Tuesday 21 September 2010

Spiff: The competition

In the intervening years between the first inception of Spiff (yes, I'm going camel case, upper case everywhere just won't do) and it's subsequent revival (and re-revival), a couple of competitors have appeared in the same space.

Closest in spirit to Spiff is Preon. It even states it's aim "to be to binary encoded data what Hibernate is to relational databases, and JAXB to XML", which pretty much sums up what I want with Spiff. Where Preon differs is in it's extensive use of annotations to do what Spiff does in it's format definition file (which Spiff calls an .adf file - Arbitrary Data Format). Preon will examine your classes and derive the data format from the order and types of annotated fields in the class. It also uses annotations to derive looping and conditional logic.

I'll admit that I've not used Preon - partly through fear of polluting my ideas about what Spiff could and should do, and partly in case I decided it was better than Spiff and just decided to call the whole thing off. So any discussion of it's merits and drawbacks are truly superficial. My general impression is that it's reliance on annotations are a little hairy - when you're expressing logic in annotations, things look a little awkward. Spiff trades off compactness (having everything described in the code) for readability and also portability. The event dispatching and class binding mechanism means that one .adf file can be used to populate classes of any shape without needing to respecify the file format. This also highlights the fact that it looks like Preon expects the classes to fully describe the file format, which is rarely what you want in the code. One of the use cases that led to me starting to write Spiff was wanting to get little pieces of the data without having to worry about the rest of the file format. And thus were .jump and .skip begat.

On the other side, the Google lads are also in the frame with protobuf. Protobuf is interesting in that it uses something analogous to the .adf file to describe the format. Where it differs is that it will generate classes for you to serialize and deserialize the format. That's something that Spiff might be capable of one day, but I like the idea of being able to write arbitrary POJOs and map the data onto them, rather than having objects in my code whose sole purpose is as marshallers. Also, it's largely oriented towards message-passing, that is, describing a message that will be passed between two systems, such as in RPC, where the user is in control of both ends of the transaction. To that end, the .proto definitions are reliant on using the underlying protobuf grammar for the message, for instance to recognise repeated blocks of data, and don't have some of the flexibility to express more complicated relationships between parts of the file.

I see a couple of strong points in Spiff from this. Portability of .adf files is possibly the biggest. Once someone has defined an .adf for, say, a .bmp file, or an ItunesDB file, anyone else can take that and use it to bind all or part of that data to their own classes. The other is flexibility, in hopefully being able to express all the things that can make binary file formats tricky to work with. I guess first step is to have a working product...

No comments: