Tuesday 21 September 2010

Spiff: The competition

In the intervening years between the first inception of Spiff (yes, I'm going camel case, upper case everywhere just won't do) and it's subsequent revival (and re-revival), a couple of competitors have appeared in the same space.

Closest in spirit to Spiff is Preon. It even states it's aim "to be to binary encoded data what Hibernate is to relational databases, and JAXB to XML", which pretty much sums up what I want with Spiff. Where Preon differs is in it's extensive use of annotations to do what Spiff does in it's format definition file (which Spiff calls an .adf file - Arbitrary Data Format). Preon will examine your classes and derive the data format from the order and types of annotated fields in the class. It also uses annotations to derive looping and conditional logic.

I'll admit that I've not used Preon - partly through fear of polluting my ideas about what Spiff could and should do, and partly in case I decided it was better than Spiff and just decided to call the whole thing off. So any discussion of it's merits and drawbacks are truly superficial. My general impression is that it's reliance on annotations are a little hairy - when you're expressing logic in annotations, things look a little awkward. Spiff trades off compactness (having everything described in the code) for readability and also portability. The event dispatching and class binding mechanism means that one .adf file can be used to populate classes of any shape without needing to respecify the file format. This also highlights the fact that it looks like Preon expects the classes to fully describe the file format, which is rarely what you want in the code. One of the use cases that led to me starting to write Spiff was wanting to get little pieces of the data without having to worry about the rest of the file format. And thus were .jump and .skip begat.

On the other side, the Google lads are also in the frame with protobuf. Protobuf is interesting in that it uses something analogous to the .adf file to describe the format. Where it differs is that it will generate classes for you to serialize and deserialize the format. That's something that Spiff might be capable of one day, but I like the idea of being able to write arbitrary POJOs and map the data onto them, rather than having objects in my code whose sole purpose is as marshallers. Also, it's largely oriented towards message-passing, that is, describing a message that will be passed between two systems, such as in RPC, where the user is in control of both ends of the transaction. To that end, the .proto definitions are reliant on using the underlying protobuf grammar for the message, for instance to recognise repeated blocks of data, and don't have some of the flexibility to express more complicated relationships between parts of the file.

I see a couple of strong points in Spiff from this. Portability of .adf files is possibly the biggest. Once someone has defined an .adf for, say, a .bmp file, or an ItunesDB file, anyone else can take that and use it to bind all or part of that data to their own classes. The other is flexibility, in hopefully being able to express all the things that can make binary file formats tricky to work with. I guess first step is to have a working product...


Ok, I'm back on Spiff. I mean it this time. Repeat after me - "I will ship software, I will ship software, I will ship software".

Coming back to code after a little time is an interesting experience. Almost every time you can guarantee a few nuggets of insight that hadn't occurred previously.

Today's lesson: if it's difficult writing a unit test for (usual suspects being the FileNotFoundExceptions and if(x == null)) conditions), it's probably not worth having in the code.

I'm not normally one for striving for 100% code coverage with tests. Like all things that fall under the agile/XP umbrella, if you're doing it by the book, you're doing it wrong. There isn't a book that tells you how you should be working on your projects.

However, in this instance, I thought it would be an interesting exercise to try and get up to 100%. By getting OCD on the unit tests, I found at least two conditions in my code that couldn't actually occur:
  • a null check on an object right after it's constructor was called, and
  • an exception thrown from code where I was using dynamic assignment where static assignment was sufficient.
In the latter case, I had
try {
lib = Class.forName("java.lang.Math");
} catch (ClassNotFoundException e) {
//how do I get here?
Can you write a test that exercises the catch block? Nope. This was a remnant of old code that hadn't been cleaned up. What I should have been doing was
lib = java.lang.Math.class;
which doesn't throw any exception.

It's good to remember that adding test cases is not the only way to get closer to 100% code coverage - deleting code does just as well.

How final is final?

One interesting tidbit from Spiff development. How final is
final int x = 1
? Answer: Not very, if you're using reflection. Field.setAccessible(true) will soon get you round any awkward encapsulation issues.

So, newly armed with that knowledge, what's printed out here?
public class HowFinal {
private final int x = 1;

public static void main(String[] args) throws Exception {
HowFinal howFinal = new HowFinal();
Field f = howFinal.getClass().getDeclaredField("x");

public int getX() {
return x;

The answer, unexpectedly, is

Er, so x was final after all? Sort of. The compiler inlines constants at compile time, so as far as the runtime JVM is concerned, getX() contains the code return 1;. Querying the field via reflection shows it's true value of 2.

Is there any question whose answer doesn't start with "it depends"?

Spiff on Github

Spiff is now available to download/fork/whatever at GitHub. The GitHub site also has a wiki with some instructions for getting started.

Current state is pre-pre-pre-pre-alpha. That is, it doesn't really work, I'm rebuilding some of the core parts, but it's there for anyone who wants to snoop. Ship early, ship often!