Monday, 9 December 2013

The Two Queues

As a manager of several development teams, I have two queues at my desk.

The first consists of all the people complaining "the Dev estimate on this work is far too high!"

The second consists of all the people complaining "Dev have taken far longer than they estimated!"

It is not unknown for members of the first queue, having had their query swatted away, to immediately join the second queue, or vice versa, without a hint of irony.

Let's address the first one here. Why might someone think an estimate is too high?  Conversely, why might a Dev team give an estimate that others consider "high"?
  • You don't agree on the scope - estimation is often done off the back of a ticket in a workflow system that has little detail on it.  Even if the Dev team engage the product owner in a conversation when estimating (and you do, right?), there are often lots of implicit assumptions made by both sides, not least the assumption that the "other side" are making the same assumptions.  In the absence of knowing the right questions to ask, it's natural for developers to err on the side of caution and fill in the gaps with their own assumptions about what's needed.  There's a simple solution to this: keep talking.  Find out from the Dev team what assumptions they've made, or what tasks they're planning, and help them turn them into concrete things.
  • The team doesn't know the domain - perhaps they've been asked to take over a system they haven't worked on before.  It is unrealistic to expect a team that's never worked on something to make changes to it at the same rate as a team that may have built it from scratch.  Don't be surprised if it takes 2 or even 3 times longer. Developing software is not like following a recipe.  Either quit moving systems between teams, or consider the initial effort an investment to make things quicker next time.
  • The system has technical debt - Whisper it quietly, but sometimes software is not that well written.  Or at least, once upon a time it was, but over time things have gotten a little messy.  Perhaps the Dev team have been less than careful about the changes they made.  Perhaps they did it to get the work done quicker and make your estimates smaller last time.  Either way, now you're paying the price, and the price is more effort, which means a higher estimate.  It is the righteous and courageous Dev team that says "no, we're fixing this now, not later".  If my builder didn't do that, I'd probably be writing to Watchdog.
  • You think you know something the team doesn't - And so we come to the most common cause for "that estimate is too high!".  It's just a one-liner, right? 
  • You actually do know something the team doesn't - If you do, congratulations!  Share it with the Dev team.  Share your experience or your knowledge, or connect them with other people who have it. The Dev team will almost certainly be thankful for the input. However, just telling them what you know doesn't necessarily mean that right now they can do it as quick as they will be able to once they've got experience for themselves.
  • The team know something you don't - Yes, it's a one-liner.  At least, it is once we've refactored the code into the right shape.  And written the unit test. And functional test. And integration test.  And tested it across all browsers. And deployed it.  And listened to your feedback about how it's slightly the wrong shade of blue and needs to blink. And attended the meeting with the customer to explain to them how to use the new feature. 
There is a third queue at my desk.  It's the people complaining "the Dev team have done the work far quicker than they estimated!".  It's currently empty.  Feel free to join it sometime.

Sunday, 7 April 2013

MD5 tests

Consider this test:
   public void testMapRowWhenDocIdNotInSolr() throws Exception {
      idToScore.put(DOC_ID101, 0l);
      context.checking(new Expectations(){{

         oneOf(solrResponse).getResults();will(returnValue(new SolrDocumentList()));
         never(solr).getBinder();will(returnValue(new DocumentObjectBinder()));
      unit.mapRow(rs, 101);
      assertEquals(new Integer(1),solrQueryHolder.getFirstHeldObject().getRows());
And what exactly is this testing? Alas, the name of the test doesn't really tell us - it tells us what's going to be called (unit.mapRow), and what the pre-conditions are (the docId is not in Solr), but doesn't give any clue as to what the expected outcome is. That's a first smell of a bad test. Nor does it get any better. The test proceeds to make a bunch of mock expectations about what's going to get called, but none of this helps us understand what is really expected to happen.  

The problem here is that I want to refactor this class. I want to keep the same behaviour, but I don't want it in mapRow(). This is unfortunate, because all the tests in this project look like this, and if I move functionality elsewhere I don't have any tests at a functional level that will show that I'm still achieving the same end goal.  

The colloquial name for these tests in our team is an "MD5 test". That is, the test just tests that the method is exactly what I've written. I may as well test the MD5 hash of the method body. 

Mocking frameworks make these kind of tests very easy to accidentally introduce.  The unfortunate coder will stray into the wilderness of mocking up every single method call, making sure to examine and verify every method signature along the way.  The alert coder will consider what the intent of the mocking is.  In some cases, verifying a specific method call is what you're intending to do, but I'd suggest a majority of the time mocking is simply a way to return fake values, in which case you probably want a stub.

Tuesday, 12 February 2013

On the Joy Of Less Code

There's only one thing I really enjoy more than coding, and that's not coding. Or rather, anti-coding.  Ctrl + D is my favourite key combination.  Watching lines disappear into bit-history fills me with a sense of excitement and cosiness that far exceeds the dread and fear with which I create them.

It's pretty simple.  Every time I delete a line, there are a handful fewer troublesome bytecodes in the world that can cause a bug, and that can surely only be a good thing.  There are, of course, perils, but I'm all about high-octane thrills.  I'm not talking about mere refactoring here - taking 3 lines of code and coming up with the sort of one-liner that ├╝ber-geeks love to use for oneupmanship.  I'm talking taking a good look at your code and thinking about what really matters.  Is this code actually doing anything useful? 

There are many reasons to be deleting code.  Perhaps you found a better way of doing things and have refactored.  Perhaps requirements have changed and some functionality is no longer needed.  Perhaps you just went too far in the first place and made up a requirement in your head.  Occasionally, you have the headbanging moment when the compiler forces you to add code against your will to deal with a situation that will never occur.

But, remember the rules:
  1. Delete, don't comment - Commenting out unused code is a bit like putting dog poo in a bag and then leaving the bag.  If you're using version control (and if you're not, you're on your own), then all this history is freely available to you.  With modern DVCS systems like git, it's a no-brainer to start a repo anywhere you like, and to search back through history if you ever want to dig out that code again.
  2. YAGNI - it's far too tempting to think "but this might come in handy one day...".  If it's not being used now, get rid of it.  Again, if you want it in the future, it's safe and warm in the repo.  But not today.
  3. Delete mercilessly and thoroughly - Don't just remove the obvious chunk of code.  Inevitably there will be tests that call that code, configuration parameters, text strings - all sorts of other cruft that will be rendered redundant when you delete that code.  If it's not used right here, right now, delete it.
I'm going to stick my neck out and suggest that most codebases could probably lose 10% of bug-generating, coder-baffling, readability-stifling code, and still do exactly the same job.  So, grab your Ctrl, grab your D, and get deleting.  Watch your test coverage stats soar! 

Monday, 28 November 2011

Insert Post Here

Insert shame-faced regret at not updating blog more here

Friday, 12 November 2010

Please release me

Ship it, ship it, ship it! Spiff v0.1.0 (yes, I'm being cautious) out now, you can grab the jar from

Tuesday, 21 September 2010

Spiff: The competition

In the intervening years between the first inception of Spiff (yes, I'm going camel case, upper case everywhere just won't do) and it's subsequent revival (and re-revival), a couple of competitors have appeared in the same space.

Closest in spirit to Spiff is Preon. It even states it's aim "to be to binary encoded data what Hibernate is to relational databases, and JAXB to XML", which pretty much sums up what I want with Spiff. Where Preon differs is in it's extensive use of annotations to do what Spiff does in it's format definition file (which Spiff calls an .adf file - Arbitrary Data Format). Preon will examine your classes and derive the data format from the order and types of annotated fields in the class. It also uses annotations to derive looping and conditional logic.

I'll admit that I've not used Preon - partly through fear of polluting my ideas about what Spiff could and should do, and partly in case I decided it was better than Spiff and just decided to call the whole thing off. So any discussion of it's merits and drawbacks are truly superficial. My general impression is that it's reliance on annotations are a little hairy - when you're expressing logic in annotations, things look a little awkward. Spiff trades off compactness (having everything described in the code) for readability and also portability. The event dispatching and class binding mechanism means that one .adf file can be used to populate classes of any shape without needing to respecify the file format. This also highlights the fact that it looks like Preon expects the classes to fully describe the file format, which is rarely what you want in the code. One of the use cases that led to me starting to write Spiff was wanting to get little pieces of the data without having to worry about the rest of the file format. And thus were .jump and .skip begat.

On the other side, the Google lads are also in the frame with protobuf. Protobuf is interesting in that it uses something analogous to the .adf file to describe the format. Where it differs is that it will generate classes for you to serialize and deserialize the format. That's something that Spiff might be capable of one day, but I like the idea of being able to write arbitrary POJOs and map the data onto them, rather than having objects in my code whose sole purpose is as marshallers. Also, it's largely oriented towards message-passing, that is, describing a message that will be passed between two systems, such as in RPC, where the user is in control of both ends of the transaction. To that end, the .proto definitions are reliant on using the underlying protobuf grammar for the message, for instance to recognise repeated blocks of data, and don't have some of the flexibility to express more complicated relationships between parts of the file.

I see a couple of strong points in Spiff from this. Portability of .adf files is possibly the biggest. Once someone has defined an .adf for, say, a .bmp file, or an ItunesDB file, anyone else can take that and use it to bind all or part of that data to their own classes. The other is flexibility, in hopefully being able to express all the things that can make binary file formats tricky to work with. I guess first step is to have a working product...


Ok, I'm back on Spiff. I mean it this time. Repeat after me - "I will ship software, I will ship software, I will ship software".

Coming back to code after a little time is an interesting experience. Almost every time you can guarantee a few nuggets of insight that hadn't occurred previously.

Today's lesson: if it's difficult writing a unit test for (usual suspects being the FileNotFoundExceptions and if(x == null)) conditions), it's probably not worth having in the code.

I'm not normally one for striving for 100% code coverage with tests. Like all things that fall under the agile/XP umbrella, if you're doing it by the book, you're doing it wrong. There isn't a book that tells you how you should be working on your projects.

However, in this instance, I thought it would be an interesting exercise to try and get up to 100%. By getting OCD on the unit tests, I found at least two conditions in my code that couldn't actually occur:
  • a null check on an object right after it's constructor was called, and
  • an exception thrown from code where I was using dynamic assignment where static assignment was sufficient.
In the latter case, I had
try {
lib = Class.forName("java.lang.Math");
} catch (ClassNotFoundException e) {
//how do I get here?
Can you write a test that exercises the catch block? Nope. This was a remnant of old code that hadn't been cleaned up. What I should have been doing was
lib = java.lang.Math.class;
which doesn't throw any exception.

It's good to remember that adding test cases is not the only way to get closer to 100% code coverage - deleting code does just as well.