Tuesday, October 6, 2009

ZPlanner, TDD, and Over abstraction

As I've written once or twice before, I've been working on a lightweight Scrum/Agile piece of project management software. It's called ZPlanner and I've probably put all of 200-300 hours into it, though it's hard for me to say precisely. I started writing it when I thought I was going to be laid off from my job as development manager and wanted to brush up a bit on my Java programming skills. I somewhat arbitrarily chose to write a piece of project management software as it was a domain I knew and I've used XPlanner fairly frequently over the years. It's a decent tool, but has some serious shortcomings. But again, the motivation was mainly to try a few new technologies and do things "the right way" since I had no timelines or customers chomping at the bit.

Though I've been a big proponent of unit tests for some time and even TDD my formal computer science education instilled in me the absolute need to "architect" everything I do prior to writing a line of code. Ever since I came into contact with TDD, I've questioned this approach and I've done my best to move away from big upfront design (BUFD), but if I'm honest with myself I've not really done TDD proper. With ZPlanner, however, there was nothing on the line really, so for the first time I decided to really make a legitimate attempt at using TDD from the beginning.

I didn't think about a line of code or or spend days upon days trying to come up with some perfect object hierarchy, instead for once, I just started coding.
  1. Write a failing unit test for new functionality
  2. Do the simplest possible thing to make the code pass the test
  3. Refactor
  4. Repeat


What's interesting was that in an amazingly short period of time I actually had something that worked at some basic level. My first milestone was to have a simple web page in which one could enter a task with a name, description, and an estimate and save it to the database. As trivial as that may seem, using my old approach I would have spent days and days (or longer) agonizing over some design, thinking of every possible use case, going back and forth on which technology to use, and so on. It probably would have been several weeks before I even was at the stage to write a line of code.

Instead, within 40 hours I had set up Maven2 for dependency management and to do builds, Hibernate Annotations to manage database persistence, Cobertura to do unit test coverage reports, and Mercurial to manage my revisioning. And I even had a working web application to boot!

And it did something meaningful, which was allow me to save tasks with estimates.

It was basically a slightly glorified Excel sheet (minus all the crazy macros and editing and so on).
As I kept moving forward I tried to repeat the TDD mantra

  1. Write a failing unit test for new functionality
  2. Do the simplest possible thing to make the code pass the test
  3. Refactor
  4. Repeat

Of course, "doing the simplest possible thing" is a subjective assessment. Sometimes what seems the "simplest thing" is not at all the simplest thing. And though I'd overcome my education with regard to doing some BFUD, and my need to be a "good computer scientist" and abstract things to reduce duplication reared it's head.

Unit test by unit test I built up the application, refactoring, removing duplication, doing "the simplest thing possible" until a week or two ago. I actually had a system that could store a hierarchical tree of projects, iterations, stories, and infinitely recursed tasks, it could roll up estimates using some fairly complex logic, it had validations for the web app, a clean, easy build process, the ability to create users, an Ajax drag-and-drop interface to move around stories/tasks, and about 80-90% code coverage. All with about 20 files and only a few thousand lines of code.


Talking to one of my colleagues (thanks Jim!), though, he pointed something obvious out. Part of my interpretation of doing the "simplest thing possible" had been trying to reuse code as much as possible. This in turn ended up resulting in object model in which Iterations, Stories, and Tasks had all been abstracted into a notion of an EstimatedItem.

This meant most of my functionality was able to be captured in only a few lines of code, but it also meant it was pretty non-intuitive. The comments were strewn with explanations about how an EstimatedItem in one case represented a Story and in another case represented a Task.

I'd also oddly (and wrongly) decided to subclass the Iteration class from EstimatedItem as well, reasoning that the fundamental characteristic of an estimated item was that it had a name, description, and an estimate. Well, I figured at the time, an iteration has all of these things. It only differs in that it has a start date and an end date. It seemed to make sense at the time and it also seemed the "simplest thing possible"


The problem of course is that an Iteration IS NOT a story.
A task ISN'T even a story.

One can abstract things sufficiently such that all things are the same. I blame Java and its ilk for this to some degree with its notion of everything being an Object. Okay, great! Everything's an object. What does that MEAN? But Java isn't really to blame either, I've seen plenty of other examples of over abstraction. The problem is we're taught in school and by our peers that reuse is bad.

But then if our abstraction makes the code much harder to read. Rather than looking at the code and seeing Stories and Tasks and things that have concrete meaning, we're dealing with abstractions that it really takes some thought to understand. Maybe it's a little less code, but it's much harder to understand.

I don't really blame TDD for my mistake. I think it could just as easily (perhaps even more likely) happened with a traditional approach. In any event, I'm no rewriting a large portion of the code, copy-and-pasting even and where I had three classes I know have 9-12. I guess this could be seen as a bad thing, but suddenly what was abstruse and explained in comments makes perfect sense just looking at the code

Whereas one used to add a task like this

EstimatedItem story = new EstimatedItem(...)
story.addSubItem(new EstimatedItem(...))


is replaced with this

Story story = new Story(...)
story.addTask(new Task(...))

The latter is infinitely easier to understand in my view. And hopefully once I've exploded my classes I might be able to see some commonalities that *are* meaningful, rather than my flawed notion of abstraction. Now, since I have unit tests for all this stuff, it's pretty easy to change them and make sure I'm still passing the same tests. So, TDD is what makes this *better* design possible. And now we're back on TDD and we've come full circle.

Until next time.

2 comments:

Joshua DeWald said...

The obvious question... are you now using ZPlanner to plan the future iterations of the ZPlanner itself. Nothing like eating your own dog food to figure out what works and what doesn't :)

wrt to the abstraction... I admit that I probably would have went the same route with the EstimatedItem, assuming that most of the logic doesn't actually care about the difference between them. I think you could at least have EstimatedItem be an interface, such that you can treat it as that for purposes of calculation, even if you have to duplicate all of the actual implementation. I find a lot of value in interfaces, but I've found base classes to often cause problems (due to what you mentioned of trying to do it too early and finding that it does not actually contain good "base" logic).

Code Monkey said...

Haha. Jim has asked the same question about using Zplanner to plan ZPlanner. I would consider using Zplanner to plan future iterations of Zplanner except
1) It's not a planned project really, I'm just working on it here and there, so there's not a lot of value in planning iterations. I am the dev team, product owner, qa, etc. I just do the feature I think most worthwhile next. What I'm doing is more akin to prototyping
2) The table structure isn't consistent enough. I keep changing things which would make maintaining the data more trouble that it's worth

I've started using MyLyn within Eclipse to track how much time i spend implementing a given feature and to manage context(If you haven't used MyLyn, I recommend giving it a try. It's an interesting concept, though I'm not entirely convinced it's a "game changer")

As for interfaces, etc that's the first path I went down for my refactoring. I may yet return to it, but it proved too troublesome at first. I tried to create interfaces for the separate characteristics of EstimatedItems. An estimated item has two orthogonal natures, one being it's quality of being in a tree (i.e. a project has nodes under it which are iterations which has nodes under it which are stories). This I represented with an ITreeNode interface with things like getParent and getChildren, etc. The other nature it has is being estimatable, i.e. how an estimate is calculated, which I represented with an IEstimatable interface. That approach was proving problematic so I reverted the code and took a "simpler", more brute force approach of exploding the abstraction. Once that's done I will of course look for commonalities, which may include creating interfaces. I'll be honest, though, I still am not a fan of Interfaces. Perhaps it's because I cut my teeth on C++ which lacks the concept (and instead has true multi-inheritance), but I see little value in them over abstract classes with private members, etc. not saying I'm right, I just tend not to use them. In fact, this is my 3rd pass at refactoring. The first was to create an abstract base class, then I tried interfaces, now I'm on the 'dumb approach' of removing my inheritance scheme.

Also, I'm not sure how Hibernate Annotations would behave if I use interfaces for the private members of my persisted clases. I'm fairly certain you need a concrete class for it to know how to persist it.