Thursday, December 10, 2009

ZPlanner Technology Brief: Maven2

A few weeks ago, I devoted a blog to talking about Mercurial, the distributed revision control system I decided to use for ZPlanner, my Agile project tracking tool. This week, I figured I'd briefly talk about one of the other choices I made, namely to use Maven2.

For most of the career as a hand-on developer, I was trapped in the world of Perl and Apache. Our proprietary system didn't require doing builds. You just created a new file, put it in an appropriate directory, and voila, it worked. All of the libraries we used were global to our servers and installation of new packages (from CPAN) was a rare thing. When they did occur, it usually just mean shooting a request to our of our admins and asking him to install it. So, for all intents and purposes we never really thought much about dependency management or builds or any of that stuff.

A bit later, I was assigned as technical lead to a new, large scale (for our company) project and we decided, for various reasons, to use Java. Immediately, we had to start thinking about how we'd do our builds. Given that I was the first person to start coding and was relatively inexperienced with Java at that time, I didn't think about it very much. I started coding in Eclipse, created a directory structure that I thought was reasonable and did builds solely via Eclipse. Dependencies were managed by plopping a jar in the lib directory, then clicking 'Add to classpath'. I didn't think about it much after that. It seemed to work.

When we brought other more experienced Java developers on board, they immediately starting talking about Ant. My boss (whom I still regard as the most brilliant programmer I've ever worked with), steadfastly maintained that dependencies and builds should be managed via Eclipse. That to use Ant (or Maven) was a duplication of what we'd be doing in Eclipse anyway. Why do we have the need to do builds via command line? We should be able to build our artifacts directly in Eclipse which would produce an artifact (in the case of the service I was writing, a jar), deploy it any which way we wanted, and create simple scripts to set the classpath when we ran it. Why muck around with byzantine Ant files, when Eclipse managed all of this for us? And I really do think he had a good point.

Eventually, though we did end up with an Ant file. I can’t recall if it was for any great reason. Thankfully, due primarily to the efforts of my boss, who whittled down the initial file one of our contractors created to about 1/3 its size, it wasn't too hard to understand. Most Ant files are not so nice. I've seen more than my share of horribly long, incomprehensible Ant files that copy files arbitrarily from one location to another, rename directories, and all sorts of other crazy, random stuff.

And the thing is, when you start creating an Ant file, you almost invariably end up reinventing the wheel. Deploying a jar or a war generally consists of pretty much the same steps every time. But because someone decided to structure the project a little differently than the last one, or made some other random decision, there are all these one-off, unique things that the Ant file has to do.

And ant *only* really manages builds--and maybe starting up your app. If you're doing a build in Eclipse, you'll still have to go into the lib and make sure all the jars you need are on the classpath. Some may be assumed to be provided by your web container at runtime, but Eclipse doesn't konw about those, so make sure you go and add those to the classpath too. It just ends up being a pain in the ass.

And that's where Maven comes in. Instead of recreating the wheel every time you start a project, you agree to some conventions. You'll always structure your directories in a certain, canonical way. This is facilitated by the notion of archetypes in Maven. Archetypes are basically just templates for how projects are structured. There available for the most of the common (and even uncommon) types of applications you’re like to build.

In return for this compliance with convention (with which you may or may not totally agree), Maven provides the basic things you always need to do: build, deploy, run tests without having to hand code them. So, in my mind, it’s already got Ant beat at this point. You don’t have to tell it to copy all the jars in your lib into some other dir, or copy some random config files you shoved in the ‘etc’ directory, into the other directories. But then Maven adds something much, much more useful: Dependency management.

Too often I’ve checked out some Java project, add all the jars in the lib path to the classpath, then try to do a build…and stacktrace.

Crap, okay, so this lib I added, has a dependency on another lib. Time to go out on Google and search around. Then you download that, add it to the classpath, then blam.

Dammit, there’s some other dependency. And through trial and error (and a bunch of time), you eventually figure out what you were missing to start with. Often, these are libraries that are only needed for compilation, or they’re assumed to be provided by your web container, so they haven’t been included in lib. But you need them to do anything. And then you have the problem that it’s quite likely that none of your jars are versions

Okay, I have hibernate.jar from two years ago that has no version information. What the hell version is it? Can I upgrade it or will that break everything? These are all lovely questions to wrestle with.

Instead, Maven manages this stuff for you. The POM (Project Object Model) file that you write, instead of a build.xml as Ant uses, is not a procedural set of steps. Maven knows about doing builds via Plugins (which I won’t go into here), and so really all you end up needing to represent are which plugins (i.e. what types of activities you’re performing, rather than the explicit steps that you need to represent in Ant) and what the dependencies are that your project relies on. You can even notate that a certain dependency is only necessary for the build step or the test step. If one of the libraries you’re using requires some other jars of which you were unaware, Maven will figure this out and download them without you ever having to muck around.

And you can specific that *only* version 1.1.2 of a library should be used. You explicit tell it what version to use, so later on down the line, you can easily make the decision of whether it’s safe to move to a new version. If you use the Maven2 Eclipse plugin, you can even reuse the POM to set your classpath, so that whole laborious process of adding jar after jar to your classpath…you don’t need to do that. Just check out the code and bam, ready to run.

Of course, a lot of this happens by magic. But magic, to paraphrase Arthur C Clarke, just means that something is going on that is more advanced than you understand. And recently I’ve seen some complaints that when you have to delve into the magic, things can get messy.

But while using it on ZPlanner, I’ve not encountered any real problems. A few little hiccups here and there, but honestly I imagine going back to futzing around with 200 line long build.xmls for Ant. Or not having the dependency information readily available so I can make intelligent decisions about what my project should or shouldn't be relying on.

If you haven't played with Maven before, I highly recommend taking a look at the excellent online Maven book, available freely from Sonatype. It clearly explains the basic concepts of Maven and let me get my project up and running using Maven in a matter of hours.

Thursday, December 3, 2009

Timetracking, Funny Math, and the Accountants

I joined my current company about 2.5 years ago. And since then the one constant has been my frustration with how inefficient the company is at times. These have included things such as an overreliance on obscenely expensive contractors (The first team of five developers I managed, cost somewhere around one million a year and contained no full-time employees), to endless meetings in which nothing is resolved, to various hopeless complicated processes that amount to nothing more than busy work.

Of course, when the company was bought out a year ago and a new executive team was brought onboard, they immediately began trying ot reduce cost. This jettisoned the 70% contractor work force (something a few of us had been complaining about for years), moved most of the real development offshore, and made cuts in various departments across the board.

But they're not done. And now, apparently, some of them have turned their gaze on our time tracking system.

Now, I've always seen corporate time tracking a curious thing. Time tracking is almost always something mandated by the accountants and managers, so they can get a handle on cost. Like many things, it's an abstraction, and in every implementation I've seen it's an abstraction meant to give the financial people the information they want.

For the people doing the real work, meaning development and QA, the abstraction is generally far less meanginful. They don't see it as something valuable because financials and project codes aren't how they think about their lives. They think about the project or task they're doing. Sometimes these overlap, but just as often they don't.

Of course, I firmly believe in tracking estimates, how close actuals are to estimates, and so on. That's a big part of what I'm trying to tackle with ZPlanner. The difference though, is that in tools like ZPlanner, the numbers come from the actual tasks and work that people have to do. Entering time in such a system gives them value, becuase they can look at a burndown and see how much work remains for their team, they can look at how many hours and tasks are assigned to them and see what's left.

Time tracking systems don't give any of this valuable feedback to a developer. So, people enter their time because they have to. As such it's validity is very questionable. I have only to open up my manager's interface in our time tracking system, and see timecard after timecard filled in with an exact 8 hours every day to know this is not "real" data. It's constructed for my benefit and the benefit of the accountants. My reports fill it out because if they don't, I send them an email chiding them for having not entered their time not because they really *care* about it.

When people are doing something just because they have to, more often than not they do a half-assed job. That's just human nature. If you have a problem with that, take it up with evolution.

And the more complicated you make the rules and more difficult you make it, the more carelessly the task will be performed. Which brings us back to the the attempts of our executives to make our company more efficient.

For 2010, we now have a series of new rules for entering time. The two most impactful changes being:
  • Every time card must have 8 hours entered per day
  • Buckets for management will be eliminated, which means all time must be charged against a specific project

Currently, when I have to file a timecard, I really try my utmost to charge the time I spent to the appropriate project. But frequently, there will be a big chunk of time on any given day, for which I can't really account. The time was usually spent in random conversation about our technology or other ongoing projects, answering email, and so on. It's in such small, discrete portions, though, it's imposisble to recollect exactly how it was spent. Currently, I log this to the "management" bucket. Now I will have to charge it against a project. Most likely this will mean, I spread it evenly across all the projects for which I have oversight, which is everything in my division. Trying to count it accurately would take 50% of my overall time and would be an exercise if futility.

Even the time I spend with my reports in one-on-ones needs to be charged to a project I'm told. I no longer have a 'management' code against which to charge it which will likely be whatever project that particular person is assigned to. The fact that I probably spent the time discussing stuff having nothing to with the project is immaterial. As flawed and inaccurate as the data is now, I'm pretty convinced these rules will cause it to be even more skewed.

My initial reaction to this was that it was a terrible thing. I kind of thought of it in an absolute sense as in, time logged should correspond to reality, and that's the only way the data can led to sane decisions. I thought the intention was to log time as accurately as possible, but I now understand that's not really the case.

Last night, I brought up my concerns about the new time tracking rules to a VP. He explained that the intention was to either pass along the cost or use use these tools to find inefficiency. I told him I thought it was dumb to mandate that everyone has to log a minimum of 8 hours per day. There are probably days they work less than that. And I guarantee you, no one is "productive" eight hours out of the day at this company. If they work 8 hours, they take a coffee break here, discuss some random thing with their colleagues that's completely non-work related and so on.

He countered that the new system would serve two goals. To make sure the cost is passed along to the customer and to make sure people are being efficient. He said that right now the buckets of things like "Management" allow people to hide things. If they try to just amortize this time across their projects, the people who are in charge of those projects will see through it.

But that presumes people know exactly (or even remotely) how long "managerial" stuff takes. I don't think the problem in our company is (primarily) inefficiency on the part of the developers or QA or anyone doing real work. It's the amount of managerial overhead. I can't imagine what requres my team to have 6 project managers for the amount of projects we have. Often a project manager has only 1-2 assigned projects. How can their be *that* much to manage? I just don't buy it.

The VPs contention is that will come out. I suspect something different, however. People will learn other ways to game the system and because now the data will be ever less accurate (if that's possible) than it is now, the decisions the executives make based upon it will be every more divorced from reality.

But maybe it all works out. I don't know. This VP did bring up some good points insofar as "reality" doesn't matter, if the cost is passed along appropriately to the customer and the margins rare. I guess my incredulity is because I don't have an MBA. Reality doesn't matter.

But I like to think it does. And the *real* way to answer these questions, to make sure that costs are calculated appropriately is to start with how the estimates are generated. To track them in a real tool that caputure how much real work is done and how much "management" is logged. To ask hard questions based on empirical observation of why so much *management* time is needed, and to cut the cord in some cases and see what happens.

Managers will always justifty their existence. They create work for eachother. It's not out of ill will, but it's just their nature. There's a great quote from Northcote Parkinson, after whom Parkinson's law is named which goes:

"But the day came when the air vice marshal went on leave. Shortly afterwards, as it happened, the colonel fell sick. The wing commander was attending a course, and I found I was the group. And I also found that, while the work had lessened as each of my superiors had disappeared, by the time it came to me, there was nothing to do at all. There never had been anything to do. We'd been making work for each other."

So to me using a tool that most people don't care about, with an abstraction meaningful only to a few who aren't actually doing any of the "real" work, and which the smart people will find simple ways to game is not the way to fix these problems.

But what the hell do I know, I don't have an MBA.