Thursday, December 10, 2009

ZPlanner Technology Brief: Maven2

A few weeks ago, I devoted a blog to talking about Mercurial, the distributed revision control system I decided to use for ZPlanner, my Agile project tracking tool. This week, I figured I'd briefly talk about one of the other choices I made, namely to use Maven2.

For most of the career as a hand-on developer, I was trapped in the world of Perl and Apache. Our proprietary system didn't require doing builds. You just created a new file, put it in an appropriate directory, and voila, it worked. All of the libraries we used were global to our servers and installation of new packages (from CPAN) was a rare thing. When they did occur, it usually just mean shooting a request to our of our admins and asking him to install it. So, for all intents and purposes we never really thought much about dependency management or builds or any of that stuff.

A bit later, I was assigned as technical lead to a new, large scale (for our company) project and we decided, for various reasons, to use Java. Immediately, we had to start thinking about how we'd do our builds. Given that I was the first person to start coding and was relatively inexperienced with Java at that time, I didn't think about it very much. I started coding in Eclipse, created a directory structure that I thought was reasonable and did builds solely via Eclipse. Dependencies were managed by plopping a jar in the lib directory, then clicking 'Add to classpath'. I didn't think about it much after that. It seemed to work.

When we brought other more experienced Java developers on board, they immediately starting talking about Ant. My boss (whom I still regard as the most brilliant programmer I've ever worked with), steadfastly maintained that dependencies and builds should be managed via Eclipse. That to use Ant (or Maven) was a duplication of what we'd be doing in Eclipse anyway. Why do we have the need to do builds via command line? We should be able to build our artifacts directly in Eclipse which would produce an artifact (in the case of the service I was writing, a jar), deploy it any which way we wanted, and create simple scripts to set the classpath when we ran it. Why muck around with byzantine Ant files, when Eclipse managed all of this for us? And I really do think he had a good point.

Eventually, though we did end up with an Ant file. I can’t recall if it was for any great reason. Thankfully, due primarily to the efforts of my boss, who whittled down the initial file one of our contractors created to about 1/3 its size, it wasn't too hard to understand. Most Ant files are not so nice. I've seen more than my share of horribly long, incomprehensible Ant files that copy files arbitrarily from one location to another, rename directories, and all sorts of other crazy, random stuff.

And the thing is, when you start creating an Ant file, you almost invariably end up reinventing the wheel. Deploying a jar or a war generally consists of pretty much the same steps every time. But because someone decided to structure the project a little differently than the last one, or made some other random decision, there are all these one-off, unique things that the Ant file has to do.

And ant *only* really manages builds--and maybe starting up your app. If you're doing a build in Eclipse, you'll still have to go into the lib and make sure all the jars you need are on the classpath. Some may be assumed to be provided by your web container at runtime, but Eclipse doesn't konw about those, so make sure you go and add those to the classpath too. It just ends up being a pain in the ass.

And that's where Maven comes in. Instead of recreating the wheel every time you start a project, you agree to some conventions. You'll always structure your directories in a certain, canonical way. This is facilitated by the notion of archetypes in Maven. Archetypes are basically just templates for how projects are structured. There available for the most of the common (and even uncommon) types of applications you’re like to build.

In return for this compliance with convention (with which you may or may not totally agree), Maven provides the basic things you always need to do: build, deploy, run tests without having to hand code them. So, in my mind, it’s already got Ant beat at this point. You don’t have to tell it to copy all the jars in your lib into some other dir, or copy some random config files you shoved in the ‘etc’ directory, into the other directories. But then Maven adds something much, much more useful: Dependency management.

Too often I’ve checked out some Java project, add all the jars in the lib path to the classpath, then try to do a build…and stacktrace.

Crap, okay, so this lib I added, has a dependency on another lib. Time to go out on Google and search around. Then you download that, add it to the classpath, then blam.

Dammit, there’s some other dependency. And through trial and error (and a bunch of time), you eventually figure out what you were missing to start with. Often, these are libraries that are only needed for compilation, or they’re assumed to be provided by your web container, so they haven’t been included in lib. But you need them to do anything. And then you have the problem that it’s quite likely that none of your jars are versions

Okay, I have hibernate.jar from two years ago that has no version information. What the hell version is it? Can I upgrade it or will that break everything? These are all lovely questions to wrestle with.

Instead, Maven manages this stuff for you. The POM (Project Object Model) file that you write, instead of a build.xml as Ant uses, is not a procedural set of steps. Maven knows about doing builds via Plugins (which I won’t go into here), and so really all you end up needing to represent are which plugins (i.e. what types of activities you’re performing, rather than the explicit steps that you need to represent in Ant) and what the dependencies are that your project relies on. You can even notate that a certain dependency is only necessary for the build step or the test step. If one of the libraries you’re using requires some other jars of which you were unaware, Maven will figure this out and download them without you ever having to muck around.

And you can specific that *only* version 1.1.2 of a library should be used. You explicit tell it what version to use, so later on down the line, you can easily make the decision of whether it’s safe to move to a new version. If you use the Maven2 Eclipse plugin, you can even reuse the POM to set your classpath, so that whole laborious process of adding jar after jar to your classpath…you don’t need to do that. Just check out the code and bam, ready to run.

Of course, a lot of this happens by magic. But magic, to paraphrase Arthur C Clarke, just means that something is going on that is more advanced than you understand. And recently I’ve seen some complaints that when you have to delve into the magic, things can get messy.

But while using it on ZPlanner, I’ve not encountered any real problems. A few little hiccups here and there, but honestly I imagine going back to futzing around with 200 line long build.xmls for Ant. Or not having the dependency information readily available so I can make intelligent decisions about what my project should or shouldn't be relying on.

If you haven't played with Maven before, I highly recommend taking a look at the excellent online Maven book, available freely from Sonatype. It clearly explains the basic concepts of Maven and let me get my project up and running using Maven in a matter of hours.

2 comments:

Joshua DeWald said...

Extra points for quoting Clarke's Law ;)

The only issue I've had with Maven is when you have a large-ish project using multiple outside components, you can get into some situations where multiple versions of a library end up included and you have to add some exclusions.

But otherwise, I've found it nice to not have to individually download 26 dependencies so I can use some particular jar.

Code Monkey said...

Ah, yes.

Even in the very modestly sized ZPlanner (which doesn't have that many dependencies) I've had to add a few exclusion rules. Mostly for the very reason you mention, but also in some cases due to invalid POMs from the libraries I'm using.

Additionally, if the library you need isn't in the main maven repository it can be a pain in the ass. I decided to use the flexjson library which isn't in the main repository. I could have installed it to my local one, but if anyone checked out the code they'd have to do the same thing. Instead, I used the 'system' scope, but it still doesn't pull it into the classpath as it documentation indicates it should.

Looks liek the problem is more with Eclipse as the command line works, but little things like this can be annoying to troubleshoot.

But even with the small faults, it's still miles and miles better than the alternative IMO.