Wednesday, April 03, 2013

Code analysis reporting tools don't work

Code analysis tools are good at highlighting code defects and technical debt, but it is when the issues are presented to the developer that determines how effective the tool will be at making the code better.  Tools that only generate reports nightly will be magnitudes less effective than tools that inform developers of errors before a change is put into the repository.

A few weeks ago I played with a code analysis tool that generates a website showing errors that it found in a codebase.  Like most reporting tools this one was made to run on a nightly cron job to generate its reports.  Upon reflection of my career I have never seen tools of this type produced more than a small improvement in a project.  After introduction there are a few developers that strive to keep the area they maintained clean and an even smaller pockets of developers that utilized the tools to raise the quality of their code to the next level, but they were the exception and not the norm.  A scenario I have seen several times over my career was a project that had tools to automatically run unit tests at night.  With this in place you would expect failures to be fixed the next day, but often I saw the failures continue for weeks or months and were only fixed right before a release.  Once the commit was in the repository the developer moves onto another task and considers it done. You could almost call it a law: Before a developer gets a commit into the repository they are willing to move the moon to make the patch right, but after it is in the repository the patch will have to destroy the moon before they will think about revisiting it and even in that case they will ask if you want to fix it so they don't have to.  This means that code analysis reporting tools are able to make only a small impact but no where near what the desired result is.

After pondering why the reporting tools do so poorly  and how they could be improved to make a bigger impact I finally figured out what was really nagging at me, these tools were created because our existing processes are failing.  If we could catch the issues sooner it would both be cheaper to fix the issue and eliminate a whole class of time wastes. While you could think about new developer training, better code review's, mentoring, etc all of which can be improved, a simpler solution would be to move the tools ability closer to the time when the change is made.

In 2007 I started a project that included local commit hooks with Git.  Anytime I had something that could have been automated it was added as a hook.  When you modify file foo.cpp it would run foo's unit tests, code style checking, project building, xml validation and more.  This idea was wildly successful and there were only a few times (~six?) in the lifetime of the project that the main branch failed to build on one of the three OS's or had failing unit tests.  More importantly the quality of the code was kept extremely high though out the project lifetime.   When working in a the much larger WebKit project when you put up a patch for review on the project's Bugzilla a bot would automatically grab the patch and run a gantlet of tests against it adding a comment to the patch when it was done.  Often it was done before the human reviewer even had a chance to look at the patch.  These bots would catch the same technical debt problems and the report tools, but because it was presented at the time of review it would be cleaned up right then and there when it was cheap and easy to do. Automatically reviewing patches after they are made but before they go into the main repository is a very successful way to prevent problems from ever appearing in the code base.

But why stop at commit time?  Many editors have built in warnings from code style to verification of code parsing.  A lot has been written about LLVM's improved compiler warnings and even John Carmack has written how powerful turning on /analyze is for providing static code analysis at compile time.  Much more could be done in this area to find and present issues to the developer in as soon as they create them or even in real time.

Code analysis reporting tools will always be useful because they can provide a view into legacy code, but for new code project using error reporting before commit time with hooks, bots, and editor integration will be able to actually prevent technical debt and do more for quality than nightly reports ever could.

1 comment:

Jake A. Smith said...

Good insight. The last time I remember looking at pre commit hooks I was disappointed that every developer had to set these up on their individual machines. Aside from good documentation and finding devs who actually read and follow documentation, have you found any good ways of automating the installation of pre commit hooks?

Popular Posts