Last week I received Making Software : What Really Works, and Why We Believe It. From the first moment I heard about the book I couldn't get the idea out of my head. Throughout the years I have read many books and many more blogs discussing how X is better than Y, but here was a book that was going to go through listing what we really know with studies to back it up. Rather than just claiming office cubes are bad or pair programming is good the essays present research, evidence and credibility to what they are saying. I couldn't wait to read it.
Just a few of the many questions that it tries to answer:
- How much time should you spend on a code review in one sitting?
- Is there a limit to the number of LOC you can accurately review?
- How much better/faster is pair programming?
- Does using design patterns make software better?
- Does test-driven development work as well as they say?
- How much do languages matter?
- What matters more: How far apart people are geographically, or how far apart they are in the org chart?
- Can code metrics predict the number of bugs in a piece of software?
- Which is better: offices or cubes?
- Does code coverage predict the number of bugs that will be later found?
- What is right/wrong with our bug tracking systems today?
- Why are graduates so lost in their first job?
- Many more...
Short trailer for "Making Software" by one of the two editors Greg Wilson:
The book is broken out into two somewhat different sections. The first eight chapters discuss "The Quest for Convincing Evidence". It presents the how and what to collect and lists some existing data that is available. It also talks about when to know your are convinced, how to get your ideas reviewed etc. I was sold on the book for the chapters 9-30 which each cover specific ideas/theories so found this first part a little annoying to wade through as I was unprepared, but it gave the book a solid foundation and set your expectations of what the rest of the book was made up of.
Starting at chapter 9 it is broken out into a number of different essay's each written by a different author. Go skim the Table of Contents for a complete list of chapters. I can bet you that there is at least one chapter you would be very interested in reading this very moment.
Disclaimer: Before getting into some of the results many (if not all) of the chapters had nice big disclaimers about all of the possible issues and places their data gathering or testing could have failed. Any of the below statements are my take away from reading the book and things I am going to do differently. And if anything the book very much encourages (and teaches) you how to generate your own data review/publish your results.
Chapter 11 on Conway's Corollary really drove home the point that software is social.
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.If you have a large number of modules and you wanted to know which one was the most buggy the best predictor is what module has been modified by the most number of groups in the company. Code coverage, testing, edits, code churn, dependencies, number of pre-release bugs etc all give lower precision. I would love to see a script that took in a git repo and outputted a org chart. Then you can compare it to the real org chart to discover where things are probably a complete mess. Communication structor, developers leaving, number of engineers working on a project are all discussed. But it isn't all bad. The chapter also discusses how you can exploit this to leverage the corollary.
Chapter 17 discussed pair programming which is something I have been debating trying to do and now I know I will. In one example cited while one person would take 10 hours to complete a task two people pair programming only 5 hours and 45 minutes, but the expected 10 hours. Not only does it take about half as long, but the resulting software was of higher quality and you get cross training in the process. They also discuss one v.s. two keyboards, experience levels and more.
Chapter 19 provides you with the evidence to convince your VP that you should go to offices, war rooms or some combination and get rid of your cubes. Don't just say you will be more productive, show him the chapter so he/she can read and decide for themselves.
Chapter 24's report on bug trackers was more interesting than I thought it might be. The biggest 'discovery' (in my opinion) found was that marking a bug as 'Duplicate' in a bug tracker causes data to be lost. The duplicate bug's often contained rich information that would have helped a programmer solve a task quicker, but by marking it as duplicate developers often don't see it. I would love to see bugzilla and other bug trackers get a Merge feature rather than Duplicate so reports are not lost.
Chapter 26 discusses how the problem many employers have with hiring collage graduates where they are not able to utilize them as quickly as they would hope. I have seen various thing tried, but wasn't sure what worked best. Microsoft tackled this by following around some new developers and wrote down every single thing they did. Like a UI review where you can only watch a user and can't say anything as they fumble around I was frustrated reading the reports of what the new developers did. They would get assigned to projects (not very well documented) working in groups (v.s. on their own like in school) using internal tools that they have never heard of. It is no surprise it takes some time to get up to speed and for the most part they were just really confused by a lot of things. Close mentoring and giving the mentor actual time to mentor is very important. The chapter includes more tips on changes companies can make.
Chapter 27 discusses how to look at your own code and evaluate it. After investigating some of my own open source projects it became clear that I need to do a better job of being able to distinguish what a change does, but even with just some basic dumb analysis on the repository it showed interesting results.
Chapter 28 "Copy-Paste as a Principled Engineering Tool" was very surprising to read. I have always been an advocate of the DRY principle, but never thought about it as much as they did. Here they take apart DRY and even break down and can classify DRY violations into different types and end up arguing that while many are bad a select few are good/necessary.
Chapter 29 is "How Usable Are Your APIs?" It tells the story of designing an API at Microsoft and just how very important it is and tips and tricks you can do. All throughout reading the chapter I kept thinking that if only the author had read The little manual of API design [pdf] it would have saved them a lot of work. If you haven't read the little pdf I highly recommend taking the next twenty minutes and going through it.
The book is filled with more more discussion points. Each chapter can be broken down and discussed in rich detail. There is so much in this book it is hard to give any sort of summary that does it justice. Each section comes backed by a ton of references so you can look up the data and not just take the authors word. The book reminds me of Diffusion Of Innovations which also reads a bit like a text book but is overflowing with information where whole books can be written on just one small part. Developers, Managers, Testers, Human Resource, the reach of those who can get value out of the book is large. There were a number of sections I held beliefs on and didn't think I would learn anything, other sections I was very curious to know what it might say and others that completely surprised me. This book now sits on my shelf next to Mythical Man Month (dare I say replaces it?). If you only read one technical book this year make it this one.
Edit: Some more links including a webinar discussion on third-bit.com.