Software Development Estimates, Where Do I Start?

For some reason many people discuss the problem of estimating software development timeframes without properly understanding the issue. There is a famous answer on Quora that exemplifies this. Lots of people like that story, even though it’s inaccurate and misguided. “Software development” is such a huge endeavor that it doesn’t even make sense to talk about estimates without an understanding of the kinds of problems software can solve. To put this in context, let’s forget software for a second and look at a few tangible problems of different magnitudes.

  • You are a medical researcher. A new disease makes headlines. It’s a virus. It seems to be spreading through sexual contact. How long is it going to take you to find a cure?
  • It’s 1907, and you’ve just built the first airplane. The government wants you to build a spaceship to fly to the moon. How long will it take?
  • You are in charge of a construction company that has built hundreds of buildings in your metropolitan area. I want you to build me a twenty-story apartment complex very similar to the one you just finished on the other side of town. When will it be done?
  • You own a chair factory that produces 1000 chairs per month. I need 3500 chairs. Can you make them in a month?

The four questions above are radically different in nature. The first two involve significant unknowns, and require scientific or technological breakthroughs. The other two, not so much. Meta-question: does software development look like the first or the second kind? Another meta-question: what kind of software development are we talking about?

Let’s focus on the construction company. People have been constructing buildings for centuries. There’s relatively little variation in the effort and costs of putting up vanilla high-rises (we’re not talking about Burj Khalifa). Of course there is uncertainty: economic conditions could change, suppliers could go out of business. The new mayor could have a personal vendetta against your company because your big brother bullied him in school. All those things have happened before, perhaps in combination. Let’s say you can give me an estimate of 15 to 24 months with 98% confidence based on past data. Sounds good to me.

If buildings were software you could break out a template, customize it a bit, install it on my plot of land in the cloud, the end. There are companies doing this for software, for example Bitnami (disclosure: I’m an investor). The process is so quick that you don’t even need to ask for a time estimate. You just see it happen in real time.

Let’s imagine that it were impossible to clone software at almost zero cost, like it is for physical things. Most software developers would be like monks copying manuscripts. If you have been handwriting pages for long enough, you can confidently tell me that it will take you at least 15 years to produce a high-quality copy of the entire Bible (it can be done 4x faster today, not sure about the quality though). You could get sick, or suffer interruptions. However, the number of absolute hours you’ll need is well known. There is a type of software development that works like this: porting old applications to a new language (say COBOL to Java back in the day).

Of course, the more rewarding problems in software development look nothing like this. I enjoy trying to solve problems that nobody has solved before. The malleability of software makes it easy to explore an open problem. Some problems are deceiving; at first they may look like building a house, and as you discover unknowns they sometimes mutate to resemble a quest for an AIDS vaccine. If a problem is solvable with software, it may take weeks or months to come up with an imperfect solution. It will probably take years to build one that’s scalable and robust. The meta-problem is that some problems cannot be solved with software, or at least not yet, or not by me / my team / my company. I might give you an estimate that would look like:

  • less than a month: 30% chance
  • less than a year: 40% chance
  • never: 30% chance

Another kind of software development somewhere in the middle, and it may be the one that generates the most software jobs. Usually an organization wants a solution to a problem that has already been solved by others (e.g. building a cluster manager for a social network graph). Even though you don’t have access to the design or the code, you have an idea of what the solutions look like. You don’t know what key issues they ran into, how good the teams were, how lucky they got. Still, you know the problem can be solved by companies that look like yours for a reasonable cost. There is still quite a bit of uncertainty:  you can estimate small tasks reasonably well, but you cannot predict which “week-long” task might expand to several months (e.g. it turns out that no open source tool solves X in a way that works for us, we’ll have to write our own).

The gist of why estimates are hard: every new piece of software is a machine that has never been built before. The process of describing how the machine works is the same as building the machine. The more your machine looks like existing ones, the easier it is to estimate its difficulty. Of course it won’t be exactly the same machine; that can happen with a chair but not with software. On the other hand, you may want to boldly build what no one has built before. In that case, you’ll most likely adjust your scope so that you can build something that makes sense for the timeframes you work with. The solution to the original problem might take iterations over generations. Not necessarily generations of humans, perhaps generations of product versions, teams or even companies. You may set out to put a man on the moon, and your contribution would be the first airplane.

Discuss on Hacker News