The New York Times is reporting that the Healthcare.gov website contains "about 500 million lines of software code." This number, attributed to "one specialist", and widely repeated across the interwebs, seems incredibly far-fetched (even assuming a large fraction of that number includes standard libraries). If this is an accurate estimate, it would truly be staggering (as this fascinating infographic vividly reveals). I realize StackExchange:Programmers isn't Snopes.com, but I'd like to find out if anyone here believes this is even remotely possible. I'd like to know if there is a plausible system of accounting (using examples from publicly available data, if possible) that could lead someone to conclude that such an estimate is within the realm of reason. How could a codebase (by any measure) sum up to such an exhorbitant number of code lines?
1 Answers
I'm inclined to believe it. For a very generous definition of "the Healthcare.gov website."
The software I work on has almost 1.1 million lines checked in in trunk (according to subversion's stats), and that's with just 4 in-house developers. The largest single chunk of that (about a quarter of a million lines) is simply auto-generated code from including a reference to Ebay's web service. Add another 150k for the various other autogenerated webservices together.
Our database is relatively small, and despite my best efforts, the large majority of it is still using DBF tables. The portion of it that's using EntityFramework is another 11k lines. The web database's Linq2Sql project weighs in at 28k. The sum total of all the javascript is somewhere around 46k (including both minified and unminified versions in that total).
Again, this is 4 developers over something like 10 years (although it only really started exploding a few years ago). It doesn't include much in the way of unit tests, database scripting (we prefer code), redundancy, or really fancy HTML5 graphical effects.
Add 3-5 subcontractors, each with their own external references, included 3rd party libraries, and 10-50 times the developers we have, and include all the database scripting we avoid, and and so on, and I can easily see it getting that big. Especially if you start including documentation and/or heavily commenting the code. I interviewed for an FAA contractor once where they told me that their comment-to-code ratio was ideally 1:1.
- 4,656