I'm doing an analysis of commonalities between successful open source projects (If you can point me to good research on the topic, please email me!!!), and came across an interesting academic paper (Requires purchase) related to the subject. Here are some of its findings (none of which will be surprising to those who have followed Joel West and Siobhan O'Mahony's work, but surprising if you still believe in the magical open source community, numbered in the millions, anxiously waiting to contribute code to your project):
Undoubtedly, there are problems with the data set. But regardless of how many holes one can point out in these researchers' work, it corroborates very well with other academic work on the subject. The myth of a global, expansive open source development community is just that: a myth. The reality is more like severe clumping of development around Linux, Apache, and very few other projects. (Even JBoss and MySQL, as I've written before, are overwhelmingly developed by those respective companies, and not by a crowd of outside developers. 95% and 85%, respectively, I believe.)
- It is interesting to compare horizontal applications (applications used to build other software, the end user is required to program and is, likely, a software professional) with vertical ones (applications used by an end user, no programming is required). Horizontal applications (categories Internet, System, Software development, Communications, Database, Security) account for 72%. The researchers interpret this data as evidence that the OS [open source] community is largely oriented to produce applications for the same community.
[No news here. The difference only comes when we add commercial open source to the mix. Once we bring in the commercial entities, the only thing bounding open source development is the success of the companies' business models.]
- Open source projects [at least, as housed on FreshMeat (the source for the researchers' data set), which tends to host newer and, hence, smaller projects] tend to be small (82% - suitable to one or two developers) and young. 60% of open source projects (as measured in February 2001 - admittedly, an ancient data set) had been in development less than a year, 22% from one to two years, 15% two to three years, and around 2% more than three years.
[For the full set of data, please visit AC/OS.]
- The GPL license prevails, at 77% of projects. The LGPL is second at 6%, and BSD trails in third at 5%. All other licenses account from 3% to 1%.
- C is the most used programming language (41.5%), followed by C++ and Perl (~14% each), then PHP, Java, and Python (5% - 8%).
- 49% of projects have only one person developing the application; 15% have two to three developers; 20% have four to 10; 9% have 11 to 20; and 6% have more than 20. Clearly, this calls into question the ideal of "community" in open source. Last time I checked, even with my multiple personalities, I'm not a community.
- The researchers assumed that larger projects would have more developers. Wrong. "Instead we find that there is no meaningful increase of size with developers." Apparently, "[fewer] developers produce the same amount of code...." This isn't surprising - I take it as a given that a minority of people in any company/project/etc. will produce a majority of code/product/whatever. The interesting thing to note is that an open source project can be wildly successful without a massive community contributing code to it. The key is code quality and contributor productivity.
- Related to the above, 73% of projects have only one stable developer. 10% more projects have two stable developers (defined as a developer with a "prolonged collaboration with the project"). That leaves just 17% of projects that have more than two committed developers.
- Added to the above, the researchers found that 55% of projects have no transient developers at all ("Transient" defined as those providing at most one patch in the development of any section of a project or up to three patches to the same part of the code base). Of the remainder, 9% have one transient developer, 8% have two, and 20% have between two and 10.
- How does a project attain a larger status, such that it can sustain 10 or more developers? The researchers find that such conditions include "a defined and clear architecture and an adequately appealing function offered; both conditions require a meaningful size of code." This means that the initial developer(s) must be committed to see the project through its young, immature phase. But this isn't surprising, as the same principle holds true for religious movements, political uprisings (the United States is one example), and various other projects. Bluntly put, you need a fanatic or two (in the nicest sense of the word) at the beginning to blindly push forward against all odds. Open source software development appears to be no different.
- 80% of projects have less than 11 users (measured in terms of those who "subscribe" to a project - i.e., those who register and download a piece of code).
- 15% of projects are actively developed - the remainder (85%) wither and die on the vine or are, at best, "lethargically" developed. Over the six months measured, 90% of the projects on FreshMeat did not change.
Does this mean open source is a sham? Not at all. It is still a great way to engage prospective customers, incorporating them into one's development. And it's a great way to replicate Google's "perpetual beta" development methodology, which allows them to innovate and deliver code faster, because it artificially sets expectations low.
It's also a reminder that companies engaging in open source should not delude themselves into thinking that some amorphous community will do their work for them. There is no community to do this. Whether one is a company or an individual developer, the onus of code production is on you. The community only comes when the project initiator has done the grueling, constant work to make the project worthwhile.
In this way, open source really isn't so different from closed source software.
Posted by Matt Asay on September 27, 2005 03:49 PM












