Opening Research on Open Source

Today's columnist is Carlo Daffara from Conecta. He writes:

Fresh from the wonderful OpenWorldForum in Paris, I had again the pleasure to meet many colleagues and researchers in many fields related to open source software, be it security, adoption practices, or business aspects; experts like Dirk Riehle, Matthew Aslett, Martin Michlmayr, and several others that have greatly contributed to the current state of the art in OSS research.

It was an interesting meeting, with some peculiarities: first of all the recognition that there are relatively few people working in the field; I think that we can count the publishing participants in such a group to be less than 100, substantially less than those working in other related fields. The other aspect is, that despite the relative friendliness among all of us, most researchers still have to collect data and process it on their own.

There are of course some exceptions. FLOSSmole provides large sets of data extracted from many popular source collaboration sites like SourceForge. The Free/Open Source Research Community began creating a database of papers from the community, but unfortunately the site has not been updated since February 2009. Beyond these exceptions, there is not much more. I believe that, after talking with a few fellow researchers in the area, it is possible to do something better.

For this reason, I propose an effort for sharing reusable data on open source, especially in areas that are on the boundaries between open source, management, and social aspects, to extend the existing efforts that are more focused on software engineering aspects, like FLOSSmole. In particular, I believe that we can share data like:

  • lists of open source companies, their markets, and the adopted business model

  • internal measurements, like data on use and reuse of OSS components in internal software production

  • parameters like conversion or monetization ratios and margins for companies using open core or freemium models

  • equations and tables for those interested in econometrics

I believe that such an effort can help in maintaining fresh and updated data sources, extending the efforts that every individual researcher or research group has to do to prepare a data-backed paper, or even for those companies interested in these kind of data to estimate markets and receptiveness of specific business models. We will try to jumpstart the effort by releasing the data of our survey on open source business models created in the context of the FLOSSMETRICS project. We will start soon a cleaning and reformatting activity to make the model clearly understandable for those not in the project and to foster additional contributions from external groups.

I would gladly accept suggestions on where and how to host such an effort. What kind of structure do you think may be more amenable to such a collaboration method? Google Documents? A wiki? Any suggestion is welcome. Write me at, and let's try to apply the collaborative approach to our work!

