Background
This project is a collaboration between software engineering
researchers at the Computer Science Department (Chris Bird,
Prem Devanbu) and
Social Scientists (Swaminathan, Hsu) at the Business School, all at UC
Davis. We gratefully acknowledge research funding from two NSF grants
from the
Human
& Social Dynamics Program, and the
Science
of Design Program. The overall goal of this
research is to conduct a longitudinal study of the interactions between
design, social process, and product quality in Free/Libre open
source software systems (FLOSS). The specific goal of the project
described on this page is to study the process by which people become
accepted as developers in FLOSS projects.
Publications
This is a new project, so nothing (yet) has been published. However,
there is a draft paper available for the asking, just send email
to <lastname of prem>@cs.ucdavis.edu. It's under review right
now, we'll be glad to put it up here as soon as we hear back. However,
the interested reader might find our earlier
paper
of interest; the data extraction approaches used there
are quite similar, although the goals were different.
Graphs and Analysis.
Our goal was to quantitatively model the rate at which people became
developers in FLOSS projects, using
statistical
hazard rate analysis. We measure the time delay "at risk" from the
time a newcomer joins the mailing list to the time they first
make a commit in the CVS repository. We studied Apache HTTPD,
Postgres, and Python.
Here are some graphs showing the rate at which people become
developers: x axis is years since the person first appears on the
developer mailing list. It should we noted that very few people
stick
around (who haven't yet become developers) past the 4 year mark, so the
data at the upper end on the x axis is based on very few samples.
The curves shown are simply smoothed raw data, not fitted to any model.
The rate is slow, because most people don't become developers.
Our theoretical explanation for the non-monotonic behaviour of rate is
complex, and are discussed in the paper (see
publications
above). But very briefly, the non-monotonicty relates to 3 conflicting
effects: two that increase with time (project-specific skill and social
status) and one that decreases (level of technical commitment).


Figure 1: Rate at which people become developers, as
a function of their tenure on the mailing list. The left most picture
is for Apache HTTPD, the middle is Postgres, and the rightmost is
Python. note the striking similiarity.
We have
also made available the
trace of the
analysis (using the
Stata
data analysis package) and the descriptive statistics (including
several other predictive measures that were not in the scope of the
current paper) for
Apache HTTPD,
Postgres, and
Python. In all the analysis it
should be noted that we are consideirng an
entire population at risk,
Our goal was to quantitatively evaluate these 3 hypotheses:
Hypothesis
1 Likelihood of attaining developer status will rise with
tenure, peak at some point, and then decline.
Hypothesis
2 Demonstration of skill level, such as patch submissions
and/or acceptances, will increase the likelihood of becoming a
developer.
Hypothesis
3 Social status will influence the rate at which a non-developer
becomes a developer.
Additional data will be made available subsequent to the review
process.