Loading...
Please wait, while we are loading the content...
Similar Documents
Predicting the number of forks for open source software project
| Content Provider | ACM Digital Library |
|---|---|
| Author | Zhang, Li Chen, Fangwei Li, Lei Jiang, Jing |
| Abstract | GitHub is successful open source software platform which attract many developers. In GitHub, developers are allowed to fork repositories and copy repositories without asking for permission, which make contribution to projects much easier than it has ever been. It is significant to predict the number of forks for open source software projects. The prediction can help GitHub to recommend popular projects, and guide developers to find projects which are likely to succeed and worthy of their contribution. In this paper, we use stepwise regression and design a model to predict the number of forks for open source software projects. Then we collect datasets of 1,000 repositories through GitHub’s APIs. We use datasets of 700 repositories to compute the weight of attributes and realize the model. Then we use other 300 repositories to verify the prediction accuracy of our model. Advantages of our model include: (1) Some attributes used in our model are new. This is because GitHub is different from traditional open source software platforms and has some new features. These new features are used to build our model. (2) Our model uses project information within t month after its creation, and predicts the number of forks in the month T (t < T). It allows users to set the combination of time parameters and satisfy their own needs. (3) Our model predicts the exact number of forks, rather than the range of the number of forks (4) Experiments show that our model has high prediction accuracy. For example, we use project information with 3 months to prediction the number of forks in month 6 after its creation. The correlation coefficient is as high as 0.992, and the median number of absolute difference between prediction value and actual value is only 1.8. It shows that the predicted number of forks is very close to the actual number of forks. Our model also has high prediction accuracy when we set other time parameters. . |
| Starting Page | 40 |
| Ending Page | 47 |
| Page Count | 8 |
| File Format | |
| ISBN | 9781450329651 |
| DOI | 10.1145/2627508.2627515 |
| Language | English |
| Publisher | Association for Computing Machinery (ACM) |
| Publisher Date | 2014-05-26 |
| Publisher Place | New York |
| Access Restriction | Subscribed |
| Subject Keyword | Fork Open source software |
| Content Type | Text |
| Resource Type | Article |