Abstract
Context:: Technical leverage is the ratio between dependencies (other people’s code) and own codes of a software package. It has been shown to be useful to characterize the Java ecosystem and there are also studies on the NPM ecosystem available. Objective:: By using this metric we aim to analyze the Python ecosystem, how it evolves, and how secure it is, as a developer would perceive it when deciding to adopt or update (or not) a library. Method:: We collect a dataset of the top 600 Python packages (corresponding to 21,205 versions) and used a number of innovative approaches for its analysis including the use of a two-part statistical model to deal with excess zeros, a mathematical closed formulation to estimate vulnerabilities that we confirm with bootstrapping on the actual dataset. Results:: Small Python package versions have a median technical leverage of 6.9x their own code, while bigger package versions rely on dependencies code a tenth of their own (median leverage of 0.1). In terms of evolution, Python packages tend to have stable technical leverage through their evolution (once highly leveraged, always leveraged). On security, the chance of getting a safe package version when choosing a package is actually better than previous research has shown based on the ratio of safe package versions in the ecosystem. Coclusions:: Python packages ship a lot of other people’s code and tend to keep doing so. However, developers will have a good chance to choose a safe package version.
| Original language | English |
|---|---|
| Article number | 139 |
| Pages (from-to) | 1-31 |
| Number of pages | 31 |
| Journal | Empirical Software Engineering |
| Volume | 28 |
| Issue number | 6 |
| Early online date | 13 Oct 2023 |
| DOIs | |
| Publication status | Published - Nov 2023 |
Bibliographical note
Funding Information:We would like to thank Ivan Pashchenko for useful discussions on technical leverage and Batbayar Narantsogt for providing the first initial code base for extracting libraries for Python. We also thank the anonymous reviewers whose comments greatly helped to improve the paper. Any remaining mistakes were ours. This work was partly funded by the EU under the H2020 Program AssureMOSS (Grant n. 952647).
Funding Information:
This work was partly funded by the EU under the H2020 Program AssureMOSS (Grant n. 952647).
Publisher Copyright:
© 2023, The Author(s).
Funding
We would like to thank Ivan Pashchenko for useful discussions on technical leverage and Batbayar Narantsogt for providing the first initial code base for extracting libraries for Python. We also thank the anonymous reviewers whose comments greatly helped to improve the paper. Any remaining mistakes were ours. This work was partly funded by the EU under the H2020 Program AssureMOSS (Grant n. 952647). This work was partly funded by the EU under the H2020 Program AssureMOSS (Grant n. 952647).
Keywords
- Dependencies
- Empirical analysis
- Python ecosystem
- Security
- Software libraries
- Technical leverage
- Vulnerabilities