Analyzing the Evolution and Maintenance of ML Models on Hugging Face

Joel Castano*, Silverio Martinez-Fernandez, Xavier Franch, Justus Bogner

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

52 Downloads (Pure)

Abstract

Hugging Face (HF) has established itself as a crucial platform for the development and sharing of machine learning (ML) models. This repository mining study, which delves into more than 380,000 models using data gathered via the HF Hub API, aims to explore the community engagement, evolution, and maintenance around models hosted on HF - aspects that have yet to be comprehensively explored in the literature. We first examine the overall growth and popularity of HF, uncovering trends in ML domains, framework usage, authors grouping and the evolution of tags and datasets used. Through text analysis of model card descriptions, we also seek to identify prevalent themes and insights within the developer community. Our investigation further extends to the maintenance aspects of models, where we evaluate the maintenance status of ML models, classify commit messages into various categories (corrective, perfective, and adaptive), analyze the evolution across development stages of commits metrics and introduce a new classification system that estimates the maintenance status of models based on multiple attributes. This study aims to provide valuable insights about ML model maintenance and evolution that could inform future model development strategies on platforms like HF.CCS CONCEPTS•Information systems → Data mining; • Software and its engineering → Software maintenance tools; Software libraries and repositories.

Original languageEnglish
Title of host publicationMSR 2024
Subtitle of host publicationProceedings of the 21st International Conference on Mining Software Repositories
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages607-618
Number of pages12
ISBN (Electronic)9798400705878
DOIs
Publication statusPublished - 2024
Event21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024 - Lisbon, Portugal
Duration: 15 Apr 202416 Apr 2024

Conference

Conference21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024
Country/TerritoryPortugal
CityLisbon
Period15/04/2416/04/24

Bibliographical note

Publisher Copyright:
© 2024 ACM.

Keywords

  • maintenance
  • repository mining
  • software evolution

Fingerprint

Dive into the research topics of 'Analyzing the Evolution and Maintenance of ML Models on Hugging Face'. Together they form a unique fingerprint.

Cite this