Opendatabay APP

ML Model Popularity and Author Data

Data Science and Analytics

Tags and Keywords

Machine

Learning

Ai

Models

Downloads

Trusted By
Trusted by company1Trusted by company2Trusted by company3
ML Model Popularity and Author Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A catalogue of models sourced from Hugging Face, Inc., a French-American company renowned as an open-source community for developing tools and resources used to build, deploy, and train machine learning models. The data provides insight into the breadth of models available and their utilisation, quantifiable through metrics like model downloads and likes. This information was collected by scraping the platform's models page, addressing curiosity regarding the volume and popularity of models hosted there.

Columns

  • author: Identifies the creator or owner of the model, exhibiting approximately 159,000 unique values.
  • downloads: The volume of model downloads, with a maximum recorded value reaching over 75 million.
  • gated: A flag indicating if the model's access is restricted, categorized as False (99% prevalence), auto, or Other.
  • id: The unique identifier assigned to the model, with 597,000 unique values, covering every record.
  • lastModified: The UTC timestamp denoting the most recent modification date of the model.
  • likes: The total number of likes received by the model, peaking at around 10.6 thousand.
  • authorData.type: Specifies whether the author is categorized as a standard user (88%) or an organization (12%).
  • authorData.isPro: A boolean field indicating if the author maintains a professional account (approximately 3% True).
  • authorData.isHf: A boolean field indicating if the author is a member affiliated with Hugging Face (approximately 1% True).
  • pipeline_tag: Denotes the machine learning pipeline associated with the model, noting that nearly half of the records are missing this tag.

Distribution

The dataset, labelled as Models.csv, measures 74.58 MB in size and contains approximately 597,000 valid records across 10 columns. The expected frequency for updates is monthly.

Usage

This data is ideal for analysing trends in machine learning model creation and adoption within the open-source community. It can be used by researchers to assess model popularity, measure deployment rates based on download statistics, and track the volume of engagement via likes. Furthermore, it allows for analysis of author profiles, differentiating between user and organizational contributions or examining professional account usage.

Coverage

The data covers the model catalogue available on HuggingFace.co. The temporal scope of the records’ last modification dates extends from 26 February 2021 up to 17 April 2024.

License

CC0: Public Domain

Who Can Use It

Data scientists, machine learning engineers, and technology industry analysts who require metrics on the status and utilisation of artificial intelligence models. It is valuable for academics studying the growth and distribution of open-source resources in the field of AI.

Dataset Name Suggestions

  • Hugging Face Model Catalogue Metrics
  • ML Model Popularity and Author Data
  • Open Source AI Model Utilization
  • HuggingFace Model Scrape

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

17/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format