Opendatabay APP

GitHub Programming Language Usage Data

Software and Technology

Tags and Keywords

Github

Programming

Languages

Software

Trends

Trusted By
Trusted by company1Trusted by company2Trusted by company3
GitHub Programming Language Usage Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset offers valuable insights into the popularity of programming languages by analysing their usage trends on GitHub. It compiles statistics related to languages deployed in public repositories, pull requests, and issues. The data provides a metric to define the popularity of a programming language based on the number of projects and files created using it, addressing a common question in computer science and software engineering.

Columns

  • name: Represents the specific programming language, such as JavaScript or Python.
  • year: Indicates the calendar year for which the data is recorded, spanning from 2011 to 2022.
  • quarter: Denotes the quarter of the year, ranging from 1 to 4.
  • count: Specifies the total number of issues associated with repositories that primarily use the corresponding programming language.

Distribution

The dataset is provided in a CSV format, specifically as issues.csv, with a file size of approximately 63.62 kB. It comprises 4 columns and contains 3375 valid records. The data was aggregated from Google BigQuery's public github_repos and githubarchive datasets.

Usage

This dataset is ideal for a variety of analytical and research purposes. It can be used to track the evolution of programming language popularity over time, identify emerging trends, and compare different languages based on their adoption rates on GitHub. It serves as a quantitative resource for understanding the landscape of software development and open-source contributions.

Coverage

The dataset covers programming language statistics on GitHub from 2011 to 2021. It focuses on data from public GitHub repositories, including their corresponding pull requests and issues. It is important to note that the dataset's scope is limited to publicly available data and may not reflect the full spectrum of language usage across all GitHub repositories, including private ones.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Computer Science Researchers: For academic studies on software engineering trends, language evolution, and open-source project dynamics.
  • Software Developers and Architects: To make informed decisions about technology stacks, learn popular languages, or understand market demand for specific programming skills.
  • Data Scientists and Analysts: For data-driven insights into developer behaviour and the popularity of tools within the tech ecosystem.
  • Educators: To illustrate real-world applications and the historical popularity of programming languages to students.

Dataset Name Suggestions

  • GitHub Programming Language Usage Data
  • Decade of GitHub Language Popularity
  • Programming Language Trends on GitHub
  • GitHub Repository Language Statistics
  • Public GitHub Language Adoption

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

12/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format