GitHub Programming Language Usage Data
Software and Technology
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers valuable insights into the popularity of programming languages by analysing their usage trends on GitHub. It compiles statistics related to languages deployed in public repositories, pull requests, and issues. The data provides a metric to define the popularity of a programming language based on the number of projects and files created using it, addressing a common question in computer science and software engineering.
Columns
- name: Represents the specific programming language, such as JavaScript or Python.
- year: Indicates the calendar year for which the data is recorded, spanning from 2011 to 2022.
- quarter: Denotes the quarter of the year, ranging from 1 to 4.
- count: Specifies the total number of issues associated with repositories that primarily use the corresponding programming language.
Distribution
The dataset is provided in a CSV format, specifically as
issues.csv
, with a file size of approximately 63.62 kB. It comprises 4 columns and contains 3375 valid records. The data was aggregated from Google BigQuery's public github_repos
and githubarchive
datasets.Usage
This dataset is ideal for a variety of analytical and research purposes. It can be used to track the evolution of programming language popularity over time, identify emerging trends, and compare different languages based on their adoption rates on GitHub. It serves as a quantitative resource for understanding the landscape of software development and open-source contributions.
Coverage
The dataset covers programming language statistics on GitHub from 2011 to 2021. It focuses on data from public GitHub repositories, including their corresponding pull requests and issues. It is important to note that the dataset's scope is limited to publicly available data and may not reflect the full spectrum of language usage across all GitHub repositories, including private ones.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Computer Science Researchers: For academic studies on software engineering trends, language evolution, and open-source project dynamics.
- Software Developers and Architects: To make informed decisions about technology stacks, learn popular languages, or understand market demand for specific programming skills.
- Data Scientists and Analysts: For data-driven insights into developer behaviour and the popularity of tools within the tech ecosystem.
- Educators: To illustrate real-world applications and the historical popularity of programming languages to students.
Dataset Name Suggestions
- GitHub Programming Language Usage Data
- Decade of GitHub Language Popularity
- Programming Language Trends on GitHub
- GitHub Repository Language Statistics
- Public GitHub Language Adoption
Attributes
Original Data Source: GitHub Programming Language Usage Data