Capital Region Urban Air and Weather Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Air quality data for Islamabad, Pakistan, detailing measurements taken between June 2019 and March 2023. This vital environmental dataset was derived from information published on the Pakistan Environmental Protection Agency (EPA) website. Initial source material existed primarily in PDF format, sometimes as structured tables and sometimes as images of tables. The process of making this data usable involved robust data extraction methods, including using
tabula-py and PyPDF2 for tables, and employing OCR technology, specifically Google Cloud Vision API, to extract data from image files. The resulting files are provided to be as accurate and complete as possible, offering essential variables for environmental analysis.Columns
The dataset provides several key environmental metrics:
- Date: The date of the measurement.
- Temperature: Air temperature, measured in Celsius. For a sample month (August 2019), the mean temperature was 29.3 degrees Celsius.
- Humidity: Measured as a percentage (%). For a sample month, humidity ranged from 43.4% to 71%, with a mean of 60.9%.
- NO2: Nitrogen Dioxide concentration, reported in micrograms per meter cubed.
- SO2: Sulphur Dioxide concentration, reported in micrograms per meter cubed.
- PM2.5: Concentration of fine particulate matter (2.5), reported in micrograms per meter cubed.
- Year: Included in the merged file (
final_data.csv) for easy time-series differentiation.
Distribution
The primary data deliverable is a merged CSV file named "final_data.csv", aggregating all observations into a single structure. The raw data was meticulously processed from EPA documents using multiple tools to ensure data fidelity. For example, a sample file (
AQR-Aug2019.csv) shows 6 columns, and validation statistics for variables like Date, Temperature, Humidity, and pollutants confirm a 100% valid record count for that period, indicating no missing or mismatched observations in the sample data provided. The dataset is expected to receive updates on a quarterly basis.Usage
This data is ideal for several analytical and research applications:
- Investigating trends in air pollution in urban areas.
- Modelling the relationship between weather and climate factors (temperature, humidity) and pollutant concentrations.
- Supporting public health studies related to air quality impacts.
- Informing governmental policy decisions concerning environmental protection and regulation.
- Academic research in Atmospheric Science and Earth and Nature studies.
Coverage
This dataset focuses geographically on Islamabad, Pakistan. The time range of the observations extends from June 2019 to March 2023. This specific range was selected because it represents the period with the most unbroken data collection available from the original source.
License
CC0: Public Domain
Who Can Use It
Environmental researchers, government agencies, urban planners, public health analysts, and students focused on atmospheric science. Given the Public Domain license, any user may access and utilise this data in their analyses without restriction.
Dataset Name Suggestions
- Islamabad Ambient Air Quality Metrics (2019-2023)
- Pakistan EPA Environmental Pollutant Readings
- Capital Region Urban Air and Weather Data
- Islamabad Air Quality Index Parameters
Attributes
Original Data Source: Capital Region Urban Air and Weather Data
Loading...
