Water Contamination and Safety Data
Public Health & Epidemiology
Related Searches
Trusted By




"No reviews yet"
Free
About
Access to clean and safe drinking water is a fundamental human right and a cornerstone of public health. This dataset offers a comprehensive collection of water quality measurements aimed at assessing potability and identifying factors contributing to unsafe drinking water. It serves as a valuable resource for public health research, environmental studies, and machine learning applications focused on water safety classification.
Dataset Features
Each row in the dataset represents water quality measurements for a specific water body, with the following columns:
- pH: Measures the acidity or alkalinity of water. The safe range for drinking water is typically 6.5 to 8.5.
- Hardness: Represents the concentration of calcium and magnesium, impacting water softness.
- Solids (TDS): Total Dissolved Solids indicate the level of dissolved minerals and salts in the water.
- Chloramines: Level of chloramine disinfectants used for water treatment.
- Sulfate: Concentration of sulfate compounds, which can influence water taste and safety.
- Conductivity: Electrical conductivity, indicating mineral content.
- Organic Carbon: Measure of total organic compounds in the water.
- Trihalomethanes: Byproducts formed during chlorine treatment, which can affect safety.
- Turbidity: Measure of water clarity; higher turbidity can indicate contamination.
- Potability: Binary target variable (1 = Potable, 0 = Not Potable).
Distribution
- Data Format: CSV (Comma Separated Values)
- Data Volume: 3,276 records and 10 columns
- Data Structure: Tabular format with both numerical and categorical data
Usage
Ideal applications for this dataset include:
- Public Health Analysis: Investigate the safety of drinking water across regions.
- Machine Learning Projects: Develop classification models to predict water potability.
- Environmental Research: Study the correlation between chemical properties and water contamination.
- Education and Training: For data science and machine learning practice projects.
Coverage
- Geographic Coverage: Global (unspecified locations)
- Time Range: Not specified
- Demographics: Not applicable (focus on water samples)
License
- License: CC0 (Public Domain)
Who Can Use It
- Data Scientists: For building models to classify potable and non-potable water.
- Researchers: For scientific studies on water quality and health risks.
- Policymakers: For assessing water safety standards and compliance.
- Educators: For hands-on data science education and environmental studies.
Leveraging this dataset can provide critical insights into water safety and inform decisions for improving access to clean drinking water worldwide.