Opendatabay APP

UCSD Gas Sensor Array Wind Tunnel Data

Data Science and Analytics

Tags and Keywords

Gas

Sensor

Chemical

Wind-tunnel

Classification

Trusted By
Trusted by company1Trusted by company2Trusted by company3
UCSD Gas Sensor Array Wind Tunnel Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Collected over a 16-month period from December 2010 to April 2012, this collection features time-series recordings derived from a chemical detection platform. The data was gathered at the BioCircuits Institute, University of California San Diego, utilising a wind tunnel research test-bed facility. It addresses a ten-class gas discrimination problem, capturing the response of gas sensor arrays to ten high-priority chemical gaseous substances at six different locations. The original dataset comprises 18,000 recordings and has been preprocessed to facilitate chemical classification and machine learning analysis in open sampling settings.

Columns

The dataset includes 290 columns in total, characterised by 288 feature columns derived from sensor arrays and a label column for chemical identification. Key column types include:
  • Sensor Array Features (A1 - I8): 288 columns representing continuous features from the sensor readings. Based on statistical summaries, these include columns such as mean_A1, mean_A2, mean_A3, mean_A4, mean_A5, mean_A6, mean_A7, mean_A8, and mean_B1. These columns contain numerical data (valid floats) representing sensor responses (e.g., means and standard deviations).
  • Chemical Name: A categorical column identifying the specific chemical substance detected (Label).
  • Number: A numerical index column found in the file structure.

Distribution

The data is provided in Comma-Separated Values (CSV) format. It consists of two distinct files:
  • chemicals_in_wind_tunnel.csv: Contains 17,921 rows and covers 11 chemicals (inducing the ten-class problem plus potential background or control). The file size is approximately 90.11 MB.
  • chemicals_in_wind_tunnel_3.csv: Contains 5,098 rows and covers 3 chemicals. The primary dataset exhibits a balanced distribution across various quantiles for sensor readings, with 100% valid entries and no missing or mismatched values reported in the statistical summary.

Usage

This data is ideal for developing and testing algorithms in the following areas:
  • Chemical Classification: Training models to distinguish between different gaseous substances based on sensor array outputs.
  • Machine Learning in Open Environments: Evaluating classifier performance in wind tunnel simulations that mimic real-world open sampling settings.
  • Time-Series Analysis: Processing continuous sensor data streams to identify patterns over time.
  • Sensor Drift and Calibration: Analysing sensor performance stability over the 16-month collection period.

Coverage

  • Geographic Scope: BioCircuits Institute, University of California San Diego (Wind Tunnel Facility).
  • Time Range: December 2010 to April 2012.
  • Subject Scope: Ten high-priority chemical gaseous substances measured at six locations within the facility.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For benchmarking classification algorithms and exploring high-dimensional feature spaces.
  • Chemical Engineers: For analysing gas sensor array responses and drift characteristics.
  • Academic Researchers: For studies in chemo-informatics, pattern recognition, and sensor technology.
  • Machine Learning Students: As a robust, clean dataset for multiclass classification projects.

Dataset Name Suggestions

  • UCSD Gas Sensor Array Wind Tunnel Data
  • Chemical Discrimination Time-Series
  • Wind Tunnel Gas Classification Records
  • 16-Month Chemical Detection Array

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

04/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format