Opendatabay APP

Colorado Forest Ecosystem Classification Data

Data Science and Analytics

Tags and Keywords

Forest

Cartography

Prediction

Elevation

Wilderness

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Colorado Forest Ecosystem Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A collection of variables designed solely for the classification and prediction of forest cover type based exclusively on cartographic inputs. The data relates to 30 x 30 meter cells situated within the Roosevelt National Forest in northern Colorado. This focus area includes four designated wilderness areas: Rawah, Neota, Comanche Peak, and Cache la Poudre. These locations were chosen because the existing forest cover types are largely the result of natural ecological processes, having undergone minimal human-caused disturbance. Forest cover is categorised into 7 distinct types, including common species such as Spruce/Fir, Lodgepole Pine, Ponderosa Pine, and Aspen. The independent variables are derived from raw, unscaled data originally obtained from the US Forest Service and US Geological Survey.

Columns

This product includes 55 columns encompassing 10 primary quantitative variables and two extensive sets of qualitative binary variables.
  • Elevation: Quantitative, measured in meters.
  • Aspect: Quantitative, measured in degrees azimuth.
  • Slope: Quantitative, measured in degrees.
  • Horizontal_Distance_To_Hydrology: Quantitative, measured in meters (distance to nearest surface water features).
  • Vertical_Distance_To_Hydrology: Quantitative, measured in meters (vertical distance to nearest surface water features).
  • Horizontal_Distance_To_Roadways: Quantitative, measured in meters (distance to nearest roadway).
  • Hillshade_9am: Quantitative, index from 0 to 255 (Hillshade index at 9am, summer solstice).
  • Hillshade_Noon: Quantitative, index from 0 to 255 (Hillshade index at noon, summer soltice).
  • Hillshade_3pm: Quantitative, index from 0 to 255 (Hillshade index at 3pm, summer solstice).
  • Horizontal_Distance_To_Fire_Points: Quantitative, measured in meters (distance to nearest wildfire ignition points).
  • Wilderness_Area (4 binary columns): Qualitative, coded 0 (absence) or 1 (presence) for the four specific wilderness areas (Rawah, Neota, Comanche Peak, Cache la Poudre).
  • Soil_Type (40 binary columns): Qualitative, coded 0 (absence) or 1 (presence) designating the specific soil type.
  • Cover_Type (Classification target): Integer 1 to 7, designating the forest cover type (e.g., Spruce/Fir, Lodgepole Pine, Krummholz).

Distribution

The data is structured in a raw format, meaning the variables are not scaled. Qualitative independent variables are presented using binary columns. The file format is covertype.csv, with a size of 75.75 MB. The dataset contains approximately 581,000 records across 55 columns. Key attributes like Elevation have a minimum value of 1859 meters and a maximum of 3858 meters, with a mean around 2.96k meters. The records are noted as fully valid, with zero reported mismatched or missing data points.

Usage

This data is ideally suited for machine learning applications, specifically tackling a multiclass classification problem. Users can train and evaluate algorithms to predict one of the seven forest cover types using only the available cartographic and proximity variables. It is highly valuable for foundational ecological modelling, spatial analysis research, and studies assessing the influence of terrain and human proximity features on natural forest distribution patterns.

Coverage

The geographic area is restricted to the Roosevelt National Forest in northern Colorado, United States. It specifically focuses on four wilderness areas: Rawah, Neota, Comanche Peak, and Cache la Poudre. Areas like Rawah and Comanche Peak are considered more typical of the overall data assortment than the relatively high-elevation Neota or the low-elevation Cache la Poudre. The data resolution is at the 30 x 30 meter cell level. No temporal range is specified, and no future updates are expected.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For developing classification models and testing feature importance in spatial prediction tasks.
  • Environmental Scientists: To understand the correlation between geographical features (slope, aspect, elevation) and natural forest type occurrence.
  • GIS Specialists: To integrate geographical predictor variables with ecological modelling efforts.

Dataset Name Suggestions

  • Forest Cover Type Cartographic Predictors
  • Colorado Forest Ecosystem Classification Data
  • Wilderness Area Tree Cover Prediction
  • Geospatial Variables for Forest Typing

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

26/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format