Opendatabay APP

MAGIC Gamma Telescope Classification Data

Data Science and Analytics

Tags and Keywords

Gamma

Telescope

Classification

Astronomy

Cherenkov

Trusted By
Trusted by company1Trusted by company2Trusted by company3
MAGIC Gamma Telescope Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Consists of Monte Carlo generated data specifically designed to model the detection of high energy gamma particles. The atmospheric Cherenkov gamma telescope operates by detecting the light radiation emitted by charged particles created during electromagnetic showers in the atmosphere. The detector records pulses from incoming Cherenkov photons on photomultiplier tubes, which are arranged in a camera. These patterns, known as the shower image, are used to statistically differentiate between primary gamma rays (the signal) and hadronic showers initiated by cosmic rays in the upper atmosphere (the background).

Columns

The dataset contains 12 columns, including 11 continuous features and one binary class label:
  • fLength: The length of the major axis of the shower image ellipse, measured in millimetres.
  • fWidth: The length of the minor axis of the shower image ellipse, measured in millimetres.
  • fSize: The logarithm base 10 of the sum of the content of all pixels, expressed in photon counts.
  • fConc: The ratio of the sum of the content of the two highest pixels relative to the total fSize.
  • fConc1: The ratio of the content of the single highest pixel relative to the total fSize.
  • fAsym: The distance from the highest pixel to the centre of the image, projected onto the major axis, measured in millimetres.
  • fM3Long: The 3rd root of the third moment calculated along the major axis, measured in millimetres.
  • fM3Trans: The 3rd root of the third moment calculated along the minor axis, measured in millimetres.
  • fAlpha: The angle of the major axis relative to the vector pointing to the origin, measured in degrees.
  • fDist: The distance from the origin to the centre of the shower ellipse, measured in millimetres.
  • class: The binary classification label, indicating either 'g' (gamma, signal) or 'h' (hadron, background).

Distribution

The data is provided as a CSV file, magic04_gamma.csv, with a file size of 1.58 MB. It contains 19.0k total instances, all of which are valid, with zero missing or mismatched entries. The dataset is structured for classification tasks and has an expected update frequency of "Never." The class distribution shows that gamma (signal) events account for 65% of the instances, while hadron (background) events account for the remaining 35%.

Usage

This data is ideally suited for machine learning classification tasks, particularly for developing algorithms to distinguish between signal (gamma particles) and noise (hadronic cosmic rays). Given the nature of the physics problem, where classifying a background event as a signal is considered a worse error than classifying a signal event as background, users should evaluate classifier performance using the Receiver Operating Characteristic (ROC) curve. Relevant evaluation points are those where the probability of falsely accepting a background event as a signal falls below thresholds such as 0.01, 0.02, 0.05, 0.1, or 0.2.

Coverage

The data is derived from the simulation of detection events relevant to the Major Atmospheric Gamma Imaging Cherenkov Telescope project (MAGIC). It is crucial to note that while simulating real detection events, the ratio of background events (h class) is understated in this specific dataset; in actual real-world data, the hadronic background typically represents the majority of events.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

This product is beneficial for:
  • Data Scientists and Machine Learning Engineers: For benchmarking and developing robust classification models in imbalanced or error-sensitive domains.
  • Researchers in Astrophysics: Those studying gamma-ray astronomy and requiring simulated data for particle discrimination analysis.
  • Students: For intermediate-level classification projects focusing on continuous data and complex performance metrics like the ROC curve.

Dataset Name Suggestions

  1. MAGIC Gamma Telescope Classification Data
  2. Atmospheric Cherenkov Particle Discrimination
  3. High Energy Particle Shower Simulation

Attributes

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

15/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format