Historical Sunspot Number Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Daily counts of sunspots provide a vital record of solar activity spanning nearly two centuries, running from 1850 up to early 2025. This sequence is specifically designed for rigorous time series analysis, offering researchers and modellers the opportunity to forecast future sunspot occurrences. The raw figures originate from the World Data Center SILSO, Royal Observatory of Belgium, Brussels. The daily total sunspot number is calculated using the established formula: R = Ns + 10 * Ng, where Ns represents the number of spots and Ng is the number of groups observed across the solar disk. The dataset has been rigorously cleaned to remove years before 1850 that suffered from excessive missing entries, ensuring the resulting product is complete with no missing values.
Columns
- Year, Month, Day: These three columns provide the Gregorian calendar date for the observation.
- Date in fraction of year (date_frac): The date represented as a fraction of the year.
- Daily total sunspot number (counts): The key measurement of solar activity for the specific day. Although the original data specification allows for a -1 placeholder for missing values, this particular file contains no such instances.
- Daily standard deviation (std): This value reflects the standard deviation of the input sunspot numbers reported by the individual observing stations. Before 1981, these error values were estimated using an auto-regressive model based on the Poissonian distribution of the Sunspot Numbers. From 1981 onwards, the figure represents the actual standard deviation derived from the sample of raw observations used for the daily count.
- Number of observations (nobs): The quantity of observations employed to calculate the daily total. Notably, for periods before 1981, this number is generally set to 1, as the Sunspot Number primarily represented the raw Wolf number from the Zürich Observatory during that era.
- Definitive/provisional indicator (indicator): This field marks the status of the observation. A blank (NaN) signifies that the value is definitive. A '*' symbol indicates that the value is still provisional and may be revised, which typically applies to the most recent three to six months of data.
Distribution
This data product is supplied as a CSV file (2.59 MB) containing 9 columns and 63.9 thousand records. The temporal data spans from 1 January 1850 to 31 January 2025. All columns are fully populated with 100% valid data, ensuring immediate utility for modelling purposes. The expected update frequency for this dataset is monthly.
Usage
This is ideal for projects focused on time series analysis and predictive modelling, specifically:
- Forecasting solar cycles and predicting future sunspot numbers.
- Studying long-term trends and periodicities in solar activity.
- Research into solar physics and heliophysics.
- Machine learning applications requiring high-quality, continuous, chronological data.
Coverage
The temporal scope extends from 1850 through to 2025, providing a long-duration view of solar activity. Earlier data containing excessive missing values (1818 to 1850) were excluded to maintain the integrity and continuity of the time series. There are no missing daily values within the specified date range.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Time Series Analysts: For developing sophisticated prediction models.
- Astronomers and Solar Physicists: For research into solar behaviour and its correlation with terrestrial phenomena.
- Students and Educators: For foundational learning in time series, astronomy, and data science applied to environmental data.
- Data Engineers: Seeking a clean, well-structured, historical time series to benchmark data pipelines.
Dataset Name Suggestions
- Daily Total Sunspot Count: 1850-2025
- Solar Activity Index Time Series
- Historical Sunspot Number Dataset
- SILSO Sunspot Counts
Attributes
Original Data Source: Historical Sunspot Number Dataset
Loading...
