Opendatabay APP

Synthetic Circle Point Collection

Synthetic Data Generation

Tags and Keywords

Synthetic

Clustering

Benchmark

Points

Tabular

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Synthetic Circle Point Collection Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection of synthetic data provides a high-fidelity, structured challenge specifically designed for the evaluation of clustering algorithms. The points are arranged in a geometric pattern featuring 100 distinct, well-defined groups. This arrangement makes the data highly valuable for benchmarking machine learning models, such as k-means, where clear class separation and structure recognition are critical. The inclusion of class labels supports both unsupervised clustering analysis and supervised classification tasks.

Columns

  • x: The primary coordinate of the point along the horizontal axis. Values range from approximately -5 to 185, with a mean of 90.
  • y: The secondary coordinate of the point along the vertical axis. Values range from approximately -5 to 185, with a mean of 90.
  • class: The ground truth label indicating the specific circle membership for the point. These labels run from 0 up to 99, denoting the 100 separate clusters.

Distribution

The dataset is provided in a standard tabular format, consisting of 10,000 total instances (rows or records). It features a fixed structure of 100 groups, with exactly 100 points contributing to each group. The data is entirely complete and valid across all fields, with zero missing or mismatched values recorded. The data file size is approximately 225 kB. Specific numbers for rows or records are available, totaling 10,000.

Usage

The points are ideal for testing the accuracy and scalability of novel clustering algorithms. They serve as a crucial benchmark for models attempting to detect non-linear geometric structures in two dimensions. It is also suitable for educational purposes to illustrate density-based or centroid-based grouping principles.

Coverage

The scope is purely abstract, focusing on two-dimensional mathematical space. As synthetic data, it has no geographic location, time range, or demographic limits. The data structure intentionally provides uniform availability, ensuring 100 points are present for each of the 100 classes (circles).

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Data Scientists: For validating the performance of their custom clustering models against a known ground truth.
  • Machine Learning Researchers: Developing novel density estimation or geometric feature extraction techniques.
  • Academics and Educators: Demonstrating core principles of unsupervised learning and class separation.

Dataset Name Suggestions

  • Structured Clustering Benchmark Data
  • Synthetic Circle Point Collection
  • 2D Geometric Clustering Challenge
  • 100-Class Algorithm Evaluator

Attributes

Original Data Source: Synthetic Circle Point Collection

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

26/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format