High-Impact Synthetic Reasoning Dataset: +3% GPQA Diamond Lift

Synthetic Data Generation

Tags and Keywords

Syntheticdata

Dpo

Gpqa

Reasoning

Alignment

Quantum

Neuroscience

Glossfree

Dataefficient

Llmtraining

Preferencedata

Finetuning

High-Impact Synthetic Reasoning Dataset: +3% GPQA Diamond Lift Dataset on Opendatabay data marketplace

"No reviews yet"

£2,800

About

Overview

One-pass synthetic DPO preference pairs engineered for indefinite rigor and escalation—no gloss decay or adversarial filtering.
This ~1,200-pair dataset fine-tuned Qwen2.5-7B-Instruct to verifiable asymmetric lifts on GPQA Diamond (hard reasoning benchmark).
Key Results (3 independent seeds, full 198 questions):
  • Full GPQA Diamond: +3.2% mean lift (36.53% vs baseline 33.33%, low variance ±0.58%)
  • Quantum mechanics subset: +16.02% mean lift (51.92%)
  • Neuroscience/BCI transfer: +15.79% mean lift (52.63%)
Structural entropy stability via high-contrast pair geometry—ideal for data-efficient reasoning fine-tunes.

Business Case & Value

Tiny data delivering outsized gains in frontier reasoning domains. Reproducible (scripts/seeds provided) for LoRAs, agents, or alignment experiments. Non-exclusive—test quickly, scale with confidence.

Dataset Features

  • prompt: Input question/context for the preference pair.
  • chosen: Preferred response (deep, formal, escalated reasoning—expert-grade).
  • rejected: Non-preferred response (shallow/underpowered—high-contrast negative).

Distribution

  • Data Volume: ~1,200 records (pairs)
  • Format: JSONL (standard DPO structure: prompt, chosen, rejected per line)
  • Size: ~5-10 MB compressed

Listing Stats

VIEWS

98

DELIVERY

INSTANT DOWNLOAD

LISTED

31/01/2026

UPDATED

01/02/2026

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

Loading...

£2,800

Download Dataset in ZIP Format