Japanese Horse Racing Analytics Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Historical data from the Japan Racing Association (JRA) is presented, covering horse racing results, betting odds, lap times, and corner passing orders. The data spans from 1986 to 2021 and was gathered from
netkeiba.com
. While the original data is in Japanese, English translations for column names are provided, making it accessible for a broader audience. It is suitable for statistical analysis, developing horse racing forecast systems, and validating racing adages.Columns
This data product is organised into four distinct CSV files, each containing specific information related to JRA horse races.
1.
19860105-20210731_race_result.csv
This file details the results for each horse in every race.- Race PP ID: A unique identifier for a horse in a specific race, formed by appending the post position to the Race ID.
- Race ID: A unique identifier for each race, constructed from the date, racecourse code, meeting number, racing day, and race number.
- Race Day: The date of the race.
- Race Meeting Number: The number of the race meeting.
- Racecourse Code: A numeric code for the racecourse.
- Racecourse Name: The name of the racecourse venue.
- N-th Racing Day: The specific day number within a race meeting.
- Race Condition: The class or condition of the race.
- Race Symbol/*: A series of columns indicating specific race conditions, such as age or sex restrictions (Mare, Stallion, Gelding), weight rules (Special Weight, Handicap, Fixed Weight), and eligibility for horses or jockeys (Mixed, Specified, International).
- Race Number: The number of the race for that day.
- Graded Races N-th Time: The running number for a specific graded race.
- Race Name: The official name of the race.
- Listed and Graded Races: The classification of the race (e.g., Graded, Listed).
- Steeplechase Category: Identifies if the race is a steeplechase.
- Turf and Dirt Category: The primary surface of the track.
- Turf and Dirt Category2: Indicates when both turf and dirt surfaces are used, typically in steeplechase races.
- Clockwise, Anti-clockwise and Straight Course Category: The direction of the race course.
- Inner Circle, Outer Circle and Tasuki Course Category: Details the course layout, including the now-obsolete Tasuki course for steeplechases.
- Distance(m): The race distance in metres.
- Weather: The weather conditions on race day.
- Track Condition1: The condition of the main track surface.
- Track Condition2: The condition of a secondary track surface, if applicable.
- Post Time: The official start time.
- FP: The final position of the horse. This is left blank for disqualifications, scratches, etc.
- FP Note: Notes regarding the final position, such as 'Disqualified' or 'Scratched'.
- BK: The bracket number for betting purposes.
- PP: The post position (stall number) of the horse.
- Horse Name: The name of the horse.
- Sex: The horse's gender.
- Age: The horse's age.
- Weight(Kg): The weight carried by the horse.
- Jockey: The name of the jockey.
- Total Time(1/10s): The total race time, measured in tenths of a second.
- Margin: The finishing distance behind the horse immediately ahead.
- Position 1st Corner: The horse's position at the first corner.
- Position 2nd Corner: The horse's position at the second corner.
- Position 3rd Corner: The horse's position at the third corner.
- Position 4th Corner: The horse's position at the fourth corner.
- L3F: The time for the final 3 furlongs (600m) of the race.
- Win Odds(100Yen): The final odds for a "Win" bet.
- Win Fav: The horse's popularity ranking for winning.
- Horse Weight: The weight of the horse on race day.
- Horse Weight Gain and Loss: The change in the horse's weight since its last race.
- East, West, Foreign Country and Local Category: The region where the trainer is based.
- Trainer: The name of the trainer.
- Owner: The name of the owner.
- Prize Money(10000Yen): The prize money won, in units of 10,000 Yen.
2.
19860105-20210731_odds.csv
This file contains the final odds for various betting types, linked by Race ID.- Win(1/2)_(PP/Odds/Fav): Post position, odds, and popularity for Win bets.
- Place(1-5)_(PP/Odds/Fav): Post position, odds, and popularity for Place bets.
- Bracket Quinella(1/2)_(Permutation1/2/Odds/Fav): Combinations, odds, and popularity for Bracket Quinella bets.
- Quinella(1/2)_(Permutation1/2/Odds/Fav): Combinations, odds, and popularity for Quinella bets.
- Quinella Place(1-7)_(Permutation1/2/Odds/Fav): Combinations, odds, and popularity for Quinella Place (Wide) bets.
- Exacta(1/2)_(Exact Order1/2/Odds/Fav): Combinations, odds, and popularity for Exacta bets.
- Trio(1-3)_(Permutation1/2/3/Odds/Fav): Combinations, odds, and popularity for Trio bets.
- Trifecta(1-3)_(Exact Order1/2/3/Odds/Fav): Combinations, odds, and popularity for Trifecta bets.
3.
19860105-20210731_laptime.csv
This file provides lap and pace times for races, linked by Race ID.- Race ID: Foreign key linking to the race result data.
- Lap Time (1-18): The time taken for each 200m segment of the race (or 100m for the first segment in odd-distance races).
- Pace Time (1-18): The cumulative time at each segment point.
- F3F: Time of the first 3 furlongs (600m).
- L3F: Time of the last 3 furlongs (600m).
4.
20020615-20210731_corner_passing_order.csv
This file details the passing order and spacing between horses at each corner of the track, linked by Race ID.- Race ID: Foreign key linking to the race result data.
- Position 1st Corner: The passing order at the first corner.
- Position 2nd Corner: The passing order at the second corner.
- Position 3rd Corner: The passing order at the third corner.
- Position 4th Corner: The passing order at the fourth corner.
- Symbol: Symbols that indicate the distance between horses (e.g.,併走 (running together), 1-2 lengths, 2-5 lengths).
Distribution
The dataset is distributed as four separate CSV files. NaN values are included where data was not available or not applicable for a given race or horse. As an example of file size,
19860105-20210731_laptime.csv
is approximately 14.41 MB. The number of records is not specified for all files, and data completeness varies across the collection.Usage
This data is well-suited for a variety of applications, including:
- Statistical Analysis: Performing detailed statistical studies of race outcomes and influencing factors.
- Predictive Modelling: Building and training machine learning models to forecast race results.
- Strategy Validation: Testing and verifying common horse racing theories and betting strategies.
- Academic Research: Supporting research in sports analytics, gambling markets, and performance analysis.
- Content Creation: Discovering interesting trends and topics for articles or discussions among racing fans.
Coverage
- Geographic: The data covers horse races conducted by the Japan Racing Association (JRA) within Japan.
- Time Range: The data extends from 5 January 1986 to 31 July 2021. However, some data points, such as corner passing order details and F3F/L3F times, are only available from June 2002 onwards.
- Demographic: The data is primarily in Japanese and is best suited for users who can understand the language. English translations for column names are provided to assist non-Japanese speakers.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Data Scientists and Analysts: Can perform statistical analysis and build predictive models to understand the dynamics of horse racing.
- Betting Strategists and Enthusiasts: Can use historical data to test betting theories, identify profitable trends, and verify common racing wisdom.
- Academics and Researchers: Can utilise the dataset for studies in sports science, economics, and quantitative analysis.
- Content Creators and Journalists: Can find data-driven stories, statistics, and insights for articles, blogs, and discussions about Japanese horse racing.
Dataset Name Suggestions
- JRA Horse Racing Historical Data (1986-2021)
- Japan Racing Association (JRA) Results and Odds
- Japanese Horse Racing Analytics Dataset
- Historical JRA Race Data: Results, Odds, and Lap Times
- JRA Race Data Archive (1986-2021)
Attributes
Original Data Source: Japanese Horse Racing Analytics Dataset