Baseball Umpire Performance Analytics
Sports & Recreation
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers a detailed analysis of Major League Baseball (MLB) umpire performance from the 2015 through to the 2022 seasons. It features advanced statistics focusing on officiating accuracy and consistency, derived from scorecard data. As the central figure enforcing game rules and making critical judgement calls, the umpire's performance is highly important. This collection provides a unique opportunity to scrutinise umpiring trends and their impact on game outcomes within professional baseball.
Columns
The dataset contains 19 distinct columns, each detailing a specific aspect of umpire performance or game context:
- id: A unique identifier for each record.
- date: The date when the game took place, spanning from April 2015 to November 2022.
- umpire: The name of the umpire officiating the game.
- home: The abbreviation for the home baseball team.
- away: The abbreviation for the away baseball team.
- home_team_runs: The total runs scored by the home team.
- away_team_runs: The total runs scored by the away team.
- pitches_called: The total number of pitches called by the umpire in the game.
- incorrect_calls: The number of incorrect calls made by the umpire.
- expected_incorrect_calls: The statistically expected number of incorrect calls.
- correct_calls: The number of correct calls made by the umpire.
- expected_correct_calls: The statistically expected number of correct calls.
- correct_calls_above_expected: The difference between actual and expected correct calls.
- accuracy: The umpire's actual call accuracy percentage.
- expected_accuracy: The statistically expected call accuracy percentage.
- accuracy_above_expected: The difference between actual and expected accuracy percentages.
- consistency: A measure of the umpire's consistency in call-making.
- favor_home: A metric indicating the extent to which calls favoured the home team.
- total_run_impact: The overall impact of umpire calls on the total runs scored in the game.
Distribution
The dataset is provided as a CSV file, named
mlb-umpire-scorecard.csv
, and has a size of 1.75 MB. It contains approximately 18,200 individual records, each with 19 detailed columns.Usage
This dataset is ideal for a variety of analytical applications and use cases, including:
- Advanced Sports Analytics: Conducting in-depth statistical analysis of umpire performance.
- Performance Evaluation: Assessing individual umpire accuracy and consistency over time.
- Trend Identification: Discovering patterns and biases in officiating across different seasons or teams.
- Research: Supporting academic studies on sports officiating, decision-making, and game theory.
- Fan Engagement: Providing data for sports commentators, journalists, and enthusiasts to discuss umpire performance.
Coverage
This dataset covers Major League Baseball (MLB) umpire scorecard data from the 2015 season up to the 2022 season. The specific date range included in the data runs from 5th April 2015 to 5th November 2022. The data is based on games officiated by a wide range of MLB umpires, with records for 124 unique officials. The dataset is expected to be updated annually.
License
CC0: Public Domain
Who Can Use It
This dataset is valuable for:
- Sports Statisticians: For detailed quantitative analysis of officiating.
- Baseball Analysts: To evaluate umpire performance and its influence on game dynamics.
- Journalists and Media: For data-driven sports reporting and commentary.
- Academic Researchers: Studying human performance, bias, and decision-making in high-pressure environments.
- Fantasy Sports Enthusiasts: Gaining deeper insights into game factors beyond player performance.
Dataset Name Suggestions
- MLB Umpire Scorecards (2015-2022)
- Baseball Umpire Performance Analytics
- Major League Baseball Officiating Data
- Umpire Accuracy & Consistency (MLB)
- Advanced MLB Umpire Statistics
Attributes
Original Data Source: Baseball Umpire Performance Analytics