La Liga Statistics for Exploratory Data Analysis
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Statistical records detailing the performance of clubs during the La Liga 2021-22 Spanish football season. This resource was initially created to serve as a practical foundation for developing skills in Exploratory Data Analysis (EDA) and using tools such as Pandas. It offers analysts a detailed look at team performance, incorporating standard metrics like total points, wins, and goal differences, alongside advanced figures such as expected goals (xG), expected goals against (xGA), and possession percentages. The data was ethically scraped from a La Liga statistics website purely for educational purposes.
Columns
The dataset features 33 detailed team statistics columns:
- Squad: Name of the club.
- MP: Number of matches played.
- W: Number of wins.
- D: Number of draws.
- L: Number of losses.
- GF: Goals scored.
- GA: Goals conceded.
- GD: Goal difference.
- Pts: Total points accumulated.
- Pts/MP: Average points per match.
- xG: Expected goals.
- xGA: Expected goals against.
- xGD: Expected goal difference.
- xGD/90: Expected goal difference per 90 minutes.
- Attendance percent: Attendance percentage at matches.
- gsts: Goals scored by the top scorer in the team.
- NDF: Number of different players used.
- age: Average age of the squad.
- poss: Average possession percentage.
- NPG: The number of total non-penalty goals.
- YC: Number of yellow cards accumulated.
- RC: Number of red cards accumulated.
- GP90: Goals scored per 90 minutes.
- XG90: Expected goals per 90 minutes.
- XA90: Expected assists per 90 minutes.
- Save per: Save percentage of the squad’s goalkeepers.
- Pen save per: Save percentage of goalkeepers against penalties.
- shots per: Average shot conversion percentage (goals scored as a percentage of total shots attempted).
- pass completion: Percentage of completed passes in total attempted passes.
- prog dist: Progressive distance moved by the ball on the pitch (in yards).
- fouls: Total number of fouls conceded.
- fouls drawn: Total number of times the squad was fouled.
- offsides: Total number of offsides.
- aerial percentage: Average percentage of headers won.
Distribution
The data is structured as a CSV file, named
laliga21-22.csv
, and is small, approximately 3.99 kB in size. It contains 19 records, with each record representing a team in the league. The original structure lists 35 columns, all of which contain valid data. Importantly, all 19 total values for every statistic are present, resulting in zero missing data.Usage
This dataset is highly suitable for introductory data analysis and skill development. It is an excellent resource for performing Exploratory Data Analysis (EDA), allowing users to visualise distributions and correlations between traditional performance metrics (like total points) and advanced metrics (like expected goal difference). It can be used by analysts interested in comparing the effectiveness of various La Liga clubs during the season, and it serves as a foundation for learning the basics needed to progress toward machine learning fields.
Coverage
The dataset focuses exclusively on the top-tier Spanish professional football league, La Liga. The data covers the entirety of the 2021-22 season. The statistics are aggregated at the club level and include metrics related to overall team performance, discipline (cards), and underlying player metrics (average age of the squad and number of different players used). All participating clubs from that season are represented.
License
CC BY-NC-SA 4.0
Who Can Use It
- Beginner Data Scientists: To practice data cleaning, manipulation, and analysis using Python libraries like Pandas.
- Sports Analysts and Enthusiasts: To conduct detailed comparisons of team efficiency using metrics beyond simple goals scored and conceded.
- Students: For educational projects requiring real-world, structured statistical data in the sports domain.
Dataset Name Suggestions
- La Liga 2021-22 Season Team Performance Statistics
- Spanish Football Advanced Club Metrics (2021-22)
- La Liga Statistics for Exploratory Data Analysis
- Expected Goals and Team Stats: La Liga 2021-22
Attributes
Original Data Source: La Liga Statistics for Exploratory Data Analysis