Multi-feature Golf Play Dataset
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This is the Extended Golf Play Dataset, a rich and detailed collection designed to expand upon the classic golf dataset [1]. It incorporates a wide array of features suitable for various data science applications and is especially valuable for teaching purposes [1]. The dataset is organised in a long format, where each row represents a single observation and often includes textual data, such as player reviews or comments [2]. It contains a special set of mini datasets, each tailored to a specific teaching point, for example, demonstrating data cleaning or combining datasets [1]. These are ideal for beginners to practise with real examples and are complemented by notebooks with step-by-step guides [1].
Columns
The dataset features a variety of columns, including core, extra, and text-based attributes:
- ID: A unique identifying number for each player [1].
- Date: The specific day the data was recorded or the golf session took place [1, 2].
- Weekday: The day of the week, with numerical representation (e.g., 0 for Sunday, 1 for Monday) [1, 3].
- Holiday: Indicates whether the day was a special holiday (Yes/No), specifically noted for holidays in Japan (1 for yes, 0 for no) [1, 3].
- Month: The month in which golf was played [3].
- Season: The time of year, such as spring, summer, autumn, or winter [1, 3].
- Outlook: Describes the weather conditions during the session (e.g., sunny, cloudy, rainy, snowy) [1, 3].
- Temperature: The ambient temperature during the golf session, recorded in Celsius [1, 3].
- Humidity: The percentage of moisture in the air [1, 3].
- Windy: A boolean indicator (True/False or 1 for yes, 0 for no) if it was windy [1, 3].
- Crowded-ness: A measure of how busy the golf course was, ranging from 0 to 1 [1, 4].
- PlayTime-Hour: The duration for which people played golf, in hours [1].
- Play: Indicates whether golf was played or not (Yes/No) [1].
- Review: Textual feedback from players about their day at golf [1].
- EmailCampaign: Text content of emails sent daily by the golf place [1].
- MaintenanceTasks: Descriptions of work carried out to maintain the golf course [1].
Distribution
This dataset is organised in a long format, meaning each row represents a single observation [2]. Data files are typically in CSV format, with sample files updated separately to the platform [5]. Specific numbers for rows or records are not currently available within the provided sources. The dataset also includes a special collection of mini datasets within its structure [1].
Usage
This dataset is highly versatile and ideal for learning and applying various data science skills:
- Data Visualisation: Learn to create graphs and identify patterns within the data [1].
- Predictive Modelling: Discover which data points are useful for predicting if golf will be played [1].
- Data Cleaning: Practise spotting and managing data that appears incorrect or inconsistent [1].
- Time Series Analysis: Understand how various factors change over time, such as daily or monthly trends [1, 2].
- Data Grouping: Learn to combine similar days or observations together [1].
- Text Analysis: Extract insights from textual features like player reviews, potentially for sentiment analysis or thematic extraction [1, 2].
- Recommendation Systems: Develop models to suggest optimal times to play golf based on historical data [1].
- Data Management: Gain experience in managing and analysing data structured in a long format, which is common for repeated measures [2].
Coverage
The dataset's regional coverage is global [6]. While the
Date
column records the day the data was captured or the session occurred, no specific time range for the collected data is stated beyond the listing date of 11/06/2025 [1, 6]. Demographic scope includes unique player IDs [1], but no specific demographic details or data availability notes for particular groups or years are provided.License
CC-BY
Who Can Use It
This dataset is designed for a broad audience:
- New Learners: It is easy to understand and comes with guides to aid the learning process [1].
- Teachers: An excellent resource for conducting classes on data visualisation and interpretation [1].
- Researchers: Suitable for testing novel data analysis methodologies [1].
- Students: Can acquire a wide range of skills, from making graphs to understanding textual data and building recommendation systems [1].
Dataset Name Suggestions
- Golf Play Extended Analytics
- Advanced Golf Session Data
- Long Format Golf Insights
- Multi-feature Golf Play Dataset
- Textual Golf Data for Learning
Attributes
Original Data Source: ⛳️ Golf Play Dataset Extended