Poker Hand Prediction
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for the classification of poker hands. Each record represents a hand of five playing cards drawn from a standard 52-card deck. The primary purpose is to predict the "Poker Hand" class attribute based on the card suits and ranks. Released in January 2007 by Robert Cattral and Franz Oppacher of Carleton University, it presents a challenging problem for classification algorithms. Relational learners and models capable of learning high-level constructs have demonstrated an advantage when used with this data.
Columns
The dataset comprises 10 predictive attributes describing the five cards in a hand, plus one goal attribute representing the poker hand itself. All attributes have no missing values.
- S1, S2, S3, S4, S5 (Suit of card #1-5): Ordinal attributes (1-4) representing the suit of each card: Hearts, Spades, Diamonds, Clubs.
- C1, C2, C3, C4, C5 (Rank of card #1-5): Numerical attributes (1-13) representing the rank of each card: Ace, 2, 3, ..., Queen, King.
- CLASS (Poker Hand): Ordinal attribute (0-9) defining the type of poker hand:
- 0: Nothing in hand; not a recognised poker hand.
- 1: One pair; one pair of equal ranks within five cards.
- 2: Two pairs; two pairs of equal ranks within five cards.
- 3: Three of a kind; three equal ranks within five cards.
- 4: Straight; five cards, sequentially ranked with no gaps.
- 5: Flush; five cards with the same suit.
- 6: Full house; a pair plus a different rank three of a kind.
- 7: Four of a kind; four equal ranks within five cards.
- 8: Straight flush; a straight plus a flush.
- 9: Royal flush; Ace, King, Queen, Jack, Ten of the same suit plus a flush.
Distribution
The dataset is typically provided in CSV format. It includes a total of 1,025,010 instances, split into 25,010 for training and 1,000,000 for testing. The testing dataset file, "poker-hand-testing.csv", is approximately 24.54 MB and contains 11 columns, all with valid and complete data.
Usage
This dataset is ideal for developing and evaluating machine learning models, particularly classification algorithms. It serves as a benchmark for researchers exploring the capabilities of different algorithms on challenging, pattern-rich data. Past usage includes research on evolutionary data mining with automatic rule generalisation.
Coverage
The dataset's scope is purely theoretical, representing permutations of poker hands. It does not have specific geographic, time range, or demographic coverage. The data was released in January 2007. A notable aspect is that the order of cards is important for certain hand classifications, such as the 480 possible Royal Flush hands compared to just 4 if order were ignored.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This dataset is suitable for machine learning engineers, data scientists, and academic researchers focused on classification problems, pattern recognition, and algorithm development. It can be used for training and testing predictive models, exploring the advantages of relational learners, and evaluating the ability of algorithms to recognise complex, high-level constructs within data.
Dataset Name Suggestions
- Poker Hand Dataset
- Card Hand Classification
- Playing Card Classification
- Poker Hand Prediction
Attributes
Original Data Source:Poker Hand Prediction