Dark Mode

Home

Data Categories

AI & ML Data

Legal Text Analysis Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Legal Text Analysis Dataset

Government & Civic Records

Tags and Keywords

Law

Text

Nlp

Government

Australia

Trusted By

Legal Text Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset features Australian legal cases from the Federal Court of Australia (FCA), specifically collected from AustLII. It includes cases from 2006 through to 2009, offering rich text content and metadata. Each document captures catchphrases, citation sentences, citation catchphrases, and citation classes, which indicate the type of treatment given to cases cited within the current document. This dataset serves as a valuable resource for developing models to perform text classification on legal data and for exploring key terms within various case categories.

Columns

case_id: A unique identifier assigned to each legal case, with 24,985 distinct values present in the dataset.
case_outcome: Represents the classification of the case's outcome or, more specifically, the treatment given to cited cases. Examples include 'cited' (49%), 'referred to' (18%), and 'Other' (34% with 8,382 values).
case_title: The official title of the legal case, containing 18,581 distinct titles.
case_text: The full textual content of the legal case document, with 17,921 unique text entries.

Distribution

The dataset typically comprises data files in a format like CSV. It contains legal cases primarily from the years 2006 to 2009. While the exact number of rows or records is not specified, the presence of thousands of unique values across the various columns suggests a substantial volume of data. The dataset is structured to capture detailed information about legal cases, their content, and citation patterns.

Usage

Text Classification: Develop and train machine learning models to classify legal documents based on their content and outcomes.
Exploratory Data Analysis (EDA): Conduct analysis to identify important keywords and phrases associated with different types of legal case categories.
Natural Language Processing (NLP): Apply NLP techniques for information extraction, sentiment analysis, or summarisation within the legal domain.

Coverage

Geographic Scope: The dataset is focused on Australia, specifically drawing cases from the Federal Court of Australia (FCA).
Time Range: It encompasses legal cases from a four-year period, including 2006, 2007, 2008, and 2009.
Data Availability: All cases from the Federal Court of Australia within the specified years are included in the dataset, ensuring a consistent collection over this period.

License

CCO

Who Can Use It

Data Scientists and Machine Learning Engineers: For building and refining models for legal text classification and legal analytics.
Legal Researchers and Scholars: To study legal trends, citation patterns, and judicial outcomes.
Academic Institutions: Particularly those involved in Computer Science and Engineering, for research into Natural Language Processing applied to legal texts.
Government Analysts: For insights into legal precedents and case management.

Dataset Name Suggestions

Australian Federal Court Cases
Legal Case Citation Dataset
AustLII Text Classification Data
FCA Legal Document Collection
Legal Text Analysis Dataset

Attributes

Original Data Source: Legal Citation Text Classification

Listing Stats

VIEWS

DOWNLOADS

LISTED

11/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Legal Text Analysis Dataset

Government & Civic Records

Tags and Keywords

Law

Text

Nlp

Government

Australia

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS