Opendatabay APP

TOI Crime News Dataset

Data Science and Analytics

Tags and Keywords

Beginner

Intermediate

Nlp

Recommender

Trusted By
Trusted by company1Trusted by company2Trusted by company3
TOI Crime News Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset aims to facilitate a faster judging process in India, a country where some legal cases have been known to persist for over a century [1]. Recognising that justice delayed is justice denied, this collection of crime articles serves as a valuable resource for legal professionals and the data science community alike [1]. It is designed to assist judges in reaching verdicts more quickly by providing access to similar historical cases, while also empowering lawyers with recent case examples to strengthen their arguments [1]. Furthermore, it offers a practical dataset for individuals engaged in Natural Language Processing (NLP) and recommender systems projects [1].

Columns

The 7k Unique crime articles.csv file within this dataset contains the following columns:
  • heading: The main title of the article [2].
  • content_summary: A concise summary of the article's content [2].
  • article_link: The URL leading to the full article [2].
  • img_link: The URL for any image associated with the article [2].
  • month_date: The month and day of publication [2].
  • time: The time of publication [2].
  • Year: The year of publication [2].

Distribution

The dataset comprises two CSV files: Crime_Articles.csv and 7k Unique crime articles.csv [3]. Crime_Articles.csv contains 80,000 repetitive articles, of which over 6,000 are unique [3]. The 7k Unique crime articles.csv file features nearly 8,000 unique crime articles [3]. The data has been extracted from the Times of India website [3]. Images corresponding to individual articles are planned to be attached in the future [3]. The dataset covers articles primarily from 2021 (33%) and 2022 (29%), with additional data from 2023 spanning from 1st January to 31st December [4, 5]. The quality of the dataset is rated 5 out of 5, and its current version is 1.0 [3, 6].

Usage

This dataset is ideally suited for:
  • Developing recommender systems for legal professionals [1].
  • Assisting judges in quickly identifying precedents and similar cases for more efficient verdict delivery [1].
  • Providing lawyers with relevant and recent case examples to inform their arguments and strategies [1].
  • Supporting data scientists working on NLP tasks, text analysis, and machine learning models related to legal or journalistic data [1].

Coverage

The dataset focuses on Indian crime articles [1]. The articles collected span a time range that includes 2021, 2022, and 2023, with detailed date and time information for each entry [4, 5]. Specifically, the 2023 data covers the entire year from 01/01/2023 to 31/12/2023 [4]. There is no specific demographic scope mentioned, as the data consists of crime news articles.

License

CC BY-NC-SA

Who Can Use It

  • Lawyers: To research similar cases, understand recent legal trends, and find examples for court arguments [1].
  • Judges: To streamline the judgment process by referencing relevant prior cases [1].
  • Data Scientists and Analysts: For projects involving Natural Language Processing, building recommender systems, and exploring large text datasets [1].
  • Researchers: Studying patterns in Indian crime reporting or the legal system.

Dataset Name Suggestions

  • Indian Crime Articles from TOI
  • Justice AI: Indian Legal Data
  • TOI Crime News Dataset
  • Indian Judiciary Support Data
  • Legal Recommendation Dataset (India)

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free