Public Safety Crime Report
Public Safety & Security
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed record of crimes reported across various regions from 2020 to the present. It offers valuable insights into crime trends, patterns, and changes in crime rates over time, making it suitable for analysis and modelling purposes to enhance public safety measures.
Columns
- DR_NO: A unique identifier for each crime report. This numerical code ranges from 817 to 252 million, with a mean value around 220 million and a standard deviation of 13.2 million.
- Date Rptd: The date when the crime was reported. This datetime field covers incidents from 1 January 2020 to 28 March 2025.
- DATE OCC: The actual date the crime occurred. This datetime field spans from 1 January 2020 to 27 March 2025.
- TIME OCC: The time the crime occurred, typically in 24-hour format. Values range from 1 to 2359, with a mean of 1.34k and a standard deviation of 651.
- AREA: A numeric code representing the geographical area of the crime. Codes range from 1 to 21, with a mean of 10.7.
- AREA NAME: The name of the geographical area. There are 21 unique area names, with "Central" being the most common, accounting for 7% of records, and "77th Street" for 6%.
- Rpt Dist No: The reporting district number for the incident. Numbers range from 101 to 2199, with a mean of 1.12k and a standard deviation of 611.
- Part 1-2: A classification of the crime, where "Part 1" indicates more serious crimes and "Part 2" less serious ones. Part 1 crimes account for approximately 60% of records, and Part 2 for approximately 40%.
- Crm Cd: A numeric code representing the type of crime. Codes range from 110 to 956, with a mean of 500.
- Crm Cd Desc: A description of the crime type. "VEHICLE - STOLEN" is the most frequent, making up 11% of records, followed by "BATTERY - SIMPLE ASSAULT" at 7%. There are 140 unique crime descriptions.
- Mocodes: Modus operandi codes describing the method used in the crime. This field has 15% missing values. Code "344" is the most common among non-null entries.
- Vict Age: The age of the victim. Ages range from -4 to 120, with a mean age of 28.9 and a standard deviation of 22. Note: The presence of negative ages may indicate data entry anomalies.
- Vict Sex: The gender of the victim. "M" (Male) accounts for 40% of records, "F" (Female) for 36%, and 24% are other or unspecified. This field has 14% missing values.
- Vict Descent: The ethnicity or descent of the victim. "H" (Hispanic) is the most common at 29%, followed by "W" (White) at 20%. This field has 14% missing values. There are 20 unique descent codes.
- Premis Cd: A numeric code for the type of premises where the crime occurred. Codes range from 101 to 976, with a mean of 306.
- Premis Desc: A description of the type of premises. "STREET" is the most common at 26%, followed by "SINGLE FAMILY DWELLING" at 16%. This field has a small number of missing values (588). There are 306 unique premise descriptions.
- Weapon Used Cd: A numeric code for the weapon used, if applicable. This field has a significant number of missing values (67%). Codes range from 101 to 516.
- Weapon Desc: A description of the weapon used. When a weapon is recorded, "STRONG-ARM (HANDS, FIST, FEET OR BODILY FORCE)" is the most common description, representing 17% of total records. This field also has 67% missing values.
- Status: A status code of the crime case. "IC" (Invest Cont) is the most common, making up 80% of records.
- Status Desc: A description of the case status. "Invest Cont" (Investigation Continues) is the primary status at 80%, followed by "Adult Other" at 11%.
- Crm Cd 1: An additional crime code for incidents involving multiple offenses. Values range from 110 to 956, with a mean of 500. Only 11 records are missing from this field.
- Crm Cd 2: A second additional crime code. This field has a large number of missing values (93%). When present, codes range from 210 to 999.
- Crm Cd 3: A third additional crime code. This field has 100% missing values (only 2314 valid records out of 1.01m). When present, codes range from 310 to 999.
- Crm Cd 4: A fourth additional crime code. This field has 100% missing values (only 64 valid records out of 1.01m). When present, codes range from 821 to 999.
- LOCATION: A text description of the crime location. There are over 66,000 unique location descriptions.
- Cross Street: A nearby cross street for the crime location. This field has a high percentage of missing values (85%). "BROADWAY" is a common cross street when specified.
- LAT: The latitude of the crime location. Values are primarily concentrated between 33.65 and 34.33, with some records at 0.00.
- LON: The longitude of the crime location. Values are primarily concentrated between -118.67 and -116.29, with some records at -2.37 to 0.00.
Distribution
This dataset is provided in CSV format ("Crime_Data_from_2020_to_Present.csv") and has a file size of 238.98 MB. It contains 28 columns and approximately 1.01 million records (1.01m valid records for many key fields). While most core fields are fully populated, some fields like "Mocodes", "Vict Sex", "Vict Descent", "Weapon Used Cd", "Weapon Desc", "Cross Street", "Crm Cd 2", "Crm Cd 3", and "Crm Cd 4" contain a notable percentage of missing values. The dataset is expected to be updated quarterly.
Usage
This dataset is ideal for:
- Trend Analysis: Identifying seasonal or yearly patterns in crime rates.
- Predictive Modelling: Developing machine learning models to forecast high-risk areas.
- Policy Planning: Supporting policymakers in designing targeted crime prevention strategies.
- Visualisation Projects: Creating heatmaps, dashboards, and visual reports for crime data.
Coverage
The dataset covers crimes reported across various geographical regions, indicated by area codes and names, as well as latitude and longitude coordinates. The time range for the data is from 2020 to the present, with reported crime dates extending up to 28 March 2025 and crime occurrence dates up to 27 March 2025. It includes demographic information about victims, such as age, sex, and descent, allowing for analysis of crime impact on different groups.
License
CC0: Public Domain
Who Can Use It
- Researchers: For academic studies on criminology and public safety.
- Data Analysts: To uncover patterns and generate insights into crime dynamics.
- Law Enforcement Agencies: For operational planning and resource allocation.
- Policymakers: To inform the development of evidence-based crime prevention strategies.
- General Public/Journalists: For understanding local crime landscapes and informing public discourse.
Dataset Name Suggestions
- Crime Records: 2020 Onwards
- Recent Crime Incidents Data
- Crime Trends 2020-Present
- Public Safety Crime Report
Attributes
Original Data Source: Public Safety Crime Report