Opendatabay APP

NYC Yellow Cab and For-Hire Vehicle Records

Data Science and Analytics

Tags and Keywords

Taxi

Nyc

Tlc

2017

Travel

Trusted By
Trusted by company1Trusted by company2Trusted by company3
NYC Yellow Cab and For-Hire Vehicle Records Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Detailed transactional records covering taxi and limousine journeys handled by the New York City Taxi and Limousine Commission (TLC) throughout the year 2017. The TLC is responsible for licensing and regulating over 200,000 operators, who collectively manage a high volume of daily trips. This particular data set has been prepared for educational and instructional purposes.

Columns

  • ID: The unique trip identification number.
  • VendorID: A code identifying the TPEP provider responsible for the record, where 1 signifies Creative Mobile Technologies, LLC, and 2 denotes VeriFone Inc.
  • tpep_pickup_datetime: The specific date and time when the taximeter was engaged.
  • tpep_dropoff_datetime: The specific date and time when the taximeter was disengaged.
  • Passenger_count: The number of passengers, recorded by the driver.
  • Trip_distance: The distance of the trip in miles, as reported by the taximeter.
  • RateCodeID: The final rate code applied to the journey (e.g., Standard rate, JFK, Newark, Negotiated fare, or Group ride).
  • Store_and_fwd_flag: A flag indicating if the trip record was temporarily stored in vehicle memory ("Y") before being transmitted to the vendor due to a lack of server connection.
  • PULocationID: The TLC Taxi Zone ID where the taximeter was engaged (pickup location).
  • DOLocationID: The TLC Taxi Zone ID where the taximeter was disengaged (dropoff location).
  • Payment_type: A numeric code signifying the method of payment used by the passenger (e.g., Credit card, Cash, Dispute).
  • Fare_amount: The calculated fare based on time and distance.
  • Extra: Miscellaneous surcharges, currently including the $0.50 and $1 rush hour and overnight charges.
  • MTA_tax: The automatically triggered $0.50 MTA tax.
  • Tip_amount: The tip amount, automatically recorded for credit card payments but excluding cash tips.
  • Tolls_amount: The total sum of all tolls paid during the trip.
  • Improvement_surcharge: The $0.30 surcharge levied upon the initiation of the trip (flag drop).
  • Total_amount: The aggregate amount charged to passengers, exclusive of cash tips.

Distribution

The raw data is typically available in CSV format and is identified as New York City TLC Data.csv, with a file size of 2.3 MB. The set contains 18 columns. Sample data analysis shows approximately 22.7 thousand valid records. The recorded trip distances range up to 34 miles.

Usage

This data set is ideal for advanced exploratory data analysis, particularly when investigating urban traffic flow and mobility patterns. It is also suitable for training machine learning models such as Decision Tree and randomForest for predictive analytics relating to trip duration or fare estimation.

Coverage

The data covers taxi and limousine trips specifically within New York City TLC zones. The temporal scope includes all recorded journeys for the entire calendar year of 2017, spanning from January 1, 2017, through to January 1, 2018 (inclusive of trips that started on the last day of 2017).

License

CC0: Public Domain

Who Can Use It

Intended users include data scientists focused on transportation kinetics and logistics, academics performing urban planning and congestion research, and analysts aiming to model public transport demand and fare structures. It is also well-suited for students requiring public data sets for educational projects.

Dataset Name Suggestions

  • NYC Taxi and Limousine Trips 2017
  • New York TLC Journey Data 2017
  • NYC Yellow Cab and For-Hire Vehicle Records

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

15/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format