Opendatabay APP

Jakarta Public Bus Transit Data

Data Science and Analytics

Tags and Keywords

Transjakarta

Transportation

Jakarta

Public

Transactions

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Jakarta Public Bus Transit Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides simulated transaction data for Transjakarta, a major public transportation company based in Jakarta, Indonesia. It was generated using Python with Faker and Random based on real master data. The purpose of this dummy dataset is to assist data analysts in building and testing analytical frameworks and data structures without needing to wait for real transaction data. It addresses a gap in publicly available Transjakarta transaction data. While the master data sources are authentic, this dataset does not represent actual Transjakarta transaction records or their internal structure. It includes detailed information about individual transactions, payment cards, routes, and stop locations for a specific period.

Columns

  • transID: A unique identifier for each transaction.
  • payCardID: The primary identifier for customers, representing the payment card used for tapping in and out.
  • payCardBank: The name of the bank that issued the customer's payment card (e.g., dki, emoney).
  • payCardName: The customer's name as embedded in their payment card.
  • payCardSex: The customer's sex as embedded in their payment card (F for Female, M for Male).
  • payCardBirthDate: The customer's birth year, extracted from the card details, ranging from 1946 to 2012.
  • corridorID: An identifier for the route corridor, used for grouping routes.
  • corridorName: The name of the route corridor, indicating the start and finish points of each route (e.g., Cibubur - Balai Kota).
  • direction: Indicates the direction of travel, with '0' for "Go" and '1' for "Back".
  • tapInStops: The ID of the stop where the customer tapped in (entered the system).
  • tapInStopsName: The name of the stop where the customer tapped in (e.g., Penjaringan).
  • tapInStopsLat: The latitude coordinate of the tap-in stop, ranging from approximately -6.39 to -6.09.
  • tapInStopsLon: The longitude coordinate of the tap-in stop, ranging from approximately 106.61 to 107.02.
  • stopStartSeq: The sequence number of the tap-in stop within the route, relative to its direction, from 0 to 68.
  • tapInTime: The date and time when the customer tapped in.
  • tapOutStops: The ID of the stop where the customer tapped out (exited the system).
  • tapOutStopsName: The name of the stop where the customer tapped out (e.g., BKN).
  • tapOutStopsLat: The latitude coordinate of the tap-out stop, ranging from approximately -6.39 to -6.09.
  • tapOutStopsLon: The longitude coordinate of the tap-out stop, ranging from approximately 106.61 to 107.02.
  • stopEndSeq: The sequence number of the tap-out stop within the route, relative to its direction, from 1 to 77.
  • tapOutTime: The date and time when the customer tapped out.
  • payAmount: The amount paid by the customer for the transaction, with some transactions being free (0).

Distribution

This dataset is provided as a CSV file, named dfTransjakarta.csv, with a size of 8.98 MB. It contains 22 columns and approximately 37,900 records. While most columns have 100% valid entries, a small percentage of records may have missing values for fields such as corridor ID, corridor name, and tap-out stop information.

Usage

This dataset is well-suited for a variety of applications, particularly for those in data analysis and system development. Ideal uses include:
  • Building and validating analytical frameworks for public transportation systems.
  • Testing data structures to ensure they meet requirements for in-depth analytics.
  • Analysing public transport route efficiency, such as identifying busy or less used routes.
  • Investigating the impact of traffic congestion on specific routes.
  • Exploring customer demographics in relation to travel patterns.
  • Developing and refining dashboards and reporting tools for transport operations.

Coverage

The dataset focuses on Transjakarta, a public transportation service in Jakarta, Indonesia. The transactions are simulated for the month of April 2023, with tap-in and tap-out times recorded between 1st April and 1st May 2023. Demographic data, including customer birth year, sex, and bank issuer, is included. It is important to note that this is a simulated dataset, not actual live transaction data, and thus might not precisely reflect real-world scenarios or the latest updates from the master data sources.

License

CC0: Public Domain

Who Can Use It

This dataset is intended for:
  • Data analysts looking to practice and refine their analytical skills on real-world inspired data.
  • Researchers studying urban mobility, public transport usage patterns, and demographic influences on transit.
  • Students and educators seeking practical data for projects and teaching scenarios.
  • Developers who need sample data to test applications and systems designed for public transport management.
  • Anyone interested in understanding how public transport transaction data can be structured and analysed.

Dataset Name Suggestions

  • Transjakarta Simulated Passenger Transactions
  • Jakarta Public Bus Transit Data (April 2023)
  • Indonesian Urban Transport Dummy Transactions
  • Transjakarta Service Usage Simulation
  • April 2023 Jakarta Bus System Data

Attributes

Original Data Source: Jakarta Public Bus Transit Data

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

22/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format