Opendatabay APP

DS4C South Korea COVID-19 Data

Public Health & Epidemiology

Tags and Keywords

Covid-19

Korea

Cases

Public

Health

Trusted By
Trusted by company1Trusted by company2Trusted by company3
DS4C South Korea COVID-19 Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset, known as DS4C: Data Science for COVID-19 in South Korea, provides structured information related to COVID-19 infection cases in South Korea [3]. It was created by reprocessing and structuring report materials from the KCDC (Korea Centers for Disease Control & Prevention) and local governments, which are known for their quick and transparent announcement of information [3, 4]. The primary purpose is to facilitate easy data analysis and to uncover meaningful patterns through the application of various data mining and visualisation techniques [3, 4]. A portion of this dataset has been recognised and accepted at NeurIPS 2020 [3]. Please be aware that updates to this dataset have ceased, and the PatientRoute.csv file is currently unavailable due to privacy concerns [5].

Columns

The Case.csv file, a sample component of this dataset, includes the following columns:
  • case_id: A unique identifier for each infection case [6].
  • province: Specifies the Special City, Metropolitan City, or Province(-do) where the case occurred. Examples include Seoul and Gyeonggi-do [7].
  • city: Details the City(-si), Country(-gun), or District(-gu) [7].
  • group: A boolean indicator (TRUE/FALSE) to show if the case is part of a group infection [8].
  • infection_case: The specific name of the infection group or other case descriptions, such as 'overseas inflow' [8].
  • confirmed: The accumulated number of confirmed cases related to that infection event [9].
  • latitude: The latitude coordinate (WGS84) of the infection group's location [9].
  • longitude: The longitude coordinate (WGS84) of the infection group's location [10].

Distribution

The data is typically provided in CSV format [1]. The Case.csv sample file is 11.71 kB in size and contains 8 columns [6]. It consists of 174 valid records or rows [7-10]. The dataset's PatientRoute.csv file is currently not available due to privacy considerations [5].

Usage

This dataset is ideal for various applications and use cases, including:
  • Applying data mining and visualisation techniques to find meaningful patterns related to COVID-19 spread and cases [3, 4].
  • Conducting exploratory data analysis (EDA), such as analysing floating population data or identifying who spreads the coronavirus [5].
  • Developing time series geospatial analyses using tools like Folium [5].
  • Supporting research on public health and epidemiology, particularly in the context of disease outbreaks [3, 11, 12].
  • Participating in data visualisation and AI competitions focused on COVID-19 [13].

Coverage

The dataset primarily covers COVID-19 infection cases within South Korea, encompassing data from various provinces and cities across the country [3, 7]. Geographic coordinates (latitude and longitude) are also provided for group infections [9, 10]. The data reflects the period when COVID-19 had infected more than 10,000 people in South Korea [3]. It is important to note that the dataset has stopped receiving updates [5]. While specific demographic groups are not explicitly listed in the Case.csv columns, the data pertains to individuals affected by the virus [6-10].

License

CC BY-NC-SA 4.0

Who Can Use It

This dataset is intended for a range of users interested in public health data and data analysis:
  • Data scientists and analysts: To reprocess information, perform analyses, and find insights into COVID-19 patterns [3, 4].
  • Researchers and academics: Particularly those in public health, epidemiology, and data science, as evidenced by partnerships with universities and research institutions [11, 12].
  • Competitors in data challenges: Ideal for those participating in hackathons and competitions focused on COVID-19 visualisation and AI [13].
  • Journalists and media outlets: For informing public understanding through news articles and blog posts about the pandemic's impact in South Korea [12].

Dataset Name Suggestions

  • DS4C South Korea COVID-19 Data
  • Korean COVID-19 Infection Cases
  • KCDC COVID-19 Dataset
  • South Korea Pandemic Data

Attributes

Original Data Source: DS4C South Korea COVID-19 Data

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format