Otkritie Bank Housing Competition Data
Comodities & Real Estate
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Data consists of imaginary Paris housing information originally collected for a data science competition held by "Otkritie" bank. The material is structured to support clustering analysis and machine learning. It was specifically utilized for model development focusing on techniques like k-Means and the elbow method. The resource provides details on property features, location prestige, and final price metrics for apartments in the European city.
Columns
The material contains 10 detailed attributes out of an available 18 columns, all showing 100% validity across 10,000 records:
- price: The price of the house. Values range from 0 to 9,999, with a mean value of 5k.
- squareMeters: The area of the apartment in square metres. Values range from 89 to 100.0k, with a mean of 49.9k.
- numberOfRooms: The count of rooms in the apartment. There are 100 unique values, with 'fifty-four' being the most common (1%).
- floors: The total amount of floors in the house. Values range from 1 to 100, with a mean of 50.3.
- cityCode: The unique code identifying the city location. Values range from 3 to 100.0k, with a mean of 50.2k.
- cityPartRange: The prestige level of the area, defined on a range from 0 to 10. Values range from 1 to 10, with a mean of 5.51.
- numPrevOwners: The count of previous owners. Values range from 1 to 10, with a mean of 5.52.
- made: The year when the house was built. Values range from 1990 to 2021, with a mean of 2.01k.
- isNewBuilt: A Boolean indicator detailing whether the apartment is new or renovated. The dataset is nearly balanced, with 4,991 records marked as true and 5,009 as false.
- hasStormProtector: A Boolean indicator showing if the apartment has a storm protector. The dataset is also balanced for this feature, with 4,999 records marked as true.
Distribution
The material is available in a single CSV file named
ParisHousing.csv, which is 1.09 MB in size. The dataset includes 10,000 valid records. Data quality is high, with all listed columns being 100% valid and containing zero missing entries. The expected update frequency is Never.Usage
This resource is ideally suited for machine learning experimentation, particularly focusing on unsupervised learning methods like clustering. It allows analysts to train models to group properties based on physical attributes, location prestige, and ownership history. The data can also be used for predictive modelling concerning property values.
Coverage
The scope covers hypothetical housing information situated in Paris, Europe. The material details 10,000 apartments, including physical characteristics (area, rooms, floors), property history (age, owners), and location details. The property age spans from houses built in 1990 up to 2021.
License
CC BY-NC-SA 4.0
Who Can Use It
The dataset is intended for users focused on computer science, specifically machine learning and data clustering techniques. This includes data science competition participants, academics, and researchers interested in property market simulation and predictive modelling. The material has a maximum usability rating of 10.00.
Dataset Name Suggestions
- Paris Housing Data for Clustering
- Imaginary Paris Apartment Features
- Otkritie Bank Housing Competition Data
Attributes
Original Data Source: Otkritie Bank Housing Competition Data
Loading...
