Opendatabay APP

Geographic and Rating Analysis of Indian Hotels

Product Reviews & Feedback

Tags and Keywords

Travel

Hotel

India

Rating

Cleartrip

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Geographic and Rating Analysis of Indian Hotels Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This product contains detailed information for 5,000 Indian hotel properties extracted from Cleartrip.com, a major online travel portal in India. This material is a subset derived from a much larger collection, initially gathered via a dedicated web-crawling service. The data is structured to enable robust analyses of hotel characteristics, covering everything from detailed property descriptions and amenity listings to multiple vendor and consumer rating systems. It provides crucial input for understanding the Indian hospitality market landscape and traveller preferences.

Columns

The dataset contains 33 distinct fields, providing granular detail on property location, features, and performance metrics:
  • address: The physical location details of the property.
  • area: The specific locale where the property is situated.
  • city/province/state: Geographic identifiers, noting that the content is exclusively sourced from properties located within INDIA. Jaipur and Kochi are among the most frequently listed cities.
  • landmark: Key reference points near the property.
  • latitude/longitude: Geographic coordinates for mapping and spatial analysis.
  • property_name: The title or business name of the accommodation.
  • property_id: The unique identifier assigned by Cleartrip to the listing.
  • property_type: Categorisation of the accommodation type (e.g., Hotel, Guest House).
  • hotel_description: Narrative text detailing the hotel's offering and context.
  • hotel_facilities: A detailed list of amenities available at the property (e.g., Internet, Air Conditioning, Parking).
  • hotel_star_rating: The official or assigned star rating of the accommodation.
  • room_type: Classification of rooms available (e.g., Deluxe Room, Standard Room AC).
  • room_facilities: Features specific to the individual rooms.
  • room_count: The total number of rooms present in the property.
  • image_count: The number of promotional images available for the listing.
  • image_urls: The URLs pointing to the property images.
  • cleartrip_seller_rating: A rating specific to the Cleartrip platform.
  • tad_review_count/tad_review_rating/tad_stay_review_rating: Metrics derived from TripAdvisor data, including review volumes and detailed category ratings (Location, Service, Value).
  • tripadvisor_seller_rating: The seller rating provided by TripAdvisor (e.g., Top 1%).
  • crawl_date/qts: Timestamp details related to the data extraction event.
  • pageurl/sitename: Links confirming the original Cleartrip listing source.
  • uniq_id: A unique identifying code assigned during data collection.
  • similar_hotel: Identifiers for comparable properties.

Distribution

The collection is provided as a single CSV file, cleartrip_com-travel_sample.csv, with a file size of 15.43 MB. It contains exactly 5,000 records, each featuring 33 data points. The properties are exclusively located within India. This is a static collection with an expected update frequency of 'Never'.

Usage

This data is suitable for various analytical purposes, including:
  • Developing predictive models for hotel pricing or demand in the Indian travel sector.
  • Performing geographic market segmentation and identifying concentration of facilities in specific areas (e.g., Airport Zones).
  • Analysing correlations between star ratings, seller ratings, and customer reviews to gauge market perception.
  • Building recommendation engines for travel planning platforms.
  • Extracting key phrases and features from property descriptions and facility listings using natural language processing.

Coverage

The dataset focuses entirely on hotels in India. The collection covers data extracted between 16 August 2016 and 1 September 2016. Properties are distributed across several states, prominently Rajasthan and Kerala, and feature properties in 196 distinct cities, including significant representation in Jaipur and Kochi.

License

CC0: Public Domain

Who Can Use It

  • Travel Industry Analysts: To study facility trends and competitive positioning of Indian hotels.
  • Data Scientists and Machine Learning Engineers: For training models related to geo-spatial correlation, text analysis, and sentiment prediction based on review scores.
  • Geographic Information System (GIS) Specialists: To map hotel density and distribution using provided latitude and longitude data.
  • Market Researchers: To gain insight into the property types, room facilities, and typical image counts used by accommodations in key tourist and business regions.

Dataset Name Suggestions

  • Indian Hotel Listings 2016 (Cleartrip Data)
  • Cleartrip Indian Accommodation Features and Ratings
  • India Travel Portal Hotel Data Sample
  • Geographic and Rating Analysis of Indian Hotels

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

04/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format