Opendatabay APP

Happywhale Competition Cropping Data

Data Science and Analytics

Tags and Keywords

Whales

Dolphins

Bounding-boxes

Annotations

Vision

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Happywhale Competition Cropping Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Designed to assist in the identification of individual marine life forms, this collection provides bounding box annotations specifically created for cropping input images. It aggregates crowd-sourced data established for the Happywhale competition alongside several public manual annotation datasets. The primary objective is to facilitate the isolation of whales and dolphins within images to enhance identification algorithms.

Columns

  • identity: The unique user identifier (UUID) of the individual who created the annotation; a value of '0000000000' denotes an automated method.
  • image: The specific filename of the image as found in the competition's training file set (e.g., '19c22f1ba1d06d.jpg').
  • x1: The X coordinate for the first point of the bounding box.
  • y1: The Y coordinate for the first point of the bounding box.
  • x2: The X coordinate for the second point of the bounding box.
  • y2: The Y coordinate for the second point of the bounding box.
  • verified: A boolean value indicating whether the annotation has been manually reviewed (true/false).
  • judge_identity: The UUID of the person who verified the annotation; this field is null if the entry remains unverified.
  • judge_decision: Indicates whether the reviewer accepted the annotation (e.g., 'accepted').

Distribution

The data is formatted as a CSV file named annotations.csv with a file size of approximately 2.86 MB. It contains roughly 27,300 valid records across 9 columns. The dataset includes boolean fields for verification status and numeric coordinates for bounding boxes, with approximately 17% of the data manually verified.

Usage

This dataset is ideal for training object detection models such as YOLOv5, developing image segmentation algorithms, and creating pre-processing pipelines that require accurate cropping of marine animals. It serves as a ground truth resource for machine learning competitions and research focused on automated animal identification.

Coverage

The data covers specific marine life imagery (whales and dolphins) associated with the Happywhale competition. It integrates annotations from multiple sources, including specific Kaggle datasets (e.g., 'happierwhale', 'happywhales-labelme-segmentation-dataset') and a dedicated crowd-sourcing website. The update frequency is expected to be daily.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists working on computer vision problems for animal recognition.
  • Marine Biologists seeking to automate the processing of survey images.
  • Machine Learning Engineers developing object detection systems for conservation technology.
  • Competition Participants needing refined training data for the Happywhale challenge.

Dataset Name Suggestions

  • Happywhale Crowdsourced Bounding Boxes
  • Marine Life Object Detection Annotations
  • Whale and Dolphin Image Crops
  • Happywhale Competition Cropping Data

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

07/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format