New AI-powered dataset aims to save endangered birds in Macao

Data analysis revealed a highly balanced distribution across species categories, and bounding box annotations showed minimal localization errors. Most birds were centered in the images and relatively small in size, which aligns with realistic field surveillance conditions. Models demonstrated the ability to identify birds even in complex environments with diverse lighting and occlusion conditions.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 02-06-2025 08:47 IST | Created: 02-06-2025 08:47 IST
New AI-powered dataset aims to save endangered birds in Macao
Representative Image. Credit: ChatGPT

A newly released dataset promises to transform avian conservation in the densely urbanized region of Macao by equipping artificial intelligence (AI) systems with curated, locally relevant data. Published in the journal Data, the study titled “Macao-ebird: A Curated Dataset for Artificial-Intelligence-Powered Bird Surveillance and Conservation in Macao” introduces a groundbreaking two-part image dataset tailored to endangered bird species in the region.

Developed by researchers from Macao Polytechnic University, Dongguan University of Technology, and the University of Bologna, the dataset addresses a longstanding barrier in AI-based biodiversity research: the lack of localized, high-quality visual data. Macao-ebird targets this gap by integrating 7341 curated images across 24 bird species, prioritizing those listed as endangered or nationally protected, alongside automated detection labels generated through state-of-the-art AI techniques.

How was the dataset built and what makes it unique?

The Macao-ebird dataset comprises two structured components: Macao-ebird-cls for classification tasks and Macao-ebird-det for object detection. The first subset was compiled through a hybrid web scraping approach, using publicly accessible platforms like eBird, Observation.org, and Baidu Image Search. After an initial collection phase, images underwent rigorous curation, including resolution checks, duplicate removal, and manual verification to ensure species accuracy and photographic quality. Only single-bird images were retained to reduce label ambiguity.

In the classification dataset, species such as Circus spilonotus (412 images) and Halcyon smyrnensis (340 images) are well represented, reflecting a focus on birds under threat within Macao’s diverse habitats. The dataset prioritizes species identified in authoritative sources like the Catalogue of Birds in the Cotai Ecological Zone and Report on the State of Macao’s Environment.

The detection subset, Macao-ebird-det, was created using AI-agent-assisted labeling powered by Grounding DINO, a zero-shot object detection model capable of aligning text prompts with image regions. This allowed the research team to automatically annotate images by feeding prompts such as “find the bird: Platalea minor,” significantly reducing manual labeling time. The pipeline achieved a 99.18% success rate across 7287 images.

How effective are the detection models and what are the technical benchmarks?

To evaluate the practical utility of the Macao-ebird-det dataset, the team ran extensive baseline experiments using YOLOv8 through YOLOv12 object detection models. These models were selected for their balance of computational efficiency and detection accuracy - critical for deployment in field conditions such as on drones or edge devices in conservation zones.

Among the best performers was YOLOv9s, achieving a mean average precision (mAP50) of 0.984 and mAP50-95 of 0.958. Other top models, such as YOLOv12s and YOLOv11s, also showed strong detection performance while maintaining competitive inference speeds, critical for real-time applications.

Data analysis revealed a highly balanced distribution across species categories, and bounding box annotations showed minimal localization errors. Most birds were centered in the images and relatively small in size, which aligns with realistic field surveillance conditions. Models demonstrated the ability to identify birds even in complex environments with diverse lighting and occlusion conditions.

The trade-off between model size and inference speed was evident. Lighter models like YOLOv8n (2.3 ms per inference) delivered fast but slightly less precise results, whereas more accurate models like YOLOv12s had higher latency (8.2 ms). YOLOv11s stood out for its balance, achieving 0.953 mAP50-95 with an inference time of 5.4 ms.

What are the implications for conservation, education, and AI research?

The authors highlight three primary applications for the dataset: real-time avian surveillance, public education, and algorithm benchmarking.

In conservation, models trained on Macao-ebird can be deployed on drones, monitoring stations, or mobile applications to track endangered bird populations in real time. This enables better protection planning, early warning systems, and even habitat quality assessments. For example, the inclusion of rare species like Platalea minor allows targeted monitoring in Macao’s ecologically sensitive areas such as the Cotai wetlands.

Beyond conservation, the dataset’s extensive labeled imagery opens opportunities for citizen science engagement and environmental education. Schools, museums, and NGOs can develop bird identification tools, mobile apps, and interactive platforms to deepen public understanding of local biodiversity and foster ecological stewardship.

From a technical standpoint, Macao-ebird provides a robust benchmark for fine-grained visual classification and object detection algorithms. It supports research into annotation automation, lightweight model deployment, and AI-driven ecological modeling. The inclusion of Grounding DINO as an annotation tool showcases scalable techniques for dataset generation, potentially applicable to other regions and taxa.

What are the current limitations and future expansion plans?

Despite its breakthroughs, the dataset has constraints. It covers only 24 species out of the 174 known in Macao, limiting its scope for comprehensive ecological assessments. Additionally, the current version includes only one bird per image, which restricts multi-object detection and flock behavior analysis.

The dataset also favors clear, high-quality imagery, leading to an underrepresentation of occluded or poorly lit samples, scenarios frequently encountered in real-world monitoring. The authors recommend that future iterations incorporate challenging samples and expand to include audio data, enabling multimodal bird recognition systems.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback