CAMotion | Dataset & Benchmark

Introduction

Discovering camouflaged objects is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings. While the problem of camouflaged object detection over sequential video frames has received increasing attention, the scale and diversity of existing video camouflaged object detection (VCOD) datasets are greatly limited, which hinders the deeper analysis and broader evaluation of recent deep learning-based algorithms with data-hungry training strategy. To break this bottleneck, we construct CAMotion, a high-quality benchmark that covers a wide range of species for camouflaged moving object detection in the wild. CAMotion comprises various sequences with multiple challenging attributes such as uncertain edge, occlusion, motion blur, and shape complexity, etc. The sequence annotation details and statistical distribution are presented from various perspectives, allowing CAMotion to provide in-depth analyses on the camouflaged object’s motion characteristics in different challenging scenarios.

CAMotion benchmark teaser figure — Figure 1. Examples from the CAMotion dataset with corresponding pixel-level annotations. Rows 1, 3, 5, and 7 show original images, while Rows 2, 4, 6, and 8 present the corresponding pixel-wise ground-truth annotations.

Statistics

149,319 Image Frames

6.51×MoCA-Mask

30,028 Annotated Frames

6.40×MoCA-Mask

151 Species

3.43×MoCA-Mask

Statistic comparison with other camouflage datasets

Statistical comparison table of camouflage datasets — TABLE 1: Statistics of camouflage datasets. * indicates that the #Species is not reported in the original paper and is estimated by us.

Figure 2: scale and species comparison between camouflage datasets and CAMotion — Figure 2: The scale and species comparison between existing camouflage datasets and CAMotion.

Figure 3: scale distribution comparison of CAMotion and MoCA-Mask — Figure 3: Scale distribution comparison of CAMotion and MoCA-Mask. Note that the reported ratio is defined as the proportion of foreground area relative to the entire image.

Dataset features

Figure 4 dataset feature visualization — Figure 4: Taxonomic structure of CAMotion. The inner ring illustrates the class taxonomy, and the outer ring shows the corresponding order taxonomy.

Figure 5 dataset feature visualization — Figure 5: An example of the hierarchy tree in CAMotion, illustrated with the Ray-finned Fish class.

Figure 6 dataset feature visualization — Figure 6: The attributes distribution of CAMotion in frame-level and sequence-level.

Tip: click to enlarge for details.

Figure 7 statistics for CAMotion dataset — Figure 7: Statistics for CAMotion dataset. (a) Object sizes distribution. (b) The distribution of video durations. (c) Global and local contrast distribution. (d) Motion statistics of the camouflaged objects. Note that the reported ratio is defined as the proportion of foreground area relative to the entire image.

Demo

Amazon Leaffish

Batfish

Cat

Clownfish

Common Octopus

Eurasian Bittern

Leaf-Tailed Geckos

Leafy Seadragon

Mockingbird

Moss Mimic Stick Insect

Orchid Mantis

Peppered Moth

Pygmy Seahorse

Snow Leopard

Snowy Owl

Stoat

Evaluation

We conduct comprehensive experiments on the CAMotion dataset to evaluate the performance of 18 COD/VCOD models. Despite the promising performance of these models on existing COD datasets, these models exhibit a notable performance decline on the CAMotion benchmark. Both COD and VCOD methods still struggle to balance camouflaged object discriminative capability and temporal consistency. How to accurately identify camouflaged objects across video frames while mitigating error accumulation over time remains a crucial challenge.

Metrics: S measure, weighted F measure, mean E measure, MAE, mean Dice, mean IoU.

TABLE 2: Quantitative comparison with 18 cutting-edge methods on CAMotion and MoCA-Mask testing datasets. Notes $$\uparrow/\downarrow$$ denotes the higher/lower the better, and the best and second best are $$\textbf{bolded}$$ and $$\underline{\text{underlined}}$$ for highlighting, respectively. $$\ddagger$$ indicates that the prompt input, i.e., the first-frame annotation, is removed during both training and testing for a fair comparison.

Dataset

Dataset Download

The dataset is available for non-commercial research purposes only. Please use the following links.

Google Drive Baidu Netdisk

Depth and Optical Flow

The depth map generated by Depth Anything V2 and optical flow generated by GMFlow can be found at Google Drive and Baidu NetDisk.

Attributes

To facilitate in-depth analysis of camouflaged videos under various challenging scenarios, we categorize each camouflaged frame according to eight attributes, including uncertain edge (UE), big object (BO), multiple objects (MO), small object (SO), occlusion (OC), shape complexity (SC), out-of-view (OV) and motion blur (MB). The definitions of these attributes are provided below.

Attr	Description
MO	Multiple Objects: image contains at least two objects.
BO	Big Object: ratio between object area and image area ≥ 0.15.
SO	Small Object: ratio between object area and image area ≤ 0.02.
UE	Uncertain Edge: the foreground and background areas around object have similar colors and textures.
OC	Occlusion: the object is partially occluded.
SC	Shape Complexity: object contains thin parts (e.g., animal foot).
OV	Out-of-View: some portion of the object leaves the camera field of view.
MB	Motion Blur: the object region is blurred due to the motion of object or camera.

Data structure

CAMotion
├── TestDataset_per_sq
│   ├── African_twig_mantis_3
│   └── …
│
└── TrainDataset_per_sq
    ├── African_twig_mantis_2
    │   ├── BBox
    │   │   ├── 00000.txt
    │   │   └── …
    │   ├── Edge
    │   │   ├── 00000.png
    │   │   └── …
    │   ├── GT
    │   │   ├── 00000.png
    │   │   └── …
    │   └── Imgs
    │       ├── 00000.jpg
    │       ├── 00005.jpg
    │       ├── 00010.jpg
    │       └── …
    └── …

People

Siyuan Yao

Sun Yat-sen University Shenzhen Campus

Hao Sun

Sun Yat-sen University Shenzhen Campus

Ruiqi Yu

Nanyang Technological University

Xiwei Jiang

Beijing University of Posts and Telecommunications

Wenqi Ren

Sun Yat-sen University Shenzhen Campus

Xiaochun Cao

Sun Yat-sen University Shenzhen Campus

Citation

Please cite CAMotion if it helps your research.

@article{yao2026camotion,
  title={C{AM}otion: A High-Quality Benchmark for Camouflaged Moving Object Detection in the Wild},
  author={Siyuan Yao and Hao Sun and Ruiqi Yu and Xiwei Jiang and Wenqi Ren and Xiaochun Cao},
  journal={arXiv preprint arXiv:2604.08287},
  year={2026}
}