Bounding-box-based multi-animal tracking for challenging videos of non-human primates in the wild.
PriMAT is a robust multi-animal tracking framework specifically adapted to challenges introduced by videos of non-human primates in the wild.
In addition to tracking, the framework can also be used to learn a classification task for each bounding box, for example individual identification.
π Read the paper here:
PriMAT: Robust multi-animal tracking of primates in the wild
@article{vogg2026primat,
title={PriMAT: Robust multi-animal tracking of primates in the wild},
author={Vogg, Richard and Nuske, Matthias and Weis, Marissa A and L{\"u}ddecke, Timo and Karako{\c{c}}, Elif and Ahmed, Zurna and Pereira, Sofia M and Malaivijitnond, Suchinda and Meesawat, Suthirote and Murphy, Derek and others},
journal={PLoS One},
volume={21},
number={4},
pages={e0347669},
year={2026},
publisher={Public Library of Science San Francisco, CA USA}
}Train the tracking model and apply it to a video directly in Colab:
Important: If you have more than one GPU in your system, you need to export/select one that is supported by CUDA 10.2 for the entire setup.
Clone this repository and navigate to the project folder, then create the environment:
conda env create -f environment.yml
conda activate primatIf you want to create output videos, you also need to install ffmpeg:
sudo apt install ffmpegFor training, the dataset should be organized in a folder with two subfolders: images and labels_with_ids.
dataset_folder
βββ images
β βββ IMG0001.jpg
β βββ IMG0002.jpg
β βββ ...
βββ labels_with_ids
Use the notebook notebooks/create_labels.ipynb to convert your VoTT or CVAT annotations into the required format.
If your labels are stored in another format, either:
- adapt the data loader, or
- convert them into the formats described below.
If you only want to train the tracking model, you need one .txt file for each image in the images folder. Each label file must have the same name as the image file:
IMG0001.jpgβIMG0001.txt
Each line in the file corresponds to one detection and contains:
class id x_center y_center w h
The last four values are normalized by image width and image height.
If you only have one object type, class can always be 0.
In our setup:
0= lemur1= feeding box
The id is a running number through the whole dataset, i.e. each individual in each image receives a new ID.
Example:
0 379 0.520337 0.464083 0.031490 0.041500
0 380 0.538707 0.470500 0.024742 0.041333
1 337 0.547142 0.462833 0.020993 0.028000
1 338 0.498782 0.518167 0.033739 0.056000
For individual identification, or other classification tasks, an additional class label is appended to each row.
Format:
class id x_center y_center w h classification_label
The first six values are the same as above.
In our experiments:
7was used as an "Unsure" class- only lemurs were classified
- feeding boxes were assigned class
7and ignored during training
Example:
0 65 0.682461 0.544545 0.287594 0.446381 7
0 51 0.983571 0.44805 0.048391 0.120989 2
0 70 0.197715 0.476319 0.167154 0.230225 5
1 14 0.141534 0.510369 0.141242 0.207826 7
1 17 0.52678 0.527526 0.164393 0.311018 7
π¦ Download pretrained weights and datasets used in the publication:
https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/CMQY0Q
- ImageNet pretrained model
- MacaqueCopyPaste pretrained model
- Macaque tracking model
- Lemur and box tracking model
- Lemur identification model
- 500 macaque images for training
- 12 macaque videos for evaluation
- 500 lemur images for training
- 12 lemur videos for evaluation
- Lemur ID images for training
- 30 lemur ID videos for evaluation
You can train the tracking model on your own dataset using the following steps:
-
Generate one
.txtlabel file per image.
Each line should follow the format:class id x_center/img_width y_center/img_height w/img_width h/img_heightSee
notebooks/create_labels.ipynbfor code compatible with VoTT and CVAT annotations. -
Generate files containing image paths.
Example files can be found insrc/data/.
Seenotebooks/create_labels_own.ipynbfor an example. -
Create a
.jsonfile for your custom dataset insrc/lib/cfg/.
At minimum, specify the keys:"root""train"
Examples are available in
src/lib/cfg/. -
Add the following argument to your experiment file during training:
--data_cfg '../src/lib/cfg/your_dataset.json'
Use experiments/run_inference.py and adapt it to your videos and models.
If you set:
--output_format videoa video showing the tracking results will be generated using ffmpeg.
To evaluate tracking results using HOTA, clone the TrackEval repository.
Then use:
notebooks/evaluate_tracking_HOTA.ipynb
to move files into the correct locations and evaluate the model.
For evaluating identification results, use:
notebooks/evaluate_lemur_id_on_videos.ipynb
We used the codebase from FairMOT as a starting point:
FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu
We also used multi-class modifications from:
A large part of their code builds on:
Thanks to the authors for their excellent work.