Clear Sky Science · en

A wild fish image dataset for individual re-identification and phenotyping

· Back to index

Why looking closely at fish faces matters

Wild animals often look similar to us, yet each individual carries its own story of growth, movement and survival. For marine biologists, keeping track of these stories usually means catching fish, attaching tags and hoping to see them again later, a process that is costly and stressful for the animals. This study introduces Melops, a large open image collection of a colourful coastal fish, the corkwing wrasse, designed to let computers recognize individual fish from their facial patterns alone and to measure how their appearance changes over time.

Figure 1
Figure 1.

A growing photo album of wild fish

Over seven years along the shores of western Norway, researchers repeatedly caught nearly ten thousand wild corkwing wrasse. For each fish they recorded its length, sex and health, implanted a tiny electronic tag for reliable identification and photographed both sides of the body against a standard white background. This careful routine produced 24,578 images, including more than 8,500 photos of fish that were seen more than once. Because many individuals were recaptured over months and years, the dataset captures how real wild fish change in size, colour and condition as they age, breed and recover from injuries.

Turning raw photos into machine-ready data

To make these images useful for artificial intelligence, the team did far more than just store photographs. A subset of images was painstakingly annotated by hand, marking the outline of each fish, its head region and eleven anatomical landmarks such as the snout, eye and tail base. These examples were used to train modern computer vision models (YOLOv8) that can automatically locate the fish, crop out the body or just the head and pinpoint key body features across the entire collection. The result is a suite of standardized image crops—full body, head and body without head—plus precise coordinates of important points on the fish.

Reading colour and shape with consistency

Because the photos were taken in the field under changing light, the researchers also tackled the challenge of making colours comparable from image to image. Most photos include a white reference card, allowing software to measure how much each picture drifts from a known standard and to correct brightness and colour balance accordingly. Scripts in Python and R, shared openly with the dataset, show how to perform these corrections and how to extract colour information from specific regions such as the cheek. This careful standardization is essential for studying subtle differences in colour linked to sex, season, health and social status.

Figure 2
Figure 2.

Can people and machines spot the same fish?

The corkwing wrasse has intricate, high-contrast patterns on its head that act like a facial fingerprint. To see how well humans can use these patterns, the team built a simple online test called FishFaces. Participants were shown a query headshot of a fish and two candidate images and asked to choose which one showed the same individual, sometimes separated by years. Eight experienced observers scored near-perfect accuracy, even when the fish had grown or changed overall colour. Earlier computer experiments on a smaller subset of the data showed that current deep learning methods can already pick out the right fish in about half of difficult cases, and the new, larger dataset is meant to push these methods much further.

What this means for watching life under water

By openly releasing Melops and all associated tools, the authors offer a rare, richly documented record of thousands of wild fish followed through time. For non-specialists, the key message is that we may soon be able to monitor fish populations, sex ratios, growth and health simply by analysing photographs, reducing the need for invasive tags. The same framework can be adapted to other species with distinctive markings, providing a powerful way to study animal lives, fisheries impacts and environmental change while leaving more of the animals’ world undisturbed.

Citation: Sørdalen, T.K., Malde, K., Sauvaitre, C. et al. A wild fish image dataset for individual re-identification and phenotyping. Sci Data 13, 708 (2026). https://doi.org/10.1038/s41597-026-07045-1

Keywords: animal re-identification, computer vision, fish ecology, image dataset, colour pattern