Expert or Novice: do we see the same things?

Camera traps have become an invaluable tool for wildlife scientists, allowing us to capture images and videos of animals without disrupting their behaviours. However, as the wealth of data they generate continues to grow, our challenge lies in efficiently processing and classifying the thousands (or even millions) of images that cameras produce. With cameras capable of recording up to 40,000 images per day, researchers may find themselves overwhelmed with the task of content classification. This is where students, interns, citizen scientists or other assistants may come into play. But we are left with an important question: can we trust their observations?

Confronted with mass data, multiple research programmes turn to untrained, volunteer citizen scientists to determine image contents. One example is the SnapShot Safari programme initiated by the University of Minnesota. In this ground-breaking effort, hundreds of camera traps were deployed across East and Southern African landscapes, empowering citizens worldwide to help identify animals through an online platform. Given the diverse backgrounds and varying levels of experience with recognising and identifying wildlife, some might wonder whether we can truly rely on volunteers to extract relevant information accurately and reliably? And beyond species identification, counting individuals requires a keen eye and a high level of concentration.

M2E79L213-213R382B320

At the Ongava Research Centre, we have processed over 8 million images already, with about 500,000 more added annually. We are continually confronted with the challenge of evaluating how different observers might impact the quality of data extracted from these images. In a recent experiment, our focus was on understanding the influence of experience with camera traps and knowledge of Namibian wildlife on the data obtained. So, we enlisted a diverse group of 10 in-house researchers, consisting of three senior researchers, two research technicians, and five students. We categorized participants into novices (less than 1-year experience), semi-experienced (1 to 5 years), and experienced observers (> 5 years). Looking at the same set of ~11,500 images and without time constraints and provided plenty of aids to identify wildlife correctly, everyone was assigned two (seemingly very simple) objectives:

1) to identify all the mammal species present in the images, and

2) to count herbivores in each of the recorded herds.

From this, we tested how much different human observers agree in their wildlife observations recorded from the images.

The more experience the observer, the quicker the classification!

The results were quite a bit of a

surprise… Our experiment revealed considerable disparities

in the time taken to complete these tasks but also the accuracy of species identification and

herd counts among observers. First, only a single participant (a novice we might add) found all 22 mammal species present in the images! While experienced observers outperformed their less experienced counterparts in terms of greater precision and confidence in identifying species, the novice observers tended to label a great number of recorded animals as “unknown”, something that the ‘experts’ did not do at all. Additionally, experienced observers were an impressive 4.5 times faster in their classifications compared to novices.

More experienced observers make fewer mistakes and are more confident.

Among the different challenges faced by less experienced observers, the most prominent errors occurred with correctly recording small mammals like squirrels and rock hyraxes, which proved difficult to differentiate and were more likely to be overlooked altogether. Novices also encountered problems with distinguishing between similar-looking mammals, such as plains and Hartmann’s mountain zebras.

More experienced observed produced more similar counts. But antelopes in large herds are difficutl to count for everyone!

When it came to counting the group sizes of common herbivores, things really got interesting. We found enormous variation between the counts of different observers, with herd sizes varying by a staggering 140% to 170%. However, it was among novice observers that estimates exhibited the highest level of variability, while experienced observers recorded more consistent results. Further analysis showed that group size estimates also varied significantly between different species. For instance, species living in larger herds, such as black-faced impalas, posed greater challenges in counting, leading to increased variation in counts. Interestingly, even Hartmann’s mountain zebras, which do not typically form large groups, were recorded with considerable variation in group sizes. It’s not so easy after all, even when we’re provided with near-optimal recording conditions from static wildlife images and all the time in the world.

So, are we seeing the same things in camera trap images? Apparently not! But what do the results from our small experiment mean? If you have a large camera trap dataset and your focus is solely on identifying large and well-known species, involving volunteers or citizen scientists may be an effective approach. They can contribute valuable data to your research, facilitating the identification of easily identifiable, familiar species. If your study requires the inclusion of smaller, lesser-known or similar-looking wildlife species, however, it is best to rely solely on expert observers for extracting information from images. Their expertise ensures higher levels of accuracy and precision, particularly when dealing with cryptic or less distinctive species. This is particularly true for difficult image sequences where only a part of the animal’s body may be visible. If your goal is to estimate group sizes of common herbivores from camera trap images, camera traps may not be the most suitable tool to achieve accurate results. If you still choose to use them, it becomes imperative to assemble a small team of highly experienced observers to do the counting. Their expertise will minimize errors and uncertainties, guaranteeing the most reliable estimates possible.

In summary, the decision of whether to involve volunteers or untrained citizen scientists in data extraction or not, and the choice of research tools, depends on the specific objectives and requirements of your wildlife study. For broader surveys focusing on well-known distinctive species, citizen scientists can be a valuable resource. However, when dealing with more difficult situations that involve similar-looking, cryptic or small species, or when estimating group sizes accurately is your goal, the expertise of seasoned observers becomes indispensable for obtaining high-quality data. The devil is in the detail.

If you’ve been following our previous posts, you might be thinking that Artificial Intelligence (AI) is the solution to our challenges – and you are partly right! AI holds the promise of addressing some of the issues we have discussed here, provided that the same algorithm is used consistently across different studies and that it is trained with standardized datasets. But let’s remember how the initial AI training works. To train or ‘inform’ an AI algorithm, a vast amount of data is required, often millions of images that have already been classified by humans. Correctly identified images enable the machine to ‘learn’ how to distinguish between various species, such as elephants and lions. However, there is a catch here – the training dataset itself still has to be classified by humans, bringing us back to our little experiment.

The TrapTagger platform: online AI image classification

As we circle back to the fundamental importance of human expertise, when it comes to extracting research data from camera trap images, relying on experienced staff rather than novices is likely the better approach. The wealth of knowledge and experience possessed by expert staff ensures more accurate and reliable information. And, while AI surely holds great promise in minimising our challenges in the future, algorithms still rely on human input for their initial training. Hence, having experienced observers involved in the initial classification of camera trap images remains crucial for data quality and accuracy.

If you are interested in the full scientific paper, you can find it here: https://link.springer.com/article/10.1007/s10531-022-02472-z

This blog was written with the help of Florian Weise.