Biodiversity databases are not "talking" to each other: species occurrences records and maps from different data sources do not match, and only some of them agree with data that inform how species interact with each other. A comparison between the data obtained from the International Union for the Conservation of Nature (IUCN), the Global Biodiversity Information Facility (GBIF) and a dataset of foodweb interactions (the ones comprising predation and herbivory) from the Serengeti ecosystem (East Africa) revealed many areas of "mismatch" that could indicate a lack of data for nine predators and their prey.
For some predator mammals, nearly 100% of their range maps do not overlap with those of their prey, which leads us to question if the predators can really be found in these areas where there is no prey. This is especially true for those considered specialized predators in the Serengeti foodweb (such as the serval - a wild cat - and the black-backed jackal - a dog-like carnivoran). For the golden jackal, this mismatch is probably caused by inconsistent taxonomic information between datasets: its scientific identity has been heavily debated in the literature, and it is possible that the databases are not catching up with the updates.
Species occurrence data are massively used by ecologists to understand and predict the distribution of biodiversity. These analyses inform conservation policies, actions to fight climate change, public health guidelines and much more. This is only possible because these data are very often shared under a license that allows anyone to use them and are archived properly in databases such as IUCN and GBIF. But these data have their flaws: GBIF data are known for being biased (as a result of historically biased scientific activity), IUCN is known for misestimating the distribution of rare species, and there are very few large-scale interaction data available.
This study was based on a very simple rationale: if a predator can't feed, it is very unlikely it will remain where it was found. We should expect that a range map of a predator would superpose almost perfectly with that of its prey items. If it does not, the reasons could be that we misestimate either the predator or the prey distribution, or because we lack information about species' diet.
It was with this initial idea that Gracielle, Gabriel, Francis, Fredric, Norma and Timothée got together and started analysing available data from IUCN and using the Serengeti foodweb to assess relationships between species.
"We are all interested in macroecology of interactions and species distribution modelling, and we think these things should be studied together. But we know that in order to integrate these two things, our available datasets need to talk." – Gracielle Higino
To assess if range maps and ecological interaction data were "talking" to each other, the authors divided the map of the African continent in grids of approximately 50km2 and created local foodwebs based on the published regional Serengeti foodweb and IUCN range maps. With that, they could trace a connection between a top predator and an herbivore within each grid. Whenever such connection was not possible, that grid cell was considered mismatched and the presence of a top predator or the lack of information on its prey was questioned.
This method can also be used to map priority sampling locations for interaction and occurrence data. This would contribute to the monitoring of biodiversity in face of climate change and habitat loss and to another promising venue for ecological data: the prediction of species' diets.
Geographical (such as range maps) and ecological (such as foodwebs) data mismatch because of biases and incentives to open data are a way to dissolve that. The researchers believe that more access to data is important to mitigate error propagation in ecological models that can be caused by biased occurrence maps and incomplete interaction networks.
"Open science is a very core practice for all of us. We think open access to data and information is extremely important, and we want it to be done the right way. It is imperative that open ecological data are consistent and redundant across databases, which we didn't quite see in our study." – Gracielle Higino
As we lose ecological interactions at least as fast as we lose species due to environmental changes, open access to data becomes crucial to help inform public policies in conservation and public health.