AI Tool Boosts Ecological Applications

Washington University in St. Louis

By Shawn Ballard

Ever seen an image of an animal and wondered, "What is that?" TaxaBind, a new tool developed by computer scientists in the McKelvey School of Engineering at Washington University in St. Louis, can sate that curiosity and more.

TaxaBind addresses the need for more robust and unified approaches to ecological problems by combining multiple models to perform species classification (what kind of bear is this?), distribution mapping (where are the cardinals?), and other tasks related to ecology. The tool can also be used as a starting point for larger studies related to ecological modeling, which scientists might use to predict shifts in plant and animal populations, climate change effects, or impacts of human activities on ecosystems.

Srikumar Sastry, the lead author on the project, presented TaxaBind on March 2-3 at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) in Tucson, AZ.

"With TaxaBind we're unlocking the potential of multiple modalities in the ecological domain," Sastry said. "Unlike existing models that only focus on one task at a time, we combine six modalities – ground-level images of species, geographic location, satellite images, text, audio and other environmental features – into one cohesive framework. This enables our models to address a diverse range of ecological tasks."

Sastry, a graduate student working with Nathan Jacobs , professor of computer science & engineering, used an innovative technique known as multimodal patching to distill information from different modalities into one binding modality. Sastry describes this binding modality as the "mutual friend" that connects and maintains synergy among the other five modalities.

For TaxaBind, the binding modality is ground-level images of species. The tool captures unique features from each of the other five modalities and condenses them into the binding modality, enabling the AI to learn from images, text, sound, geography and environmental context all at once.

When the team assessed the tool's performance across various ecological tasks, TaxaBind demonstrated superior capabilities in zero-shot classification, which is the ability to classify a species not present in its training dataset. The demo version of the tool was trained on roughly 450,000 species and can classify a given image by the species it shows, including previously unseen species.

"During training we only need to maintain the synergy between ground-level images and other modalities," Sastry said. "That bridge then creates emergent synergies between the modalities – for example, between satellite images and audio – when TaxaBind is applied to retrieval tasks, even though those modes were not trained together."

This cross-modal retrieval was another area where TaxaBind outperformed state-of-the-art methods. For example, the combination of satellite images and ground-level species images allowed TaxaBind to retrieve habitat characteristics and climate data related to species' locations. It also returned relevant satellite images based on species images, proving the tool's ability to link fine-grained ecological data with real-world environmental information.

The implications of TaxaBind extend far beyond species classification. Sastry notes that the models are general purpose and could potentially be used a foundational model for other ecology and climate-related applications, such as deforestation monitoring and habitat mapping. He also envisions future iterations of the technology that can make sense of natural language text inputs to respond to user queries.


Sastry S, Khanal S, Dhakal A, Ahmad A, Jacobs N. TaxaBind: A unified embedding space for ecological applications. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, Feb. 28-March 4, 2025. https://arxiv.org/abs/2411.00683

This research used the TGI RAILs advanced compute and data resource, which is supported by the National Science Foundation (OAC-2232860) and the Taylor Geospatial Institute. This work was also supported by the Washington University in St. Louis Geospatial Research Initiative.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.