As agricultural research continues to become more entwined with technology, smart farming – a phrase that encompasses research computing tools that help farmers to better address issues like crop disease, drought and sustainability – has quickly become a ubiquitous term in Ag labs across the country. The availability of NCSA resources like Delta for researchers, both nationally and on the University of Illinois Urbana-Champaign (U. of I.) campus, has fostered a hotbed of cutting-edge research projects in the agricultural domain.
Yi-Chia Chang, a Ph.D. student at the U. of I., focuses his research on machine learning (ML) and remote sensing. His most recent research, published in arXiv and accepted to IEEE IGARSS 2025 , concerns crop mapping.
Imagine you're a farmer, and you're planning what to grow this season. You may want to know what crop would be most valuable to grow. If you're a policymaker, you might want to know if there would be a shortage of a particular crop and incentivize farmers to grow it through subsidies. To do this, you'd have to know what's currently growing to make those decisions – that's where crop mapping comes into play.
Crop mapping uses satellite imagery to create a map of all the crop types in a particular region. Crop maps are essential tools when it comes to monitoring crops and regional food supplies, and these maps help when farmers are planning which crops to plant in a growing season. The maps can also help with smart farming – using these crop maps applications can monitor growth, precipitation conditions, yield predictions and even disease.
All these tools are great for farmers, but they also help at a larger scale as well, helping policymakers and organizations determine how much food and what types are being produced in a given area. Machine learning is an essential component when it comes to keeping these crop maps up-to-date. In the U.S. alone, there are millions of acres of farmland to analyze, label and map. There aren't enough experts to analyze and keep up with data to create up-to-date, accurate crop maps, so training machines to scan satellite images and label crops is far more efficient and useful.
Researchers have had great success training machines to recognize not only crops but many other elements of farming from satellite imagery. They've created accurate models for crop mapping in well-researched regions like the U.S. However, there has been little research on how well these models work in new geographic areas, especially in regions where data is lacking. This raises concerns about "geospatial bias," meaning models trained on data from well-developed countries may not perform well in less-developed regions.
Our research will enable better-informed agricultural systems for policymakers and stakeholders to support global food security.
–Yi-Chia Chang, University of Illinois
Chang's study, which was inspired by his team's previous related research published in NeurIPS 2023 proceedings , looks at how popular Earth observation models work when applied to new regions, particularly in agriculture, where differences in farming practices and uneven data availability make it harder to transfer knowledge between areas. To do this, Chang chose four major cereal grains – maize, soybean, rice and wheat – and then tested three widely-used pre-trained models and compared their performance on data they had seen before (in-distribution) versus data from new regions (out-of-distribution).
The results showed that models pre-trained on satellite images like Sentinel-2 (SSL4EO-S12) performed better than those pre-trained on general image datasets like ImageNet.
"By harmonizing crop type datasets across five continents, we found that foundation models pre-trained on full spectral bands of Sentinel-2 perform better for crop-type mapping," said Chang. "Our research also shows that training with out-of-distribution data can boost performance when the in-distribution data is scarce. In the long run, we still hope to acquire larger and more balanced labeled datasets since those can help achieve the best crop-type mapping results. I am excited to see how foundation models and transfer learning can benefit food security."
Chang's work has been fully integrated with TorchGeo , an open-source library for geospatial machine learning, so future research can easily develop further based on his results. As his team looks ahead, they plan to build upon the results of this study and apply their methodology to new smart-farming models.
"Our future work will focus on expanding crop-type datasets and developing agriculture-specific pre-trained models," said Chang. "We will also establish benchmarks for agricultural applications of foundation models, such as crop-type mapping and crop-yield prediction, bridging the gap between GeoAI and food security solutions."
Chang's work required massive amounts of storage and compute power to complete. GPUs were necessary for the machine-learning aspect of the project to be completed in a timely manner, but a lot of space was also needed for all that satellite imagery.
HPC resources significantly accelerate the machine learning workflows using GPUs, reducing model training time from hours on CPUs to minutes on GPUs. Additionally, the large data-storage allocation enables us to efficiently manage the training datasets, pre-trained weights and model outputs in the cluster.
–Yi-Chia Chang, University of Illinois
Chang has experience using research computing. Prior to this project, he utilized the campus cluster hosted by a research group led by Arindam Banerjee, a professor of computer science at U. of I. Even with his previous experience with high-performance computing (HPC), Chang was happy to report that moving his project onto Delta was relatively simple.
"My experience using Delta has been smooth and user-friendly. The admin staff was responsive, approving token exchange for GPU hours and storage allocations within a few days. The technical staff efficiently helped with troubleshooting. I'd like to send a special thanks to Brett Bode for helping to allocate over 50 TB of storage for satellite imagery."