Plant breeders are gaining access to specialized training in advanced data analysis techniques through a new online course from the Feed the Future Innovation Lab for Crop Improvement (ILCI) at Cornell University. The course, which launched today on eCornell, equips participants with essential skills for managing and analyzing large plant breeding datasets, including techniques for identifying important genetic traits and evaluating plant breeding trials. The program introduces participants to powerful tools for data visualization and analysis, helping to strengthen research capacity in plant breeding programs around the world.
Developed by ILCI experts in genetics and informatics and taught by Cornell instructors, the online course "R and Plant Breeding: Analyzing Large Phenotype and Genotype Datasets in JupyterLab, An Introduction" is tailored for scientists in resource-limited settings managing the increasingly large and complex datasets that are essential to modern plant breeding programs.
"The advanced skills needed to manage and analyze large datasets are essential for plant breeders to make consistent genetic gains in their crop improvement programs," said Bethany Econopouly, associate director for biological and computational sciences at ILCI. "This course is designed to meet that demand by offering researchers practical training on tools, technologies and methods that will enhance their capacity to advance crop improvement projects around the world."
Participants in this inaugural cohort - comprised of researchers from Costa Rica, Malawi, Mozambique, Senegal, Uganda and the U.S. - will gain hands-on experience with essential tools, including R programming language, computational notebooks in JupyterLab, and rTASSEL, along with insights into genome-wide association analysis (GWAS) and mixed model methodologies. Developed by the Buckler Lab and partially funded by ILCI, rTASSEL is an R interface to the widely used TASSEL software for analyzing genomic diversity.
The course consists of three comprehensive modules, covering topics such as documenting and executing code in JupyterLab, performing GWAS, and applying mixed models for genetic evaluation. Each module includes interactive tutorials and graded assignments to ensure learners develop a strong grasp of the material.
"This course reflects our ongoing commitment to developing capacity in genomics, phenomics, and data analysis skills, which are essential for plant breeders worldwide," said Kelly Robbins, associate professor of plant breeding and genetics in the School of Integrative Plant Science. "This training supports Cornell's land-grant mission of sharing knowledge and fostering innovation, ensuring that plant breeders and agricultural communities worldwide gain access to these vital skills."
Brandon Monier, an international genomic informatician in the Buckler Lab and lead developer and creator of rTASSEL, said "This program teaches advanced data science concepts to current and future researchers. We aim to demystify these skill sets for plant breeders, supporting global efforts to improve food security and crop resilience.
ILCI's full funding of the course allows participants to access this training at no cost, reflecting its continued commitment to enhancing research capacity in Feed the Future countries and beyond. Researchers in the first cohort are part of: National Semi Arid Resources Research Institute (Uganda), Lilongwe University of Agriculture and Natural Resources (Uganda), Instituto Nacional de Innovación y Transferencia en Tecnología Agropecuaria (Costa Rica), CIMMYT (Senegal), Agricultural Research Institute of Mozambique, Institut Sénégalais de Recherches Agricoles (Senegal), Delaware State (U.S.) and Clemson University (U.S.),