Leveraging national surveys, big-data advances and machine learning, Cornell researchers have piloted a new approach to mapping poverty that could help policymakers identify the neediest people in poor countries and target resources more effectively.
Eliminating extreme poverty - defined as surviving on $2.15 per person per day, in 2017 U.S. dollars - is the United Nations' top priority among 17 sustainable development goals. To achieve it, governments and development and humanitarian agencies need to know how many people live under that threshold, and where.
Yet that information often is lacking in the countries that need help most, the researchers said. Household surveys of consumption or income - considered the gold standard for defining poverty lines - may be unavailable or outdated because they are expensive and difficult to administer frequently. Meanwhile, data from satellites and other Earth observation systems monitoring infrastructure, natural conditions and human behavior has been successfully used to generate asset-based poverty indexes disconnected from the monetary measure most relevant to policymakers.
The Cornell team's new "structural poverty" estimates seek to address that gap, translating abundant Earth observations into more actionable terms for policymakers. Focused on four southern and eastern African nations, the pilot project mapped poverty about as accurately as existing asset index methods, but for more useful measures - including the share of people living below the $2.15 per person per day global poverty line. The structural poverty approach outperformed previous monetary poverty methods and is forward-looking, making it especially useful for informing programming.
"Rapid advances in data science and machine learning haven't gained widespread acceptance in the operational community in part because they haven't generated estimates in a very usable form," said Chris Barrett, the Stephen B. and Janice G. Ashley Professor of Applied Economics and Management in the Cornell SC Johnson College of Business and professor in the Cornell Jeb E. Brooks School of Public Policy. "We've helped make computational and data advances more useful in practical terms, because the models are anchored to the money metric poverty line."
Barrett is the senior author of "Microlevel Structural Poverty Estimates for Southern and Eastern Africa," published Feb. 6 in Proceedings of the National Academy of Science as part of a series of inaugural articles by academy members elected in 2022, including Barrett. The first author is Elizabeth Tennant, research associate in the Department of Economics and visiting lecturer in the Charles H. Dyson School of Applied Economics and Management (SC Johnson College). Co-authors are Yating Ru, MRP '17, Ph.D. '24; Peizan Sheng, M.S. '22; and David Matteson, professor of statistics and data science and social statistics in the ILR School.
The research focused on Ethiopia, Malawi, Tanzania and Uganda - agricultural nations with high poverty rates where many development agencies are working, but with only a rough idea of where the poorest people live, the researchers said.
"These are places where we think the structural poverty model is quite relevant," Tennant said. "They're also places where we had good data on consumption and assets, so we were able to look at both and model their connections."
Structural poverty refers to an expected poverty status based on the relationship between the assets people own - bicycles, cars, land, livestock, businesses - and the income they generate. A focus on that relationship, the researchers thought, might capture the strengths of existing data sources in a more accessible way.
The team trained machine-learning models using 13 household living standard surveys conducted in the four countries between 2008 and 2020. Those were linked to Earth observation data, incorporating information on assets including housing size and quality, land and livestock, vehicles, and access to technology including cell phones. In short, the researchers said, the older survey data trained models to generate "nowcasts" of current conditions from recent satellite observations.
The structural poverty models performed well when tested against data not used in their development. A model trained on data from all four countries explained 72% to 78% of the variation in structural poverty across villages. When the model only saw data from neighboring countries, 40% to 54% of variation was predicted. The proof of concept will need further refinement, the team said, but can be implemented using publicly available data, accessible machine-learning models and personal computers.
"We're showing that you can get all the computational precision of advances the data science community has made, while having the policy and programming usefulness of these monetary measures - and in a forward-, not backward-looking way," Barrett said. "You want to know who's expected to be poor right now - not when a big survey was conducted years ago - and that's what our structural poverty models help predict."
The research was funded by the Cornell Atkinson Center for Sustainability and received computing support from the Cornell Center for Social Sciences.