RICHLAND, Wash.-The technology behind web search engines is useful for more than tracking down your long-lost buddy or discovering a delicious new recipe. Technology based on search engine algorithms might also help keep the lights on, the water running and the trains moving during an emergency.
Scientists at the Department of Energy's Pacific Northwest National Laboratory have shown that the algorithms that underlie web searches can help protect facilities like the grid, water treatment plants, food processing facilities and hospitals.
"This is a resource for people who are trying to protect an important network from a threat such as a cyberattack, and they need to prioritize which structures are most important to safeguard," said mathematician Bill Kay, who led the work.
The new research, published recently in the journal Homeland Security Affairs, is built around Google's PageRank algorithm, designed to recommend the most relevant websites for people searching for information on the internet. To rank search results, the formula considers factors like how many influential websites point to a given page and then how many influential sites the page itself points to.
Kay's team applied the same principles to structures such as the electric grid to keep power flowing, treatment plants to keep our water clean and hospitals to treat the sick and injured. Researchers refer to this network of facilities as "critical infrastructure"-structures that if damaged or destroyed could threaten public safety or national security.

Stopping a cascade of failure
The task for researchers like Kay: Among the tangle of tens of thousands of important facilities in a nation like the United States, help planners prioritize which structures are most significant to protect. What might be the most likely targets of an adversary? Which facilities are most likely to cascade a failure to other structures, and how can defenders stop the cascade as quickly as possible?
"Not all infrastructure assets are the same," said Kay. "If the failure is spreading, I want to know what will happen if a particular piece of equipment is taken out-how broad will the impact be?
"The key here is to identify systems where influence goes both ways," said Kay, "identifying systems that are influenced by a lot of other systems but also influence many others. It's like knowing who are the so-called popular kids in high school and especially those who are popular among the other popular kids."
A lot can happen quickly. For example, if an errant squirrel takes out a power substation, the pumps at a water treatment plant might stop. That could threaten the water supply to a nearby hospital or to a nuclear plant that needs water for cooling. Of course, officials have robust backup systems, and part of their planning is knowing how to prevent such a cascade of failure as quickly as possible.
Kay's team began by adapting the existing PageRank algorithm: Instead of looking at interactions among web pages, the scientists analyzed interactions among structures. Which facilities would be most likely to be targeted or to fail? And which facilities would have a serious impact on other facilities if they did fail? Structures that met both criteria were deemed critical by the team.
"As failure propagates through a network, there are two things I'd want to know: which things are likely to fail, and which things, if they fail, are likely to propagate the failure forward," said Kay, who specializes in graph networks.
Layers of knowledge
The team didn't simply apply the PageRank formula but made modifications to weigh many streams of information simultaneously-akin to performing dozens of related web searches at the same time and having all the searches communicate among themselves. The team refers to this as a multilayer approach.
"To think of a multilayer algorithm, think of a multilayer sandwich-a club sandwich," said coauthor Patrick Mackey. "One layer might be the electric system. Another is transportation. Others might be oil pipelines or hospitals. Many people look at these aspects of infrastructure one at a time, in isolation; we're looking at them all together and how they affect each other, which helps identify which are most critical."
In several simulations, the team showed that its multilayer approach consistently stops failure faster, with fewer structures damaged, than other approaches, including a straight PageRank algorithm and another approach known as "outdegree." The team did not quantify exactly how much it would limit an attack compared to the other methods, instead treating the study as evidence that the approach is worth exploring.
"A good algorithm for this type of work doesn't always need to incorporate detailed dynamics of the various entities of interest. Oftentimes it's sufficient, as a start, to adequately understand the relationships between those entities," said Kay. "It provides a starting point and can become very useful once a human expert adds in knowledge about the domain in question."
The work is part of a project portfolio led by PNNL researcher Sam Chatterjee, principal investigator and chief data scientist. It was funded by the Cybersecurity and Infrastructure Security Agency to enable consistent, repeatable and defensible analysis across a broad spectrum of potential failures.
"This work represents an excellent example of how network science methods can be adapted to address critical infrastructure risk and resilience challenges," said Chatterjee.
In addition to Chatterjee, Kay and Mackey, former intern Jacob Miller also contributed to the project.