Representatives from the Department of Energy (DOE) national laboratories, academia and industry convened recently at the University of California Livermore Collaboration Center (UCLCC) for a workshop aimed at aligning strategies for ensuring safe artificial intelligence (AI).
The daylong event, attended by dozens of AI researchers, included keynote speeches by thought leaders, panels by technical researchers and policymakers and breakout discussions that addressed the urgent need for responsible AI development. Workshop organizers with Lawrence Livermore National Laboratory's (LLNL) Data Science Institute (DSI) and Center for Advanced Signal and Image Sciences (CASIS) said the event's goals included fostering collaboration between UC and the national labs, strategizing investments in AI and providing a platform for interdisciplinary dialogue with a focus on AI's societal impact.
"We're gathering [AI] experts out of industry, academia and national labs, with the goal of forming a community," said DSI Director Brian Giera in introducing the event. "If we don't solve these existential problems, it can lead to human extinction-level events - or at least that's a possibility - so it's a really important topic that we're handling. For over half a century [the national labs have been] managing a technology that has incredible consequences and incredible benefits. I think AI shares a lot of these properties and impacts our mission space, where we can change socio-political dynamics. So how do we do it responsibly? How do we do it safely?"
Throughout the workshop, speakers, panelists and attendees focused on algorithm development, the potential dangers of superhuman AI systems and the importance of understanding and mitigating the risks to humans, as well as urgent measures needed to address the risks both scientifically and politically. They also addressed the importance of engaging with policymakers to ensure the responsible deployment of AI technologies to avoid models being used by bad actors for nefarious purposes. Others discussed considering development of a "Doomsday Clock"-style metric for quantifying and communicating risks of human extinction due to AI.
Does AI pose a threat to humanity?
In welcoming attendees, Pat Falcone, LLNL's deputy director for Science and Technology, emphasized the national labs' historical ties with the UC system, adding that "no topic is more important than AI safety and security." Kathy Yelick, vice chancellor for research at the University of California, Berkeley, added that the workshop underscored the role of advanced computing and the challenges posed by AI in safety-critical and security domains.
"The labs - especially the [National Nuclear Security Administration] labs - are well known for pushing the boundaries of computing performance, but this [workshop] also relates to them working on the improving verification and validation and improving confidence in those kinds of computing. I think it's really appropriate to have this kind of meeting here at Livermore and involving the labs to dually think about the safety and security of AI."
In his keynote talk, Yoshua Bengio, a Canadian computer scientist and one of the world's foremost experts in AI and deep learning, spoke of ensuring AI systems have the "right goals" aligned with human values and the difficulty in controlling AI behavior effectively, illustrating scenarios where AI might seek to take control of its reward mechanisms, leading to adverse consequences for humanity. Named as one of Time magazine's 100 most influential people for 2024, Bengio stressed the importance of addressing the scientific and political dimensions of AI safety and advocated for multi-stakeholder governance and democratic oversight to ensure the responsible development and deployment of AI technologies.
Bengio, who also gave a seminar to Lab employees the day before the workshop, explained that AI systems can possess the capability to achieve goals independently of the goals themselves, raising concerns about AI pursuing self-preservation and potentially leading to conflicts with human interests. He proposed building AI systems that could reason about uncertainty and the need for a model-based approach - separating knowledge of how the world works from how AI answers questions - to achieve better generalization and mitigate risks.
"In general, I think we know not nearly enough about what we could do to feel safe, and we should explore many possible ways," Bengio said. "Really, this is so important; if you believe that the bad things can happen, and that the stakes are so high that we should [explore solutions], we should be having a wide diversity of approaches to find and build safe and powerful AI. It's going to take time, both on the science side and the political side, to mitigate those risks. And we don't know how much time we have."
Assessing the technical challenges of safeguarding AI
Following Bengio's talk, experts from academia and research institutions held a technical panel discussion offering insights into the challenges and potential solutions in AI safety and governance. Panelists discussed immediate-term concerns such as external evaluation of AI systems, access to resources for researchers and the pressing risks associated with large language models (LLMs) and cybersecurity. While acknowledging there is progress being made, panelists agreed on the need for collaborative efforts to build more robust AI systems to improve safety.
"The basic first step that we need to improve upon, is [to not] build a model and then have it be retrained to do something bad, potentially by other actors," said UC Berkeley postdoctoral researcher Michael Cohen, who studies the behavior of advanced AI agents. "I think we will need to take more and more [time] in keeping models secure, so that we can actually control things."
Other panelists discussed technical aspects of AI alignment, safeguarding AI and the importance of uncertainty estimation, preference learning and interpretability for AI systems.
"We should very much be thinking about how we can have systems that work really well in a specific domain," said David Dalrymple, a program director at the UK Advanced Research + Invention Agency. "Then we're iterating the process through which we do a better job of specifying new domains and growing the realm of things we can operate in. An AI system that is general-purpose and superhuman is necessarily unpredictable, so it must not interact with the world without human supervision."
Other technical panelists included Zico Kolter, a faculty member at Carnegie Mellon University; David Krueger, an assistant professor at the University of Cambridge; and Dylan Hadfield-Mennell, an assistant professor at the Massachusetts Institute of Technology (MIT).
In the day's second keynote, UC Berkeley computer science professor and renowned AI researcher Stuart Russell stressed the need for proactive planning and a cautious approach to AI development focusing on safety, reliability and alignment with human values, including designing AI systems that are inherently safe from the beginning.
Russell said it was "a fundamental mistake to train systems to imitate humans because they acquire goals of agency on their own," adding that society will need formal guarantees to ensure that AI systems remain under human control.
"Even if we never achieve alignment, it doesn't mean we're going to be misaligned; it just means that robots are going to be uncertain about what human preferences are, particularly over eventualities that have never arisen," Russell said. "When the uncertainty goes away, then the incentive [for AI] to shut itself off goes away. The system believes absolutely that it knows what the utility functions are, and that it's going to work."
Russell suggested approaches like assistance games and inverse reinforcement learning to help ensure that AI systems understand and respect human intentions. He also advocated for verification techniques and regulatory practices to ensure that AI remains under human control and can operate in a safe and beneficial manner, similar to regulatory practices in other industries including nuclear power.
A pathway to regulating AI models
Russell's remarks dovetailed into the following panel on AI policy and regulation, where panelists discussed their priorities and strategies for addressing AI-related risks. The panelists, including Adam Gleave, a professor of computing at FAR AI; Dan Hendrycks, director of the Center for AI Safety; Aleksander Madry, Director of the MIT Center for Deployable Machine Learning; and California State Assemblymember Rebecca Bauer-Kahan (CA-16), generally agreed on the need for proactive regulatory measures, while balancing innovation and security concerns.
Bauer-Kahan highlighted the importance of building capacity for government agencies to understand AI's impacts, and of proposed legislation aimed at defining AI and addressing biases in automated decision tools. She advocated for concrete policy standards and mandatory registration of AI models, adding her concern for protecting the state's workforce and "building up an ecosystem of augmentation rather than replacement."
"A lot of what is happening in our government discussions is coming from a place of fear, and we need to take it back to a real understanding of risks… and drafting the legislation on the registry," Bauer-Kahan said. "[In government] we don't move as quickly as technology does, so how do we put into policy things that are as nimble as the technology will be? We are doing [AI model] registration, which I think is really critical; it's a really important push that we are already getting pushback on, even though it is just registration. But I think that we're taking a position around transparency and understanding that it's really a first step in any safety and trust policy."
Other panelists focused on adversarial robustness and various interventions, including evaluations of AI risks and potential liability for AI companies, and the need for "compute governance" to ensure trust and security, especially in critical infrastructure. Several panel members suggested "red lines" aimed at preventing AI systems from aiding malicious actors or causing significant harm, while cautioning against overly restrictive regulations that could hinder innovation.
The national laboratories' role in ensuring safe AI
Following a break for lunch, a panel moderated by LLNL's Giera centered on the goals and challenges surrounding AI safety in the context of the DOE national laboratories. Panelists from several national labs expressed concerns about trust in AI systems due to the rise of synthetic media, underlining the importance of tools, data standardization and other methods to maintain trustworthiness. The discussion also delved into workforce challenges, including recruitment and retention, with suggestions for engaging students and fostering interdisciplinary collaboration.
Director for Lawrence Berkeley National Laboratory's Scientific Data Division Ana Kupresanin, formerly of LLNL, discussed the importance of workforce development and raising awareness of DOE programs such as the Trillion Parameter Consortium and the Frontiers in Artificial Intelligence for Science, Security, and Technology (FASST) program - DOE-wide efforts to advance AI infrastructure and speed up scientific discovery while freeing up scientists to innovate.
"My concern is not just about AI; my concern is about the workforce in general, and the trends that I see happening in the nation," Kupresanin said. "I think it's very important for science because scientists need sustainable funding. Scientists who have to live on proposals and constantly writing proposals from one employer to another, don't have freedom to think and create."
"FASST is really an outgrowth of the AI for Science, Energy and Security workshops, but the idea is, 'What would a DOE program be and what it would entail?'" added Court Corley, chief scientist for AI in the Artificial Intelligence and Data Analytics Division at Pacific Northwest National Laboratory. "I think the next step for this effort is really building consensus; bringing academics together with industry to say, 'Where does it all fit?' That's the phase we're in right now. FASST is something happening in the background, and it's only just now that we're able to talk about it publicly."
Panelist Juston Moore, a team leader in AI assurance at Los Alamos National Laboratory, emphasized the need for large-scale projects driven by national ambition in science, similar to a "Moon Shot" or "Manhattan Project" for AI. Panelists also called for better outreach and collaboration with industry, academia and federal partners in building robust AI systems and discussed the national labs' unique position in evaluating AI systems, particularly in high-consequence domains such as nuclear security.
"DOE and the national labs also have a long history and culture of managing technology of extremely high consequence, thinking through what atomic energy and new nuclear weapons mean, and trying to build something you hope will never matter, and trying to keep it safe," said Philip Kegelmeyer, a machine learning researcher at Sandia National Laboratories. "There's a cultural behavior around extreme safety that I think we can bring to bear in the AI space."
The event concluded with breakout sessions focused on AI safety and development, featuring Diyi Yang from Stanford University discussing current challenges and limitations in existing AI safety approaches, Tim Lillicrap from University College London exploring essential factors for developing safe and robust AI systems, and Peer-Timo Bremer from LLNL addressing the regulation of AI and guidance for safe AI development.
"This community-forming event is a great example of LLNL's efforts in driving transformational technology development and initiating partnerships with leading experts in the global research community," said CASIS director and AI research group leader Ruben Glatt, who is part of the Strategy Alignment on AI Safety leadership team. "Making AI safe while increasing its capabilities is one of the greatest challenges of our time and requires a collaborative research approach to mitigate risks and create a reliable framework for AI safety."
The event was organized by DSI, the UCLCC and CASIS. Other Strategy Alignment on AI Safety organizers were DSI Deputy Director Cindy Gonzales and DSI Scientific Outreach Coordinator Leno da Silva.