Over the past four years, large language models have advanced in leaps and bounds, going from mangling candy heart messages to assisting with drafting performance self-evaluations.
In May 2024, Sandia National Laboratories became the first facility within the nuclear security enterprise to provide its employees with access to this powerful new tool with SandiaAI Chat, enabling them to ask sensitive unclassified questions.
Rather than just buying a corporate license for ChatGPT, Sandia established its own instance of ChatGPT in its own Azure Cloud "box." This means that the questions asked by Sandia employees and the corresponding responses are not shared with Microsoft or OpenAI. Only Sandia has access to this data, according to Mike Vigil, the project manager responsible for the web development portion of the project. Thus, certain types of sensitive information can be shared with the tool. However, SandiaAI doesn't learn from the internet beyond the initial training period or from earlier questions, Vigil said.
"SandiaAI Chat has all these resources that work together to make sure that our application works, and it is safe, secure and scalable for everyone at Sandia," Vigil said. "We received a directive from the top of the Labs, which opened up every single door that we needed. We're able to ensure that our development team was working on only one project."
More than 10,000 people have used SandiaAI at least once, said Brian Sims, the technical lead for the project. This means that nearly every Sandia employee or contractor with a dedicated computer has used SandiaAI at least once.
First and fast
Sandia was the first facility within the nuclear security enterprise to gain access to a chat-based generative AI system. Starting in late October 2023, a diverse cross-functional team met twice a week to achieve this milestone, Vigil said. Erica Grong, who oversaw the risk management portion of the project, credited the team's diverse skill sets for the project's rapid progress.
Development of the system took 27 business days, followed by six months of testing before its rollout to the Labs in early May 2024. Vigil said that this effort built on earlier initiatives by Information Technology Services to gain acceptance for Microsoft Cloud-based services at Sandia.
Though this process was blazingly fast, Grong added that the team followed the National Institute of Standards and Technology AI risk management framework to appropriately manage associated risks, making Sandia one of the first institutions to use the NIST framework.
The team received support from the highest level of Labs leadership, Vigil said. This support and a strong partnership with cybersecurity allowed the team to get through the necessary processes without waiting. The team worked with Information Systems Security Manager Pete Warner to compile a document detailing all the tests performed on the new tool. Warner presented this document to the Sandia Field Office authorization official, who approved it within an astonishing two weeks.
Safe and secure
To ensure the safety and security of SandiaAI, the team followed Sandia's Solutions Delivery Lifecycle, which includes load, cybersecurity and penetration testing. Sims explained that since generative AI as a service was so new, neither the Sandia nor the Microsoft developers knew what load demands to expect. The team conducted four series of load tests, gradually increasing the number of participants submitting the same questions at the same time. This ensured that every component of the system could handle up to 20,000 potential users.
Since each question requires significantly more computer processing compared to traditional searches, Microsoft imposes limits on the number of questions and the length of responses that Sandia can ask and receive at a time. The team did not want employees' first experience with the tool to be an error message, so they coded in small delays, shorter than an eyeblink. This strategy ensured that Sandia did not exceed Microsoft's limit by asking too many questions simultaneously, Sims said.
"The cybersecurity team was intrigued by the concept of jailbreaking OpenAI: Can you get it to tell you how to build a dirty bomb?" Sims said.
However, Vigil assured that large language models have guardrails in place to prevent such nefarious requests from going through, including synchronous content filtering and asynchronous abuse monitoring built into Azure OpenAI. Sims added that the cybersecurity team identified a few ways to circumvent these security features and shared these vulnerabilities with Microsoft. Sandia's IT checks any requests flagged for abuse, but only false positives have been identified thus far.
Responsible use
When the project commenced in October 2023, the team did not have a clear understanding of how SandiaAI would be most effectively used by employees.
"This was one of the first software projects I've worked on where we really didn't have a list of very specific business cases or targeted audiences," Grong said. "We had so many different types of use cases. That was pretty exciting."
SandiaAI is a versatile tool that can be used to generate code in several common programming languages, including those used in the development of the SandiaAI application itself. It can also assist in composing clearer emails, particularly for non-native English speakers, and refine entries in performance management forms, making it easier for individuals to highlight their achievements. Sims noted that SandiaAI experienced a significant spike in usage the day before PMFs were due in August.
"I found that this was the easiest PMF I've had," Sims said. "If I extrapolate the amount of time it saved me, I know it saved Sandia hours and hours of time and in business terms, that equates to huge money savings."
Grong said that employees have responded overwhelmingly positively to the new tool, sharing stories of how SandiaAI has saved them time. They have also expressed a desire for additional functionality.
Soon, Sandia data scientists will analyze the usage data to assess how the workforce is using SandiaAI. This data could determine how much time and money the tool is saving Sandia. Sims said that the team is working on adding the ability for people to upload their own files, which will be converted into a format that SandiaAI can analyze. This will allow for trend analysis, such as identifying the most expensive plane ticket in a year's worth of expense reports. Additionally, the team is developing the capability to generate images based on prompts and provide text descriptions of uploaded images.
"I would say I'm most excited about file upload," Grong said. "'Here's all my data: write user stories, write acceptance criteria, write a test plan.' I look to AI to replace the things that I find tedious. Having it produce something that I can just edit would save me a lot of time."
Sims agrees.
"There are so many neat things that are taking place around this technology," Sims said. "The next couple of years are going to be fascinating to watch. What we think of as black magic right now is going to be crude compared to what we'll see in the next couple of years."