Despite their impressive capabilities, large language models are far from perfect. These artificial intelligence models sometimes "hallucinate" by generating incorrect or unsupported information in response to a query.
Due to this hallucination problem, an LLM's responses are often verified by human fact-checkers, especially if a model is deployed in a high-stakes setting like health care or finance. However, validation processes typically require people to read through long documents cited by the model, a task so onerous and error-prone it may prevent some users from deploying generative AI models in the first place.
To help human validators, MIT researchers created a user-friendly system that enables people to verify an LLM's responses much more quickly. With this tool, called SymGen , an LLM generates responses with citations that point directly to the place in a source document, such as a given cell in a database.
Users hover over highlighted portions of its text response to see data the model used to generate that specific word or phrase. At the same time, the unhighlighted portions show users which phrases need additional attention to check and verify.
"We give people the ability to selectively focus on parts of the text they need to be more worried about. In the end, SymGen can give people higher confidence in a model's responses because they can easily take a closer look to ensure that the information is verified," says Shannon Shen, an electrical engineering and computer science graduate student and co-lead author of a paper on SymGen .