Small- Model Approach Could Be More Effective

Small language models are more reliable and secure than their large counterparts, primarily because they draw information from a circumscribed dataset. Expect to see more chatbots running on these slimmed-down alternatives in the coming months.

After the widespread rollout of OpenAI's large language model (LLM) in late 2022, many other big tech companies followed suit - at a pace that showed they were not far behind and had actually been working for years to develop their own generative artificial intelligence (GenAI) programs using natural language.

What's striking about the various GenAI programs available today is how similar they truly are. They all basically work in the same way: a model containing billions of parameters is deep-trained on huge datasets made up of content available on the internet.

Once trained, the models in turn generate content - in the form of texts, images, sounds and videos - by using statistics to predict which string of words, pixels or sounds is the most probable response to a prompt. "But this method comes with risks," says Nicolas Flammarion, who runs EPFL's Theory of Machine Learning Laboratory. "A hefty chunk of the content available online is toxic, dangerous or simply incorrect. That's why developers have to supervise and refine their models and add several filters."

How to avoid getting drowned in information

The way things currently stand, LLMs have created a suboptimal situation where machines housed in vast data centers crunch through billions of data bytes - consuming large amounts of energy in the process - to find the tiny fraction of data that's relevant to a given prompt. It's as if to find the answer to a question, you had to flip through all the books in the Library of Congress page by page until you came across the right answer.

Researchers are now exploring ways of leveraging the power of LLMs while making them more efficient, secure and economical to operate. "One method is to limit the sources of data that are fed into the model," says Martin Rajman, an EPFL lecturer and researcher on AI. "The result will be language models that are highly effective for a given application and that don't attempt to have the answers to everything."

This is where small language models (SLMs) come in. Such models can be small in various ways, but, in this context, size usually refers to the dataset they draw from. The technical term for this is retrieval-augmented generation (RAG). EPFL's Meditron provides an example of how this can be applied in practice: its models rely exclusively on reliable, verified medical datasets.

The advantage of this approach is that it prevents the spread of incorrect information. The trick is to implement the limited datasets with chatbots trained on large models. That way, the chatbot can read the information and link different bits together in order to produce useful responses.

Several EPFL research groups are exploring the potential of SLMs. One project is Meditron, while another is a pilot test under way based on Polylex, EPFL's online repository of rules and policies. Two other projects are looking at improving how class recordings are transcribed so that they can be indexed more reliably, and streamlining some of the School's administrative processes.

Cheaper to use

Because SLMs rely on smaller datasets, they don't need huge amounts of processing power to run - some of them can even operate on a smartphone. "Another important advantage of SLMs is they function in a closed system, meaning the information users enter into a prompt is protected," says Rajman. "That's unlike ChatGPT, where if you ask it to transcribe a meeting and write up minutes, for example - something the model can do quite well - you don't know how the information will be used. It gets stored on unknown servers, although some of the information could be confidential or include personal data."

SLMs have all the chatbot-running capabilities of large models and come with considerably fewer risks. That's why businesses are getting more and more interested in the technology, whether for their internal needs or for use with their customers. Chatbots designed for specific applications can be both very useful and extremely effective, and this has prompted tech companies worldwide to rush their version to market.

2023 may have been the year when LLMs - with all their strengths and weaknesses - made the headlines, but 2025 could very well be the year when their smaller, tailored and fully trustworthy counterparts steal the show. ■

Meditron, EPFL's industry-leading example

The first thing most of us do when we have a skin rash, unexplained calf pain or are prescribed a new medicine, for example, is to go online. Some people run a standard internet search, while others prefer to converse with a generative artificial intelligence (GenAI) program, looking for reassuring explanations or fueling their hypochondriac tendencies. But the diagnoses put forward by generalist large language models - like those used by ChatGPT and Claude - are drawn from obscure sources containing all kinds of data, raising questions about their reliability.

The solution is to develop smaller models that are better targeted, more efficient and fed with verified data. That's precisely what researchers at EPFL and Yale School of Medicine are doing for the healthcare industry - they've developed a program called Meditron that is currently the world's best-performing open-source language model for medicine. It was introduced just over a year ago and, when tested on medical exams given in the US, it answered more accurately than humans on average and came up with reasonable responses to several questions. While Meditron is not intended to replace doctors, it can help them make decisions and establish diagnoses. A human will always have the final say.

The program is built on Meta's Llama open-access large language model. What sets Meditron apart is that it has been trained on carefully selected medical data. These include peer-reviewed literature from open-access databases such as PubMed and a unique collection of clinical practice guidelines, including those issued by the ICRC and other international organizations, spanning a number of countries, regions and hospitals.

"This open-access basis is perhaps the most important aspect of Meditron," says Prof. Annie Hartley from the Laboratory for Intelligent Global Health and Humanitarian Response Technologies (LiGHT), hosted jointly by EPFL and Yale. It can be downloaded to a smartphone and operate in remote areas where there's little or no internet access. Unlike the black boxes developed by large companies, Meditron is transparent, and it gets better each time it's used. "The program is in constant development," says Hartley. "One of its strengths is that it includes data from regions that are often underrepresented."

To make sure the program can be used as widely as possible and accurately reflects real-world conditions, its developers launched an initiative whereby medical professionals from around the world were asked to test the model in actual clinical settings and ask it challenging questions. "The fact that these professionals volunteered their time in our open-source community to independently validate Meditron is a recognition of its value," says Hartley. Martin Jaggi, head of EPFL's Machine Learning and Optimization Laboratory, adds: "None of that would've been possible with the closed models developed by big tech companies."

Another step towards personalized medicine

Other EPFL researchers are looking at improving the quality of data fed to language models. Emmanuel Abbé, who holds the Chair of Mathematical Data Science at EPFL, is carrying out one such project with the Lausanne University Hospital (CHUV) in order to help prevent heart attacks. The goal is to develop an AI system that can analyze images from an angiogram - a visualization of the heart and blood vessels - and compare them with those in a database to estimate a patient's risk of cardiac arrest. Abbé and his research group plan to conduct a large cohort study in Switzerland involving at least 1,000 participants over the next three years to collect data to train their model.

Such applications could also bring us one step closer to personalized medicine. "I see huge potential in combining the results of these models with patients' medical histories and the data collected by smartwatches and other health-related apps," says Olivier Crochat, executive director of EPFL's Center for Digital Trust. "But we have to make sure robust systems are in place to protect these highly sensitive data and ensure they're used ethically and fairly." ■ AMB

References

This article was published in the March 2025 issue of Dimensions, an EPFL magazine that showcases cutting-edge research through a series of in-depth articles, interviews, portraits and news highlights. Published four times a year in both English and French, it can be sent to anyone who wants to subscribe as well as contributing members of the EPFL Alumni Club. It is also distributed free of charge on EPFL's campuses.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like