AI to Preserve Europe's Lesser-Spoken Languages?

Durham University

German words on a page

European Day of Languages is an annual celebration of the diverse range of languages spoken across the continent. But as Dr Fintan Mallory, from our Department of Philosophy, explains how we shouldn't expect AI technology to save lesser spoken languages.

Can you explain about what your research and the connection with AI?

I do interdisciplinary work at the intersection of linguistics, machine learning and the philosophy of mind and language.

One question I work on is, what is it to know a language?

Some people think that Large Language Models (LLM) like GPT4o can genuinely understand language. I'm sceptical of this but have argued that it's not necessary for these models to fully know languages in order for us to use them.

If you book tickets at the cinema on the phone, you don't have to believe that the automated ticketing system is talking to you in order to use it, you can simply engage in a game of make-believe in order to get what you want.

A second question I work on is, how do neural networks represent things about the world? The question of how one thing represents another is a very old philosophical topic and philosophers have made quite a bit of progress with it (at least, we've got very good at detecting dead-ends).

Why does it matter how we define 'language'?

There are around 7,000 languages on the planet. It's likely that by the end of this century about half of those languages will have been killed. This killing is a part of the ongoing process of colonialism that has sought to extinguish cultures and peoples in their uniqueness. Linguicide, the killing of a language, plays a major role in the attacks on indigenous communities by political powers seeking to render those communities economically useful to them.

It is possible to think of a language as a database of information, a hoard of facts about grammar and word meanings that can be extracted and put into storage. This is sometimes a good way to view a language when you're doing linguistics — to remove the people from the equation. It's what LLMs do when they train on 'linguistic data' with no regard for the human beings whose lives are expressed in that data.

But if a language were just a body of information, we could in theory just save languages by storing them on in LLMs. An alternative view of language puts human beings at the centre and views a language as something more like the soul of a community. You can't store this in a machine. You can't solve a human problem like linguicide with a view of language that removes the human component.

What more can be done to keep minority languages alive?

Rather than approaching language preservation as a technical problem, I think indigenous communities need to be politically empowered, whether that be funding from governments or legal protections to use their languages.

/Durham University Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.