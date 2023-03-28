BEING the second most spoken mother tongue in the world, the one that is being learned the most, as well as expanding, and the second language of international communication, Spanish seeks to position itself in Artificial Intelligence (AI) and its competition against the traditional competitor, English. However, said goal will not be so easy or immediate.

Undoubtedly, as King Felipe VI stated at the inauguration of the IX Congress of the Spanish Language on Monday, at the Teatro de Falla in Cádiz, “this is the time for Spanish, with all its voices, twists and nuances and with all their accents, with all their richness…The 21st century must be the century of Spanish, let’s make it possible”. But, precisely, it is those outstanding qualities of this language, the most diverse and mixed of all on the planet, that Artificial Intelligence has not been able to speak “Spanish” well and has even driven it crazy.

In this way, the artificial AI will need a lot of food so that every Spanish speaker can address it naturally, taking into account its 21 national varieties, accents, sub-accents and slang such as Cádiz or lunfardo.

The inauguration of the IX International Congress of the Spanish Language (Cile) left an idea of ​​the distance that machines have to cover.

It was when the mayor of the host city, José María González Santos, better known as Kichi, wished the participants a good time, in pure Cadiz language.

“Ladies and gentlemen, be on the ‘liquindoi’, take advantage of ‘la collá’ and ‘conviá’, enjoy the ‘tangai’, so that when it’s your turn to ‘guannajarse’ you can proudly say that this congress has been a ‘bastinazo'” Kichi launched.

Well, this is what one of the most popular programs in the world transcribed: “ladies and gentlemen, beautiful China and take advantage of the ***** and the cumbia and enjoy the tangai so that when it’s your turn to save yourself you can proudly say that this congress has been a great match”.

This, explained Virginia Bertolotti, a Uruguayan linguist, “is a sign that if we feed artificial intelligence with something that is relatively flat, such as the Internet, there are things that they do very well, but there are things that they cannot do.”

“If I tell him ‘make me a dialogue between gauchos of the 19th century’, he does anything,” said this professor at the University of the Republic, a member of the Uruguayan Academy of Letters.

wrong correctors

Artificial intelligence is the “scientific discipline that deals with creating computer programs that execute operations comparable to those carried out by the human mind”, according to the definition of the Royal Academy of Language (RAE), which chose it word of the year in 2022 .

It is a field with a lot at stake, indicated the Spanish Minister of Foreign Affairs, José Manuel Albares, when he warned that Spanish must be “positioned in the central nucleus of Artificial Intelligence (AI), in the metaverse”.

Many of the inventions of this new discipline are already in common use, such as proofreaders, translators, or editors or “chatbots”.

The problem with these tools, warned the director of the Royal Academy of Language, Santiago Muñoz Machado, is that these instruments “do not use the pan-Hispanic language canon, but the Silicon Valley canon.”

“We have warned at the Academy that what that corrector corrects us is normally not correct, and they are not puns, it is exactly like that,” he deepened.

The remedy, Bertolotti explained, would be to “sophisticate” the large amount of data that artificial intelligence sweeps from the Internet, through “many materials that are small corpus of specialties, which we have been working on in linguistic research.”

“Speech on the street, everyday speech, in the case of Spanish, which is a language of great cultural thickness, with a wide geographical distribution, is not necessarily captured by the type of data that is usually used to train intelligence artificial”, argued the linguist.

¿Dominant variables?

Asunción Gómez-Pérez, a member of the RAE and advisor on artificial intelligence to the Spanish government, believes that it could end up being the case that some variants of Spanish end up being more dominant than others in artificial intelligence.

“The language models that we are using now are fed by large amounts of texts that have been written by people who belong to certain countries and use certain vocabulary,” he explained.

“The more texts there are of a variant, the more possibilities that variety has of being accepted,” Gómez-Pérez concluded.

If that turns out to be the case, those domestic outposts of artificial intelligence that are Siri and Alexa will suffer to understand that when they are asked to include green beans, needles, green beans, green beans, razors or green beans on the supermarket list, they are being asking for the exact same thing.

Now they are written like this

Meanwhile, the Royal Spanish Academy presented a new edition of the Diccionario Panhispánico de Dudas, which addresses various issues related to the language, such as the recommended spellings in cases of having to use foreign words.

‘Jol’ for ‘hall’, ‘jáquer’ for ‘hacker’, ‘wiski’ for ‘whisky’ or ‘jol’ for ‘hall’ are some of the examples provided by the person in charge of this work, Salvador Gutiérrez Ordóñez, who has remarked however that “the first obligation of the Academy” is to make “proposals of equivalence”.

“If they succeed, fine, and if not, nothing happens,” he pointed out, recalling that this dictionary is not normative like the dictionary of the language, but rather answers the doubts of the speakers. For example, for ‘backstage’ it is recommended to use the word ‘transcenio’; for ‘bullying’, ‘school bullying’ -for ‘bully’ also ‘bully’- or for ‘hall’, ‘lobby’, ‘entrance’ or ‘receiver’.

However, this dictionary also provides other ‘alternative’ spellings in Spanish that also serve as a recommendation for these foreign words. This is the case of ‘jol’ for ‘hall’. “The adaptation of the Anglicism is unnecessary, but, if done, the spelling ‘jol’ would be possible and correct,” explains the new pan-Hispanic dictionary.

Another striking case cited by Gutiérrez Ordóñez himself has been that of ‘whisky’, which could also be replaced by the word ‘wiski’. Until now, the formula recommended by the RAE in its Spanish spelling was that of ‘güisqui’, but it has also changed due to a greater acceptance of letters such as ‘w’ or ‘k’ in Spanish.

“Those letters were seen as foreign and you could say that they were doomed for a time, but now it has changed,” said the linguist. “This dictionary anticipates some questions and tries to answer questions that are not in the dictionary. It is true that almost no one writes like that, but if you ask, you have to say this and make it clear that they are not misspellings”, has added.

During the presentation, other examples were mentioned such as ‘tour’ –“we already use a tourist or tour operator, but an adaptation is not included”, he indicated– or, on the contrary, other already consolidated assimilations, such as the ‘football’ case. “It was a difficult adaptation, because football was first discussed and derivatives are a problem,” she acknowledged.

‘Sambernardo’ has been another of the examples that have emerged as a writing recommendation example, ‘brauni’ for ‘brownie’ or ‘bypass’ for ‘bypass’. And, before concluding, he has remarked that there are still foreign words such as ‘hardware’ or ‘software’ that are “very difficult to unearth” in Spanish.

only, only without tilde

In this dictionary, the RAE also included changes regarding the writing of ‘solo’ -to tick when there is a risk of ambiguity, in the opinion of the writer-, although “ideally, no one writes it with a tick”, according to the person in charge from pan-Hispanic, Salvador Gutiérrez Ordóñez.

“Just like ten years go by and nobody writes it with an accent anymore, it’s a short time, but ideally, nobody writes it like that,” Gutiérrez Ordóñez, who has also been director of the Department of Spanish up to date since 2008, pointed out before the presentation.

He explained that the changes in the regulations are due to the fact that the previous one “was less clear”, but in general the wording of the year 2010 is maintained. “That regulation was darker and now the new wording is more exhaustive and clear,” he pointed out. .

In fact, it has been exposed how the new wording is in the dictionary: ‘It has been determined that the use of the tilde is not mandatory, but optional in cases of ambiguity. It is mandatory to write it without accent marks in a context where it does not entail risks of ambiguity and optional where it does, in the opinion of the writer’.