Seminar on large language models and the Estonian language on November 19

People at Jakobi 2
Author: Mariana Tulf

On November 19 from 14:00 to 16:00, the University of Tartu Centre for Digital Humanities will host a seminar titled Large Language Models and the Estonian Language: the State of the Art. The event is organized by the Centre for Digital Text Scholarship (DigiTS) in cooperation with the Language Data Research Infrastructure (KeTa).

The development of large language models (LLM-s) has brought about revolutionary changes in language technology. Estonian, too, is moving rapidly in this direction, as both researchers and developers are increasingly engaging with issues related to larger volumes of language data, model quality, and the possibilities of open artificial intelligence.

The seminar will focus on how Estonian language data are currently being collected for AI development, what strategies are used to assess quality and create benchmarking frameworks, and how to enhance Estonian’s capabilities in the context of open LLM-s. The discussion will also address the challenges of working with a small language and limited data resources.

The event will be held in English. Presentations can be followed either in person at Jakobi 2, room 114, or via Zoom. The seminar will be moderated by Professor Liina Lindström, with presentations by experts from the University of Tartu Centre for Digital Humanities, the University of Tartu Institute of Computer Science, as well as the Institute of the Estonian Language.

Image
Ajakava
Author: DigiTS

Registration is open until 15.11.

The event is followed by a roundtable with invitations. The Centre for Digital Text Scholarship is funded by the European Union. See more about DigiTS at the project's website: www.digits.ut.ee.