DIGIHUMi ettekanne

  • 18. november: Maciej Eder (Institute of the Polish Language)

"Literary and linguistic computing: from authorship attribution to assessing language change"



The presentation will be focused on computer-assisted text analysis, understood as measuring textual similarities by statistical techniques. The talk will start with a concise introduction to authorship attribution, followed by a discussion of how attribution techniques can be extended to assess stylistic variation in (large) collections of texts. Since one of the core concepts in attribution is to identify the works that are stylistically (whatever it really means) most similar to an anonymous sample in question, the same idea can be easily adopted to assess any text collection. Instead of looking for authorship, however, such an analysis is aimed at tracing other stylometric signals: genre, gender, chronology, intertextuality, and so forth.


Prof. Maciej Eder is the director of the Institute of Polish Language (Polish Academy of Sciences), chair of the Committee of Linguistics at the Polish Academy of Sciences, principal investigator of the project Computational Literary Studies Infrastructure, co-founder of the Computational Stylistics Group, and the main developer of the R package ‘Stylo’ for performing stylometric analyses. He is interested in European literature of the Renaissance and the Baroque, classical heritage in early modern literature, and quantitative approaches to style variation. These include measuring style using statistical methods, authorship attribution based on quantitative measures, as well as “distant reading” methods to analyze dozens (or hundreds) of literary works at a time.

"Methods of extracting keywords and topics from text collections" (Maciej Eder)

The workshop will offer an introduction to information extraction methods from collections of written texts.
