Talks by Michela Dota & Davide Mastrantonio, Christof Schöch
Speakers: Michela Dota (Dipartimento di Lingue, letterature, culture e mediazioni, Università degli Studi di Milano) & Davide Mastrantonio (Dipartimento di Studi Umanistici, Università Ca’ Foscari Venezia, Italy)
Title: The Academic Italian Dictionary (DIA) and the Academic Prose
Abstract: The talk aims to present the DIA (Academic Italian Dictionary) project and its dual purpose, i.e. starting a systematic formal and functional inquiry of the academic variety of Italian based on the analysis of a corpus and creating an open-access digital dictionary that allows the query of academic expressions by forms and functions. The talk will therefore illustrate the concept of academic Italian and the dictionary’s teaching potential. Then it will briefly illustrate the corpus on which it is based, together with two examples of entries, a lexical and a functional one.
Suggested readings:
- Mastrantonio D. (2021), L’italiano scritto accademico: problemi descrittivi e proposte didattiche, “Italiano LinguaDue”, 13 (1), pp. 348-368: https://riviste.
unimi.it/index.php/promoitals/ article/view/15871. - Mastrantonio D., Sakr A., Dota M., Nardella S. (2024), Il progetto PRIN 2022 PNRR “Dizionario dell’italiano accademico: forme e funzioni testuali” (DIA): prime acquisizioni e prospettive, “Italiano LinguaDue”, 16 (2), pp. 564-605: https://riviste.
unimi.it/index.php/promoitals/ article/view/27866
Speaker: Christof Schöch (Trier Center for Digital Humanities, Germany)
Title: MiMoText – Literary History between Computational Literary Studies and Linked Linked Open Data
Abstract: This talk will explore an innovative approach to literary history that integrates machine learning (ML) and linked open data (LOD), with the goal of combining the depth of qualitative analysis with the scale of quantitative methods, in the context of literary history. This approach was developed in the Mining and Modeling Text project conducted at the Trier Center for Digital Humanities. The project relies on three primary data sources: bibliographic metadata from the “Bibliographie du genre romanesque français,” a corpus of 200 French novels from 1750–1800 encoded in XML-TEI, and scholarly literature about the French Englightenment novel. By applying ML techniques such as topic modeling or named entity recognition, and modeling the extracted information as LOD triples in a public Wikibase instance, the project constructs a semantic knowledge graph. This graph facilitates complex querying of the data, enabling researchers to uncover patterns and trends in 18th-century French literature. The approach emphasizes openness, federation, multilingualism, and collaboration in Computational Literary Studies.