«`html

Google DeepMind Introduces Aeneas: AI-Powered Contextualization and Restoration of Ancient Latin Inscriptions

The discipline of epigraphy, which studies texts inscribed on durable materials like stone and metal, is essential for understanding the Roman world. However, the field faces numerous challenges, including fragmentary inscriptions, uncertain dating, diverse geographical provenance, and a growing corpus of over 176,000 Latin inscriptions, with approximately 1,500 new inscriptions added annually.

Challenges in Latin Epigraphy

Latin inscriptions span more than two millennia, from roughly the 7th century BCE to the 8th century CE, across the vast Roman Empire comprising over sixty provinces. These inscriptions vary from imperial decrees and legal documents to tombstones and votive altars. Epigraphers traditionally restore partially lost or illegible texts using detailed knowledge of language, formulae, and cultural context. However, many inscriptions suffer from physical damage, making dating and provenance attribution complex.

Aeneas: Addressing Epigraphic Challenges

To tackle these challenges, Google DeepMind developed Aeneas, a transformer-based generative neural network that performs restoration of damaged text segments, chronological dating, geographic attribution, and contextualization through retrieval of relevant epigraphic parallels.

Latin Epigraphic Dataset (LED)

Aeneas is trained on the Latin Epigraphic Dataset (LED), which includes 176,861 Latin inscriptions aggregated from three major databases. This dataset encompasses approximately 16 million characters covering inscriptions from seven centuries BCE to eight centuries CE, with about 5% having associated grayscale images.

Model Architecture and Input Modalities

Aeneas’s core is a deep, narrow transformer decoder based on the T5 architecture, adapted for effective local and contextual character processing. The model includes multiple specialized task heads to perform:

Restoration: Predict missing characters, supporting arbitrary-length unknown gaps.
Geographical Attribution: Classify inscriptions among 62 provinces.
Chronological Attribution: Estimate text dates by decade.

Performance and Evaluation

Evaluated on the LED test set and through a collaboration study with 23 epigraphers, Aeneas demonstrates significant improvements:

Restoration: Character error rate (CER) reduced to approximately 21% with Aeneas support, compared to 39% for unaided human experts.
Geographical Attribution: Achieves around 72% accuracy in classifying the province.
Chronological Attribution: Average error in date estimation is approximately 13 years for Aeneas.
Contextual Parallels: Retrieved parallels are accepted as useful starting points for historical research in approximately 90% of cases.

Integration in Research Workflows and Education

Aeneas operates as a cooperative tool, enhancing historians’ workflows by accelerating the search for epigraphic parallels and refining attribution. The tool and dataset are openly available via the Predicting the Past platform under permissive licenses, promoting interdisciplinary digital literacy.

FAQs

What is Aeneas and what tasks does it perform?

Aeneas is a generative multimodal neural network developed by Google DeepMind for Latin epigraphy. It assists historians by restoring damaged or missing text in ancient Latin inscriptions, estimating their date, attributing their geographical origin, and retrieving historically relevant parallel inscriptions.

How does Aeneas handle incomplete or damaged inscriptions?

Aeneas can predict missing text segments even when the length of the gap is unknown. It generates multiple plausible restoration hypotheses, ranked by likelihood, facilitating expert evaluation and further research.

How is Aeneas integrated into historian workflows?

Aeneas provides historians with ranked lists of epigraphic parallels and predictive hypotheses for restoration, dating, and provenance. These outputs boost historians’ confidence and accuracy while reducing research time.

Check out the Paper, Project, and Google DeepMind Blog. All credit for this research goes to the researchers of this project. SUBSCRIBE NOW to our AI Newsletter.

«`