Rdgai: Classifying transcriptional changes using Large Language Models with a test case from an Arabic Gospel tradition
November 17, 2025 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Robert Turnbull
arXiv ID
2511.13801
Category
cs.DL: Digital Libraries
Cross-listed
cs.CL
Citations
0
Venue
arXiv.org
Last Checked
3 months ago
Abstract
Application of phylogenetic methods to textual traditions has traditionally treated all changes as equivalent even though it is widely recognized that certain types of variants were more likely to be introduced than others. While it is possible to give weights to certain changes using a maximum parsimony evaluation criterion, it is difficult to state a priori what these weights should be. Probabilistic methods, such as Bayesian phylogenetics, allow users to create categories of changes, and the transition rates for each category can be estimated as part of the analysis. This classification of types of changes in readings also allows for inspecting the probability of these categories across each branch in the resulting trees. However, classification of readings is time-consuming, as it requires categorizing each reading against every other reading at each variation unit, presenting a significant barrier to entry for this kind of analysis. This paper presents Rdgai, a software package that automates this classification task using multi-lingual large language models (LLMs). The tool allows users to easily manually classify changes in readings and then it uses these annotations in the prompt for an LLM to automatically classify the remaining reading transitions. These classifications are stored in TEI XML and ready for downstream phylogenetic analysis. This paper demonstrates the application with data an Arabic translation of the Gospels.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Digital Libraries
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Measuring academic influence: Not all citations are equal
R.I.P.
π»
Ghosted
The Open Access Advantage Considering Citation, Article Usage and Social Media Attention
R.I.P.
π»
Ghosted
A Bibliometric Review of Large Language Models Research from 2017 to 2023
R.I.P.
π»
Ghosted
On the Performance of Hybrid Search Strategies for Systematic Literature Reviews in Software Engineering
R.I.P.
π»
Ghosted
A Systematic Identification and Analysis of Scientists on Twitter
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted