Skip to content
blog_transkribubus-1
Success Story Universities English

Sustainable efficiency: How the University of Georgia transcribed 20,000 pages in two months

Fiona Park
Fiona Park

The Finding Their Names: Discovery and Description of Enslavement Events project is a major initiative by the Hargrett Rare Book and Manuscript Library at the University of Georgia. Funded by a $137,000 grant from the National Historical Publications and Records Commission (NHPRC), the project aims to identify and document enslaved persons who lived during the Colonial and Antebellum periods of the United States by studying the content of historical documents most often created by the people who enslaved them.

The project team was tasked with analysing over 20,000 pages of archival documents sourced from over 80 collections, including correspondence, diaries, ledgers, and legal records. By creating high-resolution scans, full-text transcripts, and applicable datasets, and then publishing them online, the team was able to elevate these documents from collections of unscanned documents into searchable, machine-readable, and future-ready digital resources.

  

Match Page 2 to Page 1 (1)

 

Will Stanier, Librarian and Project Coordinator, University of Georgia Libraries 

Chris Lott, Digitization and Data Coordinator, UGA Special Collections Libraries

 

  

Key Facts

Organisation: Hargrett Rare Book and Manuscript Library at the University of Georgia (supported by a grant from the NHPRC).

Material: Over 20,000 pages of Colonial and Antebellum era archival documents.

Goal: To identify and document enslaved persons by creating searchable, machine-readable digital resources and publishing them to the Digital Library of Georgia (DLG) and Enslaved.org.

Project constraints: A small team of two people, a strict two-month timeline before Transkribus credits expired, and the need to meet specific technical standards/metadata requirements for external digital repositories.

Result: Successfully generated transcripts and metadata for 20,000+ pages in under two months, establishing a sustainable and efficient AI-assisted workflow for future projects.

 

IMG_0870The University of Georgia Libraries are home to thousands of documents relating to the slave trade. © University of Georgia Libraries

 

A challenging collection with technical requirements

The primary obstacle for the team was the nature of the collections. While some of the documents had already been digitised, they were not consistent in descriptions and metadata, making it difficult for researchers and library users to search across documents to understand narratives from mentions in different sources. To ensure the widest possible impact, the team planned to publish the digitised documents on both the Digital Library of Georgia (DLG) and the Enslaved.org digital archive, but this came with its own set of requirements.

“Documents needed to comply with various standards. First, the professional standards of library and archives generally. More specifically, the metadata standards for details like file-names that are inherent to both our library and the Digital Library of Georgia. And then also to the standards suggested by the Controlled Vocabulary published by Enslaved.org.” - Will Stanier,  Librarian and Project Coordinator, University of Georgia Libraries 

Operational constraints added another layer of difficulty. The library already had a Transkribus subscription, but the credits were soon to expire, and the project was managed by a small team of two. They required a methodology that could achieve results fast without compromising on technical requirements or transcription accuracy.

 

Screenshot 2026-05-19 115314
The Digital Library of Georgia contains resources from across all the libraries at the University of Georgia. © University of Georgia Libraries

  

A hybrid approach of AI and custom code

The Hargrett Library team developed a hybrid approach of Transkribus plus custom code to achieve optimal efficiency with very little resources. The Transkribus interface allows for extensive formatting and tagging of documents, with an array of options for exporting any metadata created. This flexibility allowed the team to write a custom script that facilitated the bulk uploading, recognition, and downloading of pages.

The script automatically compiled the data into text files that mirrored the original physical objects and complied with existing DLG and Enslaved.org standards. This hybrid approach of AI and custom code made the work very efficient.

“[Before creating this scripting workflow], we were looking at manually uploading material at the level at which we would eventually want to compile the transcripts (for each folder of material or, more commonly, for each item), or manually compiling after recognition, either of which would have required significant time investments that can now be directed toward quality control and subsequent phases of metadata creation.” - Will Stanier

 

Screenshot 2026-05-20 111158By combining Transkribus with custom code, the team of two were able to create a custom workflow that met their individual transcription requirements. © Transkribus 

 

More documents transcribed in less time

Increased productive capacity: Combining Transkribus and custom code allowed the team to generate full-text transcripts for over 20,000 pages of historical documents in less than two months. “The possibility of generating full-text transcripts for over 20,000 pages of 18th and 19th century handwritten documents in under two months, with two project staff, an AI tool, and a few lines of code presents exciting opportunities for special collections libraries.” - Chris Lott, Digitization and Data Coordinator, UGA Special Collections Libraries

Workflow efficiency: The workflow was so successful that the team were able to exceed their expectations by transcribing thousands of pages more than originally planned. “The workflow created the productive capacity to transcribe an additional 9,000 scans from non-grant digitization projects.” - Chris Lott

Enhanced discovery and research: The team was not only able to fulfil their digitisation goals on this project, but they also laid the foundations for future digitisation projects. “While we did not codify a specific approach to using Transkribus applicable to every future project, we did establish a sustainable model for recognizing documents quickly that will allow us much flexibility in how we approach these kinds of projects.” - Chris Lott

 

Conclusion

The Finding Their Names project demonstrates how special collections libraries can use AI to overcome traditional digitisation bottlenecks. By combining Transkribus with tailored automation, the Hargrett Rare Book and Manuscript Library at the University of Georgia has not only increased the speed of its digitisation efforts and set the groundwork for future projects, but has also ensured that the stories of enslaved individuals are more accessible and searchable than ever before.

Digitising written sources with specific technical requirements is a challenge experienced by many archives and institutions. If your team wants to find out more about how Transkribus can be used to create custom digitisation workflows for your project, then reach out to one of our advisors today and discover how Transkribus could help you reach your digitisation goals.

Share this post