Skip to content
blog_transkribubus-1
Screenshot of a handwritten table in Transkribus with a scroll icon overlaid
Technology Table Models

From paper to pixels: How to convert handwritten tables into digital spreadsheets

READ-COOP |

From census records to trade ledgers, a handwritten table in a historical document often holds a wealth of valuable information. Yet much of this data remains locked on paper, accessible only by manually deciphering each page. Imagine the potential if these tables could be transformed into digital, searchable spreadsheets, allowing you to analyse whole volumes at the click of a button.

With AI tools like Transkribus, digitising tabular data in this way is now a reality. Transkribus enables users to convert handwritten tables into structured, searchable data, unlocking the stories within and making research both more efficient and more effective.

In this post, we will show you how to convert handwritten tables into digital spreadsheets, helping you bring historical records from paper to pixels.

 

Key takeaways:

  • Tables in paper-based documents can be converted into digital spreadsheets using Transkribus.
  • This is achieved by creating an AI model, using it to recognise the tables in your documents, and exporting the data as a spreadsheet file.
  • The digital spreadsheet can then be searched for specific data, or published online through Transkribus Sites.


Transkribus uses the power of AI to turn handwritten tables into digital data.Transkribus uses the power of AI to turn handwritten tables into digital data. © “Carnegie Corporation Register of Applications from Educational Institutions, 1911-1920” (Carnegie Corporation of New York Records)

 

Bring your handwritten tables from the archive to the computer

The tables in historical records provide valuable insights into aspects of life at particular times and places. But no matter if the table is handwritten, printed, or a mixture of both, its potential is often locked within physical documents. By digitising these tables and converting them into a format like Excel, researchers can search, analyse, and reorganise data in ways that simply aren’t possible on paper.

For example, you could transform lists of trades from centuries-old apprenticeship records into a dataset that reveals changing economic trends over time. Or you could convert registers of handwritten addresses into a searchable resource for genealogical research. Once digitised, these tables can also be uploaded to platforms like Transkribus Sites, where they can be shared and accessed worldwide.

 

What kind of records can be converted into spreadsheets?

Not all tables are equally easy to digitise. For best results, the tables should have a consistent layout across each page, with similar handwriting styles and clear separation between rows and columns. High-quality scans are essential, as blurry or poorly lit images can make it difficult for AI to accurately identify table boundaries or recognise text. Historical records that meet these criteria, such as census tables or registers with well-defined sections, are ideal candidates for this digitisation process.


The columns and rows in a table should be separated by a line or blank space.The columns and rows in a table should be separated, whether by a line or blank space. © “A catalogue of pictures” (Paul Mellon Centre for Studies in British Art)

 

How to convert handwritten tables into spreadsheets

Below is an outline of the workflow for turning handwritten tables into digital spreadsheets. More detailed information about each step of the workflow can be found in our Help Center.

1. Train a Table Model

The first step is to train an AI Table Model that can locate the columns and rows in your handwritten table. To do this, you first need to prepare training data. Upload scanned images of your document to Transkribus, and use the Editor to mark where all the rows and columns are. You will need to prepare between 20 and 50 documents as training data, depending on the complexity and homogeneity of your tables.

Ground Truth training for AI.Save these images as "Ground Truth" and use them to train a Table Model. You may need to retrain the model a few times before it is sufficiently accurate.

2. Run the table model

Now that your table model is ready, you can apply it to the rest of your document collection. This process tells Transkribus to identify the columns and rows in each table in your documents, creating a structured map of the tables.

Applying the trained table model.3. Perform the layout recognition

Before performing the text recognition, it is important to run a separate layout recognition first. This allows Transkribus to locate the text within each cell of the table.

When running the layout recognition, you should also select the following advanced parameters: "Keep existing regions" and "Split lines on region border". These parameters make sure that the table format recognised in Step 2 is not changed through the layout recognition.

4. Perform the text recognition

Once Transkribus has located the cells and the text contained within them, it can then recognise the text and create a digital version of it. This is done in exactly the same way as a "normal" text recognition. A digital version of the table will be displayed on the right side of the Editor screen, and this can be corrected and edited as required.

 

5. Export as an Excel file

Finally, download the data to your computer as a spreadsheet file. This allows you to open it in programs such as Microsoft Excel or Google Sheets.

 

Can I export all the document pages as one spreadsheet?

Yes, Transkribus allows you to choose between exporting each page as a separate sheet within the spreadsheet or merging all pages into one comprehensive spreadsheet. This flexibility makes it easy to structure your output in a way that suits your research goals.

 

Can I export it in another digital format?

Your tabular data doesn't have to be exported as an Excel file. Files can be downloaded from Transkribus in various different formats, from PDF to TEI, meaning you can choose the format that allows you to process and analyse your data most effectively after processing it in Transkribus.

Various export formats supported by Transkribus.

How can I improve access to my documents?

Once your tables are digitised, consider uploading them to Transkribus Sites. This platform allows you to share your digital records online, making them accessible to researchers, historians, and the general public worldwide. By publishing on Transkribus Sites, you enhance the visibility of your work and contribute valuable resources to the wider research community.

 

Can I use Transkribus to digitise other types of documents?

Certainly. Transkribus is designed to digitise a wide range of documents, from medieval manuscripts to handwritten notes. It can also be used as optical character recognition (OCR) software for printed books. Its versatility makes it a valuable tool for any archive, library, or research institution working with historical records.

 

Where can I get more information on this topic?

The Transkribus Help Center has information and tutorials for all the different features and workflows in Transkribus. In the Table Models section, you can find step-by-step instructions for training a Table Model and applying it to your documents.

Alternatively, check out the recording of our Table Models webinar on our YouTube channel for a walkthrough of how to train a Table Model, and visit our Events page for information about upcoming webinars for both beginner and advanced users.

Share this post