OCR on Historical Documents

Skilja is proud to announce that we have received a grant from the European Union supporting a research and development project to improve OCR on historical documents. The grant is provided through the Eurostars program of the European Union. This program supports research-performing small and medium enterprises, which develop innovative products, processes and services, to gain competitive advantage. It is a transnational program, where projects have partners from two or more Eurostars countries. Thanks to this international collaboration, SMEs can more easily gain access to new markets. Please see here for more details on the program.
Skilja has won this grant together with our partner company Lumex in Norway (www.lumex.no). It will support our own investment in research activities and will run for three years. The Eurostars evaluation process selected Lumex’ and Skilja’s proposal as a top 5% technology and business model winner amongst hundreds of European SME’s representing all industries and sciences.

The goal of the three year project is to improve the recognition of difficult historical documents. An example of the documents is given below. This is a typewritten document from a correspondence archive. The technology we develop also extends to standard and gothic (fraktur) fonts. Main target is the digitization of old archives and newspapers.

Example for a difficult historic typewriter document

This improved conversion will allow researchers to access cultural heritage better and to preserve historical content for the future digital world. The project will build upon a current version of an accuracy extension for existing OCR that has been created by Lumex and Skilja. It will use advanced image processing and classification technologies to further improve the results.

This project is funded by the Federal Ministry of Education and Research (BMBF) of German and the European Union under the project OptO-Heritage and grant number 01QE140.

 

Die Kommentarfunktion ist geschlossen.