CRS4

A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts

Ruggero Pintus, Ying Yang, Enrico Gobbetti, Holly Rushmeier
The 12th Eurographics Worhshop on Graphics and Cultural Heritage - 2014
Download the publication : gch2014-atalisman.pdf [3.4Mo]  
The acquisition and the preservation of historical and artistic books is crucial due to the value and the fragile condition of such deteriorating objects. Moreover, the study and browsing of digital libraries is invaluable for scholars in the Cultural Heritage field, but requires automatic tools for analyzing and indexing these huge databases. In this scenario, document layout analysis plays a significant role, being a fundamental step of any document image understanding system. In this paper, we present a completely automatic algorithm to perform per-book text and line segmentation of old handwritten books. Our proposed technique have been evaluated on historical manuscripts of different writing styles and with various problematic attributes, such as holes, spots, ink bleed-through, ornamentation, background noise, and overlapping text lines. Our experimental results demonstrate that this approach is efficient and reliable, even when applied to very noisy and damaged books.

Images and movies

 

BibTex references

@InProceedings{PYGR14,
  author       = {Pintus, R. and Yang, Y. and Gobbetti, E. and Rushmeier, H.},
  title        = {A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts},
  booktitle    = {The 12th Eurographics Worhshop on Graphics and Cultural Heritage},
  year         = {2014},
  keywords     = {automatic text recognition, digital heritage},
  url          = {http://publications.crs4.it/pubdocs/2014/PYGR14},
}

Other publications in the database

» Ruggero Pintus
» Ying Yang
» Enrico Gobbetti
» Holly Rushmeier