Skip navigation
Please use this identifier to cite or link to this item: http://repository.iitr.ac.in/handle/123456789/15677
Title: An efficient coarse-to-fine indexing technique for fast text retrieval in historical documents
Authors: Pratim Roy, Partha
Rayar F.
Ramel J.-Y.
Published in: Proceedings of 10th IAPR International Workshop on Document Analysis Systems, DAS 2012
Abstract: In this paper, we present a fast text retrieval system to index and browse degraded historical documents. The indexing and retrieval strategy is designed in a two level, coarse-to-fine approach, to increase the speed of the retrieval process. During the indexing step, the text parts in the images are encoded into sequences of primitives, obtained from two different codebooks: a coarse one corresponding to connected components and a fine one corresponding to glyph primitives. A glyph consists of a single character or a part of a character according to the shape complexity. During the querying step, the coarse and the fine signature are generated from the query image using both codebooks. Then, a bi-level approximate string matching algorithm is applied to find similar words, using coarse approach first, and then the fine approach if necessary, by exploiting predetermined hypothetical locations. An experimental evaluation on datasets of real life document images, gathered from historical books of different scripts, demonstrated the speed improvement and good accuracy in presence of degradation. © 2012 IEEE.
Citation: Proceedings of 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, (2012), 150- 154. Gold Coast, QLD
URI: https://doi.org/10.1109/DAS.2012.17
http://repository.iitr.ac.in/handle/123456789/15677
Issue Date: 2012
Keywords: Approximate String Matching
Historical Documents
Word Spotting
ISBN: 9.78077E+12
Author Scopus IDs: 56880478500
55247398900
8293131700
Author Affiliations: Roy, P.P., Laboratoire d'Informatique, Université François Rabelais, Tours, France
Rayar, F., Laboratoire d'Informatique, Université François Rabelais, Tours, France
Ramel, J.-Y., Laboratoire d'Informatique, Université François Rabelais, Tours, France
Corresponding Author: Roy, P.P.; Laboratoire d'Informatique, Université François Rabelais, Tours, France; email: partha.roy@univ-tours.fr
Appears in Collections:Conference Publications [CS]

Files in This Item:
There are no files associated with this item.
Show full item record


Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.