Skip navigation
Please use this identifier to cite or link to this item: http://repository.iitr.ac.in/handle/123456789/15860
Title: Multi-lingual text recognition from video frames
Authors: Sharma N.
Mandal R.
Sharma R.
Pratim Roy, Partha
Pal U.
Blumenstein M.
Published in: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
Abstract: Text recognition from video frames is a challenging task due to low resolution, blur, complex and coloured backgrounds, noise, to mention a few. Consequently, the traditional ways of text recognition from scanned documents having simple backgrounds fails when applied to video text. Although there are various techniques available for text recognition from handwritten and printed documents with simple backgrounds, text recognition from video frames has not been comprehensively investigated, especially for multi-lingual videos. In this paper, we present a technique for multi-lingual video text recognition which involves script identification in the first stage, followed by word and character recognition, and finally the results are refined using a post-processing technique. Considering the inherent problems in videos, a Spatial Pyramid Matching (SPM) based technique, using patch-based SIFT descriptors and SVM classifier, is employed for script identification. In the next stage, a Hidden Markov Model (HMM) based approach is used for word and character recognition, which utilizes the context information. Finally, a lexicon-based post-processing technique is applied to verify and refine the word recognition results. The proposed method was tested on a dataset comprising of 4800 words from three different scripts, namely, Roman (English), Hindi and Bengali. The script identification results obtained are encouraging. The word and character recognition results are also encouraging considering the complexity and problems associated with video text processing. © 2015 IEEE.
Citation: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, (2015), 951- 955
URI: https://doi.org/10.1109/ICDAR.2015.7333902
http://repository.iitr.ac.in/handle/123456789/15860
Issue Date: 2015
Publisher: IEEE Computer Society
Keywords: Copying
Hidden Markov models
Linguistics
Markov processes
Text processing
Video signal processing
Context information
Post-processing techniques
Printed documents
Script identification
SIFT descriptors
Spatial Pyramid Matching
Text recognition
Video text recognition
Character recognition
ISBN: 9.78148E+12
ISSN: 15205363
Author Scopus IDs: 23991570600
54410932900
57013824000
56880478500
57200742116
56243577200
Author Affiliations: Sharma, N., Griffith UniversityQLD 4222, Australia
Mandal, R., Griffith UniversityQLD 4222, Australia
Sharma, R., CVPR Unit, Indian Statistical Institute, Kolkata, 700108, India
Roy, P.P., Indian Institute of Technology, Roorkee, India
Pal, U., CVPR Unit, Indian Statistical Institute, Kolkata, 700108, India
Blumenstein, M., Griffith UniversityQLD 4222, Australia
Appears in Collections:Conference Publications [CS]

Files in This Item:
There are no files associated with this item.
Show full item record


Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.