Skip navigation
Please use this identifier to cite or link to this item: http://repository.iitr.ac.in/handle/123456789/15810
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGaur S.-
dc.contributor.authorSonkar S.-
dc.contributor.authorPratim Roy, Partha-
dc.date.accessioned2020-12-02T11:41:43Z-
dc.date.available2020-12-02T11:41:43Z-
dc.date.issued2015-
dc.identifier.citationProceedings of the International Conference on Document Analysis and Recognition, ICDAR, (2015), 491- 495-
dc.identifier.isbn9.78148E+12-
dc.identifier.issn15205363-
dc.identifier.urihttps://doi.org/10.1109/ICDAR.2015.7333810-
dc.identifier.urihttp://repository.iitr.ac.in/handle/123456789/15810-
dc.description.abstractThis paper presents a novel approach to create synthetic dataset for word recognition systems. Our purpose is to improve performance of off-line handwritten text recognizers by providing it with additional synthetic training data. Due to lack of proper data-set for many languages it becomes hard to train recognition systems. To solve such problems synthetic handwriting could be used to expand the existing training dataset. Any available digital data from online newspaper and such sources can be used to generate this synthetic data. The digital data is distorted in such a way that the underlying pattern is conserved for identification of the word by both machine and human user. The images hence produced can be used to train any classification system for handwriting recognition. This data can be used independently to train the system or be combined with natural handwritten data to augment the original dataset and improve the accuracy of the results. We experimented using only synthetic data obtaining high recognition accuracy in both character and word recognition. The data was tested on 3 Indian scripts for numerals- Hindi, Bengali and Telugu, and 1 script-Hindi for words, the results achieved hence are highly promising. © 2015 IEEE.-
dc.description.sponsorshipABBYY;Google;iTESOFT;MyScript;Yooz-
dc.language.isoen_US-
dc.publisherIEEE Computer Society-
dc.relation.ispartofProceedings of the International Conference on Document Analysis and Recognition, ICDAR-
dc.subjectHidden Markov Models-
dc.subjectIndic Text Recognition-
dc.subjectSynthetic Data Generation-
dc.titleGeneration of synthetic training data for handwritten Indic script recognition-
dc.typeConference Paper-
dc.scopusid57188730765-
dc.scopusid57207533098-
dc.scopusid56880478500-
dc.affiliationGaur, S., Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India-
dc.affiliationSonkar, S., Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India-
dc.affiliationRoy, P.P., Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee, Uttarakhand, India-
dc.identifier.conferencedetails13th International Conference on Document Analysis and Recognition, ICDAR 2015, 23-26 August 2015-
Appears in Collections:Conference Publications [CS]

Files in This Item:
There are no files associated with this item.
Show simple item record


Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.