Bengali text to image

5/30/2023

But one drawback of this OCR engine is that it does not work for all Bengali fonts and sizes. Recently, it has started to work on Bengali script as well.

Tesseract is an open source OCR engine which works for many languages. A group of researchers suggested a complete Bengali OCR system methodology where they have experimented their methods for different fonts and sizes and got a good accuracy rate only for larger font size The accuracy of the model was reduced due to some connected characters. One of the approaches include two zone approach for character segmentation, which do not work for all fonts and sizes. But the system only works for fixed font size which was used to train the model. A printed Bengali OCR system was developed using a single hidden BLSTM-CTC architecture which includes pre-processing, line detection and recognition. One of the drawbacks of the OCR system is that it does not work properly if the resolution of the image is less than 300dpi.

Some problem scopes of Bengali OCR system are mentioned in, which include lack of standard samples and complex structure of documents. A lot of research works have been undertaken regarding Bengali OCR system.

0 Comments

Bengali text to image

Leave a Reply.

Author

Archives

Categories