OCR 总结

最近3个月一直不做人脸相关任务了,现在做OCR
还是老规矩,先放上OCR的最前沿的论文。

1、state-of-art进展

https://github.com/hs105/Deep-Learning-for-OCR
https://github.com/chongyangtao/Awesome-Scene-Text-Recognition
http://mclab.eic.hust.edu.cn/~xbai/

2、综述

流程图说明
这里写图片描述[1]. 对输入图像有的时候需要做预处理 ( aims to enhance the quality of image)

Different types of filters such as averaging, min and max filters can be applied.
Alternatively, different morphological operations such as :
erosion, dilation, opening and closingcan be performed.

[2]. 版式分析往往采用传统方法

[3].当采集的图像有比较大的倾斜,或者畸变严重,需要进行方向矫正
,一般采用仿射变换

[4].Lexical processingbased on Markov models and dictionary
can also help in improving the results of OCR

2.1 难点

[1] An important area of research ismulti-lingual characterrecognition system.
[2]Handwriting character recognitionis a very tough job due to different writing
style of user as well as different pen movements by the user for the same character.

2.2 用途

OCR has been used for mail sorting, bank cheque reading and signature verification
Another useful application of OCR ishelping blind and visually impaired people to read text
Other uses of OCR include processing utility bills, passport validation, pen computing[手写计算] and automated number plate recognition[自动车牌识别] etc

2.3 分类

ON line & OFF line
[1] The former isperformed in real-timewhile the users are writing the character. They are less complex as they can capture the temporal or time based information.
[2] The offline recognition systemsoperate on static datai.e. the input is a bitmap.
Hence, it is very difficult to perform recognition.

3、主流方案

[1]. Google 发布自然语言处理解析器 SLING
[2]. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

4-1 Page dewarping 去除扭曲

4-2 Angle estimate 角度预判

5、小结


版权声明:本文为dongfang1984原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。