Citation: | BAI Zhi-cheng, LI Qing, CHEN Peng, GUO Li-qing. Text detection in natural scenes: a literature review[J]. Chinese Journal of Engineering, 2020, 42(11): 1433-1448. doi: 10.13374/j.issn2095-9389.2020.03.24.002 |
[1] |
戴津. 自然場景中文本檢測技術研究綜述. 計算機光盤軟件與應用, 2013(18):104
Dai J. Review of research on text detection technology in natural scenes. Comput CD Software Appl, 2013(18): 104
|
[2] |
卓力, 龍海霞, 彭遠帆, 等. 加密域圖像處理綜述. 北京工業大學學報, 2016, 42(2):174
Zhuo L, Long H X, Peng Y F, et al. Image processing in encrypted domain: a comprehensive survey. J Beijing Univ Technol, 2016, 42(2): 174
|
[3] |
樊亞玲. 移動終端自然場景文本檢測算法研究[學位論文]. 西安: 西安電子科技大學, 2015
Fan Y L.Natural Scene Text Detection Algorithm Research Based on Mobile Terminal [Dissertation]. Xi’an: Xidian University, 2015
|
[4] |
王潤民, 桑農, 丁丁, 等. 自然場景圖像中的文本檢測綜述. 自動化學報, 2018, 44(12):2113
Wang R M, Sang N, Ding D, et al. Text detection in natural scene image: a survey. Acta Autom Sin, 2018, 44(12): 2113
|
[5] |
李建更, 李立杰, 張巖, 等. 適用于具有多分類器的卷積神經網絡訓練方法. 北京工業大學學報, 2018, 44(10):1291
Li J G, Li L J, Zhang Y, et al. A method which is suitable for the training of convolutional neural networks with multiple classifiers. J Beijing Univ Technol, 2018, 44(10): 1291
|
[6] |
李曉理, 張博, 王康, 等. 人工智能的發展及應用. 北京工業大學學報, 2020, 46(6):583
Li X L, Zhang B, Wang K, et al. The development and application of artificial intelligence. J Beijing Univ Technol, 2020, 46(6): 583
|
[7] |
Canny J. A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell, 1986, 8(6): 679
|
[8] |
Liu C M, Wang C H, Dai R W. Text detection in images based on unsupervised classification of edge-based features // Eighth International Conference on Document Analysis and Recognition (ICDAR'05). Seoul, 2005: 610
|
[9] |
Sobel I E. Camera Models and Machine Perception [Dissertation]. San Francisco: Stanford University, 1970
|
[10] |
Shivakumara P, Phan T Q, Tan C L. A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell, 2011, 33(2): 412 doi: 10.1109/TPAMI.2010.166
|
[11] |
Yu C, Song Y, Meng Q, et al. Text detection and recognition in natural scene with edge analysis. IET Computer Vision, 2015, 9(4): 603 doi: 10.1049/iet-cvi.2013.0307
|
[12] |
Buta M, Neumann L, Matas J. FASText: Efficient unconstrained scene text detector // Proceedings of the IEEE International Conference on Computer Vision. Santiago, 2015: 1206
|
[13] |
Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform // 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. California, 2010: 2963
|
[14] |
Yao C, Bai X, Liu W Y, et al. Detecting texts of arbitrary orientations in natural images // 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, 2012: 1083
|
[15] |
Huang W L, Lin Z, Yang J C, et al. Text localization in natural images using stroke feature transform and text covariance descriptors // Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, 2013: 1241
|
[16] |
Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions. Image Vision Computing, 2004, 22(10): 761 doi: 10.1016/j.imavis.2004.02.006
|
[17] |
Gomez L, Karatzas D. Object proposals for text extraction in the wild // 2015 13th International Conference on Document Analysis and Recognition (ICDAR). Tunis, 2015: 206
|
[18] |
Neumann L, Matas J. A method for text localization and recognition in real-world images // 10th Asian Conference on Computer Vision. Queenstown, 2010: 770
|
[19] |
Neumann L, Matas J. Real-time scene text localization and recognition // 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, 2012: 3538
|
[20] |
Sun L, Huo Q. A component-tree based method for user-intention guided text extraction // Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). Tsukuba, 2012: 633
|
[21] |
Sun L, Huo Q, Jia W, et al. A robust approach for text detection from natural scene images. Pattern Recognit, 2015, 48(9): 2906 doi: 10.1016/j.patcog.2015.04.002
|
[22] |
周鵬飛. 自然場景圖像中的文本檢測與識別技術研究[學位論文]. 西安: 西安理工大學, 2019
Zhou P F.Research on Text Detection and Recognition in Natural Scene Images [Dissertation]. Xi’an: Xi’an University of Technology, 2019
|
[23] |
Chen X R, Yuille A L. Detecting and reading text in natural scenes // Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, 2004: II
|
[24] |
Lee J J, Lee P H, Lee S W, et al. AdaBoost for text detection in natural scene // 2011 International Conference on Document Analysis and Recognition. Beijing, 2011: 429
|
[25] |
Liu Y L, Jin L W. Deep matching prior network: Toward tighter multi-oriented text detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 1962
|
[26] |
尹寶才, 王文通, 王立春. 深度學習研究綜述. 北京工業大學學報, 2015, 41(1):48
Yin B C, Wang W T, W L C. Review of deep learning research. J Beijing Univ Technol, 2015, 41(1): 48
|
[27] |
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031
|
[28] |
Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21
|
[29] |
Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks// Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, 2016: 379
|
[30] |
余崢, 王晴晴, 呂岳. 基于特征融合網絡的自然場景文本檢測. 計算機系統應用, 2018, 27(10):1
Yu Z, Wang Q Q, Lü Y. Scene text detection based on feature fusion network. Comput Syst Appl, 2018, 27(10): 1
|
[31] |
Karatzas D, Mestre S R, Mas J, et al. ICDAR 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email) // 2011 International Conference on Document Analysis and Recognition. Beijing, 2011: 1485
|
[32] |
Karatzas D, Shafait F, Uchida S, et al. ICDAR 2013 robust reading competition // 2013 12th International Conference on Document Analysis and Recognition. Washington, 2013: 1484
|
[33] |
Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia, 2018, 20(11): 3111 doi: 10.1109/TMM.2018.2818020
|
[34] |
Jiang Y Y, Zhu X Y, Wang X B, et al. R2CNN: rotational region CNN for orientation robust scene text detection [J/OL]. arXiv preprint (2017-06-30)[2020-03-01]. https://arxiv.org/abs/1706.09579
|
[35] |
Zhong Z Y, Sun L, Huo Q. An anchor-free region proposal network for Faster R-CNN-based text detection approaches. Int J Doc Anal Recognit, 2019, 22(3): 315 doi: 10.1007/s10032-019-00335-y
|
[36] |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. arXiv preprint (2015-04-10)[2020-03-01]. https://arxiv.org/abs/1409.1556
|
[37] |
Shi B G, Bai X, Belongie S. Detecting oriented text in natural images by linking segments // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2550
|
[38] |
Liao M H, Shi B G, Bai X, et al. TextBoxes: a fast text detector with a single deep neural network // Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, 2017: 4161
|
[39] |
Liao M H, Shi B G, Bai X. TextBoxes++: a single-shot oriented scene text detector. IEEE Trans Image Process, 2018, 27(8): 3676 doi: 10.1109/TIP.2018.2825107
|
[40] |
He P, Huang W L, He T, et al. Single shot text detector with regional attention // Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, 2017: 3047
|
[41] |
Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning // Thirty-First AAAI Conference on Artificial Intelligence. San Francisco, 2017: 4278
|
[42] |
Liao M H, Zhu Z, Shi B G, et al. Rotation-sensitive regression for oriented scene text detection // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 5909
|
[43] |
Liu X, Zhang R, Zhou Y S, et al. Scene text detection with feature pyramid network and linking segments // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 508
|
[44] |
Zhang S, Liu Y L, Jin L W, et al. Feature enhancement network: a refined scene text detector // Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, 2018: 2612
|
[45] |
Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network // European Conference on Computer Vision. Munich, 2016: 56
|
[46] |
Wang X B, Jiang Y Y, Luo Z B, et al. Arbitrary shape scene text detection with adaptive text region representation // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 6449
|
[47] |
He K M, Gkioxari G, Dollár P, et al. Mask R-CNN // Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, 2017: 2961
|
[48] |
Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39(4): 640 doi: 10.1109/TPAMI.2016.2572683
|
[49] |
Li Y, Qi H Z, Dai J F, et al. Fully convolutional instance-aware semantic segmentation // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 4438
|
[50] |
Lyu P Y, Liao M H, Yao C, et al. Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes // Proceedings of the European Conference on Computer Vision. Munich, 2018: 67
|
[51] |
Liao M H, Lyu P Y, He M H, et al. Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans Pattern Anal Machine Intelligence, 2019: 1
|
[52] |
Xie E Z, Zang Y H, Shao S, et al. Scene text detection with supervised pyramid context network. Proc AAAI Conf Artif Intell, 2019, 33: 9038
|
[53] |
Zhang Z, Zhang C Q, Shen W, et al. Multi-oriented text detection with fully convolutional networks // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 4159
|
[54] |
Long S B, Ruan J Q, Zhang W J, et al. TextSnake: a flexible representation for detecting text of arbitrary shapes // Proceedings of the European Conference on Computer Vision. Munich, 2018: 19
|
[55] |
He T, Huang W L, Qiao Y, et al. Accurate text localization in natural image with cascaded convolutional text network [J/OL]. arXiv preprint (2016-03-31)[2020-03-01]. https://arxiv.org/abs/1603.09423
|
[56] |
Deng D, Liu H F, Li X L, et al. PixelLink: Detecting scene text via instance segmentation [J/OL]. arXiv preprint (2018-01-04)[2020-03-01]. https://arxiv.org/abs/1801.01315
|
[57] |
Yang Q P, Cheng M L, Zhou W M, et al. IncepText: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection // Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, 2018: 1071
|
[58] |
Dai Y C, Huang Z, Gao Y T, et al. Fused text segmentation networks for multi-oriented scene text detection //2018 24th International Conference on Pattern Recognition (ICPR). Beijing, 2018: 3604
|
[59] |
Wang W H, Xie E Z, Li X, et al. Shape robust text detection with progressive scale expansion network //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 9328
|
[60] |
Wang W H, Xie E Z, Song X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, 2019: 8439
|
[61] |
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 770
|
[62] |
Baek Y, Lee B, Han D, et al. Character region awareness for text detection // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 9357
|
[63] |
Tian Z T, Shu M, Lyu P Y, et al. Learning shape-aware embedding for scene text detection // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 4229
|
[64] |
Liao M H, Wan Z Y, Yao C, et al. Real-time scene text detection with differentiable binarization [J/OL]. arXiv preprint (2019-12-03)[2020-03-01]. https://arxiv.org/abs/1911.08947
|
[65] |
Lyu P Y, Yao C, Wu W H, et al. Multi-oriented scene text detection via corner localization and region segmentation // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 7553
|
[66] |
Li Y, Yu Y J, Li Z F, et al. Pixel-Anchor: a fast oriented scene text detector with combined networks [J/OL]. arXiv preprint (2018-11-19)[2020-03-01]. http://export.arxiv.org/abs/1811.07432
|
[67] |
Jiang F, Hao Z H, Liu X R. Deep scene text detection with connected component proposals [J/OL]. arXiv preprint (2017-08-17)[2020-03-01]. http://export.arxiv.org/abs/1708.05133
|
[68] |
Qiao L, Tang S L, Cheng Z Z, et al. Text perceptron: towards end-to-end arbitrary-shaped text spotting [J/OL]. arXiv preprint (2020-02-17)[2020-03-01]. https://arxiv.org/abs/2002.06820
|
[69] |
Zhou X Y, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector // Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2642
|
[70] |
Li J R, Zhou Z J, Su Z Z, et al. A new parallel detection-recognition approach for end-to-end scene text extraction // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 1358
|
[71] |
He T, Tian Z, Huang W L, et al. An end-to-end TextSpotter with explicit alignment and attention // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, 2018: 5020
|
[72] |
Kim K H, Hong S, Roh B, et al. PVANET: Deep but lightweight neural networks for real-time object detection. arXiv preprint (2019-09-30)[2020-03-01]. https://arxiv.org/abs/1608.08021
|
[73] |
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition // IEEE Conference on Computer Vision & Pattern Recognition. Las Vegas, 2016: 770
|
[74] |
Wang F F, Zhao L M, Li X, et al. Geometry-aware scene text detection with instance transformation network // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 1381
|
[75] |
Duan J Q, Xu Y J, Kuang Z H, et al. Geometry normalization networks for accurate scene text detection // Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, 2019: 9136
|
[76] |
Liu Z C, Lin G S, Yang S, et al. Towards robust curve text detection with conditional spatial expansion // Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 7261
|
[77] |
Liu Y L, Chen H, Shen C H, et al. ABCNet: real-time scene text spotting with adaptive Bezier-curve network [J/OL]. arXiv preprint (2020-02-25)[2020-03-01]. https://arxiv.org/abs/2002.10200v2
|
[78] |
Wang H, Lu P, Zhang H, et al. All you need is boundary: toward arbitrary-shaped text spotting [J/OL]. arXiv preprint (2019-11-21)[2020-03-01]. https://arxiv.org/abs/1911.09550
|
[79] |
張艾萱. 基于深度學習的自然場景文本檢測算法研究[學位論文]. 北京: 北方工業大學, 2019
Zhang A X. Research on Natural Scene Text Detection Algorithms Based on Deep Learning [Dissertation]. Beijing: North China University of Technology, 2019
|
[80] |
周翔宇, 高仲合. 基于YOLO的自然場景傾斜文本定位方法研究. 計算機工程與應用, 2020, 56(9):213 doi: 10.3778/j.issn.1002-8331.1911-0032
Zhou X Y, Gao Z H. Research on inclined text location method of natural scene based on YOLO. Comput Eng Appl, 2020, 56(9): 213 doi: 10.3778/j.issn.1002-8331.1911-0032
|
[81] |
Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 779
|
[82] |
牛作東, 李捍東. 引入注意力機制的自然場景文本檢測算法研究. 計算機應用與軟件, 2019, 36(9):198 doi: 10.3969/j.issn.1000-386x.2019.09.035
Niu Z D, Li H D. Natural scene text detection algorithm with attention mechanism. Comput Appl Software, 2019, 36(9): 198 doi: 10.3969/j.issn.1000-386x.2019.09.035
|
[83] |
Yuan T L, Zhu Z, Xu K, et al. Chinese text in the wild [J/OL]. arXiv preprint (2018-02-26)[2020-03-01]. https://arxiv.org/abs/1803.00085
|
[84] |
Lucas S M, Panaretos A, Sosa L, et al. ICDAR 2003 robust reading competitions // Seventh International Conference on Document Analysis and Recognition. Edinburgh, 2003: 682
|
[85] |
Veit A, Matera T, Neumann L, et al. COCO-Text: dataset and benchmark for text detection and recognition in natural images [J/OL]. arXiv preprint (2016-06-19)[2020-03-01]. https://arxiv.org/abs/1601.07140
|
[86] |
Shi B G, Yao C, Liao M H, et al. ICDAR2017 competition on reading Chinese text in the wild (RCTW-17) // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 1429
|
[87] |
Ch’ng C K, Chan C S. Total-Text: a comprehensive dataset for scene text detection and recognition // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 935
|
[88] |
Nayef N, Yin F, Bizid I, et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, 2017: 1454
|
[89] |
Chng C K, Liu Y L, Sun Y P, et al. ICDAR2019 robust reading challenge on arbitrary-shaped text-RRC-ArT. // 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, 2019: 1571
|
[90] |
Liu Y L, Jin L W, Zhang S T, et al. Detecting curve text in the wild: new dataset and new solution. arXiv preprint (2017-12-06)[2020-3-1]. https://arxiv.org/abs/1712.02170
|
[91] |
王建新, 王子亞, 田萱. 基于深度學習的自然場景文本檢測與識別綜述. 軟件學報, 2020, 31(5):1465
Wang J X, Wang Z Y, Tian X. Review of natural scene text detection and recognition based on deep learning. J Software, 2020, 31(5): 1465
|
[92] |
Liu Y L. Jin L W. Zhang S T, et al. Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit, 2019, 90: 337 doi: 10.1016/j.patcog.2019.02.002
|
[93] |
彭碧發. 騰訊云大學大咖分享 | 解密OCR文字識別技術 [EB/OL] 騰訊云社區專欄 (2019-08-13) [2020-03-01]. https://cloud.tencent.com/developer/article/1473262
Peng B F. Tencent cloud university big players share | decryption OCR text recognition technology [EB/OL]. Tencent Cloud Community Column (2019-08-13) [2020-03-01]. https://cloud.tencent.com/developer/article/1473262
|
[94] |
有道智云. 文字識別OCR服務 [EB/OL]. 有道智云·AI開放平臺 (2019-12-17) [2020-03-01]. https://ai.youdao.com/product-ocr.s
Youdao Z Y. Text recognition OCR service [EB/OL]. Youdao Intelligent Cloud·AI Open Platform (2019-12-17) [2020-03-01]. https://ai.youdao.com/product-ocr.s
|
[95] |
百度云. 通用文字識別 [EB/OL]. 百度智能云 (2020-02-05) [2020-03-01]. https://cloud.baidu.com/product/ocr/general
Baidu Clound Engine. Universal text recognition [EB/OL]. Baidu Intelligent Cloud (2020-02-05) [2020-03-01]. https://cloud.baidu.com/product/ocr/general
|
[96] |
Chuangyejun. 創藍253-創藍萬數平臺圖像識別OCR技術 [EB/OL] 創藍253專欄 (2018-07-19) [2020-03-01]. https://blog.csdn.net/chuangyejun/article/details/81113833
Chuangyejun. Chuang Lan 253- the image recognition OCR technology of Chuanglan Myriads platform [EB/OL]. Chuang Lan 253 Column (2018-07-19) [2020-03-01]. https://blog.csdn.net/chuangyejun/article/details/81113833
|
[97] |
ZJULearning. Pixel_link [EB/OL]. GitHub (2019-11-21) [2020-03-01]. https://github.com/ZJULearning/pixel_link
|
[98] |
Huoyijie. AdvancedEAST [EB/OL]. GitHub (2020-4-3) [2020-03-01]. https://github.com/huoyijie/AdvancedEAST
|
[99] |
Dengdan. Seglink [EB/OL]. GitHub (2018-5-3) [2020-03-01]. https://github.com/dengdan/seglink
|
[100] |
Tianzhi0549. CTPN [EB/OL]. GitHub (2020-4-3) [2020-03-01]. https://github.com/tianzhi0549/CTPN
|
[101] |
Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network // ECCV 2016: European Conference on Computer Vision. Amsterdam, 2016: 56
|