基于神经网络的车辆识别代号识别方法

孟凡俊; 尹东

doi:10.12086/oee.2021.200094

基于神经网络的车辆识别代号识别方法

孟凡俊^1,2,,
尹东^1,2, ,

- 1.
  中国科学技术大学信息科学技术学院，安徽合肥 230027
- 2.
  中国科学院电磁空间信息重点实验室，安徽合肥 230027
基金项目:
安徽省2018年度重点研究与开发计划项目(1804a09020049)

详细信息

作者简介:
孟凡俊(1994-)，男，硕士研究生，主要从事图像处理的研究。E-mail：fanjunm@mail.ustc.edu.cn

通讯作者: 尹东(1965-)，男，副教授，主要从事图像处理的研究。E-mail：yindong@ustc.edu.cn

中图分类号: TP391.41

收稿日期: 2020-03-20

修回日期: 2020-06-28

刊出日期: 2021-01-15

Vehicle identification number recognition based on neural network

Meng Fanjun^1,2,,
Yin Dong^1,2, ,

- 1.
  School of Information Science Technology, University of Science and Technology of China, Hefei, Anhui 230027, China
- 2.
  Key Laboratory of Electromagnetic Space Information, Chinese Academy of Sciences, Hefei, Anhui 230027, China
Fund Project: Key Research and Development Projects of Anhui Province (804a09020049)

More Information

Corresponding author: Yin Dong, E-mail: yindong@ustc.edu.cn

Received Date 20 March 2020

Revised Date 28 June 2020

Published Date 15 January 2021

摘要

摘要:
在车辆识别和车辆年检时，正确识别车架上金属刻印的车辆识别代号(VIN)是非常重要的环节。针对VIN序列，本文提出了一种基于神经网络的旋转VIN图片识别方法，它由VIN检测和VIN识别两部分组成。首先，在EAST算法基础上利用轻量级神经网络提取特征，并结合文本分割实现快速、准确的VIN检测；其次，将VIN识别任务作为一个序列分类问题，提出了一种新的识别VIN方法，即通过位置相关的序列分类器，预测出最终的车辆识别代号。为了验证本文的识别方法，引入了一个VIN数据集，其中包含用于检测的原始旋转VIN图像和用于识别的水平VIN图像。实验结果表明，本文方法能有效地识别车架VIN图片，同时达到了实时性。
- 车辆识别代号 /
- 神经网络 /
- 文本分割 /
- 机器视觉
Abstract:
It is far essential to properly recognize the vehicle identification number (VIN) engraved on the car frame for vehicle surveillance and identification. In this paper, we propose an algorithm for recognizing rotational VIN images based on neural network which incorporates two components: VIN detection and VIN recognition. Firstly, with lightweight neural network and text segmentation based on EAST, we attain efficient and excellent VIN detection performance. Secondly, the VIN recognition is regarded as a sequence classification problem. By means of connecting sequential classifiers, we predict VIN characters directly and precisely. For validating our algorithm, we collect a VIN dataset, which contains raw rotational VIN images and horizontal VIN images. Experimental results show that the algorithm we proposed achieves good performance on VIN detection and VIN recognition in real time.
- vehicle identification number /
- neural network /
- text segmentation /
- machine vision

Overview

Overview: It is far essential to properly recognize the vehicle identification number (VIN) engraved on the car frame for car surveillance and vehicle identification. Vehicle identification number is unique globally, which is assigned by car manufacturers to a car for identifying it. The vehicle identification number is usually engraved on the metallic car frame which is uneasy to tamper with, so it is of great significance for vehicle annual surveillance and vehicle identification. Although many important achievements have been made in text recognition, especially the wide application of OCR in document recognition in images, the complex background, arbitrary angle and fuzzy font of the engraved text in the images have made it difficult to identify the vehicle identification number automatically. In vehicle identification and annual car inspection, a large number of VIN pictures need to be manually reviewed every day, which is very inefficient. With the application of deep learning, we can make use of deep learning to accelerate this process, improve the efficiency of auditing greatly, and realize automated auditing. We introduce an algorithm for recognizing vehicle identification number in images based on neural network, which incorporates two components: VIN detection and VIN recognition. Firstly, in the VIN detection part, the lightweight Network is used as feature extraction network in order to accelerate the inference speed and reduce the network cost. Combined with FCN and FPN, the network is able to adapt to any size of input images and focus on the distribution difference between foreground text pixels and background pixels. In order to improve the performance on rotational VIN, the images are rotated at any angle lossless in the training stage to augment datasets. Secondly, in the VIN recognition stage, we take VIN recognition task as a sequence classification problem, using VGGNet as the feature extraction network, and the final vehicle identification number sequence is predicted through the position-related sequential classifier without character segmentation to simplify the recognition processing. Also, the text direction in images may be reversed in dataset, and in order to solve the situation, picture is rotated at 180 degrees randomly in network training. Finally, we introduce a VIN dataset, which contains raw rotational VIN images and horizontal VIN images for validating our algorithm, and all of our experiments are conducted on the dataset. Experimental results show that the algorithm we proposed can detect and recognize the VIN text in images efficiently in real time.

HTML全文

图 1 复杂背景以及任意角度方向的车辆识别代号图片

Figure 1. Samples of vehicle VIN images in complex background and with arbitrary orientation

下载: 全尺寸图片幻灯片

图 2 基于神经网络的VIN识别框架

Figure 2. The overall architecture of our proposed VIN recognition algorithm

下载: 全尺寸图片幻灯片

图 3 VIN检测网络基本框架

Figure 3. Illustration of the VIN detection architecture

下载: 全尺寸图片幻灯片

图 4 VIN检测网络中的后处理以及输出

Figure 4. Post-processing and output of VIN detection network

下载: 全尺寸图片幻灯片

图 5 VIN识别算法网络。(a) 网络整体框架；(b) VGGNet内部结构

Figure 5. Illustration of VIN recognition network. (a) Overall framework; (b) VGGNet network

下载: 全尺寸图片幻灯片

图 6 任意方向旋转图片和文本框标签

Figure 6. Rotating VIN images and labels at any angle to augment dataset

下载: 全尺寸图片幻灯片

图 7 VIN训练数据集

Figure 7. Samples of VIN dataset

下载: 全尺寸图片幻灯片

图 8 实际复杂环境下VIN图片识别效果

Figure 8. Samples of detection and recognition results of our algorithm on complicated VIN images

下载: 全尺寸图片幻灯片

图 9 实际复杂环境下检测和识别效果不好的图片

Figure 9. Samples of failed detection and recognition on complicated VIN images

下载: 全尺寸图片幻灯片

表 1 在VIN数据集上不同方法的检测效果

Table 1. Detection performance comparison of different algorithms on VIN dataset

Methods	FPS(frame/s)	Precision/%	Recall/%
PSENet^[28]	1	86.4	92.2
EAST	20	94.3	95.7
FOTS^[29]	24	89.9	93.6
OURS	33	98.5	98.5

下载: 导出CSV

表 2 在VIN数据集上不同方法的识别效果

Table 2. Recognition performance comparison of two algorithms on VIN dataset

Methods	FPS(frame/s)	F/%	A/%
Tesseract(320×32)^[30]	6	4.47	11.96
CRNN(320×32)	40	43.61	83.44
ASTER(320×32)^[31]	33	44.25	85.19
OURS(320×32)	122	61.47	93.28

下载: 导出CSV

表 3 在不同尺寸输入图片大小上的识别效果

Table 3. Recognition performance comparison of different input size

Input size	FPS(frame/s)	F/%	A/%
200×20+aug	134	47.53	89.66
200×20	134	41.21	85.73
320×32+aug	122	61.47	93.28
320×32	122	55.15	91.90
400×40+aug	94	66.29	95.12
400×40	94	58.70	91.42
500×50+aug	88	64.73	95.71
500×50	88	58.11	92.37

下载: 导出CSV

参考文献(31)

[1]	Smith R. An overview of the Tesseract OCR engine[C]// Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007: 629–633.
[2]	Mori S, Suen C Y, Yamamoto K. Historical review of OCR research and development[J]. Proc IEEE, 1992, 80(7): 1029–1058. doi: 10.1109/5.156468
[3]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[Z]. arXiv: 1409.1556, 2014.
[4]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770–778.
[5]	唐有宝, 卜巍, 邬向前. 多层次MSER自然场景文本检测[J]. 浙江大学学报(工学版), 2016, 50(6): 1134–1140. http://www.cnki.com.cn/article/cjfdtotal-zdzc201606017.htm Tang Y B, Bu W, Wu X Q. Natural scene text detection based on multi-level MSER[J]. J Zhejiang Univ (Eng Sci), 2016, 50(6): 1134–1140. http://www.cnki.com.cn/article/cjfdtotal-zdzc201606017.htm
[6]	蒋弘毅, 朱丽平, 欧樟鹏. 基于MSER和Tesseract的自然场景图像文字识别[J]. 电脑知识与技术, 2017, 13(33): 213–216. http://www.cnki.com.cn/Article/CJFDTotal-DNZS201733095.htm Jiang H Y, Zhu L P, Ou Z P. Text recognition of natural scene image based on MSER and Tesseract[J]. Comput Knowl Technol, 2017, 13(33): 213–216. http://www.cnki.com.cn/Article/CJFDTotal-DNZS201733095.htm
[7]	张开玉, 邵康一, 卢迪. MSER快速自然场景倾斜文本定位算法[J]. 哈尔滨理工大学学报, 2019, 24(2): 81–88. http://www.cnki.com.cn/Article/CJFDTotal-HLGX201902013.htm Zhang K Y, Shao K Y, Lu D. MSER fast skewed scene-text location algorithm[J]. J Harbin Univ Sci Technol, 2019, 24(2): 81–88. http://www.cnki.com.cn/Article/CJFDTotal-HLGX201902013.htm
[8]	张国和, 黄凯, 张斌, 等. 最大稳定极值区域与笔画宽度变换的自然场景文本提取方法[J]. 西安交通大学学报, 2017, 51(1): 135–140. http://www.cnki.com.cn/Article/CJFDTotal-XAJT201701021.htm Zhang G H, Huang K, Zhang B, et al. A natural scene text extraction method based on the maximum stable extremal region and stroke width transform[J]. J Xi'an Jiaotong Univ, 2017, 51(1): 135–140. http://www.cnki.com.cn/Article/CJFDTotal-XAJT201701021.htm
[9]	南阳, 白瑞林, 李新. 卷积神经网络在喷码字符识别中的应用[J]. 光电工程, 2015, 42(4): 38–43. doi: 10.3969/j.issn.1003-501X.2015.04.007 Nan Y, Bai R L, Li X. Application of convolutional neural network in printed code characters recognition[J]. Opto-Electron Eng, 2015, 42(4): 38–43. doi: 10.3969/j.issn.1003-501X.2015.04.007
[10]	Wang K, Belongie S. Word spotting in the wild[C]//Proceedings of the 11th European Conference on Computer Vision, 2010, 6311: 591–604.
[11]	Wang K, Babenko B, Belongie S. End-to-end scene text recognition[C]//Proceedings of 2011 International Conference on Computer Vision, 2011: 1457–1464.
[12]	Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network[C]//Proceedings of the 14th European Conference on Computer Vision, 2016, 9912: 56–72.
[13]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91–99.
[14]	Liao M H, Shi B G, Bai X, et al. TextBoxes: a fast text detector with a single deep neural network[Z]. arXiv: 1611.06779, 2016.
[15]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 21–37.
[16]	Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network[C]//Proceedings of the 14th European Conference on Computer Vision, 2016, 9912: 56–72.
[17]	Liao M H, Shi B G, Bai X. TextBoxes++: a single-shot oriented scene text detector[J]. IEEE Trans Image Process, 2018, 27(8): 3676–3690. doi: 10.1109/TIP.2018.2825107
[18]	Ma J Q, Shao W Y, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Trans Multimed, 2018, 20(11): 3111–3122. doi: 10.1109/TMM.2018.2818020
[19]	Zhou X Y, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 2642–2651.
[20]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 3431–3440.
[21]	Deng D, Liu H, Li X, et al. PixelLink: detecting scene text via instance segmentation[Z]. arXiv: 1801.01315, 2018.
[22]	Shi B G, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J]. I IEEE Trans Pattern Anal Mach Intell, 2017, 39(11): 2298–2304. doi: 10.1109/TPAMI.2016.2646371
[23]	Graves A, Fernández S, Gomez F, et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning, 2006: 369–376.
[24]	Lee C Y, Osindero S. Recursive recurrent nets with attention modeling for OCR in the wild[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 2231–2239.
[25]	Sandler M, Howard A, Zhu M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 4510–4520.
[26]	Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 936–944.
[27]	Milletari F, Navab N, Ahmadi S A. V-Net: fully convolutional neural networks for volumetric medical image segmentation[C]// Proceedings of the 4th International Conference on 3D Vision (3DV), 2016: 565–571.
[28]	Li X, Wang W H, Hou W B, et al. Shape robust text detection with progressive scale expansion network[Z]. arXiv: 1806.02559, 2018.
[29]	Liu X B, Liang D, Yan S, et al. FOTS: fast oriented text spotting with a unified network[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 5676–5685.
[30]	Thakare S, Kamble A, Thengne V, et al. Document Segmentation and Language Translation Using Tesseract-OCR[C]//2018 IEEE 13th International Conference on Industrial and Information Systems (ICⅡS). IEEE, 2018.
[31]	Shi B G, Yang M K, Wang X G, et al. ASTER: an attentional scene text recognizer with flexible rectification[J]. IEEE Trans Pattern Anal Mach Intell, 2019, 41(9): 2035–2048. https://pubmed.ncbi.nlm.nih.gov/29994467/

施引文献

资源附件(0)

访问统计

点击扫一扫

图(9)

表(3)

计量

文章访问数: 5186
PDF下载数: 1857
施引文献: 0

基于神经网络的车辆识别代号识别方法

作者简介:
孟凡俊(1994-)，男，硕士研究生，主要从事图像处理的研究。E-mail：fanjunm@mail.ustc.edu.cn

通讯作者: 尹东(1965-)，男，副教授，主要从事图像处理的研究。E-mail：yindong@ustc.edu.cn

Vehicle identification number recognition based on neural network

Corresponding author: Yin Dong, E-mail: yindong@ustc.edu.cn

计量

目录

作者须知

其他内容

条款和政策

基于神经网络的车辆识别代号识别方法

作者简介: 孟凡俊(1994-)，男，硕士研究生，主要从事图像处理的研究。E-mail：fanjunm@mail.ustc.edu.cn

通讯作者: 尹东(1965-)，男，副教授，主要从事图像处理的研究。E-mail：yindong@ustc.edu.cn

Vehicle identification number recognition based on neural network

Corresponding author: Yin Dong, E-mail: yindong@ustc.edu.cn

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
孟凡俊(1994-)，男，硕士研究生，主要从事图像处理的研究。E-mail：fanjunm@mail.ustc.edu.cn