复杂场景下的红外目标检测

张汝榛; 张建林; 祁小平; 左颢睿; 徐智勇

doi:10.12086/oee.2020.200314

复杂场景下的红外目标检测

- 1.
  中国科学院光束控制重点实验室，四川成都 610209
- 2.
  中国科学院光电技术研究所，四川成都 610209
- 3.
  中国科学院大学，北京 100049
基金项目:
国家863计划资助项目(G158207)

详细信息

作者简介:
张汝榛(1994-)，女，硕士，主要从事基于深度学习的红外目标检测的研究。E-mail：zrzhen0319@163.com

通讯作者: 祁小平(1974-)，男，副研究员，主要从事扩展目标高精度定位技术和图像处理算法的优化及实时实现的研究。E-mail：qixiaoping@163.com

中图分类号: TP391.41;TN219

收稿日期: 2020-08-20

修回日期: 2020-09-22

刊出日期: 2020-10-15

Infrared target detection and recognition in complex scene

- 1.
  Key Laboratory of Beam Control, Chinese Academy of Sciences, Chengdu, Sichuan 610209, China
- 2.
  Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu, Sichuan 610209, China
- 3.
  University of Chinese Academy of Sciences, Beijing 100049, China
Fund Project: Supported by National High Technology Research Development Program China (G158207)

More Information

Corresponding author: Qi Xiaoping, E-mail: qixiaoping@163.com

Received Date 20 August 2020

Revised Date 22 September 2020

Published Date 15 October 2020

摘要

摘要:
主流的目标检测网络在高质量RGB图像上的目标检测能力突出，但应用于分辨率低的红外图像上时目标检测性能则有比较明显的下降。为了提高复杂场景下的红外目标检测识别能力，本文采用了以下措施：第一、借鉴领域自适应的方法，采用合适的红外图像预处理手段，使红外图像更接近RGB图像，从而可以应用主流的目标检测网络进一步提高检测精度。第二、采用单阶段目标检测网络YOLOv3作为基础网络，并用GIOU损失函数代替原有的MSE损失函数。经实验验证，该算法在公开红外数据集FLIR上检测的准确率提升明显。第三、针对FLIR数据集存在的目标尺寸跨度大的问题，借鉴空间金字塔思想，加入SPP模块，丰富特征图的表达能力，扩大特征图的感受野。实验表明，所采用的方法可以进一步提高目标检测的精度。
- 红外目标检测 /
- 深度学习 /
- 复杂场景
Abstract:
The mainstream target detection network has outstanding target detection capability in high quality RGB images, but for infrared images with poor resolution, the target detection performance decreases significantly. In order to improve the performance of infrared target detection in complex scene, the following measures are adopted in this paper: Firstly, by referring to the field adaption and adopting the appropriate infrared image preprocessing means, the infrared image is closer to the RGB image, so that the mainstream target detection network can further improve the detection accuracy. Secondly, based on the one-stage target detection network YOLOv3, the algorithm replaces the original MSE loss function with the GIOU loss function. It is verified by experiments that the detection accuracy on the open infrared data set the FLIR is significantly improved. Thirdly, in view of the problem of large target size span existing in FLIR dataset, the SPP module is added with reference to the idea of the spatial pyramid to enrich the expression ability of feature map, expand the receptive field of feature map, and further improve the accuracy of target detection.
- infrared target detection /
- deep learning /
- complex scenario

英文长摘要

Overview: In recent years, with the continuous development of computer vision, the ability of target detection based on deep learning has been significantly improved. However, most of the images used by mainstream target detection networks are RGB images, and there are few studies on the direction of infrared target detection. Moreover, the mainstream target detection network has a prominent target detection capability in high quality RGB images, but the target detection performance in infrared images with poor resolution is significantly reduced. Compared with infrared images, visible images have higher imaging resolution and rich target detail information. However, under certain weather conditions, the visible images cannot be obtained. Infrared imaging technology has the characteristics of long range, strong anti-interference ability, high measurement accuracy, not affected by weather, able to work day and night, and strong ability to penetrate smoke. Therefore, infrared imaging technology has been widely used once it was proposed. The demand for infrared target detection is also urgent.
In order to improve the performance of infrared target detection in complex scenes, the following measures are adopted in this paper: First, referring to the field adaptive method, appropriate infrared image preprocessing means are adopted to make the infrared image closer to the RGB image, so as to further improve the detection accuracy by applying the mainstream target detection network. Secondly, mean square error (MSE), a loss function, regards the coordinate value of each point of BBox as an independent variable, which does not consider the integrity of the target frame, and l_n-norms is sensitive to the scale of the object, so the algorithm is based on the single-stage target detection network YOLOv3 and replaces the original MSE loss function with GIOU loss function. It is verified by experiments that the detection accuracy on FLIR, an open infrared data set, is significantly improved, and the problem of inaccurate location in the original network is effectively improved. Thirdly, in view of the problem of large span of target size in the FLIR data set, the SPP module is added to enrich the expression ability of feature map and expand the receptive field of feature map by referring to the idea of space pyramid. The experimental results show that the network detection error rate decreases after the addition of SPP module, and after overcoming the original deficiency of the YOLOv3, the target accuracy of detection can be further improved compared with the modification of GIOU loss function only.

HTML全文

图 1 来自FLIR数据集的不同红外图片(每行)。(a)原图；(b)倒置；(c)直方图均衡；(d)去噪+图像锐化

Figure 1. Different infrared images (per line) from the FLIR dataset. (a) Original image; (b) Inversion; (c) Histogram equalization; (d) Denoising + image sharpening

下载: 全尺寸图片幻灯片

图 2 修改后的YOLOv3网络结构图

Figure 2. Modified YOLOv3 network structure

下载: 全尺寸图片幻灯片

图 3 SPP模块

Figure 3. SPP module

下载: 全尺寸图片幻灯片

图 4 算法流程图

Figure 4. Algorithm flow chart

下载: 全尺寸图片幻灯片

图 5 (a) 不同网络的所有类别检测速度和精度的结果；(b)不同网络的汽车检测速度和精度的结果；(c)不同网络的人检测速度和精度的结果；(d)不同网络的自行车检测速度和精度的结果

Figure 5. (a) The results of detection speed and accuracy of all categories of different networks; (b) The results of vehicle detection speed and accuracy of different networks; (c) The results of human detection speed and accuracy of different networks; (d) The results of bicycle detection speed and accuracy on different networks

下载: 全尺寸图片幻灯片

图 6 (a) YOLOv3网络检测结果；(b)真实值；(c)使用GIOU损失函数的YOLOv3网络检测结果；(d)使用GIOU损失函数并添加SPP模块的YOLOv3网络的检测结果

Figure 6. (a) The network detection result of YOLOv3; (b) Ground truth; (c) The YOLOv3 network detection results using GIOU loss function; (d) Detection results for YOLOv3 network that use GIOU loss function and SPP modules

下载: 全尺寸图片幻灯片

表 1 使用ImageNet和MS COCO数据集权重训练FLIR红外数据集

Table 1. Use ImageNet and MS COCO data set to train FLIR infrared data set

Dataset	mAP/%
ImageNet	49.86
MS COCO	58.02

下载: 导出CSV

表 2 不同的预处理方法输入到YOLOv3网络进行训练的检测结果

Table 2. Different pretreatment methods were input into the YOLOv3 network for training

Pretreatment methods	mAP/%
Original image	58.02
Inversion	60.71
Histogram equalization	57.26
De-noising + sharpen	57.42

下载: 导出CSV

表 3 不同框架对于FLIR数据集的检测结果。Faster R-CNN IOU阈值为0.3，YOLOv3 IOU阈值为0.6

Table 3. Detection results of FILR dataset by different frameworks. The Faster R-CNN IOU threshold is 0.3, and the YOLOv3 IOU threshold is 0.6

Detention method	AP/%			mAP/%	Test time/s
Detention method	car	person	bicycle	mAP/%	Test time/s
Faster R-CNN(VGG16)	74.15	62.14	43.58	59.96	0.070
Faster R-CNN(Res101)	77.96	65.72	45.86	63.18	0.163
YOLOv3	77.69	57.47	39.74	58.02	0.026
Ours(YOLOv3+GIOU)	79.3	68.10	31.90	59.70	0.042
Ours(YOLOv3+GIOU+SPP)	81.90	72.60	49.00	66.80	0.039

下载: 导出CSV

参考文献(15)

[1]	Hou Y L, Song Y Y, Hao X L, et al. Multispectral pedestrian detection based on deep convolutional neural networks[C]//2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2018.
[2]	朱大炜.基于深度学习的红外图像飞机目标检测方法[D].西安: 西安电子科技大学, 2018. Zhu D W. Infrared image plane target detection method based on deep learning[D]. Xi'an: Xidian University, 2018.
[3]	Herrmann C, Ruf M, Beyerer J. CNN-based thermal infrared person detection by domain adaptation[J]. Proceedings of SPIE, 2018, 10643: 1064308. http://www.researchgate.net/publication/324935012_CNN-based_thermal_infrared_person_detection_by_domain_adaptation
[4]	侯志强, 刘晓义, 余旺盛, 等.基于双阈值-非极大值抑制的Faster R-CNN改进算法[J].光电工程, 2019, 46(12): 190159. doi: 10.12086/oee.2019.190159 Hou Z Q, Liu X Y, Yu W S, et al. Improved algorithm of faster R-CNN based on double threshold-non-maximum suppression[J]. Opto-Electronic Engineering, 2019, 46(12): 190159. doi: 10.12086/oee.2019.190159
[5]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016, 9905: 21–37.
[6]	Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector[Z]. arXiv: 1701.06659[cs.CV], 2017.
[7]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[Z]. arXiv: 1804.02767[cs.CV], 2018.
[8]	Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[Z]. arXiv: 2004.10934[cs.CV], 2020.
[9]	金瑶, 张锐, 尹东.城市道路视频中小像素目标检测[J].光电工程, 2019, 46(9): 190053. doi: 10.12086/oee.2019.190053 Jin Y, Zhang R, Yin D. Object detection for small pixel in urban roads videos[J]. Opto-Electronic Engineering, 2019, 46(9): 190053. doi: 10.12086/oee.2019.190053
[10]	Li Z M, Peng C, Yu G, et al. DetNet: a backbone network for object detection[Z]. arXiv: 1804.06215[cs.CV], 2018.
[11]	Liu S T, Huang D, Wang Y H. Receptive field block net for accurate and fast object detection[Z]. arXiv: 1711.07767[cs.CV], 2017.
[12]	赵春梅, 陈忠碧, 张建林.基于深度学习的飞机目标跟踪应用研究[J].光电工程, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261 Zhao C M, Chen Z B, Zhang J L. Application of aircraft target tracking based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261
[13]	石超, 陈恩庆, 齐林.红外视频中的舰船检测[J].光电工程, 2018, 45(6): 170748. doi: 10.12086/oee.2018.170748 Shi C, Chen E Q, Qi L. Ship detection from infrared video[J]. Opto-Electronic Engineering, 2018, 45(6): 170748. doi: 10.12086/oee.2018.170748
[14]	Yu J H, Jiang Y N, Wang Z Y, et al. UnitBox: An advanced object detection network[C]//Proceedings of the 24th ACM International Conference on Multimedia, 2016.
[15]	Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

施引文献

资源附件(0)

访问统计

点击扫一扫

图(6)

表(3)

计量

文章访问数: 10205
PDF下载数: 1629
施引文献: 0

复杂场景下的红外目标检测

作者简介:
张汝榛(1994-)，女，硕士，主要从事基于深度学习的红外目标检测的研究。E-mail：zrzhen0319@163.com

通讯作者: 祁小平(1974-)，男，副研究员，主要从事扩展目标高精度定位技术和图像处理算法的优化及实时实现的研究。E-mail：qixiaoping@163.com

Infrared target detection and recognition in complex scene

Corresponding author: Qi Xiaoping, E-mail: qixiaoping@163.com

计量

目录

作者须知

其他内容

条款和政策

复杂场景下的红外目标检测

作者简介: 张汝榛(1994-)，女，硕士，主要从事基于深度学习的红外目标检测的研究。E-mail：zrzhen0319@163.com

通讯作者: 祁小平(1974-)，男，副研究员，主要从事扩展目标高精度定位技术和图像处理算法的优化及实时实现的研究。E-mail：qixiaoping@163.com

Infrared target detection and recognition in complex scene

Corresponding author: Qi Xiaoping, E-mail: qixiaoping@163.com

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
张汝榛(1994-)，女，硕士，主要从事基于深度学习的红外目标检测的研究。E-mail：zrzhen0319@163.com