Small pixel targets in video images are difficult to detect. Aiming at the small pixel target in urban road video, this paper proposed a novel detection method named Road_Net based on the YOLOv3 convolutional neural network. Firstly, based on the improved YOLOv3, a new convolutional neural network Road_Net is designed. Secondly, for small pixel target detection depending on shallow level features, a detection method of 4 scales is adopted. Finally, combined with the improved M-Softer-NMS algorithm, it gets higher detection accuracy of the target in the image. In order to verify the effectiveness of the proposed algorithm, this paper collects and labels the data set named Road-garbage Dataset for small pixel target object detection on urban roads. The experimental results show that the algorithm can effectively detect objects such as paper scraps and stones, which are smaller pixel targets in the video relative to the road surface.
Object detection for small pixel in urban roads videos
First published at:Sep 30, 2019
1 Lowe D G. Object recognition from local scale-invariant features[C]//The Proceedings of the 7th IEEE International Conference onComputer Vision, 1999, 2: 1150–1157.
2 Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. DOI:10.1023/B:VISI.0000029664.99615.94
3 Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//Proceedings of2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 1: 886–893.
4 Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971–987. DOI:10.1109/TPAMI.2002.1017623
6 Ho T K. Random decision forests[C]//Proceedings of the 3rd International Conference on Document Analysis and Recognition, 1995, 1: 278–282.
7 Luo Z J, Zeng G Q. Space objects detection in video satellite images using improved MTI algorithm[J]. Opto-Electronic Engineering, 2018, 45(8): 180048. DOI:10.12086/oee.2018.180048
罗振杰, 曾国强.基于改进MTI算法的视频图像空间目标检测[J].光电工程, 2018, 45(8): 180048. DOI:10.12086/oee.2018.180048
8 Fan X S, Xu Z Y, Zhang J L. Dim small target tracking based on improved particle filter[J]. Opto-Electronic Engineering, 2018, 45(8): 170569.
9 Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 815–823.
10 Wang X H, Gao L L, Wang P, et al. Two-stream 3-D convNet fusion for action recognition in videos with arbitrary size and length[J]. IEEE Transactions on Multimedia, 2018, 20(3): 634–644. DOI:10.1109/TMM.2017.2749159
11 Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580–587.
12 Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 1440–1448.
13 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91–99.
14 Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 761–769.
15 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779–788.
16 Uijlings J R R, Van De Sande K E A, Gevers T, et al. Selective search for object recognition[J]. InternationalJournalof Computer Vision, 2013, 104(2): 154–171. DOI:10.1007/s11263-013-0620-5
17 Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges[C]//Proceedings of the 13th European Conference on Computer Vision, 2014: 391–405.
18 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778.
19 Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 936–944.
20 Dai W C, Jin L X, Li G N, et al. Real-time airplane detection algorithm in remote-sensing images based on improved YOLOv3[J]. Opto-Electronic Engineering, 2018, 45(12): 180350. DOI:10.12086/oee.2018.180350
戴伟聪, 金龙旭, 李国宁, 等.遥感图像中飞机的改进YOLOv3实时检测算法[J].光电工程, 2018, 45(12): 180350. DOI:10.12086/oee.2018.180350
21 Bodla N, Singh B, Chellappa R, et al. Soft-NMS—improving object detection with one line of code[C]//Proceedings of 2017IEEE International Conference on Computer Vision, 2017: 5562–5570.
22 He Y H, Zhang X Y, Savvides M, et al. Softer-NMS: rethinking bounding box regression for accurate object detection[J]. arXiv: 1809.08545v1[cs.CV], 2018.
2018 Anhui Key Research and Development Plan Project (1804a09020049)
Get Citation: Jin Yao, Zhang Rui, Yin Dong. Object detection for small pixel in urban roads videos[J]. Opto-Electronic Engineering, 2019, 46(9): 190053.