Citation: |
|
[1] | Yilmaz A, Javed O, Shah M. Object tracking: a survey[J]. ACM Computing Surveys, 2006, 38(4): 13. doi: 10.1145/1177352.1177355 |
[2] | Sivanantham S, Paul N N, Iyer R S. Object tracking algorithm implementation for security applications[J]. Far East Journal of Electronics and Communications, 2016, 16(1): 1–13. doi: 10.17654/EC016010001 |
[3] | Kwak S, Cho M, Laptev I, et al. Unsupervised object discovery and tracking in video collections[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 3173–3181. |
[4] | 罗海波, 许凌云, 惠斌, 等.基于深度学习的目标跟踪方法研究现状与展望[J].红外与激光工程, 2017, 46(5): 502002. doi: 10.3788/IRLA201746.0502002 Luo H B, Xu L Y, Hui B, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5): 502002. doi: 10.3788/IRLA201746.0502002 |
[5] | Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564–575. doi: 10.1109/TPAMI.2003.1195991 |
[6] | Lucas B D, Kanade T. An iterative image registration technique with an application to stereo vision[C]//Proceedings of the 7th International Joint Conference on Artificial Intelligence, 1981: 674–679. |
[7] | Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012: 1822–1829. |
[8] | Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/TPAMI.2014.2345390 |
[9] | 樊香所, 徐智勇, 张建林.改进粒子滤波的弱小目标跟踪[J].光电工程, 2018, 45(8): 170569. doi: 10.12086/oee.2018.170569 Fan X S, Xu Z Y, Zhang J L. Dim small target tracking based on improved particle filter[J]. Opto-Electronic Engineering, 2018, 45(8): 170569. doi: 10.12086/oee.2018.170569 |
[10] | 奚玉鼎, 于涌, 丁媛媛, 等.一种快速搜索空中低慢小目标的光电系统[J].光电工程, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654 Xi Y D, Yu Y, Ding Y Y, et al. An optoelectronic system for fast search of low slow small target in the air[J]. Opto-Electronic Engineering, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654 |
[11] | Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012: 1097–1105. |
[12] |
Karen S Y, Andrew Z M. Very Deep Convolutional Networks for Large-scale Image Recognition[Z]. arXiv: 1409.1556[cs: CV], 2015.
|
[13] | Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. |
[14] | He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. |
[15] | Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. |
[16] | Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252. doi: 10.1007/s11263-015-0816-y |
[17] |
Chatfield K, Simonyan K, Vedaldi A, et al. Return of the devil in the details: delving deep into convolutional nets[Z]. arXiv: 1405.3531[cs: CV], 2014.
|
[18] | Shelhamer E, Long G, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683 |
[19] | Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580–587. |
[20] | 李贺.基于深度学习的目标跟踪算法研究综述[J].黑龙江科技信息, 2017(17): 49. doi: 10.3969/j.issn.1673-1328.2017.17.046 Li H. An overview of target tracking algorithm based on deep learning[J]. Heilongjiang Science and technology information, 2017(17): 49. doi: 10.3969/j.issn.1673-1328.2017.17.046 |
[21] | Wang N Y, Yeung D Y. Learning a Deep Compact Image Representation for Visual Tracking[C]//NIPS. Curran Associates Inc. 2013: 809–817. |
[22] |
Nam H, Baek M, Han B. Modeling and Propagating CNNs in a Tree Structure for Visual Tracking[Z]. arXiv: 1608.07242v1[cs: CV], 2016.
|
[23] | Wang L J, Ouyang W L, Wang X G, et al. Visual tracking with fully convolutional networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), 2015: 3119–3127. |
[24] | Ma C, Huang J B, Yang X K, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), 2015. |
[25] | Heid D, Thrun S, Savarese S. Learning to track at 100 FPS with deep regression networks[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 749–765. |
[26] | Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional Siamese networks for object tracking[C]//European Conference on Computer Vision, 2016: 850–865. |
[27] |
Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[Z]. arXiv: 1511.07122[cs: CV], 2016.
|
[28] | 王慧燕, 杨宇涛, 张政, 等.深度学习辅助的多行人跟踪算法[J].中国图象图形学报, 2017, 22(3): 349–357. doi: 10.11834/jig.20170309 Wang H Y, Yang Y T, Zhang Z, et al. Deep-learning-aided multi-pedestrian tracking algorithm[J]. Journal of Image and Graphics, 2017, 22(3): 349–357. doi: 10.11834/jig.20170309 |
[29] | 王晓冬.视觉角度对游戏可玩性的影响[J].河南科技, 2014(7): 12. Wang X D. The influence of visual angle on the playability of games[J]. Henan Science and Technology, 2014(7): 12 |
[30] | Horikoshi K, Misawa K, Lang R. 20-fps motion capture of phase-controlled wave-packets for adaptive quantum control[C]//Proceedings of the 15th International Conference on Ultrafast Phenomena XV, 2006: 175–177. |
[31] | 赵春梅, 陈忠碧, 张建林.基于深度学习的飞机目标跟踪应用研究[J].光电工程, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261 Zhao C M, Chen Z B, Zhang J L. Application of aircraft target tracking based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261 |
Overview: Deep learning has achieved good results in image classification, semantic segmentation, target detection and target recognition. However, it is still restricted by small sample training sets on object tracking. Object tracking is one of the most important researches in the field of computer vision, and has a wide range of applications. The challenge of object tracking lies in the complex states such as the target rotation, multi target, blur target, complex background, size change, target occlusion, fast moving and so on. Aiming at target tracking, this paper proposes an improved convolution network Siamese-MF (multi-feature Siamese networks) based on Siamese-FC (fully-convolutional Siamese networks). For tracking networks, considering the balance between speed and accuracy, reducing computational complexity and increasing the receptive field of convolution feature are the directions to improve the speed and accuracy of tracking networks. The improvement of the classical convolution network structure is mainly focused on two points: 1) introducing feature fusion to enrich features; 2) introducing dilated convolution to reduce computational complexity and enhance the receptive field. The improved convolution layer acts as feature extraction layer, and calculates the correlation between the target and the search area through the full convolution layer, so as to get the location of the tracking target according to the correlation graph. Siamese-MF algorithm achieves real-time and accurate tracking of targets in complex scenes. The average speed test on OTB2015 reaches 76 f/s, the mean value of overlap reaches 0.44, and the mean value of precision reaches 0.61, which meets the requirement in real-time tracking application of targets. For target tracking in this paper, the Siamese-MF networks are trained by using 5 convolutional layers of Conv1~Conv5 of AlexNet and 2 connected layers Skip1~Skip2 to extract the feature of target. In the tracking process, the trained networks are used as feed-forward networks, and the maximum score of outputs is regarded as the target location, while template updating is done in time series. Also the result of tracking is adaptive to scale transformation.
Feedforward network of Siamese-MF
Tracking process of Siamese-MF
Tracking results of Siamese-MF on OTB2015
Qualitative evaluation index analysis. (a) Overlap; (b) Accuracy; (c) Velocity