Citation: | Li M, Chen N, An W, et al. Dual view fusion detection method for event camera detection of unmanned aerial vehicles[J]. Opto-Electron Eng, 2024, 51(11): 240208. doi: 10.12086/oee.2024.240208 |
[1] | Bouguettaya A, Zarzour H, Kechida A, et al. Vehicle detection from UAV imagery with deep learning: a review[J]. IEEE Trans Neural Netw Learn Syst, 2022, 33(11): 6047−6067. doi: 10.1109/TNNLS.2021.3080276 |
[2] | 陈海永, 刘登斌, 晏行伟. 基于IDOU-YOLO的红外图像无人机目标检测算法[J]. 应用光学, 2024, 45(4): 723−731. Chen H Y, Liu D B, Yan X W. Infrared image UAV target detection algorithm based on IDOU-YOLO[J]. Journal of Applied Optics, 2024, 45(4): 723−731. |
[3] | 韩江涛, 谭凯, 张卫国, 等. 协同随机森林方法和无人机LiDAR空谱数据的盐沼植被“精灵圈”识别[J]. 光电工程, 2024, 51(3): 230188. doi: 10.12086/oee.2024.230188 Han J T, Tan K, Zhang W G, et al. Identification of salt marsh vegetation “fairy circles” using random forest method and spatial-spectral data of unmanned aerial vehicle LiDAR[J]. Opto-Electron Eng, 2024, 51(3): 230188. doi: 10.12086/oee.2024.230188 |
[4] | Park S, Choi Y. Applications of unmanned aerial vehicles in mining from exploration to reclamation: a review[J]. Minerals, 2020, 10(8): 663. doi: 10.3390/min10080663 |
[5] | Sziroczak D, Rohacs D, Rohacs J. Review of using small UAV based meteorological measurements for road weather management[J]. Prog Aerosp Sci, 2022, 134: 100859. doi: 10.1016/j.paerosci.2022.100859 |
[6] | Wang Z Y, Gao Q, Xu J B, et al. A review of UAV power line inspection[C]//Proceedings of 2020 International Conference on Guidance, Navigation and Control, Tianjin, 2022: 3147–3159. https://doi.org/10.1007/978-981-15-8155-7_263. |
[7] | Khan A, Gupta S, Gupta S K. Emerging UAV technology for disaster detection, mitigation, response, and preparedness[J]. J Field Robot, 2022, 39(6): 905−955. doi: 10.1002/rob.22075 |
[8] | Li Y, Liu M, Jiang D D. Application of unmanned aerial vehicles in logistics: a literature review[J]. Sustainability, 2022, 14(21): 14473. doi: 10.3390/su142114473 |
[9] | Mademlis I, Mygdalis V, Nikolaidis N, et al. High-level multiple-UAV cinematography tools for covering outdoor events[J]. IEEE Trans Broadcast, 2019, 65(3): 627−635. doi: 10.1109/TBC.2019.2892585 |
[10] | 奚玉鼎, 于涌, 丁媛媛, 等. 一种快速搜索空中低慢小目标的光电系统[J]. 光电工程, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654 Xi Y D, Yu Y, Ding Y Y, et al. An optoelectronic system for fast search of low slow small target in the air[J]. Opto-Electron Eng, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654 |
[11] | 张润梅, 肖钰霏, 贾振楠, 等. 改进YOLOv7的无人机视角下复杂环境目标检测算法[J]. 光电工程, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051 Zhang R M, Xiao Y F, Jia Z N, et al. Improved YOLOv7 algorithm for target detection in complex environments from UAV perspective[J]. Opto-Electron Eng, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051 |
[12] | 陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372 doi: 10.12086/oee.2022.210372 Chen X, Peng D L, Gu Y. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electron Eng, 2022, 49(3): 210372 doi: 10.12086/oee.2022.210372 |
[13] | 张明淳, 牛春晖, 刘力双, 等. 用于无人机探测系统的红外小目标检测算法[J]. 激光技术, 2024, 48(1): 114−120. doi: 10.7510/jgjs.issn.1001-3806.2024.01.018 Zhang M C, Niu C H, Liu L S, et al. Infrared small target detection algorithm for UAV detection system[J]. Laser Technol, 2024, 48(1): 114−120. doi: 10.7510/jgjs.issn.1001-3806.2024.01.018 |
[14] | Sedunov A, Haddad D, Salloum H, et al. Stevens drone detection acoustic system and experiments in acoustics UAV tracking[C]//Proceedings of 2019 IEEE International Symposium on Technologies for Homeland Security, Woburn, 2019: 1–7. https://doi.org/10.1109/HST47167.2019.9032916. |
[15] | Chiper F L, Martian A, Vladeanu C, et al. Drone detection and defense systems: Survey and a software-defined radio-based solution[J]. Sensors, 2022, 22(4): 1453. doi: 10.3390/s22041453 |
[16] | de Quevedo Á D, Urzaiz F I, Menoyo J G, et al. Drone detection and radar-cross-section measurements by RAD-DAR[J]. IET Radar Sonar Navig, 2019, 13(9): 1437−1447. doi: 10.1049/iet-rsn.2018.5646 |
[17] | Gallego G, Delbrück T, Orchard G, et al. Event-based vision: a survey[J]. IEEE Trans Pattern Anal Mach Intell, 2020, 44(1): 154−180. doi: 10.1109/TPAMI.2020.3008413 |
[18] | Shariff W, Dilmaghani M S, Kielty P, et al. Event cameras in automotive sensing: a review[J]. IEEE Access, 2024, 12: 51275−51306. doi: 10.1109/ACCESS.2024.3386032 |
[19] | Paredes-Vallés F, Scheper K Y W, De Croon G C H E. Unsupervised learning of a hierarchical spiking neural network for optical flow estimation: from events to global motion perception[J]. IEEE Trans Pattern Anal Mach Intell, 2020, 42(8): 2051−2064. doi: 10.1109/TPAMI.2019.2903179 |
[20] | Cordone L, Miramond B, Thierion P. Object detection with spiking neural networks on automotive event data[C]// Proceedings of 2022 International Joint Conference on Neural Networks, Padua, 2022: 1–8. https://doi.org/10.1109/IJCNN55064.2022.9892618. |
[21] | Li Y J, Zhou H, Yang B B, et al. Graph-based asynchronous event processing for rapid object recognition[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 914–923. https://doi.org/10.1109/ICCV48922.2021.00097. |
[22] | Schaefer S, Gehrig D, Scaramuzza D. AEGNN: asynchronous event-based graph neural networks[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, 2022: 12361–12371. https://doi.org/10.1109/CVPR52688.2022.01205. |
[23] | Jiang Z Y, Xia P F, Huang K, et al. Mixed frame-/event-driven fast pedestrian detection[C]// Proceedings of 2019 International Conference on Robotics and Automation, Montreal, 2019: 8332–8338. https://doi.org/10.1109/ICRA.2019.8793924. |
[24] | Lagorce X, Orchard G, Galluppi F, et al. HOTS: a hierarchy of event-based time-surfaces for pattern recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(7): 1346−1359. doi: 10.1109/TPAMI.2016.2574707 |
[25] | Zhu A, Yuan L Z, Chaney K, et al. Unsupervised event-based learning of optical flow, depth, and egomotion[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 989–997. https://doi.org/10.1109/CVPR.2019.00108. |
[26] | Wang D S, Jia X, Zhang Y, et al. Dual memory aggregation network for event-based object detection with learnable representation[C]//Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, 2023: 2492–2500. https://doi.org/10.1609/aaai.v37i2.25346. |
[27] | Li J N, Li J, Zhu L, et al. Asynchronous spatio-temporal memory network for continuous event-based object detection[J]. IEEE Trans Image Process, 2022, 31: 2975−2987. doi: 10.1109/TIP.2022.3162962 |
[28] | Peng Y S, Zhang Y Y, Xiong Z W, et al. GET: group event transformer for event-based vision[C]//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision, Paris, 2023: 6015–6025. https://doi.org/10.1109/ICCV51070.2023.00555. |
[29] | Chen N F Y. Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, 2018: 644–653. https://doi.org/10.1109/CVPRW.2018.00107. |
[30] | Afshar S, Nicholson A P, Van S A, et al. Event-based object detection and tracking for space situational awareness[J]. IEEE Sensors Journal, 2020, 20(24): 15117−15132 doi: 10.1109/JSEN.2020.3009687 |
[31] | Huang H M, Lin L F, Tong R F, et al. UNet 3+: a full-scale connected UNet for medical image segmentation[C]// Proceedings of ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, 2020: 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405. |
[32] | Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721. |
[33] | Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2. |
[34] | Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, 2015: 1440–1448. https://doi.org/10.1109/ICCV.2015.169. |
[35] | Charles R Q, Su H, Kaichun M. PointNet: deep learning on point sets for 3D classification and segmentation[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 77–85. https://doi.org/10.1109/CVPR.2017.16. |
[36] | Charles R Q, Yi L, Su H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 2017: 5105–5114. https://dl.acm.org/doi/abs/10.5555/3295222.3295263. |
[37] | Gehrig M, Scaramuzza D. Recurrent vision transformers for object detection with event cameras[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 13884–13893. https://doi.org/10.1109/CVPR52729.2023.01334. |
[38] | Peng Y S, Li H B, Zhang Y Y, et al. Scene adaptive sparse transformer for event-based object detection[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2024: 16794–16804. https://doi.org/10.1109/CVPR52733.2024.01589. |
[39] | Li B Y, Xiao C, Wang L G, et al. Dense nested attention network for infrared small target detection[J]. IEEE Trans Image Process, 2023, 32: 1745−1758. doi: 10.1109/TIP.2022.3199107 |
[40] | Dai Y M, Li X, Zhou F, et al. One-stage cascade refinement networks for infrared small target detection[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5000917. doi: 10.1109/TGRS.2023.3243062 |
With the proliferation of UAVs in various industries, concerns over airspace management, public safety, and privacy have escalated. Traditional detection methods, such as acoustic, radio, and radar detection, are often costly and complex, limiting their applicability. In contrast, machine vision-based techniques, particularly event cameras, offer a cost-effective and adaptable solution. Event cameras, also known as dynamic vision sensors (DVS), capture visual information in the form of asynchronous events, each containing the pixel location, timestamp, and polarity of an intensity change. This address-event representation (AER) mechanism enables efficient high-speed and high-dynamic-range visual data processing. Traditional cameras capture image frames with a fixed exposure time, which makes it difficult to adapt to changes in lighting conditions, resulting in the detection of blind spots in strong light and other scenes. Event cameras perform well in this situation. The existing object detection algorithms are suitable for synchronous data such as image frames, but cannot handle asynchronous data such as data streams. The proposed method treats UAV detection as a 3D point cloud segmentation task, leveraging the unique properties of event streams. This paper introduces a dual-branch network, PVNet, which integrates voxel and point feature extraction branches. The voxel branch, resembling a U-Net architecture, extracts contextual information through downsampling and upsampling stages. The point branch utilizes multi-layer perceptrons (MLPs) to extract point-wise features. These two branches interact and fuse at multiple stages, enhancing feature representation. A key innovation lies in the gated fusion mechanism, which selectively aggregates features from both branches, mitigating the impact of non-informative features. This approach outperforms simple addition or concatenation fusion methods, as demonstrated through ablation studies. The method is evaluated on a custom dataset, Ev-UAV, containing 60 event camera recordings of UAV flights under various conditions. Evaluation metrics include intersection over union (IoU) for segmentation accuracy and mean average precision (mAP) for detection performance. Compared to frame-based methods like YOLOV7, SSD, and FastRCNN, event-based methods like RVT and SAST, and point cloud-based methods like PointNet and PointNet++, the proposed PVNet achieves superior performance, with an mAP of 69.5% and IoU of 70.3%, while maintaining low latency. By modelling UAV detection as a 3D point cloud segmentation task and leveraging the strengths of event cameras, the PVNet model effectively addresses the challenges of detecting small, feature-scarce UAVs in event data. The results demonstrate the potential of event cameras and the proposed fusion mechanism for real-time and accurate UAV detection in complex environments.
Comparison of output between event camera and traditional camera
Event camera drone detection dataset. (a) Event point cloud, with blue background and red drone target; (b) Visible light image frame; (c) Event frame, with the target within the red box
Point-voxel fusion segmentation network
Event point cloud voxelization
Point-voxel feature interaction fusion module
Visual comparison of detection results. The red box represents the correct detection result, the yellow box represents false alarm, and the green box represents missed detection