An open-pit mine roadway obstacle warning method integrating the object detection and distance threshold model
Lu Caiwu, Qi Fan, Ruan Shunling
School of Management, Xi'an University of Architecture and Technology, Xi'an, Shaanxi 710055, China
Abstract: In order to solve the problem that the current driving warning method cannot adapt to the unstructured road in open-pit mine, this paper proposes an early warning method that integrates target detection and obstacle distance threshold. Firstly, the original Mask R-CNN detection framework was improved according to the characteristics of open-pit mine obstacles, and dilated convolution was introduced into the framework network to expand the receptive field range without reducing the feature map to ensure the detection accuracy of larger targets. Then, a linear distance factor was constructed based on the target detection results to represent the depth information of obstacles in the input image, and an SVM warning model was established. Finally, in order to ensure the generalization ability of the warning model, transfer learning method was adopted to carry out pre-training of the network in COCO data set, and both the C5 stage and detection layer were trained in the data collected in the field. The experimental results show that the accuracy and recall of the proposed method reach 98.47% and 97.56% in the field data detection, respectively, and the manually designed linear distance factor has a good adaptability to the SVM warning model.
Keywords: obstacle warning    target detection    distance threshold model    dilated convolution    transfer learning

1 引言

2 目标检测框架 2.1 Mask R-CNN

 图 1 改进Mask R-CNN框架 Fig. 1 Improved mask R-CNN framework

 $\left\{ {\begin{array}{*{20}{l}} {{{P'}_i} = {\rm{sum(upsample(}}{{P'}_{i{\rm{ + 1}}}}{\rm{)}}, {\rm{conv(}}C))} \\ {{P_{\rm{6}}} = {\rm{maxpooling}}({P_{\rm{5}}}){\rm{ }}} \\ {{P_i} = {\rm{conv}}({{P'}_i}{\rm{ }}){\rm{ }}} \end{array}} \right.,$ (1)

 $\left\{ {\begin{array}{*{20}{l}} {x = (1 + \Delta x) \cdot x} \\ {y = (1 + \Delta y) \cdot y} \\ {w = \exp (\Delta w) \cdot w} \\ {h = \exp (\Delta h) \cdot h} \end{array}} \right.。$ (2)

2.2 Block单元改进

 图 2 两种卷积操作。 Fig. 2 Two convolution operations. (a)常规卷积；(b)扩展率为2的扩展卷积 (a) A conventional convolution; (b) An dilated convolution with

 图 3 C5阶段引入空洞卷积 Fig. 3 Empty convolution is introduced in C5
3 预警模型

 $\left\{ {\begin{array}{*{20}{l}} {{w_{{\rm{anchor}}}} = {{w'}_{{\rm{anchor}}}}/w} \\ {{h_{{\rm{anchor}}}} = {{h'}_{{\rm{anchor}}}}/h} \\ {{s_{{\rm{anchor}}}} = {{s'}_{{\rm{anchor}}}}/s} \\ {{s_{{\rm{mask}}}} = {{s'}_{{\rm{mask}}}}/s} \end{array}} \right.,$ (3)

 $\left\{ \begin{array}{l} \min {\rm{ }}\frac{1}{2}\sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _i}{\alpha _j}{y_i}{y_j}K({x_i}, {x_j}) - } } \sum\limits_{i = 1}^N {{\alpha _i}} \\ {\rm{s}}{\rm{.t}}{\rm{. }}\sum\limits_{i = 1}^N {{\alpha _i}{y_i}} {\rm{ , 0}} \leqslant {\alpha _i} \leqslant C, \;\;i = 1, 2, \cdots , N \\ \end{array} \right.,$ (4)

 $\begin{array}{l} T = \{ ({x_1}, {y_1}), ({x_{\rm{2}}}, {y_{\rm{2}}}), \ldots , ({x_N}, {y_N})\} , \\ {x_i} = ({w_{{\rm{anchor}}}}, {h_{{\rm{anchor}}}}, {s_{{\rm{anchor}}}}, {s_{{\rm{mask}}}}, l), \;\;{y_i} = \left( {0, 1} \right), \end{array}$

4 实验 4.1 数据采集与模型训练

 图 4 数据集组成 Fig. 4 Data set composition

4.2 实验结果与分析

 $P{\rm{ = }}\frac{{TP}}{{FP + TP}},$ (5)

 $R{\rm{ = }}\frac{{TP}}{{FP + FN}},$ (6)

F1分数：

 ${F_{\rm{1}}}{\rm{ = }}\frac{{{\rm{2}}P \cdot R}}{{P + R}},$ (7)

 Type Accuracy/% Recall/% F1/% 1 wanchor+hanchor+sanchor+l 88.12 95.13 91.49 2 wanchor+hanchor+smask+l 90.16 92.84 91.48 3 hanchor+sanchor+smask+l 90.22 81.36 85.56 4 wanchor+sanchor+smask+l 94.34 82.64 88.10 5 wanchor+hanchor+sanchor+smask 55.61 64.92 59.51 6 wanchor+hanchor+sanchor+smask+l 98.47 97.56 98.01

 Model Accuracy/% Recall/% F1/% Time/ms yolov3+SVM 95.08 95.31 95.19 87 Mask R-CNN+SVM 96.64 95.89 96.26 134 Ours model 98.47 97.56 98.01 136

 图 5 三种算法在多种场景下的预警效果对比图。从左到右分别是Mask R-CNN、本文框架、yolov3经过预警模型分类的检测结果，红色代表检测为预警目标，绿色代表安全目标。(a)会车场景一；(b)会车场景二；(c)会车场景三；(d)跟车场景一；(e)跟车场景二；(f)行人场景；(g)跟车与行人复杂场景；(h)中距离多车交会；(i)近距离多车交会；(j)远距离多车交会 Fig. 5 Comparison diagram of three models in various scenarios. From left to right are the detection results of Mask R-CNN, framework of this paper and yolov3 classified by the warning model. The red represents the detection of warning targets and the green represents the security targets. (a) Meeting scene 1; (b) Meeting scene 2; (c) Meeting scene 3; (d) Following the car scene 1; (e) Scene 2 with the car; (f) Pedestrian scene; (g) Complex scenes with cars and pedestrians; (h) Medium-distance multi-vehicle meeting; (i) Close multiple vehicle meeting; (j) Long distance multi-vehicle meeting
5 结论

