LIU Jian, NIU Pei, GUO Feng, KOU Lei, ZHANG Han-ming
To address the issues of false detection, missed detection, poor anti-interference ability and low detection accuracy in existing object detection algorithms during the process of tunnel lining crack detection, this paper proposes a tunnel lining crack detection algorithm RSwin tailored for practical working conditions. The innovation points of this algorithm were: ① It was the first to propose the Residual Swin Transformer Block (RSTB), which had the ability to globally model and locally extract features for complex lining crack characteristics, enhancing the fusion and representation of multi-scale lining crack features and improving the model performance and generalization ability; ② It was the first to integrate the Shape-IoU loss function, optimizing the evaluation method for shape matching problems, comprehensively considering the characteristics of bounding boxes and calculating the loss value based on this, thereby improving the target box matching performance of the model in the task of tunnel lining crack recognition. To verify the effectiveness of the proposed algorithm, a total of 11 classic target detection models (YOLOv7, YOLOv8, YOLOv9, YOLOv10, Cascade Mask R-CNN, Cascade R-CNN, Faster R-CNN, FSAF (Feature Selective Anchor-free Module), FCOS (Fully Convolutional One-stage Object Detection), NAS FCOS (Neural Architecture Search Fully Convolutional One-stage Object Detection), Mask R-CNN) were used on a self-collected tunnel inspection dataset for model comparison, training, validation and testing. The training results and visualization results show that the mAP50 of the RSwin algorithm is 97.6%, which is 14.51%, 5.57%, 4.41%, 2.98%, 3.2%, 2.5%, 6.43%, 11.7%, 3.1%, 4.7%, and 2.4% higher than that of the seven comparison algorithms respectively; at the same time, it has the fastest inference speed, with a frame rate of 9.3 frames·s-1 under the condition of 807 pixels×606 pixels. The RSwin algorithm has the highest recognition accuracy and the best comprehensive performance, and can be effectively applied to actual tunnel crack detection tasks.