
Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369
- Most Read
- Most Cited
- Most Downloaded
Liu Guoquan , Wang Yunwen , Zhou Shumin , Chu Hongyu
2026, 49(5):1-8.
Abstract:Aiming at the problems of low detection accuracy, large data fluctuation and weak robustness of the detection system of traditional single UAV equipped with nuclear radiation detector, this paper constructs a nuclear radiation detection system based on multi-UAV cooperation. Firstly, a multi-UAV formation control algorithm is designed to realize the cooperative control of multi-UAV. Secondly, a multi-sensor extended Kalman data fusion mechanism is improved to fuse the radiation data collected by multiple sensors into a more accurate radiation data, which improves the detection accuracy of the system. Finally, the system is successfully deployed to the physical platform, and the feasibility of the system is verified. The experimental results show that the system can reduce the detection error by about 50% compared with the single sensor system, and improve the robustness of the system. The improved extended Kalman data fusion algorithm can reduce the fusion error by about 21% compared with the ordinary extended Kalman data fusion algorithm.
Wang Zhen , Yang Ming , Sun Hang
2026, 49(5):9-18.
Abstract:To address the issue of reduced system stability caused by the dynamic interaction between the phase-locked loop, grid impedance and current control loop, this paper first utilizes the Hurwitz criterion for analysis. This analytical method can reveal the inherent impact mechanism of grid impedance fluctuations on the stability margin of the phase-locked loop synchronization process, and the analysis results indicate that grid impedance is an important unstable factor in the phase-locked loop-based synchronization system. Then, a grid impedance adaptive collaborative phase-locked loop control structure is proposed. Based on the proposed improved phase-locked loop control structure, a phase correction link is added in series to the current control loop to reshape the output impedance of the grid-connected inverter, thereby significantly enhancing the robustness of the grid-connected system under weak grid conditions. Finally, simulation and experimental results confirm the accuracy of theoretical analysis, indicating the feasibility of the improved strategy.
Yang Xingwang , Yang Aimin , Xue Tao
2026, 49(5):19-29.
Abstract:The performance indicators of sintered ore fully reflect its quality, and the quality of sintered ore, in turn, enhances blast furnace production efficiency, reduces energy consumption and fuel ratios, and promotes green smelting and environmental protection. In the process of predicting sintered ore quality, traditional deep neural networks suffer from poor interpretability, while fuzzy neural networks, which offer strong interpretability, are prone to issues such as rule explosion and difficulties in parameter tuning. This paper constructs a predictive model that combines fuzzy neural networks with deep neural networks. First, by improving the CBAM channel attention module, the model calculates both channel and spatial attention for input features to fuse effective features; this enhances the model′s ability to effectively model complex nonlinear relationships and dynamically allocate feature importance. Furthermore, by optimizing the model using an improved Grey Wolf optimization algorithm, the model′s predictive accuracy is improved. Finally, experimental studies were conducted on the prediction of sintered ore drum index, sintered ore alkalinity, and RDI+3.15, achieving high accuracy and validating the feasibility of the proposed model and algorithm. A comparison of the four models—GW-FNN, GW-DFNN, Attention-DFNN, and GW-Attention-DFNN—revealed that the GW-Attention-DFNN model achieved an R2 of 0.968 2 for the drum index, 0.975 0 for sintered ore alkalinity (R), and 0.964 2 for RDI+3.15. These results indicate that this model performs well in predicting the quality performance of sintered ore.
Yang Qingyu , Yuan Liang , Lyu Kai
2026, 49(5):30-39.
Abstract:Imitation learning offers a powerful approach for enabling robotic arms to perform complex tasks in unstructured environments. However, many state-of-the-art methods are hindered by redundant input data, leading to inefficient training and limited trajectory prediction accuracy in complex tasks. To address these issues, this paper proposes KPT-O, a method for optimizing demonstrated trajectories by extracting keypoints. The method filters keypoints to streamline the learning data and optimizes their distribution to enhance prediction accuracy. To validate its performance, KPT-O was trained within a state-of-the-art framework and compared against leading methods on the HelloWorld and RoboTasks datasets. The results demonstrate that KPT-O not only significantly reduces training time but also achieves superior trajectory prediction accuracy. Furthermore, evaluations on a physical robot platform confirm the method′s effectiveness in real-world robotic arm tasks involving changes in both position and orientation.
Jin Zhixin , Luo Shi , Li Yongan , Liang Wei , Li Jiahao
2026, 49(5):40-51.
Abstract:To address the challenges of insufficient safety and poor tracking accuracy faced by coal mine inspection robots when conducting path planning in complex and dynamic underground tunnel environments, this paper proposes a path planning method that integrates an improved A* global planning algorithm with a fuzzy PID motion control. By introducing an obstacle cost term and dynamic weighting strategy into the cost function of the traditional A* algorithm, the efficiency and safety of global path planning are enhanced. The initial path is smoothed using B-spline curves to make it more compliant with the kinematic constraints of the robot, thereby improving its executability and trajectory smoothness. A fuzzy PID controller based on the robot′s kinematic model is designed to replace the traditional PID controller. Through fuzzy control, the PID parameters are adaptively adjusted to achieve high-precision and high-stability tracking control of the smoothed global path, effectively coupling the linear and angular velocity control. The simulation results of MATLAB and ROS Gazebo show that the improved A* algorithm reduces the number of search nodes by approximately 65%, and the B-spline processing significantly improves the path smoothness. Compared with the traditional PID, the fuzzy model PID controller performs better in terms of path tracking accuracy and stability. The maximum lateral error range is within ±0.05 meters, and the maximum heading error is controlled within ±0.2 radians. This method significantly improves the path planning and tracking performance of coal mine inspection robots.
Yang Jinli , Li Wei , He Xiaoyong , Liu Kaishu , Gu Jijun
2026, 49(5):52-62.
Abstract:In response to the challenges encountered in the detection of oil leakage from pipelines on offshore platforms, such as dust and fog interference, varying target scales, and complex backgrounds, a detection method for oil leakage from offshore platform pipelines based on an improved YOLOv8 model is herein proposed.Initially, the C2f_MP module is employed to substitute the C2f module within the backbone network. This substitution effectively enhances the model′s capacity to extract detailed features. Subsequently, an efficient multi-scale attention mechanism (EMA) is incorporated into the model′s neck structure. This addition significantly improves the model′s focus on features of multi-scale targets within complex scenarios, thereby enhancing its ability to recognize small targets.Finally, the original detection head is optimized into four lightweight small target detection heads. This optimization remarkably improves the detection performance for small targets. Moreover, the WIoU loss function is utilized to boost the training effectiveness and enhance the model′s recognition accuracy.The experimental results indicate that the improved YOLOv8 model can maintain a real-time detection speed of 118 fps. Simultaneously, compared with the baseline model YOLOv8s, the precision is increased by 2.3%, and the mAP@0.5 is enhanced by 2.4%. The practical application tests demonstrate that the average accuracy of the improved model in detecting oil leakage from pipelines on a particular offshore platform reaches 96.25%. This achievement meets the requirements of engineering applications and offers an effective technical solution for the safety monitoring of offshore platform pipelines under complex industrial environment backgrounds.
Guo Xupeng , Dong Lihong , Qin Yi
2026, 49(5):63-76.
Abstract:Addressing the issues of significant scale differences in small targets and strong interference from complex backgrounds in the detection of surface defects on coal mine steel wire ropes, a deep learning detection algorithm based on an improved YOLOv11 is proposed. Firstly, a receptive field attention feature extraction module, C3k2_RFAConv, is designed to enhance feature extraction capabilities under complex textures by dynamically adjusting convolution kernel weights. Secondly, a deformable large kernel attention mechanism, D-LKA, is introduced at the feature fusion layer, combining the advantages of large receptive fields and deformable convolutions to precisely focus on defect areas. Additionally, DySample upsampling optimization is adopted to suppress background noise interference and reduce the loss of small target features. Finally, an Inner-WIoU loss function is proposed to optimize bounding box regression and improve the localization accuracy of irregular defects. Experimental results show that the improved algorithm achieves an accuracy rate of 83.2%, a recall rate of 78.1%, and an average precision of 82.1%, which are 3.1%, 4.6% and 2.6% higher than those of the benchmark model YOLOv11, respectively. It also outperforms comparative models such as Faster-RCNN and YOLOv8. In addition, visual analysis proves that the improved algorithm has a reduced missed detection rate, providing an effective technical solution for real-time monitoring of potential safety hazards in mining steel wire ropes.
2026, 49(5):77-84.
Abstract:Aiming at the problem of detecting defects and determining their relative distances in centralized photovoltaic panels inspected by unmanned aerial vehicles, a photovoltaic panel defect detection and location system based on deep learning YOLOv11n is proposed. Defect detection of high-resolution photovoltaic panel images is carried out by using YOLOv11n combined with slicing assisted hyperreasoning. Reference points and target points are set up in the centralized photovoltaic panel array based on the detection results. The detection result information is integrated with the information of unmanned aerial vehicle equipment and the exchangeable image file format information of the data set to calculate the longitude and latitude of the reference points and target points, thereby calculating the relative distance between them and achieving the defect location of photovoltaic panels. The experimental results show that the mAP50 and mAP50:95 of the YOLOv11n training results are 67.1% and 49.5% respectively; the accuracy rate of target detection combined with slicing assisted hyperreasoning is 88.73%. The accuracy rates of the defect location algorithm with errors ranging from 0 to 4 meters under visible light and thermal imaging cameras were 96.73% and 97.64% respectively, basically meeting the operation and maintenance requirements. Finally, an interactive management system page is constructed by combining front-end development languages such as JavaScript, and the detection information is stored in the database. The system realizes the intelligence of defect detection and location of photovoltaic panels through fault detection, location and information storage of photovoltaic panels, providing technical support for the maintenance of centralized photovoltaic panels.
Ren Shuai , Yang Sinian , Cao Lijia , Guo Chuandong , Liu Yanju
2026, 49(5):85-94.
Abstract:To address the problem of small targets being difficult to identify during strip defect detection and the large number of target detection algorithm parameters affecting model deployment, this paper proposes a lightweight strip defect detection algorithm based on the YOLOv11 framework. The algorithm enhances small target recognition capability through the integration of a bidirectional feature pyramid network (BiFPN) and GAM attention mechanism. Simultaneously, a lightweight SlimNeck model is adopted to integrate the neck network to reduce model parameters and complexity while maintaining detection accuracy. Experimental validation on the NEUDET dataset demonstrates that the improved BSG-LiteYOLO detection model based on YOLOv11n significantly outperforms the original YOLOv11n. The optimized model achieves a 25.6% reduction in parameters, 22.6% decrease in model weight size, 7.94% reduction in floating point operations (FLOPs), while improving mAP@0.5 by 4.19%. Experimental results demonstrate the feasibility of the improved model for steel strip surface defect detection, with the optimized algorithm successfully deployed on Jetson Orin Nx edge computing devices achieving 34.3 FPS, which meets practical production requirements.
Zhang Yihao , Li Yuezhong , Ye Lan
2026, 49(5):95-103.
Abstract:To enhance the efficiency and performance of trajectory planning for the robotic arm of a moxibustion robot, a study on joint space trajectory planning methods was conducted. An improved particle swarm optimization (PSO) algorithm was proposed, which incorporated dynamically adjusted inertia weights and learning factors, combined with 3-5-3 polynomial interpolation for trajectory planning. A six-axis robot model in MATLAB was used to establish the robotic arm model for simulation experiments. In the simulations, the proposed algorithm was compared with the standard PSO algorithm. The results showed that the joint angular displacement, angular velocity, and angular acceleration curves planned by the improved algorithm were continuous and smooth without abrupt changes. The initial and final velocities were zero, and the entire velocity and acceleration profiles strictly satisfied the constraints without exceeding the maximum operational limits. Meanwhile, the trajectory planning time was reduced from 7 s to 3.139 s, representing a 55.16% improvement in time efficiency. The results verify the effectiveness and superiority of the proposed algorithm in robotic arm trajectory planning.
Zhou Chenhan , Xu Xiaoyang , Wei Wei
2026, 49(5):104-116.
Abstract:Aiming at the issues of interference from redundant information in images, insufficient multi-scale information extraction, and low retrieval accuracy caused by the ineffective integration of global and local information in cross-modal retrieval tasks for remote sensing images, this paper proposes a network model of multi-scale cross-modal remote sensing image retrieval (IGMR) suitable for multi-scale tasks. Firstly, a multi-dimensional perception enhanced convolution module (MFE) is designed to extract local information while filtering redundant features. It also integrates a multi-attention module to focus on the high-frequency information of images, thereby enhancing feature expression ability. Secondly, a multi-scale patch attention network (RFPA) is developed to capture contextual information at different scales. Subsequently, an adaptive feature fusion module (AFFM) is constructed to dynamically fuse the extracted global and local features, strengthening attention to high-quality information. Experimental results on the public datasets RSICD and RSITMD demonstrate that the proposed IGMR method increases the average recall rate (mR) by 1.83% and 3.21% respectively in remote sensing cross-modal retrieval tasks, with retrieval accuracies reaching 19.73% and 31.83%. The overall retrieval performance is significantly improved.
Yang Xuhong , Hu Yingkai , Zhu Xiaofang , Wang Shun
2026, 49(5):117-125.
Abstract:The problem of harmonic distortion in the AC-side output current and DC-side circulating current of the modular multilevel converter (MMC) under unbalanced grid voltage conditions is addressed by proposing a control strategy based on the red-tailed hawk algorithm (RTH) optimized sliding mode. Firstly, the unbalanced current is decomposed according to the MMC′s topology, and its mathematical model under unbalanced grid voltage is established. Then, a novel reaching law is introduced, its performance is analyzed, and its stability is verified through the Lyapunov function. On this basis, the red-tailed hawk optimization algorithm is applied to dynamically optimize the sliding mode control of the new reaching law. Finally, through simulation experiments on the MATLAB/Simulink platform, comparisons are made with integral sliding mode control and the new reaching law sliding mode control without optimization. The results show that the proposed method has significant advantages in both dynamic response and steady-state accuracy.
Hou Hua , Xu Jinqian , Wang Diancheng , Wang Yan
2026, 49(5):126-133.
Abstract:To address the problem of low 3D positioning accuracy caused by multipath interference and non-line-of-sight propagation, this paper proposes an adaptive Kalman filter-based Chan-Taylor fusion positioning algorithm. First, an improved KF is applied to preprocess the TDOA-based measurements to suppress noise interference. Then, the Chan algorithm and the Taylor algorithm are respectively used to compute the position estimates based on the preprocessed data, providing initial location estimates. Finally, the difference between the initial estimates obtained by the Chan and Taylor algorithms is used as the iteration trigger condition: If the difference exceeds a predefined threshold, the algorithm enters Taylor iteration; otherwise, the initial position estimate is directly output. Simulation results show that, compared with the Chan algorithm, the KF-Chan algorithm, the Chan-Taylor algorithm, and the weighted Chan-Taylor algorithm, the proposed method improves positioning accuracy by 87.64%, 75.95%, 53.52% and 40.30%, respectively.
Li Yun , Xu Tao , Jia Yajun , Jiang Junjie
2026, 49(5):134-144.
Abstract:In response to the difficulty of feature extraction and pattern recognition in the identification of internal overvoltage types in distribution networks, this paper proposes a distribution network internal overvoltage identification method based on multi domain fusion feature extraction and Bayesian optimization Transformer BiGRU. Firstly, through multi domain fusion feature extraction, the 10 kV bus neutral point overvoltage signal is subjected to timefrequency, frequency domain, and time-frequency domain feature extraction to construct a ten dimensional feature vector with representational ability. Then, multiple sets of ten dimensional vectors of different types of overvoltage are input into the Bayesian optimization Transformer BiGRU network classifier to achieve recognition of five typical internal overvoltage types. To verify the effectiveness of the method, PSCAD simulation data and physical experimental platform fault waveforms were used to train and test the algorithm proposed in the paper using MATLAB, and the test results were compared with other methods. The results show that the recognition accuracy of the algorithm proposed in the article is as high as 99.11%, which has stronger feature extraction ability and higher recognition accuracy compared to other algorithms.
2026, 49(5):145-155.
Abstract:To address the common issues of complex architecture, high resource consumption, and significant latency in existing deep learning-based CAN bus intrusion detection methods, this paper proposes a lightweight CAN bus intrusion detection model based on an improved SqueezeNet architecture. First, CAN message data is converted into color images to enhance spatial and channel feature representation. Second, an efficient channel attention (ECA) mechanism is introduced to enable fine-grained modeling of anomalous communication patterns. Third, the network architecture is optimized by replacing standard convolutions with deep separable convolutions and Ghost modules, while pruning redundant layers to reduce computational overhead and parameter count. Finally, the Hardswish activation function is uniformly applied to enhance nonlinear expressiveness and inference efficiency. Experimental results on the Car-Hacking public dataset demonstrate that the proposed method achieves 100% detection accuracy with a model size of only 0.35 MB and an average response time of 1.6 ms, offering deployment advantages of high performance, low latency, and minimal resource consumption.
Xiong Weihua , Zhang Yong , Zhang Xuemei , Li Liangyao
2026, 49(5):156-167.
Abstract:Addressing the issues of existing deep learning-based direction of arrival (DOA) estimation algorithms, which suffer from large parameter volumes and dependence on covariance matrix inputs that are easily affected by noise, making deployment on resource-constrained edge devices challenging, this paper proposes a lightweight convolutional classification-regression neural network DOA estimation algorithm. The proposed method uses raw signals as model inputs and directly extracts DOA features from time-domain signals through end-to-end learning, thereby avoiding the performance degradation associated with traditional covariance matrix methods under low signal-to-noise ratio conditions. The model reduces the number of parameters through spatiotemporal feature compression and the integration of a Ghost bottleneck structure, and introduces an attention mechanism to adaptively recalibrate feature channel weights, enhancing focus on critical features. A dual-branch output strategy combining coarse classification and fine regression is adopted, first determining the angular interval and then predicting the sectoral offset, allowing for high-precision estimation even under stringent conditions (e.g., -5 dB SNR). Experimental results demonstrate that the model maintains outstanding performance (accuracy of 96.3%) while achieving high compactness (actual deployment size of 118.83 kB, with 24 783 parameters). Compared with traditional algorithms and mainstream lightweight models, this model preserves both accuracy and computational efficiency while reducing model parameter volume, providing edge devices with a high-precision, low-latency, and low-resource-consumption DOA estimation solution.
Han Jianfeng , Nan Rujun , Song Lili , Fang Jiandong , Xu Zihan
2026, 49(5):168-179.
Abstract:To address the significant morphological differences in road surface damage and the unsatisfactory segmentation effects under complex environmental interference in UAV inspections, we propose an improved model, MDPR-DeepLabV3+, for road surface damage segmentation. Firstly, the original backbone network is replaced with the MobileNetV2 network to enhance operational efficiency. Secondly, a DFSP module is constructed in the encoder to efficiently integrate local details and global context through progressive feature accumulation and cross-scale attention interaction. In addition, a PSA_M attention module is added to strengthen the information of damaged edges. Finally, a residual channel decoupling (RCD) module is proposed in the decoder to promote the complementarity of deep and shallow layer information and enhance feature diversity. The experimental results demonstrate that the proposed model achieves mIoU, mPA, and mPrecision of 78.47%, 92.03% and 83.93%, respectively, on a self-made UAV inspection dataset for road damage. The effectiveness of this model is further validated on the publicly available dataset Crack500, indicating strong performance in terms of pavement damage recognition accuracy and robustness in complex environments.
Liu Zhilong , Wang Cheng , Du Junnan , Wang Tianyi
2026, 49(5):180-189.
Abstract:To address the issues of limited target diversity, low detection accuracy, and poor generalization in the detection of safety protective equipment for aerial workers in electric power scenarios, this paper proposes an improved YOLO11n-based detection algorithm tailored for high-altitude safety protective equipment detection. Firstly, a DySample dynamic upsampling method is introduced into the neck network to effectively prevent excessive amplification or information loss during upsampling, thereby enhancing feature retention while maintaining overall detection performance. Secondly, the RCM is optimized using depthwise separable convolutions to construct a new CFSCM, which captures key features across both spatial and channel dimensions, improving the model′s perception of foreground protective equipment. Finally, a novel LQEH is designed to integrate the localization quality scores from the regression branch with the outputs of the classification branch, thereby addressing the lack of interaction between the two original branches and enhancing the correlation between classification and localization tasks. Experimental results demonstrate that the proposed algorithm achieves a mAP@0.5 of 93.1%, precision of 96.1%, and recall of 86.7%, representing improvements of 3.2%, 0.7% and 2.3% over the baseline model, respectively, with a detection speed of 131 fps. In addition, generalization experiments conducted on a high-altitude safety protective equipment dataset from the Roboflow platform show respective improvements of 2.1%, 5.2%, and 2.2% in mAP@0.5, precision, and recall compared to the baseline, validating the effectiveness of the proposed improvements in enhancing detection accuracy and generalization capability.
Chen Chen , Jiang Yi , Zhu Weixing
2026, 49(5):190-197.
Abstract:Presently, computer vision-based pig aggression recognition mainly adopts deep learning algorithms. However, these methods only recognize aggression from the pig herd/pairwise pigs and cannot determine which pigs are involved in aggression. Therefore, recognizing the identity of individual pigs helps to refine aggression recognition from the herd/pairwise level to the individual level. Regarding the influence of factors, e.g., body deformation, overlapping, etc., on the accuracy of pig identification in the aggression process of group-housed pigs, an improved YOLOv10s model IDBS-YOLOv10s for pig identification was proposed in this paper. Firstly, the InceptionNeXt-DCNv3 was used to replace the convolution in c2f in the backbone network to reduce the parameter and computational complexity of the model, thereby enhancing the ability of YOLOv10s network to extract features. Secondly, the weighted bidirectional feature pyramid network was used in the Neck layer to enhance the ability of the model to fuse different feature layers. Then, the SEAM attention mechanism was added before the detection head to enhance the ability of the model to extract key feature information of pig identities. Finally, the detection head v10detect was used to recognize the identity of individual pigs. The identification precision of this model was 94.3%, the recall was 93.7%, the mean average precision was 95.8%, and the model weight was only 15.2 MB. The results indicate that this method can be used to recognize the identity of pigs under aggression scenes.
Wen Jinrui , Wu Xiaohong , Teng Qizhi , He Haibo
2026, 49(5):198-208.
Abstract:To address the challenges encountered in segmenting scanning electron microscope images of lanthanum tungsten rods—such as difficulties in distinguishing adhered grains and occlusion of grain boundaries—an improved method based on the Pix2PixGAN framework is proposed to achieve high-precision extraction of tungsten grain boundaries. First, the standard skip connection is replaced with an edge guided attention module, integrated with Laplacian feature map extraction, to enhance the multi-scale representation of grain boundary features. Second, an efficient upsampling convolution block is introduced for feature upsampling, effectively mitigating checkerboard artifacts and facilitating the fusion of multi-level features. The original L2 loss function is substituted with a combined loss function comprising weighted binary cross-entropy loss and weighted intersection-over-union loss, emphasizing the optimization of edge pixels. Finally, gradient penalty is incorporated to improve the stability and diversity of the generator. Experimental results demonstrate that the improved model achieves an F1-score of 72.47%, a recall rate of 77.21%, and a precision of 68.32%, representing improvements of 13.02%, 6.49%, and 16.87%, respectively, over the baseline Pix2PixGAN model. Furthermore, the proposed method surpasses RCF, RINDNet, UCTransNet, and MEGANet in terms of F1-score and precision, confirming its effectiveness in grain boundary extraction.
Wang Yudie , Chen Lingyi , Han Lei , Su Xin , Lu Xiaochun
2026, 49(5):209-218.
Abstract:Uniqueness-based identity authentication is crucial for the implementation of agricultural insurance in dairy farms. However, there is currently no accurate and reliable method for cow identification, leading to incidents of insurance fraud and difficulties in coverage. To address this issue, this paper proposes a cross-attention mechanism and an adaptive loss function, built upon the YOLOv7 model framework, to detect cows in the complex environments of dairy farms. The cross-attention mechanism extracts correlation information from different directions in the images, integrating both deep and shallow features to adapt to scale variations caused by poor lighting conditions and shooting angles in farm settings. To tackle the inconsistency in image quality across the dataset, the adaptive loss function adjusts the weights of easy and hard samples, enabling the model to focus more on challenging samples during training, thereby enhancing the robustness and generalization performance of the detection model. Experimental results indicate that the proposed cross-attention mechanism and adaptive loss function model achieved an accuracy rate of 94.63% in the task of dairy cow detection and recognition, which is an improvement of 11.42% compared to the original YOLOv7 model.
Ran Ning , Ma Qiji , Zhang Shaokang , Hao Jinyuan
2026, 49(5):219-228.
Abstract:Aiming at the problems of low matching accuracy and poor real-time performance of existing image stitching algorithms in complex scenes, this paper proposes an image stitching algorithm based on improved ORB and MLESAC. In traditional image stitching approaches, feature detection exhibits insufficient robustness, and descriptors lack discriminative power under conditions of abrupt illumination changes, viewpoint variations or complex background interference. This deficiency readily induces mismatching errors, ultimately leading to stitching misalignments or ghosting artifacts.Thus, in the preprocessing stage of this paper, the input image is transformed into CIE Lab color space to decompose brightness and color channels, and an adaptive image pyramid is constructed by integrating information entropy with illumination statistics.In the feature detection and description stage, a lighting-adaptive FAST corner threshold adjustment mechanism is designed. Subsequently, local geometric constraints are introduced to filter corner points, and the BRIEF descriptor is extended to the L, a and b channels of the CIE Lab color space, thereby fusing local gradient direction information.In the feature matching stage, bidirectional Hamming distance matching is employed to establish a local-global constraint optimization framework for minimizing reprojection error. Subsequently, a more efficient MLESAC algorithm is employed to remove incorrect matches.Finally, a weighted average method is adopted to smooth the stitching area, achieving a seamless stitching effect.Experimental results demonstrate that the proposed algorithm can guarantee real-time performance and high-precision panoramic stitching quality when processing image stitching tasks in complex scenes. Specifically, on the APAP Dataset, the algorithm achieved a matching accuracy of 97.63%.
Liang Ruoyu , Huang Weihua , Yan Ruyu , Li Ruimin
2026, 49(5):229-238.
Abstract:Aiming to address the problem of decreased localization accuracy or even failure in visual simultaneous localization and mapping systems caused by object occlusion in highly dynamic environments, this paper proposes a dynamic VSLAM algorithm based on a joint geometric-motion error model and trajectory prediction. Unlike methods that rely on semantic segmentation or optical flow estimation, this approach fuses camera and IMU information to jointly model the epipolar geometric error and IMU pre-integrated motion error, and employs a probabilistic model for dynamic object detection and occlusion state estimation, maintaining high robustness under occluded conditions. To improve the continuity and accuracy of dynamic object tracking, an Extended Kalman Filter-based trajectory prediction is introduced for object pose estimation. Meanwhile, a joint factor graph model is constructed to optimize the camera, map points, and dynamic object feature points, where a dynamic motion-smoothing factor is designed to suppress abrupt object motion and reduce accumulated errors. Finally, experiments on the KITTI tracking dataset and real-world scenarios demonstrate that, compared with geometry-based and object-tracking-based dynamic SLAM methods, the proposed algorithm achieves superior pose estimation accuracy and dynamic object tracking performance in object occlusion scenarios within highly dynamic environments.
Ai Qiang , Feng Yongan , Wang Lingchao , Dong Lixin
2026, 49(5):239-250.
Abstract:To address the challenges of high computational overhead and insufficient detection capability for small and deformed targets when deploying existing traffic sign detection algorithms on edge computing devices, this paper proposes a lightweight YOLOv8-based traffic sign detection algorithm LTSYOLO incorporating channel pruning. First, building upon YOLOv8n, we design a feature fusion module combining noise suppression and deep semantic enhancement (NSSE) to optimize multi-scale feature representation and suppress background interference. Second, we introduce a multi-scale channel attention (MSCA) mechanism into the backbone network and integrate local deformation attention (LDA) before the detection head, enhancing the model′s perception of multi-scale targets and robustness to geometric deformations, thereby resulting in a high-precision model, TS-YOLO. Finally, to achieve model compression, we apply a BatchNorm scaling factor-based channel pruning strategy to compress TS-YOLO, obtaining the final LTS-YOLO model. Experimental results on the TT100K and CCTSDB datasets demonstrate that, compared to the baseline YOLOv8n, TS-YOLO improves mAP@50 by 2.5% and 1.8% on TT100K and CCTSDB, respectively. After pruning, the resulting LTS-YOLO model maintains its accuracy advantage while significantly reducing both parameter count and computational complexity, demonstrating the effectiveness and practicality of the proposed method.
Zhao Zhibin , Wang Hao , Zhu Jianguang , Xue Dan
2026, 49(5):251-260.
Abstract:To address the issue of knowledge redundancy in knowledge distillation-based anomaly detection models, this paper proposes an industrial detection method based on asymptotic knowledge fusion distillation. This paper adopts reverse knowledge distillation as the backbone network of the detection model. Although reverse knowledge distillation can prevent the propagation of abnormal representations to the student network, the knowledge acquired by the student network under this method is not only complex but also redundant, making it difficult to ensure that the student network can reconstruct the corresponding shallow representations. To transform the high-level, complex, and redundant information output by the teacher network, an asymptotic knowledge fusion mechanism is proposed. For transforming complex representations, this mechanism gradually transfers basic geometric knowledge to deep semantic knowledge, enabling the student network to learn feature representations effectively. For eliminating redundant knowledge from the information output by multi-level teacher networks, this mechanism adopts a learnable feature weight assignment method, which promotes the reconstruction of the student network and improves the anomaly detection capability of the model. Experiments are conducted on the MVTec AD dataset. The results show that the evaluation metrics under all categories of the dataset are 99% for AUROC-image, 97.9% for AUROC-pixel, and 94.8% for AUPRO, which outperform most of the current mainstream detection models, verifying the effectiveness and superiority of the proposed method.

Editor in chief:Prof. Sun Shenghe
Inauguration:1980
ISSN:1002-7300
CN:11-2175/TN
Domestic postal code:2-369