Volume 47,Issue 1,2024 Table of Contents

Design of cascading network time comparison system

Liu Yinhua , Zhang Yingbo , Liu Ya , Li Xiaohui

2024, 47(1):1-8.

Abstract (1003) HTML (0) PDF 8.46 M (2214) Comment (0) Favorites

Abstract:Along with the development of science and technology, the time comparison system with multiple cascading node network and high accuracy is necessary in more and more application field, such as modern shooting range, war field and radar observation bases. At present, navigation satellite common-view technology is widely applied in the time comparison field of multiple nodes, where the accuracy of nanosecond magnitude is necessary. But the data exchange link must be built before this common-view method is used. Thus, its application is limited when there isn’t enough resource to build the data exchange link. The time comparison system with multiple node network is designed based on the pseudo-code ranging theory, two way time comparison method and the technique of CDMA(Code Division of Multiple Access). In this system, additional data exchange link is needless. Otherwise, the important technical problems are analyzed and solved. Finally, validation experiments are done based on the designed time comparison system. The experiment results show that the time comparison uncertainty better than 0.5 ns is reached when the baseline is less than 500 meters and the change of node number doesn’t worsen the time comparison performance.

Knowledge-aware recommendation model fused with interaction information between entities and relations

Yao Jing , Lyu Teng

2024, 47(1):9-16.

Abstract (855) HTML (0) PDF 2.90 M (2024) Comment (0) Favorites

Abstract:As knowledge graphs contain rich item attributes and their associated information, introducing knowledge graphs into recommendation systems can to some extent solve data sparseness and cold start problems. For example, recommendation systems based on propagation utilize the graph structure of knowledge graphs to learn relevant features such as user and item representations. However, the contribution of interactive information between entities and relationships to feature representation is often ignored in propagation, so this paper proposed a knowledge aware recommendation model that fused with interaction information between entities and relationships. Firstly, collaborative information and knowledge correlation were integrated, and heterogeneous propagation methods were used to propagate and expand the representation of users and items. Secondly, in the process of propagation, attention mechanism was used to strengthen the interaction information between entities and relationships, enhance semantic relevance, and ensure the effectiveness of knowledge-based high-level interaction between users and items. Then, a knowledge aware attention mechanism was used to distinguish the importance of entity’s neighbors in each layer, and generate representations of users and items more accurately. Finally, to predict the probability of user interaction with item, multiple representations were combined to obtain the final representation of user and item by an aggregator. To optimize the model, KL divergence loss function was added to align the difference between the prediction distribution and the real distribution of the model. Experimental results on three datasets of Last.FM, Book-Crossing and MovieLens-20M show that the proposed model has a great improvement in CTR prediction performance compared with other baseline models.

Research and simulation analysis of downhole NFC antenna

Zhang Qimengsha , Hu Yongjian , Sun Chengqin , Sun Qi , Zhang Guanjie

2024, 47(1):17-22.

Abstract (1003) HTML (0) PDF 4.32 M (1843) Comment (0) Favorites

Abstract:Ferrite antenna is modeled by 3D finite element electromagnetic simulation software to achieve near-field communication in complex downhole environments. The results show that the coil radius, wire radius, and antenna spacing have a significant impact on the forward transmission gain (S21) of the antenna, while the influence of the relative permeability of the ferrite loop is relatively small. To investigate the impact of the underground environment on antenna transmission performance, antenna impedance matching is performed, and a drilling tool model with drilling fluid medium is established. The results show that metals cause 3.3 dB attenuation of S21; increasing conductivity of drilling fluid from 0 S/m to 2 S/m causes 8.2 dB attenuation of S21. Simulation results and testing results show great consistency, the deviation of S21 curve is less than 0.3 dB for the basic model and less than 5.5 dB for the extended model, guiding downhole NFC antenna design.

Research on ultrasonic detection and recognition methods for debonding defects in rubber-lined pipes

Wei Chengyao , Wang Xuemei , Ni Wenbo , Chen Guo , Zhong Hao

2024, 47(1):23-30.

Abstract (774) HTML (0) PDF 12.47 M (5188) Comment (0) Favorites

Abstract:In view of the current lack of effective detection methods for debonding defects in in-service rubber-lined pipes, as well as low detection efficiency and accuracy, based on the basic principle of ultrasonic pulse echo method, a scanning and probe clamping device suitable for ultrasonic detection of cylindrical rubber-lined pipes was designed, and a corresponding ultrasonic detection experimental system was established.Various interference factors that affect ultrasound echo signals in practical applications have been analyzed, and a binary classification model for ultrasound echo signals based on one-dimensional convolutional neural network (CNN) has been specifically constructed. Through experiments and comparison with traditional ultrasonic detection defect recognition methods, the results show that the established ultrasonic detection system and one-dimensional CNN model can achieve more accurate identification of debonding defects even in the presence of multiple interference factors, with an accuracy rate of 96.22%. This provides an effective method and means for the automated detection and recognition of debonding defects in in-service rubber-lined pipes.

Research on surface detection of ceramic insulator based on image processing

Zhu Ziqing , Liu Xiaofeng

2024, 47(1):31-37.

Abstract (850) HTML (0) PDF 6.09 M (1840) Comment (0) Favorites

Abstract:The full surface detection of ceramic insulators on high-voltage lines is an important part of ensuring its quality. Due to the complexity of its surface, manual detection is the mainstream at present, and missed detection and false detection are inevitable. It is a trend in recent years to use automatic devices with machine vision to detect ceramic insulators. This paper identifies the main defects of insulators, bubbles and cracks, and preprocesses the pictures by improving KNN edge filtering. The weighted fitting method extracts bubble defects. Bubble ROI is used for positioning and extraction, threshold segmentation method, morphological crack processing and crack positioning, and finally through feature screening, this method can quickly and accurately identify the characteristics of ceramic surface defects, the recognition efficiency is within 200 ms, and the recognition rate has reached 98.2%, meeting the precision requirements of the high-voltage line ceramic industry.

Research on the influence of aircraft airborne trunking structure on electromagnetic protection performance

Pang Siyu , Qian Yuchen , Xiao Haijian , Lu Xiang , Ding Zihang

2024, 47(1):38-45.

Abstract (1308) HTML (0) PDF 11.64 M (1764) Comment (0) Favorites

Abstract:In response to the complex electromagnetic interference problem of connecting equipment and external radiation sources in the wiring harness of aircraft airborne platforms under lightning strike environment, considering the impact of different structures of the wiring duct on the protective performance of the wiring duct, based on the three axis method and combined with the Van Helvoort shape coefficient, a theoretical expression for the transfer impedance of the wiring duct from low frequency to high frequency state is derived. Furthermore, a simulation model for the protection of wire ducts with different aspect ratios is established to study the protection performance of wire ducts with different shapes and sizes. Finally, a lightning electromagnetic protection simulation model was established for the internal trunking of the composite material fuselage, analyzing the protective effects of different positions and shapes of trunking on the fuselage, and further verifying the differences in the protective performance of different structural trunking. The results show that for every 1/2 increase in aspect ratio, the shielding effectiveness of the trunking increases by about 5 dB, and the effective protection area gradually expands. As the distance between the body and the skin increases, the induced current of the cables inside the body decreases linearly. The sealing performance of the trunking is directly proportional to its protective performance.

IGBT aging prediction model based on improved DBO optimization BiLSTM

Han Sumin , Zhao Guoshuai , Shang Zhihao , Yu Yuewei , Guo Yu

2024, 47(1):46-54.

Abstract (1152) HTML (0) PDF 12.76 M (1876) Comment (0) Favorites

Abstract:In order to characterize the aging trend of IGBT modules in inverter faults and improve the prediction accuracy of the aging process, this paper proposes an IGBT aging prediction model based on improved dung beetle optimizer (IDBO) optimizing the hyper-parameters of bidirectional long-short-term neural network (BiLSTM). Firstly, the timefrequency domain features of Vce.on in the aging process are extracted, and the normalized composite index is constructed by dimensionality reduction using kernel principal component analysis. Secondly, to address the shortcomings of the dung beetle optimizer (DBO), the optimization ability and convergence performance of the DBO are improved by introducing the improved Circle chaotic mapping, Levy flight, and adaptive weighting factors, and the global optimization is achieved by using the IDBO for the hyperparameters of the BiLSTM prediction model. Finally, the effectiveness and superiority of the BiLSTM aging prediction model optimized based on IDBO are verified by actual IGBT degradation data. The results show that the constructed IDBO-BiLSTM model reduces RMSE by 36.42%, MAE by 31.77%, and MAPE by 41.03% on average compared with the BiLSTM model.

Research on AC-DC mixed voltage switching driving method for μHRG

Zhang Lin , Li Rongbing , Xu Jing , Li Zhongliang

2024, 47(1):55-62.

Abstract (1099) HTML (0) PDF 4.08 M (2339) Comment (0) Favorites

Abstract:The scale factor of micro hemispherical resonator gyroscope(μHRG) is affected by the degree of damping nonuniformity and amplitude control accuracy. While ensuring the high symmetry of the gyroscope structure, the accuracy of driving mode amplitude control is also particularly important. In response to the common problems of the same frequency driving noise interference and compensation phase delay in the common first harmonic and second harmonic voltage driving methods, such as low amplitude control accuracy of the driving mode of μHRG and insufficient stability of the scale factor, due to the altered equivalent driving form of the gyroscope post-resonance, a novel approach is proposed based on the voltage driving principle of micro resonators. A dynamic vibration model for micro resonators during the stable amplitude stage is constructed. A same-phase double-frequency voltage driving method is introduced for this stage, alongside the design of a mixed AC-DC voltage switching driving method for gyroscope modal control. This method effectively mitigates amplitude fluctuation interference caused by phase compensation and same-frequency drive. Simulation experiments in force balance mode validate a 5.882% improvement in the control accuracy of modal amplitude and a 6.625% enhancement in the stability of the gyroscope scale factor under the mixed voltage switching driving method.

Hysteresis soft-switching VLC-WiFi access selection algorithm based on quadrant partitioning

Zhang Huiying , Liang Shida , Li Yueyue , Sheng Meichun , Ma Chengyu

2024, 47(1):63-70.

Abstract (740) HTML (0) PDF 4.45 M (1884) Comment (0) Favorites

Abstract:In order to solve the problem of spectrum scarcity and frequent user switching in wireless networks, this paper proposes a VLC-WiFi heterogeneous access and hysteresis soft switching algorithm based on quadrant segmentation. According to the three-dimensional coordinate system, the indoor area is divided into four parts to establish a user rate model, obtain the user rate, and establish an occlusion model to simulate the scenario of VLC link being occluded. Users perform access operations based on parameters such as location, quadrant, occlusion, and speed. Switching differentiates the threshold between inward switching and outward switching based on hysteresis, creating a flexible space for switching and suppressing the ping pong effect. In the 5 m×5 m×5 m indoor space, multiple experiments have shown that the indoor VLC-WiFi heterogeneous networking scheme proposed in this article reduces the number of horizontal switches by 43.8%, the number of occlusion and vertical switches by 45.3%, the number of ping-pong times by 52.27%, and the throughput by about 5.84% compared to traditional algorithms. This algorithm can significantly reduce switching and ping-pong times. Therefore, this article provides a theoretical basis for the study of indoor heterogeneous network communication.

SOC estimation of ternary lithium-ion battery based on ASSA-RBF joint algorithm

Liu Qi , Wu Songrong , Deng Hongli , Zhang Hanwen , Fu Cong , Liu Bo

2024, 47(1):71-78.

Abstract (1053) HTML (0) PDF 10.52 M (1693) Comment (0) Favorites

Abstract:Accurately estimating the state of charge (SOC) of ternary lithium batteries is the foundation for ensuring the safe and stable operation of electric vehicles. In response to the problem of low estimation accuracy of traditional BP neural networks and the tendency of RBF neural networks to fall into local optima, this paper proposes a ternary lithium battery SOC estimation method based on the combination of adaptive sparrow search algorithm and RBF neural networks. Firstly, the standard sparrow search algorithm is improved by using the elite chaos reverse mechanism to initialize the sparrow population, and the Cauchy Gaussian mutation strategy is used to optimize the follower position update formula in the sparrow population. Then, the improved sparrow search algorithm is used to optimize the initial weight and width parameters of the RBF neural network to improve the algorithm′s estimation accuracy of SOC. Finally, the model was validated based on the charging and discharging experimental data of ternary lithium batteries. The results show that under dynamic stress testing conditions, the proposed joint algorithm model has a root mean square error of 0.694% and an average percentage error of 3.15% in SOC estimation, which can be well applied to SOC estimation of ternary lithium batteries.

Improved dung beetle optimization for feature selection tasks

Li Jun , Xu Qin

2024, 47(1):79-86.

Abstract (1112) HTML (0) PDF 5.03 M (2263) Comment (0) Favorites

Abstract:The dung beetle optimization (DBO) algorithm is a novel heuristic algorithm inspired by the behaviors of dung beetles. It exhibits faster convergence speed and stronger ability to escape local optima compared to other algorithms. However, the DBO algorithm lacks the capability of performing feature selection. In this paper, propose algorithm of dung beetle and grey wolf fusion (DBOG) as an improvement to the DBO algorithm specifically designed for feature selection tasks. The DBOG incorporates three enhancement strategies: elite initialization population strategy, grey wolf-dung beetle fusion strategy, and runtime acceleration strategy. These strategies aim to further enhance the performance of the DBO algorithm in feature selection tasks. Additionally, we provide pseudocode for the overall algorithm. Experimental results demonstrate that, compared to other improved heuristic algorithms, the DBOG achieves higher accuracy and lower-dimensional feature subsets across 12 classification datasets. Moreover, it offers advantages such as faster convergence speed and computational efficiency.

Optimization of fuzzy control for switching power supply based on improved whale algorithm

Zhang Jialiu , Li Zhengquan

2024, 47(1):87-92.

Abstract (780) HTML (0) PDF 2.31 M (1989) Comment (0) Favorites

Abstract:Aiming at the control strategy of flyback switching power supply,the traditional fuzzy PID control algorithm is difficult to achieve good control effect. The whale optimization algorithm is used to optimize the fuzzy PID control system, the initial population in the whale algorithm is chaotic by Tent mapping, and the discussion domain in the fuzzy PID control system is optimized by chaotic whale optimization algorithm. The output stability and anti-interference ability of flyback switching power supply under this control system are improved. Based on Matlab software, modeling of flyback switching power supply is completed for simulation research. The simulation results show that the stability of flyback switching power supply under CWOA-FPID is better than that under WOA and particle swarm optimization. When the load changes from 10 Ω to 5 Ω, the control strategy proposed in this paper has significant advantages in voltage recovery time and voltage output mutation compared with the other two algorithms. It is proved that the fuzzy PID controller optimized by the improved whale algorithm can better meet the control requirements of flyback switching power supply.

Research on the control algorithm of inverted pendulum based on improved SAC

Zhang Xiaoli , Guo Shilin , Liu Ding , Song Wanying

2024, 47(1):93-100.

Abstract (953) HTML (0) PDF 4.91 M (2136) Comment (0) Favorites

Abstract:In response to the characteristics of external interference and natural instability in the control process of inverted pendulum systems, and the problems of low utilization of sampling data and slow convergence of random offline strategy networks in deep reinforcement learning SAC algorithm, an improved algorithm PRER_SAC is proposed that combines recency experience sampling and optimize policy network structure. The neural network fitting function is constructed,the policy network uses the better performance Mish function as the activation function, and sets the self-adjusting temperature coefficient to enhance the exploration ability of agent. Design two experience pools, far and near, and a training strategy to change the frequency of data storage. Through simulation experiments, the return value and convergence speed of the proposed method under the same number of training times are better than DDPG and SAC algorithms, and have better control effects than the traditional control methods PID and LQR. Finally, the angle disturbance added to the trained agent can be eliminated within 2 s, which proves that the proposed algorithm has strong applicability.

Object tracking algorithm with jointing high order target aware and similarity matching

Zhang Nianchao , Zhang Baohua , Li Yongxiang , Gu Yu

2024, 47(1):101-109.

Abstract (773) HTML (0) PDF 7.31 M (1905) Comment (0) Favorites

Abstract:The self-attention mechanism is used to enhance context in the visual object tracking algorithm, but in the face of complex scenes, the correlation in the selfattention mechanism is prone to mismatch. Therefore, a high-order target aware and similarity matching object tracking algorithm was proposed. A high-order target aware model was Constructed for the first-order self-attention map in the self-attention mechanism, the collapsed polarization filtering method was used to perform orthogonal modeling of space and channel dimensions, and optimize internal correlation. At the same time, a nonlinear fitting function was combined to avoid information loss caused by collapse, and then a high-order selfattention map is obtained to capture perceptual features with high-order context information. The perceptual features of the target were decomposed in different dimensions to refine the matching area, so the background noise was suppressed and the response map of the current frame was constrained, and improve the discriminative power of the network. The experimental results on OTB100 and UAV123 benchmarks show that the proposed algorithm has better tracking performance, and can effectively deal with problems such as similar interference.

Instance segmentation algorithm for electrical equipment images under complex background conditions

Zhang Zhijun , Zhang Jinglei , Jia Xin

2024, 47(1):110-117.

Abstract (1068) HTML (0) PDF 10.06 M (1848) Comment (0) Favorites

Abstract:The visible light images of electrical equipment in substation inspection are characterized by background clutter and irregular target contours, causing poor equipment segmentation accuracy and affecting the equipment recognition effect of intelligent inspection systems. This paper proposes an improved YOLACT++ model to realize accurate instance segmentation of equipment targets. First, the electrical equipment feature extraction backbone network DAGNet is designed to improve the network′s attention to important features in the complex background. Simultaneously, the 3D attention module SimAM is introduced in the prototype network branch to reduce the interference of the chaotic background on target segmentation. The model is validated using a labeled dataset of 1 730 visible images of six types of electrical equipment, including surge arresters and circuit breakers, obtained from inspections of 58 110 kV substations and 86 35 kV substations in eight regions of a city. The experimental results show that the APall index of the improved YOLACT++ model segmentation is 84.1%. It is 4.4% higher, and with YOLACT, Mask R-CNN, and YOLOv8 models, it is 4.0%, 9.3%, and 1.6% higher, better realizing the recognition of the six types of electrical equipment. The accuracy and rapidity of electric power inspection are met.

Multi-view stereo reconstruction based on gated recurrent deep range prediction network

Gao Yu , Zhu Lizhong , Liu Yunting , Liu Xiaoyu

2024, 47(1):118-124.

Abstract (963) HTML (0) PDF 7.59 M (1795) Comment (0) Favorites

Abstract:Aiming at the problems that 3D reconstruction techniques are difficult to deal with high-resolution images, and the reconstructed point cloud maps have low accuracy and fuzzy boundaries, this paper proposes a multi-stage multi-scale dynamic depth range prediction network model based on gated recurrent units. First, a curvature-guided dynamic scale convolutional network is used as a feature extraction module to obtain the feature information of the optimal pixels of the image by calculating the surface normal curvature at multiple scales on the image; then, the fine feature information is combined with a new depth range estimation module to dynamically estimate the depth range assumptions of the next stage, so as to better merge the information of neighboring pixels, and to achieve an accurate matching between the reference image and the source image. The network in this paper is compared with more than 10 other methods, and on the DTU dataset, the overall performance is improved by 2.2% over the network in 2nd. On the Tank&Temple dataset, the reconstruction performance of the Lighthouse, M60 and Panther scenes are substantially improved. Meanwhile, comparison and ablation experiments are conducted in this paper, and the experimental results demonstrate that the dynamic depth prediction network proposed in this paper significantly improves the accuracy and completeness of the reconstructed point cloud maps while reducing the memory consumption.

Multi-level decoding neural network for pitting detection of ball screw

Zhao Huifeng , Li Tiejun

2024, 47(1):125-129.

Abstract (739) HTML (0) PDF 3.46 M (1870) Comment (0) Favorites

Abstract:Due to the small pitting area of the ball screw and the serious environmental interference, defects are difficult to detect in time. Therefore, a Multi-level decoding neural network is proposed to realize the segmentation of pitting defects in ball screws. The network consists of an encoder, a multi-level decoder and a Multi-scale Attention module. The encoder is composed of Resnet34, and the Ghost module is introduced to build a lightweight multi-level decoder. In order to fuse multi-scale features and filter redundant information, the Multi-scale Attention module is designed. A hybrid loss function composed of BCE function, IOU and SSIM function is used to train the network. Experiments on the ball screw defect dataset show that Multi-level decoding neural network achieves 0.770 3 in the maxFβ metrics, compared with other methods, which achieves better segmentation results, and the processing time of a single image is 26 ms. It provides a new method for real-time segmentation of ball screw pitting defects.

Rail transit obstacle detection algorithm based on improved YOLOv5

Zhao Hongliang , Guo Youmin , Wang Jianxin , Yang Jun

2024, 47(1):130-135.

Abstract (1074) HTML (0) PDF 5.70 M (9627) Comment (0) Favorites

Abstract:In order to solve the problems of low accuracy and slow detection speed of obstacle detection in the complex rail transit background, an improved object detection network model of YOLOv5 was proposed. Firstly, a lightweight Transformer backbone EMO based on attention mechanism was used to replace some modules in the original backbone of YOLOv5, which not only ensured the lightweight, but also improved the accuracy and stability of the model. Secondly, Focal-EIoU is used to replace the CIoU loss function in YOLOv5 to solve the problems of low training efficiency and slow convergence speed caused by CIoU. Finally, the lightweight upsampling operator CARAFE is used to replace the original upsampling layer in the YOLOv5 algorithm, which has a larger receptive field without introducing too many parameters and computational cost, and improves the detection accuracy and detection speed. Experimental results show that compared with the original YOLOv5 network model, the mean average precision of the proposed method is improved by 11.1%, the precision is improved by 13%, the recall is improved by 11.4%, and the detection speed reaches 60.7 frames per second. The proposed method shows good performance in the target detection task, and effectively enhances the detection performance of the target detection model in the context of rail transit.

Cloud detection algorithm for remote sensing images based on semantic-guided and adaptive convolution

Xu Zichuan , Gong Xiaofeng

2024, 47(1):136-143.

Abstract (754) HTML (0) PDF 12.15 M (1774) Comment (0) Favorites

Abstract:Cloud detection of remote sensing satellite data is a crucial component in the processing of remote sensing images. To address the issue of low accuracy in detecting broken-clouds and thin-clouds, this paper proposes a novel cloud detection method that utilizes high-order semantic-guided decoding and adaptive convolutional encoding. The method leverages the spatial distribution relationship between the main cloud and broken-clouds by introducing an adaptive convolutional encoder to extract correlation information between the main cloud clusters. A high-order semantic-guided decoding module is then utilized to decode semantic features, thus restoring high-resolution cloud mask images. Moreover, a dynamic fusion loss function is designed to calculate the weight by dynamically computing the missed and wrong pixels in the prediction, guiding the network to focus on broken-clouds and thin-clouds, features, thereby enhancing the overall accuracy. Experimental results demonstrate that the proposed algorithm achieves an accuracy of over 96.5% and an intersection over union of over 88.1%, effectively detecting broken-clouds and thin-clouds.

Design of octagonal fiber size detection algorithm based on image processing

Wang Xiaolong , Chen Xiaorong , Wang Yuanye

2024, 47(1):144-149.

Abstract (854) HTML (0) PDF 3.69 M (1526) Comment (0) Favorites

Abstract:To solve the problem of low measurement efficiency due to the lack of octagonal optical fibers in traditional fiber end face size detection methods, image processing is used to design the algorithm for octagonal fiber size detection. For the three parameters to be measured, namely, fiber core diameter, cladding edge to edge distance, cladding diagonal distance, Gaussian filtering is used to denoise the image, and subpixel contours in different areas of the fiber to be measured are extracted. The cladding contour is divided into Octagon fiber contours using the Ramer algorithm, Propose a segmentation processing algorithm to process multi segment contour edges of different lengths and obtain effective eight segment contours; Use the least squares method for curve fitting, measure the diameter of the fiber core contour as a circle, and measure the distance between the edges of the cladding contour as a straight line; A vertex detection algorithm based on auxiliary line is proposed to measure the diagonal distance. Experimental results show that this algorithm can accurately and quickly measure the core diameter, four groups of edge to edge distance and four groups of diagonal distance of octagonal fiber,The average accuracy of repeated measurements of opposite edges and diagonal distances reaches 0.1 μm and meet the requirements of enterprise technical indicators.

Hyperspectral image denoising with sparse spatial-spectral transformer

Yang Zhixiang , Sun Yubao , Bai Zhiyuan , Luan Hongkang

2024, 47(1):150-158.

Abstract (1165) HTML (0) PDF 12.87 M (1963) Comment (0) Favorites

Abstract:The application of Transformer models has improved the performance of hyperspectral image denoising. However, the original Transformer model still falls short in effectively leveraging the spatial-spectral coupling in HSIs. It tends to excessively smooth spatial features, leading to the loss of small-scale structures. Moreover, it overly emphasizes all spectral channel features, neglecting the differences between different spectral bands. In order to solve these problems, this paper introduces a novel Sparse Spatial-Spectral Transformer model, enhancing the utilization of spatial-spectral coupling. In the spatial dimension, a local enhancement module is introduced to refine spatial feature details and deal with oversmoothing problem. Simultaneously, in the spectral dimension, a Top-k sparse self-attention mechanism is proposed, which adaptively selects the top-K most relevant spectral channel features for feature interaction, effectively capturing spatial-spectral characteristics. Ultimately, hyperspectral image denoising is achieved through hierarchical residual connections with the Sparse Spatial-Spectral Transformer. On the ICVL dataset, denoising performance for both Gaussian noise and complex noise attains peak signal-to-noise ratios of 40.56 dB and 40.19 dB, respectively, demonstrating the superior performance of the proposed Sparse Spatial-Spectral Transformer model in this paper.

Layout segmentation based on Multi-WHFPN and SimAM attention mechanism

Yang Chenhui , Zhou Xiaoliang , Zhang Heng , Sun Zheng , Ye Ning

2024, 47(1):159-168.

Abstract (1004) HTML (0) PDF 25.94 M (1984) Comment (0) Favorites

Abstract:As a pre-processing step for OCR, the layout segmentation technology is receiving increasing attention from both academic and industrial communities. To address the problems encountered in layout segmentation, such as slow detection speed, inaccurate boundary detection of target areas, and easy omission of small targets, the YOLOv7-MSY model is proposed. Firstly, the Multi-WHFPN network structure is proposed by combining the idea of residual connection, and trainable weighted parameters are introduced to highlight the importance of features and add a small target detection head to enhance small target detection. Secondly, the SimAM attention mechanism is introduced to evaluate feature weights in the 3D dimension without adding extra parameters, to enhance important features and suppress invalid features. Finally, the YEIOU is used to replace the original model′s localization loss function, which improves the convergence speed and regression accuracy of the model. Experimental comparisons on the dataset provided by the Jiangsu Provincial Archives show that YOLOv7-MSY is more sensitive to boundary detection of target areas and performs better in detecting small targets. The mAP@.5 of YOLOv7-MSY reaches 0.871, which is 7.84% higher than the original YOLOv7 model. The layout segmentation effect of this model is superior to other types of layout segmentation algorithms. It has good generalization performance，and the layout segmentation speed is relatively high.

Spatial temporal convolutional Transformer network for skeleton-based action recognition

Liu Binbin , Zhao Hongtao , Wang Tian , Yang Yi

2024, 47(1):169-177.

Abstract (1083) HTML (0) PDF 6.69 M (2427) Comment (0) Favorites

Abstract:In the methon of skeleton action recognition based on graph convolution, the rely heavily on hand-designed graph topology in modelling joint features, and lack the ability to model global joint dependencies. To address this issue, we proposed a spatio-temporal convolutional Transformer network to implement the modelling of spatial and temporal joint features. In the spatial joint feature modeling, we proposed a dynamic grouping decoupling Transformer that grouped the input skeleton sequence in the channel dimension and dynamically generated different attention matrices for each group, establishing global dependencies between joints without requiring knowledge of the human topology. In the temporal joint feature modeling, multi-scale temporal convolution was used to extract features of target behaviors at different scales. Finally, we proposed a spatio-temporal channel joint attention module to further refine the extracted spatio-temporal features. The proposed method achieved Top1 recognition accuracy rates of 92.5% and 89.3% on the cross-subject evaluation criteria for the NTU-RGB+D and NTU-RGB+D 120 datasets, respectively, demonstrating its effectiveness.

Method of space-time image velocimetry based on Radon transform

Li Han , Jin Shijun

2024, 47(1):178-185.

Abstract (886) HTML (0) PDF 8.41 M (2200) Comment (0) Favorites

Abstract:The space-time image velocimetry technique harnesses the natural features of river surfaces for analysis. By examining the predominant texture direction in the generated space-time images, it calculates the one-dimensional time-averaged flow velocity of the river surface, factoring in physical transformation relationships, captured video parameters, and the tangent of the texture inclination angle. In view of the problem that the accuracy of the spatiotemporal image texture inclination angle detection is greatly affected by noise interference in practical applications, this paper proposes to use an improved homomorphic filter to enhance the texture features of the river surface image, and adopts a frequency domain filter integrated with adaptive histogram equalization to denoise the spatiotemporal image. Subsequently, the Radon transform is deployed to pinpoint the texture′s angular direction. Through simulated texture image experiments and on-site river experiments under high and low flow conditions, the effectiveness of the improved method proposed in this paper is verified. The findings reveal that, for standard simulated texture visuals, the Radon transform′s angle detection holds a relative error of less than 0.03%. In on-site river laden with interference, the relative errors between the Radon transform-based spatiotemporal image texture angle detections and manual observations are less than 1.56% and 1.80% under low and high flow conditions, respectively. The experiment indicates that the Radon transform method is feasible and has higher accuracy compared to other texture angle detection algorithms.

Cross-view gait recognition based on multi-scale feature fusion

Zou Xue , Tan Mian , Yan Xiaobo , Wang Fei , Wang Lin

2024, 47(1):186-192.

Abstract (1168) HTML (0) PDF 3.14 M (1973) Comment (0) Favorites

Abstract:In cross-view gait recognition, it is difficult to extract distinguishable and diverse gait features in the case of clothing occlusion, which leads to the decrease of recognition accuracy. A multi-scale feature fusion network based cross-view gait recognition method is proposed. This method can effectively utilize the complementarity among gait features to obtain gait features with discriminability and diversity, thereby solving the problem of poor discriminability and uniformity caused by clothing occlusion, and thus improving the accuracy of cross-viewing Angle gait recognition. In order to verify the effectiveness of the proposed method, the public data set CASIA-B was used to verify the proposed method. The experimental results show that the proposed method achieves 73.4% recognition performance for the cross-viewing Angle gait recognition problem with occlusion, and 95.5% and 88.0% recognition performance under normal and backpack walking conditions, respectively. In addition, the performance of our method is better than that of other typical gait recognition methods under occluded conditions.

Home

Introduction

Editorial Committee

Policy

Contact Us

中文版

>Research&Design

>Theory and Algorithms

>Information Technology & Image Processing