Abstract:To address the limitations of traditional graph neural networks in processing single-modality data, which lead to incomplete information, inaccurate graph structure construction, and difficulties in effectively capturing spatial dependencies among nodes, this paper proposes a fault diagnosis method based on Multimodal Manhattan Graph Lap-Transformer. The method constructs a novel graph structure using Manhattan distance, enabling more stable measurement of inter-node similarity while eliminating dependence on fixed topological structures, thereby enhancing adaptability to complex fault data relationships. By encoding graph topological information through graph Laplacian matrices, the attention mechanism simultaneously considers both node feature similarity and graph structural connectivity. This dual-focus approach strengthens the modeling of local and global dependencies, effectively capturing spatial relationships between nodes. Through experiments on the PU bearing dataset, the AUST bearing dataset, and the tunnel boring machine bearing dataset, the average accuracy rates of fault diagnosis reached 99.7%, 98.8%, and 99.8% respectively, verifying the superiority of this method in bearing fault diagnosis. It demonstrates significant diagnostic accuracy and strong adaptability to various working conditions under noisy and multi-condition circumstances. Moreover, this method exhibits good robustness and stability, providing a novel and efficient solution for the fault diagnosis of bearings and other mechanical equipment.