Abstract:With the continuous development of intelligent security systems, pedestrian retrieval for all-day surveillance has become one of the research hotspots. Thus, the research of visible-infrared cross-modality person re-identification has emerged. The main challenge faced in this task is the huge discrepancy between visible and infrared images of the same pedestrian. Existing methods focus on exploring the shared information and reducing the feature variances of the same pedestrian in the two modalities. To further improve the accuracy of the task, this paper proposes multi-granularity shared-disentangling relation network for re-identification. By embedding the shared-disentangling module, the parameter-sharing branch of the backbone is replicated and decomposed, thus breaking limitations of the original benchmark model in multi-granularity feature extraction. By designing the multi-granularity relation feature learning module, the modality-invariant correlation information of the pedestrian body is fully explored, enhancing the learning of the shared features. And through constructing a loss function in multiple levels, effective supervision is available for the training of the model, and the global-local feature alignment scheme is optimized. The proposed algorithm obtains superior performance on both public datasets named SYSU-MM01 and RegDB. The Rank-1 and mAP in All-search mode on the SYSU-MM01 dataset can reach 74.70% and 71.79% respectively. In both retrieval modes of RegDB, Rank-1 and mAP are higher than 90%, and the accuracy is superior to many state-of-the-art methods. Experiments demonstrate the advantages of this network in cross-modality feature alignment and complex scene adaptation.