Abstract:Few-shot military target recognition aims to achieve fast and accurate recognition of novel military targets with extremely limited labeled samples, and is of great significance in military remote sensing interpretation, battlefield situation awareness, and decision support. Metric learning-based methods perform recognition by constructing class prototypes and measuring the similarity between query samples and prototypes. Owing to their simple structure, flexible training, and strong transferability, these methods have become a mainstream approach in few-shot learning. However, most existing methods construct class prototypes by mean estimation, which is easily affected by background clutter, imaging noise, and outliers in remote sensing image, leading to prototype deviation. Morever, equal weighting is usually assigned to all dimensions in the feature space, making it difficult to characterize their different contributions to classification. Consequently, when the feature distributions of different classes highly overlap, the discriminative ability of the model is constrained. To address these problems, a spatial metric prototypical network for few-shot military target recognition is proposed. First, a feature extractor is employed to map samples into the embedding space, and feature translation and normalization are introduced to enhance the robustness and stability of feature representations. Subsequently, a prototype enhancement module is designed to adaptively optimize class prototypes within a low-rank subspace. By enhancing the principal discriminative directions and suppressing redundant noise information, the proposed network alleviates the entanglement of low-dimensional discriminative features. Finally, a metric function is constructed by incorporating spatial projection error to achieve fine-grained recognition of query samples. Experimental results on the Ship, MAR20, and NWPU-RESISC45 datasets demonstrate that the proposed method improves the average recognition accuracy by 24.49%, 2.07%, and 4.03% under the 5-way 1-shot setting, and by 26.98%, 8.92%, and 5.43% under the 5-way 5-shot setting, respectively. The results demonstrate the effectiveness and generalization capability of the proposed network in complex remote sensing scenarios.