Abstract:Accurate detection of defects in wafer images is of great significance for timely identification of abnormal faults in wafer production. In the wafer testing phase, the deep learning method has been widely used in wafer defect detection due to its excellent feature extraction capability. However, traditional deep learning models often rely on a large number of adequately labeled and high-quality data for training, and in practical applications, balanced and sufficient labeled data is often difficult to obtain. To address this issue, we propose a VGGNet deep learning model that integrates an improved multi-head attention mechanism with a residual structure, aiming to extract more comprehensive features from imbalanced data sets to achieve accurate detection of wafer surface defects. Specifically, we use an improved multi-head attention mechanism to map the input wafer image features to multi-dimensional subspaces, which significantly improves the expressiveness and generalization performance of the model. At the same time, the residual connection is introduced into the full connection layer of traditional VGGNet, which effectively alleviates the problem of gradient disappearance in deep network training. To validate the effectiveness of the VGGNet with the improved multi-head attention mechanism and residual structure(RS), extensive experiments were conducted on the WM811K dataset, achieving a classification accuracy of 94.3%, which is 3% higher than the traditional VGGNet and 1% higher than existing similar models on average. The experimental results show that on the real data set WM811K, the proposed method not only improves the robustness of wafer defect detection, but also significantly outperforms the existing algorithms on the non-equilibrium data set.