Abstract:For the lack of detection accuracy for human keypoints, it is improved on the basis network of KAPAO (keypoints and pose as objects). The generalization of network is improved by the enhance data method of PoseTrans ( pose transformation); for the lack of characteristic fusion capabilities, the BiFPN (Bi-directional feature network) module is designed to fully integrate different semantic characteristic to improve the integration ability of deep semantics information and shallow semantic information; the adaptive expansion convolution module is designed to adaptive fusion different expansion rates of output branch during the network output phase, it effectively obtains the global information of the image; in order to retain the optimal key point prediction box, the traditional NMS is replaced by SDR-NMS ( soft DIOU relocation non-maximum suppression ) during the post-processing part of the network. The experimental results show that the AP score was increased by 4. 8%, the AP was 68. 6%, and the detection speed was 19. 1 ms. The accuracy and detection speed of network have better performance.