Abstract:A lightweight action recognition network with fused attention is proposed to deal with the three problems of the traditional 3D convolutional neural network: large number of parameters, information redundancy and insufficient extraction of temporal information. First, in order to lighten the network parameters and fuse short-medium-long temporal information, an efficient residual block is developed to replace two cascaded 3×3×3 convolutions; second, by extending the channel attention mechanism, a temporal attention mechanism is derived, and both of the two mechanisms are integrated into the proposed network to suppress the influence of redundant information on recognition results; finally, experiments are conducted on the UCF101 dataset to verify the effectiveness of the network. The results show that the proposed action recognition network has a computational cost of 8. 9 GFlops, a parameter amount of 18. 0 M, and a recognition accuracy rate of 94. 8%, which reveals a high recognition accuracy with a low cost computation in comparison with other behavior recognition networks.