Abstract:In industrial production, prolonged and high-intensity operations can lead to worker fatigue, increasing the risk of safety incidents. Existing research has shown that contact-based physiological features can effectively represent fatigue status, but using contact-based instruments to monitor fatigue in industrial environments interferes with operations. Therefore, fatigue detection based on surveillance video has become a more practical choice. Current methods mainly focus on mouth and eye features, failing to comprehensively reflect fatigue status. To address this issue, we propose a non-intrusive fatigue detection method that integrates facial appearance and physiological representation, utilizing a video-based dual-branch network model for monitoring worker fatigue. First, we locate the facial areas of interest in the video and segment these areas. By extracting changes in skin reflectance due to variations in capillary blood volume, we construct a physiological spatiotemporal map. Next, we build a dual-branch 3D convolutional network to extract facial appearance and physiological feature representations separately. Finally, we fuse these features and input them into a fully connected layer to map the final fatigue detection results. The proposed method is validated using a fatigue dataset obtained from simulated industrial production tasks. Experimental results demonstrate that the fatigue detection accuracy, based on the integration of facial appearance and physiological features from video, reaches 88%, offering higher accuracy and stronger applicability in industrial settings compared to existing technologies.