Abstract:Existing single-stage deep models for traffic accident detection often suffer from high false alarm rates and computational redundancy in highway scenarios, severely limiting their practical deployment. To address these issues, this paper proposes a two-stage traffic accident detection method tailored for highways, following a "stationary vehicle filtering+appearance-based recognition" strategy. In the first stage, YOLO11 and Bot-SORT are integrated to detect and track vehicles, and inter-frame speed analysis is used to identify stationary vehicles as potential accident candidates. In the second stage, an improved model named YOLO-EA is introduced to perform appearance-based detection exclusively on the stationary vehicles, combined with a multi-frame voting mechanism to enhance stability and robustness. Built upon the YOLO11 architecture, YOLO-EA incorporates an EAS-Stem module and an AWD-Conv module. The former enhances edge and contour extraction in the input stage, while the latter improves downsampling efficiency by retaining critical features and reducing computational cost. Experimental results show that YOLO-EA improves Precision, mAP@0.5 and mAP@0.5:0.95 by 10.9%, 3.4% and 2.8% respectively, while reducing parameter count by 21%. On the constructed accident video dataset, the proposed method achieves an accident recognition rate of 81.25%, with a 24.46% reduction in false alarm rate compared to single-stage detection strategies. This method achieves a favorable balance between accuracy and inference efficiency, demonstrating strong potential for real-world deployment.