Abstract:Aiming at the problem that the reinforcement learning algorithm is not ideal to detect moving obstacles in a dynamic environment, which affects the optimal obstacle avoidance strategy. A state predict error-intrinsic curiosity module (SPE-ICM) with state prediction error as intrinsic motivation is proposed to improve the ability of policy functions to explore the environment of Agents. First, the internal reward mechanism is introduced to provide multiple reward (reward) structure for Agent. Secondly, according to the internal and external reward structure optimization, the Agent’s perception of environmental information is improved, the collection and detection method of moving obstacles on the data structure is improved, and the optimal obstacle avoidance strategy function is optimized and improved by relying on the new detection method. Finally, the network model is combined with the deep deterministic strategy gradient algorithm (DDPG), and the comparative experiment is carried out in the path planning simulation environment built by ROS to verify the feasibility of the proposed algorithm. The results show that the proposed algorithm has significantly better effects in detection ability and decision-making ability.