Visual navigation of mobile robots based on LSTM and PPO algorithms
Author:
Affiliation:
Fund Project:
摘要
|
图/表
|
访问统计
|
参考文献
|
相似文献
|
引证文献
|
资源附件
|
文章评论
摘要:
为提高移动机器人在无地图情况下的视觉导航能力,提升导航成功率,提出了一种融合长短期记忆神经网络( long short term memory, LSTM)和近端策略优化算法(proximal policy optimization, PPO)算法的移动机器人视觉导航模型。 首先,该模型融 合 LSTM 和 PPO 算法作为视觉导航的网络模型;其次,通过移动机器人动作,与目标距离,运动时间等因素设计奖励函数,用以 训练目标;最后,以移动机器人第一视角获得的 RGB-D 图像及目标点的极性坐标为输入,以移动机器人的连续动作值为输出, 实现无地图的端到端视觉导航任务,并根据推理到达未接受过训练的新目标。 对比前序算法,该模型在模拟环境中收敛速度更 快,旧目标的导航成功率平均提高 17. 7%,新目标的导航成功率提高 23. 3%,具有较好的导航性能。
Abstract:
In order to improve the visual navigation ability of mobile robots without maps and improve the success rate of visual navigation, a visual navigation model of mobile robots is proposed that integrates long short term memory (LSTM) and proximal policy optimization (PPO) algorithms. Firstly, the model integrates LSTM and PPO as a network model for visual navigation. Secondly, a new reward function is designed to train the target through factors such as the action of mobile robots, the distance between the robots and the target, and the running time of robots. Finally, the RGB-D image obtained from the first perspective of mobile robots and the polar coordinates of the target in mobile robots coordinate system are used as the model input, and the continuous motion of mobile robots is used as the model output to realize the task of end-to-end visual navigation without maps, and the new target that has not been trained is reached according to the model inference. Compared with the pre-order algorithms, the model has an average increase of 17. 7% in the navigation success rate of the old target and 23. 3% of the new target in simulated environments, which has better navigation performance.