Traditional robot navigation methods either require detailed maps or assume accurate localization is always available, assumptions that break down in indoor or unfamiliar environments. Visual simultaneous localization and mapping (SLAM) systems, often used as a fallback, can easily fail in scenes lacking distinct textures or during abrupt movements, leading to severe navigational errors.
Many previous AI models focused solely on finding collision-free paths, disregarding whether the robot could maintain proper localization along the way. Even those that consider localization often rely on rigid penalty thresholds, unable to adapt to changing conditions.
Due to these challenges, a more flexible and awareness-driven navigation strategy is urgently needed for robots to perform reliably across diverse, real-world scenarios.
A Localization-Aware Deep Learning Approach
A research team from Cardiff University, Hohai University, and Spirent Communications has developed a new deep reinforcement learning (DRL) model that helps robots plan smarter, safer paths in complex indoor environments.
Their study, published in the journal IET Cyber-Systems and Robotics, introduces a localization-aware navigation policy trained with RGB-D camera input and real-time feedback from ORB-SLAM2. Rather than relying on preset thresholds or fixed maps, the robot learns to adapt dynamically to visual conditions in its environment, thereby boosting its ability to maintain localization and significantly improving navigation success rates in simulated tests.
Integrating Localization into Decision-Making
The core innovation lies in integrating localization quality into every navigation decision. The robot is trained using a compact state representation that reflects the spatial distribution of visual map points, partitioned into 24 angular regions around its body. This design helps the robot identify which directions are visually "safer" to travel through, meaning areas more likely to provide reliable localization data.
The team also emphasized the importance of using a continuous action space, trained with a Deep Deterministic Policy Gradient (DDPG) algorithm, rather than the discrete actions adopted by many earlier approaches. In addition, the researchers introduced a new reward function based on Relative Pose Error (RPE), offering instant feedback on whether a particular movement action improves or worsens the robot's understanding of its position.
Unlike previous models that used static thresholds, this system employs a dynamic threshold that adjusts in real time, depending on environmental conditions.
Performance Evaluation and Results
To evaluate the approach, the team conducted extensive training using the iGibson simulation environment and tested their model against four baseline methods.
In challenging indoor scenarios, the new model outperformed others by a wide margin, achieving a 49% success rate compared to just 33% for conventional SLAM-based navigation. By contrast, a model trained with ground-truth poses reached 61%, while methods based on raw images or ATE-based rewards performed poorly, achieving only 2–5%.
The proposed approach also showed significantly lower localization error and better adaptability in new environments. In unseen test environments, success rates reached 60% and 53%, outperforming the SLAM-based baseline by 16–18 percentage points.
Notably, the model consistently chose longer, but safer routes, demonstrating the value of prioritizing localization robustness over shortest-path efficiency.
Implications for Real-World Robotics
"Our aim wasn't just to teach the robot to move, it was to teach it to think about how well it knows where it is," said Dr. Ze Ji, the study's senior author. "Navigation isn't only about avoiding walls; it's about maintaining confidence in your position every step of the way.
By integrating perception and planning, our model enables smarter, safer movement in uncertain spaces. This could pave the way for more autonomous robots that can handle the complexities of the real world without constant human oversight."
Future Directions and Deployment Challenges
The implications of this work extend to indoor robotics, encompassing service robots in hospitals and homes, as well as warehouse automation systems. In environments where GPS is not available and visual conditions vary, being able to assess and respond to localization reliability is crucial.
This method equips robots with the awareness to adjust their strategies based on how well they can see and understand their surroundings. Looking forward, the team plans to test their model on real robots and in dynamic scenes with pedestrians.
The authors also caution that simulation differs from real-world conditions in terms of texture diversity, lighting, and dynamic elements, and that sim-to-real transfer remains a significant challenge.
Addressing these gaps with domain adaptation techniques will be an important step toward deployment. With further development, the approach could become a key building block for trustworthy, mapless navigation in real-world human environments.
Source:
Journal reference:
- Gao, Y., Wu, J., Wei, C., Grech, R., & Ji, Z. (2024). Deep Reinforcement Learning for Localisability-Aware Mapless Navigation. IET Cyber-Systems and Robotics, 7(1), e70018. DOI: 10.1049/csy2.70018, https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/csy2.70018