December 21, 2023

Indoor Navigation using Visual Feature Detection for Robots & Drones


Many solutions  for robotic systems include depth sensors or lasers to avoid collisions during movement. Drones receive info about depth, and stereo image or use lasers for improved coordination.”. And the main purpose is to receive a 3D map of rooms and fast orientation: with any changes update the map. But how simple could that system be? Thus, for the experiment we chose  a Monocular video without any additional information – a GoPro video of a walk through a hardware store.


What systems exist?

Several  SLAM and SVO algorithms require additional packages like ROS and specialized sensors. We wanted a universal open-source system that was easy to use and flexible.

SLAM and SVO algorithms analyze their surroundings, extract key features, and build a map of their environment. They can then locate themselves and update the map in real time. SLAM operates on keyframes for high accuracy, while SVO focuses on direct image alignment for speed.

Speed detection

ORB-SLAM2 strikes a balance, utilizing ORB features for efficient and accurate localization and mapping. It’s widely used in real-time applications such as autonomous robots and drones . It can work with or without ROS and includes calculations for monocular, stereo, and RGB-D cameras.

Data Preparation

ORB-SLAM is a fully ROS system, ORB_SLAM3 works with intime, like online translation, but especially ORB-SLAM2’s minimum requirements are a folder with a  PNG  file of each frame,  and a file with camera calibration and distortion parameters. Converting video to set of images could be done simply with python code:

And more interesting with camera calibration and distortion parameters, as they could give us incorrect angles of rotation for our trajectory. OpenCV library has the functionality to determine all of those parameters, using the additional special video, but our goal is to use only our videos from the store, so approximate parameters were found on the internet.

Prototype Assembly

This straightforward solution is portable across various environments, from Raspberry Pi to PC . ORB-SLAM2 is a C++ algorithm utilizing specific libraries, which may pose dependency issues  and best works on Ubuntu 18.04. But it doesn’t mean that we should prepare a separate computer to use it. Docker-container with that OS allows us to have such a system in any environment. As ORB-SLAM2 in basic setup provided as a desktop application, we need to work on Xserver settings to see it in any display out of the container. For installing ORB-SLAM2 to your environment you just need GPU, camera, Linux, compatible hardware, and sufficient processing power. For the Docker option, requirements include GPU support , Docker installed,  and adequate system resources.



In monocular mode, processing a 35-second video with 50 frames per second.  (1752 frames in total) on a MacBook with an i5 processor and 2 dedicated cores for Docker takes approximately 7 minutes and 44 seconds (464 seconds). This translates to processing at around 3.5 frames per second, with 10,000 features extracted in each frame. The processing time scales proportionally to the number of features per frame, and that parameter could be changed in settings. To achieve “real-time” visualization after that, a 20x acceleration is applied to video examples. With fewer than 4,000 features, the system fails to construct a reliable map  due to the inability to match features across frames. The average CPU load is 200%, reaching a peak of 250 %. Notably, the memory usage for processing 10,000 features per frame for 1752 frames is 850 MiB.

The system builds a point cloud with 27,000 points on the video, providing sufficient detail to discern specific features on shelves rather than just outlines.

Future of Indoor Navigation

ORB-SLAM2 has the potential to revolutionize the way big companies manage their stock spaces. By enabling drones to navigate autonomously and efficiently, ORB-SLAM2 can streamline processes, improve inventory management, and enhance worker safety. For example, drones equipped with ORB-SLAM2 can quickly and accurately scan large warehouses, identifying and tracking inventory items. This information can then be used to optimize storage layout, improve picking and packing efficiency, and reduce errors

Not limited to business applications  – to check what your cat destroyed while you are not at home or to find something that you forgot to put back, to provide a virtual tour of your office or event space, to check if this shelf is right-sized for your room.

you might also like…
Dec 12, 2023

A digital twin of terrain – 3D model from satellite images

Introduction Many solutions  for robotic systems include depth sensors or lasers to avoid collisions during movement. Drones receive info about... Read more

Feb 20, 2024

Store Autopilots: Developing Retail Trade Using DSO-Based Navigator Drones

Introduction Many solutions  for robotic systems include depth sensors or lasers to avoid collisions during movement. Drones receive info about... Read more

Contact Us

  • Contact Details

    +380 63 395 42 00
    Krakow, Poland

    Follow us