Implementing monocular-inertial SLAM on the Nvidia Jetson TX2 embedded platform
Loading...
Date
Authors
Researcher ID
Supervisors
Journal Title
Journal ISSN
Volume Title
Publisher
North-West University
Record Identifier
Abstract
SLAM is a complex computational problem where a robot or vehicle concurrently creates a representation of its environment while determining its position within the environment. State-of-the-art SLAM algorithms can achieve high accuracy at high speeds. However, these algorithms are usually developed, tested, and optimised on high-end desktop computers, which are not used in the design of the robots or vehicles that implement SLAM. These robots or vehicles use embedded platforms with limited processing power, which can cause severe degradation in the execution speed of the SLAM algorithm. Thus, since it is practically more feasible, SLAM algorithms need to be tested on embedded platforms.
The study aims to create a prediction model that can estimate the performance that a SLAM algorithm can achieve on an embedded platform. ORB-SLAM3 is a state-of-the-art visual SLAM algorithm that supports multiple camera settings and configurations. ORB-SLAM3 will be implemented on the Nvidia Jetson TX2, Raspberry Pi 3b+, and Raspberry Pi 4B embedded platform, where the performance will be measured. The EuRoC MAV dataset is used as it provides camera and inertial sensor data with accurate ground-truth measurements. ORBSLAM3, built with the default version of OpenCV 3.2.0, was profiled, and it was identified that OpenCVs’ FAST function is the bottleneck when executing on the Nvidia Jetson TX2. Investigating the OpenCV3.2.0 source code showed that the FAST function does not use SIMD instructions on ARM architectures, meaning the hardware resources of the Nvidia Jetson TX2 are not fully utilised.
ORB-SLAM3 was containerised and executed on the three embedded platforms, and the average tracking time was measured. PassMark is used to benchmark the embedded platforms to characterise the CPUs of the platforms. Since three embedded platforms are a small sample size, artificial CPUs are generated by varying the CPU frequencies and core counts. The results from the benchmarking and the execution of ORB-SLAM3 were used to create a dataset to train and test prediction models. The prediction model’s target is the average tracking time that ORB-SLAM3 can achieve on embedded platforms. The inputs to the model were selected using cross-correlation and guided by the profiling results of ORB-SLAM3. The inputs used to create the prediction model are the CPU frequency, PassMark Single CPU score, PassMark NEON CPU score and PassMark Integer score.
Three modelling techniques are used to create prediction models: A simple linear regression, an Extra Trees Regressor, and a Multi-Layer Perceptron model. Two experiments are investigated for verification and validation. The first experiment, which serves as verification, is to create the models using the entire dataset of 429 unique entries, with a 75:25 ratio for training and testing. The second experiment that serves as validation is when a CPU is removed from the dataset to see how the prediction models react on an unseen CPU. The performance criteria that the models should achieve are an MAE and RMSE % of less than 10 %, and a R2 of more than 0.9.
Sustainable Development Goals
Description
Master of Engineering in Computer and Electronic Engineering, North-West University, Potchesftroom Campus
