Jacinto is a trademark of Texas Instruments Incorporated.
All trademarks are the property of their respective owners.
Whether it is a simple task such as lane assist or blind spot detection, or a more complex task such as autonomous navigation, understanding the surroundings of a vehicle or robot is vital for success, and thereby safety. Vehicles and robots perceive their environment by converting data captured by sensors such as, RADARs, LiDARs and Cameras in to a format that can be consumed by the vehicle’s decision-making engine. Light Detection and Ranging (LiDAR)-based maps tend to be the most accurate, however, they are typically cost prohibitive for most vehicles or robots. Therefore, RADAR and camera-based solutions tend to be more widely used.
The SFM algorithm is one of the more widely used algorithms for camera-based mapping. The SFM algorithm by itself outputs a point-cloud (a set of points extracted from surrounding objects), which can then be consumed by some type of mapping algorithm. The application described in this article feeds the point-cloud to an Occupancy Grid mapping algorithm to generate a map of the surroundings.
In automotive and robotics applications, the steps of receiving sensor data, converting the data to a usable format, and prescribing actions based on the perceived environment, are typically performed on an embedded platform. The Jacinto 7 TDA4x family of high-performance SoCs by Texas Instruments are designed from the ground up particularly to address the varied algorithmic needs of the automotive, industrial and robotics markets. The Structure From Motion, or SFM algorithm is one such algorithm around which the device was designed. As a result the key computational blocks of the algorithm map seamlessly to either hardware accelerators or general purpose processing cores on the TDA4VM device. This article describes the SFM based OG mapping algorithm, the TDA4VM device, and how the algorithm maps to the device to enable a high-fidelity real-time map of the environment, before showing some example implementations on the device with corresponding outputs.
In Computer Vision, the position of an object with respect to a vehicle is ascertained using images from two cameras, mounted on known disparate locations, looking at the object in question. In particular, key points from the object in both images are extracted and matched, and then using a process known as Triangulation the locations of the points that make up the object are deciphered. The process of distinguishing the position of a point in space using two cameras is known in the Computer Vision community as Stereo Vision, or Stereo Depth Estimation, and the set of points generated from all the correspondences in the two images is referred to as a point-cloud. Even as Stereo Vision is widely used by the automotive and robotics communities, it comes at a high system cost in terms of both dollars and image processing requirements, because it requires two high-precision cameras, capturing images at a relatively high frequency.
In contrast, Structure From Motion, or SFM, is an algorithm that can generate a point-cloud from a single camera in motion. As the name implies, in SFM, we have one camera which due to motion is in two distinct locations at two consecutive time instances, which effectively is the same as placing two cameras in distinct locations, given the objects in the frame have not moved between the two time instances, and given we know the relative motion of the camera. Thus, one can effectively use the same theory as in Stereo Vision to generate a point-cloud, from just one camera.
SFM algorithms come in two primary flavors, traditional Computer Vision based and Deep Learning based. Even though both flavors can be executed on TDA4VM, in this document, the focus in on the former, an algorithm based on traditional Computer Vision techniques. The point cloud generated by the SFM algorithm needs to then be utilized to generate a map of the surroundings, and in the application described here a 2D OG mapping approach is utilized for the mapping task.