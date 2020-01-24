An AI model developed at MIT and the Qatar Computing Research Institute that uses only satellite images to automatically tag digital maps could improve GPS navigation, especially in countries with limited map data. Credit Cards: Google Maps / MIT News

A model invented by researchers at MIT and the Qatar Institute of Computer Computing Research (QCRI), which uses satellite imagery to map route features on digital maps, could help improve GPS navigation.

Guides that show more details about their routes can often help them navigate unfamiliar locations. The number of lanes, for example, may allow a GPS system to warn drivers of differences in lanes or lane mergers. Incorporating parking information can help drivers plan ahead, and bike lane mapping can help cyclists negotiate busy city streets. Providing up-to-date information on road conditions can also improve disaster planning.

But creating detailed maps is a costly and time-consuming process mainly done by large companies, such as Google, which sends vehicles around cameras stuck in their hoods to capture videos and pictures of the streets of an area. By combining this with other data, you can create accurate, up-to-date maps. Because this process is expensive, however, some parts of the world are ignored.

One solution is to release machine learning models into satellite images – which are easier to obtain and updated fairly regularly – to automatically add route features. But roads can be blocked, for example, by trees and buildings, making it a difficult task. In a paper presented at the Union for the Promotion of Artificial Intelligence, MIT and QCRI researchers describe “RoadTagger”, which uses a combination of architectural neural networks to automatically predict the number of lanes and types of roadblocks. .

When testing RoadTagger on blocked roads from 20 US digital city maps, the model measured lane numbers with 77% accuracy and implicit roads with 93% accuracy. Researchers are also planning to allow RoadTagger to predict other features, such as parking spots and bike lanes.

“Most up-to-date digital maps come from places that big companies are most interested in. If you’re in places that don’t care about them, you are at a disadvantage in terms of map quality,” says co-author Sam Madden, a professor of electrical engineering. Engineering and Computer Science (CEES) and Researcher at the Computer and Artificial Intelligence Laboratory (CSAIL). “Our goal is to automate the process of producing high quality digital maps so that they are available in any country.”

The co-authors of the book are CSAIL graduate students Songtao He, Favyen Bastani and Edward Park. EECS undergraduate student Satvat Jagwani; CSAIL teachers Mohammad Alizadeh and Hari Balakrishnan. and QCRI researchers Sanjay Chawla, Sofiane Abbar and Mohammad Amin Sadeghi.

Combining CNN and GNN

Quatar, based on QCRI, is not “a priority for large digital map makers,” says Madden. However, it is constantly building new roads and improving the old ones, especially as it prepares to host the 2022 FIFA World Cup.

“During our visit to Qatar, we had experiences where our driver couldn’t figure out how to get there because the map was so far away,” says Madden. “If browsers don’t have the right information for things like strip merging, this could be frustrating or worse.”

RoadTagger is based on an innovative combination of a convolutional neural network (CNN) – commonly used for image processing tasks – and a graphics neural network (GNN). GNN model relationships between connected nodes in a graph have become popular for analyzing things like social networks and molecular dynamics. The model is “end-to-end”, meaning that it is powered only by raw data and automatically generates output, without human intervention.

CNN receives raw satellite images of the target roads as input. The GNN breaks the road into sections of about 20 meters or “tiles”. Each tile is a separate graph node, which is connected by lines along the road. For each node, CNN extracts street features and shares this information with its immediate neighbors. Road information is spread throughout the graph, with each node receiving some information about the road properties in each other node. If a particular tile is clogged in an image, RoadTagger uses information from all the tiles along the road to predict what’s behind the blockage.

This combined architecture represents a more human intuition, the researchers say. Saying that part of a four-lane road is blocked by trees, so some tiles show only two lanes. People can easily assume that the stripes are hidden behind the trees. Traditional machine learning models — say, only one CNN-extract contains only individual tiles and most likely predict that the clogged tile is a two-lane path.

“People can use information from adjacent tiles to guess the number of strips in the tile block, but networks can’t do that,” he says. “Our approach is trying to emulate people’s natural behavior, where we capture local information from CNN and global information from GNN to make better predictions.”

Learning weights

To train and test RoadTagger, the researchers used a real-world map dataset called OpenStreetMap that allows users to edit and edit digital maps around the world. From this set of data, they obtained confirmed features from 688 square kilometers of maps of 20 US cities – including Boston, Chicago, Washington and Seattle. They then aggregated the corresponding satellite images from a Google Maps dataset.

In training, RoadTagger learns weights – which attribute varying degrees of importance to node features and connections – of CNN and GNN. CNN extracts features from pixel tiles and GNN propagates the known features along the graph. From randomly selected road subheadings, the system learns to predict road characteristics on each tile. This automatically learns what image functions are useful and how to transmit these features along the graph. For example, if a target tile has fuzzy lane markings but its adjacent tile has four lanes with clear lane markings and shares the same road width, then the target tile is likely to have four lanes as well. In this case, the model automatically learns that the road width is a useful image feature, so if two adjacent tiles share the same road width, they are likely to have the same number of lanes.

Since the road does not appear in OpenStreetMap training, the model breaks the road into tiles and uses its weights to make predictions. Working with predicting a number of strips on a clogged tile, the model notes that the neighboring tiles have their corresponding pixel patterns and therefore a high probability of sharing information. So, if these tiles have four strips, the bent tile must have four.

In another result, RoadTagger accurately predicted lane numbers in a set of synthetic, highly challenging road disturbances. As an example, a two-lane gradient covered some tiles of a four-lane target road. The model detected the wrong pixel drawings of the gradient, so it ignored the two strips over the covered tiles, accurately predicting that under four strips.

Researchers hope to use RoadTagger to help people quickly validate and approve continuous modifications to infrastructure on data sets such as OpenStreetMap, where many maps do not contain lane numbers or other details. One particular area of ​​interest is Thailand, says Bastani, where roads are constantly changing, but there are few, if any, updates to the dataset.

“The roads that were once described as dirt roads are paved, so it’s better to drive and some intersections are completely constructed. There are changes every year, but digital maps are outdated,” he says. “We want to keep these road features up to date with the latest images.”

