Converting addresses and place names to geographic coordinates—known as geocoding—is extremely useful. In order to plot points on a map, you first need to translate human-readable versions of a place to ones that computers can understand. There are many ways to make this happen, and entire industries built around supporting them. In this post, we want to take a look at the options based on open data. That is, data that is publicly available, either from governments or communities. In many cases, the geocoding software used to access this data is also publicly available as open source.
We’ll include five popular ways to geocode using open data (and sometimes open source). But first, consider what open geo data is and what alternatives exist.
Collecting and maintaining data about every street and place in the world is a huge undertaking. There are entire companies—big companies—built to do just that. Among them are TeleAtlas (part of TomTom) and NavTeq (acquired by Nokia). And, of course, there is also Google, which in many areas of the world uses its own dataset, often collected with its StreetView vehicles.
Licensing data is expensive, and these companies make hundreds of millions in revenue. That is one of the biggest arguments in favor of open geographic data. Like open source, it is community-supported, and available for anyone to use. Open data sidesteps licensing requirements, so therefore should arguably be less expensive.
On the flip side, the big players are a monetary incentive to maintain their data. They have the revenue to cover the globe in quality assurance. In other words, the paid data should be better.
Yet, in some areas, especially remote and quickly-changing places, volunteers are able to better update the data. Volunteers are more likely to ensure their own areas are mapped when it’s not economically viable for a larger company to take it on.
In the end, the quality of the data depends on the type of data and location coverage needs. When considering mapping services, test several, and go with the one that fits your project. For this post, we’ll show several options for open source and open data geocoders.
Perhaps best known for its customizable map studio and visual representation of geo data, Mapbox is a complete mapping solution. That includes a robust geocoding offering based on open data, both public and community-created.
One major distinction between Mapbox and some of the other tools listed here: it is neither open source, nor free (although there are no fees for less than 50,000 requests per month). That said, it fits the distinction of being based on open data. When paired with Mapbox maps, you can fully configure how the map styles look. Compared to running your own geocoder, you don’t have to keep data updated or run a server to translate addresses to geo coordinates.
Like Mapbox, OpenCage is based on OpenStreetMap and other open data sources. Unlike Mapbox, which has many different mapping tools, OpenCage is completely focused on geocoding. While OpenCage also charges for its service, the free plan is generous—2,500 requests per day, which works out to 75,000 free requests per month.
The company is even more generous to paying customers:
If you are on the free trial, when you hit 2,500 requests in a day you begin to receive a 402 HTTP response code. If however you are a paying customer, when you reach your limit we do … nothing. The limits are “soft” in that if you need more requests on a given day, nothing happens, you can keep geocoding. If you cross the limit repeatedly then the following month we ask you to move to the next higher tier.
OpenCage sits atop numerous open data and open source projects, as seen on its credits page. The tools are combined to create great results world-wide.
Pelias describes itself as “a distributed full-text geographic search engine.” Previously, it was part of a for-profit company called Mapzen, which ran a hosted version of Pelias as a service. Though the company is no longer in operation, Pelias is entirely open source and based on OpenStreetMap and other open data. Therefore, if you have some technical skills, you can run it yourself.
Running your own Pelias server requires Node.JS, ElasticSearch, and various supporting libraries. You can import its data, which has 285 million address points available, and can provide results for many more given interpolation (as shown in the image above).
Pelias is a powerful option that requires some technical chops to run on your own.
Nominatim is a search engine for OpenStreetMap data. You can connect to it as a service for free, but has some very strict usage requirements. Nevertheless, the service can help you test OpenStreetMap data to see if it will work for your needs.
For production use, it’s best to install Nominatim yourself. It requires PostgreSQL with PostGIS and several other languages and libraries. Like Pelias, you’ll want to have some technical background to run it on your own.
Lastly, if you’re looking for place search, rather than full addresses, you may be happy with OSM Names. The OSM, as you may have guessed, stands for OpenStreetMap, the community project many of these open data geocoders are based upon. In this case, landmarks, cities, and postal codes are included in OSM Names. For many uses this can suffice!
Using open data and open source geocoders can be as simple as an API call away (obviously not simple for everyone) or as complex as installing software on your own servers. The quickest start—and sometimes best data—is available from services for a monthly fee. But if you’re looking to have the most control, you can go with the open source options that only cost you your time.
|Name||Open Data||Open Source||Addresses||Places||Free|
As you can see, if all of these things are important to you, you’ll want to go with Pelias. But, as mentioned previously, it’s worth comparing several options to each other. You can even compare the open versions to the geocoders from Google, Microsoft, and others.