Geocoder updates

As a result of Yahoo Maps rolling back some functionality in their geocoder, I have had to get rid of a couple of features on the batch geocoder.

First, you will no longer be able to lookup associated 9 digit zip codes for your address list. I am not sure how many of you were interested in this functionality, if you were a fan of it, I’m sorry. Of course you can always use single free zip+4 lookup services like this one.

Second, the exact precision of the geocoded addresses will not be reported any longer. To help deal with this I have added a new feature, it will let you view the street level location of any point on the map just by clicking on it. It be used in place of an image if you haven’t populated the Image URL field. So you can click on points to verify that they fall on the correct street.

Please feel free to post your feedback on these new changes.

Yahoo disables JSON output on geocoding API

Word is that Yahoo will soon disable the JSON output format for their geocoding API, the REST based geocoder will remain. What does this mean? Well the JSON api is what makes tools like our batch geocoder possible. Without it I would need to use a server side proxy, meaning requests going to Yahoo would be coming from our web server instead of the end-user IP. This means the 50,000 per day limit would be set on the server, only 50,000 geocodes total for batchgeocode.com.

Why can’t the the user’s browser communicate directly with the XML based REST geocoding API? Well despite being built with nifty XML enabling features like XMLHttpRequest, modern browsers are held back by security constraints that keep client side scripts from communicating with multiple domains. JSON gets around this problem by using ON-Demand JavaScript to dynamically load content through <script> tags that don’t have the same cross browser limitation. Why do the browsers limit your ability to make calls out using XMLHttpRequest but not by using the <script> tag? Who knows….

What I do know is that I did see this coming, no way is Yahoo going to throw out a free geocoding API with a JSON output format and not think about the possibility of turning it off someday. It was inevitable that a service like batchgeocode.com would be created, and that would inevitably mean that the data providers would complain about such a service. Perhaps this is why the JSON output format was never mentioned on the Yahoo geocoding API reference page?

Still Yahoo is interested in providing geocoding services in their maps, it’s what differentiates them from the competition. So geocoding isn’t really going away its just getting reworked a bit. The whole farm is no longer available for free, but the house still is.

Calculating distances to multiple addresses is fun!

Okay maybe its not really that entertaining, but you can do it now by checking the “Calculate distance” option in Step #4 of the batch geocoder.

The distance is purposely limited to miles and kilometers (two digit precision.) Why not display more precision by using feet and meters? Well anyone who’s familiar with how geocoding works knows that it’s not quite that precise. Coordinates are calculated by finding the block the address is on, that part is quite accurate. Then the side of the street is determined by checking to see if the address number is odd or even, so far so good.

What follows is not so accurate…

First the point is set a certain distance from the street center line, after all the building is not likely to be in the middle of the block. However there is no good way to know just how far back from the center of the street the building is, so it’s guessed. Usually this is a global value set when the geocoder is configured. At the City of Portland we generally pick around 50 feet from the address block. The Yahoo Geocoder that I use for BatchGeocoder.com does not specify how far the offset from the centerline, but from my crude measurements its probably close to 50′. There is no way to know for sure how far back the building or building entrance is located, but 50′ is usually darn close.

The final step (and least accurate) in the geocoding process is to try to approximate a location along the block using the address number. This part is really just a total guess. Reason being is the address range on your average block face is a nice big range like 1000-2000, or 100-200. However on average there only exists a dozen or less properties on a block. The geocoder does not actually know how many properties are located on the block, the centerline data does not indicate this. In fact it’s not even sure if the address is really there or not.

The geocoder’s best guess about where the address might be located on the block is done by taking the street number calcing it’s position along the centerline using the block range. Example: If the range was 100-200, and the address number was 150, the geocoder would place the point halfway along the block range. Again, this is a guess at best. If the geocoder manages to place the point right on top of the address it is just getting lucky!

Now other things can help the accuracy, like setting an offset from the start of the block range (similar to the offset from the center line.) The geocoder does know which end of the street to start the calculation from (for example does the 100 address start on the north end or south end, east or west.) But for the most part, geocoding is not that accurate when looking at precision beyond the block range.

For most applications this doesn’t matter too much. If you are looking at points zoomed out to the zip code or city level, then who cares about +-100 feet of precision. For more precision you have to have parcel data that is linked to an address database. Then you are looking up actual addresses with attached parcel polygons and centering the point in the middle of the parcel (talk about accuracy!) A good example of this is PortlandMaps.com

So that is the not so short explanation of why batchgeo.com will not show you distance precision in feet and meters. Isn’t GIS fun?