Amazon S3 for map tile storage and delivery?

Amazon recently launched their new S3 Storage service, and everyone seems to be clamoring to figure out uses for it. Well here is my contribution: map tile storage and serving.

Think about it: You want to create your own tile based map delivery (because your boss has been nagging you about it ever since the Google Maps launch), but where are you going to store those gigabytes and gigabytes of images. Not to mention how to deliver them? Stick them all in a database? Write a wrapper script to that? Sounds like an awful lot of bandwidth, cycles, and storage is going to be needed. Imagine every time there is a breakdown, pager goes off and you got to fix it.

Or, get an S3 account and blaze away. I’d bet dollars to donuts that S3 is a heck of a lot cheaper than your average “enterprise” network storage solution. In fact I’ll just tell you: it’s cheaper. The drawbacks, well you won’t have LAN speed access to it, but if your target is the internet who cares. Likely it will take just as long to generate all those tiles as it will to upload them anyway.

I’d also bet that Amazon’s delivery will be much faster with lower latency than what an average sized shop could do on a T1 with a weenie little 4 processor database server. Who knows what sort of super optimized proprietary network, hardware, and software architecture Amazon has put together to make their system work. More than likely its much better than what’s available off the shelf.

Did I mention scalability? Scalability in this matter is just sending more dollars off to Amazon. Assuming your business model has you making more for each visit than you need to spend, you will just keep making more and more profit, no matter how many users show up. Got Slashdotted? No biggie, the server capacity briefly expands to take on the Niagara falls sized volume, and then returns to normal when it has passed. Point is, you captured every bit of the revenue generated from that extra traffic.

So the next question is: after storage, what’s next? Application delivery?

Maybe instead of thinking in terms of racks of servers, we should be thinking of tracking cycles and storage down to smallest possible measurements and paying for only what we need, with endless ability to scale. Developers can add their applications to the Internet Borg cube, and after some marketing, expect to see a linear increase in profits along with traffic. No more hassling over rack space, load balancers, hard drive failures, backups, software licensing, and so on.

Here it comes, the infinitely scalable internet application model. Sustainable growth, just add water.

Commercial or Public, It’s still all about the data

My day job at the City of Portland lets me work with really cool data. Take a look at PortlandMaps for example. It has several dozen different datasets all rolled into one easy to navigate interface. The mapping GUI and speed is not up to today’s standards of AJAX based map viewers (yet), but the underlying data is much more complete and powerful than what is available anywhere at the national level.

For example, we have access to four counties worth of parcel, or tax lot, data This information is key in seeing where property lines are on a map without squinting through the trees on aerial photos (but we do have those too.) We also have building footprints for the entire city of Portland. Overlay the two on top of an 6 inch/pixel aerial photograph and pair it with weekly updated assessor data, and you have a very powerful property viewing tool.

We also have great data for zoning, utilities, crime incidents, hazard levels, building permits, City Parks, etc.. We have first hand access to all this data because our group at the city is responsible for gathering it from the various regional entities (mostly government based, at the city and county level.) In exchange for the entities giving us their data, we give them back all of the other data we have collected. The three most popular: tax lots, aerial photos, and street center line.

A few years ago we decided it might also be nice to open up access to the general public, hence PortlandMaps.com. After its launch, it soon became apparent that PortlandMaps was not only an excellent tool for citizen access, but for all of our data partners as well. It has become a invaluable resource for both.

Where am I going with all of this? Well the main point is: it’s all about the data. PortlandMaps would not exist if it was not for the work of hundreds of individuals at the city, county, and state level creating datasets and giving them back to the public for free.

Now we have a parallel with companies like MapQuest, Yahoo, and Google all offering transportation/routing information at the national (and sometimes international) level. These are great services, but they only provide directions and routing.

Why not provide all of the data of PortlandMaps, in a nationwide interface? Again: It’s all about the data. Even if they could collect data from all the various counties, cities, and states in the U.S. compiling it all into one database would be a sizeable task. Companies like Zillow are attempting this, they have parcel data in many areas as well as detailed assessor records. No doubt a huge effort went into Zillow gathering and normalizing data from all of these various entities.

I know how hard that can be, because I see what we must go through in Portland to do it on a local level. Data formats are different and can change at the will of the data provider. There are no standards, so creating one for all data to file into is a task to say the least. Add into that the data providers tendency to set rules on how data can be used or how much it might cost to obtain it and at what difficulty (hint: they don’t just leave it out on an FTP server somewhere.)

The local data providers know how valuable their data is, and even though they might be required by law to make it publicly available (in the case of government agencies), they will make it as difficult as possible. Again, a similar parallel to the commercial data providers like Navteq and TeleAtlas. Getting that data from these companies is not usually difficult, in fact if you own a car with a navigation system you probably already have a copy of it. But they impose strict licensing rules that limit what you can use it for and maybe even charge extra. This is why it is estimated that Yahoo, MapQuest, and Google all pay a small fee back to their data providers every time they calculate a route. Now no doubt this is a small fee, probably a small fraction of a penny, but a fee all the same.

Now the service providers are interested in giving away free APIs, to further expose their branding and potential advertising to would be affiliate web sites. No doubt checking their every move is the data providers, who desperately need to protect the value of their hard worked for data. Data that needs to be maintained constantly to keep up with the ever changing infrastructure of our country. Just like the local data providers that help PortlandMaps become a service, everyone wants to protect what they work so hard on to create and maintain.

It’s all about the data.

Google Maps vs. Yahoo Maps vs. MapQuest – API’s

Since Google Maps launched their API allowing developers to use their mapping service to draw their own data, Yahoo has tried to play catchup with their own API. Well now with MapQuest’s announcement of their new API, it’s now a three way. Which one to choose?

Google Maps API

Pros:

  • Fluid interface, brilliant looking map marker flyouts
  • International
  • Built in Aerial Photos
  • Largest developer base, as a result…
  • Lots of hacks and how-to’s available

Cons:

  • No built-in geocoding service
  • No built-in routing capability

Yahoo Maps API

Pros:

  • Built-in and external geocoding capability
  • Very flexible and open API’s
  • Rate limiting by IP instead of appID
  • Built-in GeoRSS support
  • Flash version available

Cons:

  • U.S. and Canada only
  • Flyouts not quite as spiffy as Google
  • No aerial photo option

MapQuest API

Pros:

  • Built-in routing (driving directions) capability
  • Built-in geocoding capability

Cons:

  • No smooth AJAX client (yet)
  • Rate limiting by appID + web site URL (instead of end-user IP)
  • No photos option

Yahoo and MapQuest seem to be eager to please their developers, probably with good reason. They have a lot of catching up to do with Google. I give Yahoo a lot of credit for being first to release a AJAX map client with built-in geocoding functionality. That’s one clear area where they are ahead of Google.

Time will tell how sustainable each companies model is and how much change will be necessary. Remember too that they aren’t just always going to give this away for free, even if there will be no charge in the future, there are bound to be ADs.