Where I’m working at the moment we’re using the Yahoo Geocoding API but aren’t very happy with it. I’ve been asked to look into how we can improve our geo coding.
Geo coding services vary greatly in accuracy, precision, availability, throughput capping, and other attributes. So it can help to try multiple services until you get sufficient confidence in the result.
It seems there are plenty of modules on CPAN for geocoding from a single source, including Yahoo, Google, Mapquest, Multimap, Cloudmade and Bing. The only one that I could find that handles multiple services was Geo::Coder::Multiple.
I’m writing this blog post for two reasons…
Firstly I’m interested in your experiences with geocoding services. Which you’ve tried, and which you’d recommend (for geocoding for US addresses). What problems you’ve encountered and any advice you’d like to pass on.
Secondly I’m interested in your thoughts on working with multiple services.
Geo::Coder::Multiple looks interesting but quite limited. For example, it’ll accept the first valid response even if it’s of low precision. There’s also no provision for checking multiple results to derive some measure of confidence, for “knowing when to stop”.
Some feature ideas:
- Ordered list of geocoders
- Auto rate limit by detecting over-limit response and disabling for a period, perhaps with exponential back-off.
- Result-filter callback to discard uninteresting responses, e.g., precision too low to be useful.
- Result-picking callback to pick best result from those collected so far. It could tell if there were more to try and return undef to mean “keep going”.
- Some pre-defined result-picking callbacks for common use cases.
Any thoughts on those?
What kind of features would you like to see?
Want to help build this?