Turn address strings into gps coordinates for free with google

I’ve recently wanted to geocode a large number of addresses (think circa 60k) in Ireland as part of a visualisation of the Irish property market. Geocoding can be simply achieved in R using the geocode() function from the ggmap library. The geocode function uses Googles Geocoding API to turn addresses from text to latitude and longitude pairs very simply.

There is a usage limit on the geocoding service for free users of 2,500 addresses per IP address per day. This hard limit cannot be overcome without employing new a IP address, or paying for a business account. To ease the pain of starting an R process every 2,500 addresses / day, I’ve built the a script that geocodes addresses up the the API query limit every day with a few handy features:

  • Once it hits the geocoding limit, it patiently waits for Google’s servers to let it proceed.
  • The script pings Google once per hour during the down time to start geocoding again as soon as possible.
  • A temporary file containing the current data state is maintained during the process. Should the script be interrupted, it will start again from the place it left off once any problems with the data /connection has been rectified.
points in dublin

The R script assumes that you are starting with a database that is contained in a single *.csv file, “input.csv”, where the addresses are contained in the “address” column. Feel free to use/modify to suit your own devices!

Comments are included where possible:

Let me know if you find a use for the script, or if you have any suggestions for improvements.

Please be aware that it is against the Google Geocoding API terms of service to geocode addresses without displaying them on a Google map. Please see the terms of service for more details on usage restrictions.

  1. Hi Ann – whats the error that you are getting? Have you got your csv file in the same place as your R script? And finally have you set the “working directory” of R to the same directory.

    As a another try – change read.csv(paste0(‘./’, infield, ‘.csv’)) to just read.csv(paste0(infield, ‘.csv’))

    Hope this works!

  2. Hi Shane, thanks for this highly useful code. How do rows of your input file look like? I mean how is the address typed in each row? I need to add state in US after the primary address; for example primary address is “abcd high school”, followed by Oklahoma, US. Could you please help me out here.

    • The address can be just a string column in the csv file. So something like:

      id, address, other_column
      0, “Test Address, Test Town, State”, “other value”
      1, “Test Address2, Test Town2, State”, “other value”

      Doest that make sense? You can use paste() to add a state at the end if you have it in another column etc.

  3. The address can be just a string column in the csv file. So something like:

    id, address, other_column
    0, “Test Address, Test Town, State”, “other value”
    1, “Test Address2, Test Town2, State”, “other value”

    Doest that make sense? You can use paste() to add a state at the end if you have it in another column etc.

  4. Here is the error – Error in gzfile(file, mode) : cannot open the connection
    In addition: Warning message:
    In gzfile(file, mode) :
    cannot open compressed file ‘../data/C:/Users/*******/Documents/April 2017/Addresses.csv_geocoded.rds’, probable reason ‘Invalid argument’

    As a beginner, I find the error to be confusing and unclear. I followed your script without any changes and created my own csv file for testing purposes. All went well until I neared the end, unfortunately. Maybe the script should not be ran exactly as shown or maybe I am missing an important part of the instructions. Advice and suggestions are welcome.

  5. Hi Shanelynn,

    Thank you for the script, it has been very helpful for the purposes of my project. I noticed above in the comments that someone spotted the error on line 77 and it is now correct here. However, I accessed the script from Rbloggers here and the error is still there. I wanted to let you know in case you had any control over fixing the error on Rbloggers as well. Thank you!

  6. Hello,

    I am getting an error when I run line 63 – 73 which says

    “Error in if (location == “”) return(failedGeocodeReturn(output)) :
    missing value where TRUE/FALSE needed

    Thoughts? Ideas? Fixes?


  7. Hi 🙂 I noticed many people mentioning the repeating of addresses when it continues from index – would simply adding a +1 to the condition not work? Like this:

    if (file.exists(tempfilename)){
    print(“Found temp file – resuming from index”)
    geocoded <- readRDS(tempfilename)
    startindex <- nrow(geocoded)+1

    I tried it and seems to work for me but maybe Im missing something :/

  8. Hi, Shane, thanks for your amazing code. Super helpful.
    My dummy question about tempfilename. I should use any, I suppose. But in which format should I save?


  9. Hi Shane,

    Thanks for the code! I have a couple of questions, but I’m brand new to programming, so forgive my ignorance.

    1) I have received the following error: “Error in data$Address (from DB_Geocode_trial1) : object of type ‘closure’ is not subsettable”. What is this? And how do I fix it?

    2) I am working on a water-related project in which I need to find coordinates for “Habitations” (similar to a village) in India. In other words, I don’t have street names. Is this code meant to accomplish a task like this? If not, do you have any ideas for how I can modify it?

    Sorry that I have so many questions. I appreciate any and all assistance!

    Thanks in advance,


    • Hi Brooke. It sounds like your data didn’t read in correctly for the first error – make sure your CSV file is correctly formatted, and the data shows up correctly if you run “head(data)”. For the water project, the geocoder from google should work fine with areas rather than street addresses, as long as those habitations are marked on Google. To help it along, you can add “India” to the end of the geocoding string.

    • Hi Rajanna, this is called “reverse-geocoding” – you’ll need to look at the specific API for that – Google has one if you look around the documentation!

  10. This worked for me and I don’t even have addresses. I passed it “Region, Country”, sometimes “,Region, Country” and it’s working well.

    Thank you!

  11. This is great! Would it be possible to adapt the code for the gmapsdistance function? Is this something you have already done? That is, I need to calculate multiple travel distances (in time)…rather than geocode addresses. Thanks for your thoughts!

  12. Hi Shane,
    Thanks for the useful code . When i tried with 252 addresses its worked fine after 4 days i used new input.csv file with 221 rows it’s started giving the below error
    Error in if (location == “”) return(failedGeocodeReturn(output)) :
    missing value where TRUE/FALSE needed

    and its considering 252 rows old file values .please provide the solution waiting for your reply

    Thanks Regards
    Prakash Hullathi

Leave a Reply