The Labs.Com Admin Lab GeoTag
Last update 2009/04/29

The Labs - Design & Functionality For The Net

Automatic Geotagging Photos without GPS

  1. Introduction
  2. Finding Location
  3. Geotagging Photos
  4. Download
  5. Usage
  6. Examples
  7. Places of Interests
  8. Client/Server
  9. References
1. Introduction
I made a couple of hundred photos while my past bicycle travels and wrote diaries and added description to many of the photos.

I considered to geotag (find proper location and its coordinates of latitude and longitude) the photos - but I postponed it after my first attempts. Now I made another attempt with a database from Geonames.org, with cities1000.txt which lists apprx. 85,000 cities with over 1,000 population (allCountries.txt has 8,000,000 entries which I will test as next), and the most useful data in this dataset are the aliases which lists city names in different languages and variation - that made my second attempt a success.

2. Finding Location

As first I take cities1000.txt and fill a sqlite database, two tables, geocities and geoalias.

  • geocities is the entry of each city: name, alias, lat, long, district and country code
  • geoalias are all aliases pointing to geocities entries

The geoalias table speeds up, so in theory, the lookup.

Applying Heuristics

 To find a location from a text isn't that easy one might think, and I came up with some assumptions (aka heuristics) to find a location:

Anatomy of City Names

 I assumed a city name starts always with an uppercase, followed by lowercase characters. Italian and French city names often have multiple terms, whereas middle terms may be lower case, but first and last term starts with uppercase.

So I ended up with following pattern matching:

[A-Z][a-z]+ [A-Za-z]+ [A-Za-z]+ [A-Z][a-z]+
[A-Z][a-z]+ [A-Za-z]+ [A-Z][a-z]+
[A-Z][a-z]+ [A-Z][a-z]+
in that order.

Nearby Preference

 Additionally I implemented that in case multiple locations with the same name are found, sort according distance to last found location. How does this help? E.g. when I make a tour and travel from Prague/Praha to Vienna, and lookup Vienna I get 5 entries:

  • Vienna (VA,US) 38.9012225,-77.2652604 (#4791160)
  • Vienna (WV,US) 39.3270191,-81.5484578 (#4825976)
  • Vienna (GA,US) 32.0915577,-83.7954518 (#4228440)
  • Vienna (IL,US) 37.4153295,-88.8978435 (#4252025)
  • Wien (09,AT) 48.2084877601653,16.3720750808716 (#2761369)

but the entry I want is "Wien", and this one is most close geographically to previous looked up "Prag". This small enhancement helped a lot to determine location correctly.

3. Geotagging Photos

For renekmueller.com web-site internally I defined a file called "list" which resides in the folder of the photos, it lists every file with its description, like this:

0001.jpg    Zurich by night
0002.jpg    Rapperswil in the morning, after long night

My little perl-script geotag either accepts locations or filenames, if it's a file, it tries to find the location names or if it's a 'list' file, it handles it accordingly and prints out an alike 'list' file I call 'list.geo' which looks like this:

0001.jpg    geo:name=Zurich (ZH,CH),geo:long=8.55,lat=47.3666667,time=123238128
0002.jpg    geo:name=Rapperswil (SG,CH),geo:long=8.82227897644043,geo:lat=47.2255721988597,time=123239228

the time is the timestamp of the photo.

Interpolation of Locations

 Since I have many photos not all have description nor location description, there I try to optionally interpolate the location: I use found locations before and after, and interpolate according timestamp a linear location interpolation.

This works for me quite well, since I usually stop and take a few photos within 1-2mins and then ride again to the next location and make there photos again, and only put a description of the first photo in sequence. Since it takes me 1-2 hours to reach the next location, as I ride the bicycle, using the timestamp of the photo gives a good guess that the photos I take quickly in timely sequence are also near the location of the first photo with the location description.

4. Download


  • 2009/04/29: 0.016: using wikipedia to lookup places (which aren't cities or in the database)
  • 2009/04/26: 0.014: new homepage at http://the-labs.com/GeoTag/ as script has grown now beyond a little script
  • 2009/04/25: 0.012: -server starts server via port 10102, where as -s behaves like client to connect to a server
  • 2009/04/25: 0.011: major speed improvement by lc alias and create indexes
  • 2009/04/23: 0.009: using spherical geometry to determine distance between la/lo coords
  • 2009/04/22: 0.007: -f gpx format supported for openstreetmap.org inclusion
  • 2009/04/22: 0.004: lat/long notion, support for district and country search, e.g. vienna,il,us or vienna,at
  • 2009/04/22: 0.003: fine grained matching of location name
  • 2009/04/21: 0.002: support of TAB separated list (description of images), rudimentary interpolating between known places, and timestamp of photo
  • 2009/04/21: 0.001: starting with allCountries.txt, but it's way too big (8,000,000 entries, which takes about 17 hours on my machine) whereas cities1000.txt with only 85,000 entries (takes 20 mins to put into sqlite).


  • sqlite-3.x, install via local package manager
  • perl module DBD::SQLite
  • perl module Time::HiRes

install the perl-module either with your local package manager, or

% perl -MCPAN -e 'install DBD::SQLite'
% perl -MCPAN -e 'install Time::HiRes'

5. Usage

Copy geotag into /usr/local/bin (as root) or keep it locally; as first unzip the cities1000.txt.gz or cities1000.zip:

% gzip -d cities1000.txt.gz

As next run geotag and have cities1000.txt in the same directory:

% ./geotag

it will create ~/DB/ and populate ~/DB/geotag.db and takes a couple of minutes, on a Pentium4 2.4GHz about 20 mins to create the sqlite database geotag.db. After that the lookup will respond instantly of course.

6. Examples

% ./geotag prag
Praha (52,CZ) lat=50.0878367932108,long=14.4241322001241

% ./geotag vienna
Wien (09,AT) lat=48.2084877601653,long=16.3720750808716

% ./geotag boulder
Boulder (CO,US) lat=40.0149856,long=-105.2705456

% ./geotag vienna
Vienna (IL,US) lat=37.4153295,long=-88.8978435

% ./geotag paris
Paris (TN,US) lat=36.3020023,long=-88.3267107

% ./geotag munich
M√ľnchen (02,DE) lat=48.1376831438553,long=11.5743541717529

% ./geotag paris
Paris (A8,FR) lat=48.85341,long=2.3488

% ./geotag paris,il,us
Paris (IL,US) lat=39.611146,long=-87.6961374

% ./geotag vienna,at
Wien (09,AT) lat=48.2084877601653,long=16.3720750808716

% ./geotag -f gpx diary.txt > list.gpx

so it behaves as I wanted, depending on previously found matches determine the perimeter of the next found location.

I made a test-run based on my ./list file with photo description of my Europe 2008 Tour:

% ./geotag list > list.geo
% ./geotag -f gpx list > list.gpx

and it made 1-2 errors which I corrected by hand, and added one waypoint (Rapperswil) so the path doesn't go over a lake - this is the result:

Note: I had no GPS coordindates to start with, I solely used the description of my photos to conclude the waypoints. I used OpenStreetMap.org for this, used some of their examples, and referenced the list.gpx within the javascript code:

   var lgml = new OpenLayers.Layer.GML("GPX", "list.gpx", {    
      format: OpenLayers.Format.GPX,
      style: {
          strokeColor: 'red', strokeWidth: 5, 
          strokeOpacity: 0.5 },
      projection: new OpenLayers.Projection("EPSG:4326")

I added some verbosity which is printed to stderr like giving the distance of the looked up locations, whereas location data is stdout (so you can redirect it via > file):

% ./geotag berlin paris london
Berlin (16,DE) lat=52.5166667,long=13.4
Paris (A8,FR) lat=48.85341,long=2.3488
London (ENG,GB) lat=51.5084152563931,long=-0.125532746315002
        3 locations looked up, 3 successes, 0 failed (0.0%)
        1222.499km cumulative distance

7. Places of Interests

Since version 0.016 also places of interests are supported, by using wikipedia.org as lookup (therefore requires network connection to have this feature work):
% ./geotag 'taj mahal,in'
Taj Mahal (,IN) lat=27.1741666666667,long=78.0422222222222
best ask a place of interest with the country code (and district code as well) the first time, so the results will be addded with district and country to the database as well.
% ./geotag 'mount everest'
Mount Everest (,) lat=27.9880555555556,long=86.9252777777778
        1 locations looked up, 1 successes, 0 failed (0.0%)
        0.000km cumulative distance

% ./geotag northpole southpole
Northpole (,) lat=90,long=0
Southpole (,) lat=-90,long=0
        2 locations looked up, 2 successes, 0 failed (0.0%)
        20037.014km cumulative distance

8. Client/Server

Since version 0.012 also tcp-based client/server is built in:

% ./geotag -server

if this machine has as IP, and then go on a client, and do this:

% ./geotag -s 'new york'

You can create a ~/.geotagrc where you can define the defaults:


and then call

% ./geotag 'new york'

and it will use the server to lookup the locations, via tcp on port 10102.

9. References

iLife: iPhoto09
has similiar functionality
MacOSX only app to geotag photos with map
GeoTag @ Sourceforge.net
Java multi-platform photo timestamp to recorded GPS coordinate mapping tool for photos, also has 'reverse geocoding' which is the functionality my 'geotag' primarly has


QMail Admin Lab

Hipocrisy of the finest:

"I agree that no single company can create all the hardware and software. Openness is central because it's the foundation of choice."
-- Steve Balmer (Microsoft) blaming Apple regarding iPhone, February 18, 2009

"Things work better when hardware and software are considered together, [..]. We control it all, we design it all, and we manufacture it all ourselves."
-- Steve Balmer announcing Windows 8 Tablet, June 19, 2012

Last update 2009/04/29

All Rights Reserved - (C) 1997 - 2012 by The Labs.Com

Top of Page

The Labs.Com