├── README.md ├── grids.txt ├── predict_geolocation.ipynb └── predict_places365.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # Geo-location Tutorial 2 | 3 | Download model: 4 | https://s3.amazonaws.com/mmcommons-tutorial/models/RN101-5k500-0012.params 5 | https://s3.amazonaws.com/mmcommons-tutorial/models/RN101-5k500-symbol.json 6 | 7 | Geolocation model inspired by ideas presented in: 8 | PlaNet - Photo Geolocation with Convolutional Neural Networks (ECCV 2016), 9 | Tobias Weyand, Ilya Kostrikov, James Philbin 10 | https://research.google.com/pubs/pub45488.html 11 | 12 | ## Data and Classes 13 | Our data come from the geotagged images in the YFCC100M Multimedia Commons dataset. 14 | Training, validation, and test images are split so that images uploaded by the same person do not appear in multiple sets. 15 | Classes are created with the training data using [Google's S2 Geometry Library](https://code.google.com/archive/p/s2-geometry-library/) 16 | as described in the PlaNet paper above. The classes are defined in `grids.txt` where the i-th line is the i-th class and the columns are: 17 | `S2 Cell Token, Latitude, Longitude`. 18 | 19 | ## Difference between our model and PlaNet: 20 | 21 | |               | Ours | PlaNet | 22 | |---------------|-------|--------| 23 | |Dataset source |Multimedia Commons|Images crawled from the web| 24 | |Training set|33.9 million|91 million| 25 | |Validation|1.8 million|34 million| 26 | |S2 Cell Partitioning|t_1=5000, t_2=500 ==> 15,527 cells|t_1=10,000, t_2=50 ==> 26,263 cells| 27 | |Model|ResNet-101|GoogleNet|Inception| 28 | |Optimization|SGD with Momentum and LR Schedule|Adagrad| 29 | |Training time| 9 days on 16 NVIDIA K80 GPUs (p2.16xlarge), 12 epochs|2.5 months on 200 CPU cores| 30 | |Framework|MXNet | DistBelief| 31 | |Test set|Placing Task 2016 Test Set (1.5 million Flickr images)|2.3 M geo-tagged Flickr images| 32 | 33 | ## Result 34 | ### Im2GPS test set 35 | The values indicate the percentages of images within test set that were correctly localized within the given distance. 36 | 37 | |Method|1km|25km|200km|750km|2500km| 38 | |------|---|----|-----|-----|------| 39 | |PlaNet|8.4%|24.5%|37.6%|53.6%|71.3%| 40 | |Ours|16.8%|39.2%|48.9%|67.9%|82.2%| 41 | 42 | ### Flickr Images 43 | Note that these result in the table are not directly comparable as the test set images used in PlaNet is not publicly released. 44 | The values indicate the percentages of images within test set that were correctly localized within the given distance. 45 | 46 | |Method|1km|25km|200km|750km|2500km| 47 | |------|---|----|-----|-----|------| 48 | |PlaNet|3.6%|10.1%|16.0%|28.4%|48.0%| 49 | |Ours|6.2%|13.5%|20.8%|35.6%|55.2%| 50 | 51 | 52 | --------------------------------------------------------------------------------