├── .gitattributes
├── .gitignore
├── LICENSE.md
├── README.md
├── example_output.png
├── geocode_api_keys.py
├── ratebeer.py
├── requirements.txt
└── yelp_reviews.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 |
2 | *.xml
3 | *.iml
4 | *.html
5 | *.csv
6 | geocode_api_keys*
7 | .vscode
8 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Copyright (C) 2018 Spotlight Infosec LLC
2 |
3 | This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
4 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Test scripts for OSINT
2 |
3 | These scripts pull data from several sites and then plot the locations found on a Google Map.
4 |
5 | ## Usage
6 |
7 | ### Requirements
8 |
9 | The most important requirement is __this script is written in Python 3.x__.
10 |
11 | #### Modules
12 |
13 | * bs4
14 | * geocoder
15 | * gmplot
16 | * googlemaps
17 | * requests
18 |
19 | If you have PIP installed, type: `pip3 install -r requirements.txt` from the command line and your system should install all required modules.
20 |
21 | #### Geocoding API
22 |
23 | You will need to create a file named `geocode_api_keys.py` and put the following in it:
24 |
25 | ```bash
26 | google_api_key = 'YOUR_GOOGLE_API_KEY'
27 | bing_api_key = 'YOUR_BING_API_KEY'
28 | ```
29 |
30 | Of course this means you need to go get a valid Google Developer API key for the Geocoding
31 | ().
32 |
33 | You also can also create a Bing API key for free at .
34 |
35 | ### Help command Output
36 |
37 | #### Ratebeer.com
38 |
39 | ```bash
40 | $ python ratebeer.py -h
41 | usage: ratebeer.py [-h] -u USER
42 |
43 | Grab ratebeer user activity
44 |
45 | optional arguments:
46 | -h, --help show this help message and exit
47 | -u USER, --user USER Username to research
48 | ```
49 |
50 | #### Yelp.com
51 |
52 | ```bash
53 | $ python yelp_reviews.py -h
54 | usage: yelp_reviews.py [-h] -u USER [--csv]
55 |
56 | Grab yelp user activity
57 |
58 | optional arguments:
59 | -h, --help show this help message and exit
60 | -u USER, --user USER Username to research
61 | --csv Output to CSV
62 | ```
63 |
64 | ### Example Output
65 |
66 | #### Ratebeer.com
67 |
68 | ```bash
69 | $ python ratebeer.py -u 105404
70 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/1/1/
71 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/2/2/
72 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/3/3/
73 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/4/4/
74 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/5/5/
75 | [ ] VENUE DATA: Requesting https://www.ratebeer.com/ajax/user/105404/place-ratings/6/6/
76 |
77 | [ ] HTML output file named ratebeer_map_105404_1539303736.html was written to disk.
78 | ```
79 |
80 | #### Yelp.com
81 |
82 | ```bash
83 | python yelp_reviews.py -u U4gWrMtHevbDF3Le3GBLHA
84 | [ ] VENUE DATA: Requesting https://www.yelp.com/user_details?userid=U4gWrMtHevbDF3Le3GBLHA
85 | [ ] Can Geocode with Google.
86 | [ ] VENUE DATA: Requesting https://www.yelp.com/user_details?userid=U4gWrMtHevbDF3Le3GBLHA&rec_pagestart=10
87 | [ ] Can Geocode with Google.
88 | [ ] VENUE DATA: Requesting https://www.yelp.com/user_details?userid=U4gWrMtHevbDF3Le3GBLHA&rec_pagestart=20
89 | [ ] Can Geocode with Google.
90 | [ ] VENUE DATA: Requesting https://www.yelp.com/user_details?userid=U4gWrMtHevbDF3Le3GBLHA&rec_pagestart=30
91 |
92 | [-] No review addresses found
93 |
94 | [ ] HTML output file named yelp_map_U4gWrMtHevbDF3Le3GBLHA_1539303863.html was written to disk.
95 | ```
96 |
97 | All scripts should produce HTML output files that show the geolocated content. An example is below:
98 | 
99 |
100 | If your web page shows "For Development Purposes Only" watermarks, you will need to edit the HTML file and add your Google API key for JavaScript Maps API. Add `key=YOUR_GOOGLE_API_KEY` to the end of the maps.googleapis.com line like this: `https://maps.googleapis.com/maps/api/js?libraries=visualization&sensor=true_or_false&key=YOUR_GOOGLE_API_KEY`
101 |
102 | ## License
103 |
104 | 
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
105 |
--------------------------------------------------------------------------------
/example_output.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WebBreacher/foodndrink/7e3c62cecb2d8a49ad6c4a8cc38474a28e36a9c8/example_output.png
--------------------------------------------------------------------------------
/geocode_api_keys.py:
--------------------------------------------------------------------------------
1 |
2 | # You need to get one or more of these API keys from Google and Bing
3 | # The Google one may cost money. Bing is free.
4 | google_api_key = 'GOOGLE_API_KEY'
5 | bing_api_key = 'BING_API_KEY'
6 |
--------------------------------------------------------------------------------
/ratebeer.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | """
3 | Author: Micah Hoffman (@WebBreacher)
4 | Purpose: To look up a user on ratebeer.com
5 |
6 | # Test users
7 | 105404 5/5
8 | 11116 6/6
9 | """
10 |
11 | import argparse
12 | from bs4 import BeautifulSoup
13 | import geocoder
14 | import googlemaps
15 | import gmplot
16 | import re
17 | import requests
18 | from requests.packages.urllib3.exceptions import InsecureRequestWarning
19 | import time
20 | from geocode_api_keys import *
21 |
22 |
23 | ####
24 | # Functions
25 | ####
26 |
27 | def get_mean(lst):
28 | return float(sum(lst) / len(lst))
29 |
30 |
31 | # Parse command line input
32 | parser = argparse.ArgumentParser(description='Grab ratebeer user activity')
33 | parser.add_argument('-u', '--user', required=True, help='Username to research')
34 | args = parser.parse_args()
35 |
36 |
37 | def get_data_from_ratebeer(url):
38 | # Setting up and Making the Web Call
39 | try:
40 | user_agent = 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Firefox/68.0'
41 | headers = {'User-Agent': user_agent}
42 | # Make web request for that URL and don't verify SSL/TLS certs
43 | response = requests.get(url, headers=headers, verify=False)
44 | return response.text
45 |
46 | except Exception as e:
47 | print('[!] ERROR - ratebeer issue: {}'.format(str(e)))
48 | exit(1)
49 |
50 | # TODO - Get user data
51 | def get_user_data(passed_user):
52 | # Parsing user information
53 | url = 'https://www.ratebeer.com/ajax/user/{}/'.format(passed_user)
54 | print("\n[ ] USER DATA: Requesting {}".format(url))
55 | resp = get_data_from_ratebeer(url)
56 | html_doc = BeautifulSoup(resp, "html.parser")
57 |
58 | """ TODO - Parse the content below
59 |
60 |
Kimberly "ratebeer Yoda" S.
61 |
Washington, DC
62 |
4369 Friends
63 |
1333 Reviews
64 |
10344 Photos
65 | """
66 |
67 | if user1:
68 | return user1
69 |
70 | # TODO - Get friend data from https://www.ratebeer.com/user_details_friends?userid=XXX
71 |
72 | def ratebeer_pages(url):
73 | # This function retrieves location content by extracting
74 | entry = []
75 | review_addresses = []
76 | reviewlatslongs = []
77 | gmaps = googlemaps.Client(key=google_api_key)
78 | black_list = ['Location', 'Avg', 'Score', 'Date', 'next >', 'last >>', '< prev']
79 | resp = get_data_from_ratebeer(url)
80 | html_doc = BeautifulSoup(resp, 'html.parser')
81 | addresses = html_doc.find_all('a')
82 | if addresses:
83 | for a in addresses:
84 | if a.string in black_list or a.string == None:
85 | continue
86 | else:
87 | entry.append(a.string)
88 | counter = 1
89 | for e in entry:
90 | if counter == 1:
91 | placename = e
92 | counter += 1
93 | elif counter == 2:
94 | country = e
95 | counter += 1
96 | elif counter == 3:
97 | region = e
98 | counter += 1
99 | elif counter == 4:
100 | city = e
101 | review_addresses.append('{}, {}, {}, {}'.format(placename, city, region, country))
102 | counter = 1
103 |
104 | for addy in review_addresses:
105 | g = gmaps.geocode(addy)
106 | if g[0]['geometry']['location']:
107 | loc = g[0]['geometry']['location']['lat'], g[0]['geometry']['location']['lng']
108 | reviewlatslongs.append(tuple(loc))
109 | else:
110 | continue
111 | return reviewlatslongs
112 | else:
113 | print('\n[-] No additional entries found')
114 | return False
115 |
116 | def get_venue_data(passed_user):
117 | # Parsing check-in location information
118 | review_addresses = []
119 | reviewlatslongs = []
120 | url = 'https://www.ratebeer.com/ajax/user/{}/place-ratings/1/1/'.format(passed_user)
121 | print('[ ] VENUE DATA: Requesting {}'.format(url))
122 | reviewlatslongs = ratebeer_pages(url)
123 |
124 | # Try to make pulls for additional reviews
125 | for num in range(2, 100, 1):
126 | url = 'https://www.ratebeer.com/ajax/user/{}/place-ratings/{}/{}/'.format(passed_user, num, num)
127 | print('[ ] VENUE DATA: Requesting {}'.format(url))
128 | reviewlatslongs1 = ratebeer_pages(url)
129 | if reviewlatslongs1:
130 | reviewlatslongs.extend(reviewlatslongs1)
131 | else:
132 | # If a false value came back, no addresses were found and we are done iterating
133 | break
134 |
135 | review_lats, review_longs = zip(*reviewlatslongs)
136 |
137 | # Compute the center Lat and Long to center the map
138 | center_lat = get_mean(review_lats)
139 | center_long = get_mean(review_longs)
140 | gmap = gmplot.GoogleMapPlotter(center_lat, center_long, 6)
141 | gmap.coloricon = "http://www.googlemapsmarkers.com/v1/%s/"
142 | # Create the points/heatmap/circles on the map
143 | gmap.heatmap(review_lats, review_longs, 1, 100)
144 | gmap.scatter(review_lats, review_longs, '#333333', size=1, marker=True)
145 | gmap.plot(review_lats, review_longs, '#FF33FF', edge_width=1)
146 |
147 | outfile = 'ratebeer_map_{}_{}.html'.format(args.user, str(int(time.time())))
148 | gmap.draw(outfile)
149 | print('\n[ ] HTML output file named {} was written to disk.\n'.format(outfile))
150 |
151 |
152 | ###########################
153 | # Start
154 | ###########################
155 |
156 | # Suppress HTTPS warnings
157 | requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
158 |
159 |
160 | ###############
161 | # Get Venue info
162 | ###############
163 | get_venue_data(args.user)
164 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | bs4
2 | geocoder
3 | gmplot
4 | googlemaps
5 | requests
--------------------------------------------------------------------------------
/yelp_reviews.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 | """
3 | Author: Micah Hoffman (@WebBreacher)
4 | Purpose: To look up a user on yelp.com
5 |
6 | # Test users
7 | 29 reviews = U4gWrMtHevbDF3Le3GBLHA (Eva, CA)
8 | 52 reviews = 7Yn_ljl1SCd2br4NMFZkxA (Michele, CA)
9 | 136 reviews = 85wCGBfsbcsT4RkplclSKg (Konark, Brazil)
10 | 259 reviews = cZrsBqcs3-UGmsCwvvPutA (James, New Zealand)
11 | 1000 reviews = 4paWpLov6LjpsNNgE1fhSg (Scotty, SC)
12 | """
13 |
14 | import argparse
15 | from bs4 import BeautifulSoup
16 | import csv
17 | import geocoder
18 | import gmplot
19 | import googlemaps
20 | import random
21 | import re
22 | import requests
23 | from requests.packages.urllib3.exceptions import InsecureRequestWarning
24 | import time
25 | from geocode_api_keys import *
26 |
27 |
28 | ####
29 | # Functions
30 | ####
31 |
32 | def get_mean(lst):
33 | return float(sum(lst) / len(lst))
34 |
35 |
36 | # Parse command line input
37 | parser = argparse.ArgumentParser(description='Grab yelp user activity')
38 | parser.add_argument('-u', '--user', required=True, help='Username to research')
39 | parser.add_argument('--csv', action='store_true', help='Output to CSV')
40 | args = parser.parse_args()
41 |
42 |
43 | def get_data_from_yelp(url):
44 | # Setting up and Making the Web Call
45 | try:
46 | # Need to use a non-Python User Agent to interact with Yelp
47 | user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0'
48 | headers = {'User-Agent': user_agent}
49 | # Make web request for that URL and don't verify SSL/TLS certs
50 | response = requests.get(url, headers=headers, verify=False)
51 | return response.text
52 |
53 | except Exception as e:
54 | print('[!] ERROR - yelp issue: {}'.format(str(e)))
55 | exit(1)
56 |
57 | # TODO - Get user data
58 | def get_user_data(passed_user):
59 | # Parsing user information
60 | url = 'https://www.yelp.com/user_details?userid={}'.format(passed_user)
61 | print("\n[ ] USER DATA: Requesting {}".format(url))
62 | resp = get_data_from_yelp(url)
63 | html_doc = BeautifulSoup(resp, "html.parser")
64 |
65 | """ TODO - Parse the content below
66 |
67 |
Kimberly "Yelp Yoda" S.
68 |
Washington, DC
69 |
4369 Friends
70 |
1333 Reviews
71 |
10344 Photos
72 | """
73 |
74 | #if user1:
75 | # return user1
76 |
77 | # TODO - Get friend data from https://www.yelp.com/user_details_friends?userid=XXX
78 |
79 | def yelp_pages(url):
80 | # This function retrieves location content by extracting
81 | review_addresses = []
82 | reviewlatslongs = []
83 | gmaps = googlemaps.Client(key=google_api_key)
84 | resp = get_data_from_yelp(url)
85 | html_doc = BeautifulSoup(resp, "html.parser")
86 | addresses = html_doc.find_all('address')
87 | if addresses:
88 | for a in addresses:
89 | matchObj = re.search(r'\n\s+([a-zA-Z0-9].*)\s+<', str(a), re.M|re.I)
90 | review_addresses.append(matchObj.group(1).replace('
', ', '))
91 |
92 | # Test if we can geocode with Google or OpenStreetMap
93 | test_google = gmaps.geocode('1600 pennsylvania ave, washington, dc')
94 | #print(test_google[0]['geometry']['location']) #DEBUG
95 | if test_google[0]['geometry']['location']['lat']:
96 | print('[ ] Can Geocode with Google.')
97 | goog = True
98 | else:
99 | print('[!] ERROR - Cannot Geocode with Google.')
100 | goog = False
101 | # Test if we can geocode with Bing
102 | test_bing = geocoder.bing('1600 pennsylvania ave, washington, dc', key=bing_api_key)
103 | if test_bing.latlng:
104 | print('[ ] Can Geocode with Bing.')
105 | bing = True
106 | else:
107 | print('[!] ERROR - Cannot Geocode with Bing.')
108 | bing = False
109 | # Test if we can geocode with OpenStreetMaps
110 | test_osm = geocoder.osm('1600 pennsylvania ave, washington, dc')
111 | if test_osm.x:
112 | print('[ ] Can Geocode with OSM.')
113 | openstreet = True
114 | else:
115 | print('[!] ERROR - Cannot Geocode with OSM.')
116 | openstreet = False
117 |
118 |
119 | if goog == False and bing == False and openstreet == False:
120 | print('[!] ERROR - Cannot geocode to Google, Bing or OpenStreetMap. We are done here.')
121 | exit(1)
122 |
123 | # At least one of the geocoders worked
124 | for addy in review_addresses:
125 | if goog:
126 | g = gmaps.geocode(addy)
127 | if g[0]['geometry']['location']:
128 | loc = g[0]['geometry']['location']['lat'], g[0]['geometry']['location']['lng']
129 | reviewlatslongs.append(tuple(loc))
130 | elif bing:
131 | b = geocoder.bing(addy, key=bing_api_key)
132 | if b.latlng:
133 | reviewlatslongs.append(tuple(b.latlng))
134 | elif openstreet:
135 | osm = geocoder.osm(addy)
136 | if osm.x:
137 | reviewlatslongs.append((osm.x,osm.y))
138 | return reviewlatslongs
139 | else:
140 | print('\n[-] No review addresses found')
141 | return False
142 |
143 | def get_venue_data(passed_user):
144 | # Parsing check-in location information
145 | url = 'https://www.yelp.com/user_details?userid={}'.format(passed_user)
146 | print("[ ] VENUE DATA: Requesting {}".format(url))
147 | reviewlatslongs = yelp_pages(url)
148 |
149 | # Pause to prevent Yelp from shunning us
150 | time.sleep(random.random())
151 | # Try to make pulls for additional reviews
152 | for num in range(10, 5000, 10):
153 | url = 'https://www.yelp.com/user_details?userid={}&rec_pagestart={}'.format(passed_user, num)
154 | print("[ ] VENUE DATA: Requesting {}".format(url))
155 | reviewlatslongs1 = yelp_pages(url)
156 | # Pause to prevent Yelp from shunning us
157 | time.sleep(random.random())
158 | if reviewlatslongs1:
159 | reviewlatslongs.extend(reviewlatslongs1)
160 | else:
161 | # If a false value came back, no addresses were found and we are done iterating
162 | break
163 |
164 | review_lats, review_longs = zip(*reviewlatslongs)
165 |
166 | # Compute the center Lat and Long to center the map
167 | center_lat = get_mean(review_lats)
168 | center_long = get_mean(review_longs)
169 | gmap = gmplot.GoogleMapPlotter(center_lat, center_long, 6)
170 | gmap.coloricon = "http://www.googlemapsmarkers.com/v1/%s/"
171 | # Create the points/heatmap/circles on the map
172 | gmap.heatmap(review_lats, review_longs, 1, 100)
173 | gmap.scatter(review_lats, review_longs, '#333333', size=1, marker=True)
174 | gmap.plot(review_lats, review_longs, '#FF33FF', edge_width=1)
175 |
176 | outfile = 'yelp_map_{}_{}.html'.format(args.user, str(int(time.time())))
177 | gmap.draw(outfile)
178 | print("\n[ ] HTML output file named {} was written to disk.".format(outfile))
179 |
180 | if args.csv:
181 | outfile = 'yelp_map_{}_{}.csv'.format(args.user, str(int(time.time())))
182 | print('[ ] CSV output file named {} was written to disk.'.format(outfile))
183 | csv_data = []
184 | for row in reviewlatslongs:
185 | row1 = '{}, {}'.format(row[0], row[1])
186 | csv_data.append([passed_user, row1])
187 | with open(outfile, 'w', newline='') as f:
188 | writer = csv.writer(f)
189 | writer.writerows(csv_data)
190 |
191 | ###########################
192 | # Start
193 | ###########################
194 |
195 | # Suppress HTTPS warnings
196 | requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
197 |
198 |
199 | ###############
200 | # Get Venue info
201 | ###############
202 | get_venue_data(args.user)
203 |
--------------------------------------------------------------------------------