├── LICENSE ├── README.md └── dork-cli.py /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 2-Clause License 2 | 3 | Copyright (c) 2017, jgor 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 17 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | dork-cli 2 | ======== 3 | 4 | Command-line Google dork tool. 5 | 6 | dork-cli performs searches against a Google custom search engine and returns a list of all the unique page results it finds, optionally filtered by a set of dynamic page extensions. Any number of additional query terms / dorks can be specified. dork-cli was designed to be piped into an external tool such as a vulnerability scanner for automated testing purposes. 7 | 8 | ## Setup ## 9 | In order to use this program you need to configure at a minimum two settings: a Google API key and a custom search engine id. 10 | 11 | Custom Search Engine: 12 | * Create a custom search engine via https://www.google.com/cse/ 13 | * Add your desired domain(s) under "Sites to search" 14 | * Click "Search engine ID" button to reveal the id, or grab it from the "cx" url paramter 15 | 16 | API key: 17 | * Open the Google API console at https://code.google.com/apis/console 18 | * Enable the Custom Search API via APIs & auth > APIs 19 | * Create a new API key via APIs & auth > Credentials > Create new Key 20 | * Select "Browser key", leave HTTP Referer blank and click Create 21 | 22 | ## Usage ## 23 |
24 | $ ./dork-cli.py -h 25 | usage: dork-cli.py [-h] [-e ENGINE] [-f [FILETYPES]] [-k KEY] [-m MAX_QUERIES] 26 | [-s SLEEP] 27 | [T [T ...]] 28 | 29 | Find dynamic pages via Google dorks. 30 | 31 | positional arguments: 32 | T additional search term 33 | 34 | optional arguments: 35 | -h, --help show this help message and exit 36 | -e ENGINE, --engine ENGINE 37 | Google custom search engine id (cx value) 38 | -f [FILETYPES], --filetypes [FILETYPES] 39 | File extensions to return (if present but no 40 | extensions specified, builtin dynamic list is used) 41 | -k KEY, --key KEY Google API key 42 | -m MAX_QUERIES, --max-queries MAX_QUERIES 43 | Maximum number of queries to issue 44 | -s SLEEP, --sleep SLEEP 45 | Seconds to sleep before retry if daily API limit is 46 | reached (0=disable) 47 |48 | 49 | examples: 50 | * NOTE: including -f/--filetypes without an argument, e.g. followed by --, defaults to filtering by a builtin list of dynamic file extensions. 51 |
52 | $ ./dork-cli.py inurl:login 53 | https://www.example.com/usher/Login.aspx 54 | https://www.example.com/login/ 55 | http://www.example.com/rooms/index.php?option=com_user&view=login&Itemid=8 56 | http://www.example.com/index.php?cmd=login 57 | [...] 58 |59 |
60 | $ ./dork-cli.py --filetypes -- inurl:id 61 | http://www.example.com/its/sla/sla.php?id=1617 62 | http://www.example.com/bbucks/index.php?site=5&scode=0&id=720 63 | http://www.example.com/directory/details.aspx?id=33 64 | http://www.example.com/SitePages/VOIP%20ID.aspx 65 | http://www.example.com/personnel_ext.php?id=44 66 | http://www.example.com/its/alerts/event.php?id=7220 67 | [...] 68 |69 |
70 | $ ./dork-cli.py --filetypes=php,aspx intitle:login inurl:admin 71 | https://www.example.com/users/lab/admin/portal.php 72 | https://www.example.com/admin/start/login.aspx?ReturnUrl=%2Fadmin%2Fscheduling%2Faudit%2Fdefault.aspx 73 | http://www.example.com/admin/admin.php 74 | [...] 75 |76 | 77 | ## API Limitations ## 78 | The free Google API limits you to 100 searches per day, with a maximum of 10 results per search. This means if you configure dork-cli.py to return 100 results, it will issue 10 queries (1/10th of your daily limit) each time it is run. You have the option to pay for additional searches via the Google API console. At the time of writing, signing up for billing on the Google API site gets you $300 free to spend on API calls for 60 days. 79 | 80 | -------------------------------------------------------------------------------- /dork-cli.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | from __future__ import print_function 3 | try: 4 | from urllib.request import urlopen 5 | from urllib.parse import urlencode,urlparse 6 | from urllib.error import HTTPError 7 | except ImportError: 8 | from urllib import urlencode 9 | from urllib2 import urlopen, HTTPError 10 | from urlparse import urlparse 11 | import json 12 | import sys 13 | import time 14 | import argparse 15 | 16 | domain = '' 17 | engine = '' 18 | key = '' 19 | max_queries = 10 20 | sleep = 0 21 | dynamic_filetypes = "asp,aspx,cfm,cgi,jsp,php,phtm,phtml,shtm,shtml" 22 | 23 | def main(): 24 | parser = argparse.ArgumentParser(description='Find dynamic pages via Google dorks.') 25 | parser.add_argument('-d', '--domain', default=domain, 26 | help='Specific domain to search (instead of all domains defined in CSE)') 27 | parser.add_argument('-e', '--engine', default=engine, 28 | help='Google custom search engine id (cx value)') 29 | parser.add_argument('-f', '--filetypes', nargs='?', default=[], 30 | const=dynamic_filetypes, 31 | help='File extensions to return (if present but no extensions specified, builtin dynamic list is used)') 32 | parser.add_argument('-k', '--key', default=key, 33 | help='Google API key') 34 | parser.add_argument('-m', '--max-queries', type=int, default=max_queries, 35 | help='Maximum number of queries to issue') 36 | parser.add_argument('-s', '--sleep', type=int, default=sleep, 37 | help='Seconds to sleep before retry if daily API limit is reached (0=disable)') 38 | parser.add_argument('terms', metavar='T', nargs='*', 39 | help='additional search term') 40 | 41 | args = parser.parse_args() 42 | 43 | if not args.key or not args.engine: 44 | print("ERROR: [key] and [engine] must be set", file=sys.stderr) 45 | parser.print_help() 46 | sys.exit(1) 47 | 48 | data = {} 49 | data['key'] = args.key 50 | data['cx'] = args.engine 51 | data['siteSearch'] = args.domain 52 | data['q'] = ' '.join(args.terms) 53 | if args.filetypes: 54 | filetypes = args.filetypes.split(',') 55 | data['q'] += ' filetype:' + ' OR filetype:'.join(filetypes) 56 | data['num'] = 10 57 | data['start'] = 1 58 | 59 | pages = set() 60 | found = 0 61 | query_max_reached = False 62 | query_count = 0 63 | data_saved = data['q'] 64 | 65 | while query_count < args.max_queries: 66 | url = 'https://www.googleapis.com/customsearch/v1?'+ urlencode(data) 67 | try: 68 | response_str = urlopen(url) 69 | query_count += 1 70 | response_str = response_str.read().decode('utf-8') 71 | response = json.loads(response_str) 72 | except HTTPError as e: 73 | response_str = e.read().decode('utf-8') 74 | response = json.loads(response_str) 75 | if "Invalid Value" in response['error']['message']: 76 | sys.exit(0) 77 | elif response['error']['code'] == 500: 78 | data['q'] = data_saved 79 | query_max_reached = True 80 | continue 81 | print("error: " + str(response['error']['code']) + " - " + str(response['error']['message']), file=sys.stderr) 82 | for error in response['error']['errors']: 83 | print(error['domain'] + "::" + error['reason'] + "::" + error['message'], file=sys.stderr) 84 | if "User Rate Limit Exceeded" in response['error']['message']: 85 | print("sleeping " + str(args.sleep) + " seconds", file=sys.stderr) 86 | time.sleep(5) 87 | elif args.sleep and "Daily Limit Exceeded" in response['error']['message']: 88 | print("sleeping " + str(args.sleep) + " seconds", file=sys.stderr) 89 | time.sleep(args.sleep) 90 | continue 91 | else: 92 | sys.exit(1) 93 | data_saved = data['q'] 94 | for request in response['queries']['request']: 95 | if int(request['totalResults']) == 0: 96 | sys.exit(0) 97 | for item in response['items']: 98 | item_url = urlparse(item['link']) 99 | if item_url.path in pages: 100 | if not query_max_reached: 101 | data['q'] += " -inurl:" + item_url.path 102 | else: 103 | pages.add(item_url.path) 104 | found += 1 105 | print(item['link']) 106 | if found >= data['num'] or query_max_reached: 107 | data['start'] += data['num'] 108 | 109 | if __name__ == "__main__": 110 | main() 111 | 112 | --------------------------------------------------------------------------------