├── .gitignore ├── Dockerfile ├── README.md ├── requirements.txt └── webscraper ├── config.py ├── utils ├── extract_links_from_webpage.py ├── media_extensions.py ├── proxy_utils │ ├── proxy.py │ └── proxy_list.json ├── redislite_utils.py ├── request_client.py ├── url_utils.py └── user_agent_utils │ ├── user_agent.py │ └── user_agent_list.json └── webscraper.py /.gitignore: -------------------------------------------------------------------------------- 1 | /data/ 2 | /redis.db 3 | .idea/ 4 | .DS_Store 5 | __pycache__/ 6 | *.py[cod] 7 | 8 | */redis.db 9 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # Use an official Python runtime as a parent image 2 | FROM python:3.10 3 | 4 | # Set the working directory to /app 5 | WORKDIR /app 6 | 7 | # Copy the current directory contents into the container at /app 8 | COPY ./requirements.txt /app/. 9 | 10 | # Install any needed packages specified in requirements.txt 11 | RUN pip install -r requirements.txt 12 | 13 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # WebScraper 2 | 3 | WebScraper is a Python-based web scraping tool designed to crawl websites efficiently while implementing sophisticated techniques to evade website security mechanisms and prevent blocking. Whether you require data extraction for research, analysis, or any other purpose, WebScraper streamlines the web scraping process, making it both effective and user-friendly. 4 | 5 | 6 | ## Features 7 | 8 | WebScraper offers several essential features to enhance your web scraping experience: 9 | 10 | - **Request Throttling:** Avoid overwhelming target websites by intelligently throttling your requests, ensuring a respectful and non-disruptive scraping process. 11 | 12 | - **Random Time Intervals:** Implement randomized time intervals between requests to mimic human browsing behavior, reducing the likelihood of triggering website security measures. 13 | 14 | - **User-Agent Rotation:** Automatically switch User-Agents for each request to make your scraping activities appear more like legitimate user interactions. 15 | 16 | - **IP Rotation via Proxy Server:** Enable IP rotation through a proxy server to further disguise your scraping activities, making it challenging for websites to detect and block your access. 17 | 18 | These features collectively enhance the reliability and stealthiness of your web scraping tasks, enabling you to gather data with minimal disruption and increased success rates. 19 | 20 | 21 | ## Getting Started 22 | 23 | Follow these instructions to get a copy of WebScraper up and running on your local machine. 24 | 25 | ### Prerequisites 26 | 27 | Make sure you have the following prerequisites installed: 28 | 29 | - Python 3.x 30 | - Pip (Python package manager) 31 | 32 | ### Installation 33 | 34 | #### Direct Installation 35 | 36 | 1. Clone this repository to your local machine: 37 | 38 | ```bash 39 | git clone https://github.com/MLArtist/WebScraper.git 40 | ``` 41 | 42 | 2. Navigate to the project directory: 43 | 44 | ```bash 45 | cd WebScraper 46 | ``` 47 | 48 | 3. Install the required Python dependencies: 49 | 50 | ```bash 51 | pip install -r requirements.txt 52 | ``` 53 | 54 | Now, you're ready to start using WebScraper! 55 | 56 | #### Docker Installation 57 | 58 | If you prefer to use Docker for installation, follow these steps: 59 | 60 | 1. Make sure you have Docker installed on your system. 61 | 62 | 2. build the WebScraper Docker image: 63 | 64 | ```bash 65 | docker build -t webscraper -f Dockerfile . 66 | ``` 67 | 68 | Now, you can use WebScraper in a Docker container! 69 | 70 | 71 | ### Usage 72 | 73 | #### Direct Usage 74 | 75 | To start scraping, use the following command: 76 | 77 | ```bash 78 | cd webscraper 79 | python -m webscraper URL 80 | ``` 81 | 82 | Replace `URL` with the URL of the website you want to scrape. 83 | 84 | > If you wish to start the crawling process from a supplied address and clear any previously scraped data, you can use the following command: 85 | > 86 | > ```bash 87 | > cd webscraper 88 | > python -m webscraper URL --start_afresh true 89 | > ``` 90 | > Replace `URL` with the URL of the website you want to scrape. 91 | 92 | 93 | #### Via Docker 94 | 95 | To run WebScraper using Docker, execute the following command: 96 | 97 | ```bash 98 | docker run -d -v $(pwd):/app -w /app/webscraper webscraper \ 99 | python -m webscraper URL 100 | ``` 101 | 102 | Replace `URL` with the URL of the website you want to scrape. 103 | 104 | #### Output 105 | 106 | WebScraper generates output in the form of JSON files, which are stored in the `/data/` directory. Each JSON file contains the raw HTML content of a webpage in the following format: 107 | 108 | ```json 109 | { 110 | "url": "URL of the webpage", 111 | "content": "Raw HTML content of the webpage associated with the URL" 112 | } 113 | ``` 114 | 115 | 116 | ## Acknowledgments 117 | 118 | - Special thanks to the open-source community for providing valuable libraries and tools that made this project possible. 119 | 120 | 121 | ## Note 122 | 123 | Web scraping should always be done responsibly and in compliance with the website's terms of service and legal regulations. Before using WebScraper, make sure you have the necessary permissions to scrape data from the targeted website. 124 | 125 | Additionally, keep in mind that scraping large amounts of data or scraping too frequently from a website can put strain on the site's resources and may result in IP bans or legal action. Please use WebScraper responsibly and ethically. 126 | 127 | The maintainers of this repository are not responsible for any misuse or legal consequences arising from the use of WebScraper. Users are encouraged to familiarize themselves with web scraping best practices and legal guidelines before using this tool. 128 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | beautifulsoup4==4.12.2 2 | certifi==2024.7.4 3 | chardet==5.1.0 4 | idna==3.7 5 | lxml==4.9.2 6 | psutil==5.9.5 7 | redis==4.5.5 8 | redislite==6.2.859089 9 | requests==2.32.0 10 | soupsieve==2.4.1 11 | urllib3 12 | -------------------------------------------------------------------------------- /webscraper/config.py: -------------------------------------------------------------------------------- 1 | import os 2 | HOME_DIR = os.path.dirname(os.path.realpath(__file__)) 3 | 4 | DATA_DIR = os.path.join(HOME_DIR, os.pardir, r"data/") 5 | 6 | # URL exclusions 7 | EXCLUDED_URL_EXTENSIONS = ["application", "audio", "video", "image"] 8 | 9 | USE_PROXY_SERVER = False 10 | -------------------------------------------------------------------------------- /webscraper/utils/extract_links_from_webpage.py: -------------------------------------------------------------------------------- 1 | from collections import Counter 2 | from bs4 import BeautifulSoup 3 | 4 | from utils.url_utils import url_split 5 | 6 | 7 | def get_links(text, website_full_url): 8 | """ 9 | get the local links and their frequency in a webpage 10 | :param 11 | text: response.text 12 | website_full_url: homepage of website which is being scrapped 13 | """ 14 | 15 | local_urls = list() 16 | foreign_urls = list() 17 | 18 | base = url_split(website_full_url)["base"] 19 | strip_base = url_split(website_full_url)["strip_base"] 20 | base_url = url_split(website_full_url)["base_url"] 21 | path = url_split(website_full_url)["path"] 22 | scheme = url_split(website_full_url)["scheme"] 23 | 24 | soup = BeautifulSoup(text, "lxml") 25 | 26 | for link in soup.find_all("a"): 27 | anchor = link.attrs.get("href").strip() if "href" in link.attrs else '' 28 | if anchor.startswith('//'): 29 | endchar = anchor[-1] if anchor.endswith("/") else "" 30 | anchor = anchor.strip("/") + endchar 31 | if anchor.startswith('javascript'): continue 32 | if ("#" in anchor or anchor.startswith("mailto:") 33 | or anchor.startswith("tel:")): 34 | continue 35 | elif anchor.startswith('/'): 36 | local_link = base_url + anchor 37 | local_urls.append(local_link) 38 | elif anchor.startswith("http") and strip_base.lower() in anchor[ 39 | :anchor.find("/", anchor.find("//") + 2)].lower(): 40 | local_urls.append(anchor) 41 | elif strip_base.lower() in anchor[:anchor.find("/")].lower(): 42 | local_link = "{}://{}".format(scheme, anchor) 43 | local_urls.append(local_link) 44 | elif not anchor.lower().startswith("http"): 45 | local_link = path + anchor 46 | local_urls.append(local_link) 47 | else: 48 | foreign_urls.append(anchor) 49 | return dict(Counter(local_urls)) 50 | 51 | 52 | if __name__ == '__main__': 53 | # get_links(text, website_full_url) 54 | pass 55 | -------------------------------------------------------------------------------- /webscraper/utils/media_extensions.py: -------------------------------------------------------------------------------- 1 | import mimetypes 2 | 3 | import config 4 | 5 | 6 | def get_extensions_for_type(): 7 | return_list = set() 8 | general_type = config.EXCLUDED_URL_EXTENSIONS 9 | for ext in mimetypes.types_map: 10 | if mimetypes.types_map[ext].split('/')[0] in general_type: 11 | return_list.add(ext.lower()) 12 | return return_list 13 | 14 | 15 | media_extensions_list = get_extensions_for_type() 16 | -------------------------------------------------------------------------------- /webscraper/utils/proxy_utils/proxy.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import random 4 | import requests 5 | from bs4 import BeautifulSoup 6 | 7 | from config import HOME_DIR 8 | from utils.user_agent_utils.user_agent import UserAgent 9 | 10 | ua = UserAgent() 11 | 12 | 13 | class Proxy: 14 | def __init__(self): 15 | self.proxy_list = self.read_proxy_list() 16 | if not self.proxy_list: 17 | self.proxy_list = self.update_proxy_list() 18 | self.write_proxy_list() 19 | self.count = 0 20 | self.proxy = None 21 | 22 | @staticmethod 23 | def read_proxy_list(): 24 | filename = os.path.join(HOME_DIR, "utils/proxy_utils/proxy_list.json") 25 | if os.path.exists(filename): 26 | with open(filename, "r") as fp: 27 | return json.load(fp) 28 | return None 29 | 30 | @staticmethod 31 | def update_proxy_list(): 32 | # Retrieve latest proxies 33 | url = 'https://www.sslproxies.org/' 34 | try: 35 | response = requests.get(url, headers={'User-Agent': ua.user_agent()}) 36 | except: 37 | raise Exception("Could not scrape proxies from {}!".format(url)) 38 | 39 | soup = BeautifulSoup(response.text, 'html.parser') 40 | 41 | proxies_table = soup.find(id='proxylisttable') 42 | proxy_list = [] 43 | # Save proxies in a file array 44 | for row in proxies_table.tbody.find_all('tr'): 45 | proxy_list.append(row.find_all('td')[0].string + ':' + row.find_all('td')[1].string) 46 | return proxy_list 47 | 48 | def write_proxy_list(self): 49 | with open("proxy_list.json", "w") as fp: 50 | json.dump(self.proxy_list, fp) 51 | 52 | def generate_proxy(self): 53 | """Choose a random proxy; keeps the same proxy for some number of times then changes it 54 | """ 55 | if self.count % 10 == 0: 56 | self.proxy = random.choice(self.proxy_list) 57 | return { 58 | "http": self.proxy, 59 | "https": self.proxy 60 | } 61 | 62 | 63 | if __name__ == '__main__': 64 | pro = Proxy() 65 | print(pro.generate_proxy()) 66 | -------------------------------------------------------------------------------- /webscraper/utils/proxy_utils/proxy_list.json: -------------------------------------------------------------------------------- 1 | ["187.87.38.28:53281", "91.236.251.131:8118", "51.158.171.137:3128", "51.15.142.111:3128", "103.102.15.90:10714", "51.15.57.189:3128", "51.15.63.71:3128", "51.158.161.153:3128", "109.251.185.20:31692", "200.106.55.125:80", "51.15.132.100:3128", "51.15.83.142:3128", "51.158.164.40:3128", "51.15.134.242:3128", "51.15.70.100:3128", "51.15.104.165:3128", "51.158.189.154:3128", "51.15.231.181:3128", "212.83.128.152:3838", "51.15.58.59:3128", "51.158.99.58:3128", "51.158.178.4:3128", "51.158.184.55:3128", "51.158.184.126:3128", "45.76.176.170:3128", "51.158.168.245:3128", "51.158.189.203:3128", "91.224.98.206:36301", "51.15.118.123:3128", "51.15.57.44:3128", "202.40.177.69:80", "51.15.132.110:3128", "51.15.36.8:3128", "95.30.182.240:8080", "51.15.56.157:3128", "51.15.111.34:3128", "51.15.127.242:3128", "51.158.165.63:3128", "36.90.69.105:55443", "181.118.167.104:80", "103.28.121.58:3128", "51.15.207.198:3128", "110.44.133.135:3128", "47.89.8.186:80", "51.15.83.156:3128", "51.15.39.251:3128", "77.37.131.164:55443", "51.161.62.120:8080", "186.0.231.138:999", "191.232.170.36:80", "128.199.214.87:3128", "139.99.102.114:80", "51.15.119.118:3128", "82.200.233.4:3128", "51.15.96.241:3128", "51.15.73.212:3128", "51.15.111.19:3128", "51.158.172.174:3128", "163.172.151.90:3128", "176.202.92.62:8080", "190.214.45.18:31326", "94.228.21.66:45882", "103.224.185.48:32341", "36.92.124.2:39081", "116.0.2.162:36432", "200.55.218.202:53281", "223.165.1.170:44504", "85.10.219.97:1080", "46.175.70.69:40559", "68.183.30.238:1111", "109.199.133.161:23500", "109.87.40.23:33879", "195.78.93.28:3128", "103.142.110.130:36793", "117.54.4.245:42097", "43.252.18.140:54561", "36.89.183.253:48733", "37.17.38.196:53281", "188.0.138.147:8080", "186.159.8.218:59211", "103.87.207.188:44793", "3.23.23.63:20018", "77.221.220.133:44331", "77.247.94.131:48602", "95.0.206.216:8080", "64.71.145.122:3128", "45.7.133.174:1985", "81.89.113.142:57801", "1.10.187.149:44976", "51.15.115.249:3128", "51.15.105.176:3128", "51.15.124.3:3128", "51.158.182.115:3128", "51.15.119.72:3128", "51.15.141.178:3128", "51.15.108.88:3128", "51.158.67.66:3128", "51.15.47.6:3128", "51.15.110.18:3128", "51.15.62.232:3128"] -------------------------------------------------------------------------------- /webscraper/utils/redislite_utils.py: -------------------------------------------------------------------------------- 1 | from redislite import Redis 2 | 3 | from utils.url_utils import get_filtered_links 4 | 5 | 6 | redis_client = Redis(dbfilename="./redis.db", decode_responses=True) 7 | 8 | 9 | def redis_cleanup(website_full_url): 10 | """removed invalid entries from redis caches""" 11 | # remove intersections 12 | for anchor in redis_client.sinter("new_urls", "processed_urls"): 13 | redis_client.srem("new_urls", anchor) 14 | print("Removed processed URL from redis: {}!\n".format(anchor)) 15 | for anchor in redis_client.smembers("new_urls"): 16 | if len(get_filtered_links([anchor], website_full_url)) < 1: 17 | redis_client.srem("new_urls", anchor) 18 | 19 | 20 | if __name__ == '__main__': 21 | redis_cleanup("https://www.wikipedia.org/") 22 | pass 23 | -------------------------------------------------------------------------------- /webscraper/utils/request_client.py: -------------------------------------------------------------------------------- 1 | from requests.auth import HTTPProxyAuth 2 | 3 | from utils.proxy_utils.proxy import Proxy, ua 4 | import requests 5 | import config 6 | 7 | 8 | class ReqestClient(Proxy): 9 | def __init__(self): 10 | super().__init__() 11 | 12 | def request_with_proxy_header(self, url): 13 | header = {'User-Agent': ua.user_agent()} 14 | 15 | if config.USE_PROXY_SERVER: 16 | proxy = self.generate_proxy() 17 | auth = HTTPProxyAuth("", "") 18 | try: 19 | response = requests.get(url, proxies=proxy, auth=auth, headers=header, timeout=20, verify=True) 20 | return response 21 | except: 22 | # remove the invalid proxy from the proxy list and update in the file 23 | self.proxy_list.remove(proxy.get("http")) 24 | self.write_proxy_list() 25 | else: 26 | try: 27 | response = requests.get(url, headers=header, timeout=20, verify=True) 28 | except: 29 | response = None 30 | 31 | return response 32 | 33 | 34 | if __name__ == '__main__': 35 | cli = ReqestClient() 36 | print(cli.request_with_proxy_header("https://www.wikipedia.org/").text) 37 | -------------------------------------------------------------------------------- /webscraper/utils/url_utils.py: -------------------------------------------------------------------------------- 1 | from urllib.parse import urlsplit 2 | 3 | from utils.media_extensions import media_extensions_list 4 | 5 | 6 | def url_split(url): 7 | """ split the link into useful parts""" 8 | parts = urlsplit(url) 9 | scheme = parts.scheme 10 | base = '{0.netloc}'.format(parts) 11 | strip_base = base.replace('www.', "") 12 | base_url = "{0.scheme}://{0.netloc}".format(parts) 13 | path = url[:url.rfind('/') + 1] if '/' in parts.path else url 14 | return { 15 | "scheme": scheme, 16 | "base": base, 17 | "strip_base": strip_base, 18 | "base_url": base_url, 19 | "path": path 20 | } 21 | 22 | 23 | def get_filtered_links(local_urls_list, website_full_url): 24 | """ get the filtered links for a list of local links""" 25 | filtered_list = [] 26 | strip_base = url_split(website_full_url)["strip_base"].lower() 27 | for anchor in local_urls_list: 28 | #discard media extensions 29 | i = anchor.lower() 30 | if i[i.rfind("."):].strip("/") in media_extensions_list: 31 | continue 32 | # discard anchor tags 33 | if "#" in i: 34 | continue 35 | # discard javascripts tags 36 | if "javascript:" in i: 37 | continue 38 | if i.startswith("http"): 39 | # discard "https://www.google.com/c.org" 40 | http_loc = i.find("//") 41 | end_finder = i.find("/", http_loc+2) if not i.find("/", http_loc+2) == -1 else len(i) 42 | if not strip_base in i[:end_finder]: 43 | continue 44 | else: 45 | i = i.strip("/") 46 | # discard "www.google.com/https://www.wikipedia.org" 47 | end_finder = i.find("/") if not i.find("/") == -1 else len(i) 48 | if not strip_base in i[:end_finder]: 49 | continue 50 | filtered_list.append(anchor) 51 | return filtered_list 52 | 53 | 54 | if __name__ == '__main__': 55 | pass -------------------------------------------------------------------------------- /webscraper/utils/user_agent_utils/user_agent.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import random 4 | import config 5 | 6 | 7 | class UserAgent: 8 | def __init__(self): 9 | self.user_agents = self.load(file=os.path.join(config.HOME_DIR, "utils/user_agent_utils/user_agent_list.json")) 10 | 11 | @staticmethod 12 | def load(file): 13 | """loads the data from supplied json file""" 14 | with open(file) as fp: 15 | data = json.load(fp) 16 | if not data: 17 | print("Empty {} file".format(file)) 18 | raise 19 | return data['browsers'] 20 | 21 | def user_agent(self): 22 | """returns an user agent selected randomly from the provided data""" 23 | browser = random.choice([*self.user_agents]) 24 | return random.choice(self.user_agents[browser]) 25 | 26 | 27 | if __name__ == '__main__': 28 | pro = UserAgent() 29 | print(pro.user_agent()) -------------------------------------------------------------------------------- /webscraper/utils/user_agent_utils/user_agent_list.json: -------------------------------------------------------------------------------- 1 | {"browsers": {"chrome": ["Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2226.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.4; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2224.3 Safari/537.36", "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.93 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2049.0 Safari/537.36", "Mozilla/5.0 (Windows NT 4.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2049.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36", "Mozilla/5.0 (X11; OpenBSD i386) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.3319.102 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2117.157 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1866.237 Safari/537.36", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/4E423F", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36 Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.10", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.517 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1664.3 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1664.3 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.16 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1623.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.62 Safari/537.36", "Mozilla/5.0 (X11; CrOS i686 4319.74.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.2 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1467.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1464.0 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1500.55 Safari/537.36", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.90 Safari/537.36", "Mozilla/5.0 (X11; NetBSD) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36", "Mozilla/5.0 (X11; CrOS i686 3912.101.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.60 Safari/537.17", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Safari/537.15", "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.14 (KHTML, like Gecko) Chrome/24.0.1292.0 Safari/537.14"], "opera": ["Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16", "Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14", "Mozilla/5.0 (Windows NT 6.0; rv:2.0) Gecko/20100101 Firefox/4.0 Opera 12.14", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0) Opera 12.14", "Opera/12.80 (Windows NT 5.1; U; en) Presto/2.10.289 Version/12.02", "Opera/9.80 (Windows NT 6.1; U; es-ES) Presto/2.9.181 Version/12.00", "Opera/9.80 (Windows NT 5.1; U; zh-sg) Presto/2.9.181 Version/12.00", "Opera/12.0(Windows NT 5.2;U;en)Presto/22.9.168 Version/12.00", "Opera/12.0(Windows NT 5.1;U;en)Presto/22.9.168 Version/12.00", "Mozilla/5.0 (Windows NT 5.1) Gecko/20100101 Firefox/14.0 Opera/12.0", "Opera/9.80 (Windows NT 6.1; WOW64; U; pt) Presto/2.10.229 Version/11.62", "Opera/9.80 (Windows NT 6.0; U; pl) Presto/2.10.229 Version/11.62", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; fr) Presto/2.9.168 Version/11.52", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; de) Presto/2.9.168 Version/11.52", "Opera/9.80 (Windows NT 5.1; U; en) Presto/2.9.168 Version/11.51", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; de) Opera 11.51", "Opera/9.80 (X11; Linux x86_64; U; fr) Presto/2.9.168 Version/11.50", "Opera/9.80 (X11; Linux i686; U; hu) Presto/2.9.168 Version/11.50", "Opera/9.80 (X11; Linux i686; U; ru) Presto/2.8.131 Version/11.11", "Opera/9.80 (X11; Linux i686; U; es-ES) Presto/2.8.131 Version/11.11", "Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/5.0 Opera 11.11", "Opera/9.80 (X11; Linux x86_64; U; bg) Presto/2.8.131 Version/11.10", "Opera/9.80 (Windows NT 6.0; U; en) Presto/2.8.99 Version/11.10", "Opera/9.80 (Windows NT 5.1; U; zh-tw) Presto/2.8.131 Version/11.10", "Opera/9.80 (Windows NT 6.1; Opera Tablet/15165; U; en) Presto/2.8.149 Version/11.1", "Opera/9.80 (X11; Linux x86_64; U; Ubuntu/10.10 (maverick); pl) Presto/2.7.62 Version/11.01", "Opera/9.80 (X11; Linux i686; U; ja) Presto/2.7.62 Version/11.01", "Opera/9.80 (X11; Linux i686; U; fr) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.1; U; zh-tw) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.1; U; zh-cn) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.1; U; sv) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.1; U; en-US) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.1; U; cs) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 6.0; U; pl) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 5.2; U; ru) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 5.1; U;) Presto/2.7.62 Version/11.01", "Opera/9.80 (Windows NT 5.1; U; cs) Presto/2.7.62 Version/11.01", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101213 Opera/9.80 (Windows NT 6.1; U; zh-tw) Presto/2.7.62 Version/11.01", "Mozilla/5.0 (Windows NT 6.1; U; nl; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6 Opera 11.01", "Mozilla/5.0 (Windows NT 6.1; U; de; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6 Opera 11.01", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; de) Opera 11.01", "Opera/9.80 (X11; Linux x86_64; U; pl) Presto/2.7.62 Version/11.00", "Opera/9.80 (X11; Linux i686; U; it) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.1; U; zh-cn) Presto/2.6.37 Version/11.00", "Opera/9.80 (Windows NT 6.1; U; pl) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.1; U; ko) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.1; U; fi) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.1; U; en-GB) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.1 x64; U; en) Presto/2.7.62 Version/11.00", "Opera/9.80 (Windows NT 6.0; U; en) Presto/2.7.39 Version/11.00"], "firefox": ["Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1", "Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10; rv:33.0) Gecko/20100101 Firefox/33.0", "Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/31.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20130401 Firefox/31.0", "Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20120101 Firefox/29.0", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/29.0", "Mozilla/5.0 (X11; OpenBSD amd64; rv:28.0) Gecko/20100101 Firefox/28.0", "Mozilla/5.0 (X11; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0", "Mozilla/5.0 (Windows NT 6.1; rv:27.3) Gecko/20130101 Firefox/27.3", "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:27.0) Gecko/20121011 Firefox/27.0", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:25.0) Gecko/20100101 Firefox/25.0", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0", "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Firefox/24.0", "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20130405 Firefox/23.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20130406 Firefox/23.0", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:23.0) Gecko/20131011 Firefox/23.0", "Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20130405 Firefox/22.0", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:22.0) Gecko/20130328 Firefox/22.0", "Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20130405 Firefox/22.0", "Mozilla/5.0 (Microsoft Windows NT 6.2.9200.0); rv:22.0) Gecko/20130405 Firefox/22.0", "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/21.0.1", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/21.0.1", "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:21.0.0) Gecko/20121011 Firefox/21.0.0", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20130331 Firefox/21.0", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:21.0) Gecko/20130514 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.2; rv:21.0) Gecko/20130326 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20130401 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20130331 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20130330 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20130401 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20130328 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20130401 Firefox/21.0", "Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20130331 Firefox/21.0", "Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Windows NT 5.0; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0", "Mozilla/5.0 (Windows NT 6.2; Win64; x64;) Gecko/20100101 Firefox/20.0", "Mozilla/5.0 (Windows x86; rv:19.0) Gecko/20100101 Firefox/19.0", "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20100101 Firefox/19.0", "Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/18.0.1", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0.6"], "internetexplorer": ["Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; AS; rv:11.0) like Gecko", "Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko", "Mozilla/5.0 (compatible; MSIE 10.6; Windows NT 6.1; Trident/5.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) 3gpp-gba UNTRUSTED/1.0", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 7.0; InfoPath.3; .NET CLR 3.1.40767; Trident/6.0; en-IN)", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/5.0)", "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/4.0; InfoPath.2; SV1; .NET CLR 2.0.50727; WOW64)", "Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)", "Mozilla/4.0 (Compatible; MSIE 8.0; Windows NT 5.2; Trident/6.0)", "Mozilla/4.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/5.0)", "Mozilla/1.22 (compatible; MSIE 10.0; Windows 3.1)", "Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US))", "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 7.1; Trident/5.0)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; Media Center PC 6.0; InfoPath.3; MS-RTC LM 8; Zune 4.7)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; Media Center PC 6.0; InfoPath.3; MS-RTC LM 8; Zune 4.7", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Zune 4.0; InfoPath.3; MS-RTC LM 8; .NET4.0C; .NET4.0E)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; chromeframe/12.0.742.112)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 2.0.50727; Media Center PC 6.0)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 2.0.50727; Media Center PC 6.0)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Zune 4.0; Tablet PC 2.0; InfoPath.3; .NET4.0C; .NET4.0E)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; yie8)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2; .NET CLR 1.1.4322; .NET4.0C; Tablet PC 2.0)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; FunWebProducts)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; chromeframe/13.0.782.215)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; chromeframe/11.0.696.57)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) chromeframe/10.0.648.205", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.1; SV1; .NET CLR 2.8.52393; WOW64; en-US)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; chromeframe/11.0.696.57)", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/4.0; GTB7.4; InfoPath.3; SV1; .NET CLR 3.1.76908; WOW64; en-US)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.2; SV1; .NET CLR 3.3.69573; WOW64; en-US)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 1.0.3705; .NET CLR 1.1.4322)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; InfoPath.1; SV1; .NET CLR 3.8.36217; WOW64; en-US)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; .NET CLR 2.7.58687; SLCC2; Media Center PC 5.0; Zune 3.4; Tablet PC 3.6; InfoPath.3)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.2; Trident/4.0; Media Center PC 4.0; SLCC1; .NET CLR 3.0.04320)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 1.1.4322)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322)", "Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.0; Trident/4.0; InfoPath.1; SV1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 3.0.04506.30)", "Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 5.0; Trident/4.0; FBSMTWB; .NET CLR 2.0.34861; .NET CLR 3.0.3746.3218; .NET CLR 3.5.33652; msn OptimizedIE8;ENUS)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.2; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; Media Center PC 6.0; InfoPath.2; MS-RTC LM 8)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; Media Center PC 6.0; InfoPath.2; MS-RTC LM 8", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; Media Center PC 6.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; InfoPath.3; .NET4.0C; .NET4.0E; .NET CLR 3.5.30729; .NET CLR 3.0.30729; MS-RTC LM 8)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; InfoPath.2)", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Zune 3.0)"], "safari": ["Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A", "Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13+ (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/534.55.3 (KHTML, like Gecko) Version/5.1.3 Safari/534.53.10", "Mozilla/5.0 (iPad; CPU OS 5_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko ) Version/5.1 Mobile/9B176 Safari/7534.48.3", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; de-at) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; da-dk) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1", "Mozilla/5.0 (Windows; U; Windows NT 6.1; tr-TR) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.1; ko-KR) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.1; fr-FR) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.1; cs-CZ) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.0; ja-JP) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_5_8; zh-cn) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_5_8; ja-jp) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; ja-jp) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; zh-cn) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; sv-se) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; ko-kr) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; ja-jp) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; it-it) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; fr-fr) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; es-es) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-gb) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; de-de) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27", "Mozilla/5.0 (Windows; U; Windows NT 6.1; sv-SE) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.1; ja-JP) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.1; de-DE) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.0; hu-HU) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.0; de-DE) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 5.1; ja-JP) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 5.1; it-IT) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/534.16+ (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; fr-ch) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; de-de) AppleWebKit/534.15+ (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; ar) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Android 2.2; Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4", "Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-HK) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 6.0; tr-TR) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 6.0; nb-NO) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 6.0; fr-FR) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; zh-cn) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5"]}, "randomize": {"344": "chrome", "819": "firefox", "346": "chrome", "347": "chrome", "340": "chrome", "341": "chrome", "342": "chrome", "343": "chrome", "810": "internetexplorer", "811": "internetexplorer", "812": "internetexplorer", "813": "firefox", "348": "chrome", "349": "chrome", "816": "firefox", "817": "firefox", "737": "chrome", "719": "chrome", "718": "chrome", "717": "chrome", "716": "chrome", "715": "chrome", "714": "chrome", "713": "chrome", "712": "chrome", "711": "chrome", "710": "chrome", "421": "chrome", "129": "chrome", "420": "chrome", "423": "chrome", "422": "chrome", "425": "chrome", "619": "chrome", "424": "chrome", "427": "chrome", "298": "chrome", "299": "chrome", "296": "chrome", "297": "chrome", "294": "chrome", "295": "chrome", "292": "chrome", "293": "chrome", "290": "chrome", "291": "chrome", "591": "chrome", "590": "chrome", "593": "chrome", "592": "chrome", "595": "chrome", "594": "chrome", "597": "chrome", "596": "chrome", "195": "chrome", "194": "chrome", "197": "chrome", "196": "chrome", "191": "chrome", "190": "chrome", "193": "chrome", "192": "chrome", "270": "chrome", "271": "chrome", "272": "chrome", "273": "chrome", "274": "chrome", "275": "chrome", "276": "chrome", "277": "chrome", "278": "chrome", "279": "chrome", "569": "chrome", "497": "chrome", "524": "chrome", "525": "chrome", "526": "chrome", "527": "chrome", "520": "chrome", "521": "chrome", "522": "chrome", "523": "chrome", "528": "chrome", "529": "chrome", "449": "chrome", "448": "chrome", "345": "chrome", "443": "chrome", "442": "chrome", "441": "chrome", "440": "chrome", "447": "chrome", "446": "chrome", "445": "chrome", "444": "chrome", "47": "chrome", "108": "chrome", "109": "chrome", "102": "chrome", "103": "chrome", "100": "chrome", "101": "chrome", "106": "chrome", "107": "chrome", "104": "chrome", "105": "chrome", "902": "firefox", "903": "firefox", "39": "chrome", "38": "chrome", "906": "firefox", "907": "firefox", "904": "firefox", "905": "firefox", "33": "chrome", "32": "chrome", "31": "chrome", "30": "chrome", "37": "chrome", "36": "chrome", "35": "chrome", "34": "chrome", "641": "chrome", "640": "chrome", "643": "chrome", "642": "chrome", "645": "chrome", "644": "chrome", "438": "chrome", "439": "chrome", "436": "chrome", "437": "chrome", "434": "chrome", "435": "chrome", "432": "chrome", "433": "chrome", "430": "chrome", "431": "chrome", "826": "firefox", "339": "chrome", "338": "chrome", "335": "chrome", "334": "chrome", "337": "chrome", "336": "chrome", "331": "chrome", "330": "chrome", "333": "chrome", "332": "chrome", "559": "chrome", "745": "chrome", "854": "firefox", "818": "firefox", "856": "firefox", "857": "firefox", "850": "firefox", "851": "firefox", "852": "firefox", "0": "chrome", "858": "firefox", "859": "firefox", "748": "chrome", "6": "chrome", "43": "chrome", "99": "chrome", "98": "chrome", "91": "chrome", "90": "chrome", "93": "chrome", "92": "chrome", "95": "chrome", "94": "chrome", "97": "chrome", "96": "chrome", "814": "firefox", "815": "firefox", "153": "chrome", "740": "chrome", "741": "chrome", "742": "chrome", "743": "chrome", "744": "chrome", "558": "chrome", "746": "chrome", "747": "chrome", "555": "chrome", "554": "chrome", "557": "chrome", "556": "chrome", "551": "chrome", "550": "chrome", "553": "chrome", "552": "chrome", "238": "chrome", "239": "chrome", "234": "chrome", "235": "chrome", "236": "chrome", "237": "chrome", "230": "chrome", "231": "chrome", "232": "chrome", "233": "chrome", "1": "chrome", "155": "chrome", "146": "chrome", "147": "chrome", "618": "chrome", "145": "chrome", "142": "chrome", "143": "chrome", "140": "chrome", "141": "chrome", "612": "chrome", "613": "chrome", "610": "chrome", "611": "chrome", "616": "chrome", "617": "chrome", "148": "chrome", "149": "chrome", "46": "chrome", "154": "chrome", "948": "safari", "949": "safari", "946": "safari", "947": "safari", "944": "safari", "945": "safari", "942": "safari", "943": "safari", "940": "safari", "941": "safari", "689": "chrome", "688": "chrome", "685": "chrome", "684": "chrome", "687": "chrome", "686": "chrome", "681": "chrome", "680": "chrome", "683": "chrome", "682": "chrome", "458": "chrome", "459": "chrome", "133": "chrome", "132": "chrome", "131": "chrome", "130": "chrome", "137": "chrome", "136": "chrome", "135": "chrome", "134": "chrome", "494": "chrome", "495": "chrome", "139": "chrome", "138": "chrome", "490": "chrome", "491": "chrome", "492": "chrome", "493": "chrome", "24": "chrome", "25": "chrome", "26": "chrome", "27": "chrome", "20": "chrome", "21": "chrome", "22": "chrome", "23": "chrome", "28": "chrome", "29": "chrome", "820": "firefox", "407": "chrome", "406": "chrome", "405": "chrome", "404": "chrome", "403": "chrome", "402": "chrome", "401": "chrome", "400": "chrome", "933": "firefox", "932": "firefox", "931": "firefox", "930": "firefox", "937": "safari", "452": "chrome", "409": "chrome", "408": "chrome", "453": "chrome", "414": "chrome", "183": "chrome", "415": "chrome", "379": "chrome", "378": "chrome", "228": "chrome", "829": "firefox", "828": "firefox", "371": "chrome", "370": "chrome", "373": "chrome", "372": "chrome", "375": "chrome", "374": "chrome", "377": "chrome", "376": "chrome", "708": "chrome", "709": "chrome", "704": "chrome", "705": "chrome", "706": "chrome", "707": "chrome", "700": "chrome", "144": "chrome", "702": "chrome", "703": "chrome", "393": "chrome", "392": "chrome", "88": "chrome", "89": "chrome", "397": "chrome", "396": "chrome", "395": "chrome", "394": "chrome", "82": "chrome", "83": "chrome", "80": "chrome", "81": "chrome", "86": "chrome", "87": "chrome", "84": "chrome", "85": "chrome", "797": "internetexplorer", "796": "internetexplorer", "795": "internetexplorer", "794": "internetexplorer", "793": "internetexplorer", "792": "internetexplorer", "791": "internetexplorer", "790": "internetexplorer", "749": "chrome", "799": "internetexplorer", "798": "internetexplorer", "7": "chrome", "170": "chrome", "586": "chrome", "587": "chrome", "584": "chrome", "585": "chrome", "582": "chrome", "583": "chrome", "580": "chrome", "581": "chrome", "588": "chrome", "589": "chrome", "245": "chrome", "244": "chrome", "247": "chrome", "246": "chrome", "241": "chrome", "614": "chrome", "243": "chrome", "242": "chrome", "615": "chrome", "249": "chrome", "248": "chrome", "418": "chrome", "419": "chrome", "519": "chrome", "518": "chrome", "511": "chrome", "510": "chrome", "513": "chrome", "512": "chrome", "515": "chrome", "514": "chrome", "517": "chrome", "516": "chrome", "623": "chrome", "622": "chrome", "621": "chrome", "620": "chrome", "627": "chrome", "626": "chrome", "625": "chrome", "624": "chrome", "450": "chrome", "451": "chrome", "629": "chrome", "628": "chrome", "454": "chrome", "455": "chrome", "456": "chrome", "457": "chrome", "179": "chrome", "178": "chrome", "177": "chrome", "199": "chrome", "175": "chrome", "174": "chrome", "173": "chrome", "172": "chrome", "171": "chrome", "198": "chrome", "977": "opera", "182": "chrome", "975": "opera", "974": "opera", "973": "opera", "972": "opera", "971": "opera", "970": "opera", "180": "chrome", "979": "opera", "978": "opera", "656": "chrome", "599": "chrome", "654": "chrome", "181": "chrome", "186": "chrome", "187": "chrome", "184": "chrome", "185": "chrome", "652": "chrome", "188": "chrome", "189": "chrome", "658": "chrome", "653": "chrome", "650": "chrome", "651": "chrome", "11": "chrome", "10": "chrome", "13": "chrome", "12": "chrome", "15": "chrome", "14": "chrome", "17": "chrome", "16": "chrome", "19": "chrome", "18": "chrome", "863": "firefox", "862": "firefox", "865": "firefox", "864": "firefox", "867": "firefox", "866": "firefox", "354": "chrome", "659": "chrome", "44": "chrome", "883": "firefox", "882": "firefox", "881": "firefox", "880": "firefox", "887": "firefox", "886": "firefox", "885": "firefox", "884": "firefox", "889": "firefox", "888": "firefox", "116": "chrome", "45": "chrome", "657": "chrome", "355": "chrome", "322": "chrome", "323": "chrome", "320": "chrome", "321": "chrome", "326": "chrome", "327": "chrome", "324": "chrome", "325": "chrome", "328": "chrome", "329": "chrome", "562": "chrome", "201": "chrome", "200": "chrome", "203": "chrome", "202": "chrome", "205": "chrome", "204": "chrome", "207": "chrome", "206": "chrome", "209": "chrome", "208": "chrome", "779": "internetexplorer", "778": "internetexplorer", "77": "chrome", "76": "chrome", "75": "chrome", "74": "chrome", "73": "chrome", "72": "chrome", "71": "chrome", "70": "chrome", "655": "chrome", "567": "chrome", "79": "chrome", "78": "chrome", "359": "chrome", "358": "chrome", "669": "chrome", "668": "chrome", "667": "chrome", "666": "chrome", "665": "chrome", "664": "chrome", "663": "chrome", "662": "chrome", "661": "chrome", "660": "chrome", "215": "chrome", "692": "chrome", "693": "chrome", "690": "chrome", "691": "chrome", "696": "chrome", "697": "chrome", "694": "chrome", "695": "chrome", "698": "chrome", "699": "chrome", "542": "chrome", "543": "chrome", "540": "chrome", "541": "chrome", "546": "chrome", "547": "chrome", "544": "chrome", "545": "chrome", "8": "chrome", "548": "chrome", "549": "chrome", "598": "chrome", "869": "firefox", "868": "firefox", "120": "chrome", "121": "chrome", "122": "chrome", "123": "chrome", "124": "chrome", "125": "chrome", "126": "chrome", "127": "chrome", "128": "chrome", "2": "chrome", "219": "chrome", "176": "chrome", "214": "chrome", "563": "chrome", "928": "firefox", "929": "firefox", "416": "chrome", "417": "chrome", "410": "chrome", "411": "chrome", "412": "chrome", "413": "chrome", "920": "firefox", "498": "chrome", "922": "firefox", "923": "firefox", "924": "firefox", "925": "firefox", "926": "firefox", "927": "firefox", "319": "chrome", "318": "chrome", "313": "chrome", "312": "chrome", "311": "chrome", "310": "chrome", "317": "chrome", "316": "chrome", "315": "chrome", "314": "chrome", "921": "firefox", "496": "chrome", "832": "firefox", "833": "firefox", "830": "firefox", "831": "firefox", "836": "firefox", "837": "firefox", "834": "firefox", "835": "firefox", "838": "firefox", "839": "firefox", "3": "chrome", "368": "chrome", "369": "chrome", "366": "chrome", "367": "chrome", "364": "chrome", "365": "chrome", "362": "chrome", "363": "chrome", "360": "chrome", "361": "chrome", "218": "chrome", "380": "chrome", "861": "firefox", "382": "chrome", "383": "chrome", "384": "chrome", "385": "chrome", "386": "chrome", "387": "chrome", "388": "chrome", "389": "chrome", "784": "internetexplorer", "785": "internetexplorer", "786": "internetexplorer", "787": "internetexplorer", "780": "internetexplorer", "781": "internetexplorer", "782": "internetexplorer", "381": "chrome", "788": "internetexplorer", "789": "internetexplorer", "860": "firefox", "151": "chrome", "579": "chrome", "578": "chrome", "150": "chrome", "573": "chrome", "572": "chrome", "571": "chrome", "570": "chrome", "577": "chrome", "576": "chrome", "575": "chrome", "574": "chrome", "60": "chrome", "61": "chrome", "62": "chrome", "259": "chrome", "64": "chrome", "65": "chrome", "66": "chrome", "67": "chrome", "68": "chrome", "253": "chrome", "250": "chrome", "251": "chrome", "256": "chrome", "257": "chrome", "254": "chrome", "255": "chrome", "499": "chrome", "157": "chrome", "156": "chrome", "939": "safari", "731": "chrome", "730": "chrome", "733": "chrome", "938": "safari", "735": "chrome", "734": "chrome", "508": "chrome", "736": "chrome", "506": "chrome", "738": "chrome", "504": "chrome", "505": "chrome", "502": "chrome", "503": "chrome", "500": "chrome", "501": "chrome", "630": "chrome", "631": "chrome", "632": "chrome", "633": "chrome", "469": "chrome", "468": "chrome", "636": "chrome", "637": "chrome", "465": "chrome", "464": "chrome", "467": "chrome", "466": "chrome", "461": "chrome", "900": "firefox", "463": "chrome", "462": "chrome", "901": "firefox", "168": "chrome", "169": "chrome", "164": "chrome", "165": "chrome", "166": "chrome", "167": "chrome", "160": "chrome", "161": "chrome", "162": "chrome", "163": "chrome", "964": "safari", "965": "safari", "966": "safari", "967": "safari", "960": "safari", "961": "safari", "962": "safari", "963": "safari", "783": "internetexplorer", "968": "safari", "969": "opera", "936": "firefox", "935": "firefox", "934": "firefox", "908": "firefox", "909": "firefox", "722": "chrome", "426": "chrome", "878": "firefox", "879": "firefox", "876": "firefox", "877": "firefox", "874": "firefox", "875": "firefox", "872": "firefox", "873": "firefox", "870": "firefox", "871": "firefox", "9": "chrome", "890": "firefox", "891": "firefox", "892": "firefox", "893": "firefox", "894": "firefox", "647": "chrome", "896": "firefox", "897": "firefox", "898": "firefox", "899": "firefox", "646": "chrome", "649": "chrome", "648": "chrome", "357": "chrome", "356": "chrome", "809": "internetexplorer", "808": "internetexplorer", "353": "chrome", "352": "chrome", "351": "chrome", "350": "chrome", "803": "internetexplorer", "802": "internetexplorer", "801": "internetexplorer", "800": "internetexplorer", "807": "internetexplorer", "806": "internetexplorer", "805": "internetexplorer", "804": "internetexplorer", "216": "chrome", "217": "chrome", "768": "chrome", "769": "chrome", "212": "chrome", "213": "chrome", "210": "chrome", "211": "chrome", "762": "chrome", "763": "chrome", "760": "chrome", "761": "chrome", "766": "chrome", "767": "chrome", "764": "chrome", "765": "chrome", "40": "chrome", "41": "chrome", "289": "chrome", "288": "chrome", "4": "chrome", "281": "chrome", "280": "chrome", "283": "chrome", "282": "chrome", "285": "chrome", "284": "chrome", "287": "chrome", "286": "chrome", "678": "chrome", "679": "chrome", "674": "chrome", "675": "chrome", "676": "chrome", "677": "chrome", "670": "chrome", "671": "chrome", "672": "chrome", "673": "chrome", "263": "chrome", "262": "chrome", "261": "chrome", "260": "chrome", "267": "chrome", "266": "chrome", "265": "chrome", "264": "chrome", "269": "chrome", "268": "chrome", "59": "chrome", "58": "chrome", "55": "chrome", "54": "chrome", "57": "chrome", "56": "chrome", "51": "chrome", "258": "chrome", "53": "chrome", "52": "chrome", "537": "chrome", "536": "chrome", "535": "chrome", "63": "chrome", "533": "chrome", "532": "chrome", "531": "chrome", "530": "chrome", "152": "chrome", "539": "chrome", "538": "chrome", "775": "internetexplorer", "774": "internetexplorer", "982": "opera", "983": "opera", "980": "opera", "981": "opera", "777": "internetexplorer", "984": "opera", "50": "chrome", "115": "chrome", "252": "chrome", "117": "chrome", "776": "internetexplorer", "111": "chrome", "110": "chrome", "113": "chrome", "69": "chrome", "771": "chrome", "119": "chrome", "118": "chrome", "770": "chrome", "773": "internetexplorer", "772": "internetexplorer", "429": "chrome", "428": "chrome", "534": "chrome", "919": "firefox", "918": "firefox", "915": "firefox", "914": "firefox", "917": "firefox", "916": "firefox", "911": "firefox", "910": "firefox", "913": "firefox", "912": "firefox", "308": "chrome", "309": "chrome", "855": "firefox", "300": "chrome", "301": "chrome", "302": "chrome", "303": "chrome", "304": "chrome", "305": "chrome", "306": "chrome", "307": "chrome", "895": "firefox", "825": "firefox", "824": "firefox", "827": "firefox", "847": "firefox", "846": "firefox", "845": "firefox", "844": "firefox", "843": "firefox", "842": "firefox", "841": "firefox", "840": "firefox", "821": "firefox", "853": "firefox", "849": "firefox", "848": "firefox", "823": "firefox", "822": "firefox", "240": "chrome", "390": "chrome", "732": "chrome", "753": "chrome", "752": "chrome", "751": "chrome", "750": "chrome", "757": "chrome", "756": "chrome", "755": "chrome", "754": "chrome", "560": "chrome", "561": "chrome", "759": "chrome", "758": "chrome", "564": "chrome", "565": "chrome", "566": "chrome", "701": "chrome", "739": "chrome", "229": "chrome", "507": "chrome", "227": "chrome", "226": "chrome", "225": "chrome", "224": "chrome", "223": "chrome", "222": "chrome", "221": "chrome", "220": "chrome", "114": "chrome", "391": "chrome", "726": "chrome", "727": "chrome", "724": "chrome", "725": "chrome", "568": "chrome", "723": "chrome", "720": "chrome", "721": "chrome", "728": "chrome", "729": "chrome", "605": "chrome", "604": "chrome", "607": "chrome", "606": "chrome", "601": "chrome", "600": "chrome", "603": "chrome", "602": "chrome", "159": "chrome", "158": "chrome", "112": "chrome", "609": "chrome", "608": "chrome", "976": "opera", "634": "chrome", "399": "chrome", "635": "chrome", "959": "safari", "958": "safari", "398": "chrome", "48": "chrome", "49": "chrome", "951": "safari", "950": "safari", "953": "safari", "952": "safari", "42": "chrome", "954": "safari", "957": "safari", "956": "safari", "638": "chrome", "5": "chrome", "639": "chrome", "460": "chrome", "489": "chrome", "488": "chrome", "487": "chrome", "486": "chrome", "485": "chrome", "484": "chrome", "483": "chrome", "482": "chrome", "481": "chrome", "480": "chrome", "509": "chrome", "955": "safari", "472": "chrome", "473": "chrome", "470": "chrome", "471": "chrome", "476": "chrome", "477": "chrome", "474": "chrome", "475": "chrome", "478": "chrome", "479": "chrome"}} -------------------------------------------------------------------------------- /webscraper/webscraper.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | import os 4 | import time 5 | import uuid 6 | from random import randint 7 | 8 | from utils.extract_links_from_webpage import get_links 9 | from utils.request_client import ReqestClient 10 | from utils.url_utils import get_filtered_links 11 | import config 12 | from utils.redislite_utils import redis_cleanup, redis_client as redis 13 | 14 | request_client = ReqestClient() 15 | 16 | 17 | def write_url_data(url, response_text): 18 | if not os.path.exists(config.DATA_DIR): 19 | os.mkdir(config.DATA_DIR) 20 | file_path = config.DATA_DIR + str(uuid.uuid3(uuid.NAMESPACE_URL, str(url))) + ".json" 21 | if not os.path.exists(file_path): 22 | with open(file_path, "w") as fp: 23 | json.dump({url: response_text}, fp) 24 | 25 | 26 | class Websitescrap: 27 | def __init__(self, url, start_afresh=False): 28 | self.url = url 29 | if start_afresh: 30 | redis.flushdb() 31 | redis.sadd("new_urls", url) 32 | 33 | def crawl(self, sleep_time_lower=30, sleep_time_upper=121): 34 | print("\ncrawling started\n") 35 | write_count = 0 36 | write_flag = 1 37 | while len(redis.smembers("new_urls")): 38 | url = redis.spop("new_urls") 39 | redis.sadd("processed_urls", url) 40 | 41 | if write_count % 12 == 0: 42 | time.sleep(randint(sleep_time_lower, sleep_time_upper)) 43 | else: 44 | time.sleep(randint(sleep_time_lower, int(sleep_time_upper / 2))) 45 | 46 | if write_count % write_flag == 0: 47 | print('Processing %s' % url) 48 | 49 | write_count += 1 50 | 51 | response = request_client.request_with_proxy_header(url) 52 | if not response or not response.status_code == 200: 53 | continue 54 | 55 | # write response.text to a json dump file 56 | write_url_data(url, response.text) 57 | 58 | # get the urls for local page 59 | local_urls = [*get_links(response.text, self.url).keys()] 60 | 61 | # filter the urls of foreign urls or dummy urls 62 | local_urls = get_filtered_links(local_urls, self.url) 63 | 64 | for i in local_urls: 65 | if not redis.sismember("processed_urls", i): 66 | redis.sadd("new_urls", i) 67 | redis_cleanup(self.url) 68 | if write_count % write_flag == 0: 69 | print('Processed') 70 | 71 | 72 | def str2bool(v): 73 | if isinstance(v, bool): 74 | return v 75 | if v.lower() in ('yes', 'true', 't', 'y', '1'): 76 | return True 77 | elif v.lower() in ('no', 'false', 'f', 'n', '0'): 78 | return False 79 | else: 80 | raise argparse.ArgumentTypeError('Boolean value expected.') 81 | 82 | 83 | def main(): 84 | parser = argparse.ArgumentParser() 85 | parser.add_argument('website_address') 86 | parser.add_argument("-s", "--start_afresh", help="whether to start a fresh crawling", required=False, default=True, 87 | action='store', dest='start_afresh') 88 | 89 | args = parser.parse_args() 90 | website = args.website_address 91 | start_afresh = str2bool(args.start_afresh) 92 | 93 | if not website.startswith("http"): 94 | print("\033[91m {}\033[00m".format("Please include website scheme (http/https) in the provided address")) 95 | return 96 | 97 | scrapper = Websitescrap(website, start_afresh=start_afresh) 98 | scrapper.crawl(5, 18) 99 | 100 | 101 | if __name__ == '__main__': 102 | main() 103 | --------------------------------------------------------------------------------