├── .project
├── .pydevproject
├── PyTorStemPrivoxy.py
├── README.md
└── documentation
└── Linux_TOR_Install.md
/.project:
--------------------------------------------------------------------------------
1 |
2 |
3 | PyTorStemPrivoxyTest
4 |
5 |
6 |
7 |
8 |
9 | org.python.pydev.PyDevBuilder
10 |
11 |
12 |
13 |
14 |
15 | org.python.pydev.pythonNature
16 |
17 |
18 |
--------------------------------------------------------------------------------
/.pydevproject:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | /${PROJECT_DIR_NAME}
5 |
6 | python 2.7
7 | python27
8 |
9 |
--------------------------------------------------------------------------------
/PyTorStemPrivoxy.py:
--------------------------------------------------------------------------------
1 | '''
2 | Python script to connect to Tor via Stem and Privoxy, requesting a new connection (hence a new IP as well) as desired.
3 | '''
4 |
5 | import stem
6 | import stem.connection
7 |
8 | import time
9 | import urllib2
10 |
11 | from stem import Signal
12 | from stem.control import Controller
13 |
14 | # initialize some HTTP headers
15 | # for later usage in URL requests
16 | user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
17 | headers={'User-Agent':user_agent}
18 |
19 | # initialize some
20 | # holding variables
21 | oldIP = "0.0.0.0"
22 | newIP = "0.0.0.0"
23 |
24 | # how many IP addresses
25 | # through which to iterate?
26 | nbrOfIpAddresses = 3
27 |
28 | # seconds between
29 | # IP address checks
30 | secondsBetweenChecks = 2
31 |
32 | # request a URL
33 | def request(url):
34 | # communicate with TOR via a local proxy (privoxy)
35 | def _set_urlproxy():
36 | proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1:8118"})
37 | opener = urllib2.build_opener(proxy_support)
38 | urllib2.install_opener(opener)
39 |
40 | # request a URL
41 | # via the proxy
42 | _set_urlproxy()
43 | request=urllib2.Request(url, None, headers)
44 | return urllib2.urlopen(request).read()
45 |
46 | # signal TOR for a new connection
47 | def renew_connection():
48 | with Controller.from_port(port = 9051) as controller:
49 | controller.authenticate(password = 'my_password')
50 | controller.signal(Signal.NEWNYM)
51 | controller.close()
52 |
53 | # cycle through
54 | # the specified number
55 | # of IP addresses via TOR
56 | for i in range(0, nbrOfIpAddresses):
57 |
58 | # if it's the first pass
59 | if newIP == "0.0.0.0":
60 | # renew the TOR connection
61 | renew_connection()
62 | # obtain the "new" IP address
63 | newIP = request("http://icanhazip.com/")
64 | # otherwise
65 | else:
66 | # remember the
67 | # "new" IP address
68 | # as the "old" IP address
69 | oldIP = newIP
70 | # refresh the TOR connection
71 | renew_connection()
72 | # obtain the "new" IP address
73 | newIP = request("http://icanhazip.com/")
74 |
75 | # zero the
76 | # elapsed seconds
77 | seconds = 0
78 |
79 | # loop until the "new" IP address
80 | # is different than the "old" IP address,
81 | # as it may take the TOR network some
82 | # time to effect a different IP address
83 | while oldIP == newIP:
84 | # sleep this thread
85 | # for the specified duration
86 | time.sleep(secondsBetweenChecks)
87 | # track the elapsed seconds
88 | seconds += secondsBetweenChecks
89 | # obtain the current IP address
90 | newIP = request("http://icanhazip.com/")
91 | # signal that the program is still awaiting a different IP address
92 | print ("%d seconds elapsed awaiting a different IP address." % seconds)
93 | # output the
94 | # new IP address
95 | print ("")
96 | print ("newIP: %s" % newIP)
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | PyTorStemPrivoxy
2 | ================
3 |
4 | Python, Tor, Stem, Privoxy program that requests new connections via Tor and thereby obtains new IP addresses as well.
5 |
6 | # Crawling Anonymously with Tor in Python #
7 |
8 | *adapted from the article "[Crawling anonymously with Tor in Python](http://sacharya.com/crawling-anonymously-with-tor-in-python/)" by S. Acharya, Nov 2, 2013.*
9 |
10 | The most common use-case is to be able to hide one's identity using TOR or being able to change identities programmatically, for example when you are crawling a website like Google and you don't want to be rate-limited or blocked via IP address.
11 |
12 | ## Tor ##
13 |
14 | Install Tor.
15 |
16 | ```shell
17 | sudo apt-get update
18 | sudo apt-get install tor
19 | sudo /etc/init.d/tor restart
20 | ```
21 |
22 | *Notice that the socks listener is on port 9050.*
23 |
24 | Next, do the following:
25 |
26 | - Enable the ControlPort listener for Tor to listen on port 9051, as this is the port to which Tor will listen for any communication from applications talking to the Tor controller.
27 | - Hash a new password that prevents random access to the port by outside agents.
28 | - Implement cookie authentication as well.
29 |
30 | You can create a hashed password out of your password using:
31 |
32 | ```shell
33 | tor --hash-password my_password
34 | ```
35 |
36 | Then, update the /etc/tor/torrc with the port, hashed password, and cookie authentication.
37 |
38 | ```shell
39 | sudo gedit /etc/tor/torrc
40 | ```
41 |
42 | ```shell
43 | ControlPort 9051
44 | # hashed password below is obtained via `tor --hash-password my_password`
45 | HashedControlPassword 16:E600ADC1B52C80BB6022A0E999A7734571A451EB6AE50FED489B72E3DF
46 | CookieAuthentication 1
47 | ```
48 |
49 | Restart Tor again to the configuration changes are applied.
50 |
51 | ```shell
52 | sudo /etc/init.d/tor restart
53 | ```
54 |
55 | ## python-stem ##
56 |
57 | Next, install `python-stem` which is a Python-based module used to interact with the Tor Controller, letting us send and receive commands to and from the Tor Control port programmatically.
58 |
59 | ```shell
60 | sudo apt-get install python-stem
61 | ```
62 |
63 | ## privoxy ##
64 |
65 | Tor itself is not a http proxy. So in order to get access to the Tor Network, use `privoxy` as an http-proxy though socks5.
66 |
67 | Install `privoxy` via the following command:
68 |
69 | ```shell
70 | sudo apt-get install privoxy
71 | ```
72 |
73 | Now, tell `privoxy` to use TOR by routing all traffic through the SOCKS servers at localhost port 9050.
74 |
75 | ```shell
76 | sudo gedit /etc/privoxy/config
77 | ```
78 |
79 | and enable `forward-socks5` as follows:
80 |
81 | ```shell
82 | # source https://stackoverflow.com/questions/9887505/how-to-change-tor-identity-in-python
83 | forward-socks5 / localhost:9050 . #dot is important at the end
84 | ```
85 |
86 | Restart `privoxy` after making the change to the configuration file.
87 |
88 | ```shell
89 | sudo /etc/init.d/privoxy restart
90 | ```
91 |
92 | ##Python Script##
93 |
94 | In the script below, `urllib2` is using the proxy. `privoxy` listens on port 8118 by default, and forwards the traffic to port 9050 upon which the Tor socks is listening.
95 |
96 | Additionally, in the `renew_connection()` function, a signal is being sent to the Tor controller to change the identity, so you get new identities without restarting Tor. Doing such comes in handy when crawling a web site and one doesn’t wanted to be blocked based on IP address.
97 |
98 | **[PyTorStemPrivoxy.py](https://gist.github.com/KhepryQuixote/46cf4f3b999d7f658853#file-pytorstemprivoxy-py)**
99 |
100 | ```python
101 |
102 | '''
103 | Python script to connect to Tor via Stem and Privoxy, requesting a new connection (hence a new IP as well) as desired.
104 | '''
105 |
106 | import stem
107 | import stem.connection
108 |
109 | import time
110 | import urllib2
111 |
112 | from stem import Signal
113 | from stem.control import Controller
114 |
115 | # initialize some HTTP headers
116 | # for later usage in URL requests
117 | user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
118 | headers={'User-Agent':user_agent}
119 |
120 | # initialize some
121 | # holding variables
122 | oldIP = "0.0.0.0"
123 | newIP = "0.0.0.0"
124 |
125 | # how many IP addresses
126 | # through which to iterate?
127 | nbrOfIpAddresses = 3
128 |
129 | # seconds between
130 | # IP address checks
131 | secondsBetweenChecks = 2
132 |
133 | # request a URL
134 | def request(url):
135 | # communicate with TOR via a local proxy (privoxy)
136 | def _set_urlproxy():
137 | proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1:8118"})
138 | opener = urllib2.build_opener(proxy_support)
139 | urllib2.install_opener(opener)
140 |
141 | # request a URL
142 | # via the proxy
143 | _set_urlproxy()
144 | request=urllib2.Request(url, None, headers)
145 | return urllib2.urlopen(request).read()
146 |
147 | # signal TOR for a new connection
148 | def renew_connection():
149 | with Controller.from_port(port = 9051) as controller:
150 | controller.authenticate(password = 'my_password')
151 | controller.signal(Signal.NEWNYM)
152 | controller.close()
153 |
154 | # cycle through
155 | # the specified number
156 | # of IP addresses via TOR
157 | for i in range(0, nbrOfIpAddresses):
158 |
159 | # if it's the first pass
160 | if newIP == "0.0.0.0":
161 | # renew the TOR connection
162 | renew_connection()
163 | # obtain the "new" IP address
164 | newIP = request("http://icanhazip.com/")
165 | # otherwise
166 | else:
167 | # remember the
168 | # "new" IP address
169 | # as the "old" IP address
170 | oldIP = newIP
171 | # refresh the TOR connection
172 | renew_connection()
173 | # obtain the "new" IP address
174 | newIP = request("http://icanhazip.com/")
175 |
176 | # zero the
177 | # elapsed seconds
178 | seconds = 0
179 |
180 | # loop until the "new" IP address
181 | # is different than the "old" IP address,
182 | # as it may take the TOR network some
183 | # time to effect a different IP address
184 | while oldIP == newIP:
185 | # sleep this thread
186 | # for the specified duration
187 | time.sleep(secondsBetweenChecks)
188 | # track the elapsed seconds
189 | seconds += secondsBetweenChecks
190 | # obtain the current IP address
191 | newIP = request("http://icanhazip.com/")
192 | # signal that the program is still awaiting a different IP address
193 | print ("%d seconds elapsed awaiting a different IP address." % seconds)
194 | # output the
195 | # new IP address
196 | print ("")
197 | print ("newIP: %s" % newIP)
198 |
199 | ```
200 |
201 | Execute the Python 2.7 script above via the following command:
202 |
203 | ```shell
204 | python PyTorStemPrivoxy.py
205 | ```
206 |
207 | When the above script is executed, one should see that the IP address is changing every few seconds.
208 |
209 |
210 |
211 | ## Adaptations to the original article ##
212 |
213 | - *tweaks of grammar.*
214 | - *the use of `python-stem` instead of `pytorctl`.*
215 | - *a slight difference of settings within the `/etc/tor/torrc` file.*
216 | - *the use of a different hashed password for the Tor controller, in this case `my_password`.*
217 | - *some modifications in the sample program to accommodate the use of `python-stem`, cleaner logic, and more comprehensive commentary.*
218 |
--------------------------------------------------------------------------------
/documentation/Linux_TOR_Install.md:
--------------------------------------------------------------------------------
1 | ##Linux TOR Install##
2 |
3 | - 7zip
4 | - p7zip
5 | - p7zip-full
6 | - `sudo apt-get install p7zip p7zip-full`
7 | - nautilus
8 | - nautilus-open-terminal
9 | - `sudo apt-get install nautilus-open-terminal`
10 | - browsers
11 | - firefox
12 | - chromium-browser
13 | - `sudo apt-get install chromium-browser firefox`
14 | - cryptography
15 | - gnupg
16 | - gpa
17 | - kleopatra
18 | - seahorse
19 | - seahorse-nautilus
20 | - rng-tools
21 | - This package may be required on virtual machines for the following reasons:
22 | - Key generation requires the system to work with a source of random numbers. Systems which are better at generating random numbers than others are said to have higher entropy. This is typically obtained from the system hardware; the GnuPG documentation recommends that keys be generated only on a local machine (i.e. not one being accessed across a network), and that keyboard, mouse and disk activity be maximized during key generation to increase the entropy of the system.
23 | - Unfortunately, there are some scenarios - for example, on virtual machines which don’t have real hardware - where insufficient entropy causes key generation to be extremely slow. If you come across this problem, you should investigate means of increasing the system entropy. On virtualized Linux systems, this can often be achieved by installing the rng-tools package. This is available at least on RPM-based and APT-based systems (Red Hat/Fedora, Debian, Ubuntu and derivative distributions).
24 | - haveged
25 | - Installing this may assist in obtaining the needed entropy to make GPG key generation run within an acceptable timeframe.
26 | - `sudo apt-get install gnupg gpa kleopatra seahorse seahorse-nautilus rng-tools haveged`
27 | - databases
28 | - sqlite
29 | - sqlite3
30 | - sqlitebrowser
31 | - sqliteman
32 | - sqliteman-doc
33 | - libqt4-dev
34 | - libsqlite3-dev
35 | - sqlite3-doc
36 | - `sudo apt-get install sqlite3 sqlitebrowser sqliteman sqliteman-doc libsqlite3-dev libqt4-dev sqlite3-doc`
37 | - [sqlitestudio (Linux 64-bit)](http://sqlitestudio.pl/files/free/stable/linux64/sqlitestudio-2.1.5.bin)
38 | - downloaders
39 | - filezilla
40 | - transmission (a BitTorrent equivalent)
41 | - transmission-gtk
42 | - `sudo apt-get install filezilla transmission transmission-gtk`
43 | - editors
44 | - gedit (is already installed with Linux)
45 | - retext (Markdown editor)
46 | - `sudo apt-get install gedit retext`
47 | - folders
48 | - ~/data
49 | - voters
50 | - nc (these are North Carolina voter files suitable for use as "test" files)
51 | - ~/projects
52 | - python *(this is the "workspace" folder for the Spyder IDE)*
53 | - ~/temp
54 | - python
55 | - python 2.7
56 | - libraries
57 | - python
58 | - python-all
59 | - python-dev
60 | - python-bcrypt
61 | - python-configparser
62 | - python-crypto
63 | - python-gdal
64 | - python-gnupg
65 | - python-iniparse
66 | - python-pip
67 | - python-numpy
68 | - python-pandas
69 | - python-pyodbc
70 | - python-pysqlite2
71 | - python-socksipy
72 | - python-sphinx
73 | - python-stem
74 | - python-torctl
75 | - python-xlrd
76 | - python-zmq
77 | - `sudo apt-get install python python-all python-dev python-bcrypt python-configparser python-crypto python-gdal python-gnupg python-iniparse python-numpy python-pyodbc python-pandas python-pip python-pysqlite2 python-socksipy python-sphinx python-stem python-torctl python-xlrd python-zmq`
78 | - connectors
79 | - ***NOTE: add freetds.conf file to the {home} folder of the user***
80 | - [global]
81 | - tds version = auto
82 | - freetds-bin
83 | - freetds-common
84 | - freetds-dev
85 | - libdbd-freetds
86 | - python-pymssql
87 | - python-mysql.connector
88 | - python-pysqlite2 (for SQLite 3)
89 | - python-pysqlite2-doc
90 | - tdsodbc
91 | - unixodbc
92 | - unixodbc-dev
93 | - `sudo apt-get install freetds-bin freetds-common freetds-dev libdbd-freetds python-pymssql python-mysql.connector python-pysqlite2 tdsodbc unixodbc unixodbc-dev`
94 | - documentation
95 | - python-doc
96 | - python-pysqlite2-doc
97 | - python-numpy-doc
98 | - `sudo apt-get install python-doc python-pysqlite2-doc python-numpy-doc`
99 | - ides
100 | - spyder (Python IDE)
101 | - `sudo apt-get install spyder`
102 | - `sudo pip install --upgrade spyder`
103 | - [Spyder 2.3.0](https://pypi.python.org/packages/source/s/spyder/spyder-2.3.0.zip#md5=7c99e0bc6485b0700f9570201282a139)
104 | - python 3.4
105 | - libraries
106 | - python3
107 | - python3-all
108 | - python3-dev
109 | - python3-bcrypt
110 | - *python3-configparser (now integrated into Python 3)*
111 | - python3-crypto
112 | - python3-gdal
113 | - python3-gnupg
114 | - *python3-iniparse (not present yet)*
115 | - python3-pip
116 | - python3-numpy
117 | - python3-pandas
118 | - *python3-pyodbc (not present yet)*
119 | - *python3-socksipy (not present yet)*
120 | - python3-sphinx
121 | - python3-stem
122 | - *python3-torctl (not present yet)*
123 | - python3-xlrd
124 | - python3-zmq
125 | - `sudo apt-get install python3 python3-all python3-dev python3-bcrypt python3-configparser python3-crypto python3-gdal python3-gnupg python3-iniparse python3-numpy python3-pyodbc python3-pandas python3-pip python3-socksipy python3-sphinx python3-stem python3-torctl python3-xlrd python3-zmq`
126 | - connectors
127 | - ***NOTE: add freetds.conf file to the {home} folder of the user***
128 | - [global]
129 | - tds version = auto
130 | - freetds-bin
131 | - freetds-common
132 | - freetds-dev
133 | - libdbd-freetds
134 | - python3-pymssql (not present yet)
135 | - python3-mysql.connector
136 | - tdsodbc
137 | - unixodbc
138 | - unixodbc-dev
139 | - `sudo apt-get install freetds-bin freetds-common freetds-dev libdbd-freetds python3-mysql.connector tdsodbc unixodbc unixodbc-dev`
140 | - documentation
141 | - python3-doc
142 | - `sudo apt-get install python3-doc`
143 | - ides
144 | - spyder3 (Python IDE)
145 | - `sudo apt-get install spyder3`
146 | - `sudo pip3 install --upgrade spyder`
147 | - source code managers
148 | - git
149 | - rabbitvcs-cli
150 | - rabbitvcs-gedit
151 | - rabbitvcs-nautilus
152 | - rabbitvcs-core
153 | - `sudo apt-get rabbitvcs-cli rabbitvcs-gedit rabbitvcs-nautilus rabbitvcs-core`
154 | - mercurial
155 | - mercurial
156 | - tortoisehg-nautilus
157 | - tortoisehg
158 | - `sudo apt-get install mercurial tortoisehg-nautilus tortoisehg`
159 | - tor
160 | - tor
161 | - tor-geoipdb
162 | - vidalia
163 | - torchat
164 | - privoxy
165 | - `sudo apt-get install tor tor-geoipdb torchat privoxy vidalia`
166 | - To compensate for app-armor interfering with Vidalia when it is invoked:
167 | - `sudo ln -s /etc/apparmor.d/usr.bin.vidalia /etc/apparmor.d/disable/`
168 | - `sudo apparmor_parser -R /etc/apparmor.d/usr.bin.vidalia`
169 | - `sudo /etc/init.d/tor start` or `sudo /etc/init.d/tor restart`
170 |
--------------------------------------------------------------------------------