634 |
635 | This program is free software: you can redistribute it and/or modify
636 | it under the terms of the GNU Affero General Public License as published
637 | by the Free Software Foundation, either version 3 of the License, or
638 | (at your option) any later version.
639 |
640 | This program is distributed in the hope that it will be useful,
641 | but WITHOUT ANY WARRANTY; without even the implied warranty of
642 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
643 | GNU Affero General Public License for more details.
644 |
645 | You should have received a copy of the GNU Affero General Public License
646 | along with this program. If not, see .
647 |
648 | Also add information on how to contact you by electronic and paper mail.
649 |
650 | If your software can interact with users remotely through a computer
651 | network, you should also make sure that it provides a way for users to
652 | get its source. For example, if your program is a web application, its
653 | interface could display a "Source" link that leads users to an archive
654 | of the code. There are many ways you could offer source, and different
655 | solutions will be better for different programs; see section 13 for the
656 | specific requirements.
657 |
658 | You should also get your employer (if you work as a programmer) or school,
659 | if any, to sign a "copyright disclaimer" for the program, if necessary.
660 | For more information on this, and how to apply and follow the GNU AGPL, see
661 | .
662 |
663 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # HTTPS Proxy
2 |
3 | This is a simple proxy server using [request.js](https://github.com/request/request) and
4 | [Node.js](https://nodejs.org/)' builtin [HTTP library](https://nodejs.org/api/http.html)
5 | to rewrite incoming requests for sites using the HTTP protocol to use HTTPS.
6 |
7 | It achieves this using the [Electronic Frontier Foundation](https://www.eff.org/)'s
8 | [HTTPS Everywhere](https://github.com/EFForg/https-everywhere) ruleset
9 | ([official site](https://www.eff.org/HTTPS-EVERYWHERE)) and part of the codebase.
10 | Below you will find a copy of the licensing information for the HTTPS Everywhere
11 | project. We ask that you kindly observe and respect existing copyrights and licensing
12 | terms relevant to both the MIT license used by EFF as well as the GNU Affero GPL v0.3
13 | license used by eQualit.ie.
14 |
15 | # Using this software
16 |
17 | ## Dependencies
18 |
19 | ### Node.js and NPM
20 |
21 | You can download Node.js and NPM together directly from the [official site](https://nodejs.org/download/).
22 |
23 | ### HTTPS Everywhere rulesets
24 |
25 | Before you can use HTTPS-Proxy, HTTPS Everywhere's rulesets must first be downloaded and compiled
26 | into a single xml file. All of this and more is handled automatically with the `fetchrules` script.
27 |
28 | ```bash
29 | sh tools/fetchrules.sh
30 | ```
31 | ### Libraries
32 |
33 | Next you must install the dependencies HTTPS-Proxy relies on.
34 |
35 | ```bash
36 | npm install
37 | ```
38 |
39 | ## Configuration
40 |
41 | Configuration settings for HTTPS-Proxy are provided in `config.js`. Below are short explanations
42 | for each of the available configuration options. Note that most of the settings pertain to
43 | options for the [request](https://github.com/request/request) library, which you can read more
44 | about [in the request README](https://github.com/request/request#requestoptions-callback).
45 |
46 | * `port` - The port number to have HTTPS-Proxy listen on
47 | * `address` - The IP address HTTPS-Proxy should bind to
48 | * `rewritePages` - Whether or not HTTPS-Proxy should rewrite HTTP URLs to HTTPS in responses it receives
49 | * `aggressive` - When true, HTTPS-Proxy will overwrite URLs in all responses, otherwise only in text (html, css, ...)
50 | * `followRedirect` - Whether or not HTTPS-Proxy should follow a status code 304 redirect
51 | * `followAllRedirects` - Whether HTTPS-Proxy should follow **all** redirects
52 | * `maxRedirects` - The maximum number of redirects HTTPS-Proxy should follow before returning the last response
53 | * `useProxy` - Whether or not HTTPS-Proxy should use another proxy to send requests through
54 | * `proxy` - The URI of the proxy to use. Only applies if `useProxy` is true
55 | * `strictSSL` - Whether or not SSL certificate validity should be strictly enforced
56 | * `useTunnel` - Whether or not HTTPS-Proxy should tunnel CONNECT requests and websocket data
57 | * `tunnel` - The settings for the tunnel
58 |
59 | ## Running
60 |
61 | After you have downloaded the HTTPS Everywhere rulesets, installed the required dependencies,
62 | and changed any configuration settings you'd like, running HTTPS-Proxy is very simple.
63 |
64 | ```bash
65 | npm start
66 | ```
67 |
68 | ## Testing
69 |
70 | HTTPS-Proxy's unit tests can be run from the `HTTPS-Proxy/` directory with the following command:
71 |
72 | ```bash
73 | npm test
74 | ```
75 |
76 | ### Okay, I know the tests pass, but how do I know I'm secure?
77 |
78 | If you would like to verify that HTTPS-Proxy is doing its job and rewriting the URLs of requests you
79 | proxy through it, run the following commands.
80 |
81 | ```bash
82 | npm start # Start HTTPS-Proxy if you haven't already. Assuming it is still on port 5641
83 | curl -i -x http://127.0.0.1:5641 http://reddit.com > download1
84 | curl -i http://reddit.com > download2
85 | /usr/bin/diff -y download1 download2
86 | ```
87 |
88 | Sites like Reddit only use HTTPS, so trying to get it using HTTP as in the second `curl` will
89 | not succeed. When you run `diff`, you should see the headers received from the first request,
90 | rewritten to use HTTPS, on the left, and the headers of the request that wasn't rewritten on
91 | the right.
92 |
93 | ```
94 | HTTP/1.1 200 OK | HTTP/1.1 301 Moved Permanently
95 | server: cloudflare-nginx | Date: Thu, 27 Aug 2015 21:16:24 GMT
96 | date: Thu, 27 Aug 2015 21:16:17 GMT | Transfer-Encoding: chunked
97 | content-type: text/html; charset=UTF-8 | Connection: keep-alive
98 | transfer-encoding: chunked | Set-Cookie: __cfduid=d49ff83163d2fa4e5151df7c33d3034181440710
99 | connection: close | Location: https://www.reddit.com/
100 | set-cookie: __cfduid=d1e297e4e5de5b53248b2b591758c67db14 | X-Content-Type-Options: nosniff
101 | x-ua-compatible: IE=edge | Server: cloudflare-nginx
102 | x-frame-options: SAMEORIGIN | CF-RAY: 21cacc1b97ca0f9f-YYZ
103 | x-content-type-options: nosniff <
104 | x-xss-protection: 1; mode=block <
105 | vary: accept-encoding <
106 | cache-control: max-age=0, must-revalidate <
107 | x-moose: majestic <
108 | strict-transport-security: max-age=15552000; includeSubD <
109 | cf-cache-status: EXPIRED <
110 | cf-ray: 21cacbdc5e880fab-YYZ <
111 | ```
112 |
113 | ### That's pretty cool, but what about the links in a page?
114 |
115 | If you haven't changed the `rewritePages` configuration option from `true` to `false`,
116 | HTTPS-Proxy will also rewrite any URLs it finds in pages using HTTP to HTTPS where there's
117 | a rule for that URL. You can test that it's working properly using a simple HTTP server
118 | shipped with [Python](https://docs.python.org/2/library/simplehttpserver.html) and a simple
119 | HTML document contained in the HTTPS-Proxy tests.
120 |
121 | Load up two terminals.
122 |
123 | In your first terminal:
124 |
125 | ```bash
126 | npm start& # Start HTTPS-Proxy if you haven't already
127 | cd tests
128 | python -m SimpleHTTPServer 8080
129 | ```
130 |
131 | In your second terminal:
132 |
133 | ```html
134 | curl http://127.0.0.1:8080/contentrewrite.html
135 |
136 |
137 |
138 | HTTPS-Proxy Test
139 |
140 |
141 |
142 | This anchor's URL should be rewritten
143 |
144 |
145 |
146 |
147 | curl -x http://127.0.0.1:5641 http://127.0.0.1:8080/contentrewrite.html
148 |
149 |
150 |
151 | HTTPS-Proxy Test
152 |
153 |
154 |
155 | This anchor's URL should be rewritten
156 |
157 |
158 |
159 | ```
160 |
161 | As you can see, in the output from the first `curl` where we didn't proxy
162 | through HTTPS-Proxy, we just got the contents of `contentrewrite.html` as
163 | they are. In the second case, we do proxy through HTTPS-Proxy and we can
164 | see that the URL to Reddit in the anchor tag has been rewritten to use HTTPS!
165 |
166 | # HTTPS Everywhere license
167 |
168 | HTTPS Everwyhere:
169 | Copyright © 2010-2012 Mike Perry
170 | Peter Eckersley
171 | and many others
172 | (Licensed GPL v2+)
173 |
174 | Incorporating code from NoScript,
175 | Copyright © 2004-2007 Giorgio Maone
176 | Licensed GPL v2+
177 |
178 | Incorporating code from Convergence
179 | Copyright © Moxie Marlinspike
180 | Licensed GPL v3+
181 |
182 | Incorporating code from URI.js
183 | Copyright © Rodney Rehm
184 | Licensed MIT, GPL V3
185 |
186 | Incorporating code from js-lru
187 | Copyright © 2010 Rasmus Andersson
188 | Licensed MIT
189 |
190 | The build system incorporates code from Python 2.6,
191 | Copyright © 2001-2006 Python Software Foundation
192 | Python Software Foundation License Version 2
193 |
194 | Net License: GPL v3+ (complete tree)
195 | GPL v2+ (if Moxie's NSS.js is absent)
196 |
197 |
198 | Text of MIT License:
199 | ====================
200 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
201 |
202 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
203 |
204 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
205 |
--------------------------------------------------------------------------------
/config.js:
--------------------------------------------------------------------------------
1 | // Rather than parsing a JSON file, configuration for the HTTPS proxy will
2 | // be handled in good old javascript.
3 | // As you can see, at the most basic level, that doesn't change much.
4 | module.exports = {
5 | port: 5641, // reversed(str(sum(map(ord, 'HTTPSEverywhere'))))
6 | address: '127.0.0.1',
7 | rewritePages: true, // Rewrite the URLs found in responses to HTTPS
8 | aggressive: false, // Rewrite the URLs in *all* responses instead of just text
9 |
10 | // Configuration options for requests. See
11 | // https://github.com/request/request#requestoptions-callback
12 | followRedirect: true,
13 | followAllRedirects: false,
14 | maxRedirects: 5,
15 | useProxy: false,
16 | proxy: null, // Doesn't get set if useProxy is false
17 | strictSSL: true,
18 | useTunnel: false,
19 | tunnel: null // Doesn't get set if useTunnel is false
20 | };
21 |
--------------------------------------------------------------------------------
/index.js:
--------------------------------------------------------------------------------
1 | /**
2 | * A simple HTTP server that will forward incoming requests for sites
3 | * using the HTTP protocol to an HTTPS version if there is such a known
4 | * version in the HTTPS Everywhere ruleset library.
5 | * Information that gets passed on includes:
6 | * 1. Headers
7 | * 2. Request method
8 | * 3. Request body
9 | */
10 |
11 | var http = require('http');
12 | var request = require('request');
13 | var concatStream = require('concat-stream');
14 | var config = require('./config');
15 | var HttpsRewriter = require('./rewriter').HttpsRewriter;
16 |
17 | // It takes a little time to load up the rulesets, so it's best
18 | // to do it once at startup.
19 | console.log('Loading rulesets');
20 | var rewriter = new HttpsRewriter();
21 |
22 | // A list of the types of requests (by method) that can have bodies we want to forward.
23 | // See wikipedia for a list of all the existing HTTP request methods
24 | // https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods
25 | const CAN_HAVE_BODY = [
26 | 'POST',
27 | 'PUT',
28 | 'PATCH'
29 | ];
30 |
31 | /**
32 | * Help handle a POST request or other with a request body
33 | * by accumulating the body contents as they are received.
34 | * @param {http.IncomingMessage} req - The incoming request object
35 | * @param {function} callback - A callback invoked with any error and the body
36 | */
37 | function handleBody(req, callback) {
38 | req.on('error', function (err) {
39 | callback(err, null);
40 | });
41 | var concat = concatStream(function (data) {
42 | callback(null, data);
43 | });
44 | req.pipe(concat);
45 | }
46 |
47 | /**
48 | * Produce an `options` object for the outgoing request based on
49 | * data received by our own HTTP server and the provided config.
50 | * @param {string} url - The URL to request
51 | * @param {object} headers - An object containing key-value headers
52 | * @param {string} method - The HTTP method to use, e.g. get, post, ...
53 | */
54 | function requestOptions(url, headers, method) {
55 | var options = {
56 | url: url,
57 | headers: headers,
58 | method: method.toUpperCase(),
59 | followRedirect: config.followRedirect,
60 | followAllRedirects: config.followAllRedirects,
61 | maxRedirects: config.maxRedirects,
62 | strictSSL: config.strictSSL,
63 | gzip: true,
64 | encoding: null
65 | };
66 | if (config.useTunnel) {
67 | options.tunnel = config.tunnel;
68 | }
69 | if (config.useProxy) {
70 | options.proxy = config.proxy;
71 | }
72 | return options;
73 | }
74 |
75 | /**
76 | * Report that an error occurred.
77 | * Sets the status code to 500 and writes the error message as the body.
78 | * @param {http.ServerResponse} res - The outgoing response object
79 | * @param {error} error - The error object received
80 | */
81 | function reportError(res, error) {
82 | res.statusCode = 500;
83 | res.write(error.message);
84 | res.end();
85 | }
86 |
87 | /**
88 | * Copy the headers from a response made with `request` to a response object for
89 | * the proxy server. This function is really only here to make `forwardRequests`
90 | * shorter.
91 | * @param {http.ServerResponse} res - The HTTP response object to the invoker
92 | * @param {object} headersObj - key-value pairs of headers and their values
93 | */
94 | function copyHeaders(res, headersObj) {
95 | var headers = Object.keys(headersObj);
96 | var cntenc = 'content-encoding';
97 | // The request library handles gzip encoded data for us so we will
98 | // strip out a `content-encoding: gzip` header to avoid confusing browsers.
99 | if (headers.indexOf(cntenc) >= 0 && headersObj[cntenc].toLowerCase() === 'gzip') {
100 | headers.splice(headers.indexOf(cntenc), 1);
101 | delete headersObj[cntenc];
102 | }
103 | for (var i = 0, len = headers.length; i < len; i++) {
104 | res.setHeader(headers[i], headersObj[headers[i]]);
105 | }
106 | }
107 |
108 | /**
109 | * We don't want to modify the contents of things like images and javascript code
110 | * where we could potentially break functionality. Configuration allows for this
111 | * protection to be overridden.
112 | * @param {http.IncomingMessage} response - The response object from the forwarded request
113 | */
114 | function shouldNotRewrite(response) {
115 | var cnttyp = 'content-type';
116 | // Use this nice short-circuiting monadic approach to determining whether to rewrite or not.
117 | var propertyExists = cnttyp in response.headers;
118 | var isNotText = propertyExists && !/^text\//.test(response.headers[cnttyp]);
119 | var dontRewrite = isNotText && !config.aggressive;
120 | return dontRewrite;
121 | }
122 |
123 | /**
124 | * Make a request out of the proxy server with the provided options and
125 | * write back either any error that occurs in making the request or
126 | * else the response from the web server.
127 | * @param {http.ServerResponse} res - The outgoing response object
128 | * @param {object} options - The options object to dictate what the request does
129 | */
130 | function forwardRequest(res, options) {
131 | console.log('REQUEST FOR', options.url, '\nHEADERS', options.headers, '\n\n');
132 | request(options, function (err, response, body) {
133 | if (err) {
134 | reportError(res, err);
135 | } else {
136 | console.log('RESPONSE FOR', options.url, '\nHEADERS', response.headers, '\n\n');
137 | copyHeaders(res, response.headers);
138 | res.statusCode = response.statusCode;
139 | if (shouldNotRewrite(response)) {
140 | res.write(body);
141 | res.end();
142 | } else {
143 | body = body.toString();
144 | if (config.rewritePages) {
145 | body = rewriter.process(body);
146 | }
147 | res.write(body);
148 | res.end();
149 | }
150 | }
151 | });
152 | }
153 |
154 | /**
155 | * Read in information from the incoming request to pass on in the outgoing
156 | * request and write back either any error in reading the request body
157 | * or else the response.
158 | * @param {http.IncomingMessage} req - The incoming request object
159 | * @param {http.ServerResponse} res - The outgoing response object
160 | */
161 | function proxy(req, res) {
162 | var newUrl = rewriter.process(req.url);
163 | var method = req.method.toUpperCase();
164 | var headers = req.headers;
165 | var options = requestOptions(newUrl, headers, method);
166 | if (CAN_HAVE_BODY.indexOf(method) >= 0) {
167 | handleBody(req, function (err, body) {
168 | if (err) {
169 | reportError(res, err);
170 | } else {
171 | options.body = body;
172 | forwardRequest(res, options);
173 | }
174 | });
175 | } else {
176 | forwardRequest(res, options);
177 | }
178 | }
179 |
180 | // Start up the HTTP server!
181 | console.log('Starting HTTP server');
182 | var server = http.createServer(proxy);
183 | server.listen(config.port, config.address);
184 | console.log('Server running on ' + config.address + ':' + config.port);
185 |
186 | // Export the functions defined herein for testing purposes.
187 | module.exports = {
188 | proxy: proxy,
189 | forwardRequest: forwardRequest,
190 | reportError: reportError,
191 | handleBody: handleBody
192 | };
193 |
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
1 | {
2 | "name": "HTTPS-Proxy",
3 | "version": "0.1.0",
4 | "description": "A standalone nodejs proxy server in a single file that you can push requests through to have them come out using HTTPS",
5 | "main": "index.js",
6 | "author": "Zack Mullaly ",
7 | "scripts": {
8 | "test": "mocha tests/*.js",
9 | "start": "node index.js"
10 | },
11 | "repository": {
12 | "type": "git",
13 | "url": "https://github.com/equalitie/HTTPS-Proxy"
14 | },
15 | "dependencies": {
16 | "request": "latest",
17 | "URIjs": "1.11.2",
18 | "xmldom": "0.1.17",
19 | "concat-stream": "1.5.0"
20 | },
21 | "devDependencies": {
22 | "should": "latest",
23 | "mocha": "latest"
24 | },
25 | "keywords": [
26 | "HTTPS",
27 | "proxy",
28 | "server",
29 | "nodejs",
30 | "equalit.ie"
31 | ],
32 | "author": "Zack Mullaly (http://redwire.co/)",
33 | "license": "GNU Affero GPL v0.3",
34 | "bugs": {
35 | "url": "https://github.com/equalitie/HTTPS-Proxy/issues"
36 | },
37 | "homepage": "https://github.com/equalitie/HTTPS-Proxy"
38 | }
39 |
--------------------------------------------------------------------------------
/rewriter.js:
--------------------------------------------------------------------------------
1 | /**
2 | * A simple rewriting utility that rewrites URLs in a body of text from HTTP to HTTPS
3 | * using EFF's HTTPS Everywhere (https://www.eff.org/Https-everywhere)
4 | * rulesets. This code is based in part on HTTPS Everywhere's own rewriter
5 | * (https://github.com/EFForg/https-everywhere/blob/master/rewriter/rewriter.js)
6 | *
7 | * From the HTTPS Everywhere LICENSE.txt
8 | * HTTPS Everwyhere:
9 | * Copyright © 2010-2012 Mike Perry
10 | * Peter Eckersley
11 | * and many others
12 | * (Licensed GPL v2+)
13 | *
14 | * Incorporating code from NoScript,
15 | * Copyright © 2004-2007 Giorgio Maone
16 | * Licensed GPL v2+
17 | *
18 | * Incorporating code from Convergence
19 | * Copyright © Moxie Marlinspike
20 | * Licensed GPL v3+
21 | *
22 | * Incorporating code from URI.js
23 | * Copyright © Rodney Rehm
24 | * Licensed MIT, GPL V3
25 | *
26 | * Incorporating code from js-lru
27 | * Copyright © 2010 Rasmus Andersson
28 | * Licensed MIT
29 | *
30 | * The build system incorporates code from Python 2.6,
31 | * Copyright © 2001-2006 Python Software Foundation
32 | * Python Software Foundation License Version 2
33 | *
34 | * Net License: GPL v3+ (complete tree)
35 | * GPL v2+ (if Moxie's NSS.js is absent)
36 | *
37 | *
38 | * Text of MIT License:
39 | * ====================
40 | * Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
41 | *
42 | * The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
43 | *
44 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
45 | */
46 |
47 | var fs = require('fs');
48 | var path = require('path');
49 | var DOMParser = require('xmldom').DOMParser;
50 | var URI = require('URIjs');
51 |
52 | // The compiled rulesets XML file
53 | // run `sh tools/fetchrules.sh` from the `HTTPS-Proxy` directory if you haven't
54 | // already done so.
55 | const RULESET_FILE = 'httpse.rulesets';
56 |
57 | /**
58 | * Overwrite the default URI find_uri_expression with a modified one that
59 | * mitigates a catastrophic backtracking issue common in CSS.
60 | * The workaround was to insist that URLs start with http, since those are the
61 | * only ones we want to rewrite anyhow. Note that this may still go exponential
62 | * on certain inputs. http://www.regular-expressions.info/catastrophic.html
63 | * Example string that blows up URI.withinString:
64 | * image:url(http://img.youtube.com/vi/x7f
65 | */
66 | URI.find_uri_expression = /\b((?:http:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+)+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/ig;
67 |
68 | // Copied from https://github.com/EFForg/https-everywhere/blob/master/chromium/rules.js
69 | // Stubs so this runs under nodejs. They get overwritten later by util.js
70 | var DBUG = 1;
71 | function log(){};
72 |
73 | /**
74 | * A single rule
75 | * @param from
76 | * @param to
77 | * @constructor
78 | */
79 | function Rule(from, to) {
80 | //this.from = from;
81 | this.to = to;
82 | this.from_c = new RegExp(from);
83 | }
84 |
85 | /**
86 | * Regex-Compile a pattern
87 | * @param pattern The pattern to compile
88 | * @constructor
89 | */
90 | function Exclusion(pattern) {
91 | //this.pattern = pattern;
92 | this.pattern_c = new RegExp(pattern);
93 | }
94 |
95 | /**
96 | * Generates a CookieRule
97 | * @param host The host regex to compile
98 | * @param cookiename The cookie name Regex to compile
99 | * @constructor
100 | */
101 | function CookieRule(host, cookiename) {
102 | this.host = host;
103 | this.host_c = new RegExp(host);
104 | this.name = cookiename;
105 | this.name_c = new RegExp(cookiename);
106 | }
107 |
108 | /**
109 | *A collection of rules
110 | * @param set_name The name of this set
111 | * @param match_rule Quick test match rule
112 | * @param default_state activity state
113 | * @param note Note will be displayed in popup
114 | * @constructor
115 | */
116 | function RuleSet(set_name, match_rule, default_state, note) {
117 | this.name = set_name;
118 | if (match_rule)
119 | this.ruleset_match_c = new RegExp(match_rule);
120 | else
121 | this.ruleset_match_c = null;
122 | this.rules = [];
123 | this.exclusions = [];
124 | this.targets = [];
125 | this.cookierules = [];
126 | this.active = default_state;
127 | this.default_state = default_state;
128 | this.note = note;
129 | }
130 |
131 | RuleSet.prototype = {
132 | /**
133 | * Check if a URI can be rewritten and rewrite it
134 | * @param urispec The uri to rewrite
135 | * @returns {*} null or the rewritten uri
136 | */
137 | apply: function(urispec) {
138 | var returl = null;
139 | // If we're covered by an exclusion, go home
140 | for(var i = 0; i < this.exclusions.length; ++i) {
141 | if (this.exclusions[i].pattern_c.test(urispec)) {
142 | log(DBUG,"excluded uri " + urispec);
143 | return null;
144 | }
145 | }
146 | // If a ruleset has a match_rule and it fails, go no further
147 | if (this.ruleset_match_c && !this.ruleset_match_c.test(urispec)) {
148 | log(VERB, "ruleset_match_c excluded " + urispec);
149 | return null;
150 | }
151 |
152 | // Okay, now find the first rule that triggers
153 | for(var i = 0; i < this.rules.length; ++i) {
154 | returl = urispec.replace(this.rules[i].from_c,
155 | this.rules[i].to);
156 | if (returl != urispec) {
157 | return returl;
158 | }
159 | }
160 | if (this.ruleset_match_c) {
161 | // This is not an error, because we do not insist the matchrule
162 | // precisely describes to target space of URLs ot redirected
163 | log(DBUG,"Ruleset "+this.name
164 | +" had an applicable match-rule but no matching rules");
165 | }
166 | return null;
167 | }
168 |
169 | };
170 |
171 | /**
172 | * Initialize Rule Sets
173 | * @param userAgent The browser's user agent
174 | * @param cache a cache object (lru)
175 | * @param ruleActiveStates default state for rules
176 | * @constructor
177 | */
178 | function RuleSets(userAgent, cache, ruleActiveStates) {
179 | // Load rules into structure
180 | var t1 = new Date().getTime();
181 | this.targets = {};
182 | this.userAgent = userAgent;
183 |
184 | // A cache for potentiallyApplicableRulesets
185 | // Size chosen /completely/ arbitrarily.
186 | this.ruleCache = new cache(1000);
187 |
188 | // A cache for cookie hostnames.
189 | this.cookieHostCache = new cache(100);
190 |
191 | // A hash of rule name -> active status (true/false).
192 | this.ruleActiveStates = ruleActiveStates;
193 | }
194 |
195 |
196 | RuleSets.prototype = {
197 | /**
198 | * Iterate through data XML and load rulesets
199 | */
200 | addFromXml: function(ruleXml) {
201 | var sets = ruleXml.getElementsByTagName("ruleset");
202 | for (var i = 0; i < sets.length; ++i) {
203 | try {
204 | this.parseOneRuleset(sets[i]);
205 | } catch (e) {
206 | log(WARN, 'Error processing ruleset:' + e);
207 | }
208 | }
209 | },
210 |
211 | /**
212 | * Return the RegExp for the local platform
213 | */
214 | localPlatformRegexp: (function() {
215 | var isOpera = /(?:OPR|Opera)[\/\s](\d+)(?:\.\d+)/.test(this.userAgent);
216 | if (isOpera && isOpera.length === 2 && parseInt(isOpera[1]) < 23) {
217 | // Opera <23 does not have mixed content blocking
218 | log(DBUG, 'Detected that we are running Opera < 23');
219 | return new RegExp("chromium|mixedcontent");
220 | } else {
221 | log(DBUG, 'Detected that we are running Chrome/Chromium');
222 | return new RegExp("chromium");
223 | }
224 | })(),
225 |
226 | /**
227 | * Load a user rule
228 | * @param params
229 | * @returns {boolean}
230 | */
231 | addUserRule : function(params) {
232 | log(INFO, 'adding new user rule for ' + JSON.stringify(params));
233 | var new_rule_set = new RuleSet(params.host, null, true, "user rule");
234 | var new_rule = new Rule(params.urlMatcher, params.redirectTo);
235 | new_rule_set.rules.push(new_rule);
236 | if (!(params.host in this.targets)) {
237 | this.targets[params.host] = [];
238 | }
239 | this.ruleCache.remove(params.host);
240 | // TODO: maybe promote this rule?
241 | this.targets[params.host].push(new_rule_set);
242 | if (new_rule_set.name in this.ruleActiveStates) {
243 | new_rule_set.active = (this.ruleActiveStates[new_rule_set.name] == "true");
244 | }
245 | log(INFO, 'done adding rule');
246 | return true;
247 | },
248 |
249 | /**
250 | * Does the loading of a ruleset.
251 | * @param ruletag The whole tag to parse
252 | */
253 | parseOneRuleset: function(ruletag) {
254 | var default_state = true;
255 | var note = "";
256 | if (ruletag.attributes.default_off) {
257 | default_state = false;
258 | note += ruletag.attributes.default_off.value + "\n";
259 | }
260 |
261 | // If a ruleset declares a platform, and we don't match it, treat it as
262 | // off-by-default
263 | var platform = ruletag.getAttribute("platform");
264 | if (platform) {
265 | if (platform.search(this.localPlatformRegexp) == -1) {
266 | default_state = false;
267 | }
268 | note += "Platform(s): " + platform + "\n";
269 | }
270 |
271 | var rule_set = new RuleSet(ruletag.getAttribute("name"),
272 | ruletag.getAttribute("match_rule"),
273 | default_state,
274 | note.trim());
275 |
276 | // Read user prefs
277 | if (rule_set.name in this.ruleActiveStates) {
278 | rule_set.active = (this.ruleActiveStates[rule_set.name] == "true");
279 | }
280 |
281 | var rules = ruletag.getElementsByTagName("rule");
282 | for(var j = 0; j < rules.length; j++) {
283 | rule_set.rules.push(new Rule(rules[j].getAttribute("from"),
284 | rules[j].getAttribute("to")));
285 | }
286 |
287 | var exclusions = ruletag.getElementsByTagName("exclusion");
288 | for(var j = 0; j < exclusions.length; j++) {
289 | rule_set.exclusions.push(
290 | new Exclusion(exclusions[j].getAttribute("pattern")));
291 | }
292 |
293 | var cookierules = ruletag.getElementsByTagName("securecookie");
294 | for(var j = 0; j < cookierules.length; j++) {
295 | rule_set.cookierules.push(new CookieRule(cookierules[j].getAttribute("host"),
296 | cookierules[j].getAttribute("name")));
297 | }
298 |
299 | var targets = ruletag.getElementsByTagName("target");
300 | for(var j = 0; j < targets.length; j++) {
301 | var host = targets[j].getAttribute("host");
302 | if (!(host in this.targets)) {
303 | this.targets[host] = [];
304 | }
305 | this.targets[host].push(rule_set);
306 | }
307 | },
308 |
309 | /**
310 | * Insert any elements from fromList into intoList, if they are not
311 | * already there. fromList may be null.
312 | * @param intoList
313 | * @param fromList
314 | */
315 | setInsert: function(intoList, fromList) {
316 | if (!fromList) return;
317 | for (var i = 0; i < fromList.length; i++)
318 | if (intoList.indexOf(fromList[i]) == -1)
319 | intoList.push(fromList[i]);
320 | },
321 |
322 | /**
323 | * Return a list of rulesets that apply to this host
324 | * @param host The host to check
325 | * @returns {*} (empty) list
326 | */
327 | potentiallyApplicableRulesets: function(host) {
328 | // Have we cached this result? If so, return it!
329 | var cached_item = this.ruleCache.get(host);
330 | if (cached_item !== undefined) {
331 | log(DBUG, "Ruleset cache hit for " + host + " items:" + cached_item.length);
332 | return cached_item;
333 | }
334 | log(DBUG, "Ruleset cache miss for " + host);
335 |
336 | var tmp;
337 | var results = [];
338 | if (this.targets[host]) {
339 | // Copy the host targets so we don't modify them.
340 | results = this.targets[host].slice();
341 | }
342 |
343 | // Replace each portion of the domain with a * in turn
344 | var segmented = host.split(".");
345 | for (var i = 0; i < segmented.length; ++i) {
346 | tmp = segmented[i];
347 | segmented[i] = "*";
348 | this.setInsert(results, this.targets[segmented.join(".")]);
349 | segmented[i] = tmp;
350 | }
351 | // now eat away from the left, with *, so that for x.y.z.google.com we
352 | // check *.z.google.com and *.google.com (we did *.y.z.google.com above)
353 | for (var i = 2; i <= segmented.length - 2; ++i) {
354 | t = "*." + segmented.slice(i,segmented.length).join(".");
355 | this.setInsert(results, this.targets[t]);
356 | }
357 | log(DBUG,"Applicable rules for " + host + ":");
358 | if (results.length == 0)
359 | log(DBUG, " None");
360 | else
361 | for (var i = 0; i < results.length; ++i)
362 | log(DBUG, " " + results[i].name);
363 |
364 | // Insert results into the ruleset cache
365 | this.ruleCache.set(host, results);
366 | return results;
367 | },
368 |
369 | /**
370 | * Check to see if the Cookie object c meets any of our cookierule citeria for being marked as secure.
371 | * knownHttps is true if the context for this cookie being set is known to be https.
372 | * @param cookie The cookie to test
373 | * @param knownHttps Is the context for setting this cookie is https ?
374 | * @returns {*} ruleset or null
375 | */
376 | shouldSecureCookie: function(cookie, knownHttps) {
377 | var hostname = cookie.domain;
378 | // cookie domain scopes can start with .
379 | while (hostname.charAt(0) == ".")
380 | hostname = hostname.slice(1);
381 |
382 | if (!knownHttps && !this.safeToSecureCookie(hostname)) {
383 | return null;
384 | }
385 |
386 | var rs = this.potentiallyApplicableRulesets(hostname);
387 | for (var i = 0; i < rs.length; ++i) {
388 | var ruleset = rs[i];
389 | if (ruleset.active) {
390 | for (var j = 0; j < ruleset.cookierules.length; j++) {
391 | var cr = ruleset.cookierules[j];
392 | if (cr.host_c.test(cookie.domain) && cr.name_c.test(cookie.name)) {
393 | return ruleset;
394 | }
395 | }
396 | }
397 | }
398 | return null;
399 | },
400 |
401 | /**
402 | * Check if it is secure to secure the cookie (=patch the secure flag in).
403 | * @param domain The domain of the cookie
404 | * @returns {*} true or false
405 | */
406 | safeToSecureCookie: function(domain) {
407 | // Check if the domain might be being served over HTTP. If so, it isn't
408 | // safe to secure a cookie! We can't always know this for sure because
409 | // observing cookie-changed doesn't give us enough context to know the
410 | // full origin URI.
411 |
412 | // First, if there are any redirect loops on this domain, don't secure
413 | // cookies. XXX This is not a very satisfactory heuristic. Sometimes we
414 | // would want to secure the cookie anyway, because the URLs that loop are
415 | // not authenticated or not important. Also by the time the loop has been
416 | // observed and the domain blacklisted, a cookie might already have been
417 | // flagged as secure.
418 |
419 | if (domain in domainBlacklist) {
420 | log(INFO, "cookies for " + domain + "blacklisted");
421 | return false;
422 | }
423 | var cached_item = this.cookieHostCache.get(domain);
424 | if (cached_item !== undefined) {
425 | log(DBUG, "Cookie host cache hit for " + domain);
426 | return cached_item;
427 | }
428 | log(DBUG, "Cookie host cache miss for " + domain);
429 |
430 | // If we passed that test, make up a random URL on the domain, and see if
431 | // we would HTTPSify that.
432 |
433 | var nonce_path = "/" + Math.random().toString();
434 | var test_uri = "http://" + domain + nonce_path + nonce_path;
435 |
436 | log(INFO, "Testing securecookie applicability with " + test_uri);
437 | var rs = this.potentiallyApplicableRulesets(domain);
438 | for (var i = 0; i < rs.length; ++i) {
439 | if (!rs[i].active) continue;
440 | if (rs[i].apply(test_uri)) {
441 | log(INFO, "Cookie domain could be secured.");
442 | this.cookieHostCache.set(domain, true);
443 | return true;
444 | }
445 | }
446 | log(INFO, "Cookie domain could NOT be secured.");
447 | this.cookieHostCache.set(domain, false);
448 | return false;
449 | },
450 |
451 | /**
452 | * Rewrite an URI
453 | * @param urispec The uri to rewrite
454 | * @param host The host of this uri
455 | * @returns {*} the new uri or null
456 | */
457 | rewriteURI: function(urispec, host) {
458 | var newuri = null;
459 | var rs = this.potentiallyApplicableRulesets(host);
460 | for(var i = 0; i < rs.length; ++i) {
461 | if (rs[i].active && (newuri = rs[i].apply(urispec)))
462 | return newuri;
463 | }
464 | return null;
465 | }
466 | };
467 |
468 | // Copied from https://raw.githubusercontent.com/EFForg/https-everywhere/master/chromium/lru.js
469 | /**
470 | * A doubly linked list-based Least Recently Used (LRU) cache. Will keep most
471 | * recently used items while discarding least recently used items when its limit
472 | * is reached.
473 | *
474 | * Licensed under MIT. Copyright (c) 2010 Rasmus Andersson
475 | * See README.md for details.
476 | *
477 | * Illustration of the design:
478 | *
479 | * entry entry entry entry
480 | * ______ ______ ______ ______
481 | * | head |.newer => | |.newer => | |.newer => | tail |
482 | * | A | | B | | C | | D |
483 | * |______| <= older.|______| <= older.|______| <= older.|______|
484 | *
485 | * removed <-- <-- <-- <-- <-- <-- <-- <-- <-- <-- <-- added
486 | */
487 | function LRUCache (limit) {
488 | // Current size of the cache. (Read-only).
489 | this.size = 0;
490 | // Maximum number of items this cache can hold.
491 | this.limit = limit;
492 | this._keymap = {};
493 | }
494 |
495 | /**
496 | * Put into the cache associated with . Returns the entry which was
497 | * removed to make room for the new entry. Otherwise undefined is returned
498 | * (i.e. if there was enough room already).
499 | */
500 | LRUCache.prototype.put = function(key, value) {
501 | var entry = {key:key, value:value};
502 | // Note: No protection agains replacing, and thus orphan entries. By design.
503 | this._keymap[key] = entry;
504 | if (this.tail) {
505 | // link previous tail to the new tail (entry)
506 | this.tail.newer = entry;
507 | entry.older = this.tail;
508 | } else {
509 | // we're first in -- yay
510 | this.head = entry;
511 | }
512 | // add new entry to the end of the linked list -- it's now the freshest entry.
513 | this.tail = entry;
514 | if (this.size === this.limit) {
515 | // we hit the limit -- remove the head
516 | return this.shift();
517 | } else {
518 | // increase the size counter
519 | this.size++;
520 | }
521 | };
522 |
523 | /**
524 | * Purge the least recently used (oldest) entry from the cache. Returns the
525 | * removed entry or undefined if the cache was empty.
526 | *
527 | * If you need to perform any form of finalization of purged items, this is a
528 | * good place to do it. Simply override/replace this function:
529 | *
530 | * var c = new LRUCache(123);
531 | * c.shift = function() {
532 | * var entry = LRUCache.prototype.shift.call(this);
533 | * doSomethingWith(entry);
534 | * return entry;
535 | * }
536 | */
537 | LRUCache.prototype.shift = function() {
538 | // todo: handle special case when limit == 1
539 | var entry = this.head;
540 | if (entry) {
541 | if (this.head.newer) {
542 | this.head = this.head.newer;
543 | this.head.older = undefined;
544 | } else {
545 | this.head = undefined;
546 | }
547 | // Remove last strong reference to and remove links from the purged
548 | // entry being returned:
549 | entry.newer = entry.older = undefined;
550 | // delete is slow, but we need to do this to avoid uncontrollable growth:
551 | delete this._keymap[entry.key];
552 | }
553 | return entry;
554 | };
555 |
556 | /**
557 | * Get and register recent use of . Returns the value associated with
558 | * or undefined if not in cache.
559 | */
560 | LRUCache.prototype.get = function(key, returnEntry) {
561 | // First, find our cache entry
562 | var entry = this._keymap[key];
563 | if (entry === undefined) return; // Not cached. Sorry.
564 | // As was found in the cache, register it as being requested recently
565 | if (entry === this.tail) {
566 | // Already the most recenlty used entry, so no need to update the list
567 | return entry.value;
568 | }
569 | // HEAD--------------TAIL
570 | // <.older .newer>
571 | // <--- add direction --
572 | // A B C E
573 | if (entry.newer) {
574 | if (entry === this.head)
575 | this.head = entry.newer;
576 | entry.newer.older = entry.older; // C <-- E.
577 | }
578 | if (entry.older)
579 | entry.older.newer = entry.newer; // C. --> E
580 | entry.newer = undefined; // D --x
581 | entry.older = this.tail; // D. --> E
582 | if (this.tail)
583 | this.tail.newer = entry; // E. <-- D
584 | this.tail = entry;
585 | return returnEntry ? entry : entry.value;
586 | };
587 |
588 | // ----------------------------------------------------------------------------
589 | // Following code is optional and can be removed without breaking the core
590 | // functionality.
591 |
592 | /**
593 | * Check if is in the cache without registering recent use. Feasible if
594 | * you do not want to chage the state of the cache, but only "peek" at it.
595 | * Returns the entry associated with if found, or undefined if not found.
596 | */
597 | LRUCache.prototype.find = function(key) {
598 | return this._keymap[key];
599 | };
600 |
601 | /**
602 | * Update the value of entry with . Returns the old value, or undefined if
603 | * entry was not in the cache.
604 | */
605 | LRUCache.prototype.set = function(key, value) {
606 | var oldvalue, entry = this.get(key, true);
607 | if (entry) {
608 | oldvalue = entry.value;
609 | entry.value = value;
610 | } else {
611 | oldvalue = this.put(key, value);
612 | if (oldvalue) oldvalue = oldvalue.value;
613 | }
614 | return oldvalue;
615 | };
616 |
617 | /**
618 | * Remove entry from cache and return its value. Returns undefined if not
619 | * found.
620 | */
621 | LRUCache.prototype.remove = function(key) {
622 | var entry = this._keymap[key];
623 | if (!entry) return;
624 | delete this._keymap[entry.key]; // need to do delete unfortunately
625 | if (entry.newer && entry.older) {
626 | // relink the older entry with the newer entry
627 | entry.older.newer = entry.newer;
628 | entry.newer.older = entry.older;
629 | } else if (entry.newer) {
630 | // remove the link to us
631 | entry.newer.older = undefined;
632 | // link the newer entry to head
633 | this.head = entry.newer;
634 | } else if (entry.older) {
635 | // remove the link to us
636 | entry.older.newer = undefined;
637 | // link the newer entry to head
638 | this.tail = entry.older;
639 | } else {// if(entry.older === undefined && entry.newer === undefined) {
640 | this.head = this.tail = undefined;
641 | }
642 |
643 | this.size--;
644 | return entry.value;
645 | };
646 |
647 | /** Removes all entries */
648 | LRUCache.prototype.removeAll = function() {
649 | // This should be safe, as we never expose strong refrences to the outside
650 | this.head = this.tail = undefined;
651 | this.size = 0;
652 | this._keymap = {};
653 | };
654 |
655 | /**
656 | * Return an array containing all keys of entries stored in the cache object, in
657 | * arbitrary order.
658 | */
659 | if (typeof Object.keys === 'function') {
660 | LRUCache.prototype.keys = function() { return Object.keys(this._keymap); };
661 | } else {
662 | LRUCache.prototype.keys = function() {
663 | var keys = [];
664 | for (var k in this._keymap) keys.push(k);
665 | return keys;
666 | };
667 | }
668 |
669 | /**
670 | * Call `fun` for each entry. Starting with the newest entry if `desc` is a true
671 | * value, otherwise starts with the oldest (head) enrty and moves towards the
672 | * tail.
673 | *
674 | * `fun` is called with 3 arguments in the context `context`:
675 | * `fun.call(context, Object key, Object value, LRUCache self)`
676 | */
677 | LRUCache.prototype.forEach = function(fun, context, desc) {
678 | var entry;
679 | if (context === true) { desc = true; context = undefined; }
680 | else if (typeof context !== 'object') context = this;
681 | if (desc) {
682 | entry = this.tail;
683 | while (entry) {
684 | fun.call(context, entry.key, entry.value, this);
685 | entry = entry.older;
686 | }
687 | } else {
688 | entry = this.head;
689 | while (entry) {
690 | fun.call(context, entry.key, entry.value, this);
691 | entry = entry.newer;
692 | }
693 | }
694 | };
695 |
696 | /** Returns a JSON (array) representation */
697 | LRUCache.prototype.toJSON = function() {
698 | var s = [], entry = this.head;
699 | while (entry) {
700 | s.push({key:entry.key.toJSON(), value:entry.value.toJSON()});
701 | entry = entry.newer;
702 | }
703 | return s;
704 | };
705 |
706 | /** Returns a String representation */
707 | LRUCache.prototype.toString = function() {
708 | var s = '', entry = this.head;
709 | while (entry) {
710 | s += String(entry.key)+':'+entry.value;
711 | entry = entry.newer;
712 | if (entry)
713 | s += ' < ';
714 | }
715 | return s;
716 | };
717 |
718 | /**
719 | * Object containing rulesets and a single method for rewriting the contnent
720 | * of a provided web page to change HTTP URLs to HTTPS.
721 | */
722 | function HttpsRewriter() {
723 | var contents = fs.readFileSync(path.join(__dirname, RULESET_FILE), 'utf-8');
724 | var xml = new DOMParser().parseFromString(contents, 'text/xml');
725 | this._rules = new RuleSets('Bundler user agent', LRUCache, {});
726 | this._rules.addFromXml(xml);
727 | }
728 |
729 | /**
730 | * Rewrite the content of a web page so that URLs with the HTTP protocol
731 | * are rewritten to use HTTPS.
732 | * @param {string} pageContent - The raw, utf-8 encoded content of a document
733 | * @return the raw content of the modified document
734 | */
735 | HttpsRewriter.prototype.process = function (pageContent) {
736 | var thisLibrary = this;
737 | return URI.withinString(pageContent, function (url) {
738 | var uri = new URI(url);
739 | if (uri.protocol() !== 'http') {
740 | return url;
741 | }
742 | uri.normalize();
743 | var rewritten = thisLibrary._rules.rewriteURI(uri.toString(), uri.host());
744 | if (rewritten) {
745 | // If the rewrite was just a protocol change, output protocol-relative URIs.
746 | var rewrittenUri = new URI(rewritten).protocol('http');
747 | if (rewrittenUri.toString() === uri.toString()) {
748 | return rewrittenUri.protocol('https').toString();
749 | } else {
750 | return rewritten;
751 | }
752 | } else {
753 | return url;
754 | }
755 | });
756 | };
757 |
758 | module.exports = {
759 | HttpsRewriter: HttpsRewriter
760 | };
761 |
--------------------------------------------------------------------------------
/tests/contentrewrite.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | HTTPS-Proxy Test
5 |
6 |
7 |
8 | This anchor's URL should be rewritten
9 |
10 |
11 |
12 |
--------------------------------------------------------------------------------
/tests/httpsrewriter.js:
--------------------------------------------------------------------------------
1 | /**
2 | * Test that the HTTPS Rewriter based on HTTPS Everywhere's code works.
3 | */
4 |
5 | var HttpsRewriter = require('../rewriter').HttpsRewriter;
6 | var should = require('should');
7 |
8 | const TESTS = {
9 | 'html': 'Reddit
',
10 | 'python code': 'def test():\n\treturn Link("http://google.com")',
11 | 'javascript code': 'document.location.href = "http://reddit.com";',
12 | 'json': '{"url": "http://news.ycombinator.com"}',
13 | 'markdown': 'Read more on [Github](http://github.com)!',
14 | 'raw string': 'http://reddit.com?returnto=index&numberofcats=allofthem'
15 | };
16 |
17 | describe('httpsrewriter', function () {
18 |
19 | before(function (done) {
20 | this.rewriter = new HttpsRewriter();
21 | done();
22 | });
23 |
24 | it('should successfully initialize the ruleset library object', function (done) {
25 | this.rewriter.should.have.property('_rules'); // The RuleSets member object
26 | this.rewriter._rules.should.have.property('targets'); // An attribute of RuleSets.
27 | this.rewriter._rules.should.have.property('rewriteURI'); // A method of RuleSets.
28 | this.rewriter._rules.rewriteURI.should.be.type('function');
29 | done();
30 | });
31 |
32 | it('should have the process method for modifying web documents', function (done) {
33 | this.rewriter.should.have.property('process');
34 | this.rewriter.process.should.be.type('function');
35 | done();
36 | });
37 |
38 | it('should produce an document containing an HTTPS URI in place of an existing HTTP URI', function (done) {
39 | var testCases = Object.keys(TESTS);
40 | for (var i = 0, len = testCases.length; i < len; i++) {
41 | var test = TESTS[testCases[i]];
42 | var rewritten = this.rewriter.process(test);
43 | rewritten.indexOf('http://').should.be.eql(-1, 'http:// found');
44 | rewritten.indexOf('https://').should.be.greaterThan(-1, 'https:// not found');
45 | }
46 | done();
47 | });
48 | });
49 |
--------------------------------------------------------------------------------
/tests/proxy.js:
--------------------------------------------------------------------------------
1 | /**
2 | * Test the behavior of the proxy server.
3 | */
4 |
5 | var http = require('http');
6 | var url = require('url');
7 | var qs = require('querystring');
8 | var request = require('request');
9 | var should = require('should');
10 | var proxy = require('../index');
11 |
12 | const TEST_PORT = 54345;
13 |
14 | describe('proxy', function () {
15 |
16 | describe('handleBody', function () {
17 |
18 | before(function (done) {
19 | this.testFlag = false;
20 | this.testData = 'argument1=hello&argument2=world';
21 | this.recvData= '';
22 | var thisInit= this;
23 | // Set up a server that will provide some indication that the handler was
24 | // called and will give us access to what handleBody produces.
25 | this.server = http.createServer(function (req, res) {
26 | proxy.handleBody(req, function (err, data) {
27 | should.not.exist(err);
28 | thisInit.recvData += data;
29 | thisInit.testFlag = true;
30 | res.end();
31 | });
32 | }).listen(TEST_PORT);
33 | done();
34 | });
35 |
36 | it('should produce the body of a request', function (done) {
37 | var thisTest = this;
38 | request({
39 | url: 'http://localhost:' + TEST_PORT,
40 | method: 'post',
41 | body: thisTest.testData
42 | }, function (err, response, body) {
43 | should.not.exist(err);
44 | response.statusCode.should.be.exactly(200);
45 | thisTest.testFlag.should.be.true;
46 | thisTest.recvData.should.be.type('string');
47 | thisTest.recvData.should.be.eql(thisTest.testData);
48 | done();
49 | });
50 | });
51 |
52 | after(function () {
53 | this.server.close();
54 | });
55 | });
56 |
57 | describe('reportError', function () {
58 |
59 | before(function (done) {
60 | this.testFlag = false;
61 | this.errorMessage = 'Error1234';
62 | var thisInitializer = this;
63 | // Set up a server that will provide indication that the handler was called.
64 | this.server = http.createServer(function (req, res) {
65 | thisInitializer.testFlag = true;
66 | proxy.reportError(res, new Error(thisInitializer.errorMessage));
67 | }).listen(TEST_PORT);
68 | done();
69 | });
70 |
71 | it('should write an error to the requestor and set statusCode to 500', function (done) {
72 | var thisTest = this;
73 | request('http://localhost:' + TEST_PORT, function (err, response, body) {
74 | should.not.exist(err);
75 | response.statusCode.should.be.exactly(500);
76 | thisTest.testFlag.should.be.true;
77 | body.toString().should.be.exactly(thisTest.errorMessage);
78 | done();
79 | });
80 | });
81 |
82 | after(function () {
83 | this.server.close();
84 | });
85 | });
86 |
87 | describe('forwardRequest', function () {
88 |
89 | before(function (done) {
90 | this.server = http.createServer(function (req, res) {
91 | // Send requests to http://localhost:TEST_PORT?url=
92 | var uri = qs.parse(url.parse(req.url).query).url;
93 | proxy.forwardRequest(res, {url: uri});
94 | }).listen(TEST_PORT);
95 | done();
96 | });
97 |
98 | it('should write an error for requests that fail', function (done) {
99 | request('http://localhost:' + TEST_PORT + '?url=adhfadgf234', function (err, res, body) {
100 | should.not.exist(err); // This request won't fail, but the response will be an error
101 | res.statusCode.should.be.exactly(500);
102 | done();
103 | });
104 | });
105 |
106 | it('should write back whatever it receives', function (done) {
107 | request('http://localhost:' + TEST_PORT + '?url=https://google.com', function (err, res, body) {
108 | should.not.exist(err);
109 | res.statusCode.should.be.exactly(200);
110 | res.headers.should.have.property('content-type');
111 | res.headers['content-type'].indexOf('text/html').should.be.exactly(0);
112 | done();
113 | });
114 | });
115 |
116 | after(function () {
117 | this.server.close();
118 | });
119 | });
120 |
121 | describe('proxy', function () {
122 |
123 | before(function (done) {
124 | // Set up a server that will act as the endpoint we want to send a request to.
125 | // We will set special headers to be tested.
126 | this.responseBody = 'Hello World! We got the request!';
127 | this.innerPort = 55555;
128 | var thisInitializer = this;
129 | this.endServer = http.createServer(function (req, res) {
130 | res.setHeader('x-reached-endserver', true);
131 | res.statusCode = 200;
132 | // If we get a POST/PUT/PATCH request, use the handleBody function to
133 | // write it back so that we can be sure it made it here.
134 | if (['POST', 'PUT', 'PATCH'].indexOf(req.method.toUpperCase()) >= 0) {
135 | proxy.handleBody(req, function (err, body) {
136 | res.write(body);
137 | res.end();
138 | });
139 | } else {
140 | res.write(thisInitializer.responseBody);
141 | res.end();
142 | }
143 | }).listen(this.innerPort);
144 | // Also set up a server that will invoke the proxy method for us.
145 | this.server = http.createServer(proxy.proxy).listen(TEST_PORT);
146 | done();
147 | });
148 |
149 | it('should forward headers and content written by the end server', function (done) {
150 | var thisTest = this;
151 | request({
152 | url: 'http://localhost:' + thisTest.innerPort,
153 | proxy: 'http://localhost:' + TEST_PORT
154 | }, function (err, response, body) {
155 | should.not.exist(err);
156 | response.statusCode.should.be.exactly(200);
157 | response.headers.should.have.property('x-reached-endserver');
158 | response.headers['x-reached-endserver'].should.be.true;
159 | body.toString().should.be.eql(thisTest.responseBody);
160 | done();
161 | });
162 | });
163 |
164 | it('should pass on the body of PUT/POST/PATCH requests', function (done) {
165 | var thisTest = this;
166 | var testBody = 'This is the body we expect to be written back by our echo server.';
167 | request({
168 | url: 'http://localhost:' + thisTest.innerPort,
169 | proxy: 'http://localhost:' + TEST_PORT,
170 | method: 'POST',
171 | body: testBody
172 | }, function (err, response, body) {
173 | should.not.exist(err);
174 | response.statusCode.should.be.exactly(200);
175 | response.headers.should.have.property('x-reached-endserver');
176 | response.headers['x-reached-endserver'].should.be.true;
177 | body.toString().should.be.eql(testBody);
178 | done();
179 | });
180 | });
181 |
182 | after(function () {
183 | this.server.close();
184 | this.endServer.close();
185 | });
186 | });
187 | });
188 |
--------------------------------------------------------------------------------
/tools/fetchrules.sh:
--------------------------------------------------------------------------------
1 | git clone https://github.com/EFForg/https-everywhere.git
2 | cd https-everywhere
3 | sh ./makecrx.sh
4 | mv ./pkg/crx/rules/default.rulesets ../httpse.rulesets
5 | cd ..
6 | rm -rf https-everywhere
7 |
--------------------------------------------------------------------------------