├── COPYRIGHT ├── INSTALL.md ├── README.md ├── replayproxy.py └── test.pcap /COPYRIGHT: -------------------------------------------------------------------------------- 1 | Modifications by Tom Sparrow 2015. Original copyright notice still applies: 2 | 3 | Copyright (c) 2011, Armin Buescher (armin.buescher@googlemail.com) 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | * Redistributions of source code must retain the above copyright 9 | notice, this list of conditions and the following disclaimer. 10 | * Redistributions in binary form must reproduce the above copyright 11 | notice, this list of conditions and the following disclaimer in the 12 | documentation and/or other materials provided with the distribution. 13 | * Neither the name of the nor the 14 | names of its contributors may be used to endorse or promote products 15 | derived from this software without specific prior written permission. 16 | 17 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 18 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 19 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 20 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY 21 | DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 22 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 23 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 24 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 25 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 26 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 27 | 28 | Author: Armin Buescher (armin.buescher@googlemail.com) 29 | Contribs: Marco Cova (marco@lastline.com) 30 | Thx to: Andrew Brampton (brampton@gmail.com) for his example code on how to parse HTTP streams from .pcap files using dpkg 31 | -------------------------------------------------------------------------------- /INSTALL.md: -------------------------------------------------------------------------------- 1 | ## Installing replayproxy on linux 2 | 3 | In theory replayproxy _should_ work on windows too if you can get pynids working... but this guide assumes linux e.g. Ubuntu 4 | 5 | ### Get relayproxy code 6 | * `git clone https://github.com/sparrowt/replayproxy.git` 7 | * `cd replayproxy` 8 | 9 | ### Setup dependencies 10 | * pynids library (http://jon.oberheide.org/pynids/) 11 | * dpkt library (http://code.google.com/p/dpkt/) 12 | 13 | Optional: setup python virtual environment to install in 14 | * `virtualenv venv` 15 | * `source venv/bin/activate` 16 | 17 | #### Install **dpkt** 18 | * `pip install dpkt` 19 | 20 | #### Install **pynids** 21 | * `wget https://jon.oberheide.org/pynids/downloads/pynids-0.6.1.tar.gz` 22 | * `tar -xf pynids-0.6.1.tar.gz` 23 | * `cd pynids-0.6.1` 24 | 25 | Before building pynids you need to install its dependencies: 26 | 27 | 1. libpcap 28 | * `sudo apt-get install libpcap0.8 libpcap-dev` 29 | 30 | 2. libnet 31 | * `sudo apt-get install libnet1 libnet1-dev` 32 | 33 | Now you can build pynids: 34 | * `sudo apt-get install python-dev` 35 | * `python setup.py build` 36 | 37 | this should build `libnids` which is included in the pynids download 38 | 39 | Then actually install pynids: 40 | * `python setup.py install` 41 | 42 | ### Enjoy! 43 | You should now be able to use replayproxy (see [README.md](README.md) for instructions) 44 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Origin 2 | This project is originally from https://code.google.com/p/replayproxy/ by Armin Buescher. 3 | 4 | This fork fixes some bugs & provides more details for usage & installation. 5 | 6 | ## Summary 7 | replayproxy allows you to "re-live" a HTTP session which has been captured in a .pcap file (e.g. in Wireshark). It parses the HTTP streams, caches them, and starts a HTTP proxy. It then replies to HTTP requests with the matching response from the .pcap, ignoring all other requests. 8 | 9 | ## Usage 10 | `replayproxy.py [-h] [-H HOST] [-p PORT] [-v[v]] FILENAME` 11 | 12 | Arguments: 13 | * `-h|--help` Show usage information 14 | * `-H HOST` IP to start the proxy on (DEFAULT: 127.0.0.1) 15 | * `-p PORT` Port to listen on (DEFAULT: 3128) 16 | * `-v[v]` Verbose output (DEFAULT: log only ERRORs, -v = INFO, -vv = DEBUG) 17 | * `FILENAME` Path to the .pcap file to parse (*required*) 18 | 19 | Normal usage: 20 | - obtain a .pcap file containing the captured HTTP session (e.g. using tcpdump or Wireshark) 21 | - run replayproxy to start the HTTP proxy (see details above) 22 | - configure your browser to use the proxy settings (IP & port) on which replayproxy is running 23 | - browse to the site that was captured 24 | 25 | To get you started `test.pcap` in this repository contains a capture of a visit to http://www.honeynet.org 26 | 27 | ## Dependencies and Installation 28 | * Python 2.7+ 29 | * dpkt library (http://code.google.com/p/dpkt/) 30 | * pynids library (http://jon.oberheide.org/pynids/) 31 | 32 | For detailed installation instructions, see the [INSTALL.md](INSTALL.md) file 33 | -------------------------------------------------------------------------------- /replayproxy.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | ################################################################################################### 4 | # 5 | # Copyright (c) 2011, Armin Buescher (armin.buescher@googlemail.com) 6 | # All rights reserved. 7 | # 8 | # Redistribution and use in source and binary forms, with or without 9 | # modification, are permitted provided that the following conditions are met: 10 | # * Redistributions of source code must retain the above copyright 11 | # notice, this list of conditions and the following disclaimer. 12 | # * Redistributions in binary form must reproduce the above copyright 13 | # notice, this list of conditions and the following disclaimer in the 14 | # documentation and/or other materials provided with the distribution. 15 | # * Neither the name of the nor the 16 | # names of its contributors may be used to endorse or promote products 17 | # derived from this software without specific prior written permission. 18 | # 19 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 21 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 22 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY 23 | # DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 24 | # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 25 | # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 26 | # ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 27 | # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | # 30 | ################################################################################################### 31 | # 32 | # File: replayproxy.py 33 | # Desc.: ReplayProxy is a forensic tool to replay web-based attacks (and also general HTTP traffic) that were captured in a pcap file. 34 | # Functionality: 35 | # * parse HTTP streams from .pcap files 36 | # * open a TCP socket and listen as a HTTP proxy using the extracted HTTP responses as a cache while refusing all requests for unknown URLs 37 | # Author: Armin Buescher (armin.buescher@googlemail.com) 38 | # Contribs: Marco Cova (marco@lastline.com), Tom Sparrow (sparrowt) 39 | # Thx to: Andrew Brampton (brampton@gmail.com) for his example code on how to parse HTTP streams from .pcap files using dpkg 40 | # 41 | ################################################################################################### 42 | # 43 | # Changelog 44 | # 1.1 (Marco Cova, marco@lastline.com) 45 | # - tcpreassembly via pynids 46 | # - initial support for non-exact matches 47 | # - general refactoring 48 | # 49 | # 1.2 (Tom Sparrow, sparrowt) 50 | # - fix handling of missing content-length header 51 | # - handle request/response parse errors and continue 52 | # 53 | ################################################################################################### 54 | 55 | import argparse 56 | import dpkt 57 | import gzip 58 | import logging 59 | import nids 60 | import sys 61 | import urlparse 62 | import SocketServer 63 | import StringIO 64 | 65 | END_STATES = (nids.NIDS_CLOSE, nids.NIDS_TIMEOUT, nids.NIDS_RESET) 66 | 67 | # { full url: [response1] } 68 | # for now just assume only 1 response per url 69 | resources = {} 70 | 71 | 72 | ######################## 73 | # pcap parsing 74 | ######################## 75 | 76 | # keep track of non-closed TCP streams, which otherwise are not processed 77 | # using the regular pynids API 78 | openstreams = {} 79 | 80 | 81 | def reassembleTcpStream(tcp): 82 | 83 | if tcp.nids_state == nids.NIDS_JUST_EST: 84 | # always assume it is HTTP traffic (else: if dport === 80) 85 | tcp.client.collect = 1 86 | tcp.server.collect = 1 87 | 88 | openstreams[tcp.addr] = tcp 89 | elif tcp.nids_state == nids.NIDS_DATA: 90 | # keep all of the stream's new data 91 | tcp.discard(0) 92 | 93 | openstreams[tcp.addr] = tcp 94 | elif tcp.nids_state in END_STATES: 95 | del openstreams[tcp.addr] 96 | 97 | processTcpStream(tcp) 98 | else: 99 | print >>sys.stderr, "Unknown nids state" 100 | 101 | 102 | def processTcpStream(tcp): 103 | ((src, sport), (dst, dport)) = tcp.addr 104 | 105 | # We can not handle HTTPS 106 | if 443 in [sport, dport]: 107 | logging.warning("Ignoring HTTPS/SSL stream (%s:%s -> %s:%s)" % (src, sport, dst, dport)) 108 | return 109 | 110 | # data to server 111 | server_data = tcp.server.data[:tcp.server.count] 112 | # data to client 113 | client_data = tcp.client.data[:tcp.client.count] 114 | 115 | # extract *all* the requests in this stream 116 | req = "" 117 | while len(req) < len(server_data): 118 | req_parsed = False 119 | try: 120 | req = dpkt.http.Request(server_data) 121 | req_parsed = True 122 | host_hdr = req.headers['host'] 123 | full_uri = req.uri if req.uri.startswith("http://") else \ 124 | "http://%s:%d%s" % (host_hdr, dport, req.uri) if dport != 80 else \ 125 | "http://%s%s" % (host_hdr, req.uri) 126 | logging.info("Processing tcp stream for %s", full_uri) 127 | res = dpkt.http.Response(client_data) 128 | logging.debug(res) 129 | if "content-length" in res.headers: 130 | body_len = int(res.headers["content-length"]) 131 | hdr_len = client_data.find('\r\n\r\n') 132 | client_data = client_data[body_len + hdr_len + 4:] 133 | else: 134 | hdr_len = client_data.find('\r\n\r\n') 135 | body_len = client_data[hdr_len:].find("HTTP/1") 136 | client_data = client_data[hdr_len + body_len:] 137 | 138 | if not full_uri in resources: 139 | resources[full_uri] = [] 140 | resources[full_uri].append(res) 141 | 142 | server_data = server_data[len(req):] 143 | except Exception as ex: 144 | logging.error("Failed to parse {}. Exception: {}".format("response" if req_parsed else "request", str(ex))) 145 | logging.error("Stopping processing of TCP stream %s:%s -> %s:%s (%s)" % (src, sport, dst, dport, full_uri)) 146 | break 147 | 148 | 149 | def get_resource(uri): 150 | # exact match? 151 | if uri in resources: 152 | return resources[uri][0] 153 | 154 | resources_by_domain = {} 155 | for u in resources: 156 | domain = urlparse.urlparse(u).hostname 157 | if not domain in resources_by_domain: 158 | resources_by_domain[domain] = [] 159 | resources_by_domain[domain].append(u) 160 | 161 | uri_domain = urlparse.urlparse(uri).hostname 162 | uri_path = urlparse.urlparse(uri).path 163 | if uri_domain in resources_by_domain: 164 | # do we have one page from the same domain of the requested uri? 165 | if len(resources_by_domain[uri_domain]) == 1: 166 | logging.info("Matching %s with %s (one url from requested domain)", uri, resources_by_domain[uri_domain][0]) 167 | return resources[resources_by_domain[uri_domain][0]][0] 168 | 169 | # is there a page with same path as requested uri? 170 | for u in resources_by_domain[uri_domain]: 171 | if urlparse.urlparse(u).path == uri_path: 172 | logging.info("Matching %s with %s (same path and domain)", uri, u) 173 | return resources[u][0] 174 | 175 | return None 176 | 177 | 178 | ######################## 179 | # HTTP proxy 180 | ######################## 181 | class ProxyServer(SocketServer.TCPServer): 182 | allow_reuse_address = True 183 | 184 | 185 | class ProxyRequestHandler(SocketServer.BaseRequestHandler): 186 | 187 | def handle(self): 188 | # handles a request of a client 189 | # callback for SocketServer 190 | sock_client = self.request 191 | http_req = ProxyRequestHandler.recvRequest(sock_client) 192 | if http_req: 193 | resp = get_resource(http_req.uri) 194 | if resp: 195 | logging.info("Request for %s" % http_req.uri) 196 | ProxyRequestHandler.sendResponse(resp, sock_client) 197 | else: 198 | sock_client.send('') 199 | logging.warning("Request for unknown URL %s" % http_req.uri) 200 | sock_client.close() 201 | 202 | @staticmethod 203 | def recvRequest(sock): 204 | total_data = data = sock.recv(16384) 205 | while 1: 206 | try: 207 | http_req = dpkt.http.Request(total_data) 208 | return http_req 209 | except dpkt.NeedData: 210 | data = sock.recv(16384) 211 | total_data += data 212 | pass 213 | except: 214 | "Error while processing HTTP Request!" 215 | return None 216 | 217 | @staticmethod 218 | def sendResponse(resp, conn): 219 | resp.version = '1.0' 220 | if 'content-encoding' in resp.headers and resp.headers['content-encoding'] == 'gzip': 221 | del resp.headers['content-encoding'] 222 | compressed = resp.body 223 | compressedstream = StringIO.StringIO(compressed) 224 | gzipper = gzip.GzipFile(fileobj=compressedstream) 225 | data = gzipper.read() 226 | resp.body = data 227 | resp.headers['content-length'] = len(resp.body) 228 | conn.send(resp.pack()) 229 | 230 | 231 | ######################## 232 | # main 233 | ######################## 234 | 235 | def main(): 236 | 237 | # parse args 238 | argparser = argparse.ArgumentParser() 239 | argparser.add_argument('PCAP', help='Path to the .pcap file to parse') 240 | argparser.add_argument('-H', metavar='HOST', default='127.0.0.1', help='Address to listen on (DEFAULT: 127.0.0.1)') 241 | argparser.add_argument('-p', metavar='PORT', type=int, default=3128, help='Port to listen on (DEFAULT: 3128)') 242 | argparser.add_argument('-v', action='append_const', const=1, default=[], help='Increase the verbosity level') 243 | args = argparser.parse_args() 244 | 245 | HOST, PORT = args.H, args.p 246 | verbosity = len(args.v) 247 | 248 | # setup logger 249 | if verbosity == 0: 250 | log_level = logging.ERROR 251 | elif verbosity == 1: 252 | log_level = logging.INFO 253 | else: 254 | log_level = logging.DEBUG 255 | logging.basicConfig(format='%(levelname)s:%(message)s', level=log_level) 256 | 257 | # setup the reassembler 258 | nids.param("scan_num_hosts", 0) # disable portscan detection 259 | nids.chksum_ctl([('0.0.0.0/0', False)]) # disable checksum verification: jsunpack says it may cause missed traffic 260 | nids.param("filename", args.PCAP) 261 | nids.init() 262 | nids.register_tcp(reassembleTcpStream) 263 | logging.info("Processing TCP streams...") 264 | nids.run() 265 | 266 | # process the open streams, which are not processed by pynids 267 | logging.info("Processing open streams...") 268 | for c, stream in openstreams.items(): 269 | processTcpStream(stream) 270 | 271 | # run proxy server 272 | server = ProxyServer((HOST, PORT), ProxyRequestHandler) 273 | server.allow_reuse_address = True 274 | try: 275 | logging.info("Proxy listening on %s:%d" % (HOST, PORT)) 276 | server.serve_forever() 277 | except KeyboardInterrupt: 278 | return 0 279 | 280 | if __name__ == "__main__": 281 | sys.exit(main()) 282 | -------------------------------------------------------------------------------- /test.pcap: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sparrowt/replayproxy/7668513b62701d81fe40fa49c58967e89107c43e/test.pcap --------------------------------------------------------------------------------