├── .gitignore
├── LICENSE
├── README.md
├── api
├── __init__.py
├── config.py
├── dbghelp.py
├── main.py
├── mainhandler.py
├── objectpool.py
├── requeststatistics.py
├── statistics.py
├── symbolhandler.py
├── symbolserver.py
└── testhandler.py
├── config
└── default.pysymproxy.json
├── data
└── empty.txt
├── dbghelp
└── symsrv.yes
├── requirements.txt
├── run_server.bat
├── server.py
└── static
├── main.html.jinja
└── screenshot.png
/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | *.pyc
3 | data/
4 | *.dll
5 | .vs/
6 | *.pyproj
7 | *.sln
8 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2016 Joshua Green
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # pySymProxy
2 | pySymProxy is an implementation of a Microsoft Symbol Proxy server using Python.
3 | Symbol proxy servers are used to organise, optimise, and cache debugging symbols from multiple other servers.
4 |
5 | 
6 |
7 | See the following links for information on symbols and symbol servers
8 | - [Debugging with Symbols](https://msdn.microsoft.com/en-us/library/windows/desktop/ee416588(v=vs.85).aspx)
9 | - [Symbol Server and Symbol Stores](https://msdn.microsoft.com/en-us/library/windows/desktop/ms680693(v=vs.85).aspx)
10 | - [SymProxy](https://msdn.microsoft.com/en-us/library/windows/hardware/ff558846(v=vs.85).aspx)
11 | - [Symbols the Microsoft way](https://randomascii.wordpress.com/2013/03/09/symbols-the-microsoft-way/)
12 |
13 | ---
14 | ## Features
15 | | Feature | Microsoft SymProxy | pySymProxy | Description |
16 | | ------------- |:-------------:|:-----:|-----|
17 | | Common configuration of symbol search paths | X| X| Configure clients once with the symbol server's details and then all future configuration changes only need to occur on one server. Entire studios can be updated from one place. |
18 | | Global blacklist | X | X | Commonly unavailable Symbols can be denied immediately by the service and avoid unnecessary requests and delays to debuggers. |
19 | | Per server blacklist | | X | Different servers house the artifacts of different builds. If a server will never have a symbol then black list it and don't send the request in the first place. Useful when symbols will exist on one of the servers you reference, so a global blacklist won't help. |
20 | | Per server whitelist | | X | This might be easier to configure than a per server blacklist. |
21 | | Statistics collection | | X | Logs and statistics about each symbol served are kept and made available on the web interface. These statistics may help you identify further configuration optimisation opportunities. |
22 | | Configurable server retry timeouts | | X| There is no need to query servers over and over again for symbols. If they didn't exist there 5 minutes ago, is it likely they're there now? Configure the timeout that makes sense for each server. |
23 | | Open source | | X | Need another feature? Jump in and implement it :)|
24 | ---
25 | ## Requirements
26 | | Requirement | Microsoft SymProxy | pySymProxy |
27 | | ------------- |:-------------:|:-----:|
28 | | Windows | X| X|
29 | | IIS | X| |
30 | | Python | | X|
31 | ---
32 | ## Why
33 | I was driven to implement this solution when I found myself waiting for symbols on my system to download and run all the time. The reason this had become intolerably slow was because one of the servers we might fetch symbols from exists on the other side of the world. I had configured this server to be last in my symbol search paths so other (faster) servers would get the chance to service the request first. This worked for most symbols, but there are plenty of modules that we simply don't have access to the symbols for. These would always hit the slow server (after hitting the others) and wait for that server to respond to each request (every symbol fetch requires 3 HTTP requests to the remote servers). Some of my colleagues had setup exclusion lists for their configurations, but it became aparrent that these configurations were non-trivial and completely different on each developer's machine.
34 |
35 | This sounded like the perfect reason to setup a SymProxy as described by the Microsoft documentation. After attempting configuration of such a proxy, it also became clear that configuring this proxy wasn't going to be easy, and it would still hit the slow path for many of the requests we would make.
36 |
37 | ---
38 | ## Configuration
39 | ### Installation
40 | Make sure you are running windows and have Python 2.7 or 3+ installed.
41 | You can clone this repository into a folder on the windows machine.
42 | To run the server, you can run `run_server.bat` in the root folder of the repository.
43 | This will run a server on the machine on port 8080.
44 | `run_server.bat` can be configured as a service on the machine so it automatically starts when the machine reboots. A search online can help with this configuration.
45 |
46 | Due to Microsoft not explicitly allowing redistribution od the `dbghelp.dll` files anymore, I can't have them committed directly into the repository. But I can point you to the place to get them: [Debugging Tools for Windows](https://developer.microsoft.com/en-us/windows/hardware/windows-driver-kit). Simply copy/paste the x86 `dbghelp.dll` and `symsrv.dll` to the `dbghelp` directory of the repository.
47 |
48 | ### Client configuration
49 | Developers PC's should be configured to reference the pySymProxy service by setting an environment variable on their machine to the following:
50 | `srv*C:\Symbols*http://pysymproxy.company.local:8080/symbols.`
51 |
52 | ### Checking the status of the server
53 | At any time navigate to the base address of the server in a web browser.
54 | Such as: `http://pysymproxy.company.local:8080/`
55 |
56 | ### Configuration files
57 | When starting, the server will attempt to load a configuration file.
58 | The following locations will be checked, and the first location that exists will be used as the configuration.
59 | - `../config/pysymproxy.json`
60 | - `./config/pysymproxy.json`
61 | - `./config/default.pysymproxy.json`
62 |
63 | The repository has a `./config/default.pysymproxy.json`. It is recommended to copy this file to one of the other locations, then begin to configure the server for your needs.
64 |
65 | These files are standard JSON files and contain sections that define how the server is configured.
66 |
67 | #### Configuration - `identity` section
68 | - `name` - (string) The title describing the server
69 | - `host` - (string) The name of the host that the server is running on
70 | - `administrator` - (string) Contact information for the server administrator (displayed on the status page),
71 | - `default_sympath` - (string) Client configuration information offered on the status page.
72 |
73 | #### Configuration - `general` section
74 | - `enableStatistics` - (boolean) Whether or not statistics should be enabled on the server - may affect performance
75 | - `cacheLocation` - (string) A location on disk where symbols can be cached for serving to clients
76 | - `blacklist` - (list of strings) A list of regular expression patterns. If a requested file matches one of these patterns it will be rejected immediately.
77 |
78 | #### Configuration - `servers` section
79 | This section is expected to contain a list of server objects.
80 | When attempting to find symbols, these servers will be searched in the order they are defined in this list.
81 | Each object can have the following properies:
82 | - `name` - (string) The name of this server
83 | - `remote` - (string) a URL, file path, or network path where symbols are stored as a symbol server
84 | - `cacheLocation` - (string) optional location on disk where symbols from this specific server can be cached for serving to clients
85 | - `retryTimout` - (number) The number of seconds to wait after a failed symbol lookup before allowing another request for the same symbol to try again.
86 | - `maxRequests` - (number) Maximum number of simultaneous requests that can be served by this server
87 | - `blacklist` - (list of strings) A list of regular expression patterns. If a requested file matches one of these patterns it will be rejected immediately by this server.
88 | - `whitelist` - (list of strings) A list of regular expression patterns. If a requested file does NOT match one of these patterns it will be rejected immediately by this server.
89 |
90 | #### Configuration - `logging` section
91 | The config file can have a logging object defined. This defines a dict that is passed to Python's logging modules. This can be used to configure the server to generate log output to disk for later analysis. The default configuration outputs to rotating log files and to the console. See the Python logging documentation for specific details on this section
92 |
93 | ---
94 | ## Known issues
95 | ### "Failed to load symbol"
96 | Sometimes when a symbol exists on a remote server, the remote requests aren't serviced quickly enough, and the client (debugger) requesting the symbol treats it as a failure. Often attempting to load the symbol again will hit the cache and immediately load the symbol the second time around. This might be solved in future using chunked encoding streams and generating 0 bytes to keep the connection alive for longer.
97 |
98 | ### dbghelp.dll should not be accessed by multiple threads
99 | Yeah, so we're abusing the dbghelp.dll a bit here. The documentation makes no guarantees that the dll will do what it should if it is accessed by multiple threads. This isn't really useful for a webservice - so in an attempt to limit the risk of something bad happening - I've loaded the dll into the process multiple times. Each load of the DLL will only be used by one thread at a time. This still may not guarantee things will work, and it may not even be required in this situation. EAFP is the 'pythonic way' right?
100 |
101 | ---
102 | ## FAQ
103 | ### I use local network shares for my symbol storage. Why would I use this?
104 | You probably shouldn't. Local network shares are great and fast.
105 | However, there are some useful features that may make it worth trying, such as statistics collection, and the fact that symbol server configuration is managed by the server rather than each developer's individual PC configuration.
106 |
107 | ### Microsoft has an implementation of a Symbol Proxy (symproxy), why not use that?
108 | Our initial attempts to configure a symproxy as per the documentation failed. Varying versions of IIS, mixed with configuration headaches and legacy installations made it unclear where the issue was. Whilst attempting to configure the server it also became clear that there were desirable configuration settings missing, and the complexity of the task that was being performed by the service wasn't very high.
109 | So I thought I would give it a shot and write something that worked the way we wanted.
110 |
111 | ### Why not just use linux? Why use windows?
112 | When initially attempting an implementation, I tried to get a symproxy working using just python. This could then have been hosted on a linux server. I ran into trouble when attempting to forward requests to the Microsoft symbol server. This server doesn't accept requests from anything but their software. So the current implementation makes use of the dbghelp.dll and symsrv.dll to build requests for other servers (just as your debugger would).
113 |
114 | ### How do I store symbols on this symbol server?
115 | This server implementation just serves symbols. It can serve them from a local network share, so the recommended practice here is to "store" your built symbols on a local network share as you normally would. Then configure the server to serve the symbols stored at this location.
116 |
117 | ---
118 | ## Technologies used
119 | - [Python](https://www.python.org/)
120 | - [Falcon](https://falconframework.org/)
121 | - [Waitress](http://docs.pylonsproject.org/projects/waitress/en/latest/)
122 | - [Jinja2](http://jinja.pocoo.org/)
123 | - [jQuery](https://jquery.com/)
124 | - [w3-css](http://www.w3schools.com/w3css/)
125 | - [Google Material Icons](https://material.io/icons/)
126 | - [Debuging Tools for Windows](https://developer.microsoft.com/en-us/windows/hardware/windows-driver-kit)
127 |
128 | ---
129 | ## Disclaimer
130 | Use at your own risk.
131 | This is the first time I've written a webservice or used python for more than a small script.
132 | If you think you could have done better - you probably could have :)
133 | Please send me your pull requests!
134 |
135 | ---
136 | ## Support
137 | Feel free to create issues if you see something that needs fixing or improvement.
138 | I also welcome pull requests for useful additions and bugfixes.
--------------------------------------------------------------------------------
/api/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/inbilla/pySymProxy/a696ac15a0a468f6ff3ad6fe4591b435656752c5/api/__init__.py
--------------------------------------------------------------------------------
/api/config.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import logging.config
3 | import logging.handlers
4 | import json
5 | import os
6 |
7 | def findConfigFile(candidates):
8 | for location in candidates:
9 | if os.path.isfile(location):
10 | return location
11 | return candidates[-1]
12 |
13 | def findConfigValue(rootDict, name, required = False, default = None):
14 | curElement = rootDict
15 | elements = name.split(".")
16 | for element in elements:
17 | curElement = curElement.get(element)
18 | if (curElement == None):
19 | break
20 |
21 | if (curElement == None):
22 | if (required):
23 | raise Exception("Configuration value missing: " + name)
24 | curElement = default
25 |
26 | return curElement
27 |
28 | class Config:
29 | def __init__(self, configFile):
30 | # Load configuration information
31 | self._configFile = configFile
32 | with open(configFile) as data_file:
33 | self._configData = json.load(data_file)
34 | logging.config.dictConfig(self.loggingConfig())
35 |
36 | def configFile(self):
37 | return self._configFile
38 |
39 | def name(self):
40 | return self.findConfigValue("identity.name")
41 |
42 | def host(self):
43 | return self.findConfigValue("identity.host")
44 |
45 | def administrator(self):
46 | return self.findConfigValue("identity.administrator")
47 |
48 | def sympath(self):
49 | return self.findConfigValue("identity.default_sympath")
50 |
51 | def servers(self):
52 | return self.findConfigValue("servers")
53 |
54 | def cacheLocation(self):
55 | return self.findConfigValue("general.cacheLocation")
56 |
57 | def blacklist(self):
58 | return self.findConfigValue("general.blacklist")
59 |
60 | def loggingConfig(self):
61 | return self.findConfigValue("logging", required=False, default={})
62 |
63 | def extractLogFiles(self, logger, logfiles):
64 | for handler in logger.handlers:
65 | if isinstance(handler, logging.FileHandler):
66 | logfiles.append(handler.baseFilename)
67 | if isinstance(handler, logging.handlers.RotatingFileHandler):
68 | for x in range(0, handler.backupCount):
69 | logfiles.append(handler.baseFilename + "." + str(x))
70 |
71 | def logfiles(self):
72 | logfiles = []
73 | for loggerName in logging.Logger.manager.loggerDict:
74 | logger = logging.getLogger(loggerName)
75 | self.extractLogFiles(logger, logfiles)
76 | self.extractLogFiles(logger.root, logfiles)
77 |
78 | logfiles = list(set(logfiles))
79 | logfiles = [f for f in logfiles if os.path.exists(f)]
80 | logfiles.sort()
81 |
82 | return logfiles
83 |
84 | def findConfigValue(self, name, required=True, default=None):
85 | return findConfigValue(self._configData, name, required, default)
86 |
87 |
88 |
--------------------------------------------------------------------------------
/api/dbghelp.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 | # Import print() function for Python 2.7 compatibility
3 |
4 | import ctypes
5 | import ctypes.wintypes
6 | import os
7 | import sys
8 | import uuid
9 | import threading
10 | from shutil import copyfile
11 | import logging
12 |
13 | logger = logging.getLogger(__name__)
14 |
15 | class dbghelp:
16 | g_shareDll = True;
17 | g_uniqueInitializationHandle = 1
18 | g_lock = threading.RLock()
19 |
20 | def __init__(self, localCachePath, serverCachePath, server):
21 | self._localCachePath = localCachePath
22 | self._serverCachePath = serverCachePath
23 | self._server = server
24 |
25 | if dbghelp.g_shareDll:
26 | self._lock = threading.RLock()
27 | else:
28 | self._lock = dbghelp.g_lock
29 |
30 | self._uniqueProcessHandle = dbghelp.g_uniqueInitializationHandle
31 | dbghelp.g_uniqueInitializationHandle = dbghelp.g_uniqueInitializationHandle + 1
32 |
33 | self.initialize()
34 |
35 | def __del__(self):
36 | self.SymCleanup(self._uniqueProcessHandle)
37 |
38 | def symCallbackProc(self, process, actionCode, callbackData, context):
39 | with self._lock:
40 | if actionCode == self.CBA_EVENT or \
41 | actionCode == self.CBA_SRCSRV_EVENT:
42 |
43 | class CBA_EVENT_DATA(ctypes.Structure):
44 | _fields_ = [
45 | ('severity', ctypes.c_ulong),
46 | ('code', ctypes.c_ulong),
47 | ('desc', ctypes.c_char_p),
48 | ('object', ctypes.c_void_p)]
49 |
50 | data = ctypes.cast(callbackData, ctypes.POINTER(CBA_EVENT_DATA))
51 | message = data[0].desc.replace("\b", "").strip()
52 | logger.info("dllEvent {}>({}) {}".format(self._uniqueProcessHandle, data[0].code, message))
53 | return 1
54 | elif actionCode == 0x07:
55 | # Opportunity to cancel a download.
56 | # always returning false here
57 | wantToCancel = 0
58 | return wantToCancel
59 | elif actionCode == 0x08:
60 | # Event that indicates that setOptions has been called and applied new options to the system.
61 | # Don't need to know about this in our code
62 | return 1
63 | else:
64 | logger.info("dllEvent {}> unknown event {}".format(self._uniqueProcessHandle, actionCode))
65 |
66 | return 0
67 |
68 | def initialize(self):
69 | self.loadDll()
70 |
71 | # Calculate the symbol path
72 | self._sympath = "srv*"
73 | if (self._localCachePath != None):
74 | self._sympath += self._localCachePath + "*"
75 | if (self._serverCachePath != None):
76 | self._sympath += self._serverCachePath + "*"
77 | self._sympath += self._server
78 |
79 | SYMOPT_DEBUG = 0x80000000
80 | symoptions = self.SymGetOptions()
81 | symoptions |= SYMOPT_DEBUG
82 | self.SymSetOptions(symoptions)
83 |
84 | # Initialize the symbol system
85 | success = self.SymInitialize(self._uniqueProcessHandle, ctypes.c_char_p(self._sympath), ctypes.c_bool(False))
86 | if (success == False):
87 | raise ctypes.WinError()
88 |
89 | # Setup debug callback to hook logging
90 | success = self.SymRegisterCallback(self._uniqueProcessHandle, self.callback, 0)
91 | if (success == False):
92 | raise ctypes.WinError()
93 |
94 |
95 | def loadDll(self):
96 | try:
97 | dllName = "./dbghelp/dbghelp.dll"
98 |
99 | if not (os.path.exists(dllName)):
100 | logger.error("dbghelp.dll and symsrv.dll must be placed in the dbghelp folder.")
101 | logger.error("These files can be downloaded in the Debugging Tools for Windows SDK.")
102 | raise Exception("dbghelp.dll and symsrv.dll are not in the expected location")
103 |
104 | if dbghelp.g_shareDll:
105 | targetName = dllName + "." + str(self._uniqueProcessHandle) + ".dll"
106 | copyfile(dllName, targetName)
107 | dllName = targetName
108 |
109 | self.dbghelp_dll = ctypes.windll.LoadLibrary(dllName)
110 | logger.info("Loaded dll: {}".format(dllName))
111 |
112 | except WindowsError as e:
113 | print(e)
114 | raise
115 |
116 | self.SymInitialize = self.dbghelp_dll["SymInitialize"]
117 | self.SymInitialize.argtypes = [ctypes.c_ulong, ctypes.c_char_p, ctypes.c_bool]
118 | self.SymInitialize.restype = ctypes.c_ulong
119 | self.SymSetOptions = self.dbghelp_dll["SymSetOptions"]
120 | self.SymSetOptions.argtypes = [ctypes.c_ulong]
121 | self.SymSetOptions.restype = ctypes.c_ulong
122 | self.SymGetOptions = self.dbghelp_dll["SymGetOptions"]
123 | self.SymGetOptions.argtypes = []
124 | self.SymGetOptions.restype = ctypes.c_ulong
125 | self.SymCleanup = self.dbghelp_dll["SymCleanup"]
126 | self.SymCleanup.argtypes = [ctypes.c_ulong]
127 | self.SymCleanup.restype = ctypes.c_ulong
128 | self.SymFindFileInPath = self.dbghelp_dll["SymFindFileInPath"]
129 | self.SymFindFileInPath.argtypes = [ctypes.c_ulong, ctypes.c_char_p, ctypes.c_char_p, ctypes.c_void_p,
130 | ctypes.c_ulong, ctypes.c_ulong, ctypes.c_ulong, ctypes.c_char_p, ctypes.c_void_p, ctypes.c_void_p]
131 | self.SymFindFileInPath.restype = ctypes.c_ulong
132 | self.SymFindFileInPath_pdb = self.dbghelp_dll["SymFindFileInPath"]
133 | self.SymFindFileInPath_pdb.argtypes = [ctypes.c_ulong, ctypes.c_char_p, ctypes.c_char_p, ctypes.c_void_p,
134 | ctypes.c_ulong, ctypes.c_ulong, ctypes.c_ulong, ctypes.c_char_p,
135 | ctypes.c_void_p, ctypes.c_void_p]
136 | self.SymFindFileInPath_pdb.restype = ctypes.c_ulong
137 |
138 | self.SymRegisterCallbackProc = ctypes.WINFUNCTYPE(ctypes.wintypes.BOOL, ctypes.c_ulong, ctypes.c_ulong, ctypes.c_void_p, ctypes.c_void_p)
139 | self.SymRegisterCallback = self.dbghelp_dll["SymRegisterCallback"]
140 | self.SymRegisterCallback.argtypes = [ctypes.c_ulong, self.SymRegisterCallbackProc, ctypes.c_void_p]
141 | self.SymRegisterCallback.restype = ctypes.c_ulong
142 | self.callback = self.SymRegisterCallbackProc(self.symCallbackProc)
143 |
144 | self.SSRVOPT_DWORD = 0x00000002
145 | self.SSRVOPT_DWORDPTR = 0x00000004
146 | self.SSRVOPT_GUIDPTR = 0x00000008
147 |
148 | self.CBA_EVENT = 0x00000010
149 | self.CBA_SRCSRV_EVENT = 0x40000000
150 |
151 | def extractIdentifiers_Pdb(self, id):
152 | return (
153 | #bytearray.fromhex(id[:32])
154 | uuid.UUID(id[:32]),
155 | int(id[32:], 16))
156 |
157 | def extractIdentifiers_Binary(self, id):
158 | return (
159 | int(id[:8], 16),
160 | int(id[8:], 16))
161 |
162 | def findFile(self, name, identifier):
163 | logger.info("Find request: {}/{}".format(name, identifier))
164 | result = None
165 | try:
166 | with self._lock:
167 | if name.lower().endswith(".pdb"):
168 | result = self.findFile_Pdb(name, identifier)
169 | else:
170 | result = self.findFile_Binary(name, identifier)
171 | except Exception:
172 | raise
173 | finally:
174 | logger.info("Find result: {}".format(result))
175 | return result
176 |
177 | def findFile_Binary(self, name, identifier):
178 | (id1, id2) = self.extractIdentifiers_Binary(identifier)
179 |
180 | fileLocation = ctypes.create_string_buffer(b'\000' * 1024)
181 | flags = self.SSRVOPT_DWORD
182 | result = self.SymFindFileInPath(self._uniqueProcessHandle, self._sympath, name, id1, id2, 0, flags, fileLocation, None, None)
183 | if (not result):
184 | raise ctypes.WinError()
185 |
186 | return fileLocation.value
187 |
188 | def findFile_Pdb(self, name, identifier):
189 | (id1, id2) = self.extractIdentifiers_Pdb(identifier)
190 |
191 | #convert guid to a pointer to a guid buffer
192 | id1 = bytearray(id1.bytes_le)
193 | id1 = (ctypes.c_ubyte * len(id1)).from_buffer(id1)
194 |
195 | fileLocation = ctypes.create_string_buffer(b'\000' * 1024)
196 | flags = self.SSRVOPT_GUIDPTR
197 | result = self.SymFindFileInPath_pdb(self._uniqueProcessHandle, self._sympath, name, ctypes.byref(id1), id2, 0,
198 | flags, fileLocation, None, None)
199 |
200 | # if the search reports unsuccessful, it is possible it still
201 | # succeeded. This appears common with long distance servers with high latency.
202 | # Check if the file exists in the location we might expect:
203 | if not result:
204 | possible_location = "{}/{}/{}/{}".format(self._localCachePath, name, identifier, name)
205 | if os.path.isfile(possible_location):
206 | logger.info("DbgHlp reported unable to find, but file was found locally anyway, returning local file. {}".format(possible_location))
207 | fileLocation.value = possible_location
208 | result = True
209 |
210 | if (not result):
211 | raise ctypes.WinError()
212 |
213 | return fileLocation.value
--------------------------------------------------------------------------------
/api/main.py:
--------------------------------------------------------------------------------
1 | import falcon
2 | from . import config
3 | from . import mainhandler
4 | from . import symbolhandler
5 | from . import testhandler
6 |
7 | configFile = config.findConfigFile(
8 | [ '../config/pysymproxy.json'
9 | , './config/pysymproxy.json'
10 | , './config/default.pysymproxy.json'])
11 |
12 | configuration = config.Config(configFile)
13 |
14 | symbolroutehandler = symbolhandler.SymbolHandler(configuration)
15 | testrouteHandler = testhandler.TestHandler()
16 | defaultroutehandler = mainhandler.MainHandler(configuration, symbolroutehandler.getStats())
17 |
18 | api = falcon.API()
19 | api.add_route('/{file}', defaultroutehandler)
20 | api.add_route('/symbols/{file}/{identifier}/{rawfile}', symbolroutehandler)
21 | api.add_route('/test/{file1}/{file2}/{file3}', testrouteHandler)
--------------------------------------------------------------------------------
/api/mainhandler.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | import logging
4 | from jinja2 import Environment, FileSystemLoader
5 | env = Environment(loader=FileSystemLoader('./static'))
6 |
7 | logger = logging.getLogger(__name__)
8 |
9 | class JsonEncoder(json.JSONEncoder):
10 | def default(self, obj):
11 | if callable(getattr(obj, 'encodeJSON', None)):
12 | return obj.encodeJSON()
13 | # Let the base class default method raise the TypeError
14 | return json.JSONEncoder.default(self, obj)
15 |
16 | def getFolderSize(folder):
17 | if (folder == None):
18 | return 0
19 |
20 | try:
21 | total_size = os.path.getsize(folder)
22 | dirList = os.listdir(folder)
23 | if dirList == 0:
24 | return 0
25 |
26 | for item in dirList:
27 | itempath = os.path.join(folder, item)
28 | if os.path.isfile(itempath):
29 | total_size += os.path.getsize(itempath)
30 | elif os.path.isdir(itempath):
31 | total_size += getFolderSize(itempath)
32 | return total_size
33 | except Exception as e:
34 | return 0
35 |
36 | class MainHandler:
37 | def __init__(self, config, statistics):
38 | self._config = config
39 | self._statistics = statistics
40 | self._template = env.get_template('main.html.jinja')
41 |
42 | def on_get(self, req, resp, file):
43 | logging.info("get: {} client: {}".format(file, req.remote_addr))
44 | try:
45 | if (file == ""):
46 | return self.on_get_index(req, resp)
47 | elif (file == "pysymproxy.json"):
48 | return self.on_get_config(req, resp)
49 | elif (file == "statistics.json"):
50 | return self.on_get_statistics(req, resp)
51 | elif (file == "symbols.json"):
52 | return self.on_get_symbols(req, resp)
53 | elif (file.endswith(".log")):
54 | return self.on_get_logfile(req, resp, file)
55 | except Exception as e:
56 | resp.body = "error: " + str(e)
57 |
58 | def on_get_index(self, req, resp):
59 | diskUsage = 0 # sum([getFolderSize(server.get("cacheLocation", None)) for server in self._config.servers()])
60 | diskUsage += getFolderSize(self._config.cacheLocation())
61 | self._template = env.get_template('main.html.jinja')
62 | resp.body = self._template.render(
63 | serverName=self._config.name(),
64 | admin=self._config.administrator(),
65 | clientConfig=self._config.sympath(),
66 | servers=self._config.servers(),
67 | statistics=self._statistics.getStats(),
68 | config=self._config._configData,
69 | diskUsage=diskUsage,
70 | logfiles=self._config.logfiles()
71 | )
72 | resp.content_type = "html"
73 |
74 | def on_get_config(self, req, resp):
75 | configLocation = self._config.configFile()
76 | resp.stream = open(configLocation, 'rb')
77 | resp.stream_len = os.path.getsize(configLocation)
78 | resp.content_type = "json"
79 |
80 | def on_get_statistics(self, req, resp):
81 | # Build a dictionary of information to send
82 | # Serialise it and send
83 | stats = self._statistics.getStats()
84 | stats.diskUsage = 0 # sum([getFolderSize(server.get("cacheLocation", None)) for server in self._config.servers()])
85 | stats.diskUsage += getFolderSize(self._config.cacheLocation())
86 | stats.numAcceptedRequests = stats.numRequests.value - stats.numExcluded.value
87 |
88 | resp.data = JsonEncoder().encode(stats)
89 | resp.content_type = "json"
90 |
91 | def on_get_symbols(self, req, resp):
92 | # Build a dictionary of information to send
93 | # Serialise it and send
94 | symbols = self._statistics.getSymbols()
95 |
96 | resp.data = JsonEncoder().encode(symbols)
97 | resp.content_type = "json"
98 |
99 | def on_get_logfile(self, req, resp, file):
100 | # Get the list of log files
101 | logfiles = self._config.logfiles()
102 | logIndex = int(file[:-4])
103 |
104 | logLocation = logfiles[logIndex - 1]
105 | resp.stream = open(logLocation, 'rb')
106 | resp.stream_len = os.path.getsize(logLocation)
107 | resp.content_type = "text"
108 |
--------------------------------------------------------------------------------
/api/objectpool.py:
--------------------------------------------------------------------------------
1 | # Python 3+ has module "queue", while 2.7 has module "Queue"
2 | try:
3 | import queue
4 | except ImportError:
5 | import Queue as queue
6 |
7 | from contextlib import contextmanager
8 | import threading
9 |
10 | class ObjectPool(object):
11 | def __init__(self, maxSize, objectType, *args):
12 | self._semaphore = threading.BoundedSemaphore(maxSize)
13 | self._queue = queue.Queue()
14 |
15 | for i in range(0, maxSize):
16 | self._queue.put(objectType(*args))
17 |
18 | def acquire(self):
19 | self._semaphore.acquire()
20 | return self._queue.get()
21 |
22 | def release(self, obj):
23 | self._queue.put(obj)
24 | self._semaphore.release()
25 |
26 | @contextmanager
27 | def poolObject(pool):
28 | obj = pool.acquire()
29 | try:
30 | yield obj
31 | except Exception as e:
32 | raise e
33 | finally:
34 | pool.release(obj)
--------------------------------------------------------------------------------
/api/requeststatistics.py:
--------------------------------------------------------------------------------
1 | import threading
2 |
3 | class AtomicCounter:
4 | def __init__(self):
5 | self.value = 0
6 | self._lock = threading.Lock()
7 |
8 | def increment(self, amount=1):
9 | with self._lock:
10 | self.value += amount
11 |
12 | def decrement(self, amount=1):
13 | with self._lock:
14 | self.value -= amount
15 |
16 | def assign(self, value):
17 | with self._lock:
18 | self.value = value
19 |
20 | def encodeJSON(self):
21 | return (self.value)
22 |
23 | class RequestStatistics:
24 | def __init__(self, file):
25 | self.file = file
26 | self.totalTimeServicing = AtomicCounter()
27 | self.numRequests = AtomicCounter()
28 | self.numExcluded = AtomicCounter()
29 | self.numSuccess = AtomicCounter()
30 | self.numFail = AtomicCounter()
31 | self.numCacheHit = AtomicCounter()
32 | self.numCacheMiss = AtomicCounter()
33 | self.numPending = AtomicCounter()
34 | self.lastAccessTime = AtomicCounter()
35 | self.serverMisses = {}
36 | self.serverHits = {}
37 |
38 | def recordServerHit(self, server):
39 | counter = self.serverHits.get(server.identifer(), None)
40 | if counter is None:
41 | counter = AtomicCounter()
42 | self.serverHits[server.identifer()] = counter
43 | counter.increment()
44 |
45 | def recordServerMiss(self, server):
46 | counter = self.serverMisses.get(server.identifer(), None)
47 | if counter is None:
48 | counter = AtomicCounter()
49 | self.serverMisses[server.identifer()] = counter
50 | counter.increment()
51 |
52 | def encodeJSON(self):
53 | return (self.__dict__)
54 |
55 |
--------------------------------------------------------------------------------
/api/statistics.py:
--------------------------------------------------------------------------------
1 | from . import requeststatistics
2 | import time
3 |
4 |
5 | class BlankObject(object):
6 | def encodeJSON(self):
7 | result = self.__dict__ .copy()
8 | del result["symbols"]
9 | return result
10 |
11 | class Statistics:
12 | def __init__(self, config):
13 | self._statistics = BlankObject()
14 | self._statistics.numRequests = requeststatistics.AtomicCounter()
15 | self._statistics.numInvalidRequests = requeststatistics.AtomicCounter()
16 | self._statistics.numSuccess = requeststatistics.AtomicCounter()
17 | self._statistics.numSymbols = requeststatistics.AtomicCounter()
18 | self._statistics.numCacheHit = requeststatistics.AtomicCounter()
19 | self._statistics.numPending = requeststatistics.AtomicCounter()
20 | self._statistics.numExcluded = requeststatistics.AtomicCounter()
21 | self._statistics.symbols = {}
22 | self._pending = {}
23 | self._enabled = config.findConfigValue("general.enableStatistics", required=False, default=True)
24 |
25 | def recordId(self, file, identifier):
26 | return file
27 |
28 | def beginRequest(self, file, identifier):
29 | if not self._enabled:
30 | return
31 |
32 | record = self.recordId(file, identifier)
33 | stats = self._statistics.symbols.get(record, None)
34 | if (stats == None):
35 | stats = requeststatistics.RequestStatistics(file)
36 | self._statistics.symbols[record] = stats
37 | self._statistics.numSymbols.increment()
38 |
39 | # Now manipulate the statistics
40 | self._statistics.numRequests.increment()
41 | self._statistics.numPending.increment()
42 | stats.numRequests.increment()
43 | stats.numPending.increment()
44 |
45 | return (stats, time.time())
46 |
47 | def endRequest(self, statrecord, file, identifier, location, cachehit, exclusion, valid, servers_attempted):
48 | if not self._enabled:
49 | return
50 |
51 | stats = statrecord[0]
52 | stats.numPending.decrement()
53 | self._statistics.numPending.decrement()
54 |
55 | if not valid:
56 | self._statistics.numInvalidRequests.increment()
57 | self._statistics.numRequests.decrement()
58 | stats.numRequests.decrement()
59 | return
60 |
61 | beginTime = statrecord[1]
62 | currentTime = time.time()
63 | stats.totalTimeServicing.increment(currentTime - beginTime)
64 | stats.lastAccessTime.assign(currentTime)
65 |
66 | if (location):
67 | stats.numSuccess.increment()
68 | self._statistics.numSuccess.increment()
69 | else:
70 | stats.numFail.increment()
71 |
72 | if (cachehit):
73 | stats.numCacheHit.increment()
74 | self._statistics.numCacheHit.increment()
75 | else:
76 | stats.numCacheMiss.increment()
77 |
78 | if (exclusion):
79 | self._statistics.numExcluded.increment()
80 | stats.numExcluded.increment()
81 |
82 | for server in servers_attempted:
83 | if server[1]:
84 | stats.recordServerHit(server[0])
85 | else:
86 | stats.recordServerMiss(server[0])
87 |
88 | def getStats(self):
89 | if not self._enabled:
90 | return None
91 |
92 | return self._statistics
93 |
94 | def getSymbols(self):
95 | if not self._enabled:
96 | return None
97 |
98 | return self._statistics.symbols
99 |
--------------------------------------------------------------------------------
/api/symbolhandler.py:
--------------------------------------------------------------------------------
1 | import falcon
2 | import os
3 | import re
4 | from . import symbolserver
5 | import logging
6 | from . import statistics
7 |
8 | logger = logging.getLogger(__name__)
9 |
10 | class SymbolHandler:
11 | def __init__(self, config):
12 | self._statistics = statistics.Statistics(config)
13 | self._blacklist = [re.compile(pattern) for pattern in config.blacklist()]
14 |
15 | # build up a list of servers
16 | self._servers = [symbolserver.SymbolServer(config, serverConfig) for serverConfig in config.servers()]
17 | self._previousResults = {}
18 |
19 | def getStats(self):
20 | return self._statistics
21 |
22 | def on_get(self, req, resp, file, identifier, rawfile):
23 | statRecord = self._statistics.beginRequest(file, identifier)
24 | symbolLocation = None
25 | cacheHit = False
26 | excluded = False
27 | valid = True
28 | servers_attempted = []
29 |
30 | try:
31 | logging.info("get: {}/{}/{} client: {}".format(file, identifier, rawfile, req.remote_addr))
32 |
33 | # Match against list of exclusions
34 | if file != rawfile:
35 | valid = False
36 | raise Exception("Requested file ignored. Compression and file redirection disabled");
37 |
38 | # Match against list of exclusions
39 | if any(regex.match(file) for regex in self._blacklist):
40 | excluded = True
41 | raise Exception("Matched exclusion pattern")
42 |
43 | # Check if we already have a cached record for this request
44 | recordId = file + "/" + identifier
45 | previousRecord = self._previousResults.get(recordId, None)
46 | if previousRecord is not None:
47 | if previousRecord.success:
48 | if os.path.exists(previousRecord.location):
49 | logger.info("Cache hit - success")
50 | symbolLocation = previousRecord.location
51 | cacheHit = True
52 |
53 | if symbolLocation is None:
54 | # If we made it here then we haven't seen a successful request yet
55 | # Attempt to find a server that will service this file request
56 | for server in self._servers:
57 | (symbolLocation, cacheHit, lookup_attempted) = server.findFile(file, identifier)
58 |
59 | if lookup_attempted:
60 | servers_attempted.append((server, symbolLocation is not None))
61 |
62 | if symbolLocation is not None:
63 | break
64 |
65 | # No servers attempted to lookup this request
66 | # so they all must have excluded it individually
67 | if len(servers_attempted) == 0:
68 | excluded = True
69 |
70 | newRecord = symbolserver.SymbolServer.SymbolRequestRecord(file, identifier, symbolLocation)
71 | self._previousResults[recordId] = newRecord
72 |
73 | if symbolLocation is not None:
74 | logging.info("response: {}".format(symbolLocation))
75 | resp.stream = open(symbolLocation, 'rb')
76 | resp.stream_len = os.path.getsize(symbolLocation)
77 | resp.content_type = "application/octet-stream"
78 | else:
79 | raise Exception("Unable to find file across the servers")
80 |
81 | except Exception as e:
82 | logging.error("{}".format(str(e)))
83 | resp.body = "404 could not find requested file.\nError: " + str(e)
84 | resp.status = falcon.HTTP_404
85 |
86 | self._statistics.endRequest(statRecord, file, identifier, symbolLocation, cacheHit, excluded, valid,
87 | servers_attempted)
88 |
--------------------------------------------------------------------------------
/api/symbolserver.py:
--------------------------------------------------------------------------------
1 | from . import dbghelp
2 | import time
3 | import os.path
4 | from . import objectpool
5 | from . import config
6 | import re
7 | import logging
8 |
9 | logger = logging.getLogger(__name__)
10 |
11 | class SymbolServer:
12 | def __init__(self, globalConfig, serverConfig):
13 | self._name = config.findConfigValue(serverConfig, "name")
14 | self._identifier = config.findConfigValue(serverConfig, "identifier", required=True)
15 | self._remoteURL = config.findConfigValue(serverConfig, "remote", required=True)
16 | self._cacheLocation = config.findConfigValue(serverConfig, "cacheLocation", default=None)
17 | self._retryTimeout = config.findConfigValue(serverConfig, "retryTimeout", default=60)
18 | self._maxRequests = config.findConfigValue(serverConfig, "maxRequests", default=10)
19 | self._whitelist = config.findConfigValue(serverConfig, "whitelist", default=[".*"])
20 | self._blacklist = config.findConfigValue(serverConfig, "blacklist", default=[])
21 | self._previousResults = {}
22 | self._dbgHelpPool = objectpool.ObjectPool(self._maxRequests, dbghelp.dbghelp, globalConfig.cacheLocation(), self._cacheLocation, self._remoteURL)
23 |
24 | # Generate regex objects for the filtering lists
25 | self._whitelist = [re.compile(pattern) for pattern in self._whitelist]
26 | self._blacklist = [re.compile(pattern) for pattern in self._blacklist]
27 |
28 | class SymbolRequestRecord:
29 | def __init__(self, file, identifier, location):
30 | self.success = location != None
31 | self.timestamp = time.time()
32 | self.location = location
33 | self.file = file
34 | self.identifier = identifier
35 |
36 | def filterRequest(self, file):
37 | # File must match the whitelist
38 | if not any(regex.match(file) for regex in self._whitelist):
39 | return False
40 |
41 | # File must not match the blacklist
42 | if any(regex.match(file) for regex in self._blacklist):
43 | return False
44 |
45 | return True
46 |
47 | def findFile(self, file, identifier):
48 | logger.info("{}: find {}/{}".format(self._name, file, identifier))
49 |
50 | # Make sure the request is valid for this server
51 | if not self.filterRequest(file):
52 | logger.info("Find ignored - did not match filters")
53 | return None, True, False
54 |
55 | # Check if the symbol requested has already been requested before
56 | recordId = file + "/" + identifier
57 | previousRecord = self._previousResults.get(recordId, None)
58 | if (previousRecord != None):
59 | if (previousRecord.success):
60 | if os.path.exists(previousRecord.location):
61 | logger.info("Cache hit - success")
62 | return previousRecord.location, True, True
63 | elif (time.time() - previousRecord.timestamp < self._retryTimeout):
64 | logger.info("Cache hit - rejection - retry in {}s".format(self._retryTimeout - (time.time() - previousRecord.timestamp)))
65 | return None, True, True
66 |
67 | # If we made it here then we need to retry the request
68 | # either because we haven't tried this file,
69 | # or we've tried before, but the retry timeout has expired
70 | location = None
71 | try:
72 | with objectpool.poolObject(self._dbgHelpPool) as dbgHelp:
73 | location = dbgHelp.findFile(file, identifier)
74 | except Exception as e:
75 | logging.error("{}".format(str(e)))
76 | pass
77 |
78 | newRecord = self.SymbolRequestRecord(file, identifier, location)
79 | self._previousResults[recordId] = newRecord
80 | return newRecord.location, False, True
81 |
82 | def identifer(self):
83 | return self._identifier
84 |
--------------------------------------------------------------------------------
/api/testhandler.py:
--------------------------------------------------------------------------------
1 | import os
2 | import logging
3 |
4 | logger = logging.getLogger(__name__)
5 |
6 | class TestHandler:
7 | def __init__(self):
8 | pass
9 |
10 | def on_get(self, req, resp, file1, file2, file3):
11 | file_location = "./test/" + file1 + "/" + file2 + "/" + file3
12 | logging.info("get: {} client: {}".format(file_location, req.remote_addr))
13 | try:
14 | resp.stream = open(file_location, 'rb')
15 | resp.stream_len = os.path.getsize(file_location)
16 | resp.content_type = "application/octet-stream"
17 |
18 | except Exception as e:
19 | resp.body = "error: " + str(e)
20 |
--------------------------------------------------------------------------------
/config/default.pysymproxy.json:
--------------------------------------------------------------------------------
1 | {
2 | "identity": {
3 | "name": "PySymProxy - Default configuration",
4 | "host": "localhost",
5 | "administrator": "!enter your contact details!",
6 | "default_sympath": "srv*c:\\symbols*http://localhost:8080/symbols"
7 | },
8 | "general": {
9 | "enableStatistics": true,
10 | "cacheLocation": "./data/symbols",
11 | "blacklist": []
12 | },
13 | "servers": [
14 | {
15 | "name": "My builds",
16 | "identifier": "mb",
17 | "remote": "\\\\build-server\\symbols",
18 | "maxRequests": 10
19 | },
20 | {
21 | "name": "Microsoft symbol server",
22 | "identifier": "mss",
23 | "remote": "http://msdl.microsoft.com/download/symbols",
24 | "cacheLocation": "\\\\build-server\\mssymbols",
25 | "retryTimout": 600,
26 | "maxRequests": 10
27 | }
28 | ],
29 | "logging":{
30 | "version": 1,
31 | "disable_existing_loggers": false,
32 | "formatters": {
33 | "console": {
34 | "level": "DEBUG",
35 | "format": "%(asctime)s [%(thread)s][%(levelname)s][%(module)s] %(message)s",
36 | "datefmt": "%m/%d/%Y %I:%M:%S %p"
37 | }
38 | },
39 | "handlers":{
40 | "console":{
41 | "class" : "logging.StreamHandler",
42 | "formatter" : "console",
43 | "level" : "DEBUG"
44 | },
45 | "file":{
46 | "class" : "logging.handlers.RotatingFileHandler",
47 | "filename" : "./data/log.txt",
48 | "maxBytes" : 524288,
49 | "backupCount" : 10,
50 | "formatter" : "console",
51 | "level" : "DEBUG"
52 | }
53 | },
54 | "loggers": {
55 | "": {
56 | "handlers": ["console", "file"],
57 | "level" : "DEBUG"
58 | }
59 | }
60 | }
61 | }
62 |
--------------------------------------------------------------------------------
/data/empty.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/inbilla/pySymProxy/a696ac15a0a468f6ff3ad6fe4591b435656752c5/data/empty.txt
--------------------------------------------------------------------------------
/dbghelp/symsrv.yes:
--------------------------------------------------------------------------------
1 | 1
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | falcon
2 | requests
3 | waitress
4 | jinja2
--------------------------------------------------------------------------------
/run_server.bat:
--------------------------------------------------------------------------------
1 | @echo off
2 | set CURPATH=%~dp0
3 | echo %CURPATH%
4 | pushd %CURPATH%
5 |
6 | pip install -r requirements.txt
7 |
8 | python server.py
9 |
10 | popd
--------------------------------------------------------------------------------
/server.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 | # Import print() function for Python 2.7 compatibility
3 |
4 | from api.main import api
5 | from waitress import serve
6 | import logging
7 |
8 | logger = logging.getLogger(__name__)
9 | logger.info("Starting SymProxy Server")
10 | print("Test link: http://localhost:8080/symbols/wntdll.pdb/F999943DF7FB4B8EB6D99F2B047BC3101/wntdll.pdb")
11 | serve(api, host='0.0.0.0', port=8080)
12 |
13 | #TODO:
14 | # - Add lazy evaluation of storage space
15 | # - Avoid timeouts somehow on requests that run for a long time
16 | # - Detect when entire servers are down and stop attempting to fetch from them for a while
17 | # - Add cache budget management
--------------------------------------------------------------------------------
/static/main.html.jinja:
--------------------------------------------------------------------------------
1 |
2 |
3 |