├── README.md
├── _start-pickle-scan.cmd
├── pickle_inspector.py
└── pickle_scan.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Stable Diffusion Pickle Scanner
 2 | 
 3 | Scan `.pt`, `.ckpt` and `.bin` files for potentially malicious code.
 4 | 
 5 | ## How to use
 6 | 
 7 | 1. Export `pickle_inspector.py` and `pickle_scan.py` to your Stable Diffusion base directory
 8 | 2. Open bash / CMD
 9 | 3. Run command `python pickle_scan.py models > scan_output.txt`
10 | 4. Open `scan_output.txt`
11 | 
12 | If you get an error about torch not being installed, start your webui and copy the venv python path and replace `python` with that path. 
13 | 
14 | > It might look something like this:
15 | >
16 | > `venv "F:\Projects\stable-diffusion-webui\venv\Scripts\Python.exe"`
17 | >
18 | > Final command would look like:
19 | >
20 | > `"F:\Projects\stable-diffusion-webui\venv\Scripts\Python.exe" pickle_scan.py models > scan_output.txt`
21 | 
22 | ## Usage
23 | 
24 | ```shell
25 | python pickle_scan.py [directory] [debugmode]
26 | ```
27 | 
28 | Example
29 | 
30 | ```shell
31 | python pickle_scan.py models
32 | ```
33 | 
34 | ## Debug Mode
35 | 
36 | Add `1` after directory to see which calls / signals triggered the scan failure.
37 | 
38 | ```
39 | python pickle_scan.py models 1 > scan_output.txt
40 | ```
41 | 
42 | ## How to set up and use with AUTOMATIC1111 web UI (Windows)
43 | 
44 | 1. Download the three files `pickle_inspector.py`, `pickle_scan.py` and `_start-pickle-scan.cmd` to any directory
45 | 2. Open `_start-pickle-scan.cmd` with notepad (or any text editor)
46 | 3. Copy your venv path between the quotation marks in the line starting with `SET VENV_PATH=`. When you start the UI this should be displayed in the first line of the console window. Example *venv "**E:\stable-diffusion-webui\venv\Scripts\Python.exe**"*
47 | 4. Copy the path to your model folder between the quotation marks in the line starting with `SET SD_FOLDER=`. Example *E:\stable-diffusion-webui\models*
48 | 5. (optional) If yo would like to scan an additional folder you can copy the path between the quotation marks in the line starting with `SET DOWNLOAD_FOLDER`. In case you want to scan a checkpoint before moving it into the proper model folder, otherwise leave as is
49 | 6. Save the script file
50 | 7. Doubleclick `_start-pickle-scan.cmd` and wait for the scan to complete
51 | The last few lines show how many suspicious files were found
52 | ```shell
53 | "Number of failed scans (potentially malicious files):"
54 | 
55 | ---------- SCAN_OUTPUT.TXT: 0
56 | ```
57 | 
58 | Example output (with `numpy` considered "non-standard"):
59 | 
60 | ![Code_-_Insiders_Db9qYRswOQ](https://user-images.githubusercontent.com/114846827/200138825-777e4e43-67c0-44cb-b5a7-80ee141ceb7c.png)
61 | 
62 | ## Notes
63 | 
64 | By default this will scan all subdirectories for files ending with `.pt`, `.ckpt` and `.bin`
65 | 
66 | ## License
67 | 
68 | https://creativecommons.org/licenses/by-nc-sa/4.0/
69 | 


--------------------------------------------------------------------------------
/_start-pickle-scan.cmd:
--------------------------------------------------------------------------------
 1 | @ECHO off
 2 | :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
 3 | ::: starter script for scanning SD models for malicious pickling
 4 | :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
 5 | ::: This script assumes you are running *AUTOMATIC1111's web UI* on Windows
 6 | ::: you will have to paste the path to your model folder(s) below
 7 | ::: where it says SET SD_FOLDER="..."
 8 | ::: Your VENV_PATH should be in the first line of the consolewhen you start up the web UI
 9 | ::: The DOWNLOAD_FOLDER is an optional second folder that you might like to scan,
10 | ::: otherwise leave it as is
11 | :::
12 | :SETUP
13 | SET VENV_PATH="F:\Whatever\Path\stable-diffusion-webui\venv\Scripts\Python.exe"
14 | SET SD_FOLDER="F:\Whatever\Path\stable-diffusion-webui\models\Stable-diffusion"
15 | SET DOWNLOAD_FOLDER="F:\Whatever\Other\Folder"
16 | :::
17 | ::: how result details should be displayed ("yes" or "no"):
18 | SET SHOW_RESULT_IN_CONSOLE="yes"
19 | SET OPEN_RESULT_IN_NOTEPAD="yes"
20 | :::
21 | ::: End of setup, you can now save this script and run it by double clicking
22 | :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
23 | 
24 | REM create / overwrite the output file .\scan_output.txt and initialize with timestamp
25 | ECHO scan started on %date% %time% > scan_output.txt
26 | 
27 | :SCANNING
28 | REM check if VENV_PATH was set
29 | if %VENV_PATH% equ "F:\Whatever\Path\stable-diffusion-webui\venv\Scripts\Python.exe" (
30 | ECHO ##### ERROR please set your VENV_PATH #####
31 | goto EXIT
32 | )
33 | 
34 | ECHO "Scanning...Please wait a moment..."
35 | 
36 | REM check if SD_FOLDER was set
37 | if %SD_FOLDER% equ "F:\Whatever\Path\stable-diffusion-webui\models\Stable-diffusion" (
38 | ECHO ##### ERROR please set your SD_FOLDER #####
39 | goto EXIT
40 | )
41 | 
42 | ECHO "step 1: SD models folder"
43 | ECHO ####################################################################### >> scan_output.txt
44 | ECHO ##### scanning SD model folder "~~~webui\models\Stable-diffusion" ##### >> scan_output.txt
45 | ECHO ####################################################################### >> scan_output.txt
46 | %VENV_PATH% pickle_scan.py %SD_FOLDER% >> scan_output.txt
47 | 
48 | REM check if download folder was set, if not just skip instead of errorlevel
49 | if %DOWNLOAD_FOLDER% equ "F:\Whatever\Other\Folder" (
50 | ECHO "No download folder specified"
51 | goto DISPLAY_RESULT
52 | )
53 | ECHO "step 2: download folder"
54 | ECHO ##################################### >> scan_output.txt
55 | ECHO ##### scanning download folder  ##### >> scan_output.txt
56 | ECHO ##################################### >> scan_output.txt
57 | %VENV_PATH% pickle_scan.py %DOWNLOAD_FOLDER% >> scan_output.txt
58 | 
59 | :DISPLAY_RESULT
60 | if %SHOW_RESULT_IN_CONSOLE% equ "yes" (type scan_output.txt)
61 | if %OPEN_RESULT_IN_NOTEPAD% equ "yes" (start notepad scan_output.txt)
62 | 
63 | ECHO "Number of failed scans (potentially malicious files):"
64 | find /c "SCAN FAILED" scan_output.txt
65 | 
66 | :EXIT
67 | pause
68 | 
69 | ::: based on the original code by *TheEliteGeek* in issue #2:
70 | ::: @ECHO off
71 | ::: ECHO "Scanning...Please wait a moment..."
72 | ::: "F:\Whatever\Path\stable-diffusion-webui\venv\Scripts\Python.exe"  pickle_scan.py models > scan_output.txt
73 | ::: type scan_output.txt
74 | ::: pause


--------------------------------------------------------------------------------
/pickle_inspector.py:
--------------------------------------------------------------------------------
  1 | # Copyright (C) 2023  Lopho <contact@lopho.org>
  2 | #
  3 | # This program is free software: you can redistribute it and/or modify
  4 | # it under the terms of the GNU Affero General Public License as published
  5 | # by the Free Software Foundation, either version 3 of the License, or
  6 | # (at your option) any later version.
  7 | #
  8 | # This program is distributed in the hope that it will be useful,
  9 | # but WITHOUT ANY WARRANTY; without even the implied warranty of
 10 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 11 | # GNU Affero General Public License for more details.
 12 | #
 13 | # You should have received a copy of the GNU Affero General Public License
 14 | # along with this program.  If not, see <https://www.gnu.org/licenses/>.
 15 | 
 16 | import pickle as python_pickle
 17 | from types import ModuleType
 18 | from functools import partial
 19 | 
 20 | 
 21 | def _check_list(what, where):
 22 |     for s in where:
 23 |         if s == what or (s.endswith('*') and what.startswith(s[:-1])):
 24 |             return True
 25 |     return False
 26 | 
 27 | 
 28 | class InspectorResult:
 29 |     def __init__(self):
 30 |         self.classes = []
 31 |         self.calls = []
 32 |         self.structure = {}
 33 | 
 34 | 
 35 | class UnpickleConfig:
 36 |     def __init__(self, blacklist = [], whitelist = [], tracklist = []):
 37 |         self.blacklist = blacklist
 38 |         self.whitelist = whitelist
 39 |         self.tracklist = tracklist
 40 |         self.record = True
 41 |         self.verbose = False
 42 |         self.strict = False
 43 | 
 44 | 
 45 | class StubBase:
 46 |     def __init__(self, module, name, result, config, *args, **kwargs):
 47 |         self.module = module
 48 |         self.name = name
 49 |         self.full_name = f'{module}.{name}'
 50 |         self.args = {'__init__': [args]}
 51 |         self.kwargs = {'__init__': [kwargs]}
 52 |         self.config = config
 53 |         self.result = result
 54 |         if config.record or self.full_name in config.tracklist:
 55 |             result.calls.append(f'{self.full_name}({args}, {kwargs})')
 56 | 
 57 |     def __repr__(self):
 58 |         return f'{self.full_name}({self.args["__init__"]}, {self.kwargs["__init__"]})'
 59 |         
 60 |     def __getattr__(self, attr):
 61 |         return partial(self._call_tracer, attr)
 62 | 
 63 |     def __setitem__(self,*args, **kwargs):
 64 |         self._call_tracer('__setitem__', *args, **kwargs)
 65 | 
 66 |     def _call_tracer(self, attr, *args, **kwargs):
 67 |         if attr not in self.args:
 68 |             self.args[attr] = []
 69 |             self.kwargs[attr] = []
 70 |         self.args[attr].append(args)
 71 |         self.kwargs[attr].append(kwargs)
 72 |         self.result.calls.append(f'{self.full_name}.{attr}({args}, {kwargs})')
 73 | 
 74 | 
 75 | class UnpickleBase(python_pickle.Unpickler):
 76 |     config = UnpickleConfig()
 77 |     def _print(self, *_):
 78 |         if self.config.verbose:
 79 |             print(*_)
 80 | 
 81 | 
 82 | class UnpickleInspector(UnpickleBase):
 83 |     def find_class(self, result, module, name):
 84 |         full_name = f'{module}.{name}'
 85 |         self._print(f'STUBBED {full_name}')
 86 |         in_tracklist = _check_list(full_name, self.config.tracklist)
 87 |         if self.config.record or in_tracklist:
 88 |            result.classes.append(full_name)
 89 |         config = self.config
 90 |         class Stub(StubBase):
 91 |             def __init__(self, *args, **kwargs):
 92 |                 super().__init__(module, name, result, config, *args, **kwargs)
 93 |         return Stub
 94 | 
 95 |     def load(self):
 96 |         result = InspectorResult()
 97 |         self.persistent_load = lambda *_: None # torch
 98 |         self.find_class = partial(UnpickleInspector.find_class, self, result)
 99 |         result.structure = super().load()
100 |         return result
101 | 
102 | 
103 | class BlockedException(Exception):
104 |     def __init__(self, msg):
105 |         self.msg = msg
106 | 
107 | 
108 | class UnpickleControlled(UnpickleBase):
109 |     def find_class(self, result, module, name):
110 |         full_name = f'{module}.{name}'
111 |         in_blacklist = _check_list(full_name, self.config.blacklist)
112 |         in_whitelist = _check_list(full_name, self.config.whitelist)
113 |         if (in_blacklist and not in_whitelist) or (len(self.config.blacklist) < 1 and len(self.config.whitelist) > 0 and not in_whitelist):
114 |             if self.config.strict:
115 |                 raise BlockedException(f'strict mode: {full_name} blocked')
116 |             else:
117 |                 return UnpickleInspector.find_class(self, result, module, name)
118 |         self._print(full_name)
119 |         in_tracklist = _check_list(full_name, self.config.tracklist)
120 |         if self.config.record or full_name in self.config.tracklist:
121 |             result.classes.append(full_name)
122 |         return super().find_class(module, name)
123 |         
124 |     def load(self):
125 |         result = InspectorResult()
126 |         self.find_class = partial(UnpickleControlled.find_class, self, result)
127 |         result.structure = super().load()
128 |         return result
129 | 
130 | 
131 | def build(unpickler, conf = None):
132 |     if conf is not None:
133 |         class ConfiguredUnpickler(unpickler):
134 |             config = conf
135 |         unpickler = ConfiguredUnpickler
136 |             
137 |     class PickleModule(ModuleType):
138 |         Unpickler = unpickler
139 |     
140 |     return PickleModule('pickle')
141 | 
142 | 
143 | pickle = build(UnpickleInspector)
144 | 


--------------------------------------------------------------------------------
/pickle_scan.py:
--------------------------------------------------------------------------------
 1 | # copyright zxix 2022
 2 | # https://creativecommons.org/licenses/by-nc-sa/4.0/
 3 | import torch
 4 | import pickle_inspector
 5 | import sys
 6 | from pathlib import Path
 7 | 
 8 | sys.stdout.reconfigure(encoding='utf-8')
 9 | 
10 | debug = len(sys.argv) == 3
11 | 
12 | dir = sys.argv[1]
13 | print("checking dir: " + dir)
14 | 
15 | BASE_DIR = Path(dir)
16 | EXTENSIONS = {'.pt', '.bin', '.ckpt'}
17 | BAD_CALLS = {'os', 'shutil', 'sys', 'requests', 'net'}
18 | BAD_SIGNAL = {'rm ', 'cat ', 'nc ', '/bin/sh '}
19 | 
20 | for path in BASE_DIR.glob(r'**/*'):
21 |   if path.suffix in EXTENSIONS:
22 |     print("")
23 |     print("..." + path.as_posix())
24 |     result = torch.load(path.as_posix(), pickle_module=pickle_inspector.pickle)
25 |     result_total = 0
26 |     result_other = 0
27 |     result_calls = {}
28 |     result_signals = {}
29 |     result_output = ""
30 | 
31 |     for call in BAD_CALLS:
32 |       result_calls[call] = 0
33 | 
34 |     for signal in BAD_SIGNAL:
35 |       result_signals[signal] = 0
36 | 
37 |     for c in result.calls:
38 |       for call in BAD_CALLS:
39 |         if (c.find(call + ".") == 0):
40 |           result_calls[call] += 1
41 |           result_total += 1
42 |           result_output += "\n--- found lib call (" + call + ") ---\n"
43 |           result_output += c
44 |           result_output += "\n---------------\n"
45 |           break
46 |       for signal in BAD_SIGNAL:
47 |         if (c.find(signal) > -1):
48 |           result_signals[signal] += 1
49 |           result_total += 1
50 |           result_output += "\n--- found malicious signal (" + signal + ") ---\n"
51 |           result_output += c
52 |           result_output += "\n---------------\n"
53 |           break
54 | 
55 |       if (
56 |         c.find("numpy.") != 0 and 
57 |         c.find("_codecs.") != 0 and 
58 |         c.find("collections.") != 0 and 
59 |         c.find("torch.") != 0):
60 |         result_total += 1
61 |         result_other += 1
62 |         result_output += "\n--- found non-standard lib call ---\n"
63 |         result_output += c
64 |         result_output += "\n---------------\n"
65 | 
66 |     if (result_total > 0):
67 |       for call in BAD_CALLS:
68 |         print("library call (" + call + ".): " + str(result_calls[call]))
69 |       for signal in BAD_SIGNAL:
70 |         print("malicious signal (" + signal + "): " + str(result_signals[signal]))
71 |       print("non-standard calls: " + str(result_other))
72 |       print("total: " + str(result_total))
73 |       print("")
74 |       print("SCAN FAILED")
75 | 
76 |       if (debug):
77 |         print(result_output)
78 |     else:
79 |       print("SCAN PASSED!")
80 | 


--------------------------------------------------------------------------------