A dependency vulnerability scanner for your python projects, straight from the terminal.
14 |
15 | + can be used within large projects. (see [benchmarks](BENCHMARKS.md))
16 | + automatically finds dependencies either from configuration files or within source code.
17 | + support for poetry,hatch,filt,pdm and can be integrated into existing build processes.
18 | + hasn't been battle-hardened yet. PRs and issue makers welcome.
19 |
20 | ## 🕊️ Install
21 |
22 | ```bash
23 | pip install pyscan-rs
24 | ```
25 | **look out for the "-rs"** part
26 | or
27 |
28 | ```bash
29 | cargo install pyscan
30 | ```
31 |
32 |
33 |
34 | ## 🐇 Usage
35 |
36 | Go to your python source directory (or wherever you keep your `requirements.txt`/`pyproject.toml`) and run:
37 |
38 | ```bash
39 | > pyscan
40 | ```
41 | or
42 | ```bash
43 | > pyscan -d path/to/src
44 | ```
45 |
46 |
58 |
59 |
60 | Pyscan will find any dependencies added through poetry, hatch, filt, pdm, etc.
61 | Here's the order of precedence for a source/config file:
62 |
63 | + `requirements.txt`
64 | + `pyproject.toml`
65 | + your source code (`.py`)
66 |
67 | Pyscan will use your `pip` to find unknown versions, otherwise [pypi.org](https://pypi.org) for the latest version. Still, **it is recommended to version-ize your requirements** and use proper [pep-508 syntax](https://peps.python.org/pep-0508/).
68 |
69 | ## Building
70 |
71 | pyscan requires a rust version of `< v1.70`, and might be unstable on previous releases.
72 | There's an overview of the codebase at [architecture](./architecture/). Grateful for all the contributions so far.
73 |
74 | ## 🦀 Note
75 |
76 | pyscan doesn't make sure your code is safe from everything. Use all resources available to you like [safety](https://pypi.org/project/safety/) Dependabot, [`pip-audit`](https://pypi.org/project/pip-audit/), trivy and the likes.
77 |
78 | ## 🐰 Todo
79 |
80 | As of December 24, 2024:
81 |
82 | - [ ] Gather time to work on it (incredible task as a ~~high schooler~~ college freshman)
83 | - [ ] Persistent state representation of a project's security.
84 | - [ ] Graphical analysis of dependencies and their dependencies, and so on.
85 | - [ ] Better display, search, filter of vulns
86 | - [ ] Finish the "big" update (All of the above is a part of PR #17)
87 |
88 | ## 🐹 Donate
89 |
90 | While not coding, I am a broke ~~high school~~ college student with nothing else to do. I appreciate all the help I can get.
91 |
92 |
93 | [](https://ko-fi.com/Z8Z74DCR4)
94 |
--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
1 | # Security Policy
2 |
3 | ## Supported Versions
4 |
5 | | Version | Supported |
6 | | ------- | ------------------ |
7 | | 0.1.x | :white_check_mark: |
8 |
9 |
10 | ## Reporting a Vulnerability
11 |
12 | Open an issue!
13 |
--------------------------------------------------------------------------------
/architecture/README.md:
--------------------------------------------------------------------------------
1 | # 🐍 Architecture / Codebase Overview
2 |
3 |
4 |
5 |
6 |
7 |
A very vague representation of how an ideal pyscan run works with no arguments given.
8 |
9 |
10 |
11 | Pyscan is coded in a psuedo-procedural manner where the top level works just like any procedural program (the functions are "chained" in a way) but the internals use structs and models/classes to an extent enough to call it OOP. It's a mix of both worlds.
12 |
13 | ## Important files to look at
14 |
15 | There's comments on almost anything comment-able and worthy. Feel free to look around.
16 |
17 | - [`parser.rs`](../src/parser/mod.rs) - top level look at the parser. Check out [`extractor.rs`](../src/parser/extractor.rs) to really see the extraction and file discovery being done.
18 |
19 | - [`scanner::api.rs`](../src/scanner/api.rs) - how the API stuff gets done using the struct `Osv`, look at `mod.rs` for a higher level view.
20 |
21 | - [`docker.rs`](../src/docker/mod.rs) - handles getting and doing stuff with Docker. [this one is buggy and might get deprecated because i dont really care about docker, just run the program inside the container or something]
22 |
23 | - [`display.rs`](../src/display/mod.rs) - some functions used to print to the screen, not all though.
24 |
25 | ## Notes for contributers
26 |
27 | - This thing will be updated every once in a while to detail how pyscan works in a much more articulate and better way, including subcommands and other arguments and quirks.
28 |
29 | - If you think the codebase is designed badly, I don't know, it might be. I have never made a CLI tool before so, there's that. Open an issue or make a PR and I'm more than willing to learn from you.
30 |
31 | - Please be descriptive and detailed in your PRs, comments and other decent things. It's very cool what the open source community has done for pyscan so far.
32 |
--------------------------------------------------------------------------------
/assets/2pyscan-repository.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ohaswin/pyscan/959b39c8d025e4802eee7a30fef7a408186b7f9f/assets/2pyscan-repository.png
--------------------------------------------------------------------------------
/assets/flowchart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ohaswin/pyscan/959b39c8d025e4802eee7a30fef7a408186b7f9f/assets/flowchart.png
--------------------------------------------------------------------------------
/assets/pyscan.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ohaswin/pyscan/959b39c8d025e4802eee7a30fef7a408186b7f9f/assets/pyscan.png
--------------------------------------------------------------------------------
/assets/snake.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ohaswin/pyscan/959b39c8d025e4802eee7a30fef7a408186b7f9f/assets/snake.png
--------------------------------------------------------------------------------
/osv_schema.json:
--------------------------------------------------------------------------------
1 | {
2 | "title": "Open Source Vulnerability",
3 | "description": "A schema for describing a vulnerability in an open source package.",
4 | "type": "object",
5 | "properties": {
6 | "schema_version": {
7 | "type": "string"
8 | },
9 | "id": {
10 | "type": "string"
11 | },
12 | "modified": {
13 | "type": "string",
14 | "format": "date-time"
15 | },
16 | "published": {
17 | "type": "string",
18 | "format": "date-time"
19 | },
20 | "withdrawn": {
21 | "type": "string",
22 | "format": "date-time"
23 | },
24 | "aliases": {
25 | "type": ["array", "null"],
26 | "items": {
27 | "type": "string"
28 | }
29 | },
30 | "related": {
31 | "type": "array",
32 | "items": {
33 | "type": "string"
34 | }
35 | },
36 | "summary": {
37 | "type": "string"
38 | },
39 | "details": {
40 | "type": "string"
41 | },
42 | "severity": {
43 | "type": ["array", "null"],
44 | "items": {
45 | "type": "object",
46 | "properties": {
47 | "type": {
48 | "type": "string",
49 | "enum": [
50 | "CVSS_V2",
51 | "CVSS_V3"
52 | ]
53 | },
54 | "score": {
55 | "type": "string"
56 | }
57 | },
58 | "required": [
59 | "type",
60 | "score"
61 | ]
62 | }
63 | },
64 | "affected": {
65 | "type": ["array", "null"],
66 | "items": {
67 | "type": "object",
68 | "properties": {
69 | "package": {
70 | "type": "object",
71 | "properties": {
72 | "ecosystem": {
73 | "type": "string"
74 | },
75 | "name": {
76 | "type": "string"
77 | },
78 | "purl": {
79 | "type": "string"
80 | }
81 | },
82 | "required": [
83 | "ecosystem",
84 | "name"
85 | ]
86 | },
87 | "severity": {
88 | "type": ["array", "null"],
89 | "items": {
90 | "type": "object",
91 | "properties": {
92 | "type": {
93 | "type": "string",
94 | "enum": [
95 | "CVSS_V2",
96 | "CVSS_V3"
97 | ]
98 | },
99 | "score": {
100 | "type": "string"
101 | }
102 | },
103 | "required": [
104 | "type",
105 | "score"
106 | ]
107 | }
108 | },
109 | "ranges": {
110 | "type": "array",
111 | "items": {
112 | "type": "object",
113 | "properties": {
114 | "type": {
115 | "type": "string",
116 | "enum": [
117 | "GIT",
118 | "SEMVER",
119 | "ECOSYSTEM"
120 | ]
121 | },
122 | "repo": {
123 | "type": "string"
124 | },
125 | "events": {
126 | "type": "array",
127 | "contains": {
128 | "required": [
129 | "introduced"
130 | ]
131 | },
132 | "items": {
133 | "type": "object",
134 | "oneOf": [
135 | {
136 | "type": "object",
137 | "properties": {
138 | "introduced": {
139 | "type": "string"
140 | }
141 | },
142 | "required": [
143 | "introduced"
144 | ]
145 | },
146 | {
147 | "type": "object",
148 | "properties": {
149 | "fixed": {
150 | "type": "string"
151 | }
152 | },
153 | "required": [
154 | "fixed"
155 | ]
156 | },
157 | {
158 | "type": "object",
159 | "properties": {
160 | "last_affected": {
161 | "type": "string"
162 | }
163 | },
164 | "required": [
165 | "last_affected"
166 | ]
167 | },
168 | {
169 | "type": "object",
170 | "properties": {
171 | "limit": {
172 | "type": "string"
173 | }
174 | },
175 | "required": [
176 | "limit"
177 | ]
178 | }
179 | ]
180 | },
181 | "minItems": 1
182 | },
183 | "database_specific": {
184 | "type": "object"
185 | }
186 | },
187 | "allOf": [
188 | {
189 | "if": {
190 | "properties": {
191 | "type": {
192 | "const": "GIT"
193 | }
194 | }
195 | },
196 | "then": {
197 | "required": [
198 | "repo"
199 | ]
200 | }
201 | },
202 | {
203 | "if": {
204 | "properties": {
205 | "events": {
206 | "contains": {
207 | "required": ["last_affected"]
208 | }
209 | }
210 | }
211 | },
212 | "then": {
213 | "not": {
214 | "properties": {
215 | "events": {
216 | "contains": {
217 | "required": ["fixed"]
218 | }
219 | }
220 | }
221 | }
222 | }
223 | }
224 | ],
225 | "required": [
226 | "type",
227 | "events"
228 | ]
229 | }
230 | },
231 | "versions": {
232 | "type": "array",
233 | "items": {
234 | "type": "string"
235 | }
236 | },
237 | "ecosystem_specific": {
238 | "type": "object"
239 | },
240 | "database_specific": {
241 | "type": "object"
242 | }
243 | }
244 | }
245 | },
246 | "references": {
247 | "type": ["array", "null"],
248 | "items": {
249 | "type": "object",
250 | "properties": {
251 | "type": {
252 | "type": "string",
253 | "enum": [
254 | "ADVISORY",
255 | "ARTICLE",
256 | "DETECTION",
257 | "DISCUSSION",
258 | "REPORT",
259 | "FIX",
260 | "INTRODUCED",
261 | "GIT",
262 | "PACKAGE",
263 | "EVIDENCE",
264 | "WEB"
265 | ]
266 | },
267 | "url": {
268 | "type": "string",
269 | "format": "uri"
270 | }
271 | },
272 | "required": [
273 | "type",
274 | "url"
275 | ]
276 | }
277 | },
278 | "credits": {
279 | "type": "array",
280 | "items": {
281 | "type": "object",
282 | "properties": {
283 | "name": {
284 | "type": "string"
285 | },
286 | "contact": {
287 | "type": "array",
288 | "items": {
289 | "type": "string"
290 | }
291 | },
292 | "type": {
293 | "type": "string",
294 | "enum": [
295 | "FINDER",
296 | "REPORTER",
297 | "ANALYST",
298 | "COORDINATOR",
299 | "REMEDIATION_DEVELOPER",
300 | "REMEDIATION_REVIEWER",
301 | "REMEDIATION_VERIFIER",
302 | "TOOL",
303 | "SPONSOR",
304 | "OTHER"
305 | ]
306 | }
307 | },
308 | "required": [
309 | "name"
310 | ]
311 | }
312 | },
313 | "database_specific": {
314 | "type": "object"
315 | }
316 | },
317 | "required": [
318 | "id",
319 | "modified"
320 | ]
321 | }
322 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [build-system]
2 | requires = ["maturin>=0.15"]
3 | build-backend = "maturin"
4 |
5 | [project]
6 | name = "pyscan-rs"
7 | requires-python = ">=3.7"
8 | classifiers = [
9 | "Programming Language :: Rust",
10 | "Programming Language :: Python :: Implementation :: CPython",
11 | "Programming Language :: Python :: Implementation :: PyPy",
12 | ]
13 |
14 |
15 | [tool.maturin]
16 | bindings = "bin"
17 |
--------------------------------------------------------------------------------
/python/pyscan/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ohaswin/pyscan/959b39c8d025e4802eee7a30fef7a408186b7f9f/python/pyscan/__init__.py
--------------------------------------------------------------------------------
/python/pyscan/__main__.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | import sysconfig
4 | from pathlib import Path
5 |
6 |
7 | def find_pyscan_bin() -> Path:
8 | """Return the pyscan binary path."""
9 |
10 | pyscan_exe = "pyscan" + sysconfig.get_config_var("EXE")
11 |
12 | path = Path(sysconfig.get_path("bin")) / pyscan_exe
13 | if path.is_file():
14 | return path
15 |
16 | if sys.version_info >= (3, 10):
17 | user_scheme = sysconfig.get_preferred_scheme("user")
18 | elif os.name == "nt":
19 | user_scheme = "nt_user"
20 | elif sys.platform == "darwin" and sys._framework:
21 | user_scheme = "osx_framework_user"
22 | else:
23 | user_scheme = "posix_user"
24 |
25 | path = Path(sysconfig.get_path("bin", scheme=user_scheme)) / pyscan_exe
26 | if path.is_file():
27 | return path
28 |
29 | raise FileNotFoundError(path)
30 |
31 |
32 | if __name__ == "__main__":
33 | pyscan = find_pyscan_bin()
34 | sys.exit(os.spawnv(os.P_WAIT, pyscan, ["pyscan", *sys.argv[1:]]))
--------------------------------------------------------------------------------
/src/display/mod.rs:
--------------------------------------------------------------------------------
1 | use crate::parser::structs::ScannedDependency;
2 | use console::{style, Term};
3 | use once_cell::sync::Lazy;
4 | use std::{collections::HashMap, io, process::exit};
5 |
6 | static CONS: Lazy = Lazy::new(Term::stdout);
7 |
8 | pub struct Progress {
9 | // this progress info only contains progress info about the found vulns.
10 | pub count: usize,
11 | current_displayed: usize,
12 | }
13 |
14 | impl Progress {
15 | pub fn new() -> Progress {
16 | Progress {
17 | count: 0,
18 | current_displayed: 0,
19 | }
20 | }
21 | pub fn display(&mut self) {
22 | if self.count > 1 {
23 | let _ = CONS.clear_last_lines(1);
24 | }
25 |
26 | if self.count > self.current_displayed {
27 | let _ = CONS.write_line(
28 | format!(
29 | "Found {} vulnerabilities so far",
30 | style(self.count).bold().bright().red()
31 | )
32 | .as_str(),
33 | );
34 | self.current_displayed = self.count;
35 | }
36 | }
37 |
38 | pub fn count_one(&mut self) {
39 | self.count += 1;
40 | }
41 | pub fn end(&mut self) {
42 | let _ = CONS.clear_last_lines(1);
43 | }
44 | }
45 |
46 | pub fn display_queried(
47 | collected: &Vec,
48 | imports_info: &mut HashMap,
49 | ) {
50 | // --- displaying query result starts here ---
51 | for dep in collected {
52 | let _ = CONS.write_line(
53 | format!(
54 | "|-| {} [{}]{:^5}",
55 | style(dep.name.as_str()).bold().bright().yellow(),
56 | style(dep.version.as_str()).bold().dim(),
57 | style(" -> Found vulnerabilities!").bold().bright().red()
58 | )
59 | .as_str(),
60 | );
61 | } // displays all the deps where vuln has been found
62 |
63 | // remove the the deps with vulns from import_info so what remains is the safe deps, which we can display as safe
64 | for d in collected.iter() {
65 | imports_info.remove(d.name.as_str());
66 | }
67 |
68 | for (k, v) in imports_info.iter() {
69 | let _ = CONS.write_line(
70 | format!(
71 | "|-| {} [{}]{}",
72 | style(k.as_str()).bold().bright().yellow(),
73 | style(v.as_str()).bold().dim(),
74 | style(" -> No vulnerabilities found.")
75 | .bold()
76 | .bright()
77 | .green()
78 | )
79 | .as_str(),
80 | );
81 | } // display the safe deps
82 | let _ = display_summary(&collected);
83 | }
84 |
85 | pub fn display_summary(collected: &Vec) -> io::Result<()> {
86 | // thing is, collected only has vulnerable dependencies, if theres a case where no vulns have been found, it will just skip this entire thing.
87 | if !collected.is_empty() {
88 | // --- summary starts here ---
89 | CONS.write_line(&format!(
90 | "{}",
91 | style("SUMMARY").bold().yellow().underlined()
92 | ))?;
93 | for v in collected {
94 | for vuln in &v.vuln.vulns {
95 | // DEPENDENCY
96 | let name = format!(
97 | "Dependency: {}",
98 | style(v.name.clone()).bold().bright().red()
99 | );
100 |
101 | CONS.write_line(name.as_str())?;
102 | CONS.flush()?;
103 |
104 | // ID
105 | let id = format!("ID: {}", style(vuln.id.as_str()).bold().bright().yellow());
106 | CONS.write_line(id.as_str())?;
107 | CONS.flush()?;
108 |
109 | // DETAILS
110 | let details = format!("Details: {}", style(vuln.details.as_str()).italic());
111 | CONS.write_line(details.as_str())?;
112 | CONS.flush()?;
113 |
114 | // VERSIONS AFFECTED from ... to
115 | let vers: Vec> = vuln
116 | .affected
117 | .iter()
118 | .map(|affected| {
119 | vec![
120 | {
121 | if let Some(v) = &affected.versions {
122 | v.first().unwrap().to_string()
123 | } else {
124 | "This version".to_string()
125 | }
126 | },
127 | {
128 | if let Some(v) = &affected.versions {
129 | v.last().unwrap().to_string()
130 | } else {
131 | "Unknown".to_string()
132 | }
133 | },
134 | ]
135 | })
136 | .collect();
137 | // let vers: Vec> = vuln.affected.iter().map(|affected| {vec![affected.versions.first().unwrap().to_string(), affected.versions.last().unwrap().to_string()]}).collect();
138 |
139 | let version = format!(
140 | "Versions affected: {} to {}",
141 | style(
142 | vers.first()
143 | .expect("No version found affected")
144 | .first()
145 | .unwrap()
146 | )
147 | .dim()
148 | .underlined(),
149 | style(
150 | vers.last()
151 | .expect("No version found affected")
152 | .last()
153 | .unwrap()
154 | )
155 | .dim()
156 | .underlined()
157 | );
158 |
159 | println!();
160 |
161 | CONS.write_line(version.as_str())?;
162 | CONS.flush()?;
163 | }
164 | }
165 | } else {
166 | println!("Finished scanning all found dependencies.");
167 | exit(0)
168 | }
169 | Ok(())
170 | }
171 |
--------------------------------------------------------------------------------
/src/docker/mod.rs:
--------------------------------------------------------------------------------
1 | // Import the std::process module to use Command
2 | use std::{path::{PathBuf, Path}, process::Command};
3 |
4 | use crate::parser::scan_dir;
5 |
6 | // Define a custom error type that wraps a String message
7 | #[derive(Debug)]
8 | pub struct DockerError(String);
9 |
10 | // Implement the std::error::Error trait for DockerError
11 | impl std::error::Error for DockerError {}
12 |
13 | // Implement the std::fmt::Display trait for DockerError
14 | impl std::fmt::Display for DockerError {
15 | fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
16 | write!(f, "Docker error: {}", self.0)
17 | }
18 | }
19 |
20 | // Define a function that takes a docker image name as a parameter
21 | // and returns a result of either a vector of filenames or a DockerError
22 | pub async fn list_files_in_docker_image(image: &str, path: PathBuf) -> Result<(), DockerError> {
23 | // Create a Command object to run docker commands
24 | let mut cmd = Command::new("docker");
25 |
26 | // Use the "create" subcommand to create a container from the image
27 | // without starting it
28 | cmd.arg("create").arg(image);
29 |
30 | // Execute the command and get the output
31 | let output = cmd.output().map_err(|e| DockerError(e.to_string()))?;
32 |
33 | // Check if the command was successful
34 | if !output.status.success() {
35 | // Return an error with the command's stderr
36 | return Err(DockerError(
37 | String::from_utf8_lossy(&output.stderr).to_string(),
38 | ));
39 | }
40 |
41 | // Get the container ID from the output
42 | let container_id = String::from_utf8(output.stdout)
43 | .map_err(|e| DockerError(e.to_string()))?
44 | .trim()
45 | .to_string();
46 |
47 | // Create another Command object to run docker commands
48 | let mut cmd = Command::new("docker");
49 | let cmd = cmd.current_dir(".");
50 |
51 | // Create a tmp folder to keep our docker-files
52 | create_tmp_folder(".")
53 | .expect("Could not create a temporary folder for the docker files. Try creating it yourself:\n./tmp/docker-files\n");
54 |
55 | // Use the "cp" subcommand to copy all files from the container
56 | // to a temporary directory on the host
57 | cmd.arg("cp")
58 | .arg(format!(
59 | "{}:/{}",
60 | container_id,
61 | path.to_str().expect("Path contains non-unicode characters")
62 | ))
63 | .arg("./tmp/docker-files");
64 |
65 | // Execute the command and get the output
66 | let output = cmd.output().map_err(|e| DockerError(e.to_string()))?;
67 |
68 | // Check if the command was successful
69 | if !output.status.success() {
70 | // Return an error with the command's stderr
71 | return Err(DockerError(
72 | String::from_utf8_lossy(&output.stderr).to_string(),
73 | ));
74 | }
75 |
76 | scan_dir(Path::new("./tmp/docker-files")).await;
77 | cleanup().map_err(|e| DockerError(e.to_string()) )?;
78 |
79 | // docker stop
80 | let mut cmd = Command::new("docker");
81 | cmd.arg("stop")
82 | .arg(container_id.clone());
83 |
84 | // Execute the command and get the output
85 | let _output = cmd.output().map_err(|e| DockerError(e.to_string()))?;
86 |
87 | // docker remove
88 | let mut cmd = Command::new("docker");
89 | cmd.arg("rm")
90 | .arg(container_id);
91 |
92 | // Execute the command and get the output
93 | let _output = cmd.output().map_err(|e| DockerError(e.to_string()))?;
94 | Ok(())
95 |
96 |
97 | // // Create another Command object to run shell commands
98 | // let mut cmd = Command::new("sh");
99 |
100 | // // Use the "-c" argument to run a shell command that lists all files
101 | // // in the temporary directory and removes the directory prefix
102 | // cmd.arg("-c")
103 | // .arg(format!("cd ./tmp/docker-files/{} && ls -F", path.to_str().unwrap()));
104 |
105 | // // Execute the command and get the output
106 | // let output = cmd.output().map_err(|e| DockerError(e.to_string()))?;
107 |
108 | // // Check if the command was successful
109 | // if !output.status.success() {
110 | // // Return an error with the command's stderr
111 | // return Err(DockerError(
112 | // String::from_utf8_lossy(&output.stderr).to_string(),
113 | // ));
114 | // }
115 |
116 | // // Get the filenames from the output as a vector of strings
117 | // let filenames = String::from_utf8(output.stdout)
118 | // .map_err(|e| DockerError(e.to_string()))?
119 | // .lines()
120 | // .map(|s| s.to_string())
121 | // .collect();
122 |
123 | // // Return the filenames vector as Ok value
124 | // Ok(filenames)
125 | }
126 |
127 | fn create_tmp_folder(path: &str) -> std::io::Result<()> {
128 | let tmp_path = format!("{}/tmp/docker-files", path);
129 | std::fs::create_dir_all(tmp_path)?;
130 | Ok(())
131 | }
132 |
133 | fn cleanup() -> Result<(), std::io::Error> {
134 | std::fs::remove_dir_all("./tmp/docker-files")
135 | }
136 |
--------------------------------------------------------------------------------
/src/main.rs:
--------------------------------------------------------------------------------
1 | use clap::{Parser, Subcommand};
2 | use console::style;
3 | use once_cell::sync::Lazy;
4 | use std::sync::OnceLock;
5 | use std::{path::PathBuf, process::exit};
6 | use utils::{PipCache, SysInfo};
7 | mod display;
8 | mod docker;
9 | mod parser;
10 | mod scanner;
11 | mod utils;
12 | use crate::{
13 | parser::structs::{Dependency, VersionStatus},
14 | utils::get_version,
15 | };
16 | use std::env;
17 | use tokio::task;
18 |
19 | #[derive(Parser, Debug)]
20 | #[command(
21 | author = "aswinnnn",
22 | version = "0.1.7",
23 | about = "python dependency vulnerability scanner.\n\ndo 'pyscan [subcommand] --help' for specific help."
24 | )]
25 | struct Cli {
26 | /// path to source. (default: current directory)
27 | #[arg(long,short,default_value=None,value_name="DIRECTORY")]
28 | dir: Option,
29 |
30 | /// export the result to a desired format. [json]
31 | #[arg(long, short, required = false, value_name = "FILENAME")]
32 | output: Option,
33 |
34 | /// search for a single package.
35 | #[command(subcommand)]
36 | subcommand: Option,
37 |
38 | /// skip: skip the given databases
39 | /// ex. pyscan -s osv snyk
40 | /// hidden due to only having one database for now.
41 | #[arg(
42 | short,
43 | long,
44 | value_delimiter = ' ',
45 | value_name = "VAL1 VAL2 VAL3...",
46 | hide = true
47 | )]
48 | skip: Vec,
49 |
50 | /// show the version and information about a package from all available sources. (does not search for vulns, use 'package' subcommand for that).
51 | /// usage: pyscan show requests pyscan-rs lxml koda
52 | /// hidden due to unfinished
53 | #[arg(
54 | long,
55 | value_delimiter = ' ',
56 | value_name = "package1 package2 package3...",
57 | hide = true
58 | )]
59 | show: Vec,
60 |
61 | /// Uses pip to retrieve versions. if not provided it will use the source, falling back on pip if not, pypi.org.
62 | #[arg(long, required=false, action=clap::ArgAction::SetTrue)]
63 | pip: bool,
64 |
65 | /// Same as --pip except uses pypi.org to retrieve the latest version for the packages.
66 | #[arg(long, required=false,action=clap::ArgAction::SetTrue)]
67 | pypi: bool,
68 |
69 | /// turns off the caching of pip packages at the starting of execution.
70 | #[arg(long="cache-off", required=false,action=clap::ArgAction::SetTrue)]
71 | cache_off: bool,
72 | }
73 |
74 | #[derive(Subcommand, Debug, Clone)]
75 | enum SubCommand {
76 | /// query for a single python package
77 | Package {
78 | /// name of the package
79 | #[arg(long, short)]
80 | name: String,
81 |
82 | /// version of the package (defaults to latest if not provided)
83 | #[arg(long, short, default_value=None)]
84 | version: Option,
85 | },
86 |
87 | /// scan inside a docker image
88 | Docker {
89 | /// name of the docker image
90 | #[arg(long, short)]
91 | name: String,
92 |
93 | /// path inside your docker container where requirements.txt is, or just the folder name where your Dockerfile (along with requirements.txt) is.
94 | #[arg(long, short, value_name = "DIRECTORY")]
95 | path: PathBuf,
96 | },
97 | }
98 |
99 | static ARGS: Lazy> = Lazy::new(|| OnceLock::from(Cli::parse()));
100 |
101 | // Why is the args a static variable? Some arguments need to be seen by other files in the codebase
102 | // such as --pip or --pypi due to different use cases. Args only get wrote to once so it shouldn't pose a problem (Reason its OnceLock'ed).
103 | // Why is it Lazy? Something about a non-const fn in a const world. Pretty surprised to see the compiler recommend an outside crate for this issue.
104 |
105 | static PIPCACHE: Lazy = Lazy::new(|| utils::PipCache::init());
106 | // is a hashmap of (package name, version) from 'pip list'
107 | // because calling 'pip show' everytime might get expensive if theres a lot of dependencies to check.
108 |
109 | #[tokio::main]
110 | async fn main() {
111 | match &ARGS.get().unwrap().subcommand {
112 | // subcommand package
113 | Some(SubCommand::Package { name, version }) => {
114 | // let osv = Osv::new().expect("Cannot access the API to get the latest package version.");
115 | let version = if let Some(v) = version {
116 | v.to_string()
117 | } else {
118 | utils::get_package_version_pypi(name.as_str())
119 | .await
120 | .expect("Error in retrieving stable version from API")
121 | .to_string()
122 | };
123 |
124 | let dep = Dependency {
125 | name: name.to_string(),
126 | version: Some(version),
127 | comparator: None,
128 | version_status: VersionStatus {
129 | pypi: false,
130 | pip: false,
131 | source: false,
132 | },
133 | };
134 |
135 | // start() from scanner only accepts Vec so
136 | let vdep = vec![dep];
137 |
138 | let _res = scanner::start(vdep).await;
139 | exit(0)
140 | }
141 | Some(SubCommand::Docker { name, path }) => {
142 | println!(
143 | "{} {}\n{} {}",
144 | style("Docker image:").yellow().blink(),
145 | style(name.clone()).bold().green(),
146 | style("Path inside container:").yellow().blink(),
147 | style(path.to_string_lossy()).bold().green()
148 | );
149 | println!("{}",
150 | style("--- Make sure you run the command with elevated permissions (sudo/administrator) as pyscan might have trouble accessing files inside docker containers ---").dim());
151 | docker::list_files_in_docker_image(name, path.to_path_buf())
152 | .await
153 | .expect("Error in scanning files from Docker image.");
154 | exit(0)
155 | }
156 | None => (),
157 | }
158 |
159 | println!(
160 | "pyscan v{} | by Aswin S (github.com/aswinnnn) | \x1b[90mConsider donating to a broke college student: https://ko-fi.com/aswinnnn \x1b[0m",
161 | get_version()
162 | );
163 |
164 | let sys_info = SysInfo::new().await;
165 | // supposed to be a global static, cant atm because async closures are unstable.
166 | // has to be ran in diff thread due to underlying blocking functions, to be fixed soon.
167 |
168 | task::spawn(async move {
169 | // init pip cache, if cache-off is false or pip has been found
170 | if !&ARGS.get().unwrap().cache_off | sys_info.pip_found {
171 | let _ = PIPCACHE.lookup(" ");
172 | // since its in Lazy its first accesss would init the cache, the result is ignorable.
173 | }
174 | // has to be run on another thread to not block user functionality
175 | // it still blocks because i cant make pip_list() async or PIPCACHE would fail
176 | // as async closures aren't stable yet.
177 | // but it removes a 3s delay, for now.
178 | });
179 |
180 | // --- giving control to parser starts here ---
181 |
182 | // if a directory path is provided
183 | if let Some(dir) = &ARGS.get().unwrap().dir {
184 | parser::scan_dir(dir.as_path()).await
185 | }
186 | // if not, use cwd
187 | else if let Ok(dir) = env::current_dir() {
188 | parser::scan_dir(dir.as_path()).await
189 | } else {
190 | eprintln!("the given directory is empty.");
191 | exit(1)
192 | }; // err when dir is empty
193 | }
194 |
--------------------------------------------------------------------------------
/src/parser/extractor.rs:
--------------------------------------------------------------------------------
1 | use std::process::exit;
2 |
3 | /// for the parser module, extractor.rs is the backbone of all parsing
4 | /// it takes a String and a mutable reference to a Vec.
5 | /// String is the contents of a source file, while the mut ref vector will
6 | /// be used to collect the dependencies that we have extracted from the contents.
7 | use super::structs::{Dependency, VersionStatus};
8 |
9 | use lazy_static::lazy_static;
10 | use pep_508::{self, Spec};
11 | use regex::Regex;
12 |
13 | use toml::{de::Error, Table, Value};
14 |
15 | pub fn extract_imports_python(text: String, imp: &mut Vec) {
16 | lazy_static! {
17 | static ref IMPORT_REGEX: Regex =
18 | Regex::new(r"^\s*(?:from|import)\s+(\w+(?:\s*,\s*\w+)*)").unwrap();
19 | }
20 |
21 | for x in IMPORT_REGEX.find_iter(&text) {
22 | let mat = x.as_str().to_string();
23 | let mat = mat.replacen("import", "", 1).trim().to_string();
24 |
25 | imp.push(Dependency {
26 | name: mat,
27 | version: None,
28 | comparator: None,
29 | version_status: VersionStatus {
30 | pypi: false,
31 | pip: false,
32 | source: false,
33 | },
34 | })
35 | }
36 | }
37 |
38 | pub fn extract_imports_reqs(text: String, imp: &mut Vec) {
39 | // requirements.txt uses a PEP 508 parser to parse dependencies accordingly
40 | // you might think its just a text file, but I'm gonna decline reinventing the wheel
41 | // just to parse "requests >= 2.0.8"
42 |
43 | let parsed = pep_508::parse(text.as_str());
44 |
45 | if let Ok(ref dep) = parsed {
46 | let dname = dep.name.to_string();
47 | // println!("{:?}", parsed.clone());
48 | if let Some(ver) = &dep.spec {
49 | if let Spec::Version(verspec) = ver {
50 | if let Some(v) = verspec.iter().next() {
51 | // pyscan only takes the first version spec found for the dependency
52 | let version = v.version.to_string();
53 | let comparator = v.comparator;
54 | imp.push(Dependency {
55 | name: dname,
56 | version: Some(version),
57 | comparator: Some(comparator),
58 | version_status: VersionStatus {
59 | pypi: false,
60 | pip: false,
61 | source: true,
62 | },
63 | });
64 | }
65 | }
66 | } else {
67 | imp.push(Dependency {
68 | name: dname,
69 | version: None,
70 | comparator: None,
71 | version_status: VersionStatus {
72 | pypi: false,
73 | pip: false,
74 | source: false,
75 | },
76 | });
77 | }
78 | } else if let Err(e) = parsed {
79 | println!("{:#?}", e);
80 | }
81 | }
82 |
83 | // pub fn extract_imports_pyproject(f: String, imp: &mut Vec) {
84 | // let parsed = f.parse::
();
85 | // if let Ok(parsed) = parsed {
86 | // let project = &parsed["project"];
87 | // let deps = &project["dependencies"];
88 | // let deps = deps
89 | // .as_array()
90 | // .expect("Could not find the dependencies table in your pyproject.toml");
91 | // for d in deps {
92 | // let d = d.as_str().unwrap();
93 | // let parsed = pep_508::parse(d);
94 | // if let Ok(dep) = parsed {
95 | // let dname = dep.name.to_string();
96 | // // println!("{:?}", dep.clone());
97 | // if let Some(ver) = dep.spec {
98 | // if let Spec::Version(verspec) = ver {
99 | // for v in verspec {
100 | // // pyscan only takes the first version spec found for the dependency
101 | // // for now.
102 | // let version = v.version.to_string();
103 | // let comparator = v.comparator;
104 | // imp.push(Dependency {
105 | // name: dname,
106 | // version: Some(version),
107 | // comparator: Some(comparator),
108 | // version_status: VersionStatus {
109 | // pypi: false,
110 | // pip: false,
111 | // source: true,
112 | // },
113 | // });
114 | // break;
115 | // }
116 | // }
117 | // } else {
118 | // imp.push(Dependency {
119 | // name: dname,
120 | // version: None,
121 | // comparator: None,
122 | // version_status: VersionStatus {
123 | // pypi: false,
124 | // pip: false,
125 | // source: false,
126 | // },
127 | // });
128 | // }
129 | // }
130 | // }
131 | // }
132 | // }
133 |
134 | pub fn extract_imports_setup_py(setup_py_content: &str, imp: &mut Vec) {
135 | let mut deps = Vec::new();
136 |
137 | // regex for install_requires section
138 | let re = Regex::new(r"install_requires\s*=\s*\[([^\]]+)\]").expect("Invalid regex pattern");
139 |
140 | for cap in re.captures_iter(setup_py_content) {
141 | if let Some(matched) = cap.get(1) {
142 | // Split the matched text by ',' and trim whitespace
143 | deps.extend(
144 | matched
145 | .as_str()
146 | .split(',')
147 | .map(|dep| dep.trim().replace("\"", "").replace("\\", "").to_string()),
148 | );
149 | }
150 | }
151 |
152 | for d in deps {
153 | let d = d.as_str();
154 | let parsed = pep_508::parse(d);
155 | if let Ok(dep) = parsed {
156 | let dname = dep.name.to_string();
157 | if let Some(ver) = dep.spec {
158 | if let Spec::Version(verspec) = ver {
159 | if let Some(v) = verspec.first() {
160 | // pyscan only takes the first version spec found for the dependency
161 | // for now.
162 | let version = v.version.to_string();
163 | let comparator = v.comparator;
164 | imp.push(Dependency {
165 | name: dname,
166 | version: Some(version),
167 | comparator: Some(comparator),
168 | version_status: VersionStatus {
169 | pypi: false,
170 | pip: false,
171 | source: true,
172 | },
173 | });
174 | }
175 | }
176 | } else {
177 | imp.push(Dependency {
178 | name: dname,
179 | version: None,
180 | comparator: None,
181 | version_status: VersionStatus {
182 | pypi: false,
183 | pip: false,
184 | source: false,
185 | },
186 | });
187 | }
188 | }
189 | }
190 | }
191 |
192 | pub fn extract_imports_pyproject(
193 | toml_content: String,
194 | imp: &mut Vec,
195 | ) -> Result<(), Error> {
196 | // Parse the toml content into a Value
197 | let toml_value: Value = toml::from_str(toml_content.as_str())?;
198 | // println!("{:#?}",toml_value);
199 |
200 | // Helper function to extract dependency values (version strings) including nested tables
201 | fn extract_dependencies(
202 | table: &toml::value::Table,
203 | poetry: Option,
204 | ) -> Result, Error> {
205 | let mut deps = Vec::new();
206 |
207 | // for [project] in pyproject.toml, the insides require a different sort of parsing
208 | // for poetry you need both keys and values (as dependency name and version),
209 | // for [project] the values are just enough and the keys are in the vec below
210 | let projectlevel: Vec<&str> = vec![
211 | "dependencies",
212 | "optional-dependencies.docs",
213 | "optional-dependencies",
214 | ];
215 |
216 | for (key, version) in table {
217 | if projectlevel.contains(&key.as_str()) {
218 | match version {
219 | Value::String(version_str) => {
220 | deps.push(version_str.to_string());
221 | }
222 | Value::Table(nested_table) => {
223 | if "optional-dependencies" == key {
224 | parse_opt_deps_pyproject(nested_table.clone(), &mut deps);
225 | } else {
226 | // Recursively extract dependencies from nested tables
227 | let nested_deps = extract_dependencies(nested_table, None)?;
228 | deps.extend(nested_deps);
229 | }
230 | }
231 | Value::Array(array) => {
232 | // Extract dependencies from an array (if any)
233 | for item in array {
234 | if let Value::String(item_str) = item {
235 | deps.push(item_str.to_string());
236 | }
237 | }
238 | }
239 | _ => eprintln!("ERR: Invalid dependency syntax found while TOML parsing"),
240 | }
241 | } else if poetry.unwrap_or(false) {
242 | match version {
243 | Value::String(version_str) => {
244 | let verstr = version_str.to_string();
245 | if verstr.contains('^') {
246 | let s = format!("{} >= {}", key, verstr.strip_prefix('^').unwrap());
247 | deps.push(s);
248 | } else if verstr == "*" {
249 | deps.push(key.to_string());
250 | }
251 | }
252 | Value::Table(nested_table) => {
253 | // Recursively extract dependencies from nested tables
254 | let nested_deps = extract_dependencies(nested_table, None)?;
255 | deps.extend(nested_deps);
256 | }
257 | Value::Array(array) => {
258 | // Extract dependencies from an array (if any)
259 | for item in array {
260 | if let Value::String(item_str) = item {
261 | deps.push(item_str.to_string());
262 | }
263 | }
264 | }
265 | _ => eprintln!("ERR: Invalid dependency syntax found while TOML parsing"),
266 | }
267 | }
268 | }
269 | Ok(deps)
270 | }
271 |
272 | // Extract dependencies from different sections
273 | let mut all_dependencies = Vec::new();
274 |
275 | // Look for keys like "dependencies" and "optional-dependencies"
276 | let keys_to_check = vec!["project", "optional-dependencies", "tool"];
277 |
278 | for key in keys_to_check {
279 | if key.contains("tool") {
280 | if let Some(dependencies_table) = toml_value.get("tool") {
281 | if let Some(dependencies_table) = dependencies_table.get("poetry") {
282 | let poetrylevel: Vec<&str> = vec!["dependencies", "dev-dependencies"];
283 | for k in poetrylevel.into_iter() {
284 | if let Some(dep) = dependencies_table.get(k) {
285 | match dep {
286 | Value::Table(table) => {
287 | all_dependencies
288 | .extend(extract_dependencies(table, Some(true))?);
289 | }
290 | // its definitely gonna be a table anyway, so...
291 | Value::String(_) => todo!(),
292 | Value::Integer(_) => todo!(),
293 | Value::Float(_) => todo!(),
294 | Value::Boolean(_) => todo!(),
295 | Value::Datetime(_) => todo!(),
296 | Value::Array(_) => todo!(),
297 | }
298 | }
299 | }
300 | }
301 | }
302 | }
303 | // if its not poetry, check for [project] dependencies
304 | else if !key.contains("poetry") {
305 | if let Some(dependencies_table) = toml_value.get(key) {
306 | if let Some(dependencies) = dependencies_table.as_table() {
307 | all_dependencies.extend(extract_dependencies(dependencies, None)?);
308 | }
309 | }
310 | } else {
311 | eprintln!(
312 | "The pyproject.toml seen here is unlike of a python project. Please check and make
313 | sure you are in the right directory, or check the toml file."
314 | );
315 | exit(1)
316 | }
317 | }
318 | // the toml might contain repeated dependencies
319 | // for different tools, dev tests, etc.
320 | all_dependencies.dedup();
321 |
322 | for d in all_dependencies {
323 | let d = d.as_str();
324 | let parsed = pep_508::parse(d);
325 | if let Ok(dep) = parsed {
326 | let dname = dep.name.to_string();
327 | if let Some(ver) = dep.spec {
328 | if let Spec::Version(verspec) = ver {
329 | if let Some(v) = verspec.into_iter().next() {
330 | let version = v.version.to_string();
331 | let comparator = v.comparator;
332 | imp.push(Dependency {
333 | name: dname.clone(),
334 | version: Some(version),
335 | comparator: Some(comparator),
336 | version_status: VersionStatus {
337 | pypi: false,
338 | pip: false,
339 | source: true,
340 | },
341 | });
342 | }
343 | }
344 | } else {
345 | imp.push(Dependency {
346 | name: dname.clone(),
347 | version: None,
348 | comparator: None,
349 | version_status: VersionStatus {
350 | pypi: false,
351 | pip: false,
352 | source: false,
353 | },
354 | });
355 | }
356 | }
357 | }
358 | Ok(())
359 | }
360 |
361 | pub fn parse_opt_deps_pyproject(table: Table, deps: &mut Vec) {
362 | for v in table.values() {
363 | match v {
364 | Value::Array(a) => {
365 | for d in a {
366 | match d {
367 | Value::String(dependency) => {
368 | deps.push(dependency.to_owned());
369 | }
370 | Value::Integer(_) => todo!(),
371 | Value::Float(_) => todo!(),
372 | Value::Boolean(_) => todo!(),
373 | Value::Datetime(datetime) => todo!(),
374 | Value::Array(vec) => todo!(),
375 | Value::Table(map) => todo!(),
376 | }
377 | }
378 | }
379 | Value::String(_) => todo!(),
380 | Value::Integer(_) => todo!(),
381 | Value::Float(_) => todo!(),
382 | Value::Boolean(_) => todo!(),
383 | Value::Datetime(datetime) => todo!(),
384 | Value::Table(map) => todo!(),
385 | }
386 | }
387 | }
388 |
--------------------------------------------------------------------------------
/src/parser/mod.rs:
--------------------------------------------------------------------------------
1 | use std::fs;
2 | use std::io::{BufRead, BufReader};
3 | use std::path::Path;
4 | use std::process::exit;
5 | use std::{ffi::OsString, fs::File};
6 | mod extractor;
7 | pub mod structs;
8 | use super::scanner;
9 | use structs::{FileTypes, FoundFile, FoundFileResult};
10 |
11 | pub async fn scan_dir(dir: &Path) {
12 | let mut result = FoundFileResult::new(); // contains found files
13 |
14 | if let Ok(entries) = fs::read_dir(dir) {
15 | for entry in entries.flatten() {
16 | let filename = entry.file_name();
17 | let filext = if let Some(ext) = Path::new(&filename).extension() {
18 | ext.to_os_string()
19 | } else {"none".into()};
20 |
21 |
22 | // setup.py check comes first otherwise it might cause issues with .py checker
23 | if *"setup.py" == filename.clone() {
24 | result.add(FoundFile {
25 | name: filename,
26 | filetype: FileTypes::SetupPy,
27 | path: OsString::from(entry.path()),
28 | });
29 | result.setuppy();
30 | }
31 | // check if .py
32 | // checking file extension straight up from filename caused some bugs.
33 | else if ".py" == filext {
34 | result.add(FoundFile {
35 | name: filename,
36 | filetype: FileTypes::Python,
37 | path: OsString::from(entry.path()),
38 | });
39 | result.python(); // internal count of the file found
40 | }
41 | // requirements.txt
42 | else if *"requirements.txt" == filename.clone() {
43 | result.add(FoundFile {
44 | name: filename,
45 | filetype: FileTypes::Requirements,
46 | path: OsString::from(entry.path()),
47 | });
48 | result.reqs();
49 | }
50 | // constraints.txt
51 | else if *"constraints.txt" == filename.clone() {
52 | result.add(FoundFile {
53 | name: filename,
54 | filetype: FileTypes::Constraints,
55 | path: OsString::from(entry.path()),
56 | });
57 | result.constraints();
58 | }
59 | // pyproject.toml
60 | else if *"pyproject.toml" == filename.clone() {
61 | result.add(FoundFile {
62 | name: filename,
63 | filetype: FileTypes::Pyproject,
64 | path: OsString::from(entry.path()),
65 | });
66 | result.pyproject();
67 | }
68 | }
69 | }
70 | // println!("{:?}", result.clone());
71 |
72 | // --- find_import takes the result ---
73 |
74 | find_import(result).await
75 | }
76 |
77 | /// A nice abstraction over different ways to find imports for different filetypes.
78 | async fn find_import(res: FoundFileResult) {
79 | let files = res.files;
80 | if res.reqs_found > res.pyproject_found {
81 | // if theres a requirements.txt and pyproject.toml isnt there
82 | find_reqs_imports(&files).await
83 | } else if res.reqs_found != 0 {
84 | // if both reqs and pyproject is present, go for reqs first
85 | find_reqs_imports(&files).await
86 | } else if res.constraints_found != 0 {
87 | // since constraints and requirements have the same syntax, its okay to use the same parser.
88 | find_reqs_imports(&files).await
89 | } else if res.pyproject_found != 0 {
90 | // use pyproject instead (if it exists)
91 | find_pyproject_imports(&files).await
92 | } else if res.setuppy_found != 0 {
93 | find_setuppy_imports(&files).await
94 | } else if res.py_found != 0 {
95 | // make sure theres atleast one python file, then use that
96 | find_python_imports(&files).await
97 | } else {
98 | eprintln!(
99 | "Could not find any requirements.txt, pyproject.toml or python files in this directory"
100 | ); exit(1)
101 | }
102 | }
103 |
104 | async fn find_setuppy_imports(f: &Vec) {
105 | let cons = console::Term::stdout();
106 | cons.write_line("Using setup.py as source...")
107 | .unwrap();
108 |
109 | let mut imports = Vec::new();
110 | for file in f {
111 | if file.is_setuppy() {
112 | let readf = fs::read_to_string(file.path.clone());
113 | if let Ok(f) = readf {
114 | extractor::extract_imports_setup_py(f.as_str(), &mut imports);
115 | } else {
116 | eprintln!("There was a problem reading your setup.py")
117 | }
118 | }
119 | }
120 | // println!("{:?}", imports.clone());
121 | // cons.clear_last_lines(1).unwrap();
122 | // --- pass the dependencies to the scanner/api ---
123 | scanner::start(imports).await.unwrap();
124 | }
125 | async fn find_python_imports(f: &Vec) {
126 | let cons = console::Term::stdout();
127 | cons.write_line("Using python file as source...").unwrap();
128 |
129 | let mut imports = Vec::new(); // contains the Dependencies
130 | for file in f {
131 | if file.is_python() {
132 | if let Ok(fhandle) = File::open(file.path.clone()) {
133 | let reader = BufReader::new(fhandle);
134 |
135 | for line in reader.lines().flatten() {
136 | extractor::extract_imports_python(line, &mut imports);
137 | }
138 | }
139 | }
140 | }
141 | // println!("{:?}", imports.clone());
142 | // cons.clear_last_lines(1).unwrap();
143 | // --- pass the dependencies to the scanner/api ---
144 | scanner::start(imports).await.unwrap(); // unwrapping is ok since the return value doesnt matter.
145 | }
146 |
147 | async fn find_reqs_imports(f: &Vec) {
148 | let cons = console::Term::stdout();
149 | cons.write_line("Using requirements.txt...")
150 | .unwrap();
151 |
152 | let mut imports = Vec::new();
153 | for file in f {
154 | if file.is_reqs() {
155 | if let Ok(fhandle) = File::open(file.path.clone()) {
156 | let reader = BufReader::new(fhandle);
157 |
158 | for line in reader.lines().flatten() {
159 | // pep-508 does not parse --hash embeds in requirements.txt
160 | // see (https://github.com/figsoda/pep-508/issues/2)
161 | extractor::extract_imports_reqs(line.trim().to_string(), &mut imports)
162 | }
163 | }
164 | }
165 | }
166 | // println!("{:?}", imports.clone());
167 |
168 | // --- pass the dependencies to the scanner/api ---
169 | scanner::start(imports).await.unwrap();
170 | }
171 |
172 | async fn find_pyproject_imports(f: &Vec) {
173 | let cons = console::Term::stdout();
174 | cons.write_line("Using pyproject.toml as source...")
175 | .unwrap();
176 |
177 | let mut imports = Vec::new();
178 | for file in f {
179 | if file.is_pyproject() {
180 | let readf = fs::read_to_string(file.path.clone());
181 | if let Ok(f) = readf {
182 | let _ = extractor::extract_imports_pyproject(f, &mut imports);
183 | } else {
184 | eprintln!("There was a problem reading your pyproject.toml")
185 | }
186 | }
187 | }
188 | // println!("{:?}", imports.clone());
189 | // cons.clear_last_lines(1).unwrap();
190 | // --- pass the dependencies to the scanner/api ---
191 | scanner::start(imports).await.unwrap();
192 | }
193 |
--------------------------------------------------------------------------------
/src/parser/structs.rs:
--------------------------------------------------------------------------------
1 | use console::style;
2 | use std::{ffi::OsString, process::exit};
3 |
4 | use crate::{scanner::models::Query, utils, ARGS};
5 |
6 | use super::scanner::models::Vulnerability;
7 |
8 | // struct Python;
9 | // struct Requirements;
10 | // struct Pyproject;
11 |
12 | #[derive(Debug, PartialEq, Eq, Clone)]
13 | pub enum FileTypes {
14 | Python,
15 | Requirements,
16 | Pyproject,
17 | Constraints,
18 | SetupPy,
19 | }
20 |
21 | #[derive(Debug, Clone)]
22 | pub struct FoundFile {
23 | pub name: OsString,
24 | pub filetype: FileTypes,
25 | pub path: OsString,
26 | }
27 |
28 | impl FoundFile {
29 | pub fn is_python(&self) -> bool {
30 | self.filetype == FileTypes::Python
31 | }
32 | pub fn is_reqs(&self) -> bool {
33 | self.filetype == FileTypes::Requirements
34 | }
35 | pub fn is_pyproject(&self) -> bool {
36 | self.filetype == FileTypes::Pyproject
37 | }
38 | pub fn is_setuppy(&self) -> bool {
39 | self.filetype == FileTypes::SetupPy
40 | }
41 | }
42 |
43 | #[derive(Debug, Clone)]
44 | pub struct FoundFileResult {
45 | /// provides overall info about the files found (useful for proritising filetypes)
46 | pub files: Vec,
47 | pub py_found: u64, // no. of said files found
48 | pub reqs_found: u64,
49 | pub pyproject_found: u64,
50 | pub constraints_found: u64,
51 | pub setuppy_found: u64
52 | }
53 |
54 | impl FoundFileResult {
55 | pub fn new() -> FoundFileResult {
56 | FoundFileResult {
57 | files: Vec::new(),
58 | py_found: 0,
59 | reqs_found: 0,
60 | pyproject_found: 0,
61 | constraints_found: 0,
62 | setuppy_found: 0,
63 | }
64 | }
65 | pub fn add(&mut self, f: FoundFile) {
66 | self.files.push(f)
67 | }
68 | pub fn python(&mut self) {
69 | self.py_found += 1
70 | }
71 | pub fn reqs(&mut self) {
72 | self.reqs_found += 1
73 | }
74 | pub fn pyproject(&mut self) {
75 | self.pyproject_found += 1
76 | }
77 | pub fn constraints(&mut self) {
78 | self.constraints_found += 1
79 | }
80 | pub fn setuppy(&mut self) {
81 | self.setuppy_found += 1
82 | }
83 | }
84 |
85 | #[derive(Debug, Clone)]
86 | pub struct Dependency {
87 | pub name: String,
88 | pub version: Option,
89 | pub comparator: Option,
90 | pub version_status: VersionStatus,
91 | }
92 |
93 | impl Dependency {
94 | pub fn to_query(&self) -> Query {
95 | Query::new(self.version.as_ref().unwrap().as_str(), self.name.as_str())
96 | }
97 | }
98 |
99 | #[derive(Debug, Clone)]
100 | pub struct VersionStatus {
101 | // pyscan may get version info from a lot of places. This keeps it in check.
102 | pub pypi: bool,
103 | pub pip: bool,
104 | pub source: bool,
105 | }
106 |
107 | /// implementation for VersionStatus which can get return versions while updating the status, also pick the one decided via arguments, a nice abstraction really.
108 | impl VersionStatus {
109 | /// retreives versions from pip and pypi.org in (pip, pypi) format.
110 | pub async fn _full_check(&mut self, name: &str) -> (String, String) {
111 | let pip = utils::get_python_package_version(name);
112 | let pip_v = if let Err(e) = pip {
113 | println!("An error occurred while retrieving version info from pip.\n{e}");
114 | exit(1)
115 | } else {
116 | pip.unwrap()
117 | };
118 |
119 | let pypi = utils::get_package_version_pypi(name).await;
120 | let pypi_v = if let Err(e) = pypi {
121 | println!("An error occurred while retrieving version info from pypi.org.\n{e}");
122 | exit(1)
123 | } else {
124 | *pypi.unwrap()
125 | };
126 |
127 | self.pip = true;
128 | self.pypi = true;
129 |
130 | (pip_v, pypi_v)
131 | }
132 |
133 | pub fn pip(name: &str) -> String {
134 | let pip = utils::get_python_package_version(name);
135 |
136 | if let Err(e) = pip {
137 | println!("An error occurred while retrieving version info from pip.\n{e}");
138 | exit(1)
139 | } else {
140 | pip.unwrap()
141 | }
142 | }
143 |
144 | pub async fn pypi(name: &str) -> String {
145 | let pypi = utils::get_package_version_pypi(name).await;
146 |
147 | if let Err(e) = pypi {
148 | println!("An error occurred while retrieving version info from pypi.org.\n{e}");
149 | exit(1)
150 | } else {
151 | *pypi.unwrap()
152 | }
153 | }
154 |
155 | /// returns the chosen version (from args or fallback)
156 | pub async fn choose(name: &str, dversion: &Option) -> String {
157 | if ARGS.get().unwrap().pip {
158 | VersionStatus::pip(name)
159 | } else if ARGS.get().unwrap().pypi {
160 | VersionStatus::pypi(name).await
161 | } else {
162 | // fallback begins here once made sure no arguments are provided
163 | let d_version = if let Some(provided) = dversion {
164 | Some(provided.to_string())
165 | } else if let Ok(v) = utils::get_python_package_version(name) {
166 | println!("{} : {}",style(name).yellow().dim(), style("A version could not be detected in the source file, so retrieving version from pip instead.").dim());
167 | Some(v)
168 | } else if let Ok(v) = utils::get_package_version_pypi(name).await {
169 | println!("{} : {}",style(name).red().dim(), style("A version could not be detected through source or pip, so retrieving latest version from pypi.org instead.").dim());
170 | Some(v.to_string())
171 | } else {
172 | eprintln!("A version could not be retrieved for {}. This should not happen as pyscan defaults pip or pypi.org, unless:\n1) Pip is not installed\n2) You don't have an internet connection\n3) You did not anticipate the consequences of not specifying a version for your dependency in the configuration files.\nReach out on github.com/aswinnnn/pyscan/issues if the above cases did not take place.", style(name).bright().red());
173 | exit(1);
174 | };
175 | d_version.unwrap()
176 | }
177 | }
178 | }
179 |
180 | #[derive(Debug, Clone)]
181 | pub struct ScannedDependency {
182 | pub name: String,
183 | pub version: String,
184 | pub vuln: Vulnerability,
185 | }
186 |
--------------------------------------------------------------------------------
/src/scanner/api.rs:
--------------------------------------------------------------------------------
1 | use crate::{display, ARGS};
2 | /// provides the functions needed to connect to various advisory sources.
3 | use crate::{parser::structs::Dependency, scanner::models::Vulnerability};
4 | use crate::{
5 | parser::structs::{ScannedDependency, VersionStatus},
6 | scanner::models::Vuln,
7 | };
8 | use reqwest::{self, Client, Method};
9 | use futures::future;
10 | use std::{fs, env};
11 | use std::process::exit;
12 | use super::{
13 | super::utils,
14 | models::{Query, QueryBatched, QueryResponse},
15 | };
16 |
17 | /// OSV provides a distrubuted database for vulns, with a free API
18 | #[derive(Debug)]
19 | pub struct Osv {
20 | /// check if the host is online
21 | pub online: bool,
22 | /// time of last query
23 | pub last_queried: String,
24 | /// the Client which handles the API.
25 | client: Client,
26 | }
27 |
28 | impl Osv {
29 | pub async fn new() -> Result {
30 | let version = utils::get_version();
31 | let pyscan_version = format!("pyscan {}", version);
32 | let client = reqwest::Client::builder()
33 | .user_agent(pyscan_version)
34 | .build();
35 |
36 | if let Ok(client) = client {
37 | let res = client.get("https://osv.dev").send().await;
38 |
39 | if let Ok(_success) = res {
40 | Ok(Osv {
41 | online: true,
42 | last_queried: { utils::get_time() },
43 | client,
44 | })
45 | } else {
46 | eprintln!(
47 | "Could not connect to the OSV website. Check your internet or try again."
48 | ); exit(1)
49 | }
50 | } else {
51 | eprintln!(
52 | "Could not build the network client to connect to OSV. Report this at github.com/aswinnnn/pyscan/issues"
53 | ); exit(1)
54 | }
55 | }
56 |
57 | pub async fn _query(&self, d: Dependency) -> Option {
58 | // returns None if no vulns found
59 | // else Some(Vulnerability)
60 |
61 | let version = if d.version.is_some() {
62 | d.version
63 | } else {
64 | let res = utils::get_package_version_pypi(d.name.as_str()).await;
65 | if let Err(e) = res {
66 | eprintln!("PypiError:\n{}", e);
67 | exit(1);
68 | } else if let Ok(res) = res {
69 | Some(res.to_string())
70 | } else {
71 | eprintln!("A very unexpected error occurred while retrieving version info from Pypi. Please report this on https://github.com/aswinnnn/pyscan/issues");
72 | exit(1);
73 | }
74 | };
75 | // println!("{:?}", self.get_latest_package_version(d.name.clone()));
76 |
77 |
78 | // println!("{:?}", res);
79 |
80 | self._get_json(d.name.as_str(), &version.unwrap()).await
81 | }
82 |
83 | pub async fn query_batched(&self, mut deps: Vec) -> Vec {
84 | // runs the batch API. Each dep is converted into JSON format here, POSTed, and the response of vuln IDs -> queried into Vec -> returned as Vec
85 | // The dep version conflicts are also solved over here.
86 | let _ = future::join_all(deps
87 | .iter_mut()
88 | .map(|d| async {
89 | d.version = if d.version.is_none() {
90 | Some(VersionStatus::choose(d.name.as_str(), &d.version).await)
91 | } else {
92 | d.version.clone()
93 | }
94 | })).await;
95 |
96 | // .collect::>();
97 | let mut progress = display::Progress::new();
98 |
99 | let mut imports_info = utils::vecdep_to_hashmap(&deps);
100 |
101 | let url = "https://api.osv.dev/v1/querybatch";
102 |
103 | let queries: Vec = deps.iter().map(|d| d.to_query()).collect();
104 | let batched = QueryBatched::new(queries);
105 |
106 | let body = serde_json::to_string(&batched).unwrap();
107 |
108 | let res = self.client.request(Method::POST, url).body(body).send().await;
109 | if let Ok(response) = res {
110 | if response.status().is_client_error() {
111 | eprintln!("Failed connecting to OSV. [Client error]");
112 | exit(1)
113 | } else if response.status().is_server_error() {
114 | eprintln!("Failed connecting to OSV. [Server error]");
115 | exit(1)
116 | }
117 |
118 | let restext = response.text().await.unwrap();
119 |
120 | let parsed: Result = serde_json::from_str(&restext);
121 | let mut scanneddeps: Vec = Vec::new();
122 | if ARGS.get().unwrap().output.is_some() {
123 | // txt or json extention inference, custom output filename
124 | let filename = ARGS.get().unwrap().output.as_ref().unwrap();
125 | if ".json" == &filename[{ filename.len() - 5 }..] {
126 | if let Ok(dir) = env::current_dir() {
127 | let r = fs::write(dir.join(filename), restext);
128 | if let Err(er) = r {
129 | eprintln!("Could not write output to file: {}", er.to_string());
130 | exit(1)
131 | }
132 | else {
133 | exit(0)
134 | }
135 | }
136 | }
137 | }
138 | if let Ok(p) = parsed {
139 | for vres in p.results {
140 | if let Some(vulns) = vres.vulns {
141 |
142 |
143 | let mut vecvulns: Vec = Vec::new();
144 | for qv in vulns.iter() {
145 | vecvulns.push(self.vuln_id(qv.id.as_str()).await) // retrives vuln info from API with a vuln ID
146 | }
147 |
148 | // has to be turnt to Vulnerability before becoming a scanned dependency
149 | let structvuln = Vulnerability {vulns: vecvulns};
150 | progress.count_one(); progress.display(); // increment progress
151 | scanneddeps.push(structvuln.to_scanned_dependency(&imports_info));
152 |
153 | }
154 | else {continue;}
155 | }
156 | if progress.count > 0 {progress.end()} // clear progress line
157 |
158 | // --- passing to display module starts here ---
159 | display::display_queried(&scanneddeps, &mut imports_info);
160 | scanneddeps
161 | } else {
162 | eprintln!("Invalid parse of API reponse at src/scanner/api.rs::query_batched\nThis is usually due to a unforeseen API response or a malformed source file.");
163 | exit(1);
164 | }
165 | } else {
166 | eprintln!("Could not fetch a response from osv.dev [scanner/api/query_batched]");
167 | exit(1);
168 | }
169 | }
170 |
171 | /// get a Vuln from a vuln ID from OSV
172 | pub async fn vuln_id(&self, id: &str) -> Vuln {
173 | let url = format!("https://api.osv.dev/v1/vulns/{id}");
174 |
175 | let res = self.client.request(Method::GET, url).send().await;
176 |
177 | // println!("{:?}", res);
178 |
179 | if let Ok(response) = res {
180 | if response.status().is_client_error() {
181 | eprintln!("Failed connecting to OSV. [Client error]")
182 | } else if response.status().is_server_error() {
183 | eprintln!("Failed connecting to OSV. [Server error]")
184 | }
185 | let restext = response.text().await.unwrap();
186 | // println!("{:#?}", restext.clone());
187 | let parsed: Result = serde_json::from_str(&restext);
188 | if let Ok(p) = parsed {
189 | p
190 | } else if let Err(e) = parsed {
191 | eprintln!("Invalid parse of API reponse at src/scanner/api.rs::vuln_id\n{}", e);
192 | exit(1);
193 | }
194 | else {
195 | eprintln!("Invalid parse of API reponse at src/scanner/api.rs(vuln_id)");
196 | exit(1);
197 | }
198 | } else {
199 | eprintln!("Could not fetch a response from osv.dev [scanner/api/vulns_id]");
200 | exit(1);
201 | }
202 | }
203 |
204 | pub async fn _get_json(&self, name: &str, version: &str) -> Option {
205 | let url = r"https://api.osv.dev/v1/query";
206 |
207 | let body = Query::new(version, name); // struct implementation of query sent to OSV API.
208 | let body = serde_json::to_string(&body).unwrap();
209 |
210 | // println!("{}", body.clone());
211 |
212 | let res = self.client.request(Method::POST, url).body(body).send().await;
213 |
214 | // println!("{:?}", res);
215 |
216 | if let Ok(response) = res {
217 | if response.status().is_client_error() {
218 | eprintln!("Failed connecting to OSV. [Client error]")
219 | } else if response.status().is_server_error() {
220 | eprintln!("Failed connecting to OSV. [Server error]")
221 | }
222 | let restext = response.text().await.unwrap();
223 | if !restext.len() < 3 {
224 | // check if vulns exist by char len of json
225 | // api returns '{}' if none found so this is easy
226 |
227 | let parsed: Result =
228 | serde_json::from_str(&restext);
229 | // println!("{:?}", parsed);
230 | if let Ok(v) = parsed {
231 | Some(v)
232 | } else {
233 | None
234 | }
235 | } else {
236 | None
237 | }
238 | } else {
239 | eprintln!("Could not fetch a response from osv.dev");
240 | exit(1);
241 | }
242 | }
243 | }
244 |
--------------------------------------------------------------------------------
/src/scanner/mod.rs:
--------------------------------------------------------------------------------
1 | pub mod api;
2 | pub mod models;
3 | use std::process::exit;
4 | use super::parser::structs::Dependency;
5 | use console::{Term, style};
6 |
7 |
8 | pub async fn start(imports: Vec) -> Result<(), std::io::Error> {
9 | let osv = api::Osv::new().await.unwrap(); // err handling done inside, unwrapping is safe
10 | let cons = Term::stdout();
11 | let s = format!("Found {} dependencies", style(format!("{}", imports.len()))
12 | .bold()
13 | .green());
14 |
15 | cons.write_line(&s)?;
16 |
17 | // collected contains the dependencies with found vulns. imports_info contains a name, version hashmap of all found dependencies so we can display for all imports if vulns have been found or not
18 | let collected = osv.query_batched(imports).await;
19 | // query_batched passes stuff onto display module after
20 |
21 | // if we collected vulns:
22 | if !collected.is_empty() {
23 | exit(1)
24 | }
25 | else {
26 | Ok(()) // if collected is zero means no vulns found, no need for a non-zero exit code.
27 | }
28 | }
29 |
30 |
31 |
--------------------------------------------------------------------------------
/src/scanner/models.rs:
--------------------------------------------------------------------------------
1 | // automatically generated. do not change.
2 |
3 | use std::collections::HashMap;
4 |
5 | use serde::{Serialize, Deserialize};
6 |
7 | use crate::parser::structs::ScannedDependency;
8 |
9 |
10 |
11 | #[derive(Debug, Clone, Serialize, Deserialize)]
12 | pub struct Vulnerability {
13 | #[serde(rename = "vulns")]
14 | pub vulns: Vec,
15 | }
16 |
17 | #[derive(Debug, Clone, Serialize, Deserialize)]
18 | pub struct Vuln {
19 | #[serde(rename = "id")]
20 | pub id: String,
21 |
22 | // #[serde(rename = "summary")]
23 | // pub summary: Option,
24 |
25 | #[serde(rename = "details")]
26 | pub details: String,
27 |
28 | // #[serde(rename = "aliases")]
29 | // pub aliases: Vec,
30 |
31 | // #[serde(rename = "modified")]
32 | // pub modified: String,
33 |
34 | // #[serde(rename = "published")]
35 | // pub published: String,
36 |
37 | // #[serde(rename = "database_specific")]
38 | // pub database_specific: Option,
39 |
40 | // #[serde(rename = "references")]
41 | // pub references: Vec,
42 |
43 | #[serde(rename = "affected")]
44 | pub affected: Vec,
45 |
46 | // #[serde(rename = "schema_version")]
47 | // pub schema_version: String,
48 |
49 | // #[serde(rename = "severity")]
50 | // pub severity: Option>,
51 | }
52 |
53 | #[derive(Debug, Clone, Serialize, Deserialize)]
54 | pub struct Affected {
55 | #[serde(rename = "package")]
56 | pub package: Package,
57 |
58 | // #[serde(rename = "ranges")]
59 | // pub ranges: Vec,
60 |
61 | #[serde(rename = "versions")]
62 | pub versions: Option>,
63 |
64 | // #[serde(rename = "database_specific")]
65 | // pub database_specific: AffectedDatabaseSpecific,
66 |
67 | // #[serde(rename = "ecosystem_specific")]
68 | // pub ecosystem_specific: Option,
69 | }
70 |
71 | #[derive(Debug, Clone, Serialize, Deserialize)]
72 | pub struct AffectedDatabaseSpecific {
73 | #[serde(rename = "source")]
74 | pub source: String,
75 | }
76 |
77 | #[derive(Debug, Clone, Serialize, Deserialize)]
78 | pub struct EcosystemSpecific {
79 | #[serde(rename = "affected_functions")]
80 | pub affected_functions: Vec,
81 | }
82 |
83 | #[derive(Debug, Clone, Serialize, Deserialize)]
84 | pub struct Package {
85 | #[serde(rename = "name")]
86 | pub name: String,
87 |
88 | #[serde(rename = "ecosystem")]
89 | pub ecosystem: String,
90 |
91 | #[serde(rename = "purl")]
92 | pub purl: String,
93 | }
94 |
95 | #[derive(Debug, Clone, Serialize, Deserialize)]
96 | pub struct Range {
97 | #[serde(rename = "type")]
98 | pub range_type: String,
99 |
100 | #[serde(rename = "events")]
101 | pub events: Vec,
102 |
103 | #[serde(rename = "repo")]
104 | pub repo: Option,
105 | }
106 |
107 | #[derive(Debug, Clone, Serialize, Deserialize)]
108 | pub struct Event {
109 | #[serde(rename = "introduced")]
110 | pub introduced: Option,
111 |
112 | #[serde(rename = "fixed")]
113 | pub fixed: Option,
114 | }
115 |
116 | #[derive(Debug, Clone, Serialize, Deserialize)]
117 | pub struct VulnDatabaseSpecific {
118 | #[serde(rename = "cwe_ids")]
119 | pub cwe_ids: Vec,
120 |
121 | #[serde(rename = "github_reviewed")]
122 | pub github_reviewed: bool,
123 |
124 | #[serde(rename = "severity")]
125 | pub severity: String,
126 |
127 | #[serde(rename = "github_reviewed_at")]
128 | pub github_reviewed_at: String,
129 |
130 | #[serde(rename = "nvd_published_at")]
131 | pub nvd_published_at: Option,
132 | }
133 |
134 | #[derive(Debug, Clone, Serialize, Deserialize)]
135 | pub struct Reference {
136 | #[serde(rename = "type")]
137 | pub reference_type: String,
138 |
139 | #[serde(rename = "url")]
140 | pub url: String,
141 | }
142 |
143 | #[derive(Debug, Clone, Serialize, Deserialize)]
144 | pub struct Severity {
145 | #[serde(rename = "type")]
146 | pub severity_type: String,
147 |
148 | #[serde(rename = "score")]
149 | pub score: String,
150 | }
151 |
152 | // --- pypi.org/pypi//json JSON repsonse ---
153 |
154 | #[derive(Debug, Clone, Serialize, Deserialize)]
155 | pub struct PypiResponse {
156 | // #[serde(rename = "info")]
157 | // pub info: Info,
158 |
159 | // #[serde(rename = "last_serial")]
160 | // pub last_serial: i64,
161 |
162 | #[serde(rename = "releases")]
163 | pub releases: HashMap>>,
164 |
165 | // #[serde(rename = "urls")]
166 | // pub urls: Vec,
167 |
168 | // #[serde(rename = "vulnerabilities")]
169 | // pub vulnerabilities: Vec