├── go.mod ├── package.json ├── README.md └── scanner.go /go.mod: -------------------------------------------------------------------------------- 1 | module malware-scanner 2 | 3 | go 1.19 -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "test-app", 3 | "version": "1.0.0", 4 | "dependencies": { 5 | "koa2-swagger-ui": "*", 6 | "express": "4.18.0", 7 | "lodash": "*", 8 | "@crowdstrike/tailwind-toucan-base":"5.0.2", 9 | "ngx-trend":"8.0.1", 10 | "@ctrl/ngx-csv":"6.0.2" 11 | }, 12 | "devDependencies": { 13 | "eslint": "8.0.0" 14 | } 15 | } -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # NPM Malware Scanner 2 | 3 | A powerful command-line tool that scans GitHub repositories, local files, and folders for malicious and suspicious npm packages in `package.json` files. The scanner uses a comprehensive malware database to identify potentially harmful npm dependencies across your projects. 4 | 5 | ## ✨ Features 6 | 7 | - **Multi-threaded Scanning**: Configurable thread count for optimal performance (default: 10) 8 | - **GitHub API Integration**: Seamless repository scanning with automatic retry logic 9 | - **Local File & Folder Support**: Scan individual package.json files or entire directories recursively 10 | - **NPM Malware Detection**: Identifies malicious, suspicious, and clean npm packages 11 | - **Beautiful Output**: Color-coded results with detailed formatting 12 | - **Advanced Rate Limit Handling**: Built-in GitHub API rate limiting with exponential backoff 13 | - **Authentication Support**: Multiple GitHub token environment variables supported 14 | - **Concurrent Processing**: Efficient scanning of multiple `package.json` files 15 | - **Exit Codes**: Proper exit codes for CI/CD integration 16 | - **Built-in Database**: No need to manually specify malware database path 17 | 18 | ## 🚀 Installation 19 | 20 | ### Prerequisites 21 | - Go 1.16 or higher 22 | - GitHub Personal Access Token (recommended) 23 | 24 | ### Build from Source 25 | ```bash 26 | git clone https://github.com/ValkyriSecurity/npm-malware-scanner 27 | cd npm-malware-scanner 28 | ``` 29 | 30 | ## 🔧 Configuration 31 | 32 | ### GitHub Token Setup 33 | To avoid rate limiting and 403 errors, set up a GitHub Personal Access Token: 34 | 35 | 1. Go to [GitHub Settings → Developer settings → Personal access tokens](https://github.com/settings/tokens) 36 | 2. Generate a new token with appropriate permissions: 37 | - `public_repo` scope for public repositories 38 | - `repo` scope for private repositories 39 | 3. Set one of these environment variables: 40 | 41 | **Linux/WSL:** 42 | ```bash 43 | export GH_TOKEN="your_github_token_here" 44 | ``` 45 | 46 | **PowerShell:** 47 | ```powershell 48 | $env:GH_TOKEN="your_github_token_here" 49 | ``` 50 | 51 | **Note**: The scanner will automatically detect and use any of these token variables. Without a token, you may encounter rate limiting after ~60 requests per hour. 52 | 53 | ## 📖 Usage 54 | 55 | ### GitHub Repository Scanning 56 | ```bash 57 | # Basic repository scan 58 | go run scanner.go --repo https://github.com/owner/repo 59 | 60 | # With custom thread count 61 | go run scanner.go --repo https://github.com/owner/repo --threads=20 62 | ``` 63 | 64 | ### Local File Scanning 65 | ```bash 66 | # Scan a single package.json file 67 | go run scanner.go --file ./package.json 68 | 69 | # Scan with custom threads 70 | go run scanner.go --file ./my-project/package.json --threads=5 71 | ``` 72 | 73 | ### Local Folder Scanning 74 | ```bash 75 | # Recursively scan all package.json files in a directory 76 | go run scanner.go --folder ./my-project 77 | go run scanner.go -F ./my-project 78 | 79 | # Scan current directory 80 | go run scanner.go -F . 81 | ``` 82 | 83 | ### Command Line Options 84 | - `--repo`: GitHub repository URL to scan 85 | - `--file`: Path to a specific package.json file 86 | - `--folder` or `-F`: Directory path to scan recursively 87 | - `--threads=N`: Number of concurrent threads (default: 10) 88 | 89 | ### Examples 90 | ```bash 91 | # Scan a popular repository 92 | go run scanner.go --repo https://github.com/facebook/react 93 | 94 | # Scan local project with high concurrency 95 | go run scanner.go --folder ./my-app --threads=20 96 | 97 | # Quick single file check 98 | go run scanner.go --file package.json 99 | ``` 100 | 101 | ## 📊 Output Format 102 | 103 | The scanner provides beautifully formatted, color-coded output with detailed progress information: 104 | 105 | ### GitHub Repository Scanning 106 | ``` 107 | Scanning GitHub repository: https://github.com/owner/repo 108 | Using 10 threads 109 | GitHub token found - authenticated requests enabled 110 | 111 | 🛡️ NPM Malware Scanner 112 | =========================================== 113 | Repository: owner/repo 114 | 115 | [1] package.json 116 | ⚠ SUSPICIOUS: lodash@4.17.20 (Known vulnerabilities in older versions) 117 | ✗ MALWARE: malicious-package@1.0.0 118 | → 5 clean, 1 suspicious, 1 malware 119 | 120 | [2] frontend/package.json 121 | → 12 clean, 0 suspicious, 0 malware 122 | 123 | NPM Malware Scan Summary 124 | ============================== 125 | Repository: owner/repo 126 | Package.json files: 2 127 | Total dependencies scanned: 19 128 | 129 | ✓ Clean packages: 17 130 | ⚠ Suspicious packages: 1 131 | ✗ Malware packages: 1 132 | 133 | CRITICAL: Malware detected in repository! 134 | Scan completed in 2.34s 135 | ``` 136 | 137 | ### Local Folder Scanning 138 | ``` 139 | 🛡️ NPM Malware Scanner - Local Folder Scan 140 | =========================================== 141 | Scanning folder: ./my-project 142 | 143 | [1] ./package.json 144 | → 15 clean, 0 suspicious, 0 malware 145 | 146 | [2] ./frontend/package.json 147 | → 8 clean, 1 suspicious, 0 malware 148 | 149 | Local Folder Scan Summary 150 | ============================== 151 | Folder: ./my-project 152 | Package.json files: 2 153 | Total dependencies scanned: 23 154 | 155 | ✓ Clean packages: 23 156 | ⚠ Suspicious packages: 1 157 | ✗ Malware packages: 0 158 | 159 | SUSPICIOUS: Review flagged packages! 160 | ``` 161 | 162 | ## 🎯 Exit Codes 163 | 164 | - `0`: All packages are clean 165 | - `1`: Suspicious packages detected 166 | - `2`: Malware packages detected 167 | 168 | ## 📋 Malware Database Format 169 | 170 | The malware database should be a JSON file with the following structure: 171 | 172 | ```json 173 | [ 174 | { 175 | "package_name": "malicious-package", 176 | "version": "1.0.0", 177 | "reason": "Contains cryptocurrency mining code" 178 | }, 179 | { 180 | "package_name": "suspicious-lib", 181 | "version": "*", 182 | "reason": "Known vulnerabilities in all versions" 183 | } 184 | ] 185 | ``` 186 | 187 | ## 🔍 Detection Categories 188 | 189 | ### Malware Packages 190 | - Packages explicitly listed in the malware database 191 | - Immediate security threat 192 | - Triggers exit code 2 193 | 194 | ### Suspicious Packages 195 | - Packages with known vulnerabilities 196 | - Packages with suspicious patterns 197 | - Triggers exit code 1 198 | 199 | ### Clean Packages 200 | - Packages not found in malware database 201 | - No known security issues 202 | - Contributes to exit code 0 203 | 204 | ## 🛠️ Technical Details 205 | 206 | ### Architecture 207 | - **Language**: Go 208 | - **Concurrency**: Goroutines with semaphore-based thread limiting 209 | - **API**: GitHub REST API v3 210 | - **Authentication**: Multiple token environment variables (GH_TOKEN, GITHUB_TOKEN, GITHUB_PAT) 211 | - **Rate Limiting**: Advanced handling with exponential backoff and retry logic 212 | - **Error Handling**: Comprehensive 403 error detection and user guidance 213 | - **Timeout Management**: 30-second request timeouts with retry attempts 214 | 215 | ### Performance 216 | - Configurable thread count (default: 10, range: 1-100 recommended) 217 | - Up to 3 retry attempts with exponential backoff 218 | - Concurrent file processing for local and remote scanning 219 | - Efficient memory usage with streaming JSON parsing 220 | - Real-time progress tracking and status updates 221 | - Built-in malware database (no external file required) 222 | 223 | ### Scanning Modes 224 | - **GitHub Repository**: Remote scanning with API authentication 225 | - **Local File**: Single package.json file analysis 226 | - **Local Folder**: Recursive directory scanning with file discovery 227 | 228 | ## 🔧 Troubleshooting 229 | 230 | ### Common Issues 231 | 232 | **403 Forbidden Errors** 233 | - Set a GitHub Personal Access Token using one of the supported environment variables 234 | - Ensure your token has the correct permissions (`public_repo` or `repo`) 235 | - Check if your token has expired 236 | 237 | **Rate Limiting** 238 | - The scanner automatically handles rate limits with retry logic 239 | - Authenticated requests have much higher rate limits (5000/hour vs 60/hour) 240 | - Wait times increase exponentially: 1s, 2s, 3s for successive retries 241 | 242 | **Repository Not Found** 243 | - Verify the repository URL is correct and accessible 244 | - For private repositories, ensure your token has `repo` scope 245 | - Check your internet connection 246 | 247 | **No package.json Files Found** 248 | - Ensure the repository or folder contains package.json files 249 | - Check that the files are named exactly `package.json` (case-sensitive) 250 | - For local scanning, verify the folder path exists 251 | 252 | ### Getting Help 253 | If you encounter issues: 254 | 1. Check the troubleshooting section above 255 | 2. Verify your GitHub token setup 256 | 3. Try running with `--threads=1` to isolate concurrency issues 257 | 4. Check the scanner output for specific error messages 258 | 259 | ## 🤝 Contributing 260 | 261 | 1. Fork the repository 262 | 2. Create a feature branch 263 | 3. Make your changes 264 | 4. Add tests if applicable 265 | 5. Submit a pull request 266 | 267 | ## 📝 License 268 | 269 | This project is licensed under the MIT License - see the LICENSE file for details. 270 | 271 | ## 🚨 Security 272 | 273 | If you want a comprehensive audit for your supply chain, integrations or dependencies, contact [Valkyri](https://t.me/ValkyriSecurity) for security audits! 274 | 275 | 276 | ## 🙏 Acknowledgments 277 | 278 | - GitHub API for repository access 279 | - Go community for excellent tooling 280 | - [Aikido](https://malware-list.aikido.dev/malware_predictions.json) for vulnerability databases 281 | - [Stepsecurity](https://www.stepsecurity.io/blog/ctrl-tinycolor-and-40-npm-packages-compromised) 282 | - If I missed anyone, highly appreciate your work. Thanks 🙏 283 | 284 | --- 285 | 286 | **⚠️ Disclaimer**: This tool is for security research and defensive purposes only. Always verify results and follow responsible disclosure practices. 287 | -------------------------------------------------------------------------------- /scanner.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "encoding/base64" 5 | "encoding/json" 6 | "flag" 7 | "fmt" 8 | "io/ioutil" 9 | "log" 10 | "net/http" 11 | "os" 12 | "path/filepath" 13 | "sort" 14 | "strings" 15 | "sync" 16 | "time" 17 | ) 18 | 19 | // MalwareEntry represents a malware package entry 20 | type MalwareEntry struct { 21 | PackageName string `json:"package_name"` 22 | Version string `json:"version"` 23 | Reason string `json:"reason"` 24 | } 25 | 26 | // PackageJSON represents a package.json structure 27 | type PackageJSON struct { 28 | Dependencies map[string]string `json:"dependencies"` 29 | DevDependencies map[string]string `json:"devDependencies"` 30 | PeerDependencies map[string]string `json:"peerDependencies"` 31 | OptionalDependencies map[string]string `json:"optionalDependencies"` 32 | } 33 | 34 | // GitHubTreeResponse represents GitHub API tree response 35 | type GitHubTreeResponse struct { 36 | Tree []struct { 37 | Path string `json:"path"` 38 | Type string `json:"type"` 39 | } `json:"tree"` 40 | } 41 | 42 | // GitHubFileResponse represents GitHub API file response 43 | type GitHubFileResponse struct { 44 | Content string `json:"content"` 45 | } 46 | 47 | // ScanResult represents scan results for a file 48 | type ScanResult struct { 49 | FilePath string 50 | Clean int 51 | Suspicious int 52 | Malware int 53 | Messages []string 54 | } 55 | 56 | // Scanner holds the malware database and lookup maps 57 | type Scanner struct { 58 | malwareDB []MalwareEntry 59 | malwarePackages map[string]bool 60 | malwareVersions map[string]string // package@version -> vulnerable_version 61 | suspiciousPackages map[string]bool 62 | mu sync.RWMutex 63 | } 64 | 65 | // NewScanner creates a new scanner instance 66 | func NewScanner(dbPath string) (*Scanner, error) { 67 | data, err := ioutil.ReadFile(dbPath) 68 | if err != nil { 69 | return nil, fmt.Errorf("failed to read malware database: %v", err) 70 | } 71 | 72 | var entries []MalwareEntry 73 | if err := json.Unmarshal(data, &entries); err != nil { 74 | return nil, fmt.Errorf("failed to parse malware database: %v", err) 75 | } 76 | 77 | s := &Scanner{ 78 | malwareDB: entries, 79 | malwarePackages: make(map[string]bool), 80 | malwareVersions: make(map[string]string), 81 | suspiciousPackages: make(map[string]bool), 82 | } 83 | 84 | // Build optimized lookup maps 85 | for _, entry := range entries { 86 | if entry.Reason == "MALWARE" { 87 | s.malwarePackages[entry.PackageName] = true 88 | key := fmt.Sprintf("%s@%s", entry.PackageName, entry.Version) 89 | s.malwareVersions[key] = entry.Version 90 | } else if entry.Reason == "SUSPICIOUS" { 91 | s.suspiciousPackages[entry.PackageName] = true 92 | } 93 | } 94 | 95 | return s, nil 96 | } 97 | 98 | // GetVulnerableVersions returns all unique vulnerable versions for a package 99 | func (s *Scanner) GetVulnerableVersions(packageName string) []string { 100 | s.mu.RLock() 101 | defer s.mu.RUnlock() 102 | 103 | // Use map to deduplicate versions 104 | versionMap := make(map[string]bool) 105 | for _, entry := range s.malwareDB { 106 | if entry.PackageName == packageName && entry.Reason == "MALWARE" { 107 | versionMap[entry.Version] = true 108 | } 109 | } 110 | 111 | // Convert map keys to slice 112 | var versions []string 113 | for version := range versionMap { 114 | versions = append(versions, version) 115 | } 116 | 117 | // Sort for consistent output 118 | sort.Strings(versions) 119 | return versions 120 | } 121 | 122 | // CheckPackage checks if a package@version is malware, suspicious, or clean 123 | func (s *Scanner) CheckPackage(packageName, version string) (string, string) { 124 | s.mu.RLock() 125 | defer s.mu.RUnlock() 126 | 127 | // Helper function to check if version is a range/wildcard 128 | isVersionRange := func(v string) bool { 129 | if v == "" || v == "*" { 130 | return true 131 | } 132 | // Check for semantic version ranges 133 | return strings.ContainsAny(v, "^~><=") || strings.Contains(v, "latest") || strings.Contains(v, "x") || strings.Contains(v, "X") 134 | } 135 | 136 | // Check for exact malware match 137 | if s.malwarePackages[packageName] { 138 | if isVersionRange(version) { 139 | // Version range or wildcard - suspicious with vulnerable versions listed 140 | vulnVersions := s.GetVulnerableVersions(packageName) 141 | if len(vulnVersions) > 0 { 142 | return "suspicious", fmt.Sprintf("(vulnerable versions: %s)", strings.Join(vulnVersions, ", ")) 143 | } 144 | return "suspicious", "(version range allows vulnerable versions)" 145 | } 146 | 147 | // Check exact version match 148 | key := fmt.Sprintf("%s@%s", packageName, version) 149 | if vulnVersion, exists := s.malwareVersions[key]; exists { 150 | return "malware", vulnVersion 151 | } 152 | 153 | // Package exists in malware DB but different version 154 | vulnVersions := s.GetVulnerableVersions(packageName) 155 | if len(vulnVersions) > 0 { 156 | return "suspicious", fmt.Sprintf("vulnerable versions: %s", strings.Join(vulnVersions, ", ")) 157 | } 158 | } 159 | 160 | // Check suspicious packages 161 | if s.suspiciousPackages[packageName] { 162 | if isVersionRange(version) { 163 | return "suspicious", "(version range)" 164 | } 165 | return "suspicious", "" 166 | } 167 | 168 | // Check if package exists in malware DB with any version (for wildcard detection) 169 | if isVersionRange(version) { 170 | // Check if this package has any malware entries 171 | for _, entry := range s.malwareDB { 172 | if entry.PackageName == packageName && entry.Reason == "MALWARE" { 173 | vulnVersions := s.GetVulnerableVersions(packageName) 174 | if len(vulnVersions) > 0 { 175 | return "suspicious", fmt.Sprintf("(vulnerable versions: %s)", strings.Join(vulnVersions, ", ")) 176 | } 177 | return "suspicious", "(version range allows vulnerable versions)" 178 | } 179 | } 180 | } 181 | 182 | return "clean", "" 183 | } 184 | 185 | // ScanGitHubRepo scans a GitHub repository for malware 186 | func (s *Scanner) ScanGitHubRepo(repoURL string, threads int) error { 187 | // Parse repository URL 188 | parts := strings.Split(strings.TrimPrefix(repoURL, "https://github.com/"), "/") 189 | if len(parts) != 2 { 190 | return fmt.Errorf("invalid GitHub repository URL") 191 | } 192 | owner, repo := parts[0], parts[1] 193 | 194 | fmt.Printf("\n\033[1;35m🔍 GitHub Repository Vulnerability Scanner\033[0m\n") 195 | fmt.Printf("\033[1;35m===========================================\033[0m\n") 196 | fmt.Printf("\033[1;36mRepository:\033[0m %s/%s\n\n", owner, repo) 197 | 198 | // Get repository tree 199 | treeURL := fmt.Sprintf("https://api.github.com/repos/%s/%s/git/trees/HEAD?recursive=1", owner, repo) 200 | 201 | // Create request with User-Agent header (required by GitHub API) 202 | req, err := http.NewRequest("GET", treeURL, nil) 203 | if err != nil { 204 | return fmt.Errorf("failed to create request: %v", err) 205 | } 206 | req.Header.Set("User-Agent", "Vulnerability-Scanner/1.0") 207 | 208 | // Add GitHub token if available 209 | if token := os.Getenv("GH_TOKEN"); token != "" { 210 | req.Header.Set("Authorization", "Bearer "+token) 211 | } 212 | 213 | client := &http.Client{} 214 | resp, err := client.Do(req) 215 | if err != nil { 216 | return fmt.Errorf("failed to fetch repository tree: %v", err) 217 | } 218 | defer resp.Body.Close() 219 | 220 | // Check HTTP response status 221 | if resp.StatusCode != 200 { 222 | return fmt.Errorf("GitHub API returned status %d for tree request", resp.StatusCode) 223 | } 224 | 225 | var treeResp GitHubTreeResponse 226 | if err := json.NewDecoder(resp.Body).Decode(&treeResp); err != nil { 227 | return fmt.Errorf("failed to parse tree response: %v", err) 228 | } 229 | 230 | // Find package.json files 231 | var packageFiles []string 232 | for _, item := range treeResp.Tree { 233 | if strings.HasSuffix(item.Path, "package.json") && item.Type == "blob" { 234 | packageFiles = append(packageFiles, item.Path) 235 | } 236 | } 237 | if len(packageFiles) == 0 { 238 | fmt.Println("No package.json files found in repository") 239 | return nil 240 | } 241 | 242 | // Process files concurrently with thread limit 243 | results := make(chan ScanResult, len(packageFiles)) 244 | var wg sync.WaitGroup 245 | semaphore := make(chan struct{}, threads) 246 | 247 | for i, filePath := range packageFiles { 248 | wg.Add(1) 249 | go func(index int, path string) { 250 | defer wg.Done() 251 | semaphore <- struct{}{} // Acquire semaphore 252 | defer func() { <-semaphore }() // Release semaphore 253 | result := s.scanPackageFile(owner, repo, path, index+1) 254 | results <- result 255 | }(i, filePath) 256 | } 257 | 258 | go func() { 259 | wg.Wait() 260 | close(results) 261 | }() 262 | 263 | // Collect and display results 264 | var allResults []ScanResult 265 | for result := range results { 266 | allResults = append(allResults, result) 267 | } 268 | 269 | // Sort results by file path for consistent output 270 | sort.Slice(allResults, func(i, j int) bool { 271 | return allResults[i].FilePath < allResults[j].FilePath 272 | }) 273 | 274 | // Display results with enhanced formatting 275 | totalClean, totalSuspicious, totalMalware := 0, 0, 0 276 | for i, result := range allResults { 277 | // File header with number and path 278 | fmt.Printf("\033[1;36m[%d]\033[0m %s\n", i+1, result.FilePath) 279 | 280 | // Show suspicious/malware packages with indentation 281 | for _, msg := range result.Messages { 282 | fmt.Printf(" %s\n", msg) 283 | } 284 | 285 | // Summary line with colors 286 | cleanColor := "\033[32m" // Green 287 | suspiciousColor := "\033[33m" // Yellow 288 | malwareColor := "\033[31m" // Red 289 | resetColor := "\033[0m" // Reset 290 | 291 | fmt.Printf(" → %s%d clean%s, %s%d suspicious%s, %s%d malware%s\n\n", 292 | cleanColor, result.Clean, resetColor, 293 | suspiciousColor, result.Suspicious, resetColor, 294 | malwareColor, result.Malware, resetColor) 295 | 296 | totalClean += result.Clean 297 | totalSuspicious += result.Suspicious 298 | totalMalware += result.Malware 299 | } 300 | 301 | // Display enhanced summary 302 | fmt.Printf("\033[1;37m\nGitHub Repository Scan Summary\033[0m\n") 303 | fmt.Printf("\033[1;37m==============================\033[0m\n") 304 | fmt.Printf("\033[1;36mRepository:\033[0m %s/%s\n", owner, repo) 305 | fmt.Printf("\033[1;36mPackage.json files:\033[0m %d\n", len(packageFiles)) 306 | fmt.Printf("\033[1;36mTotal dependencies scanned:\033[0m %d\n\n", totalClean+totalSuspicious+totalMalware) 307 | 308 | // Color-coded summary with icons 309 | fmt.Printf("\033[32m✓ Clean packages:\033[0m %d\n", totalClean) 310 | fmt.Printf("\033[33m⚠ Suspicious packages:\033[0m %d\n", totalSuspicious) 311 | fmt.Printf("\033[31m✗ Malware packages:\033[0m %d\n\n", totalMalware) 312 | 313 | if totalMalware > 0 { 314 | fmt.Printf("\033[1;31mCRITICAL: Malware detected in repository!\033[0m\n") 315 | os.Exit(2) 316 | } else if totalSuspicious > 0 { 317 | fmt.Printf("\033[1;33mWARNING: Suspicious packages detected!\033[0m\n") 318 | os.Exit(1) 319 | } else { 320 | fmt.Printf("\033[1;32mRepository appears clean!\033[0m\n") 321 | } 322 | 323 | return nil 324 | } 325 | 326 | // scanPackageFile scans a single package.json file 327 | func (s *Scanner) scanPackageFile(owner, repo, filePath string, fileNum int) ScanResult { 328 | result := ScanResult{ 329 | FilePath: filePath, 330 | Messages: []string{}, 331 | } 332 | 333 | // Get file content with retry logic 334 | fileURL := fmt.Sprintf("https://api.github.com/repos/%s/%s/contents/%s", owner, repo, filePath) 335 | 336 | var resp *http.Response 337 | var err error 338 | 339 | // Retry up to 3 times with exponential backoff 340 | for attempt := 1; attempt <= 3; attempt++ { 341 | // Create request with User-Agent header 342 | req, reqErr := http.NewRequest("GET", fileURL, nil) 343 | if reqErr != nil { 344 | result.Messages = append(result.Messages, fmt.Sprintf("✗ Failed to create request: %v", reqErr)) 345 | return result 346 | } 347 | req.Header.Set("User-Agent", "Vulnerability-Scanner/1.0") 348 | 349 | // Add GitHub token if available (try multiple env vars) 350 | token := os.Getenv("GH_TOKEN") 351 | if token == "" { 352 | token = os.Getenv("GITHUB_TOKEN") 353 | } 354 | if token == "" { 355 | token = os.Getenv("GITHUB_PAT") 356 | } 357 | if token != "" { 358 | req.Header.Set("Authorization", "Bearer "+token) 359 | } 360 | 361 | client := &http.Client{Timeout: 30 * time.Second} 362 | resp, err = client.Do(req) 363 | if err != nil { 364 | if attempt == 3 { 365 | result.Messages = append(result.Messages, fmt.Sprintf("✗ Failed to fetch after %d attempts: %v", attempt, err)) 366 | return result 367 | } 368 | time.Sleep(time.Duration(attempt) * time.Second) 369 | continue 370 | } 371 | 372 | // Handle rate limiting 373 | if resp.StatusCode == 403 { 374 | resp.Body.Close() 375 | if attempt == 3 { 376 | if token == "" { 377 | result.Messages = append(result.Messages, "✗ HTTP 403: Rate limited. Set GH_TOKEN, GITHUB_TOKEN, or GITHUB_PAT environment variable with your GitHub Personal Access Token") 378 | } else { 379 | result.Messages = append(result.Messages, "✗ HTTP 403: Rate limited or insufficient permissions. Check your GitHub token") 380 | } 381 | return result 382 | } 383 | // Wait longer for rate limit reset 384 | time.Sleep(time.Duration(attempt*5) * time.Second) 385 | continue 386 | } 387 | 388 | // Check other error status codes 389 | if resp.StatusCode != 200 { 390 | resp.Body.Close() 391 | if attempt == 3 { 392 | result.Messages = append(result.Messages, fmt.Sprintf("✗ HTTP %d: %s", resp.StatusCode, resp.Status)) 393 | return result 394 | } 395 | time.Sleep(time.Duration(attempt) * time.Second) 396 | continue 397 | } 398 | 399 | // Success, break out of retry loop 400 | break 401 | } 402 | 403 | defer resp.Body.Close() 404 | 405 | var fileResp GitHubFileResponse 406 | if err := json.NewDecoder(resp.Body).Decode(&fileResp); err != nil { 407 | result.Messages = append(result.Messages, fmt.Sprintf("✗ Failed to parse response: %v", err)) 408 | return result 409 | } 410 | 411 | // Check if content is empty 412 | if fileResp.Content == "" { 413 | result.Messages = append(result.Messages, "✗ Empty content received from GitHub API") 414 | return result 415 | } 416 | 417 | // Decode base64 content 418 | content, err := base64Decode(fileResp.Content) 419 | if err != nil { 420 | result.Messages = append(result.Messages, fmt.Sprintf("✗ Failed to decode content: %v", err)) 421 | return result 422 | } 423 | 424 | // Check if decoded content is empty 425 | if strings.TrimSpace(content) == "" { 426 | result.Messages = append(result.Messages, "✗ Empty content after base64 decode") 427 | return result 428 | } 429 | 430 | // Parse package.json 431 | var pkg PackageJSON 432 | if err := json.Unmarshal([]byte(content), &pkg); err != nil { 433 | result.Messages = append(result.Messages, fmt.Sprintf("✗ Invalid JSON: %v (content length: %d)", err, len(content))) 434 | return result 435 | } 436 | 437 | // Collect all dependencies 438 | allDeps := make(map[string]string) 439 | for name, version := range pkg.Dependencies { 440 | allDeps[name] = version 441 | } 442 | for name, version := range pkg.DevDependencies { 443 | allDeps[name] = version 444 | } 445 | for name, version := range pkg.PeerDependencies { 446 | allDeps[name] = version 447 | } 448 | for name, version := range pkg.OptionalDependencies { 449 | allDeps[name] = version 450 | } 451 | 452 | // Track found packages to avoid duplicates 453 | foundPackages := make(map[string]bool) 454 | 455 | // Scan dependencies 456 | for name, version := range allDeps { 457 | if foundPackages[name] { 458 | continue 459 | } 460 | foundPackages[name] = true 461 | 462 | status, info := s.CheckPackage(name, version) 463 | switch status { 464 | case "malware": 465 | result.Malware++ 466 | result.Messages = append(result.Messages, fmt.Sprintf("\033[31m✗ MALWARE:\033[0m %s@%s", name, version)) 467 | case "suspicious": 468 | result.Suspicious++ 469 | // Helper function to check if version is a range/wildcard (same as in CheckPackage) 470 | isVersionRange := func(v string) bool { 471 | if v == "" || v == "*" { 472 | return true 473 | } 474 | return strings.ContainsAny(v, "^~><=") || strings.Contains(v, "latest") || strings.Contains(v, "x") || strings.Contains(v, "X") 475 | } 476 | 477 | if isVersionRange(version) { 478 | displayVersion := version 479 | if version == "" { 480 | displayVersion = "*" 481 | } 482 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s (%s)", name, displayVersion, info)) 483 | } else { 484 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s (%s)", name, version, info)) 485 | } 486 | case "clean": 487 | result.Clean++ 488 | } 489 | } 490 | 491 | return result 492 | } 493 | 494 | // base64Decode decodes base64 content and removes whitespace 495 | func base64Decode(content string) (string, error) { 496 | // Remove whitespace and newlines 497 | content = strings.ReplaceAll(content, "\n", "") 498 | content = strings.ReplaceAll(content, "\r", "") 499 | content = strings.ReplaceAll(content, " ", "") 500 | 501 | decoded, err := base64.StdEncoding.DecodeString(content) 502 | if err != nil { 503 | return "", err 504 | } 505 | return string(decoded), nil 506 | } 507 | 508 | // ScanLocalFolder scans all package.json files in a local folder recursively 509 | func (s *Scanner) ScanLocalFolder(folderPath string) error { 510 | var packageFiles []string 511 | 512 | // Walk through the directory to find all package.json files 513 | err := filepath.Walk(folderPath, func(path string, info os.FileInfo, err error) error { 514 | if err != nil { 515 | return err 516 | } 517 | if info.Name() == "package.json" { 518 | packageFiles = append(packageFiles, path) 519 | } 520 | return nil 521 | }) 522 | 523 | if err != nil { 524 | return fmt.Errorf("failed to walk directory %s: %v", folderPath, err) 525 | } 526 | 527 | if len(packageFiles) == 0 { 528 | fmt.Printf("No package.json files found in %s\n", folderPath) 529 | return nil 530 | } 531 | 532 | totalClean := 0 533 | totalSuspicious := 0 534 | totalMalware := 0 535 | totalDeps := 0 536 | 537 | // Scan each package.json file 538 | for i, filePath := range packageFiles { 539 | result, err := s.scanSingleFile(filePath, i+1) 540 | if err != nil { 541 | fmt.Printf("Error scanning %s: %v\n", filePath, err) 542 | continue 543 | } 544 | 545 | totalClean += result.Clean 546 | totalSuspicious += result.Suspicious 547 | totalMalware += result.Malware 548 | totalDeps += result.Clean + result.Suspicious + result.Malware 549 | } 550 | 551 | // Print overall summary 552 | fmt.Printf("\n\nFolder Scan Summary\n") 553 | fmt.Printf("==================\n") 554 | fmt.Printf("Folder: %s\n", folderPath) 555 | fmt.Printf("Package.json files: %d\n", len(packageFiles)) 556 | fmt.Printf("Total dependencies scanned: %d\n\n", totalDeps) 557 | fmt.Printf("✓ Clean packages: %d\n", totalClean) 558 | fmt.Printf("⚠ Suspicious packages: %d\n", totalSuspicious) 559 | fmt.Printf("✗ Malware packages: %d\n\n", totalMalware) 560 | 561 | if totalSuspicious > 0 || totalMalware > 0 { 562 | if totalMalware > 0 { 563 | fmt.Printf("CRITICAL: Malware packages detected!\n") 564 | os.Exit(1) 565 | } else { 566 | fmt.Printf("WARNING: Suspicious packages detected!\n") 567 | os.Exit(1) 568 | } 569 | } else { 570 | fmt.Printf("✓ No suspicious or malware packages found.\n") 571 | } 572 | 573 | return nil 574 | } 575 | 576 | // scanSingleFile scans a single package.json file and returns the result 577 | func (s *Scanner) scanSingleFile(filePath string, fileNum int) (ScanResult, error) { 578 | data, err := ioutil.ReadFile(filePath) 579 | if err != nil { 580 | return ScanResult{}, fmt.Errorf("failed to read file %s: %v", filePath, err) 581 | } 582 | 583 | var pkg PackageJSON 584 | if err := json.Unmarshal(data, &pkg); err != nil { 585 | return ScanResult{}, fmt.Errorf("failed to parse package.json: %v", err) 586 | } 587 | 588 | result := ScanResult{ 589 | FilePath: filePath, 590 | Messages: []string{}, 591 | } 592 | 593 | // Collect all dependencies 594 | allDeps := make(map[string]string) 595 | for name, version := range pkg.Dependencies { 596 | allDeps[name] = version 597 | } 598 | for name, version := range pkg.DevDependencies { 599 | allDeps[name] = version 600 | } 601 | for name, version := range pkg.PeerDependencies { 602 | allDeps[name] = version 603 | } 604 | for name, version := range pkg.OptionalDependencies { 605 | allDeps[name] = version 606 | } 607 | 608 | // Scan each dependency 609 | for name, version := range allDeps { 610 | status, info := s.CheckPackage(name, version) 611 | switch status { 612 | case "malware": 613 | result.Malware++ 614 | result.Messages = append(result.Messages, fmt.Sprintf("\033[31m✗ MALWARE:\033[0m %s@%s (%s)", name, version, info)) 615 | case "suspicious": 616 | result.Suspicious++ 617 | // Helper function to check if version is a range/wildcard (same as in CheckPackage) 618 | isVersionRange := func(v string) bool { 619 | if v == "" || v == "*" { 620 | return true 621 | } 622 | return strings.ContainsAny(v, "^~><=") || strings.Contains(v, "latest") || strings.Contains(v, "x") || strings.Contains(v, "X") 623 | } 624 | 625 | if isVersionRange(version) { 626 | displayVersion := version 627 | if version == "" { 628 | displayVersion = "*" 629 | } 630 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s %s", name, displayVersion, info)) 631 | } else { 632 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s %s", name, version, info)) 633 | } 634 | case "clean": 635 | result.Clean++ 636 | } 637 | } 638 | 639 | // Print results for this file 640 | fmt.Printf("\n[%d] %s\n", fileNum, filePath) 641 | for _, msg := range result.Messages { 642 | fmt.Printf(" %s\n", msg) 643 | } 644 | fmt.Printf(" → %d clean, %d suspicious, %d malware\n", result.Clean, result.Suspicious, result.Malware) 645 | 646 | return result, nil 647 | } 648 | 649 | // ScanLocalFile scans a local package.json file 650 | func (s *Scanner) ScanLocalFile(filePath string) error { 651 | data, err := ioutil.ReadFile(filePath) 652 | if err != nil { 653 | return fmt.Errorf("failed to read file %s: %v", filePath, err) 654 | } 655 | 656 | var pkg PackageJSON 657 | if err := json.Unmarshal(data, &pkg); err != nil { 658 | return fmt.Errorf("failed to parse package.json: %v", err) 659 | } 660 | 661 | result := ScanResult{ 662 | FilePath: filePath, 663 | Messages: []string{}, 664 | } 665 | 666 | // Collect all dependencies 667 | allDeps := make(map[string]string) 668 | for name, version := range pkg.Dependencies { 669 | allDeps[name] = version 670 | } 671 | for name, version := range pkg.DevDependencies { 672 | allDeps[name] = version 673 | } 674 | for name, version := range pkg.PeerDependencies { 675 | allDeps[name] = version 676 | } 677 | for name, version := range pkg.OptionalDependencies { 678 | allDeps[name] = version 679 | } 680 | 681 | // Scan each dependency 682 | for name, version := range allDeps { 683 | status, info := s.CheckPackage(name, version) 684 | switch status { 685 | case "malware": 686 | result.Malware++ 687 | result.Messages = append(result.Messages, fmt.Sprintf("\033[31m✗ MALWARE:\033[0m %s@%s (%s)", name, version, info)) 688 | case "suspicious": 689 | result.Suspicious++ 690 | // Helper function to check if version is a range/wildcard (same as in CheckPackage) 691 | isVersionRange := func(v string) bool { 692 | if v == "" || v == "*" { 693 | return true 694 | } 695 | return strings.ContainsAny(v, "^~><=") || strings.Contains(v, "latest") || strings.Contains(v, "x") || strings.Contains(v, "X") 696 | } 697 | 698 | if isVersionRange(version) { 699 | displayVersion := version 700 | if version == "" { 701 | displayVersion = "*" 702 | } 703 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s %s", name, displayVersion, info)) 704 | } else { 705 | result.Messages = append(result.Messages, fmt.Sprintf("\033[33m⚠ SUSPICIOUS:\033[0m %s@%s %s", name, version, info)) 706 | } 707 | case "clean": 708 | result.Clean++ 709 | } 710 | } 711 | 712 | // Print results 713 | fmt.Printf("\n[1] %s\n", filePath) 714 | for _, msg := range result.Messages { 715 | fmt.Printf(" %s\n", msg) 716 | } 717 | fmt.Printf(" → %d clean, %d suspicious, %d malware\n\n", result.Clean, result.Suspicious, result.Malware) 718 | 719 | // Print summary 720 | total := result.Clean + result.Suspicious + result.Malware 721 | fmt.Printf("Local File Scan Summary\n") 722 | fmt.Printf("======================\n") 723 | fmt.Printf("File: %s\n", filePath) 724 | fmt.Printf("Total dependencies scanned: %d\n\n", total) 725 | fmt.Printf("✓ Clean packages: %d\n", result.Clean) 726 | fmt.Printf("⚠ Suspicious packages: %d\n", result.Suspicious) 727 | fmt.Printf("✗ Malware packages: %d\n\n", result.Malware) 728 | 729 | if result.Suspicious > 0 || result.Malware > 0 { 730 | if result.Malware > 0 { 731 | fmt.Printf("CRITICAL: Malware packages detected!\n") 732 | os.Exit(1) 733 | } else { 734 | fmt.Printf("WARNING: Suspicious packages detected!\n") 735 | os.Exit(1) 736 | } 737 | } else { 738 | fmt.Printf("✓ No suspicious or malware packages found.\n") 739 | } 740 | 741 | return nil 742 | } 743 | 744 | func main() { 745 | // Define command line flags 746 | threads := flag.Int("threads", 10, "Number of concurrent threads for scanning") 747 | file := flag.String("file", "", "Local package.json file to scan") 748 | folder := flag.String("folder", "", "Path to local folder to scan recursively for package.json files") 749 | folderShort := flag.String("F", "", "Path to local folder to scan recursively for package.json files (short form)") 750 | repo := flag.String("repo", "", "GitHub repository URL to scan") 751 | flag.Parse() 752 | 753 | // Default malware database path 754 | defaultDBPath := "malware_predictions.json" 755 | 756 | // Determine which folder flag was used 757 | folderPath := *folder 758 | if folderPath == "" { 759 | folderPath = *folderShort 760 | } 761 | 762 | if *file != "" { 763 | // Local file scanning mode 764 | scanner, err := NewScanner(defaultDBPath) 765 | if err != nil { 766 | log.Fatalf("Failed to initialize scanner: %v", err) 767 | } 768 | 769 | start := time.Now() 770 | if err := scanner.ScanLocalFile(*file); err != nil { 771 | log.Fatalf("Scan failed: %v", err) 772 | } 773 | fmt.Printf("\nScan completed in %v\n", time.Since(start)) 774 | } else if folderPath != "" { 775 | // Local folder scanning mode 776 | scanner, err := NewScanner(defaultDBPath) 777 | if err != nil { 778 | log.Fatalf("Failed to initialize scanner: %v", err) 779 | } 780 | 781 | start := time.Now() 782 | if err := scanner.ScanLocalFolder(folderPath); err != nil { 783 | log.Fatalf("Scan failed: %v", err) 784 | } 785 | fmt.Printf("\nScan completed in %v\n", time.Since(start)) 786 | } else if *repo != "" { 787 | // GitHub repository scanning mode with -repo flag 788 | scanner, err := NewScanner(defaultDBPath) 789 | if err != nil { 790 | log.Fatalf("Failed to initialize scanner: %v", err) 791 | } 792 | 793 | start := time.Now() 794 | if err := scanner.ScanGitHubRepo(*repo, *threads); err != nil { 795 | log.Fatalf("Scan failed: %v", err) 796 | } 797 | fmt.Printf("\nScan completed in %v\n", time.Since(start)) 798 | } else { 799 | // Show usage information 800 | fmt.Println("Usage:") 801 | fmt.Println(" go run scanner.go --repo # Scan GitHub repository") 802 | fmt.Println(" go run scanner.go --file # Scan local file") 803 | fmt.Println(" go run scanner.go --folder # Scan local folder") 804 | fmt.Println(" go run scanner.go -F # Scan local folder (short)") 805 | fmt.Println("") 806 | fmt.Println("Options:") 807 | fmt.Println(" --threads=N Number of concurrent threads (default: 10)") 808 | fmt.Println("") 809 | fmt.Println("Examples:") 810 | fmt.Println(" go run scanner.go --repo https://github.com/input-output-hk/lace") 811 | fmt.Println(" go run scanner.go --threads=20 --repo https://github.com/input-output-hk/lace") 812 | fmt.Println(" go run scanner.go --file test_wildcard.json") 813 | fmt.Println(" go run scanner.go --folder ./my-project") 814 | os.Exit(1) 815 | } 816 | } --------------------------------------------------------------------------------