├── .gitignore ├── LICENSE ├── README.md ├── SECURITY.md ├── ccs.py └── workflows └── main.yml /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .DS_Store 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Code Credential Scanner 2 | 3 | This script is intended to scan a large, diverse codebase for hard-coded credentials, or credentials present in 4 | configuration files. These represent a serious security issue, and can be extremely hard to detect and manage. 5 | 6 | The specific focus of this script is to create a tool that can be used directly by dev teams in a CI/CD pipeline, to 7 | manage the remediation process for this issue by alerting the team when credentials are present in the code, so that 8 | the team can immediately fix issues as they arise. 9 | 10 | It is possible to apply to tool as a point-in-time scanner for this issue, but - since credentials are likely to 11 | work their way back into the codebase over time - we strongly advise integration of the script into the CI/CD 12 | process, automated build mechanism or whatever other regularly scheduled automated scanning process the team carries 13 | out. 14 | 15 | The script is written with the following aims in mind: 16 | 17 | - Be language agnostic, regular-expression based, and require no parsing, so that it works on any codebase 18 | - Reduce false positives wherever possible, even at the (inevitable) cost of false negatives 19 | - Provide multiple, straightforward methods for suppressing issues, compatible with other SAST tools 20 | - Be concise, simple and performant 21 | 22 | # Suppression comments 23 | 24 | The script attempts to provide some compatibility with other popular SAST tools. 25 | 26 | Text at or near the start of a file '# noqa file' will suppress reporting of any further issues in that file, as will 27 | the text 'flake8: noqa'. 28 | 29 | Text on an individual line of '# noqa' will suppress reporting of issues on that line. 30 | Many other common suppression comments will also work; the current list is: 31 | 32 | ``` 33 | # noinspection 34 | # noqa 35 | #noqa 36 | @SuppressWarnings 37 | DevSkim 38 | NOLINT 39 | NOSONAR 40 | checkmarx 41 | coverity 42 | fortify 43 | noinspection 44 | nosec 45 | safesql 46 | veracode 47 | ``` 48 | 49 | We also recommend the use of the comment '# noqa cred', to make it clear to team members that it is specifically the 50 | presence of a credential that is the reason for the false positive. Many of the tools referenced here (e.g. devskim) 51 | make use of specific error codes relating to tooling relevant to the language or platform in use, that serve the 52 | same purpose. It's possible for the same line of code to have multiple errors of different types. 53 | 54 | We caution that it is extremely bad practice to suppress an alert from a SAST tool that is a true positive. It is 55 | good practice to periodically review the SAST/lint suppression comments in a codebase to ensure that no 'true 56 | positives' have been suppressed. 57 | 58 | The '-nosuppress' command line flag causes the script to ignore all suppression comments. 59 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy 2 | 3 | ## Reporting a Vulnerability 4 | 5 | Please report security vulnerabilities to security@nccgroup.com 6 | -------------------------------------------------------------------------------- /ccs.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # Copyright (c) 2022 Chris Anley All Rights Reserved 3 | import os 4 | import re 5 | import signal 6 | import sys 7 | 8 | # Code Credential Scanner 9 | wrote_result = False 10 | 11 | SKIP_EXTS = [ 12 | re.compile(r'\.DS_Store$'), 13 | re.compile(r'\.css$'), 14 | re.compile(r'\.deps\.json$'), 15 | re.compile(r'\.dll$'), 16 | re.compile(r'\.eot$'), 17 | re.compile(r'\.exe$'), 18 | re.compile(r'\.gif$'), 19 | re.compile(r'\.ico$'), 20 | re.compile(r'\.jar$'), 21 | re.compile(r'\.jpg$'), 22 | re.compile(r'\.min\.js$'), 23 | re.compile(r'\.mov$'), 24 | re.compile(r'\.mp4$'), 25 | re.compile(r'\.png$'), 26 | re.compile(r'\.svg$'), 27 | re.compile(r'\.tif$'), 28 | re.compile(r'\.tiff$'), 29 | re.compile(r'\.ttf$'), 30 | re.compile(r'\.woff$'), 31 | re.compile(r'\.zip$'), 32 | re.compile(r'salt\.7$'), 33 | ] 34 | 35 | SKIP_DIRS = [ 36 | re.compile('/External/'), 37 | re.compile('/Samples/'), 38 | re.compile('/NuGet/'), 39 | # re.compile('/Setup/'), 40 | re.compile('/i18n/'), 41 | re.compile('/li8n/'), 42 | re.compile('/node_modules/'), 43 | re.compile('/packages/'), 44 | re.compile('(?i)/test/'), 45 | re.compile('/third_party/'), 46 | re.compile('/vendor/'), 47 | re.compile(r'/\.svn/'), 48 | re.compile(r'/\.git/'), 49 | re.compile('example'), 50 | ] 51 | 52 | SHORT_BAD_PASSWORDS = [ # All taken from Daniel Miessler's bad password lists 53 | # at https://github.com/danielmiessler/SecLists/tree/master/Passwords 54 | # Short strings are very likely to be non-passwords, but we allow these specific strings 55 | # since they are known-bad, common passwords 56 | '111111', 57 | '123', 58 | '123123', 59 | '1234', 60 | '12345', 61 | '123456', 62 | '123654', 63 | '159753', 64 | '1q2w3e', 65 | 'a12345', 66 | 'abc123', 67 | 'admin', 68 | 'asd123', 69 | 'asdf', 70 | 'azerty', 71 | 'bogus', 72 | 'dev', 73 | 'devop', 74 | 'devops', 75 | 'docker', 76 | 'dragon', 77 | 'love', 78 | 'mesh', 79 | 'monkey', 80 | 'mysql', 81 | 'pass', 82 | 'prod', 83 | 'qazwsx', 84 | 'qwerty', 85 | 'root', 86 | 'secret', 87 | 'shadow', 88 | 'swarm', 89 | 'stage', 90 | 'tinkle', 91 | 'test', 92 | 'toor', 93 | 'xxxx', 94 | ] 95 | 96 | PWD = r'''[^;<$\n\s'"]''' 97 | NON_PWD = r'''[;<$\n\s'"]''' 98 | GUID_LOWER = r'''[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}''' 99 | CLIENT_SECRET = r'''[a-zA-Z0-9_~\-\%/\+\=]{22,300}''' 100 | EMAIL_ADDR = r'''[.\-_a-zA-Z0-9]{1,80}\@(?:[a-z0-9][a-z0-9-]{1,80}\.){1,}[a-z]{1,10}''' 101 | 102 | # Regexes we use to extract likely passwords 103 | 104 | pwd_rules = [ 105 | # PASSWORD rules are reported first, in case we only report one result for the line 106 | (re.compile(r'(.*\W)(xox[abpr]-' + PWD + '{20,})(' + NON_PWD + '[^\n]*)'), 2, 'PASSWORD', None), # Slack access token 107 | (re.compile(r'(.*)(\$\da?\$\w{1,99}\$' + PWD + r'*)(' + NON_PWD + r'[^\n]*)'), 3, 'PASSWORD', None), # password hash 108 | (re.compile(r'(.*://[^:\n]+:)([^@:\n/]+)(@[^\n]*)'), 4, 'PASSWORD', None), # xyz://user:pass@ 109 | (re.compile(r'("\w+@(?:\w+\.)+\w+:)([^"/]+)(")'), 5, 'PASSWORD', None), # "x@y.com:pass" 110 | (re.compile(r'(?i)(.*]*>)(' + PWD + '+)(' + NON_PWD + '.*)'), 6, 'PASSWORD', None), 111 | (re.compile(r'(?i)(.*ApiKey\s*[=:]\s*")([^"]*)("[^\n]*)'), 7, 'PASSWORD', None), 112 | (re.compile(r'(?i)(.*ApiKey"[^"]+")([^"]*)("[^\n]*)'), 8, 'PASSWORD', None), 113 | (re.compile(r'(?i)(.*]*>)(' + PWD + '*)(' + NON_PWD + '[^\n]*)'), 9, 'PASSWORD', None), 114 | (re.compile(r'(?i)(.*AccountKey\s*[=:])(' + PWD + '*)(' + NON_PWD + '[^\n]*)'), 10, 'PASSWORD', None), 115 | (re.compile(r'(.*Authorization: (?:Basic|Bearer)\s+)(' + PWD + '*)(' + NON_PWD + '[^\n]*)'), 11, 'PASSWORD', None), 116 | (re.compile(r'(?i)(.*NetworkCredential\s*\(\s*"[^"]*"\s*,\s*")([^"]*)("[^\n]*)'), 12, 'PASSWORD', None), 117 | (re.compile(r'(?i)(.*_pass\s*[!=]?=\s*")([^"\n]+)("[^\n]*)'), 13, 'PASSWORD', None), # _pass != / == "foo" 118 | (re.compile(r'''(?i)(.*_passwd\s*[=:]\s*["'])([^"'\n]+)(["'][^\n]*)'''), 14, 'PASSWORD', None), 119 | (re.compile(r'(?i)(.*auth_token\s*[=:]\s*)(' + PWD + '*)(' + NON_PWD + '[^\n]*)'), 15, 'PASSWORD', None), 120 | (re.compile(r'(?i)(.*password\s*[=:]\s*)(' + PWD + '*)(' + NON_PWD + '[^\n]*)'), 16, 'PASSWORD', re.compile(r'(?i)\.ya?ml')), # xxxpassword : asdf 121 | (re.compile(r'''(?i)(.*password\s*[=:]\s*["'])([^"'\n]+)(["'][^\n]*)'''), 17, 'PASSWORD', None), # xxxpassword : 'asdf' 122 | (re.compile(r'''(?i)(.*password\s*[!=]=\s*['"])([^'"\n]+)(['"][^\n]*)'''), 18, 'PASSWORD', None), # password != / == "foo" 123 | (re.compile(r'''(?i)(.*"password\w*"[:=\s]+")([^"\n]+)("[^\n]*)'''), 19, 'PASSWORD', None), # "passwordxxx": "foo 124 | (re.compile(r'''(?i)(\$password\w*\s*=*\s')([^']+)('[^\n]*)'''), 20, 'PASSWORD', None), 125 | (re.compile(r'''(?i)(\$\w*password\s*=\s*')([^']+)('[^\n]*)'''), 21, 'PASSWORD', None), 126 | (re.compile(r'''(?i)("\w*ClientSecret":\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 24, 'PASSWORD', None), 127 | (re.compile(r'''(?i)("\w*EncryptionKey":\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 25, 'PASSWORD', None), 128 | (re.compile(r'''(?i)(.*(?:api|access|auth|client|secret)_key\s*:\s*)([^"\n]+)("[^\n]*)'''), 26, 'PASSWORD', None), # _key: foo 129 | (re.compile(r'''(?i)(.*(?:api|access|auth|client|secret)_key\s*[!=]?=\s*")([^"\n]+)("[^\n]*)'''), 27, 'PASSWORD', None), # _key = / != / == "foo" 130 | (re.compile(r'(?i)(.*(?:api|access|auth|client|secret)_key\s*[!=]?=\s*)(' + PWD + '{18,200})(' + NON_PWD + '*)'), 28, 'PASSWORD', None), # _key = / != / == foo 131 | (re.compile(r'''(?i)(.*(?:api|access|auth|client|secret)_key"\s*:\s*")([^"\n]+)("[^\n]*)'''), 29, 'PASSWORD', None), # _key": "foo" 132 | (re.compile(r'''(?i)(.*key\s*=\s*.*GetBytes\(")([^"\n]+)("[^\n]*)'''), 30, 'PASSWORD', None), 133 | (re.compile(r'''(?i)(.*key\s*=\s*"\w*password\w*"\s+value\s*=\s*")([^"\n]{0,200})("[^\n]{0,200})'''), 31, 'PASSWORD', None), 134 | (re.compile(r'''(?i)(.*key\s*=\s*"\w+pwd"\s+value\s*=\s*")([^"\n]{0,200})("[^\n]{0,200})'''), 32, 'PASSWORD', None), 135 | (re.compile(r'''(?i)(.*key\s*=\s*"\w+secret"\s+value\s*=\s*")([^"\n]{0,200})("[^\n]{0,200})'''), 33, 'PASSWORD', None), 136 | (re.compile(r'''(?i)(.*pwd\s*[=:]\s*)([^;'"<$\n\s]*)[;'"<$\n\s]([^\n]*)'''), 34, 'PASSWORD', None), 137 | (re.compile(r'''(?i)(.*AzureStorageKey.*AccountKey\s*=\s*)([^;'"<$\n\s\\]*)([;'"<$\n\s\\][^\n]*)'''), 35, 'PASSWORD', None), 138 | (re.compile(r'''(?i)(.*secret\s*[=:]\s*)([^;'"<$\n\s]*)[;'"<$\n\s](.*)'''), 36, 'PASSWORD', None), 139 | (re.compile(r'''(curl.{0,200}\s-u\s*)([^\s]+)(\s.*)'''), 37, 'PASSWORD', None), 140 | (re.compile(r'''(mysql.{0,200}\s-p\s*)([^\s]+)(\s.*)'''), 38, 'PASSWORD', None), 141 | (re.compile(r'''("AUTH"[,\s]+")([^\n]{5,99})("[^\n]{0,200})'''), 39, 'PASSWORD', None), 142 | (re.compile(r'(?i)(\w*secret\s*=\s*")(' + CLIENT_SECRET + r')("[^\n]*)'), 40, 'PASSWORD', None), 143 | (re.compile(r'''(?i)(.*api_token\s*,\s*')(''' + PWD + r'''*)('\s*''' + NON_PWD + '[^\n]*)'), 41, 'PASSWORD', None), 144 | (re.compile(r'''(?i)(\w*API_KEY\s*=\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 42, 'PASSWORD', None), 145 | (re.compile(r'''(?i)(\w*AUTH_KEY\s*=\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 43, 'PASSWORD', None), 146 | (re.compile(r'''(?i)(\w*ACCESS_KEY\s*=\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 44, 'PASSWORD', None), 147 | (re.compile(r'''(?i)(\w*TOKEN\s*=\s*['"])(''' + CLIENT_SECRET + r''')(['"][^\n]*)'''), 45, 'PASSWORD', None), 148 | (re.compile(r'''(?i)(\w*_PASS\s*=\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 46, 'PASSWORD', None), 149 | (re.compile(r'''(?i)("auth"\s*:\s*")(''' + CLIENT_SECRET + r''')("[^\n]*)'''), 47, 'PASSWORD', None), 150 | (re.compile(r'''(?i)(password\s+")(\w{5,200})("[^\n]*)'''), 48, 'PASSWORD', None), 151 | (re.compile(r'''(?i)("pass"\s*:\s*")(''' + PWD + r'''{5,100})("[^\n]*)'''), 49, 'PASSWORD', None), 152 | (re.compile(r'''(?i)("passphrase"\s*:\s*")(''' + PWD + r'''{5,100})("[^\n]*)'''), 50, 'PASSWORD', None), 153 | (re.compile(r'''(?i)(machine\s+[^\s]+\s+login\s+[^\s]+\s+password\s+)([^\s]+)(\s+[^\n]*)'''), 51, 'PASSWORD', None), 154 | (re.compile(r'''(?i)(_auth\s*=\s*)([^\s]{5,200})([^\n]*)'''), 52, 'PASSWORD', None), 155 | (re.compile(r'''(?i)(SECRET_KEY\s*=\s*)([^\s]{5,200})([^\n]*)'''), 53, 'PASSWORD', None), 156 | (re.compile(r'''(?i)(\.login\('[^'\n]+',\s*')([^\s\n']{5,200})([^\n]*)'''), 54, 'PASSWORD', None), 157 | (re.compile(r'''(?i)(secret_key_base:\s*)([^\s\n]{5,200})([^\n]*)'''), 55, 'PASSWORD', None), 158 | (re.compile(r'''(?i)(APP_KEY\s*=\s*)([^\s\n]{5,200})([^\n]*)'''), 56, 'PASSWORD', None), 159 | (re.compile(r'''(?i)(\w*_PASSWORD\s*=\s*)([^\s\n]{5,200})([^\n]*)'''), 57, 'PASSWORD', None), 160 | (re.compile(r'''(.*)(\$apr1\$\w{1,99}\$[^;<$\n\s'"]*)([^\n]*)'''), 58, 'PASSWORD', None), # password hash $apr$salt$... 161 | (re.compile(r'''(?i)(\$\w*passwd\s*=\s*')([^\s\n']{5,200})([^\n]*)'''), 59, 'PASSWORD', None), 162 | (re.compile(r'''(?i)(\w*_PASSWORD',\s*')([^\s\n']{5,200})([^\n]*)'''), 60, 'PASSWORD', None), 163 | (re.compile(r'''(?i)(\w*_KEY',\s*')([^\s\n']{5,200})([^\n]*)'''), 61, 'PASSWORD', None), 164 | (re.compile(r'''(?i)("encryptedPassword":\s*")([^\s\n"]{5,200})([^\n]*)'''), 62, 'PASSWORD', None), 165 | (re.compile(r'''(?i)(api_key:\s*)([^\s\n"]{5,200})([^\n]*)'''), 63, 'PASSWORD', None), 166 | (re.compile(r'''(.*)(\$2y\$\d+\$[^\s\n"']{5,200})([^\n]*)'''), 64, 'PASSWORD', None), 167 | (re.compile(r'''(?i)(\w*Password"\s*:\s*")([^\s\n"]{5,200})([^\n]*)'''), 65, 'PASSWORD', None), 168 | (re.compile(r'''(?i)(\w*Passphrase"\s*:\s*")([^\s\n"]{5,200})([^\n]*)'''), 66, 'PASSWORD', None), 169 | (re.compile(r'''(.*)(\$\d\$[^$]{1,40}\$[^\s\n"']{5,200})([^\n]*)'''), 67, 'PASSWORD', None), 170 | (re.compile(r'''(?i)(.*)([^<\n]{5,200})([^\n]*)'''), 68, 'PASSWORD', None), 171 | (re.compile(r'''(?i)(.*]+>)([^<\n]{5,200})([^\n]*)'''), 69, 'PASSWORD', None), 172 | (re.compile(r'''(?i)(.*\w*API_KEY\s*=\s*['"]?)([^\n\s'"]{5,200})([^\n]*)'''), 81, 'PASSWORD', None), 173 | (re.compile(r'''(?i)(.*MLAB_PASS\s*=\s*)([^\n\s]{5,200})([^\n]*)'''), 82, 'PASSWORD', None), 174 | 175 | # USER rules below here 176 | (re.compile(r'''(?i)("\w*ClientId":\s*")(''' + GUID_LOWER + r''')("[^\n]*)'''), 22, 'USER', None), 177 | (re.compile(r'''(?i)("\w*TenantId":\s*")(''' + GUID_LOWER + r''')("[^\n]*)'''), 23, 'USER', None), 178 | (re.compile(r'''(?i)(.*ACCESS_KEY_ID\s*=\s*)([^\n\s]{18,200})([^\n]*)'''), 70, 'USER', None), 179 | (re.compile(r'''(?i)(.*S3_BUCKET\s*=\s*)([^\n\s]{5,200})([^\n]*)'''), 71, 'USER', None), 180 | (re.compile(r'''(?i)(.*RDS_HOST\s*=\s*)([^\n\s]{5,200})([^\n]*)'''), 72, 'USER', None), 181 | (re.compile(r'''(?i)(.*MLAB_URL\s*=\s*)([^\n\s]{5,200})([^\n]*)'''), 73, 'USER', None), 182 | (re.compile(r'''(?i)(.*MLAB_DB\s*=\s*)([^\n\s]{5,200})([^\n]*)'''), 74, 'USER', None), 183 | (re.compile(r'''(?i)(.*_USERNAME\s*=\s*["'])([^\n\s"']{5,200})([^\n]*)'''), 75, 'USER', None), 184 | (re.compile(r'''(?i)(.*_EMAIL\s*=\s*["'])([^\n\s"']{5,200})([^\n]*)'''), 76, 'USER', None), 185 | (re.compile(r'''(?i)(.*hostname\s+)([^\n\s"'.]+\.[^\n\s"'.]+\.[^\n\s"']+)([^\n]*)'''), 77, 'USER', None), 186 | (re.compile(r'''(?i)(.*username\s+)([^\n\s]{5,200})([^\n]*)'''), 78, 'USER', None), 187 | (re.compile(r'''(?i)(.*"host"\s*:\s*)([^\n\s]{5,200})([^\n]*)'''), 79, 'USER', None), 188 | (re.compile(r'''(?i)(.*"user"\s*:\s*)([^\n\s]{5,200})([^\n]*)'''), 80, 'USER', None), 189 | (re.compile(r'''(?i)(.*MAILCHIMP_LIST_ID\s*=\s*['"])([^\n\s'"]{5,200})([^\n]*)'''), 83, 'USER', None), 190 | (re.compile(r'''(?i)(.*"email"\s*=\s*['"])([^\n\s'"]{5,200})([^\n]*)'''), 84, 'USER', None), 191 | (re.compile(r'(.*)(AKIA[A-Z0-9]{16})([^A-Z0-9][^\n]*)'), 1, 'USER', None), # AWS Access Key 192 | (re.compile(r'''(?i)(.*\W)(''' + EMAIL_ADDR + r''')([^\n]*)'''), 85, 'USER', None), 193 | (re.compile(r'''(?i)(.*_USER\s*=\s*["'])([^\n\s"']{5,200})([^\n]*)'''), 86, 'USER', None), 194 | 195 | ] 196 | 197 | # Regexes we use to exclude likely false-positive passwords 198 | # Many of these are artefacts introduced by the typical context of the password detection regexes, 199 | # such as filenames, html/dom fragments, and other common string constants. 200 | non_password_regexes = [ 201 | re.compile(r'''#[0-9a-f]{6}'''), # web colour code 202 | re.compile(r'''0x[0-9a-f]{2}'''), # hex 203 | re.compile(r'''(%n|%s|%y|%d|%m|%v)'''), # c format specifiers 204 | re.compile(r'''(\\n|\\t|\\r)'''), # escape codes 205 | re.compile(r'''(://)'''), # url 206 | re.compile(r''''''), # xml or html tag 207 | re.compile(r'''[,.]\s'''), # comma space or dot space 208 | re.compile(r'''[$@]\(\w+\)'''), # interpolation 209 | re.compile(r'''[)}\],(\[{]$'''), 210 | re.compile(r'''(\$\(|\$\w+->)'''), # php 211 | re.compile(r'''\$php'''), 212 | re.compile(r'''\):?$'''), # ends in ')' or '):' 213 | re.compile(r'''\*\.'''), 214 | re.compile(r'''\.(dll|exe|so|doc|pdf|hml|css|js|gif|png|jpg|jpeg|sh)'''), 215 | re.compile(r'''\d+\.\d+\.\d+'''), # version number 216 | re.compile(r'''\\[^\\]*\\'''), # windows path 217 | re.compile(r'''\\[ux][0-9a-f]{2}'''), # unicode or hex char 218 | re.compile(r'''(^\s|\s[^\s]*\s|\s\|\s|\s{4})'''), # spaces; probably text 219 | re.compile(r'''{#?[a-z0-9_ ]+#?}'''), # interpolation 220 | re.compile(r'''^[0\\][xX][0-9a-f]{4,}$'''), # hex constant 221 | re.compile(r'''^(A+|X+|Z+|a+|x+|z+)$'''), # AAAAAAA XXXXXXX etc 222 | re.compile(r'''^--'''), # command line option flag 223 | re.compile(r'''^@\w+$'''), 224 | re.compile(r'''^[0-9:./ ]$'''), # date/time or version number 225 | re.compile(r'''^[^a-z]+$'''), # entirely non alpha 226 | re.compile(r'''^\$\w+$'''), 227 | re.compile(r'''^([a-z0-9][a-z0-9-_]{1,80}\.){2,}[a-z_]{1,14}$'''), # looks like a fully qualified domain name 228 | re.compile(r'''^(}|\${|!join|false$|sha1-|sha256-|sha512-|split$|string\.|this\.|true$|user\.|xml|xsi)'''), # begins with 229 | re.compile(r'''_[^_]*_'''), 230 | re.compile(r'''^('.*'|".*"|\$.*;|\\.*"|any|\$auth;|alias|await|hash|keyfile|new|nil|none|null|null;|pwd|user:pass|>)$'''), # entire password is x 231 | re.compile(r'''(}|/|border|click|comments|focus|scroll|keydown|keyup|margin|pwd|resize|width|height|value)$'''), # ends with 232 | re.compile(r'''(address|after|against|already|api.*key|associated|attribute|authentication|bearer|cannot)'''), 233 | re.compile(r'''(cdata|client|config|connect|contained|credentials|could|data\s*-|digest|either|element|enter\s|error|exists|extends|false|format|function)'''), 234 | re.compile(r'''(general|\.get|href|html|http|image|inactive|index|indicating|input|invalid|json|lambda|length|localhost|matches|md5|method|missing|mm:ss)'''), 235 | re.compile(r'''(option|passes|passphrase|\Wpassword\W|placeholder|plaintext|portion|property|provided|recovery|redacted|secret|settings|sha-1|should|source)'''), 236 | re.compile(r'''(string|text/|tlsv|token|true|type|uint|user_id|user_|username|utf-8|validation|value|var\.|video|wasn't|which|whose|xml|yyyy)'''), 237 | ] 238 | 239 | # These regexes detect common 'placeholder' passwords used in test scripts that may not present a security risk (but then again, they may...) 240 | non_password_regexes_strict = [ 241 | re.compile(r'''(changeme|dummy|email|example|passwd|password|pswd|sample|secret)'''), 242 | ] 243 | 244 | 245 | def get_passwords_from_line(fname, line): 246 | results = [] 247 | 248 | for pwd_regex, rule_id, result_type, files_include in pwd_rules: 249 | if not douser: 250 | if result_type == 'USER': 251 | continue 252 | if files_include: 253 | if not files_include.search(fname): 254 | continue # this rule doesn't apply to this file 255 | remaining = line 256 | prefix = '' 257 | while remaining and remaining != '': 258 | result = pwd_regex.search(remaining) 259 | 260 | if not result: 261 | break 262 | 263 | groups = result.groups() # noqa 264 | g1 = groups[0] if len(groups) > 0 else '' 265 | g2 = groups[1] if len(groups) > 1 else '' 266 | g3 = groups[2] if len(groups) > 2 else '' 267 | results = results + [(rule_id, prefix + g1, g2, g3, result_type)] 268 | remaining = g3 269 | prefix = g1 + g2 270 | if len(prefix) > 200: prefix = prefix[-200:] # noqa 271 | return results 272 | 273 | 274 | def is_not_a_password(pwd): 275 | # pwd is interpolation string, web rgb code, unicode char etc 276 | for npregex in non_password_regexes: 277 | if npregex.search(pwd.lower()): 278 | if vvv: 279 | eprint('Not a password: \'' + pwd + '\' because it matches regex \'' + str(npregex) + '\'') 280 | return True 281 | if not placeholder: 282 | for npregex in non_password_regexes_strict: 283 | if npregex.search(pwd.lower()): 284 | if vvv: 285 | eprint('(Placeholder) Not a password: \'' + pwd + '\' because it matches regex \'' + str(npregex) + '\'') 286 | return True 287 | return False 288 | 289 | 290 | def has_non_ascii(s): 291 | for ch in s: # if pwd has non-ascii chars, ignore the password 292 | if ord(ch) > 127: 293 | return True 294 | return False 295 | 296 | 297 | def is_file_suppression_comment(text): 298 | # flake8: noqa 299 | suppression_comment_indicators = [ 300 | 'flake8: noqa', 301 | '# noqa file', 302 | ] 303 | for indicator in suppression_comment_indicators: 304 | if indicator in text: 305 | return True 306 | return False 307 | 308 | 309 | def is_line_suppression_comment(text): 310 | suppression_comment_indicators = [ 311 | '# noinspection', 312 | '# noqa', 313 | '#noqa', 314 | '@SuppressWarnings', 315 | 'DevSkim', 316 | 'NOLINT', 317 | 'NOSONAR', 318 | 'checkmarx', 319 | 'coverity', 320 | 'fortify', 321 | 'noinspection', 322 | 'nosec', 323 | 'safesql', 324 | 'veracode', 325 | ] 326 | for indicator in suppression_comment_indicators: 327 | if indicator in text: 328 | return True 329 | return False 330 | 331 | 332 | def check_line_password(fname, text): 333 | try: 334 | # check if this line contains a password, return text if yes, plus score? 335 | outputs = [] 336 | 337 | # if line has an 'ignore' comment, or is ridiculously long, ignore this line 338 | if not nosuppress: 339 | if is_line_suppression_comment(text): return outputs # noqa 340 | if len(text) > 8192: return outputs # noqa 341 | 342 | results = get_passwords_from_line(fname, text) 343 | 344 | for rule_id, prefix, pwd, suffix, result_type in results: 345 | # check conditions under which we stop scoring the candidate password 346 | # allow very short passwords only if they're very common and bad (taken from the top1000 password list) 347 | if len(pwd) < 5: continue # noqa 348 | if len(pwd) > 200: continue # noqa 349 | if 'publickeytoken' in prefix.lower(): continue # noqa 350 | # if 'example' in prefix.lower() or 'example' in suffix.lower(): continue # noqa 351 | # if has_non_ascii(pwd): continue # noqa 352 | if is_not_a_password(pwd) and pwd not in SHORT_BAD_PASSWORDS: 353 | continue # noqa # ignore known non-password patterns 354 | outputs = outputs + [(rule_id, prefix + ':' + pwd + ':' + suffix, result_type)] 355 | if not allow_duplicates: 356 | break 357 | return outputs 358 | 359 | except Exception as e: 360 | eprint("Exception: " + str(e)) 361 | eprint("") 362 | 363 | 364 | def skip_file(fname): 365 | global ns 366 | if ns: 367 | return False 368 | for skip in SKIP_DIRS: 369 | if skip.search(fname): 370 | return True 371 | for skip in SKIP_EXTS: 372 | if skip.search(fname): 373 | return True 374 | return False 375 | 376 | 377 | def eprint(*args, **kwargs): 378 | print(*args, file=sys.stderr, **kwargs) 379 | 380 | 381 | def write_result(result_msg): 382 | global wrote_result 383 | wrote_result = True 384 | while result_msg.endswith('\n'): 385 | result_msg = result_msg[0:-1] 386 | print(result_msg) 387 | 388 | 389 | def do_line_cred_check(fname, line, line_num): 390 | try: 391 | results = check_line_password(fname, line) 392 | for rule_id, result, type in results: # noqa 393 | msg = f"{fname}:{line_num}:{type}:Rule:{str(rule_id)}:{result}" 394 | write_result(msg) # noqa 395 | 396 | except Exception as e: 397 | eprint("Exception: " + str(e)) 398 | eprint("") 399 | 400 | 401 | def do_checks(): 402 | if a: 403 | mode = 'rb' 404 | else: 405 | mode = 'r' 406 | 407 | for root, subdirs, files in os.walk(os.path.abspath(".")): 408 | for fn in files: 409 | fname = str(root) + "/" + str(fn) 410 | # exclude some files/paths based on verbosity options 411 | if skip_file(fname): 412 | continue 413 | if print_progress: 414 | eprint('Scanning ' + fname) 415 | 416 | try: 417 | with open(fname, mode) as f: 418 | line_num = 1 419 | for line in f: 420 | if a: line = str(line) # noqa 421 | if not nosuppress: 422 | if is_file_suppression_comment(line): break # noqa 423 | do_line_cred_check(fname, line, line_num) 424 | line_num += 1 425 | except: # noqa 426 | continue 427 | if wrote_result: 428 | return 1 429 | else: 430 | return 0 431 | 432 | 433 | def syntax(): 434 | eprint( 435 | '''ccs.py : Code Credential Scanning Tool [ by Chris Anley ] 436 | Syntax: 437 | 438 | Run from code root directory. Output is to stdout, errors and 'verbose' messages are to stderr. 439 | 440 | The default is to return fewer false-positives; use '-everything' for lots of false positives 441 | 442 | "Result Type" is USER (for a username/email/account id), or PASSWORD (for a password, auth token, cryptographic key) 443 | Password hashes and encrypted passwords are generally crackable, and are reported as 'PASSWORD' 444 | 445 | ccs.py [options] 446 | -a : check all files, including binaries (i.e. files containing invalid utf-8 chars) 447 | -dupes : report all hits for a single line (default is to only report the first hit) 448 | -nosuppress : ignore suppression comments such as # noqa, at line and file level 449 | -douser : Run USERNAME checks as well as PASSWORD / KEY checks 450 | -everything : Get all possible creds; equivalent to -nosuppress -douser -ns -sa -placeholder 451 | -ns : no skip : don't skip files/directories that are irrelevant, like test, /vendor/, /node_modules/, .zip etc 452 | -p : print progress 453 | -sa : scan all files, not just recommended / code files 454 | -placeholder : Allow some likely 'placeholder' false positives, like 'password', 'example', 'dummy' 455 | -v : quite verbose 456 | -vv : annoyingly verbose 457 | -vvv : pointlessly verbose 458 | ''') 459 | 460 | 461 | a = False 462 | v = False 463 | vv = False 464 | vvv = False 465 | ns = False 466 | sa = False 467 | sc = False 468 | print_progress = False 469 | nosuppress = False 470 | allow_duplicates = False 471 | placeholder = False 472 | douser = False 473 | 474 | 475 | def do_main(): 476 | global a, v, vv, vvv, ns, sa, sc, print_progress, nosuppress, allow_duplicates, placeholder, douser 477 | argc = len(sys.argv) 478 | argv = sys.argv 479 | 480 | for i in range(1, argc): 481 | if argv[i] in ['-h', '-?', '--help', '--h', '/?']: 482 | return syntax() 483 | 484 | if argv[i] == '-a': 485 | a = True 486 | ns = True # no skip directories 487 | sa = True # apply all checks to all files 488 | continue 489 | if argv[i] == '-v': 490 | v = True 491 | continue 492 | if argv[i] == '-vv': 493 | v = True 494 | vv = True 495 | continue 496 | if argv[i] == '-vvv': 497 | v = True 498 | vv = True 499 | vvv = True 500 | print_progress = True 501 | continue 502 | if argv[i] == '-ns': # no skip directories / files 503 | ns = True 504 | continue 505 | if argv[i] == '-sa': # apply all checks to all files 506 | sa = True 507 | continue 508 | if argv[i] == '-p': 509 | print_progress = True 510 | if argv[i] == '-nosuppress': 511 | nosuppress = True 512 | if argv[i] == '-dupes': 513 | allow_duplicates = True 514 | if argv[i] == '-placeholder': 515 | placeholder = True 516 | if argv[i] == '-douser': 517 | douser = True 518 | if argv[i] == '-everything': 519 | douser = True 520 | nosuppress = True 521 | ns = True 522 | sa = True 523 | placeholder = True 524 | return do_checks() 525 | 526 | 527 | def signal_handler(sig, frame): # noqa 528 | os._exit(0) # noqa 529 | 530 | 531 | if __name__ == "__main__": 532 | signal.signal(signal.SIGINT, signal_handler) 533 | if do_main(): 534 | sys.exit("CCS: Credentials were found\n") 535 | -------------------------------------------------------------------------------- /workflows/main.yml: -------------------------------------------------------------------------------- 1 | # Example workflow to run ccs on the current repository 2 | # Note - this implies that you have copied ccs.py to 'scripts/ccs.py' in your repository 3 | name: run ccs on repo 4 | on: 5 | push: 6 | branches: 7 | - main 8 | jobs: 9 | comment: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - uses: actions/checkout@master 13 | - run: ./scripts/ccs.py 14 | 15 | --------------------------------------------------------------------------------