├── .travis.yml └── README.rst /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | 3 | sudo: false 4 | 5 | install: 6 | - pip install doc8 7 | - npm install -g write-good 8 | 9 | script: 10 | - doc8 README.rst 11 | - write-good README.rst --so --thereIs --cliches 12 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | What happens when... 2 | ==================== 3 | 4 | This repository is an attempt to answer the age-old interview question "What 5 | happens when you type google.com into your browser's address box and press 6 | enter?" 7 | 8 | Except instead of the usual story, we're going to try to answer this question 9 | in as much detail as possible. No skipping out on anything. 10 | 11 | This is a collaborative process, so dig in and try to help out! There are tons 12 | of details missing, just waiting for you to add them! So send us a pull 13 | request, please! 14 | 15 | This is all licensed under the terms of the `Creative Commons Zero`_ license. 16 | 17 | Read this in `简体中文`_ (simplified Chinese), `日本語`_ (Japanese), `한국어`_ 18 | (Korean) and `Spanish`_. NOTE: these have not been reviewed by the alex/what-happens-when 19 | maintainers. 20 | 21 | Table of Contents 22 | ==================== 23 | 24 | .. contents:: 25 | :backlinks: none 26 | :local: 27 | 28 | The "g" key is pressed 29 | ---------------------- 30 | The following sections explain the physical keyboard actions 31 | and the OS interrupts. When you press the key "g" the browser receives the 32 | event and the auto-complete functions kick in. 33 | Depending on your browser's algorithm and if you are in 34 | private/incognito mode or not various suggestions will be presented 35 | to you in the dropdown below the URL bar. Most of these algorithms sort 36 | and prioritize results based on search history, bookmarks, cookies, and 37 | popular searches from the internet as a whole. As you are typing 38 | "google.com" many blocks of code run and the suggestions will be refined 39 | with each keypress. It may even suggest "google.com" before you finish typing 40 | it. 41 | 42 | The "enter" key bottoms out 43 | --------------------------- 44 | 45 | To pick a zero point, let's choose the Enter key on the keyboard hitting the 46 | bottom of its range. At this point, an electrical circuit specific to the enter 47 | key is closed (either directly or capacitively). This allows a small amount of 48 | current to flow into the logic circuitry of the keyboard, which scans the state 49 | of each key switch, debounces the electrical noise of the rapid intermittent 50 | closure of the switch, and converts it to a keycode integer, in this case 13. 51 | The keyboard controller then encodes the keycode for transport to the computer. 52 | This is now almost universally over a Universal Serial Bus (USB) or Bluetooth 53 | connection, but historically has been over PS/2 or ADB connections. 54 | 55 | *In the case of the USB keyboard:* 56 | 57 | - The USB circuitry of the keyboard is powered by the 5V supply provided over 58 | pin 1 from the computer's USB host controller. 59 | 60 | - The keycode generated is stored by internal keyboard circuitry memory in a 61 | register called "endpoint". 62 | 63 | - The host USB controller polls that "endpoint" every ~10ms (minimum value 64 | declared by the keyboard), so it gets the keycode value stored on it. 65 | 66 | - This value goes to the USB SIE (Serial Interface Engine) to be converted in 67 | one or more USB packets that follow the low-level USB protocol. 68 | 69 | - Those packets are sent by a differential electrical signal over D+ and D- 70 | pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID 71 | (Human Interface Device) device is always declared to be a "low-speed device" 72 | (USB 2.0 compliance). 73 | 74 | - This serial signal is then decoded at the computer's host USB controller, and 75 | interpreted by the computer's Human Interface Device (HID) universal keyboard 76 | device driver. The value of the key is then passed into the operating 77 | system's hardware abstraction layer. 78 | 79 | *In the case of Virtual Keyboard (as in touch screen devices):* 80 | 81 | - When the user puts their finger on a modern capacitive touch screen, a 82 | tiny amount of current gets transferred to the finger. This completes the 83 | circuit through the electrostatic field of the conductive layer and 84 | creates a voltage drop at that point on the screen. The 85 | ``screen controller`` then raises an interrupt reporting the coordinate of 86 | the keypress. 87 | 88 | - Then the mobile OS notifies the currently focused application of a press event 89 | in one of its GUI elements (which now is the virtual keyboard application 90 | buttons). 91 | 92 | - The virtual keyboard can now raise a software interrupt for sending a 93 | 'key pressed' message back to the OS. 94 | 95 | - This interrupt notifies the currently focused application of a 'key pressed' 96 | event. 97 | 98 | 99 | Interrupt fires [NOT for USB keyboards] 100 | --------------------------------------- 101 | 102 | The keyboard sends signals on its interrupt request line (IRQ), which is mapped 103 | to an ``interrupt vector`` (integer) by the interrupt controller. The CPU uses 104 | the ``Interrupt Descriptor Table`` (IDT) to map the interrupt vectors to 105 | functions (``interrupt handlers``) which are supplied by the kernel. When an 106 | interrupt arrives, the CPU indexes the IDT with the interrupt vector and runs 107 | the appropriate handler. Thus, the kernel is entered. 108 | 109 | (On Windows) A ``WM_KEYDOWN`` message is sent to the app 110 | -------------------------------------------------------- 111 | 112 | The HID transport passes the key down event to the ``KBDHID.sys`` driver which 113 | converts the HID usage into a scancode. In this case, the scan code is 114 | ``VK_RETURN`` (``0x0D``). The ``KBDHID.sys`` driver interfaces with the 115 | ``KBDCLASS.sys`` (keyboard class driver). This driver is responsible for 116 | handling all keyboard and keypad input in a secure manner. It then calls into 117 | ``Win32K.sys`` (after potentially passing the message through 3rd party 118 | keyboard filters that are installed). This all happens in kernel mode. 119 | 120 | ``Win32K.sys`` figures out what window is the active window through the 121 | ``GetForegroundWindow()`` API. This API provides the window handle of the 122 | browser's address box. The main Windows "message pump" then calls 123 | ``SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)``. ``lParam`` is a bitmask 124 | that indicates further information about the keypress: repeat count (0 in this 125 | case), the actual scan code (can be OEM dependent, but generally wouldn't be 126 | for ``VK_RETURN``), whether extended keys (e.g. alt, shift, ctrl) were also 127 | pressed (they weren't), and some other state. 128 | 129 | The Windows ``SendMessage`` API is a straightforward function that 130 | adds the message to a queue for the particular window handle (``hWnd``). 131 | Later, the main message processing function (called a ``WindowProc``) assigned 132 | to the ``hWnd`` is called in order to process each message in the queue. 133 | 134 | The window (``hWnd``) that is active is actually an edit control and the 135 | ``WindowProc`` in this case has a message handler for ``WM_KEYDOWN`` messages. 136 | This code looks within the 3rd parameter that was passed to ``SendMessage`` 137 | (``wParam``) and, because it is ``VK_RETURN`` knows the user has hit the ENTER 138 | key. 139 | 140 | (On OS X) A ``KeyDown`` NSEvent is sent to the app 141 | -------------------------------------------------- 142 | 143 | The interrupt signal triggers an interrupt event in the I/O Kit kext keyboard 144 | driver. The driver translates the signal into a key code which is passed to the 145 | OS X ``WindowServer`` process. Resultantly, the ``WindowServer`` dispatches an 146 | event to any appropriate (e.g. active or listening) applications through their 147 | Mach port where it is placed into an event queue. Events can then be read from 148 | this queue by threads with sufficient privileges calling the 149 | ``mach_ipc_dispatch`` function. This most commonly occurs through, and is 150 | handled by, an ``NSApplication`` main event loop, via an ``NSEvent`` of 151 | ``NSEventType`` ``KeyDown``. 152 | 153 | (On GNU/Linux) the Xorg server listens for keycodes 154 | --------------------------------------------------- 155 | 156 | When a graphical ``X server`` is used, ``X`` will use the generic event 157 | driver ``evdev`` to acquire the keypress. A re-mapping of keycodes to scancodes 158 | is made with ``X server`` specific keymaps and rules. 159 | When the scancode mapping of the key pressed is complete, the ``X server`` 160 | sends the character to the ``window manager`` (DWM, metacity, i3, etc), so the 161 | ``window manager`` in turn sends the character to the focused window. 162 | The graphical API of the window that receives the character prints the 163 | appropriate font symbol in the appropriate focused field. 164 | 165 | Parse URL 166 | --------- 167 | 168 | * The browser now has the following information contained in the URL (Uniform 169 | Resource Locator): 170 | 171 | - ``Protocol`` "http" 172 | Use 'Hyper Text Transfer Protocol' 173 | 174 | - ``Resource`` "/" 175 | Retrieve main (index) page 176 | 177 | 178 | Is it a URL or a search term? 179 | ----------------------------- 180 | 181 | When no protocol or valid domain name is given the browser proceeds to feed 182 | the text given in the address box to the browser's default web search engine. 183 | In many cases the URL has a special piece of text appended to it to tell the 184 | search engine that it came from a particular browser's URL bar. 185 | 186 | Convert non-ASCII Unicode characters in the hostname 187 | ------------------------------------------------ 188 | 189 | * The browser checks the hostname for characters that are not in ``a-z``, 190 | ``A-Z``, ``0-9``, ``-``, or ``.``. 191 | * Since the hostname is ``google.com`` there won't be any, but if there were 192 | the browser would apply `Punycode`_ encoding to the hostname portion of the 193 | URL. 194 | 195 | Check HSTS list 196 | --------------- 197 | * The browser checks its "preloaded HSTS (HTTP Strict Transport Security)" 198 | list. This is a list of websites that have requested to be contacted via 199 | HTTPS only. 200 | * If the website is in the list, the browser sends its request via HTTPS 201 | instead of HTTP. Otherwise, the initial request is sent via HTTP. 202 | (Note that a website can still use the HSTS policy *without* being in the 203 | HSTS list. The first HTTP request to the website by a user will receive a 204 | response requesting that the user only send HTTPS requests. However, this 205 | single HTTP request could potentially leave the user vulnerable to a 206 | `downgrade attack`_, which is why the HSTS list is included in modern web 207 | browsers.) 208 | 209 | DNS lookup 210 | ---------- 211 | 212 | * Browser checks if the domain is in its cache. (to see the DNS Cache in 213 | Chrome, go to `chrome://net-internals/#dns `_). 214 | * If not found, the browser calls ``gethostbyname`` library function (varies by 215 | OS) to do the lookup. 216 | * ``gethostbyname`` checks if the hostname can be resolved by reference in the 217 | local ``hosts`` file (whose location `varies by OS`_) before trying to 218 | resolve the hostname through DNS. 219 | * If ``gethostbyname`` does not have it cached nor can find it in the ``hosts`` 220 | file then it makes a request to the DNS server configured in the network 221 | stack. This is typically the local router or the ISP's caching DNS server. 222 | * If the DNS server is on the same subnet the network library follows the 223 | ``ARP process`` below for the DNS server. 224 | * If the DNS server is on a different subnet, the network library follows 225 | the ``ARP process`` below for the default gateway IP. 226 | 227 | 228 | ARP process 229 | ----------- 230 | 231 | In order to send an ARP (Address Resolution Protocol) broadcast the network 232 | stack library needs the target IP address to lookup. It also needs to know the 233 | MAC address of the interface it will use to send out the ARP broadcast. 234 | 235 | The ARP cache is first checked for an ARP entry for our target IP. If it is in 236 | the cache, the library function returns the result: Target IP = MAC. 237 | 238 | If the entry is not in the ARP cache: 239 | 240 | * The route table is looked up, to see if the Target IP address is on any of 241 | the subnets on the local route table. If it is, the library uses the 242 | interface associated with that subnet. If it is not, the library uses the 243 | interface that has the subnet of our default gateway. 244 | 245 | * The MAC address of the selected network interface is looked up. 246 | 247 | * The network library sends a Layer 2 (data link layer of the `OSI model`_) 248 | ARP request: 249 | 250 | ``ARP Request``:: 251 | 252 | Sender MAC: interface:mac:address:here 253 | Sender IP: interface.ip.goes.here 254 | Target MAC: FF:FF:FF:FF:FF:FF (Broadcast) 255 | Target IP: target.ip.goes.here 256 | 257 | Depending on what type of hardware is between the computer and the router: 258 | 259 | Directly connected: 260 | 261 | * If the computer is directly connected to the router the router response 262 | with an ``ARP Reply`` (see below) 263 | 264 | Hub: 265 | 266 | * If the computer is connected to a hub, the hub will broadcast the ARP 267 | request out of all other ports. If the router is connected on the same "wire", 268 | it will respond with an ``ARP Reply`` (see below). 269 | 270 | Switch: 271 | 272 | * If the computer is connected to a switch, the switch will check its local 273 | CAM/MAC table to see which port has the MAC address we are looking for. If 274 | the switch has no entry for the MAC address it will rebroadcast the ARP 275 | request to all other ports. 276 | 277 | * If the switch has an entry in the MAC/CAM table it will send the ARP request 278 | to the port that has the MAC address we are looking for. 279 | 280 | * If the router is on the same "wire", it will respond with an ``ARP Reply`` 281 | (see below) 282 | 283 | ``ARP Reply``:: 284 | 285 | Sender MAC: target:mac:address:here 286 | Sender IP: target.ip.goes.here 287 | Target MAC: interface:mac:address:here 288 | Target IP: interface.ip.goes.here 289 | 290 | Now that the network library has the IP address of either our DNS server or 291 | the default gateway it can resume its DNS process: 292 | 293 | * The DNS client establishes a socket to UDP port 53 on the DNS server, 294 | using a source port above 1023. 295 | * If the response size is too large, TCP will be used instead. 296 | * If the local/ISP DNS server does not have it, then a recursive search is 297 | requested and that flows up the list of DNS servers until the SOA is reached, 298 | and if found an answer is returned. 299 | 300 | Opening of a socket 301 | ------------------- 302 | Once the browser receives the IP address of the destination server, it takes 303 | that and the given port number from the URL (the HTTP protocol defaults to port 304 | 80, and HTTPS to port 443), and makes a call to the system library function 305 | named ``socket`` and requests a TCP socket stream - ``AF_INET/AF_INET6`` and 306 | ``SOCK_STREAM``. 307 | 308 | * This request is first passed to the Transport Layer where a TCP segment is 309 | crafted. The destination port is added to the header, and a source port is 310 | chosen from within the kernel's dynamic port range (ip_local_port_range in 311 | Linux). 312 | * This segment is sent to the Network Layer, which wraps an additional IP 313 | header. The IP address of the destination server as well as that of the 314 | current machine is inserted to form a packet. 315 | * The packet next arrives at the Link Layer. A frame header is added that 316 | includes the MAC address of the machine's NIC as well as the MAC address of 317 | the gateway (local router). As before, if the kernel does not know the MAC 318 | address of the gateway, it must broadcast an ARP query to find it. 319 | 320 | At this point the packet is ready to be transmitted through either: 321 | 322 | * `Ethernet`_ 323 | * `WiFi`_ 324 | * `Cellular data network`_ 325 | 326 | For most home or small business Internet connections the packet will pass from 327 | your computer, possibly through a local network, and then through a modem 328 | (MOdulator/DEModulator) which converts digital 1's and 0's into an analog 329 | signal suitable for transmission over telephone, cable, or wireless telephony 330 | connections. On the other end of the connection is another modem which converts 331 | the analog signal back into digital data to be processed by the next `network 332 | node`_ where the from and to addresses would be analyzed further. 333 | 334 | Most larger businesses and some newer residential connections will have fiber 335 | or direct Ethernet connections in which case the data remains digital and 336 | is passed directly to the next `network node`_ for processing. 337 | 338 | Eventually, the packet will reach the router managing the local subnet. From 339 | there, it will continue to travel to the autonomous system's (AS) border 340 | routers, other ASes, and finally to the destination server. Each router along 341 | the way extracts the destination address from the IP header and routes it to 342 | the appropriate next hop. The time to live (TTL) field in the IP header is 343 | decremented by one for each router that passes. The packet will be dropped if 344 | the TTL field reaches zero or if the current router has no space in its queue 345 | (perhaps due to network congestion). 346 | 347 | This send and receive happens multiple times following the TCP connection flow: 348 | 349 | * Client chooses an initial sequence number (ISN) and sends the packet to the 350 | server with the SYN bit set to indicate it is setting the ISN 351 | * Server receives SYN and if it's in an agreeable mood: 352 | * Server chooses its own initial sequence number 353 | * Server sets SYN to indicate it is choosing its ISN 354 | * Server copies the (client ISN +1) to its ACK field and adds the ACK flag 355 | to indicate it is acknowledging receipt of the first packet 356 | * Client acknowledges the connection by sending a packet: 357 | * Increases its own sequence number 358 | * Increases the receiver acknowledgment number 359 | * Sets ACK field 360 | * Data is transferred as follows: 361 | * As one side sends N data bytes, it increases its SEQ by that number 362 | * When the other side acknowledges receipt of that packet (or a string of 363 | packets), it sends an ACK packet with the ACK value equal to the last 364 | received sequence from the other 365 | * To close the connection: 366 | * The closer sends a FIN packet 367 | * The other sides ACKs the FIN packet and sends its own FIN 368 | * The closer acknowledges the other side's FIN with an ACK 369 | 370 | TLS handshake 371 | ------------- 372 | * The client computer sends a ``ClientHello`` message to the server with its 373 | Transport Layer Security (TLS) version, list of cipher algorithms and 374 | compression methods available. 375 | 376 | * The server replies with a ``ServerHello`` message to the client with the 377 | TLS version, selected cipher, selected compression methods and the server's 378 | public certificate signed by a CA (Certificate Authority). The certificate 379 | contains a public key that will be used by the client to encrypt the rest of 380 | the handshake until a symmetric key can be agreed upon. 381 | 382 | * The client verifies the server digital certificate against its list of 383 | trusted CAs. If trust can be established based on the CA, the client 384 | generates a string of pseudo-random bytes and encrypts this with the server's 385 | public key. These random bytes can be used to determine the symmetric key. 386 | 387 | * The server decrypts the random bytes using its private key and uses these 388 | bytes to generate its own copy of the symmetric master key. 389 | 390 | * The client sends a ``Finished`` message to the server, encrypting a hash of 391 | the transmission up to this point with the symmetric key. 392 | 393 | * The server generates its own hash, and then decrypts the client-sent hash 394 | to verify that it matches. If it does, it sends its own ``Finished`` message 395 | to the client, also encrypted with the symmetric key. 396 | 397 | * From now on the TLS session transmits the application (HTTP) data encrypted 398 | with the agreed symmetric key. 399 | 400 | If a packet is dropped 401 | ---------------------- 402 | 403 | Sometimes, due to network congestion or flaky hardware connections, TLS packets 404 | will be dropped before they get to their final destination. The sender then has 405 | to decide how to react. The algorithm for this is called `TCP congestion 406 | control`_. This varies depending on the sender; the most common algorithms are 407 | `cubic`_ on newer operating systems and `New Reno`_ on almost all others. 408 | 409 | * Client chooses a `congestion window`_ based on the `maximum segment size`_ 410 | (MSS) of the connection. 411 | * For each packet acknowledged, the window doubles in size until it reaches the 412 | 'slow-start threshold'. In some implementations, this threshold is adaptive. 413 | * After reaching the slow-start threshold, the window increases additively for 414 | each packet acknowledged. If a packet is dropped, the window reduces 415 | exponentially until another packet is acknowledged. 416 | 417 | HTTP protocol 418 | ------------- 419 | 420 | If the web browser used was written by Google, instead of sending an HTTP 421 | request to retrieve the page, it will send a request to try and negotiate with 422 | the server an "upgrade" from HTTP to the SPDY protocol. 423 | 424 | If the client is using the HTTP protocol and does not support SPDY, it sends a 425 | request to the server of the form:: 426 | 427 | GET / HTTP/1.1 428 | Host: google.com 429 | Connection: close 430 | [other headers] 431 | 432 | where ``[other headers]`` refers to a series of colon-separated key-value pairs 433 | formatted as per the HTTP specification and separated by single newlines. 434 | (This assumes the web browser being used doesn't have any bugs violating the 435 | HTTP spec. This also assumes that the web browser is using ``HTTP/1.1``, 436 | otherwise it may not include the ``Host`` header in the request and the version 437 | specified in the ``GET`` request will either be ``HTTP/1.0`` or ``HTTP/0.9``.) 438 | 439 | HTTP/1.1 defines the "close" connection option for the sender to signal that 440 | the connection will be closed after completion of the response. For example, 441 | 442 | Connection: close 443 | 444 | HTTP/1.1 applications that do not support persistent connections MUST include 445 | the "close" connection option in every message. 446 | 447 | After sending the request and headers, the web browser sends a single blank 448 | newline to the server indicating that the content of the request is done. 449 | 450 | The server responds with a response code denoting the status of the request and 451 | responds with a response of the form:: 452 | 453 | 200 OK 454 | [response headers] 455 | 456 | Followed by a single newline, and then sends a payload of the HTML content of 457 | ``www.google.com``. The server may then either close the connection, or if 458 | headers sent by the client requested it, keep the connection open to be reused 459 | for further requests. 460 | 461 | If the HTTP headers sent by the web browser included sufficient information for 462 | the webserver to determine if the version of the file cached by the web 463 | browser has been unmodified since the last retrieval (ie. if the web browser 464 | included an ``ETag`` header), it may instead respond with a request of 465 | the form:: 466 | 467 | 304 Not Modified 468 | [response headers] 469 | 470 | and no payload, and the web browser instead retrieve the HTML from its cache. 471 | 472 | After parsing the HTML, the web browser (and server) repeats this process 473 | for every resource (image, CSS, favicon.ico, etc) referenced by the HTML page, 474 | except instead of ``GET / HTTP/1.1`` the request will be 475 | ``GET /$(URL relative to www.google.com) HTTP/1.1``. 476 | 477 | If the HTML referenced a resource on a different domain than 478 | ``www.google.com``, the web browser goes back to the steps involved in 479 | resolving the other domain, and follows all steps up to this point for that 480 | domain. The ``Host`` header in the request will be set to the appropriate 481 | server name instead of ``google.com``. 482 | 483 | HTTP Server Request Handle 484 | -------------------------- 485 | The HTTPD (HTTP Daemon) server is the one handling the requests/responses on 486 | the server-side. The most common HTTPD servers are Apache or nginx for Linux 487 | and IIS for Windows. 488 | 489 | * The HTTPD (HTTP Daemon) receives the request. 490 | * The server breaks down the request to the following parameters: 491 | * HTTP Request Method (either ``GET``, ``HEAD``, ``POST``, ``PUT``, 492 | ``PATCH``, ``DELETE``, ``CONNECT``, ``OPTIONS``, or ``TRACE``). In the 493 | case of a URL entered directly into the address bar, this will be ``GET``. 494 | * Domain, in this case - google.com. 495 | * Requested path/page, in this case - / (as no specific path/page was 496 | requested, / is the default path). 497 | * The server verifies that there is a Virtual Host configured on the server 498 | that corresponds with google.com. 499 | * The server verifies that google.com can accept GET requests. 500 | * The server verifies that the client is allowed to use this method 501 | (by IP, authentication, etc.). 502 | * If the server has a rewrite module installed (like mod_rewrite for Apache or 503 | URL Rewrite for IIS), it tries to match the request against one of the 504 | configured rules. If a matching rule is found, the server uses that rule to 505 | rewrite the request. 506 | * The server goes to pull the content that corresponds with the request, 507 | in our case it will fall back to the index file, as "/" is the main file 508 | (some cases can override this, but this is the most common method). 509 | * The server parses the file according to the handler. If Google 510 | is running on PHP, the server uses PHP to interpret the index file, and 511 | streams the output to the client. 512 | 513 | Behind the scenes of the Browser 514 | ---------------------------------- 515 | 516 | Once the server supplies the resources (HTML, CSS, JS, images, etc.) 517 | to the browser it undergoes the below process: 518 | 519 | * Parsing - HTML, CSS, JS 520 | * Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree → 521 | Painting the render tree 522 | 523 | Browser 524 | ------- 525 | 526 | The browser's functionality is to present the web resource you choose, by 527 | requesting it from the server and displaying it in the browser window. 528 | The resource is usually an HTML document, but may also be a PDF, 529 | image, or some other type of content. The location of the resource is 530 | specified by the user using a URI (Uniform Resource Identifier). 531 | 532 | The way the browser interprets and displays HTML files is specified 533 | in the HTML and CSS specifications. These specifications are maintained 534 | by the W3C (World Wide Web Consortium) organization, which is the 535 | standards organization for the web. 536 | 537 | Browser user interfaces have a lot in common with each other. Among the 538 | common user interface elements are: 539 | 540 | * An address bar for inserting a URI 541 | * Back and forward buttons 542 | * Bookmarking options 543 | * Refresh and stop buttons for refreshing or stopping the loading of 544 | current documents 545 | * Home button that takes you to your home page 546 | 547 | **Browser High-Level Structure** 548 | 549 | The components of the browsers are: 550 | 551 | * **User interface:** The user interface includes the address bar, 552 | back/forward button, bookmarking menu, etc. Every part of the browser 553 | display except the window where you see the requested page. 554 | * **Browser engine:** The browser engine marshals actions between the UI 555 | and the rendering engine. 556 | * **Rendering engine:** The rendering engine is responsible for displaying 557 | requested content. For example if the requested content is HTML, the 558 | rendering engine parses HTML and CSS, and displays the parsed content on 559 | the screen. 560 | * **Networking:** The networking handles network calls such as HTTP requests, 561 | using different implementations for different platforms behind a 562 | platform-independent interface. 563 | * **UI backend:** The UI backend is used for drawing basic widgets like combo 564 | boxes and windows. This backend exposes a generic interface that is not 565 | platform-specific. 566 | Underneath it uses operating system user interface methods. 567 | * **JavaScript engine:** The JavaScript engine is used to parse and 568 | execute JavaScript code. 569 | * **Data storage:** The data storage is a persistence layer. The browser may 570 | need to save all sorts of data locally, such as cookies. Browsers also 571 | support storage mechanisms such as localStorage, IndexedDB, WebSQL and 572 | FileSystem. 573 | 574 | HTML parsing 575 | ------------ 576 | 577 | The rendering engine starts getting the contents of the requested 578 | document from the networking layer. This will usually be done in 8kB chunks. 579 | 580 | The primary job of the HTML parser is to parse the HTML markup into a parse tree. 581 | 582 | The output tree (the "parse tree") is a tree of DOM element and attribute 583 | nodes. DOM is short for Document Object Model. It is the object presentation 584 | of the HTML document and the interface of HTML elements to the outside world 585 | like JavaScript. The root of the tree is the "Document" object. Prior to 586 | any manipulation via scripting, the DOM has an almost one-to-one relation to 587 | the markup. 588 | 589 | **The parsing algorithm** 590 | 591 | HTML cannot be parsed using the regular top-down or bottom-up parsers. 592 | 593 | The reasons are: 594 | 595 | * The forgiving nature of the language. 596 | * The fact that browsers have traditional error tolerance to support well 597 | known cases of invalid HTML. 598 | * The parsing process is reentrant. For other languages, the source doesn't 599 | change during parsing, but in HTML, dynamic code (such as script elements 600 | containing `document.write()` calls) can add extra tokens, so the parsing 601 | process actually modifies the input. 602 | 603 | Unable to use the regular parsing techniques, the browser utilizes a custom 604 | parser for parsing HTML. The parsing algorithm is described in 605 | detail by the HTML5 specification. 606 | 607 | The algorithm consists of two stages: tokenization and tree construction. 608 | 609 | **Actions when the parsing is finished** 610 | 611 | The browser begins fetching external resources linked to the page (CSS, images, 612 | JavaScript files, etc.). 613 | 614 | At this stage the browser marks the document as interactive and starts 615 | parsing scripts that are in "deferred" mode: those that should be 616 | executed after the document is parsed. The document state is 617 | set to "complete" and a "load" event is fired. 618 | 619 | Note there is never an "Invalid Syntax" error on an HTML page. Browsers fix 620 | any invalid content and go on. 621 | 622 | CSS interpretation 623 | ------------------ 624 | 625 | * Parse CSS files, ``