├── .travis.yml └── README.rst /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | 3 | sudo: false 4 | 5 | install: 6 | - pip install doc8 7 | - npm install -g write-good 8 | 9 | script: 10 | - doc8 README.rst 11 | - write-good README.rst --so --thereIs --cliches 12 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | What happens when... 2 | ==================== 3 | 4 | This repository is an attempt to answer the age old interview question "What 5 | happens when you type google.com into your browser's address box and press 6 | enter?" 7 | 8 | Except instead of the usual story, we're going to try to answer this question 9 | in as much detail as possible. No skipping out on anything. 10 | 11 | This is a collaborative process, so dig in and try to help out! There's tons of 12 | details missing, just waiting for you to add them! So send us a pull request, 13 | please! 14 | 15 | This is all licensed under the terms of the `Creative Commons Zero`_ license. 16 | 17 | Read this in `简体中文`_ (simplified Chinese). NOTE: this has not been reviewed 18 | by the alex/what-happens-when maintainers. 19 | 20 | Table of Contents 21 | ==================== 22 | 23 | .. contents:: 24 | :backlinks: none 25 | :local: 26 | 27 | The "g" key is pressed 28 | ---------------------- 29 | The following sections explains all about the physical keyboard 30 | and the OS interrupts. But, a whole lot happens after that which 31 | isn't explained. When you just press "g" the browser receives the 32 | event and the entire auto-complete machinery kicks into high gear. 33 | Depending on your browser's algorithm and if you are in 34 | private/incognito mode or not various suggestions will be presented 35 | to you in the dropbox below the URL bar. Most of these algorithms 36 | prioritize results based on search history and bookmarks. You are 37 | going to type "google.com" so none of it matters, but a lot of code 38 | will run before you get there and the suggestions will be refined 39 | with each key press. It may even suggest "google.com" before you type it. 40 | 41 | The "enter" key bottoms out 42 | --------------------------- 43 | 44 | To pick a zero point, let's choose the Enter key on the keyboard hitting the 45 | bottom of its range. At this point, an electrical circuit specific to the enter 46 | key is closed (either directly or capacitively). This allows a small amount of 47 | current to flow into the logic circuitry of the keyboard, which scans the state 48 | of each key switch, debounces the electrical noise of the rapid intermittent 49 | closure of the switch, and converts it to a keycode integer, in this case 13. 50 | The keyboard controller then encodes the keycode for transport to the computer. 51 | This is now almost universally over a Universal Serial Bus (USB) or Bluetooth 52 | connection, but historically has been over PS/2 or ADB connections. 53 | 54 | *In the case of the USB keyboard:* 55 | 56 | - The USB circuitry of the keyboard is powered by the 5V supply provided over 57 | pin 1 from the computer's USB host controller. 58 | 59 | - The keycode generated is stored by internal keyboard circuitry memory in a 60 | register called "endpoint". 61 | 62 | - The host USB controller polls that "endpoint" every ~10ms (minimum value 63 | declared by the keyboard), so it gets the keycode value stored on it. 64 | 65 | - This value goes to the USB SIE (Serial Interface Engine) to be converted in 66 | one or more USB packets that follows the low level USB protocol. 67 | 68 | - Those packets are sent by a differential electrical signal over D+ and D- 69 | pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID 70 | (Human Interface Device) device is always declared to be a "low speed device" 71 | (USB 2.0 compliance). 72 | 73 | - This serial signal is then decoded at the computer's host USB controller, and 74 | interpreted by the computer's Human Interface Device (HID) universal keyboard 75 | device driver. The value of the key is then passed into the operating 76 | system's hardware abstraction layer. 77 | 78 | *In the case of Virtual Keyboard (as in touch screen devices):* 79 | 80 | - When the user puts their finger on a modern capacitive touch screen, a 81 | tiny amount of current gets transferred to the finger. This completes the 82 | circuit through the electrostatic field of the conductive layer and 83 | creates a voltage drop at that point on the screen. The 84 | ``screen controller`` then raises an interrupt reporting the coordinate of 85 | the key press. 86 | 87 | - Then the mobile OS notifies the current focused application of a press event 88 | in one of its GUI elements (which now is the virtual keyboard application 89 | buttons). 90 | 91 | - The virtual keyboard can now raise a software interrupt for sending a 92 | 'key pressed' message back to the OS. 93 | 94 | - This interrupt notifies the current focused application of a 'key pressed' 95 | event. 96 | 97 | 98 | Interrupt fires [NOT for USB keyboards] 99 | --------------------------------------- 100 | 101 | The keyboard sends signals on its interrupt request line (IRQ), which is mapped 102 | to an ``interrupt vector`` (integer) by the interrupt controller. The CPU uses 103 | the ``Interrupt Descriptor Table`` (IDT) to map the interrupt vectors to 104 | functions (``interrupt handlers``) which are supplied by the kernel. When an 105 | interrupt arrives, the CPU indexes the IDT with the interrupt vector and runs 106 | the appropriate handler. Thus, the kernel is entered. 107 | 108 | (On Windows) A ``WM_KEYDOWN`` message is sent to the app 109 | -------------------------------------------------------- 110 | 111 | The HID transport passes the key down event to the ``KBDHID.sys`` driver which 112 | converts the HID usage into a scancode. In this case the scan code is 113 | ``VK_RETURN`` (``0x0D``). The ``KBDHID.sys`` driver interfaces with the 114 | ``KBDCLASS.sys`` (keyboard class driver). This driver is responsible for 115 | handling all keyboard and keypad input in a secure manner. It then calls into 116 | ``Win32K.sys`` (after potentially passing the message through 3rd party 117 | keyboard filters that are installed). This all happens in kernel mode. 118 | 119 | ``Win32K.sys`` figures out what window is the active window through the 120 | ``GetForegroundWindow()`` API. This API provides the window handle of the 121 | browser's address box. The main Windows "message pump" then calls 122 | ``SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)``. ``lParam`` is a bitmask 123 | that indicates further information about the keypress: repeat count (0 in this 124 | case), the actual scan code (can be OEM dependent, but generally wouldn't be 125 | for ``VK_RETURN``), whether extended keys (e.g. alt, shift, ctrl) were also 126 | pressed (they weren't), and some other state. 127 | 128 | The Windows ``SendMessage`` API is a straightforward function that 129 | adds the message to a queue for the particular window handle (``hWnd``). 130 | Later, the main message processing function (called a ``WindowProc``) assigned 131 | to the ``hWnd`` is called in order to process each message in the queue. 132 | 133 | The window (``hWnd``) that is active is actually an edit control and the 134 | ``WindowProc`` in this case has a message handler for ``WM_KEYDOWN`` messages. 135 | This code looks within the 3rd parameter that was passed to ``SendMessage`` 136 | (``wParam``) and, because it is ``VK_RETURN`` knows the user has hit the ENTER 137 | key. 138 | 139 | (On OS X) A ``KeyDown`` NSEvent is sent to the app 140 | -------------------------------------------------- 141 | 142 | The interrupt signal triggers an interrupt event in the I/O Kit kext keyboard 143 | driver. The driver translates the signal into a key code which is passed to the 144 | OS X ``WindowServer`` process. Resultantly, the ``WindowServer`` dispatches an 145 | event to any appropriate (e.g. active or listening) applications through their 146 | Mach port where it is placed into an event queue. Events can then be read from 147 | this queue by threads with sufficient privileges calling the 148 | ``mach_ipc_dispatch`` function. This most commonly occurs through, and is 149 | handled by, an ``NSApplication`` main event loop, via an ``NSEvent`` of 150 | ``NSEventType`` ``KeyDown``. 151 | 152 | (On GNU/Linux) the Xorg server listens for keycodes 153 | --------------------------------------------------- 154 | 155 | When a graphical ``X server`` is used, ``X`` will use the generic event 156 | driver ``evdev`` to acquire the keypress. A re-mapping of keycodes to scancodes 157 | is made with ``X server`` specific keymaps and rules. 158 | When the scancode mapping of the key pressed is complete, the ``X server`` 159 | sends the character to the ``window manager`` (DWM, metacity, i3, etc), so the 160 | ``window manager`` in turn sends the character to the focused window. 161 | The graphical API of the window that receives the character prints the 162 | appropriate font symbol in the appropriate focused field. 163 | 164 | Parse URL 165 | --------- 166 | 167 | * The browser now has the following information contained in the URL (Uniform 168 | Resource Locator): 169 | 170 | - ``Protocol`` "http" 171 | Use 'Hyper Text Transfer Protocol' 172 | 173 | - ``Resource`` "/" 174 | Retrieve main (index) page 175 | 176 | 177 | Is it a URL or a search term? 178 | ----------------------------- 179 | 180 | When no protocol or valid domain name is given the browser proceeds to feed 181 | the text given in the address box to the browser's default web search engine. 182 | In many cases the URL has a special piece of text appended to it to tell the 183 | search engine that it came from a particular browser's URL bar. 184 | 185 | Convert non-ASCII Unicode characters in hostname 186 | ------------------------------------------------ 187 | 188 | * The browser checks the hostname for characters that are not in ``a-z``, 189 | ``A-Z``, ``0-9``, ``-``, or ``.``. 190 | * Since the hostname is ``google.com`` there won't be any, but if there were 191 | the browser would apply `Punycode`_ encoding to the hostname portion of the 192 | URL. 193 | 194 | Check HSTS list 195 | --------------- 196 | * The browser checks its "preloaded HSTS (HTTP Strict Transport Security)" 197 | list. This is a list of websites that have requested to be contacted via 198 | HTTPS only. 199 | * If the website is in the list, the browser sends its request via HTTPS 200 | instead of HTTP. Otherwise, the initial request is sent via HTTP. 201 | (Note that a website can still use the HSTS policy *without* being in the 202 | HSTS list. The first HTTP request to the website by a user will receive a 203 | response requesting that the user only send HTTPS requests. However, this 204 | single HTTP request could potentially leave the user vulnerable to a 205 | `downgrade attack`_, which is why the HSTS list is included in modern web 206 | browsers.) 207 | 208 | DNS lookup 209 | ---------- 210 | 211 | * Browser checks if the domain is in its cache. (to see the DNS Cache in 212 | Chrome, go to `chrome://net-internals/#dns `_). 213 | * If not found, the browser calls ``gethostbyname`` library function (varies by 214 | OS) to do the lookup. 215 | * ``gethostbyname`` checks if the hostname can be resolved by reference in the 216 | local ``hosts`` file (whose location `varies by OS`_) before trying to 217 | resolve the hostname through DNS. 218 | * If ``gethostbyname`` does not have it cached nor can find it in the ``hosts`` 219 | file then it makes a request to the DNS server configured in the network 220 | stack. This is typically the local router or the ISP's caching DNS server. 221 | * If the DNS server is on the same subnet the network library follows the 222 | ``ARP process`` below for the DNS server. 223 | * If the DNS server is on a different subnet, the network library follows 224 | the ``ARP process`` below for the default gateway IP. 225 | 226 | 227 | ARP process 228 | ----------- 229 | 230 | In order to send an ARP (Address Resolution Protocol) broadcast the network 231 | stack library needs the target IP address to look up. It also needs to know the 232 | MAC address of the interface it will use to send out the ARP broadcast. 233 | 234 | The ARP cache is first checked for an ARP entry for our target IP. If it is in 235 | the cache, the library function returns the result: Target IP = MAC. 236 | 237 | If the entry is not in the ARP cache: 238 | 239 | * The route table is looked up, to see if the Target IP address is on any of 240 | the subnets on the local route table. If it is, the library uses the 241 | interface associated with that subnet. If it is not, the library uses the 242 | interface that has the subnet of our default gateway. 243 | 244 | * The MAC address of the selected network interface is looked up. 245 | 246 | * The network library sends a Layer 2 (data link layer of the `OSI model`_) 247 | ARP request: 248 | 249 | ``ARP Request``:: 250 | 251 | Sender MAC: interface:mac:address:here 252 | Sender IP: interface.ip.goes.here 253 | Target MAC: FF:FF:FF:FF:FF:FF (Broadcast) 254 | Target IP: target.ip.goes.here 255 | 256 | Depending on what type of hardware is between the computer and the router: 257 | 258 | Directly connected: 259 | 260 | * If the computer is directly connected to the router the router responds 261 | with an ``ARP Reply`` (see below) 262 | 263 | Hub: 264 | 265 | * If the computer is connected to a hub, the hub will broadcast the ARP 266 | request out all other ports. If the router is connected on the same "wire", 267 | it will respond with an ``ARP Reply`` (see below). 268 | 269 | Switch: 270 | 271 | * If the computer is connected to a switch, the switch will check its local 272 | CAM/MAC table to see which port has the MAC address we are looking for. If 273 | the switch has no entry for the MAC address it will rebroadcast the ARP 274 | request to all other ports. 275 | 276 | * If the switch has an entry in the MAC/CAM table it will send the ARP request 277 | to the port that has the MAC address we are looking for. 278 | 279 | * If the router is on the same "wire", it will respond with an ``ARP Reply`` 280 | (see below) 281 | 282 | ``ARP Reply``:: 283 | 284 | Sender MAC: target:mac:address:here 285 | Sender IP: target.ip.goes.here 286 | Target MAC: interface:mac:address:here 287 | Target IP: interface.ip.goes.here 288 | 289 | Now that the network library has the IP address of either our DNS server or 290 | the default gateway it can resume its DNS process: 291 | 292 | * Port 53 is opened to send a UDP request to DNS server (if the response size 293 | is too large, TCP will be used instead). 294 | * If the local/ISP DNS server does not have it, then a recursive search is 295 | requested and that flows up the list of DNS servers until the SOA is reached, 296 | and if found an answer is returned. 297 | 298 | Opening of a socket 299 | ------------------- 300 | Once the browser receives the IP address of the destination server, it takes 301 | that and the given port number from the URL (the HTTP protocol defaults to port 302 | 80, and HTTPS to port 443), and makes a call to the system library function 303 | named ``socket`` and requests a TCP socket stream - ``AF_INET/AF_INET6`` and 304 | ``SOCK_STREAM``. 305 | 306 | * This request is first passed to the Transport Layer where a TCP segment is 307 | crafted. The destination port is added to the header, and a source port is 308 | chosen from within the kernel's dynamic port range (ip_local_port_range in 309 | Linux). 310 | * This segment is sent to the Network Layer, which wraps an additional IP 311 | header. The IP address of the destination server as well as that of the 312 | current machine is inserted to form a packet. 313 | * The packet next arrives at the Link Layer. A frame header is added that 314 | includes the MAC address of the machine's NIC as well as the MAC address of 315 | the gateway (local router). As before, if the kernel does not know the MAC 316 | address of the gateway, it must broadcast an ARP query to find it. 317 | 318 | At this point the packet is ready to be transmitted through either: 319 | 320 | * `Ethernet`_ 321 | * `WiFi`_ 322 | * `Cellular data network`_ 323 | 324 | For most home or small business Internet connections the packet will pass from 325 | your computer, possibly through a local network, and then through a modem 326 | (MOdulator/DEModulator) which converts digital 1's and 0's into an analog 327 | signal suitable for transmission over telephone, cable, or wireless telephony 328 | connections. On the other end of the connection is another modem which converts 329 | the analog signal back into digital data to be processed by the next `network 330 | node`_ where the from and to addresses would be analyzed further. 331 | 332 | Most larger businesses and some newer residential connections will have fiber 333 | or direct Ethernet connections in which case the data remains digital and 334 | is passed directly to the next `network node`_ for processing. 335 | 336 | Eventually, the packet will reach the router managing the local subnet. From 337 | there, it will continue to travel to the autonomous system's (AS) border 338 | routers, other ASes, and finally to the destination server. Each router along 339 | the way extracts the destination address from the IP header and routes it to 340 | the appropriate next hop. The time to live (TTL) field in the IP header is 341 | decremented by one for each router that passes. The packet will be dropped if 342 | the TTL field reaches zero or if the current router has no space in its queue 343 | (perhaps due to network congestion). 344 | 345 | This send and receive happens multiple times following the TCP connection flow: 346 | 347 | * Client chooses an initial sequence number (ISN) and sends the packet to the 348 | server with the SYN bit set to indicate it is setting the ISN 349 | * Server receives SYN and if it's in an agreeable mood: 350 | * Server chooses its own initial sequence number 351 | * Server sets SYN to indicate it is choosing its ISN 352 | * Server copies the (client ISN +1) to its ACK field and adds the ACK flag 353 | to indicate it is acknowledging receipt of the first packet 354 | * Client acknowledges the connection by sending a packet: 355 | * Increases its own sequence number 356 | * Increases the receiver acknowledgment number 357 | * Sets ACK field 358 | * Data is transferred as follows: 359 | * As one side sends N data bytes, it increases its SEQ by that number 360 | * When the other side acknowledges receipt of that packet (or a string of 361 | packets), it sends an ACK packet with the ACK value equal to the last 362 | received sequence from the other 363 | * To close the connection: 364 | * The closer sends a FIN packet 365 | * The other sides ACKs the FIN packet and sends its own FIN 366 | * The closer acknowledges the other side's FIN with an ACK 367 | 368 | TLS handshake 369 | ------------- 370 | * The client computer sends a ``ClientHello`` message to the server with its 371 | Transport Layer Security (TLS) version, list of cipher algorithms and 372 | compression methods available. 373 | 374 | * The server replies with a ``ServerHello`` message to the client with the 375 | TLS version, selected cipher, selected compression methods and the server's 376 | public certificate signed by a CA (Certificate Authority). The certificate 377 | contains a public key that will be used by the client to encrypt the rest of 378 | the handshake until a symmetric key can be agreed upon. 379 | 380 | * The client verifies the server digital certificate against its list of 381 | trusted CAs. If trust can be established based on the CA, the client 382 | generates a string of pseudo-random bytes and encrypts this with the server's 383 | public key. These random bytes can be used to determine the symmetric key. 384 | 385 | * The server decrypts the random bytes using its private key and uses these 386 | bytes to generate its own copy of the symmetric master key. 387 | 388 | * The client sends a ``Finished`` message to the server, encrypting a hash of 389 | the transmission up to this point with the symmetric key. 390 | 391 | * The server generates its own hash, and then decrypts the client-sent hash 392 | to verify that it matches. If it does, it sends its own ``Finished`` message 393 | to the client, also encrypted with the symmetric key. 394 | 395 | * From now on the TLS session transmits the application (HTTP) data encrypted 396 | with the agreed symmetric key. 397 | 398 | HTTP protocol 399 | ------------- 400 | 401 | If the web browser used was written by Google, instead of sending an HTTP 402 | request to retrieve the page, it will send a request to try and negotiate with 403 | the server an "upgrade" from HTTP to the SPDY protocol. 404 | 405 | If the client is using the HTTP protocol and does not support SPDY, it sends a 406 | request to the server of the form:: 407 | 408 | GET / HTTP/1.1 409 | Host: google.com 410 | Connection: close 411 | [other headers] 412 | 413 | where ``[other headers]`` refers to a series of colon-separated key-value pairs 414 | formatted as per the HTTP specification and separated by single new lines. 415 | (This assumes the web browser being used doesn't have any bugs violating the 416 | HTTP spec. This also assumes that the web browser is using ``HTTP/1.1``, 417 | otherwise it may not include the ``Host`` header in the request and the version 418 | specified in the ``GET`` request will either be ``HTTP/1.0`` or ``HTTP/0.9``.) 419 | 420 | HTTP/1.1 defines the "close" connection option for the sender to signal that 421 | the connection will be closed after completion of the response. For example, 422 | 423 | Connection: close 424 | 425 | HTTP/1.1 applications that do not support persistent connections MUST include 426 | the "close" connection option in every message. 427 | 428 | After sending the request and headers, the web browser sends a single blank 429 | newline to the server indicating that the content of the request is done. 430 | 431 | The server responds with a response code denoting the status of the request and 432 | responds with a response of the form:: 433 | 434 | 200 OK 435 | [response headers] 436 | 437 | Followed by a single newline, and then sends a payload of the HTML content of 438 | ``www.google.com``. The server may then either close the connection, or if 439 | headers sent by the client requested it, keep the connection open to be reused 440 | for further requests. 441 | 442 | If the HTTP headers sent by the web browser included sufficient information for 443 | the web server to determine if the version of the file cached by the web 444 | browser has been unmodified since the last retrieval (ie. if the web browser 445 | included an ``ETag`` header), it may instead respond with a request of 446 | the form:: 447 | 448 | 304 Not Modified 449 | [response headers] 450 | 451 | and no payload, and the web browser instead retrieves the HTML from its cache. 452 | 453 | After parsing the HTML, the web browser (and server) repeats this process 454 | for every resource (image, CSS, favicon.ico, etc) referenced by the HTML page, 455 | except instead of ``GET / HTTP/1.1`` the request will be 456 | ``GET /$(URL relative to www.google.com) HTTP/1.1``. 457 | 458 | If the HTML referenced a resource on a different domain than 459 | ``www.google.com``, the web browser goes back to the steps involved in 460 | resolving the other domain, and follows all steps up to this point for that 461 | domain. The ``Host`` header in the request will be set to the appropriate 462 | server name instead of ``google.com``. 463 | 464 | HTTP Server Request Handle 465 | -------------------------- 466 | The HTTPD (HTTP Daemon) server is the one handling the requests/responses on 467 | the server side. The most common HTTPD servers are Apache or nginx for Linux 468 | and IIS for Windows. 469 | 470 | * The HTTPD (HTTP Daemon) receives the request. 471 | * The server breaks down the request to the following parameters: 472 | * HTTP Request Method (either ``GET``, ``HEAD``, ``POST``, ``PUT``, 473 | ``DELETE``, ``CONNECT``, ``OPTIONS``, or ``TRACE``). In the case of a URL 474 | entered directly into the address bar, this will be ``GET``. 475 | * Domain, in this case - google.com. 476 | * Requested path/page, in this case - / (as no specific path/page was 477 | requested, / is the default path). 478 | * The server verifies that there is a Virtual Host configured on the server 479 | that corresponds with google.com. 480 | * The server verifies that google.com can accept GET requests. 481 | * The server verifies that the client is allowed to use this method 482 | (by IP, authentication, etc.). 483 | * If the server has a rewrite module installed (like mod_rewrite for Apache or 484 | URL Rewrite for IIS), it tries to match the request against one of the 485 | configured rules. If a matching rule is found, the server uses that rule to 486 | rewrite the request. 487 | * The server goes to pull the content that corresponds with the request, 488 | in our case it will fall back to the index file, as "/" is the main file 489 | (some cases can override this, but this is the most common method). 490 | * The server parses the file according to the handler. If Google 491 | is running on PHP, the server uses PHP to interpret the index file, and 492 | streams the output to the client. 493 | 494 | Behind the scenes of the Browser 495 | ---------------------------------- 496 | 497 | Once the server supplies the resources (HTML, CSS, JS, images, etc.) 498 | to the browser it undergoes the below process: 499 | 500 | * Parsing - HTML, CSS, JS 501 | * Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree → 502 | Painting the render tree 503 | 504 | Browser 505 | ------- 506 | 507 | The browser's functionality is to present the web resource you choose, by 508 | requesting it from the server and displaying it in the browser window. 509 | The resource is usually an HTML document, but may also be a PDF, 510 | image, or some other type of content. The location of the resource is 511 | specified by the user using a URI (Uniform Resource Identifier). 512 | 513 | The way the browser interprets and displays HTML files is specified 514 | in the HTML and CSS specifications. These specifications are maintained 515 | by the W3C (World Wide Web Consortium) organization, which is the 516 | standards organization for the web. 517 | 518 | Browser user interfaces have a lot in common with each other. Among the 519 | common user interface elements are: 520 | 521 | * An address bar for inserting a URI 522 | * Back and forward buttons 523 | * Bookmarking options 524 | * Refresh and stop buttons for refreshing or stopping the loading of 525 | current documents 526 | * Home button that takes you to your home page 527 | 528 | **Browser High Level Structure** 529 | 530 | The components of the browsers are: 531 | 532 | * **User interface:** The user interface includes the address bar, 533 | back/forward button, bookmarking menu, etc. Every part of the browser 534 | display except the window where you see the requested page. 535 | * **Browser engine:** The browser engine marshals actions between the UI 536 | and the rendering engine. 537 | * **Rendering engine:** The rendering engine is responsible for displaying 538 | requested content. For example if the requested content is HTML, the 539 | rendering engine parses HTML and CSS, and displays the parsed content on 540 | the screen. 541 | * **Networking:** The networking handles network calls such as HTTP requests, 542 | using different implementations for different platforms behind a 543 | platform-independent interface. 544 | * **UI backend:** The UI backend is used for drawing basic widgets like combo 545 | boxes and windows. This backend exposes a generic interface that is not 546 | platform specific. 547 | Underneath it uses operating system user interface methods. 548 | * **JavaScript engine:** The JavaScript engine is used to parse and 549 | execute JavaScript code. 550 | * **Data storage:** The data storage is a persistence layer. The browser may 551 | need to save all sorts of data locally, such as cookies. Browsers also 552 | support storage mechanisms such as localStorage, IndexedDB, WebSQL and 553 | FileSystem. 554 | 555 | HTML parsing 556 | ------------ 557 | 558 | The rendering engine starts getting the contents of the requested 559 | document from the networking layer. This will usually be done in 8kB chunks. 560 | 561 | The primary job of HTML parser to parse the HTML markup into a parse tree. 562 | 563 | The output tree (the "parse tree") is a tree of DOM element and attribute 564 | nodes. DOM is short for Document Object Model. It is the object presentation 565 | of the HTML document and the interface of HTML elements to the outside world 566 | like JavaScript. The root of the tree is the "Document" object. Prior of 567 | any manipulation via scripting, the DOM has an almost one-to-one relation to 568 | the markup. 569 | 570 | **The parsing algorithm** 571 | 572 | HTML cannot be parsed using the regular top-down or bottom-up parsers. 573 | 574 | The reasons are: 575 | 576 | * The forgiving nature of the language. 577 | * The fact that browsers have traditional error tolerance to support well 578 | known cases of invalid HTML. 579 | * The parsing process is reentrant. For other languages, the source doesn't 580 | change during parsing, but in HTML, dynamic code (such as script elements 581 | containing `document.write()` calls) can add extra tokens, so the parsing 582 | process actually modifies the input. 583 | 584 | Unable to use the regular parsing techniques, the browser utilizes a custom 585 | parser for parsing HTML. The parsing algorithm is described in 586 | detail by the HTML5 specification. 587 | 588 | The algorithm consists of two stages: tokenization and tree construction. 589 | 590 | **Actions when the parsing is finished** 591 | 592 | The browser begins fetching external resources linked to the page (CSS, images, 593 | JavaScript files, etc.). 594 | 595 | At this stage the browser marks the document as interactive and starts 596 | parsing scripts that are in "deferred" mode: those that should be 597 | executed after the document is parsed. The document state is 598 | set to "complete" and a "load" event is fired. 599 | 600 | Note there is never an "Invalid Syntax" error on an HTML page. Browsers fix 601 | any invalid content and go on. 602 | 603 | CSS interpretation 604 | ------------------ 605 | 606 | * Parse CSS files, ``