├── .travis.yml
└── README.rst


/.travis.yml:
--------------------------------------------------------------------------------
 1 | language: python
 2 | 
 3 | sudo: false
 4 | 
 5 | install: 
 6 |     - pip install doc8
 7 |     - npm install -g write-good
 8 | 
 9 | script:
10 |     - doc8 README.rst
11 |     - write-good README.rst --so --thereIs --cliches
12 | 


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
  1 | What happens when...
  2 | ====================
  3 | 
  4 | This repository is an attempt to answer the age-old interview question "What
  5 | happens when you type google.com into your browser's address box and press
  6 | enter?"
  7 | 
  8 | Except instead of the usual story, we're going to try to answer this question
  9 | in as much detail as possible. No skipping out on anything.
 10 | 
 11 | This is a collaborative process, so dig in and try to help out! There are tons
 12 | of details missing, just waiting for you to add them! So send us a pull
 13 | request, please!
 14 | 
 15 | This is all licensed under the terms of the `Creative Commons Zero`_ license.
 16 | 
 17 | Read this in `简体中文`_ (simplified Chinese), `日本語`_ (Japanese), `한국어`_
 18 | (Korean) and `Spanish`_. NOTE: these have not been reviewed by the alex/what-happens-when
 19 | maintainers.
 20 | 
 21 | Table of Contents
 22 | ====================
 23 | 
 24 | .. contents::
 25 |    :backlinks: none
 26 |    :local:
 27 | 
 28 | The "g" key is pressed
 29 | ----------------------
 30 | The following sections explain the physical keyboard actions
 31 | and the OS interrupts. When you press the key "g" the browser receives the
 32 | event and the auto-complete functions kick in.
 33 | Depending on your browser's algorithm and if you are in
 34 | private/incognito mode or not various suggestions will be presented
 35 | to you in the dropdown below the URL bar. Most of these algorithms sort
 36 | and prioritize results based on search history, bookmarks, cookies, and
 37 | popular searches from the internet as a whole. As you are typing
 38 | "google.com" many blocks of code run and the suggestions will be refined
 39 | with each keypress. It may even suggest "google.com" before you finish typing
 40 | it.
 41 | 
 42 | The "enter" key bottoms out
 43 | ---------------------------
 44 | 
 45 | To pick a zero point, let's choose the Enter key on the keyboard hitting the
 46 | bottom of its range. At this point, an electrical circuit specific to the enter
 47 | key is closed (either directly or capacitively). This allows a small amount of
 48 | current to flow into the logic circuitry of the keyboard, which scans the state
 49 | of each key switch, debounces the electrical noise of the rapid intermittent
 50 | closure of the switch, and converts it to a keycode integer, in this case 13.
 51 | The keyboard controller then encodes the keycode for transport to the computer.
 52 | This is now almost universally over a Universal Serial Bus (USB) or Bluetooth
 53 | connection, but historically has been over PS/2 or ADB connections.
 54 | 
 55 | *In the case of the USB keyboard:*
 56 | 
 57 | - The USB circuitry of the keyboard is powered by the 5V supply provided over
 58 |   pin 1 from the computer's USB host controller.
 59 | 
 60 | - The keycode generated is stored by internal keyboard circuitry memory in a
 61 |   register called "endpoint".
 62 | 
 63 | - The host USB controller polls that "endpoint" every ~10ms (minimum value
 64 |   declared by the keyboard), so it gets the keycode value stored on it.
 65 | 
 66 | - This value goes to the USB SIE (Serial Interface Engine) to be converted in
 67 |   one or more USB packets that follow the low-level USB protocol.
 68 | 
 69 | - Those packets are sent by a differential electrical signal over D+ and D-
 70 |   pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID
 71 |   (Human Interface Device) device is always declared to be a "low-speed device"
 72 |   (USB 2.0 compliance).
 73 | 
 74 | - This serial signal is then decoded at the computer's host USB controller, and
 75 |   interpreted by the computer's Human Interface Device (HID) universal keyboard
 76 |   device driver.  The value of the key is then passed into the operating
 77 |   system's hardware abstraction layer.
 78 | 
 79 | *In the case of Virtual Keyboard (as in touch screen devices):*
 80 | 
 81 | - When the user puts their finger on a modern capacitive touch screen, a
 82 |   tiny amount of current gets transferred to the finger. This completes the
 83 |   circuit through the electrostatic field of the conductive layer and
 84 |   creates a voltage drop at that point on the screen. The
 85 |   ``screen controller`` then raises an interrupt reporting the coordinate of
 86 |   the keypress.
 87 | 
 88 | - Then the mobile OS notifies the currently focused application of a press event
 89 |   in one of its GUI elements (which now is the virtual keyboard application
 90 |   buttons).
 91 | 
 92 | - The virtual keyboard can now raise a software interrupt for sending a
 93 |   'key pressed' message back to the OS.
 94 | 
 95 | - This interrupt notifies the currently focused application of a 'key pressed'
 96 |   event.
 97 | 
 98 | 
 99 | Interrupt fires [NOT for USB keyboards]
100 | ---------------------------------------
101 | 
102 | The keyboard sends signals on its interrupt request line (IRQ), which is mapped
103 | to an ``interrupt vector`` (integer) by the interrupt controller. The CPU uses
104 | the ``Interrupt Descriptor Table`` (IDT) to map the interrupt vectors to
105 | functions (``interrupt handlers``) which are supplied by the kernel. When an
106 | interrupt arrives, the CPU indexes the IDT with the interrupt vector and runs
107 | the appropriate handler. Thus, the kernel is entered.
108 | 
109 | (On Windows) A ``WM_KEYDOWN`` message is sent to the app
110 | --------------------------------------------------------
111 | 
112 | The HID transport passes the key down event to the ``KBDHID.sys`` driver which
113 | converts the HID usage into a scancode. In this case, the scan code is
114 | ``VK_RETURN`` (``0x0D``). The ``KBDHID.sys`` driver interfaces with the
115 | ``KBDCLASS.sys`` (keyboard class driver). This driver is responsible for
116 | handling all keyboard and keypad input in a secure manner. It then calls into
117 | ``Win32K.sys`` (after potentially passing the message through 3rd party
118 | keyboard filters that are installed). This all happens in kernel mode.
119 | 
120 | ``Win32K.sys`` figures out what window is the active window through the
121 | ``GetForegroundWindow()`` API. This API provides the window handle of the
122 | browser's address box. The main Windows "message pump" then calls
123 | ``SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)``. ``lParam`` is a bitmask
124 | that indicates further information about the keypress: repeat count (0 in this
125 | case), the actual scan code (can be OEM dependent, but generally wouldn't be
126 | for ``VK_RETURN``), whether extended keys (e.g. alt, shift, ctrl) were also
127 | pressed (they weren't), and some other state.
128 | 
129 | The Windows ``SendMessage`` API is a straightforward function that
130 | adds the message to a queue for the particular window handle (``hWnd``).
131 | Later, the main message processing function (called a ``WindowProc``) assigned
132 | to the ``hWnd`` is called in order to process each message in the queue.
133 | 
134 | The window (``hWnd``) that is active is actually an edit control and the
135 | ``WindowProc`` in this case has a message handler for ``WM_KEYDOWN`` messages.
136 | This code looks within the 3rd parameter that was passed to ``SendMessage``
137 | (``wParam``) and, because it is ``VK_RETURN`` knows the user has hit the ENTER
138 | key.
139 | 
140 | (On OS X) A ``KeyDown`` NSEvent is sent to the app
141 | --------------------------------------------------
142 | 
143 | The interrupt signal triggers an interrupt event in the I/O Kit kext keyboard
144 | driver. The driver translates the signal into a key code which is passed to the
145 | OS X ``WindowServer`` process. Resultantly, the ``WindowServer`` dispatches an
146 | event to any appropriate (e.g. active or listening) applications through their
147 | Mach port where it is placed into an event queue. Events can then be read from
148 | this queue by threads with sufficient privileges calling the
149 | ``mach_ipc_dispatch`` function. This most commonly occurs through, and is
150 | handled by, an ``NSApplication`` main event loop, via an ``NSEvent`` of
151 | ``NSEventType`` ``KeyDown``.
152 | 
153 | (On GNU/Linux) the Xorg server listens for keycodes
154 | ---------------------------------------------------
155 | 
156 | When a graphical ``X server`` is used, ``X`` will use the generic event
157 | driver ``evdev`` to acquire the keypress. A re-mapping of keycodes to scancodes
158 | is made with ``X server`` specific keymaps and rules.
159 | When the scancode mapping of the key pressed is complete, the ``X server``
160 | sends the character to the ``window manager`` (DWM, metacity, i3, etc), so the
161 | ``window manager`` in turn sends the character to the focused window.
162 | The graphical API of the window  that receives the character prints the
163 | appropriate font symbol in the appropriate focused field.
164 | 
165 | Parse URL
166 | ---------
167 | 
168 | * The browser now has the following information contained in the URL (Uniform
169 |   Resource Locator):
170 | 
171 |     - ``Protocol``  "http"
172 |         Use 'Hyper Text Transfer Protocol'
173 | 
174 |     - ``Resource``  "/"
175 |         Retrieve main (index) page
176 | 
177 | 
178 | Is it a URL or a search term?
179 | -----------------------------
180 | 
181 | When no protocol or valid domain name is given the browser proceeds to feed
182 | the text given in the address box to the browser's default web search engine.
183 | In many cases the URL has a special piece of text appended to it to tell the
184 | search engine that it came from a particular browser's URL bar.
185 | 
186 | Convert non-ASCII Unicode characters in the hostname
187 | ------------------------------------------------
188 | 
189 | * The browser checks the hostname for characters that are not in ``a-z``,
190 |   ``A-Z``, ``0-9``, ``-``, or ``.``.
191 | * Since the hostname is ``google.com`` there won't be any, but if there were
192 |   the browser would apply `Punycode`_ encoding to the hostname portion of the
193 |   URL.
194 | 
195 | Check HSTS list
196 | ---------------
197 | * The browser checks its "preloaded HSTS (HTTP Strict Transport Security)"
198 |   list. This is a list of websites that have requested to be contacted via
199 |   HTTPS only.
200 | * If the website is in the list, the browser sends its request via HTTPS
201 |   instead of HTTP. Otherwise, the initial request is sent via HTTP.
202 |   (Note that a website can still use the HSTS policy *without* being in the
203 |   HSTS list.  The first HTTP request to the website by a user will receive a
204 |   response requesting that the user only send HTTPS requests.  However, this
205 |   single HTTP request could potentially leave the user vulnerable to a
206 |   `downgrade attack`_, which is why the HSTS list is included in modern web
207 |   browsers.)
208 | 
209 | DNS lookup
210 | ----------
211 | 
212 | * Browser checks if the domain is in its cache. (to see the DNS Cache in
213 |   Chrome, go to `chrome://net-internals/#dns <chrome://net-internals/#dns>`_).
214 | * If not found, the browser calls ``gethostbyname`` library function (varies by
215 |   OS) to do the lookup.
216 | * ``gethostbyname`` checks if the hostname can be resolved by reference in the
217 |   local ``hosts`` file (whose location `varies by OS`_) before trying to
218 |   resolve the hostname through DNS.
219 | * If ``gethostbyname`` does not have it cached nor can find it in the ``hosts``
220 |   file then it makes a request to the DNS server configured in the network
221 |   stack. This is typically the local router or the ISP's caching DNS server.
222 | * If the DNS server is on the same subnet the network library follows the
223 |   ``ARP process`` below for the DNS server.
224 | * If the DNS server is on a different subnet, the network library follows
225 |   the ``ARP process`` below for the default gateway IP.
226 | 
227 | 
228 | ARP process
229 | -----------
230 | 
231 | In order to send an ARP (Address Resolution Protocol) broadcast the network
232 | stack library needs the target IP address to lookup. It also needs to know the
233 | MAC address of the interface it will use to send out the ARP broadcast.
234 | 
235 | The ARP cache is first checked for an ARP entry for our target IP. If it is in
236 | the cache, the library function returns the result: Target IP = MAC.
237 | 
238 | If the entry is not in the ARP cache:
239 | 
240 | * The route table is looked up, to see if the Target IP address is on any of
241 |   the subnets on the local route table. If it is, the library uses the
242 |   interface associated with that subnet. If it is not, the library uses the
243 |   interface that has the subnet of our default gateway.
244 | 
245 | * The MAC address of the selected network interface is looked up.
246 | 
247 | * The network library sends a Layer 2 (data link layer of the `OSI model`_)
248 |   ARP request:
249 | 
250 | ``ARP Request``::
251 | 
252 |     Sender MAC: interface:mac:address:here
253 |     Sender IP: interface.ip.goes.here
254 |     Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)
255 |     Target IP: target.ip.goes.here
256 | 
257 | Depending on what type of hardware is between the computer and the router:
258 | 
259 | Directly connected:
260 | 
261 | * If the computer is directly connected to the router the router response
262 |   with an ``ARP Reply`` (see below)
263 | 
264 | Hub:
265 | 
266 | * If the computer is connected to a hub, the hub will broadcast the ARP
267 |   request out of all other ports. If the router is connected on the same "wire",
268 |   it will respond with an ``ARP Reply`` (see below).
269 | 
270 | Switch:
271 | 
272 | * If the computer is connected to a switch, the switch will check its local
273 |   CAM/MAC table to see which port has the MAC address we are looking for. If
274 |   the switch has no entry for the MAC address it will rebroadcast the ARP
275 |   request to all other ports.
276 | 
277 | * If the switch has an entry in the MAC/CAM table it will send the ARP request
278 |   to the port that has the MAC address we are looking for.
279 | 
280 | * If the router is on the same "wire", it will respond with an ``ARP Reply``
281 |   (see below)
282 | 
283 | ``ARP Reply``::
284 | 
285 |     Sender MAC: target:mac:address:here
286 |     Sender IP: target.ip.goes.here
287 |     Target MAC: interface:mac:address:here
288 |     Target IP: interface.ip.goes.here
289 | 
290 | Now that the network library has the IP address of either our DNS server or
291 | the default gateway it can resume its DNS process:
292 | 
293 | * The DNS client establishes a socket to UDP port 53 on the DNS server,
294 |   using a source port above 1023.
295 | * If the response size is too large, TCP will be used instead.
296 | * If the local/ISP DNS server does not have it, then a recursive search is
297 |   requested and that flows up the list of DNS servers until the SOA is reached,
298 |   and if found an answer is returned.
299 | 
300 | Opening of a socket
301 | -------------------
302 | Once the browser receives the IP address of the destination server, it takes
303 | that and the given port number from the URL (the HTTP protocol defaults to port
304 | 80, and HTTPS to port 443), and makes a call to the system library function
305 | named ``socket`` and requests a TCP socket stream - ``AF_INET/AF_INET6`` and
306 | ``SOCK_STREAM``.
307 | 
308 | * This request is first passed to the Transport Layer where a TCP segment is
309 |   crafted. The destination port is added to the header, and a source port is
310 |   chosen from within the kernel's dynamic port range (ip_local_port_range in
311 |   Linux).
312 | * This segment is sent to the Network Layer, which wraps an additional IP
313 |   header. The IP address of the destination server as well as that of the
314 |   current machine is inserted to form a packet.
315 | * The packet next arrives at the Link Layer. A frame header is added that
316 |   includes the MAC address of the machine's NIC as well as the MAC address of
317 |   the gateway (local router). As before, if the kernel does not know the MAC
318 |   address of the gateway, it must broadcast an ARP query to find it.
319 | 
320 | At this point the packet is ready to be transmitted through either:
321 | 
322 | * `Ethernet`_
323 | * `WiFi`_
324 | * `Cellular data network`_
325 | 
326 | For most home or small business Internet connections the packet will pass from
327 | your computer, possibly through a local network, and then through a modem
328 | (MOdulator/DEModulator) which converts digital 1's and 0's into an analog
329 | signal suitable for transmission over telephone, cable, or wireless telephony
330 | connections. On the other end of the connection is another modem which converts
331 | the analog signal back into digital data to be processed by the next `network
332 | node`_ where the from and to addresses would be analyzed further.
333 | 
334 | Most larger businesses and some newer residential connections will have fiber
335 | or direct Ethernet connections in which case the data remains digital and
336 | is passed directly to the next `network node`_ for processing.
337 | 
338 | Eventually, the packet will reach the router managing the local subnet. From
339 | there, it will continue to travel to the autonomous system's (AS) border
340 | routers, other ASes, and finally to the destination server. Each router along
341 | the way extracts the destination address from the IP header and routes it to
342 | the appropriate next hop. The time to live (TTL) field in the IP header is
343 | decremented by one for each router that passes. The packet will be dropped if
344 | the TTL field reaches zero or if the current router has no space in its queue
345 | (perhaps due to network congestion).
346 | 
347 | This send and receive happens multiple times following the TCP connection flow:
348 | 
349 | * Client chooses an initial sequence number (ISN) and sends the packet to the
350 |   server with the SYN bit set to indicate it is setting the ISN
351 | * Server receives SYN and if it's in an agreeable mood:
352 |    * Server chooses its own initial sequence number
353 |    * Server sets SYN to indicate it is choosing its ISN
354 |    * Server copies the (client ISN +1) to its ACK field and adds the ACK flag
355 |      to indicate it is acknowledging receipt of the first packet
356 | * Client acknowledges the connection by sending a packet:
357 |    * Increases its own sequence number
358 |    * Increases the receiver acknowledgment number
359 |    * Sets ACK field
360 | * Data is transferred as follows:
361 |    * As one side sends N data bytes, it increases its SEQ by that number
362 |    * When the other side acknowledges receipt of that packet (or a string of
363 |      packets), it sends an ACK packet with the ACK value equal to the last
364 |      received sequence from the other
365 | * To close the connection:
366 |    * The closer sends a FIN packet
367 |    * The other sides ACKs the FIN packet and sends its own FIN
368 |    * The closer acknowledges the other side's FIN with an ACK
369 | 
370 | TLS handshake
371 | -------------
372 | * The client computer sends a ``ClientHello`` message to the server with its
373 |   Transport Layer Security (TLS) version, list of cipher algorithms and
374 |   compression methods available.
375 | 
376 | * The server replies with a ``ServerHello`` message to the client with the
377 |   TLS version, selected cipher, selected compression methods and the server's
378 |   public certificate signed by a CA (Certificate Authority). The certificate
379 |   contains a public key that will be used by the client to encrypt the rest of
380 |   the handshake until a symmetric key can be agreed upon.
381 | 
382 | * The client verifies the server digital certificate against its list of
383 |   trusted CAs. If trust can be established based on the CA, the client
384 |   generates a string of pseudo-random bytes and encrypts this with the server's
385 |   public key. These random bytes can be used to determine the symmetric key.
386 | 
387 | * The server decrypts the random bytes using its private key and uses these
388 |   bytes to generate its own copy of the symmetric master key.
389 | 
390 | * The client sends a ``Finished`` message to the server, encrypting a hash of
391 |   the transmission up to this point with the symmetric key.
392 | 
393 | * The server generates its own hash, and then decrypts the client-sent hash
394 |   to verify that it matches. If it does, it sends its own ``Finished`` message
395 |   to the client, also encrypted with the symmetric key.
396 | 
397 | * From now on the TLS session transmits the application (HTTP) data encrypted
398 |   with the agreed symmetric key.
399 | 
400 | If a packet is dropped
401 | ----------------------
402 | 
403 | Sometimes, due to network congestion or flaky hardware connections, TLS packets
404 | will be dropped before they get to their final destination. The sender then has
405 | to decide how to react. The algorithm for this is called `TCP congestion
406 | control`_. This varies depending on the sender; the most common algorithms are
407 | `cubic`_ on newer operating systems and `New Reno`_ on almost all others.
408 | 
409 | * Client chooses a `congestion window`_ based on the `maximum segment size`_
410 |   (MSS) of the connection.
411 | * For each packet acknowledged, the window doubles in size until it reaches the
412 |   'slow-start threshold'. In some implementations, this threshold is adaptive.
413 | * After reaching the slow-start threshold, the window increases additively for
414 |   each packet acknowledged. If a packet is dropped, the window reduces
415 |   exponentially until another packet is acknowledged.
416 | 
417 | HTTP protocol
418 | -------------
419 | 
420 | If the web browser used was written by Google, instead of sending an HTTP
421 | request to retrieve the page, it will send a request to try and negotiate with
422 | the server an "upgrade" from HTTP to the SPDY protocol.
423 | 
424 | If the client is using the HTTP protocol and does not support SPDY, it sends a
425 | request to the server of the form::
426 | 
427 |     GET / HTTP/1.1
428 |     Host: google.com
429 |     Connection: close
430 |     [other headers]
431 | 
432 | where ``[other headers]`` refers to a series of colon-separated key-value pairs
433 | formatted as per the HTTP specification and separated by single newlines.
434 | (This assumes the web browser being used doesn't have any bugs violating the
435 | HTTP spec. This also assumes that the web browser is using ``HTTP/1.1``,
436 | otherwise it may not include the ``Host`` header in the request and the version
437 | specified in the ``GET`` request will either be ``HTTP/1.0`` or ``HTTP/0.9``.)
438 | 
439 | HTTP/1.1 defines the "close" connection option for the sender to signal that
440 | the connection will be closed after completion of the response. For example,
441 | 
442 |     Connection: close
443 | 
444 | HTTP/1.1 applications that do not support persistent connections MUST include
445 | the "close" connection option in every message.
446 | 
447 | After sending the request and headers, the web browser sends a single blank
448 | newline to the server indicating that the content of the request is done.
449 | 
450 | The server responds with a response code denoting the status of the request and
451 | responds with a response of the form::
452 | 
453 |     200 OK
454 |     [response headers]
455 | 
456 | Followed by a single newline, and then sends a payload of the HTML content of
457 | ``www.google.com``. The server may then either close the connection, or if
458 | headers sent by the client requested it, keep the connection open to be reused
459 | for further requests.
460 | 
461 | If the HTTP headers sent by the web browser included sufficient information for
462 | the webserver to determine if the version of the file cached by the web
463 | browser has been unmodified since the last retrieval (ie. if the web browser
464 | included an ``ETag`` header), it may instead respond with a request of
465 | the form::
466 | 
467 |     304 Not Modified
468 |     [response headers]
469 | 
470 | and no payload, and the web browser instead retrieve the HTML from its cache.
471 | 
472 | After parsing the HTML, the web browser (and server) repeats this process
473 | for every resource (image, CSS, favicon.ico, etc) referenced by the HTML page,
474 | except instead of ``GET / HTTP/1.1`` the request will be
475 | ``GET /$(URL relative to www.google.com) HTTP/1.1``.
476 | 
477 | If the HTML referenced a resource on a different domain than
478 | ``www.google.com``, the web browser goes back to the steps involved in
479 | resolving the other domain, and follows all steps up to this point for that
480 | domain. The ``Host`` header in the request will be set to the appropriate
481 | server name instead of ``google.com``.
482 | 
483 | HTTP Server Request Handle
484 | --------------------------
485 | The HTTPD (HTTP Daemon) server is the one handling the requests/responses on
486 | the server-side. The most common HTTPD servers are Apache or nginx for Linux
487 | and IIS for Windows.
488 | 
489 | * The HTTPD (HTTP Daemon) receives the request.
490 | * The server breaks down the request to the following parameters:
491 |    * HTTP Request Method (either ``GET``, ``HEAD``, ``POST``, ``PUT``,
492 |      ``PATCH``, ``DELETE``, ``CONNECT``, ``OPTIONS``, or ``TRACE``). In the
493 |      case of a URL entered directly into the address bar, this will be ``GET``.
494 |    * Domain, in this case - google.com.
495 |    * Requested path/page, in this case - / (as no specific path/page was
496 |      requested, / is the default path).
497 | * The server verifies that there is a Virtual Host configured on the server
498 |   that corresponds with google.com.
499 | * The server verifies that google.com can accept GET requests.
500 | * The server verifies that the client is allowed to use this method
501 |   (by IP, authentication, etc.).
502 | * If the server has a rewrite module installed (like mod_rewrite for Apache or
503 |   URL Rewrite for IIS), it tries to match the request against one of the
504 |   configured rules. If a matching rule is found, the server uses that rule to
505 |   rewrite the request.
506 | * The server goes to pull the content that corresponds with the request,
507 |   in our case it will fall back to the index file, as "/" is the main file
508 |   (some cases can override this, but this is the most common method).
509 | * The server parses the file according to the handler. If Google
510 |   is running on PHP, the server uses PHP to interpret the index file, and
511 |   streams the output to the client.
512 | 
513 | Behind the scenes of the Browser
514 | ----------------------------------
515 | 
516 | Once the server supplies the resources (HTML, CSS, JS, images, etc.)
517 | to the browser it undergoes the below process:
518 | 
519 | * Parsing - HTML, CSS, JS
520 | * Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree →
521 |   Painting the render tree
522 | 
523 | Browser
524 | -------
525 | 
526 | The browser's functionality is to present the web resource you choose, by
527 | requesting it from the server and displaying it in the browser window.
528 | The resource is usually an HTML document, but may also be a PDF,
529 | image, or some other type of content. The location of the resource is
530 | specified by the user using a URI (Uniform Resource Identifier).
531 | 
532 | The way the browser interprets and displays HTML files is specified
533 | in the HTML and CSS specifications. These specifications are maintained
534 | by the W3C (World Wide Web Consortium) organization, which is the
535 | standards organization for the web.
536 | 
537 | Browser user interfaces have a lot in common with each other. Among the
538 | common user interface elements are:
539 | 
540 | * An address bar for inserting a URI
541 | * Back and forward buttons
542 | * Bookmarking options
543 | * Refresh and stop buttons for refreshing or stopping the loading of
544 |   current documents
545 | * Home button that takes you to your home page
546 | 
547 | **Browser High-Level Structure**
548 | 
549 | The components of the browsers are:
550 | 
551 | * **User interface:** The user interface includes the address bar,
552 |   back/forward button, bookmarking menu, etc. Every part of the browser
553 |   display except the window where you see the requested page.
554 | * **Browser engine:** The browser engine marshals actions between the UI
555 |   and the rendering engine.
556 | * **Rendering engine:** The rendering engine is responsible for displaying
557 |   requested content. For example if the requested content is HTML, the
558 |   rendering engine parses HTML and CSS, and displays the parsed content on
559 |   the screen.
560 | * **Networking:** The networking handles network calls such as HTTP requests,
561 |   using different implementations for different platforms behind a
562 |   platform-independent interface.
563 | * **UI backend:** The UI backend is used for drawing basic widgets like combo
564 |   boxes and windows. This backend exposes a generic interface that is not
565 |   platform-specific.
566 |   Underneath it uses operating system user interface methods.
567 | * **JavaScript engine:** The JavaScript engine is used to parse and
568 |   execute JavaScript code.
569 | * **Data storage:** The data storage is a persistence layer. The browser may
570 |   need to save all sorts of data locally, such as cookies. Browsers also
571 |   support storage mechanisms such as localStorage, IndexedDB, WebSQL and
572 |   FileSystem.
573 | 
574 | HTML parsing
575 | ------------
576 | 
577 | The rendering engine starts getting the contents of the requested
578 | document from the networking layer. This will usually be done in 8kB chunks.
579 | 
580 | The primary job of the HTML parser is to parse the HTML markup into a parse tree.
581 | 
582 | The output tree (the "parse tree") is a tree of DOM element and attribute
583 | nodes. DOM is short for Document Object Model. It is the object presentation
584 | of the HTML document and the interface of HTML elements to the outside world
585 | like JavaScript. The root of the tree is the "Document" object. Prior to
586 | any manipulation via scripting, the DOM has an almost one-to-one relation to
587 | the markup.
588 | 
589 | **The parsing algorithm**
590 | 
591 | HTML cannot be parsed using the regular top-down or bottom-up parsers.
592 | 
593 | The reasons are:
594 | 
595 | * The forgiving nature of the language.
596 | * The fact that browsers have traditional error tolerance to support well
597 |   known cases of invalid HTML.
598 | * The parsing process is reentrant. For other languages, the source doesn't
599 |   change during parsing, but in HTML, dynamic code (such as script elements
600 |   containing `document.write()` calls) can add extra tokens, so the parsing
601 |   process actually modifies the input.
602 | 
603 | Unable to use the regular parsing techniques, the browser utilizes a custom
604 | parser for parsing HTML. The parsing algorithm is described in
605 | detail by the HTML5 specification.
606 | 
607 | The algorithm consists of two stages: tokenization and tree construction.
608 | 
609 | **Actions when the parsing is finished**
610 | 
611 | The browser begins fetching external resources linked to the page (CSS, images,
612 | JavaScript files, etc.).
613 | 
614 | At this stage the browser marks the document as interactive and starts
615 | parsing scripts that are in "deferred" mode: those that should be
616 | executed after the document is parsed. The document state is
617 | set to "complete" and a "load" event is fired.
618 | 
619 | Note there is never an "Invalid Syntax" error on an HTML page. Browsers fix
620 | any invalid content and go on.
621 | 
622 | CSS interpretation
623 | ------------------
624 | 
625 | * Parse CSS files, ``<style>`` tag contents, and ``style`` attribute
626 |   values using `"CSS lexical and syntax grammar"`_
627 | * Each CSS file is parsed into a ``StyleSheet object``, where each object
628 |   contains CSS rules with selectors and objects corresponding CSS grammar.
629 | * A CSS parser can be top-down or bottom-up when a specific parser generator
630 |   is used.
631 | 
632 | Page Rendering
633 | --------------
634 | 
635 | * Create a 'Frame Tree' or 'Render Tree' by traversing the DOM nodes, and
636 |   calculating the CSS style values for each node.
637 | * Calculate the preferred width of each node in the 'Frame Tree' bottom-up
638 |   by summing the preferred width of the child nodes and the node's
639 |   horizontal margins, borders, and padding.
640 | * Calculate the actual width of each node top-down by allocating each node's
641 |   available width to its children.
642 | * Calculate the height of each node bottom-up by applying text wrapping and
643 |   summing the child node heights and the node's margins, borders, and padding.
644 | * Calculate the coordinates of each node using the information calculated
645 |   above.
646 | * More complicated steps are taken when elements are ``floated``,
647 |   positioned ``absolutely`` or ``relatively``, or other complex features
648 |   are used. See
649 |   http://dev.w3.org/csswg/css2/ and http://www.w3.org/Style/CSS/current-work
650 |   for more details.
651 | * Create layers to describe which parts of the page can be animated as a group
652 |   without being re-rasterized. Each frame/render object is assigned to a layer.
653 | * Textures are allocated for each layer of the page.
654 | * The frame/render objects for each layer are traversed and drawing commands
655 |   are executed for their respective layer. This may be rasterized by the CPU
656 |   or drawn on the GPU directly using D2D/SkiaGL.
657 | * All of the above steps may reuse calculated values from the last time the
658 |   webpage was rendered, so that incremental changes require less work.
659 | * The page layers are sent to the compositing process where they are combined
660 |   with layers for other visible content like the browser chrome, iframes
661 |   and addon panels.
662 | * Final layer positions are computed and the composite commands are issued
663 |   via Direct3D/OpenGL. The GPU command buffer(s) are flushed to the GPU for
664 |   asynchronous rendering and the frame is sent to the window server.
665 | 
666 | GPU Rendering
667 | -------------
668 | 
669 | * During the rendering process the graphical computing layers can use general
670 |   purpose ``CPU`` or the graphical processor ``GPU`` as well.
671 | 
672 | * When using ``GPU`` for graphical rendering computations the graphical
673 |   software layers split the task into multiple pieces, so it can take advantage
674 |   of ``GPU`` massive parallelism for float point calculations required for
675 |   the rendering process.
676 | 
677 | 
678 | Window Server
679 | -------------
680 | 
681 | Post-rendering and user-induced execution
682 | -----------------------------------------
683 | 
684 | After rendering has been completed, the browser executes JavaScript code as a result
685 | of some timing mechanism (such as a Google Doodle animation) or user
686 | interaction (typing a query into the search box and receiving suggestions).
687 | Plugins such as Flash or Java may execute as well, although not at this time on
688 | the Google homepage. Scripts can cause additional network requests to be
689 | performed, as well as modify the page or its layout, causing another round of
690 | page rendering and painting.
691 | 
692 | .. _`Creative Commons Zero`: https://creativecommons.org/publicdomain/zero/1.0/
693 | .. _`"CSS lexical and syntax grammar"`: http://www.w3.org/TR/CSS2/grammar.html
694 | .. _`Punycode`: https://en.wikipedia.org/wiki/Punycode
695 | .. _`Ethernet`: http://en.wikipedia.org/wiki/IEEE_802.3
696 | .. _`WiFi`: https://en.wikipedia.org/wiki/IEEE_802.11
697 | .. _`Cellular data network`: https://en.wikipedia.org/wiki/Cellular_data_communication_protocol
698 | .. _`analog-to-digital converter`: https://en.wikipedia.org/wiki/Analog-to-digital_converter
699 | .. _`network node`: https://en.wikipedia.org/wiki/Computer_network#Network_nodes
700 | .. _`TCP congestion control`: https://en.wikipedia.org/wiki/TCP_congestion_control
701 | .. _`cubic`: https://en.wikipedia.org/wiki/CUBIC_TCP
702 | .. _`New Reno`: https://en.wikipedia.org/wiki/TCP_congestion_control#TCP_New_Reno
703 | .. _`congestion window`: https://en.wikipedia.org/wiki/TCP_congestion_control#Congestion_window
704 | .. _`maximum segment size`: https://en.wikipedia.org/wiki/Maximum_segment_size
705 | .. _`varies by OS` : https://en.wikipedia.org/wiki/Hosts_%28file%29#Location_in_the_file_system
706 | .. _`简体中文`: https://github.com/skyline75489/what-happens-when-zh_CN
707 | .. _`한국어`: https://github.com/SantonyChoi/what-happens-when-KR
708 | .. _`日本語`: https://github.com/tettttsuo/what-happens-when-JA
709 | .. _`downgrade attack`: http://en.wikipedia.org/wiki/SSL_stripping
710 | .. _`OSI Model`: https://en.wikipedia.org/wiki/OSI_model
711 | .. _`Spanish`: https://github.com/gonzaleztroyano/what-happens-when-ES
712 | 


--------------------------------------------------------------------------------