├── .travis.yml
└── README.rst


/.travis.yml:
--------------------------------------------------------------------------------
 1 | language: python
 2 | 
 3 | sudo: false
 4 | 
 5 | install: 
 6 |     - pip install doc8
 7 |     - npm install -g write-good
 8 | 
 9 | script:
10 |     - doc8 README.rst
11 |     - write-good README.rst --so --thereIs --cliches
12 | 


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
  1 | What happens when...
  2 | ====================
  3 | 
  4 | This repository is an attempt to answer the age old interview question "What
  5 | happens when you type google.com into your browser's address box and press
  6 | enter?"
  7 | 
  8 | Except instead of the usual story, we're going to try to answer this question
  9 | in as much detail as possible. No skipping out on anything.
 10 | 
 11 | This is a collaborative process, so dig in and try to help out! There's tons of
 12 | details missing, just waiting for you to add them! So send us a pull request,
 13 | please!
 14 | 
 15 | This is all licensed under the terms of the `Creative Commons Zero`_ license.
 16 | 
 17 | Read this in `简体中文`_ (simplified Chinese). NOTE: this has not been reviewed
 18 | by the alex/what-happens-when maintainers.
 19 | 
 20 | Table of Contents
 21 | ====================
 22 | 
 23 | .. contents::
 24 |    :backlinks: none
 25 |    :local:
 26 | 
 27 | The "g" key is pressed
 28 | ----------------------
 29 | The following sections explains all about the physical keyboard
 30 | and the OS interrupts. But, a whole lot happens after that which
 31 | isn't explained. When you just press "g" the browser receives the
 32 | event and the entire auto-complete machinery kicks into high gear.
 33 | Depending on your browser's algorithm and if you are in
 34 | private/incognito mode or not various suggestions will be presented
 35 | to you in the dropbox below the URL bar. Most of these algorithms
 36 | prioritize results based on search history and bookmarks. You are
 37 | going to type "google.com" so none of it matters, but a lot of code
 38 | will run before you get there and the suggestions will be refined
 39 | with each key press. It may even suggest "google.com" before you type it.
 40 | 
 41 | The "enter" key bottoms out
 42 | ---------------------------
 43 | 
 44 | To pick a zero point, let's choose the Enter key on the keyboard hitting the
 45 | bottom of its range. At this point, an electrical circuit specific to the enter
 46 | key is closed (either directly or capacitively). This allows a small amount of
 47 | current to flow into the logic circuitry of the keyboard, which scans the state
 48 | of each key switch, debounces the electrical noise of the rapid intermittent
 49 | closure of the switch, and converts it to a keycode integer, in this case 13.
 50 | The keyboard controller then encodes the keycode for transport to the computer.
 51 | This is now almost universally over a Universal Serial Bus (USB) or Bluetooth
 52 | connection, but historically has been over PS/2 or ADB connections.
 53 | 
 54 | *In the case of the USB keyboard:*
 55 | 
 56 | - The USB circuitry of the keyboard is powered by the 5V supply provided over
 57 |   pin 1 from the computer's USB host controller.
 58 | 
 59 | - The keycode generated is stored by internal keyboard circuitry memory in a
 60 |   register called "endpoint".
 61 | 
 62 | - The host USB controller polls that "endpoint" every ~10ms (minimum value
 63 |   declared by the keyboard), so it gets the keycode value stored on it.
 64 | 
 65 | - This value goes to the USB SIE (Serial Interface Engine) to be converted in
 66 |   one or more USB packets that follows the low level USB protocol.
 67 | 
 68 | - Those packets are sent by a differential electrical signal over D+ and D-
 69 |   pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID
 70 |   (Human Interface Device) device is always declared to be a "low speed device"
 71 |   (USB 2.0 compliance).
 72 | 
 73 | - This serial signal is then decoded at the computer's host USB controller, and
 74 |   interpreted by the computer's Human Interface Device (HID) universal keyboard
 75 |   device driver.  The value of the key is then passed into the operating
 76 |   system's hardware abstraction layer.
 77 | 
 78 | *In the case of Virtual Keyboard (as in touch screen devices):*
 79 | 
 80 | - When the user puts their finger on a modern capacitive touch screen, a
 81 |   tiny amount of current gets transferred to the finger. This completes the
 82 |   circuit through the electrostatic field of the conductive layer and
 83 |   creates a voltage drop at that point on the screen. The
 84 |   ``screen controller`` then raises an interrupt reporting the coordinate of
 85 |   the key press.
 86 | 
 87 | - Then the mobile OS notifies the current focused application of a press event
 88 |   in one of its GUI elements (which now is the virtual keyboard application
 89 |   buttons).
 90 | 
 91 | - The virtual keyboard can now raise a software interrupt for sending a
 92 |   'key pressed' message back to the OS.
 93 | 
 94 | - This interrupt notifies the current focused application of a 'key pressed'
 95 |   event.
 96 | 
 97 | 
 98 | Interrupt fires [NOT for USB keyboards]
 99 | ---------------------------------------
100 | 
101 | The keyboard sends signals on its interrupt request line (IRQ), which is mapped
102 | to an ``interrupt vector`` (integer) by the interrupt controller. The CPU uses
103 | the ``Interrupt Descriptor Table`` (IDT) to map the interrupt vectors to
104 | functions (``interrupt handlers``) which are supplied by the kernel. When an
105 | interrupt arrives, the CPU indexes the IDT with the interrupt vector and runs
106 | the appropriate handler. Thus, the kernel is entered.
107 | 
108 | (On Windows) A ``WM_KEYDOWN`` message is sent to the app
109 | --------------------------------------------------------
110 | 
111 | The HID transport passes the key down event to the ``KBDHID.sys`` driver which
112 | converts the HID usage into a scancode. In this case the scan code is
113 | ``VK_RETURN`` (``0x0D``). The ``KBDHID.sys`` driver interfaces with the
114 | ``KBDCLASS.sys`` (keyboard class driver). This driver is responsible for
115 | handling all keyboard and keypad input in a secure manner. It then calls into
116 | ``Win32K.sys`` (after potentially passing the message through 3rd party
117 | keyboard filters that are installed). This all happens in kernel mode.
118 | 
119 | ``Win32K.sys`` figures out what window is the active window through the
120 | ``GetForegroundWindow()`` API. This API provides the window handle of the
121 | browser's address box. The main Windows "message pump" then calls
122 | ``SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)``. ``lParam`` is a bitmask
123 | that indicates further information about the keypress: repeat count (0 in this
124 | case), the actual scan code (can be OEM dependent, but generally wouldn't be
125 | for ``VK_RETURN``), whether extended keys (e.g. alt, shift, ctrl) were also
126 | pressed (they weren't), and some other state.
127 | 
128 | The Windows ``SendMessage`` API is a straightforward function that
129 | adds the message to a queue for the particular window handle (``hWnd``).
130 | Later, the main message processing function (called a ``WindowProc``) assigned
131 | to the ``hWnd`` is called in order to process each message in the queue.
132 | 
133 | The window (``hWnd``) that is active is actually an edit control and the
134 | ``WindowProc`` in this case has a message handler for ``WM_KEYDOWN`` messages.
135 | This code looks within the 3rd parameter that was passed to ``SendMessage``
136 | (``wParam``) and, because it is ``VK_RETURN`` knows the user has hit the ENTER
137 | key.
138 | 
139 | (On OS X) A ``KeyDown`` NSEvent is sent to the app
140 | --------------------------------------------------
141 | 
142 | The interrupt signal triggers an interrupt event in the I/O Kit kext keyboard
143 | driver. The driver translates the signal into a key code which is passed to the
144 | OS X ``WindowServer`` process. Resultantly, the ``WindowServer`` dispatches an
145 | event to any appropriate (e.g. active or listening) applications through their
146 | Mach port where it is placed into an event queue. Events can then be read from
147 | this queue by threads with sufficient privileges calling the
148 | ``mach_ipc_dispatch`` function. This most commonly occurs through, and is
149 | handled by, an ``NSApplication`` main event loop, via an ``NSEvent`` of
150 | ``NSEventType`` ``KeyDown``.
151 | 
152 | (On GNU/Linux) the Xorg server listens for keycodes
153 | ---------------------------------------------------
154 | 
155 | When a graphical ``X server`` is used, ``X`` will use the generic event
156 | driver ``evdev`` to acquire the keypress. A re-mapping of keycodes to scancodes
157 | is made with ``X server`` specific keymaps and rules.
158 | When the scancode mapping of the key pressed is complete, the ``X server``
159 | sends the character to the ``window manager`` (DWM, metacity, i3, etc), so the
160 | ``window manager`` in turn sends the character to the focused window.
161 | The graphical API of the window  that receives the character prints the
162 | appropriate font symbol in the appropriate focused field.
163 | 
164 | Parse URL
165 | ---------
166 | 
167 | * The browser now has the following information contained in the URL (Uniform
168 |   Resource Locator):
169 | 
170 |     - ``Protocol``  "http"
171 |         Use 'Hyper Text Transfer Protocol'
172 | 
173 |     - ``Resource``  "/"
174 |         Retrieve main (index) page
175 | 
176 | 
177 | Is it a URL or a search term?
178 | -----------------------------
179 | 
180 | When no protocol or valid domain name is given the browser proceeds to feed
181 | the text given in the address box to the browser's default web search engine.
182 | In many cases the URL has a special piece of text appended to it to tell the
183 | search engine that it came from a particular browser's URL bar.
184 | 
185 | Convert non-ASCII Unicode characters in hostname
186 | ------------------------------------------------
187 | 
188 | * The browser checks the hostname for characters that are not in ``a-z``,
189 |   ``A-Z``, ``0-9``, ``-``, or ``.``.
190 | * Since the hostname is ``google.com`` there won't be any, but if there were
191 |   the browser would apply `Punycode`_ encoding to the hostname portion of the
192 |   URL.
193 | 
194 | Check HSTS list
195 | ---------------
196 | * The browser checks its "preloaded HSTS (HTTP Strict Transport Security)"
197 |   list. This is a list of websites that have requested to be contacted via
198 |   HTTPS only.
199 | * If the website is in the list, the browser sends its request via HTTPS
200 |   instead of HTTP. Otherwise, the initial request is sent via HTTP.
201 |   (Note that a website can still use the HSTS policy *without* being in the
202 |   HSTS list.  The first HTTP request to the website by a user will receive a
203 |   response requesting that the user only send HTTPS requests.  However, this
204 |   single HTTP request could potentially leave the user vulnerable to a
205 |   `downgrade attack`_, which is why the HSTS list is included in modern web
206 |   browsers.)
207 | 
208 | DNS lookup
209 | ----------
210 | 
211 | * Browser checks if the domain is in its cache. (to see the DNS Cache in
212 |   Chrome, go to `chrome://net-internals/#dns <chrome://net-internals/#dns>`_).
213 | * If not found, the browser calls ``gethostbyname`` library function (varies by
214 |   OS) to do the lookup.
215 | * ``gethostbyname`` checks if the hostname can be resolved by reference in the
216 |   local ``hosts`` file (whose location `varies by OS`_) before trying to
217 |   resolve the hostname through DNS.
218 | * If ``gethostbyname`` does not have it cached nor can find it in the ``hosts``
219 |   file then it makes a request to the DNS server configured in the network
220 |   stack. This is typically the local router or the ISP's caching DNS server.
221 | * If the DNS server is on the same subnet the network library follows the
222 |   ``ARP process`` below for the DNS server.
223 | * If the DNS server is on a different subnet, the network library follows
224 |   the ``ARP process`` below for the default gateway IP.
225 | 
226 | 
227 | ARP process
228 | -----------
229 | 
230 | In order to send an ARP (Address Resolution Protocol) broadcast the network
231 | stack library needs the target IP address to look up. It also needs to know the
232 | MAC address of the interface it will use to send out the ARP broadcast.
233 | 
234 | The ARP cache is first checked for an ARP entry for our target IP. If it is in
235 | the cache, the library function returns the result: Target IP = MAC.
236 | 
237 | If the entry is not in the ARP cache:
238 | 
239 | * The route table is looked up, to see if the Target IP address is on any of
240 |   the subnets on the local route table. If it is, the library uses the
241 |   interface associated with that subnet. If it is not, the library uses the
242 |   interface that has the subnet of our default gateway.
243 | 
244 | * The MAC address of the selected network interface is looked up.
245 | 
246 | * The network library sends a Layer 2 (data link layer of the `OSI model`_)
247 |   ARP request:
248 | 
249 | ``ARP Request``::
250 | 
251 |     Sender MAC: interface:mac:address:here
252 |     Sender IP: interface.ip.goes.here
253 |     Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)
254 |     Target IP: target.ip.goes.here
255 | 
256 | Depending on what type of hardware is between the computer and the router:
257 | 
258 | Directly connected:
259 | 
260 | * If the computer is directly connected to the router the router responds
261 |   with an ``ARP Reply`` (see below)
262 | 
263 | Hub:
264 | 
265 | * If the computer is connected to a hub, the hub will broadcast the ARP
266 |   request out all other ports. If the router is connected on the same "wire",
267 |   it will respond with an ``ARP Reply`` (see below).
268 | 
269 | Switch:
270 | 
271 | * If the computer is connected to a switch, the switch will check its local
272 |   CAM/MAC table to see which port has the MAC address we are looking for. If
273 |   the switch has no entry for the MAC address it will rebroadcast the ARP
274 |   request to all other ports.
275 | 
276 | * If the switch has an entry in the MAC/CAM table it will send the ARP request
277 |   to the port that has the MAC address we are looking for.
278 | 
279 | * If the router is on the same "wire", it will respond with an ``ARP Reply``
280 |   (see below)
281 | 
282 | ``ARP Reply``::
283 | 
284 |     Sender MAC: target:mac:address:here
285 |     Sender IP: target.ip.goes.here
286 |     Target MAC: interface:mac:address:here
287 |     Target IP: interface.ip.goes.here
288 | 
289 | Now that the network library has the IP address of either our DNS server or
290 | the default gateway it can resume its DNS process:
291 | 
292 | * Port 53 is opened to send a UDP request to DNS server (if the response size
293 |   is too large, TCP will be used instead).
294 | * If the local/ISP DNS server does not have it, then a recursive search is
295 |   requested and that flows up the list of DNS servers until the SOA is reached,
296 |   and if found an answer is returned.
297 | 
298 | Opening of a socket
299 | -------------------
300 | Once the browser receives the IP address of the destination server, it takes
301 | that and the given port number from the URL (the HTTP protocol defaults to port
302 | 80, and HTTPS to port 443), and makes a call to the system library function
303 | named ``socket`` and requests a TCP socket stream - ``AF_INET/AF_INET6`` and
304 | ``SOCK_STREAM``.
305 | 
306 | * This request is first passed to the Transport Layer where a TCP segment is
307 |   crafted. The destination port is added to the header, and a source port is
308 |   chosen from within the kernel's dynamic port range (ip_local_port_range in
309 |   Linux).
310 | * This segment is sent to the Network Layer, which wraps an additional IP
311 |   header. The IP address of the destination server as well as that of the
312 |   current machine is inserted to form a packet.
313 | * The packet next arrives at the Link Layer. A frame header is added that
314 |   includes the MAC address of the machine's NIC as well as the MAC address of
315 |   the gateway (local router). As before, if the kernel does not know the MAC
316 |   address of the gateway, it must broadcast an ARP query to find it.
317 | 
318 | At this point the packet is ready to be transmitted through either:
319 | 
320 | * `Ethernet`_
321 | * `WiFi`_
322 | * `Cellular data network`_
323 | 
324 | For most home or small business Internet connections the packet will pass from
325 | your computer, possibly through a local network, and then through a modem
326 | (MOdulator/DEModulator) which converts digital 1's and 0's into an analog
327 | signal suitable for transmission over telephone, cable, or wireless telephony
328 | connections. On the other end of the connection is another modem which converts
329 | the analog signal back into digital data to be processed by the next `network
330 | node`_ where the from and to addresses would be analyzed further.
331 | 
332 | Most larger businesses and some newer residential connections will have fiber
333 | or direct Ethernet connections in which case the data remains digital and
334 | is passed directly to the next `network node`_ for processing.
335 | 
336 | Eventually, the packet will reach the router managing the local subnet. From
337 | there, it will continue to travel to the autonomous system's (AS) border
338 | routers, other ASes, and finally to the destination server. Each router along
339 | the way extracts the destination address from the IP header and routes it to
340 | the appropriate next hop. The time to live (TTL) field in the IP header is
341 | decremented by one for each router that passes. The packet will be dropped if
342 | the TTL field reaches zero or if the current router has no space in its queue
343 | (perhaps due to network congestion).
344 | 
345 | This send and receive happens multiple times following the TCP connection flow:
346 | 
347 | * Client chooses an initial sequence number (ISN) and sends the packet to the
348 |   server with the SYN bit set to indicate it is setting the ISN
349 | * Server receives SYN and if it's in an agreeable mood:
350 |    * Server chooses its own initial sequence number
351 |    * Server sets SYN to indicate it is choosing its ISN
352 |    * Server copies the (client ISN +1) to its ACK field and adds the ACK flag
353 |      to indicate it is acknowledging receipt of the first packet
354 | * Client acknowledges the connection by sending a packet:
355 |    * Increases its own sequence number
356 |    * Increases the receiver acknowledgment number
357 |    * Sets ACK field
358 | * Data is transferred as follows:
359 |    * As one side sends N data bytes, it increases its SEQ by that number
360 |    * When the other side acknowledges receipt of that packet (or a string of
361 |      packets), it sends an ACK packet with the ACK value equal to the last
362 |      received sequence from the other
363 | * To close the connection:
364 |    * The closer sends a FIN packet
365 |    * The other sides ACKs the FIN packet and sends its own FIN
366 |    * The closer acknowledges the other side's FIN with an ACK
367 | 
368 | TLS handshake
369 | -------------
370 | * The client computer sends a ``ClientHello`` message to the server with its
371 |   Transport Layer Security (TLS) version, list of cipher algorithms and
372 |   compression methods available.
373 | 
374 | * The server replies with a ``ServerHello`` message to the client with the
375 |   TLS version, selected cipher, selected compression methods and the server's
376 |   public certificate signed by a CA (Certificate Authority). The certificate
377 |   contains a public key that will be used by the client to encrypt the rest of
378 |   the handshake until a symmetric key can be agreed upon.
379 | 
380 | * The client verifies the server digital certificate against its list of
381 |   trusted CAs. If trust can be established based on the CA, the client
382 |   generates a string of pseudo-random bytes and encrypts this with the server's
383 |   public key. These random bytes can be used to determine the symmetric key.
384 | 
385 | * The server decrypts the random bytes using its private key and uses these
386 |   bytes to generate its own copy of the symmetric master key.
387 | 
388 | * The client sends a ``Finished`` message to the server, encrypting a hash of
389 |   the transmission up to this point with the symmetric key.
390 | 
391 | * The server generates its own hash, and then decrypts the client-sent hash
392 |   to verify that it matches. If it does, it sends its own ``Finished`` message
393 |   to the client, also encrypted with the symmetric key.
394 | 
395 | * From now on the TLS session transmits the application (HTTP) data encrypted
396 |   with the agreed symmetric key.
397 | 
398 | HTTP protocol
399 | -------------
400 | 
401 | If the web browser used was written by Google, instead of sending an HTTP
402 | request to retrieve the page, it will send a request to try and negotiate with
403 | the server an "upgrade" from HTTP to the SPDY protocol.
404 | 
405 | If the client is using the HTTP protocol and does not support SPDY, it sends a
406 | request to the server of the form::
407 | 
408 |     GET / HTTP/1.1
409 |     Host: google.com
410 |     Connection: close
411 |     [other headers]
412 | 
413 | where ``[other headers]`` refers to a series of colon-separated key-value pairs
414 | formatted as per the HTTP specification and separated by single new lines.
415 | (This assumes the web browser being used doesn't have any bugs violating the
416 | HTTP spec. This also assumes that the web browser is using ``HTTP/1.1``,
417 | otherwise it may not include the ``Host`` header in the request and the version
418 | specified in the ``GET`` request will either be ``HTTP/1.0`` or ``HTTP/0.9``.)
419 | 
420 | HTTP/1.1 defines the "close" connection option for the sender to signal that
421 | the connection will be closed after completion of the response. For example,
422 | 
423 |     Connection: close
424 | 
425 | HTTP/1.1 applications that do not support persistent connections MUST include
426 | the "close" connection option in every message.
427 | 
428 | After sending the request and headers, the web browser sends a single blank
429 | newline to the server indicating that the content of the request is done.
430 | 
431 | The server responds with a response code denoting the status of the request and
432 | responds with a response of the form::
433 | 
434 |     200 OK
435 |     [response headers]
436 | 
437 | Followed by a single newline, and then sends a payload of the HTML content of
438 | ``www.google.com``. The server may then either close the connection, or if
439 | headers sent by the client requested it, keep the connection open to be reused
440 | for further requests.
441 | 
442 | If the HTTP headers sent by the web browser included sufficient information for
443 | the web server to determine if the version of the file cached by the web
444 | browser has been unmodified since the last retrieval (ie. if the web browser
445 | included an ``ETag`` header), it may instead respond with a request of
446 | the form::
447 | 
448 |     304 Not Modified
449 |     [response headers]
450 | 
451 | and no payload, and the web browser instead retrieves the HTML from its cache.
452 | 
453 | After parsing the HTML, the web browser (and server) repeats this process
454 | for every resource (image, CSS, favicon.ico, etc) referenced by the HTML page,
455 | except instead of ``GET / HTTP/1.1`` the request will be
456 | ``GET /$(URL relative to www.google.com) HTTP/1.1``.
457 | 
458 | If the HTML referenced a resource on a different domain than
459 | ``www.google.com``, the web browser goes back to the steps involved in
460 | resolving the other domain, and follows all steps up to this point for that
461 | domain. The ``Host`` header in the request will be set to the appropriate
462 | server name instead of ``google.com``.
463 | 
464 | HTTP Server Request Handle
465 | --------------------------
466 | The HTTPD (HTTP Daemon) server is the one handling the requests/responses on
467 | the server side. The most common HTTPD servers are Apache or nginx for Linux
468 | and IIS for Windows.
469 | 
470 | * The HTTPD (HTTP Daemon) receives the request.
471 | * The server breaks down the request to the following parameters:
472 |    * HTTP Request Method (either ``GET``, ``HEAD``, ``POST``, ``PUT``,
473 |      ``DELETE``, ``CONNECT``, ``OPTIONS``, or ``TRACE``). In the case of a URL
474 |      entered directly into the address bar, this will be ``GET``.
475 |    * Domain, in this case - google.com.
476 |    * Requested path/page, in this case - / (as no specific path/page was
477 |      requested, / is the default path).
478 | * The server verifies that there is a Virtual Host configured on the server
479 |   that corresponds with google.com.
480 | * The server verifies that google.com can accept GET requests.
481 | * The server verifies that the client is allowed to use this method
482 |   (by IP, authentication, etc.).
483 | * If the server has a rewrite module installed (like mod_rewrite for Apache or
484 |   URL Rewrite for IIS), it tries to match the request against one of the
485 |   configured rules. If a matching rule is found, the server uses that rule to
486 |   rewrite the request.
487 | * The server goes to pull the content that corresponds with the request,
488 |   in our case it will fall back to the index file, as "/" is the main file
489 |   (some cases can override this, but this is the most common method).
490 | * The server parses the file according to the handler. If Google
491 |   is running on PHP, the server uses PHP to interpret the index file, and
492 |   streams the output to the client.
493 | 
494 | Behind the scenes of the Browser
495 | ----------------------------------
496 | 
497 | Once the server supplies the resources (HTML, CSS, JS, images, etc.)
498 | to the browser it undergoes the below process:
499 | 
500 | * Parsing - HTML, CSS, JS
501 | * Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree →
502 |   Painting the render tree
503 | 
504 | Browser
505 | -------
506 | 
507 | The browser's functionality is to present the web resource you choose, by
508 | requesting it from the server and displaying it in the browser window.
509 | The resource is usually an HTML document, but may also be a PDF,
510 | image, or some other type of content. The location of the resource is
511 | specified by the user using a URI (Uniform Resource Identifier).
512 | 
513 | The way the browser interprets and displays HTML files is specified
514 | in the HTML and CSS specifications. These specifications are maintained
515 | by the W3C (World Wide Web Consortium) organization, which is the
516 | standards organization for the web.
517 | 
518 | Browser user interfaces have a lot in common with each other. Among the
519 | common user interface elements are:
520 | 
521 | * An address bar for inserting a URI
522 | * Back and forward buttons
523 | * Bookmarking options
524 | * Refresh and stop buttons for refreshing or stopping the loading of
525 |   current documents
526 | * Home button that takes you to your home page
527 | 
528 | **Browser High Level Structure**
529 | 
530 | The components of the browsers are:
531 | 
532 | * **User interface:** The user interface includes the address bar,
533 |   back/forward button, bookmarking menu, etc. Every part of the browser
534 |   display except the window where you see the requested page.
535 | * **Browser engine:** The browser engine marshals actions between the UI
536 |   and the rendering engine.
537 | * **Rendering engine:** The rendering engine is responsible for displaying
538 |   requested content. For example if the requested content is HTML, the
539 |   rendering engine parses HTML and CSS, and displays the parsed content on
540 |   the screen.
541 | * **Networking:** The networking handles network calls such as HTTP requests,
542 |   using different implementations for different platforms behind a
543 |   platform-independent interface.
544 | * **UI backend:** The UI backend is used for drawing basic widgets like combo
545 |   boxes and windows. This backend exposes a generic interface that is not
546 |   platform specific.
547 |   Underneath it uses operating system user interface methods.
548 | * **JavaScript engine:** The JavaScript engine is used to parse and
549 |   execute JavaScript code.
550 | * **Data storage:** The data storage is a persistence layer. The browser may
551 |   need to save all sorts of data locally, such as cookies. Browsers also
552 |   support storage mechanisms such as localStorage, IndexedDB, WebSQL and
553 |   FileSystem.
554 | 
555 | HTML parsing
556 | ------------
557 | 
558 | The rendering engine starts getting the contents of the requested
559 | document from the networking layer. This will usually be done in 8kB chunks.
560 | 
561 | The primary job of HTML parser to parse the HTML markup into a parse tree.
562 | 
563 | The output tree (the "parse tree") is a tree of DOM element and attribute
564 | nodes. DOM is short for Document Object Model. It is the object presentation
565 | of the HTML document and the interface of HTML elements to the outside world
566 | like JavaScript. The root of the tree is the "Document" object. Prior of
567 | any manipulation via scripting, the DOM has an almost one-to-one relation to
568 | the markup.
569 | 
570 | **The parsing algorithm**
571 | 
572 | HTML cannot be parsed using the regular top-down or bottom-up parsers.
573 | 
574 | The reasons are:
575 | 
576 | * The forgiving nature of the language.
577 | * The fact that browsers have traditional error tolerance to support well
578 |   known cases of invalid HTML.
579 | * The parsing process is reentrant. For other languages, the source doesn't
580 |   change during parsing, but in HTML, dynamic code (such as script elements
581 |   containing `document.write()` calls) can add extra tokens, so the parsing
582 |   process actually modifies the input.
583 | 
584 | Unable to use the regular parsing techniques, the browser utilizes a custom
585 | parser for parsing HTML. The parsing algorithm is described in
586 | detail by the HTML5 specification.
587 | 
588 | The algorithm consists of two stages: tokenization and tree construction.
589 | 
590 | **Actions when the parsing is finished**
591 | 
592 | The browser begins fetching external resources linked to the page (CSS, images,
593 | JavaScript files, etc.).
594 | 
595 | At this stage the browser marks the document as interactive and starts
596 | parsing scripts that are in "deferred" mode: those that should be
597 | executed after the document is parsed. The document state is
598 | set to "complete" and a "load" event is fired.
599 | 
600 | Note there is never an "Invalid Syntax" error on an HTML page. Browsers fix
601 | any invalid content and go on.
602 | 
603 | CSS interpretation
604 | ------------------
605 | 
606 | * Parse CSS files, ``<style>`` tag contents, and ``style`` attribute
607 |   values using `"CSS lexical and syntax grammar"`_
608 | * Each CSS file is parsed into a ``StyleSheet object``, where each object
609 |   contains CSS rules with selectors and objects corresponding CSS grammar.
610 | * A CSS parser can be top-down or bottom-up when a specific parser generator
611 |   is used.
612 | 
613 | Page Rendering
614 | --------------
615 | 
616 | * Create a 'Frame Tree' or 'Render Tree' by traversing the DOM nodes, and
617 |   calculating the CSS style values for each node.
618 | * Calculate the preferred width of each node in the 'Frame Tree' bottom up
619 |   by summing the preferred width of the child nodes and the node's
620 |   horizontal margins, borders, and padding.
621 | * Calculate the actual width of each node top-down by allocating each node's
622 |   available width to its children.
623 | * Calculate the height of each node bottom-up by applying text wrapping and
624 |   summing the child node heights and the node's margins, borders, and padding.
625 | * Calculate the coordinates of each node using the information calculated
626 |   above.
627 | * More complicated steps are taken when elements are ``floated``,
628 |   positioned ``absolutely`` or ``relatively``, or other complex features
629 |   are used. See
630 |   http://dev.w3.org/csswg/css2/ and http://www.w3.org/Style/CSS/current-work
631 |   for more details.
632 | * Create layers to describe which parts of the page can be animated as a group
633 |   without being re-rasterized. Each frame/render object is assigned to a layer.
634 | * Textures are allocated for each layer of the page.
635 | * The frame/render objects for each layer are traversed and drawing commands
636 |   are executed for their respective layer. This may be rasterized by the CPU
637 |   or drawn on the GPU directly using D2D/SkiaGL.
638 | * All of the above steps may reuse calculated values from the last time the
639 |   webpage was rendered, so that incremental changes require less work.
640 | * The page layers are sent to the compositing process where they are combined
641 |   with layers for other visible content like the browser chrome, iframes
642 |   and addon panels.
643 | * Final layer positions are computed and the composite commands are issued
644 |   via Direct3D/OpenGL. The GPU command buffer(s) are flushed to the GPU for
645 |   asynchronous rendering and the frame is sent to the window server.
646 | 
647 | GPU Rendering
648 | -------------
649 | 
650 | * During the rendering process the graphical computing layers can use general
651 |   purpose ``CPU`` or the graphical processor ``GPU`` as well.
652 | 
653 | * When using ``GPU`` for graphical rendering computations the graphical
654 |   software layers split the task into multiple pieces, so it can take advantage
655 |   of ``GPU`` massive parallelism for float point calculations required for
656 |   the rendering process.
657 | 
658 | 
659 | Window Server
660 | -------------
661 | 
662 | Post-rendering and user-induced execution
663 | -----------------------------------------
664 | 
665 | After rendering has completed, the browser executes JavaScript code as a result
666 | of some timing mechanism (such as a Google Doodle animation) or user
667 | interaction (typing a query into the search box and receiving suggestions).
668 | Plugins such as Flash or Java may execute as well, although not at this time on
669 | the Google homepage. Scripts can cause additional network requests to be
670 | performed, as well as modify the page or its layout, causing another round of
671 | page rendering and painting.
672 | 
673 | .. _`Creative Commons Zero`: https://creativecommons.org/publicdomain/zero/1.0/
674 | .. _`"CSS lexical and syntax grammar"`: http://www.w3.org/TR/CSS2/grammar.html
675 | .. _`Punycode`: https://en.wikipedia.org/wiki/Punycode
676 | .. _`Ethernet`: http://en.wikipedia.org/wiki/IEEE_802.3
677 | .. _`WiFi`: https://en.wikipedia.org/wiki/IEEE_802.11
678 | .. _`Cellular data network`: https://en.wikipedia.org/wiki/Cellular_data_communication_protocol
679 | .. _`analog-to-digital converter`: https://en.wikipedia.org/wiki/Analog-to-digital_converter
680 | .. _`network node`: https://en.wikipedia.org/wiki/Computer_network#Network_nodes
681 | .. _`varies by OS` : https://en.wikipedia.org/wiki/Hosts_%28file%29#Location_in_the_file_system
682 | .. _`简体中文`: https://github.com/skyline75489/what-happens-when-zh_CN
683 | .. _`downgrade attack`: http://en.wikipedia.org/wiki/SSL_stripping
684 | .. _`OSI Model`: https://en.wikipedia.org/wiki/OSI_model
685 | 


--------------------------------------------------------------------------------