└── wiki ├── Main.wiki ├── Part3.wiki └── Part1.wiki /wiki/Main.wiki: -------------------------------------------------------------------------------- 1 | #summary Browser Security Handbook landing page 2 | #labels Featured 3 | 4 |

Browser Security Handbook

5 | 6 | * Written and maintained by [http://lcamtuf.coredump.cx/ Michal Zalewski] <[mailto:lcamtuf@google.com lcamtuf@google.com]>. 7 | * Copyright 2008, 2009 Google Inc, rights reserved. 8 | * Released under terms and conditions of the [http://creativecommons.org/licenses/by/3.0 CC-3.0-BY] license. 9 | 10 |

11 | 12 | 13 | * _[http://code.google.com/p/browsersec/wiki/Part1 → Part 1: Basic concepts behind web browsers]_ 14 | * _[http://code.google.com/p/browsersec/wiki/Part2 → Part 2: Standard browser security features]_ 15 | * _[http://code.google.com/p/browsersec/wiki/Part3 → Part 3: Experimental and legacy security mechanisms]_ 16 | 17 | =Introduction= 18 | 19 | Hello, and welcome to the _Browser Security Handbook_! 20 | 21 | This document is meant to provide web application developers, browser engineers, and information security researchers with a one-stop reference to key security properties of contemporary web browsers. Insufficient understanding of these often poorly-documented characteristics is a major contributing factor to the prevalence of several classes of security vulnerabilities. 22 | 23 | Although all browsers implement roughly the same set of baseline features, there is relatively little standardization - or conformance to standards - when it comes to many of the less apparent implementation details. Furthermore, vendors routinely introduce proprietary tweaks or improvements that may interfere with existing features in non-obvious ways, and seldom provide a detailed discussion of potential problems. 24 | 25 | The current version of this document is based on the following versions of web browsers: 26 | 27 | || *Browser* || *Version* || *Test date* || *Usage*^*^ || *Notes* || 28 | || Microsoft Internet Explorer 6 || 6.0.2900.5512 || Feb 2, 2009 || 16% || || 29 | || Microsoft Internet Explorer 7 || 7.0.5730.11 || Dec 11, 2008 || 11% || || 30 | || Microsoft Internet Explorer 8 || 8.0.6001.18702 || Sep 7, 2010 || 28% || || 31 | || Mozilla Firefox 2 || 2.0.0.18 || Nov 28, 2008 || 1% || || 32 | || Mozilla Firefox 3 || 3.6.8 || Sep 7, 2010 || 22% || || 33 | || Apple Safari || 4.0 || Jun 10, 2009 || 5% || || 34 | || Opera || 9.62 || Nov 18, 2008 || 2% || || 35 | || Google Chrome || 7.0.503.0 || Sep 7, 2010 || 8% || || 36 | || Android embedded browser || SDK 1.5 R3 || Oct 3, 2009 || n/a || || 37 | 38 | ^*^ Approximate browser usage data based on public [http://marketshare.hitslink.com/browser-market-share.aspx?qprid=0 Net Applications] estimates for August 2010. 39 | 40 | =Disclaimers and typographical conventions= 41 | 42 | *Please note that although we tried to make this document as accurate as possible, some errors might have slipped through. Use this document only as an initial reference, and independently verify any characteristics you wish to depend upon. Test cases for properties featured in this document are [http://browsersec.googlecode.com/files/browser_tests-1.03.tar.gz freely available for download].* 43 | 44 | The document attempts to capture the risks and security considerations present for general populace of users accessing the web with default browser settings in place. Although occasionally noted, the degree of flexibility offered through non-standard settings is by itself not a subject of this comparative study. 45 | 46 | Through the document, red color is used to bring attention to browser properties that seem particularly tricky or unexpected, and need to be carefully accounted for in server-side implementations. Whenever status quo appears to bear no significant security consequences and is well-understood, but a particular browser implementation takes additional steps to protect application developers, we use green color to denote this, likewise. Rest assured, neither of these color codes implies that a particular browser is less or more secure than its counterparts. 47 | 48 | =Acknowledgments= 49 | 50 | _Browser Security Handbook_ would not be possible without the ideas and assistance from the following contributors: 51 | 52 | * Filipe Almeida 53 | * Brian Eaton 54 | * Chris Evans 55 | * Drew Hintz 56 | * Nick Kralevich 57 | * Marko Martin 58 | * Tavis Ormandy 59 | * Wladimir Palant 60 | * David Ross 61 | * Marius Schilder 62 | * Parisa Tabriz 63 | * Julien Tinnes 64 | * Berend-Jan Wever 65 | * Mike Wiacek 66 | 67 | The document builds on top of previous security research by Adam Barth, Collin Jackson, Amit Klein, Jesse Ruderman, and many other security experts who painstakingly dissected browser internals for the past few years. 68 | 69 | _([http://code.google.com/p/browsersec/wiki/Part1 Continue to basic concepts behind web browsers...])_ -------------------------------------------------------------------------------- /wiki/Part3.wiki: -------------------------------------------------------------------------------- 1 | #summary Browser Security Handbook, part 3 2 | 3 |

Browser Security Handbook, part 3

4 | 5 | * Written and maintained by [http://lcamtuf.coredump.cx/ Michal Zalewski] <[mailto:lcamtuf@google.com lcamtuf@google.com]>. 6 | * Copyright 2008, 2009 Google Inc, rights reserved. 7 | * Released under terms and conditions of the [http://creativecommons.org/licenses/by/3.0 CC-3.0-BY] license. 8 | 9 |

10 | 11 | * _[http://code.google.com/p/browsersec/wiki/Part2 ← Back to browser security features]_ 12 | 13 | 14 | =Experimental and legacy security mechanisms= 15 | 16 | Through the years, browsers accrued a fair number of security mechanisms that had either fallen into disuse, or never caught on across more than a single vendor; as well as a number of ongoing proposals, enhancements, and extensions that are yet to prove their worth or become even vaguely standardized. This section provides a brief overview of several technologies that could fall into this class. 17 | 18 | _Fun fact: when it comes to newly proposed features, many of them essentially introduce new security boundaries and permission systems mostly orthogonal, yet intertwined, with same-origin controls. Although this appears to be a good idea at first sight, some researchers [http://crypto.stanford.edu/websec/origins/fgo.pdf warn about the pitfalls] of finer-grained origins as difficult to understand to users, and hard to fully examine for potential side effects and interactions with existing code._ 19 | 20 | ==HTTP authentication== 21 | 22 | HTTP authentication is an ancient mechanism most recently laid out in [http://www.ietf.org/rfc/rfc2617 RFC 2617]. It is a simple extension of HTTP: 23 | 24 | * Any resource for which the server requires the user to provide valid credentials would initially return a `401 Unauthorized` HTTP error code, with a `WWW-Authenticate` header describing parameters such as the authentication realm, supported authentication modes, and other method-specific parameters. 25 | 26 | * Upon receiving a `401` code, the client is expected to prompt the user for login and password, or obtain the data from in-memory cache or another credential store (where according to the RFC, it should be looked up by authentication realm name alone; although for security purposes, it should be bound to the host name as well, preferably also to port and protocol). 27 | 28 | * User's credentials are then encoded and sent back in a new request with an additional `Authorization` header, which the server is expected to examine, and grant access or return an error message as appropriate. 29 | 30 | Credentials are also cached for authentication with other subresources on the same site, and are sent out on future requests in an unsolicited manner (globally, or only for a particular path prefix). 31 | 32 | Two key authentication schemes are supported by virtually all clients: `basic` and `digest`. The former simply sends user names and passwords in plain (`base64`) text - and hence the data is vulnerable to snooping, unless the process takes place over HTTPS. The latter uses a simple nonce-based challenge-response mechanism that prevents immediate password disclosure. Microsoft further extended the mechanism to include two proprietary `NTLM` and `Negotiate` schemes ([http://www.innovation.ch/personal/ronald/ntlm.html reference]) that integrate seamlessly with Microsoft Windows domain authentication. 33 | 34 | As hinted in the section on [http://code.google.com/p/browsersec/wiki/Part1#Uniform_Resource_Locators URL syntax], URLs are permitted to have an optional `user[:password]@` segment immediately before the host name, to enable pre-authenticated bookmarks or shared links. In practice, the mechanism would be seldom used for legitimate purposes, but became immensely popular with phishers - who would often construct URLs such as `http://``www.example-bank.com:something_something@``www.evilsite.com/` in hopes of confusing the user. This led to this URL syntax being banned in Microsoft Internet Explorer, and often resulting in security prompts elsewhere. 35 | 36 | Because of these limitations and the relative inflexibility of this scheme to begin with, HTTP authentication has been almost completely extinct on the Internet, and replaced with custom solutions built around HTTP cookies (it is still sometimes used for intranet applications or for simple access control for personal resources). 37 | 38 | Amusingly, its ghost still haunts modern web applications: HTTP authentication prompts often come up in browsers when viewing trusted pages where a minor authentication-requiring sub-resource, such as ``, is included from a rogue site - but these prompts usually do a poor job of clearly explaining who is asking for the credentials. This poses a phishing risk for services, such as blogs or discussion forums, that allow users to embed external content. 39 | 40 | || *Test description* || *MSIE6* || *MSIE7* || *MSIE8* || *FF2* || *FF3* || *Safari* || *Opera* || *Chrome* || *Android* || 41 | || Is HTTP authentication supported? || YES || YES || YES || YES || YES || YES || YES || YES || NO || 42 | || Does link-embedded authentication work? || NO || NO || NO || prompt || prompt || YES || prompt || YES || n/a || 43 | || Is authentication data bound to realms? || NO || NO || NO || NO || NO || YES || NO || YES || n/a || 44 | || Is authentication bound to host name? || YES || YES || YES || YES || YES || YES || YES || YES || n/a || 45 | || Is authentication bound to protocol or port? || YES || YES || YES || YES || YES || YES || YES || YES || n/a || 46 | || Are cached credentials sent out on all future requests? || YES || YES || YES || YES || YES || subdir only || subdir only || subdir only || n/a || 47 | || Do password prompts come up on ``? || YES || YES || YES || YES || YES || YES || YES || YES || n/a || 48 | || Do password prompts come up on `

` / ``? || NO || NO || NO || YES || YES || NO || YES || YES || n/a || 49 | || Do password prompts come up on `> 271 | 02: alert(1)"> 272 | 03:

273 | 04: alert(1) 274 | 05:

275 | 06: 276 | 07: 277 | 08: 278 | 09: ><SCRIPT>alert(1)</SCRIPT>

280 | }}} 281 | 282 | [http://code.google.com/p/doctype/wiki/ArticlesXSS Cross-site scripting] aside, another interesting property of HTML is that it permits certain HTTP directives to be encoded within HTML itself, using the following format: 283 | 284 | {{{ 285 | 286 | }}} 287 | 288 | Not all `HTTP-EQUIV` directives are meaningful - for example, the determination of `Content-Type`, `Content-Length`, `Location`, or `Content-Disposition` had already been made by the time HTML parsing begins - but some values seen may be set this way. The strategy for resolving HTTP - HTML conflicts is not outlined in W3C standards - but in practice, valid HTTP headers take precedence over `HTTP-EQUIV`; on the other hand, `HTTP-EQUIV` takes precedence over unrecognized HTTP header values. `HTTP-EQUIV` tags will also take precedence when the content is moved to non-HTTP media, such as saved to local disk. 289 | 290 | A fascinating quirk exists in the Internet Explorer HTML parser. While all other browsers accept quoted string parameter values only if the quote appear in front of the parameter, MSIE looks for an _equals-quote_ substring anywhere in the middle: 291 | 292 | {{{ 293 |

476 | # Two-digit, zero-padded, 8-bit hexadecimal numerical representations, with C-style `\x` prefix (`"test"` → `"t\x65st"`),

477 | # Four-digit, zero-padded, 16-bit hexadecimal numerical Unicode representations, with `\u` prefix (`"test"` → `"t\u0065st"`),

478 | # Raw `\` prefix before a literal (`"test"` → `"t\est"`),

479 | # Special C-style `\` shorthand notation for certain control characters (such as `\n` or `\t`). 480 | 481 | The fourth scheme is most space-efficient, although generally error-prone; for example, a failure to escape `\` as `\\` may lead to a string of `\" + alert(1);"` being converted to `\\" + alert(1)`, and getting interpreted incorrectly; it also partly collides with the remaining `\`-prefixed escaping schemes, and is not compatible with C syntax. 482 | 483 | The parsing of these schemes is uniform across all browsers if fewer than the expected number of input digits is seen: the value is interpreted correctly, and no subsequent text is consumed (`"\1test"` is accepted and becomes `"\001test"`). 484 | 485 | HTML entity encoding has no special meaning within HTML-embedded `'); 525 | } 526 | 527 | }}} 528 | 529 | Another interesting property specific to ` --> 534 | 535 | }}} 536 | 537 | Yet another characteristic that sets stylesheets apart from JavaScript is the fact that although CSS parser is very strict, a syntax error does not cause it to bail out completely. Instead, a recovery from the next top-level syntax element is attempted. This makes handling user-controlled strings even harder - for example, if a stray newline in user-supplied string is not properly escaped, it would actually permit the attacker to freely tinker with the stylesheet in most browsers: 538 | 539 | {{{ 540 | 546 | 547 | Hello world (in red)! 548 | }}} 549 | 550 | Additional examples along these lines are explored in more detail on [http://centricle.com/ref/css/filters/ this page]. Several fundamental differences in style parsing between common browsers are outlined below: 551 | 552 | || *Test description* || *MSIE6* || *MSIE7* || *MSIE8* || *FF2* || *FF3* || *Safari* || *Opera* || *Chrome* || *Android* || 553 | || Is JavaScript `expression(...)` supported? || YES || YES || YES || NO || NO || NO || NO || NO || NO || 554 | || Is script-targeted `url(...)` supported? || YES || NO || NO || NO || NO || NO || NO || NO || NO || 555 | || Is script-executing `-moz-binding` supported? || NO || NO || NO || YES || NO || NO || NO || NO || NO || 556 | || Does `` take precedence over comment block parsing? || NO || NO || NO || YES || YES || NO || NO || NO || NO || 557 | || Characters permitted as CSS field-value separators (excluding `\t \r \n \x20`) || \x0B \x0C \ \xA0 || \x0B \x0C \ \xA0 || \x0B \x0C \ \xA0 || \x0B \x0C \ || \x0C \ || \x0C \ || \x0C \ \xA0 || \x0C \ \xA0 || \x0C \ || 558 | 559 | 560 | 561 | ===CSS character encoding=== 562 | 563 | In many cases, as with JavaScript, there is a need for web applications to render certain user-supplied user-controlled strings within stylesheets in a safe manner. To handle various reserved characters, a method for escaping potentially troublesome values is required; confusingly, however, CSS format supports neither HTML entity encoding, nor any of the common methods of encoding characters seen in JavaScript. 564 | 565 | Instead, a rather unusual and incompatible scheme consisting of `\` followed by non-prefixed, variable length one- to six-digit hexadecimal is employed; for example, `"test"` may be encoded as `"t\65st"` or `"t\000065st"` - but not as `t\est"`, `"t\x65st"`, `"t\u0065st"`, nor `"test"` ([http://www.w3.org/TR/CSS21/syndata.html#escaped-characters reference]). 566 | 567 | A very important and little-known oddity unique to CSS parsing is that escape sequences are also accepted outside strings, and confusingly, may substitute some syntax control characters in the stylesheet; so for example, `color: expression(alert(1))` and `color: expression\028 alert \028 1 \029 \029` have the same meaning. To add insult to injury, mixed syntax such as `color: expression( alert \029 1 ) \029` would not work. 568 | 569 | With the exception of Internet Explorer, stray multi-line string literals are not supported; but a lone `\` at the end of a line may be used to seamlessly break long lines. 570 | 571 | ==Other built-in document formats== 572 | 573 | HTML aside, modern browser renderers usually natively support an additional set of media formats that may be displayed as standalone documents. These can be generally divided into two groups: 574 | 575 | * *Pure data formats.* These include plain text data or images (JPEG, GIF, BMP, etc), where the browser simply provides a basic rendering canvas and populates it with static data. In principle, no security consequences arise with these data types, as no malicious payloads may be embedded in the message - short of major and fairly rare implementation flaws. Common image formats aside, some browsers may also support other oddball or legacy media formats natively, such as the ability to play [http://en.wikipedia.org/wiki/Musical_Instrument_Digital_Interface MID] files specified by `` tags. 576 | 577 | * *Rich data formats.* This category is primarily populated by non-HTML XML namespace parsers ([http://www.w3.org/Graphics/SVG/ SVG], [http://www.rssboard.org/rss-specification RSS], [http://tools.ietf.org/html/rfc4287 Atom]); beyond raw data, these document formats contain various rendering instructions, hints, or conditionals. Because of how XML works, each of the XML-based formats has two important security consequences:

578 | * Firstly, nested XML namespaces may be defined, and are usually not verified against MIME type intents, permitting HTML to be embedded for example inside `image/svg+xml`.

579 | * Secondly, these formats may actually come with provisions for non-standard embedded HTML or JavaScript payloads or scripts built in, permitting HTML injection even if the attacker has no direct control over XML document structure. 580 | 581 | One example of a document that, even if served as `image/svg+xml`, would still execute scripts in many current browsers despite MIME type clearly stating a different intent, is as follows: 582 | 583 | {{{ 584 | 585 | 586 | 589 | 590 | 591 | 592 | 593 | }}} 594 | 595 | Furthermore, SVG natively permits [http://www.w3.org/TR/SVG/script.html embedded scripts] and event handlers; in all browsers that support SVG, these scripts execute when the image is loaded as a top-level document - but are ignored when rendered through `` tags. 596 | 597 | Some of the non-HTML builtin document type behaviors are documented below: 598 | 599 | || *Test description* || *MSIE6* || *MSIE7* || *MSIE8* || *FF2* || *FF3* || *Safari* || *Opera* || *Chrome* || *Android* || 600 | || Supported bitmap formats (excluding JPG, GIF, PNG) || BMP ICO WMF || BMP ICO WMF || BMP ICO WMF || BMP ICO TGA^*^ || BMP ICO TGA^*^ || BMP TIF || BMP^*^ || BMP ICO || BMP ICO || 601 | || Is generic XML document support present? || YES || YES || YES || YES || YES || YES || YES || YES || YES || 602 | || Is RSS feed support present? || NO || YES || YES || YES || YES || YES || YES || NO || NO || 603 | || Is ATOM feed support present || NO || YES || YES || YES || YES || YES || YES || NO || NO || 604 | || Does JavaScript execute within feeds? || (YES) || NO || NO || NO || NO || NO || NO || (YES) || (YES) || 605 | || Are `javascript:` or `data:` URLs permitted in feeds? || n/a || NO || NO || NO || NO || NO || NO || n/a || n/a || 606 | || Are CSS specifications permitted in feeds? || n/a || NO || YES || YES || YES || NO || YES || n/a || n/a || 607 | || Is SVG image support present? || NO || NO || NO || YES || YES || YES || YES || YES || NO || 608 | || May `image/svg+xml` document contain HTML `xmlns` payload? || (YES) || (YES) || (YES) || YES || YES || YES || YES || YES || (YES) || 609 | 610 | ^*^ Format support limited, inconsistent, or broken. 611 | 612 | _Trivia: curiously, Microsoft's XML-based [http://www.w3.org/TR/NOTE-VML.html Vector Markup Language] (VML) is not natively supported by the renderer, and rather implemented as a plugin; whereas Scalable Vector Graphics (SVG) is implemented as a core renderer component in all browsers that support it._ 613 | 614 | ==Plugin-supported content== 615 | 616 | Unlike the lean and well-defined set of natively supported document formats, the landscape of browser plugins is extremely diverse, hairy, and evolving very quickly. Most of the common content-rendering plugins are invoked through the use of `

Browser Security Handbook

Table of Contents

Browser Security Handbook, part 3

Table of Contents