├── .github ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── LICENSE ├── README.md ├── docs ├── code-of-conduct.md ├── contributing.md ├── exercise-4b-hints │ ├── hint1.md │ ├── hint2.md │ ├── hint3.md │ ├── hint4.md │ └── hint5.md ├── exercise1a.md ├── exercise1b.md ├── exercise1c.md ├── exercise2a.md ├── exercise2b.md ├── exercise2c.md ├── exercise2d.md ├── exercise3a.md ├── exercise3b.md ├── exercise4a.md ├── exercise4b.md ├── for-teachers.md ├── setup.md └── solutions.md ├── install-chromebook-prerequisites.sh └── src ├── browser ├── browser.py └── html_table.py ├── fuzzer ├── .gitignore └── fuzzer.py ├── requirements.txt └── server ├── .gitignore ├── http_server.py ├── https_server.py └── pages ├── exercise1c-anotherpage.html ├── exercise1c.html └── spoofable.html /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Expected Behavior 2 | 3 | 4 | ## Actual Behavior 5 | 6 | 7 | ## Steps to Reproduce the Problem 8 | 9 | 1. 10 | 1. 11 | 1. 12 | 13 | ## Specifications 14 | 15 | - Version: 16 | - Platform: -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Fixes # 2 | 3 | > It's a good idea to open an issue first for discussion. 4 | 5 | - [ ] Tests pass 6 | - [ ] Appropriate changes to documentation are included in the PR 7 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | browser.json 2 | venv 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Browser learning exercises 2 | 3 | This project provides games for high school / secondary school computer scientists 4 | to learn about how browsers work and what security bugs might exist in them. 5 | 6 | See [Contributing](docs/contributing.md) to contribute. 7 | 8 | ## For teachers 9 | 10 | See [For Teachers](docs/for-teachers.md), and [setup](docs/setup.md). 11 | 12 | ## For students 13 | 14 | If your teacher has already run through the [setup instructions](docs/setup.md), 15 | or you're comfortable doing them yourself, then proceed to the actual content 16 | below. 17 | 18 | ## Part One: What's a browser and a server 19 | 20 | * [Exercise 1a](docs/exercise1a.md): Use a real browser. Navigate to [wikipedia](https://en.wikipedia.org). 21 | View source. Find some things in the HTML. 22 | * [Exercise 1b](docs/exercise1b.md): Use our mini Python browser. Go to the same place. Look in 23 | the code of our Python browser to see how it handles `b` and `a` tags. 24 | * [Exercise 1c](docs/exercise1c.md): Run our mini Python web server. Create your own HTML file. 25 | See how the browser responds. 26 | 27 | ## Part Two: Building a browser 28 | 29 | * [Exercise 2a](docs/exercise2a.md): Add support for italic text in the browser. 30 | * [Exercise 2b](docs/exercise2b.md): Add support for `font color` or similar. 31 | * [Exercise 2a](docs/exercise2c.md): Add word wrap in the browser. (HARD!) 32 | * [Exercise 2d](docs/exercise2d.md): Invent your own HTML tag. Add it to the browser and to some 33 | HTML pages in the server. See how it works end-to-end. 34 | 35 | ## Part Three: Encryption 36 | 37 | * [Exercise 3a](docs/exercise3a.md): See the web traffic flowing between the mini browser and 38 | the mini server when using HTTP. See what happens when you try different 39 | URIs. See what happens when different types of error occur. 40 | * [Exercise 3b](docs/exercise3b.md): Try the same with HTTPS. 41 | 42 | ## Part Four: Security bugs 43 | 44 | * [Exercise 4a](docs/exercise4a.md): Find some security bugs hidden in the code of our mini 45 | browser. 46 | * [Exercise 4b](docs/exercise4b.md): Write a program that finds bugs in another program 47 | (a "fuzzer"). 48 | 49 | ## Related projects 50 | 51 | If you enjoyed this project, eee also [browser.engineering](https://browser.engineering/), 52 | another Python demonstration browser! browser.engineering goes much further than this 53 | browser - it includes JavaScript, CSS etc. - with commensurately more complex code and 54 | runtime needs. 55 | -------------------------------------------------------------------------------- /docs/code-of-conduct.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, gender identity and expression, level of 9 | experience, education, socio-economic status, nationality, personal appearance, 10 | race, religion, or sexual identity and orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or reject 41 | comments, commits, code, wiki edits, issues, and other contributions that are 42 | not aligned to this Code of Conduct, or to ban temporarily or permanently any 43 | contributor for other behaviors that they deem inappropriate, threatening, 44 | offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | This Code of Conduct also applies outside the project spaces when the Project 56 | Steward has a reasonable belief that an individual's behavior may have a 57 | negative impact on the project or its community. 58 | 59 | ## Conflict Resolution 60 | 61 | We do not believe that all conflict is bad; healthy debate and disagreement 62 | often yield positive results. However, it is never okay to be disrespectful or 63 | to engage in behavior that violates the project’s code of conduct. 64 | 65 | If you see someone violating the code of conduct, you are encouraged to address 66 | the behavior directly with those involved. Many issues can be resolved quickly 67 | and easily, and this gives people more control over the outcome of their 68 | dispute. If you are unable to resolve the matter for any reason, or if the 69 | behavior is threatening or harassing, report it. We are dedicated to providing 70 | an environment where participants feel welcome and safe. 71 | 72 | Reports should be directed to *[PROJECT STEWARD NAME(s) AND EMAIL(s)]*, the 73 | Project Steward(s) for *[PROJECT NAME]*. It is the Project Steward’s duty to 74 | receive and address reported violations of the code of conduct. They will then 75 | work with a committee consisting of representatives from the Open Source 76 | Programs Office and the Google Open Source Strategy team. If for any reason you 77 | are uncomfortable reaching out to the Project Steward, please email 78 | opensource@google.com. 79 | 80 | We will investigate every complaint, but you may not receive a direct response. 81 | We will use our discretion in determining when and how to follow up on reported 82 | incidents, which may range from not taking action to permanent expulsion from 83 | the project and project-sponsored spaces. We will notify the accused of the 84 | report and provide them an opportunity to discuss it before any action is taken. 85 | The identity of the reporter will be omitted from the details of the report 86 | supplied to the accused. In potentially harmful situations, such as ongoing 87 | harassment or threats to anyone's safety, we may take action without notice. 88 | 89 | ## Attribution 90 | 91 | This Code of Conduct is adapted from the Contributor Covenant, version 1.4, 92 | available at 93 | https://www.contributor-covenant.org/version/1/4/code-of-conduct/ 94 | -------------------------------------------------------------------------------- /docs/contributing.md: -------------------------------------------------------------------------------- 1 | # How to Contribute 2 | 3 | We would love to accept your patches and contributions to this project. 4 | 5 | ## Before you begin 6 | 7 | ### Sign our Contributor License Agreement 8 | 9 | Contributions to this project must be accompanied by a 10 | [Contributor License Agreement](https://cla.developers.google.com/about) (CLA). 11 | You (or your employer) retain the copyright to your contribution; this simply 12 | gives us permission to use and redistribute your contributions as part of the 13 | project. 14 | 15 | If you or your current employer have already signed the Google CLA (even if it 16 | was for a different project), you probably don't need to do it again. 17 | 18 | Visit to see your current agreements or to 19 | sign a new one. 20 | 21 | ### Review our Community Guidelines 22 | 23 | This project follows [Google's Open Source Community 24 | Guidelines](https://opensource.google/conduct/). 25 | 26 | ## Contribution process 27 | 28 | ### Code Reviews 29 | 30 | All submissions, including submissions by project members, require review. We 31 | use [GitHub pull requests](https://docs.github.com/articles/about-pull-requests) 32 | for this purpose. 33 | -------------------------------------------------------------------------------- /docs/exercise-4b-hints/hint1.md: -------------------------------------------------------------------------------- 1 | The bug is in the handling of some standard, fairly common, HTML tags. 2 | -------------------------------------------------------------------------------- /docs/exercise-4b-hints/hint2.md: -------------------------------------------------------------------------------- 1 | Although you're not allowed to look in `html_table.py`, perhaps the name of that file 2 | gives you some clues about what sorts of HTML tags might be involved? 3 | -------------------------------------------------------------------------------- /docs/exercise-4b-hints/hint3.md: -------------------------------------------------------------------------------- 1 | There are several standard HTML table tags - `tr`, `td`, `table` (and various others). 2 | -------------------------------------------------------------------------------- /docs/exercise-4b-hints/hint4.md: -------------------------------------------------------------------------------- 1 | There is a combination of these table-related tags which will cause the browser to crash. You should write code to generate random combinations of these tags and intervening data. 2 | -------------------------------------------------------------------------------- /docs/exercise-4b-hints/hint5.md: -------------------------------------------------------------------------------- 1 | There is no hint 5. You should be able to get it from hint 4. So there! I bet you hate me now, right? 2 | -------------------------------------------------------------------------------- /docs/exercise1a.md: -------------------------------------------------------------------------------- 1 | # Exercise 1a: Use a real browser, spot things in HTML 2 | 3 | * Open up Chrome (or any other browser) 4 | * Go to https://en.wikipedia.org in that browser 5 | * Look for: 6 | * Some text which is **bold** (remember what it says) 7 | * Some text which is _italic_ (remember what it says) 8 | * Some text which is a hyperlink (remember what it says) 9 | * Choose the "view source" option (or "inspect element") 10 | * Find the text you found 11 | 12 | Here's what's happening: 13 | 14 | ```mermaid 15 | graph LR; 16 | browser(Browser on your computer) 17 | server(Wikipedia web server) 18 | browser-- request for a particular page --> server; 19 | server -- response containing HTML -->browser; 20 | ``` 21 | 22 | Questions: 23 | 24 | * What's the HTML tag meaning 'bold'? 25 | * What's the HTML tag meaning 'italic'? 26 | * What's the HTML tag meaning hyperlink? What happens when you click the link? 27 | The browser goes to another web page - how does it know where to go? 28 | 29 | (you'll need to know these things later!) -------------------------------------------------------------------------------- /docs/exercise1b.md: -------------------------------------------------------------------------------- 1 | # Exercise 1b: Using our own browser 2 | 3 | * Do the same in our own browser, by running `python3 src/browser/browser.py` 4 | * Does it support: 5 | * **Bold** text 6 | * _Italic_ text 7 | * Hyperlinks 8 | * Open up the browser source code (open the file `src/browser/browser.py`) - 9 | it's recommended to use Visual Studio Code. (Specifically, if you're using a Chromebook, 10 | choose File -> Open Folder, click browser-learning once, and click Open. You can then use 11 | the side panel to navigate to `src/browser/browser.py`) 12 | * Find where it handles these tags. See which is missing. 13 | 14 | ## About reading code 15 | 16 | This exercise is about _reading_ code, not writing it. It's a different skill! 17 | Look for clues. See if you can piece together how the whole browser works, 18 | by cross-referencing the way different bits of the code work with each other. 19 | Use "find" (or equivalent options in Visual Studio Code). 20 | 21 | **Top tip**: try to understand how the whole thing roughly fits together 22 | before worrying about the details. Skip over bits you don't understand. 23 | 24 | Clues below! But get reading the code first, then come back to this. 25 | 26 | ## What's up with... 27 | 28 | ### The editor 29 | 30 | Top tip for reading code: make the window as big as possible. 31 | 32 | ### The overall structure 33 | 34 | The program itself is near the bottom of the file. One of the first things 35 | it does is create an object belonging to a `class` called `Browser` - see 36 | below! It then tells the brower to do things. 37 | 38 | ### `class` 39 | 40 | If your program had 1000 functions, you'd get lost. Large programs are 41 | often organized into "classes". A "class" represents a type of _thing_ in 42 | your program - called an _object_. It might be a physical object that appears on 43 | the screen, such as an icon, or a non-tangible thing in the background 44 | (like an image decoder or a score calculator). 45 | 46 | Think of classes like nouns. 47 | 48 | Objects have: 49 | * A bunch of functions. One object can tell another object to do something - 50 | that will call one of its functions. Think of these as the verbs which 51 | act on the noun. 52 | * Some data. Think of these things as facts about the object. 53 | 54 | For instance, in Python: 55 | 56 | ```python 57 | class Dinosaur: 58 | def __init__(self): 59 | self.number_of_legs = 0 # data belonging to each object in the class. 60 | # Although each dinosaur has a number of legs, they might be different 61 | # in different dinosaurs. 62 | self.tummy_fullness = 0 63 | 64 | def eat_something(self): # other code can tell the dinosaur to eat something 65 | self.tummy_fullness = self.tummy_fullness + 1 # modify some data belonging 66 | # to this dinosaur 67 | ``` 68 | 69 | In this browser, we have exactly two classes: `Renderer` and `Browser`. 70 | There's one object of the `Browser` class existing for the whole life of the 71 | program. We create a new `Renderer` each time we visit a new page. 72 | 73 | ### `self` 74 | 75 | Most of the functions in this browser code are functions belonging to classes 76 | of objects - see `class`, above. `self` represents the current object. 77 | 78 | ### `browser.navigate()` 79 | 80 | We have an object of the class `Browser` kept in a variable called `browser`. 81 | This is calling one of the functions belonging to the `Browser` class. 82 | 83 | ### `self.canvas.create_text()` and stuff like that 84 | 85 | The class we're in has some data called `canvas` which itself contains 86 | another object, and we're calling one of the canvas' functions. 87 | 88 | ### `class Renderer(HTMLParser):` 89 | 90 | Good question! That means that our class `Renderer` is a special type of 91 | a different class called `HTMLParser` which happens to be provided by one 92 | of the libraries we're using. 93 | 94 | ### What's the difference between the renderer and browser? 95 | 96 | The `Browser` class is responsible for loading pages on the network. The 97 | `Renderer` class is responsible for drawing them on the screen. The two 98 | classes interact in both directions, like this: 99 | 100 | ```mermaid 101 | sequenceDiagram 102 | Browser->>Renderer: Please draw the following HTML 103 | Renderer->>Browser: User clicked a link, please load it 104 | Browser->>Renderer: Please draw the following HTML for the new page 105 | ``` 106 | -------------------------------------------------------------------------------- /docs/exercise1c.md: -------------------------------------------------------------------------------- 1 | # Exercise 1c: Make your own HTML file 2 | 3 | Inside the `src` directory is another directory called `server`, and inside 4 | there is `pages`. 5 | 6 | Create a new HTML file in there. Use the (few!) types of tag supported by 7 | the browser. 8 | 9 | You should make sure you know how to add these tags: 10 | * `a href` 11 | * `b` 12 | 13 | Make it work end-to-end! 14 | 15 | Here's how to run the server: 16 | 17 | `python3 src/server/http_server.py` 18 | 19 | Then use the browser to visit (exactly): 20 | 21 | `http://localhost:8000/exercise1c.html` 22 | (change `exercise1c.html` to whatever you called your page) 23 | 24 | > [!TIP] 25 | > You need to run the browser and the server at the same time. The best way to 26 | > do this is to type Control-Z, and then type `bg`, which will run the browser 27 | > in the background. (You can see this by typing `jobs`.) Another way is to 28 | > open another terminal and then run `cd browser-learning` then `. venv/bin/activate`. 29 | 30 | Ensure you can see bold text and hyperlinks. 31 | 32 | ## What's up with? 33 | 34 | ### `http://localhost:8000/exercise1c.html` 35 | 36 | * `http` here is the "scheme". That is, it's an agreement in the way that 37 | the browser and server communicate. There are others including `https` 38 | and `ftp`. We'll see the details of `http` later 39 | * `localhost`: this is a special name which means "your own computer". 40 | Since the web server is running on your own computer, that's what we'll 41 | connect to. 42 | * `:8000`: this is called the "port number" and isn't very important right now. 43 | A single computer can be running several different web servers, and each 44 | has its own port number. We chose 8000 for our server. (The standard port is 45 | 80, but we're not using that for our exercises in case something else is 46 | already using that port number.) 47 | * `exercise1c.html`: the nae of the HTML page that the browser will request 48 | from the server. 49 | -------------------------------------------------------------------------------- /docs/exercise2a.md: -------------------------------------------------------------------------------- 1 | # Exercise 2a 2 | 3 | Make the browser support _italic_ text whenever it comes across the 4 | HTML tags `` or ``. (`em` means "emphasis" and sometimes websites 5 | use that rather than `i` for italic.) 6 | 7 | Check it works using a web page you create. 8 | -------------------------------------------------------------------------------- /docs/exercise2b.md: -------------------------------------------------------------------------------- 1 | # Exercise 2b: Add support for other tags 2 | 3 | Do some [research on standard HTML tags](https://www.w3schools.com/tags/), and add support for another one of your choice. 4 | 5 | * (Hopefully) easy choices: anything that just alters the appearance of the text, e.g. `font color`. 6 | * Slightly harder: support the horizontal rule, `hr`, or the `ul` and `li` list tags. 7 | * Medium choices: change the title bar of the browser window when it finds a `title` tag. (Hint: this will 8 | probably involve adding another function to the `Browser` class which you'll call from the `Renderer` class). 9 | * Very hard choice: add support for `img src` which will include pictures on the page. 10 | This will likely take several hours even if you're a Python wizard. You'd need to do this: 11 | 1. Get an absolute URI to the image (see the existing code in `link_clicked` which does the same) 12 | 2. Download the image file using [`requests`](https://pypi.org/project/requests/), perhaps into a 13 | [`NamedTemporaryFile`](https://docs.python.org/3/library/tempfile.html#tempfile.NamedTemporaryFile) 14 | 3. Use `create_image` on the `canvas` to draw it. 15 | * Another very hard choice: the `blink` tag. This will require you to set a timer to re-draw the page 16 | periodically. 17 | * Impossible choice: anything to do with page layout. Don't attempt to support tables, frames or anything 18 | fancy like that. It would take hundreds of hours - it's all very complicated. Don't even try. 19 | 20 | 21 | -------------------------------------------------------------------------------- /docs/exercise2c.md: -------------------------------------------------------------------------------- 1 | # Exercise 2c: word wrap! 2 | 3 | You might notice that the browser doesn't have word wrap. Really long lines 4 | of text flow right off the side of the window and you can't read them. Word 5 | wrap is the way most browsers (and other apps) will chop those long lines 6 | into shorter lines automatically, so you can read all of it. 7 | 8 | How would you do that? 9 | 10 | Hints: 11 | * This will all be inside the `Renderer` class. 12 | * In `handle_data` 13 | * Wrap to 300 pixels wide. Don't draw anything beyond that. 14 | * Do it _approximately_ first. 15 | * Right now, it receives all the text inside each HTML tag in one go, for 16 | example `All this text arrives in one call to that function and 17 | this might be too wide to fit on the screen in one line.` 18 | * Maybe you can split that into words and draw one word at a time? 19 | * That sounds like a loop, right? 20 | * The Python [`split` function](https://www.w3schools.com/python/ref_string_split.asp) might be useful. 21 | * We already measure how wide the text was. Perhaps you can use that to 22 | decide when to start a new line? 23 | * If you're a perfectionist, you might find that some words have gone too far. 24 | Maybe `canvas.delete` helps! 25 | -------------------------------------------------------------------------------- /docs/exercise2d.md: -------------------------------------------------------------------------------- 1 | # Exercise 2d: invent your own! 2 | 3 | Make up your own HTML tag! Do it at both the browser and server side. 4 | -------------------------------------------------------------------------------- /docs/exercise3a.md: -------------------------------------------------------------------------------- 1 | # Exercise 3a: spy on requests and responses 2 | 3 | ## What you need to know 4 | 5 | We've seen that there is a web browser, which makes a request to a web server, 6 | and receives a response. 7 | 8 | ```mermaid 9 | graph LR; 10 | Browser-- request -->Server; 11 | Server-- response -->Browser; 12 | ``` 13 | 14 | That request, and its response, travels across wires and through airwaves. 15 | Maybe it contains your credit card number, your secret crush, or your health 16 | details. 17 | 18 | What if someone is listening in? 19 | 20 | ```mermaid 21 | graph LR; 22 | Browser-- request -->Hacker; 23 | Hacker --> Server 24 | Server-- response -->Hacker; 25 | Hacker-->Browser; 26 | ``` 27 | 28 | Let's be such a hacker! We're going to use a tool called `tcpdump` which can 29 | spy on *all the network traffic*. 30 | 31 | (`tcpdump` means `TCP` dump. `TCP` is the ["transmission control protocol", which 32 | is an agreement about how computers can communicate over the internet.](https://en.wikipedia.org/wiki/Transmission_Control_Protocol)) 33 | 34 | ## Getting started 35 | 36 | Make an HTML page within `src/server/pages/crush.html` with your secret crush, in HTML. 37 | 38 | ## Running `tcpdump` 39 | 40 | The computer you're using probably has several **interfaces**. That is, ways 41 | it can talk over the network. It might have a wired network socket, and it might 42 | also be able to work on wireless ("WiFi" networks). 43 | 44 | Run this command to find out what interfaces you have: 45 | 46 | ``` 47 | sudo tcpdump -D 48 | ``` 49 | 50 | > [!TIP] 51 | > Like in exercise 1, you need to run several commands at once. Once 52 | > a command is running, the terminal won't act on other commands you 53 | > give it. You can create 54 | > several terminal windows to run several commands, or after you've run a 55 | > command you can press Control-Z and then run the command `bg` 56 | 57 | Hopefully, one of them is labelled "loopback". That's the one we want today, 58 | which enables us to spy on a browser and server running on your computer. 59 | What's it called? It might be called `lo`. Remember that. 60 | 61 | Now run 62 | 63 | ``` 64 | sudo tcpdump -i lo -A 'tcp port 8000' 65 | ``` 66 | 67 | (You might need to swap `lo` with whatever your loopback interface was called.) 68 | 69 | > [!TIP] 70 | > You can press the Up arrow to edit a command you previously ran. 71 | 72 | Imagine you're doing this on a computer that's somewhere in between the browser 73 | and server. It would work just the same way. 74 | 75 | ## Spying on an HTTP request 76 | 77 | Run the simple web browser. Navigate to `http://localhost:8000/crush.html`. 78 | 79 | You should see the HTTP request go past, and the response. 80 | 81 | Here's an example request (yours will look slightly different): 82 | 83 | ``` 84 | GET /crush.html HTTP/1.1 85 | Host: localhost:8000 86 | User-Agent: python-requests/2.31.0 87 | Accept-Encoding: gzip, deflate 88 | Accept: */* 89 | Connection: keep-alive 90 | ``` 91 | 92 | Here's an example response (again yours will be a bit different): 93 | 94 | ``` 95 | HTTP/1.0 200 OK 96 | Server: SimpleHTTP/0.6 Python/3.12.2 97 | Date: Fri, 16 Feb 2024 16:43:01 GMT 98 | Content-type: text/html 99 | Content-Length: 344 100 | Last-Modified: Fri, 16 Feb 2024 16:13:49 GMT 101 | 102 | 103 | 104 | About my crush 105 | 106 | 107 |

My secret crush is Elsa from Frozen.

108 | 109 | 110 | ``` 111 | 112 | This is the information flowing over the network between the browser and the 113 | server. 114 | 115 | We're doing it on the same computer as the browser and server, but your 116 | information passes across dozens of computers on the way, and normally, 117 | any of them could spy. 118 | 119 | # Questions to answer 120 | 121 | * In the request, what HTTP version is being used? 122 | * What's the "content type" of the response from the server? (HTTP servers can 123 | supply images, video, and audio as well as HTML - you might have noticed them) 124 | 125 | 126 | What do we do? 127 | 128 | Type Control-C to zap `tcpdump` when you're finished. -------------------------------------------------------------------------------- /docs/exercise3b.md: -------------------------------------------------------------------------------- 1 | # Exercise 3b: encryption 2 | 3 | Do the same thing, but: 4 | 5 | * Use `https_server.py` instead of `http_server.py` 6 | * Navigate to `https://localhost:4443/exercise1b.html` instead of `http://localhost:8000/exercise1b.html` (note _both_ the different number, and the extra `s` on the end of `http`) 7 | * Use a slightly different `tcpdump` command: 8 | 9 | ``` 10 | sudo tcpdump -i lo -A 'tcp port 4443' 11 | ``` 12 | 13 | What do you see now? -------------------------------------------------------------------------------- /docs/exercise4a.md: -------------------------------------------------------------------------------- 1 | # Exercise 4a: find security bugs 2 | 3 | Imagine you're a website operator who wants to deceive the browser user or 4 | cause them harm. Muahahaha. 5 | 6 | Find a way to do that! There are several bugs hidden in the browser. 7 | 8 | Any of the following counts as a success: 9 | 10 | * Any way you can *crash* the browser, that is, cause it to exit without 11 | the user asking for it to exit. 12 | * A way you can make the user think they're looking at one website 13 | when they're actually looking at another. (Imagine visiting a website 14 | showing the bank account details of somebody you're paying.) 15 | * A way that one website can find out what was displayed in some 16 | other website. 17 | 18 | Rules: 19 | * You can only do this by *altering the HTML content of the web page*. Remember, 20 | you're a website operator. You *cannot* change the browser code. 21 | * You *must not* look in the `html_table.py` file, because that's for 22 | a subsequent exercise. 23 | 24 | > [!TIP] 25 | > If you find a bug which makes one website look like another one, 26 | > maybe you want to involve `src/server/pages/spoofable.html` to make 27 | > a convincing demo. 28 | 29 | There are (at least) three different crashes to find, plus one other bug. 30 | (There might be others as well!) 31 | 32 | ## What sorts of things cause security bugs? 33 | 34 | * [Buffer overflow](https://en.wikipedia.org/wiki/Buffer_overflow) 35 | * [Divide by zero](https://en.wikipedia.org/wiki/Division_by_zero) 36 | * [Type confusion](https://www.microsoft.com/en-us/security/blog/2015/06/17/understanding-type-confusion-vulnerabilities-cve-2015-0336/) 37 | * [Use after free](https://en.wikipedia.org/wiki/Dangling_pointer#use_after_free) (not possible in Python) 38 | * [Violations of the Line of Death](https://textslashplain.com/2017/01/14/the-line-of-death/) or other spoofing of parts of the user interface that the user might base their security judgements on 39 | 40 | ## If you find a security bug in a real browser 41 | 42 | [The browser maker will pay you!](https://bughunters.google.com/about/rules/5745167867576320/chrome-vulnerability-reward-program-rules#reward-amounts). 43 | 44 | > [!TIP] 45 | > Not all _crashes_ are security bugs - it depends whether attackers can 46 | > use them to steal some data instead of crashing. -------------------------------------------------------------------------------- /docs/exercise4b.md: -------------------------------------------------------------------------------- 1 | # Exercise 4a: find security bugs (automatically) 2 | 3 | Congratulations! If you're reading this, you must have successfully found all 4 | the hidden security bugs in the previous exercise. 5 | 6 | Now you're going to write a program to find more security bugs. A program which 7 | finds security bugs by testing another program is called a "fuzzer". 8 | 9 | **Note**: this exercise probably only works on Linux, Mac or Chromebooks. 10 | 11 | Do this: 12 | 13 | * Do *NOT* look at the code for `src/browser/html_table.py`. That is cheating! 14 | * Open `src/fuzzer/fuzzer.py` in VSCode and read it. 15 | * Run `python3 src/fuzzer/fuzzer.py`. Watch what it does. 16 | * Control-C to cancel it. 17 | 18 | Now: 19 | 20 | 1. Modify *one single number* in the `generate_testcase` function so that it 21 | finds one of the security bugs. Run the fuzzer again. 22 | 2. Now, modify `generate_testcase` to find another bug which is hidden in 23 | `src/browser/html_table.py`. Do *not* look at its code - that's cheating! 24 | To be clear, this is an _extra_ security bug which wasn't in `browser.py`. 25 | 26 | ## General hints (no spoilers! Fine to read) 27 | 28 | Writing a good fuzzer is hard. You'll need to think about: 29 | 30 | * How long it takes the fuzzer to explore all the things you want it to 31 | explore. 32 | * Whether you are aiming to generate fake HTML tags, or snippets of HTML 33 | consisting of valid tags, or both. Both is hard. 34 | * Generating all possible HTML tags. Consider using [`random.choice`](https://docs.python.org/3/library/random.html#random.choice). 35 | * Connecting several HTML tokens together, possibly by generating multiple tags in a loop 36 | and then building a string containing all the tokens you made. You can go through 37 | the loop a [random number of times](https://docs.python.org/3/library/random.html#random.choice). 38 | * Sometimes it's worth calculating roughly how long it might take before the 39 | fuzzer happens upon the test case you want. If it's too long to be 40 | realistic, change the fuzzer to be more targeted. 41 | * Sometimes your fuzzer will need to actively _avoid_ existing known bugs. 42 | In this case, you'll want to write `generate_testcase` to avoid triggering 43 | the bug with headers, or it may prevent you finding the other bug you're 44 | looking for. 45 | 46 | ## Specific hints (spoilers ahead!) 47 | 48 | Read one at a time and see if it's enough for you to write a fuzzer to find it... 49 | 50 | * [Hint one](exercise-4b-hints/hint1.md) 51 | * [Hint two](exercise-4b-hints/hint2.md) 52 | * [Hint three](exercise-4b-hints/hint3.md) 53 | * [Hint four](exercise-4b-hints/hint4.md) 54 | * [Hint five](exercise-4b-hints/hint5.md) 55 | 56 | ## Bonus exercise 57 | 58 | Congratulations again, you've got to the end of the course! 59 | 60 | If you've got spare time, do this: 61 | 62 | * Pair up with someone else who has finished. 63 | * One of you now has to hide an *extra* security bug in the browser. 64 | * And the other one has to make sure your fuzzer is good enough to find it. 65 | * Then swap round! 66 | -------------------------------------------------------------------------------- /docs/for-teachers.md: -------------------------------------------------------------------------------- 1 | # For teachers 2 | 3 | Intended educational outcomes (this page will later be updated with references 4 | to the UK computer science GCSE curriculum, and maybe others): 5 | 6 | * **Get comfortable reading code**. The project will involve _reading_ a 7 | Python web browser containing about 100 lines of code. It will be covered 8 | in comments and won't assume more than basic Python knowledge - but still, 9 | reading large amounts of code is a skill in itself. There will be games 10 | based on finding the right place to add small new features, which will be 11 | mostly about figuring out where to add them. 12 | * **Understand encryption.** We'll demonstrate, using a Python web 13 | browser and a Python web server, how unencrypted HTTP traffic can be 14 | intercepted, and how HTTPS can't. 15 | * **Understand networking.** We'll see the requests and responses flowing 16 | back and forth between the web browser and web server. 17 | * **Understand security bugs.** There are games to find some hidden security 18 | bugs in the web server. 19 | 20 | Because this involves a web server and a web browser, these exercises can't 21 | be run on an online Python REPL. The kids will need a real computer capable 22 | of running Python locally, and they'll need to be able to install a few Python 23 | libraries using `pip`. You should carefully run through the [setup requirements](setup.md) 24 | before deciding if this project is right for you. 25 | 26 | You may find [solutions at this page](solutions.md). 27 | -------------------------------------------------------------------------------- /docs/setup.md: -------------------------------------------------------------------------------- 1 | # System requirements and setup 2 | 3 | In all cases, open a browser to visit `https://github.com/adetaylor/browser-learning` 4 | - you should be looking at these pages! 5 | 6 | ## Mac 7 | 8 | 1. Open Terminal. 9 | 2. Run `python3 --version`. See what Python version you have. You need at least Python 3.12. You can get it from the [main Python website](https://www.python.org/downloads/). 10 | 3. Fetch and install [Visual Studio Code](https://code.visualstudio.com/) if you don't already have it. 11 | 4. Fetch the zip of this course (your instructor will tell you how). 12 | 4. In Terminal, run: 13 | ``` 14 | git clone https://github.com/adetaylor/browser-learning.git 15 | cd browser-learning 16 | python3 -m venv venv 17 | . venv/bin/activate 18 | python3 -m pip install -r src/requirements.txt 19 | ``` 20 | 21 | ## Chromebooks 22 | 23 | 1. Under the main menu in bottom left corner of the screen, open Terminal. Follow the instructions to turn on the Linux environment if necessary. Go with all the standard settings. (Note that this environment may not be available for guest users) 24 | 2. In the terminal, run these commands: 25 | ``` 26 | git clone https://github.com/adetaylor/browser-learning.git 27 | cd browser-learning 28 | ./install-chromebook-prerequisites.sh 29 | python3 -m venv venv 30 | . venv/bin/activate 31 | pip3 install -r src/requirements.txt 32 | ``` 33 | 3. Keep the terminal open - you'll need it to run commands. (If you open a new terminal, run `cd browser-learning` and then `. venv/bin/activate`) 34 | 4. From the main menu, also open Visual Studio code. 35 | 36 | ## Other types of machine 37 | 38 | Do what's required to install: 39 | 40 | * Python 3.12+ 41 | * `tcpdump` 42 | * Visual Studio code, or some other good code editor (the students will be doing lots of _reading code_ so a good IDE is highly recommended) 43 | 44 | ## Troubleshooting 45 | 46 | * I see `"If this fails your Python may not be configured for Tk".` 47 | This is probably on MacOS and you probably installed python using `brew`; `brew install python-tk` might work. 48 | -------------------------------------------------------------------------------- /docs/solutions.md: -------------------------------------------------------------------------------- 1 | # Solutions 2 | 3 | Example solutions (there may be others!) 4 | 5 | # Exercise 4b 6 | 7 | ``` 8 | text = "" 9 | num = random.randrange(0, 12) 10 | for x in range(0, num): 11 | text += random.choice(["", "", "", "", "", "
", "hello"]) 12 | return text 13 | ``` 14 | -------------------------------------------------------------------------------- /install-chromebook-prerequisites.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | sudo apt-get -y update 4 | curl -o code.deb -L http://go.microsoft.com/fwlink/?LinkID=760868 5 | sudo dpkg -i code.deb 6 | sudo apt -y --fix-broken install 7 | sudo apt-get -y install python3.11-venv python3.11-tk pip openssl tcpdump libnss3 libnspr4 8 | -------------------------------------------------------------------------------- /src/browser/browser.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # Copyright 2024 Google LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # https://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | # Simple demo python web browser. Lacks all sorts of important features. 18 | 19 | from PyQt6.QtWidgets import QApplication, QWidget, QMainWindow, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QLineEdit, QSizePolicy 20 | from PyQt6.QtCore import QSettings, Qt, QPoint, QSize, QSocketNotifier, QTimer 21 | from PyQt6.QtGui import QFont, QMouseEvent, QPainter, QFontMetrics 22 | import requests 23 | import os 24 | import signal 25 | import sys 26 | import html_table # do not look inside this file, that would be cheating on a later exercise 27 | from html.parser import HTMLParser 28 | from urllib.parse import urlparse 29 | 30 | # How much bigger to make the font when we come across

to

tags 31 | FONT_SIZE_INCREASES_FOR_HEADERS_1_TO_6 = [10, 6, 4, 3, 2, 1] 32 | 33 | 34 | class Renderer(HTMLParser, QWidget): 35 | """ 36 | Represents the area of the screen occupied by the web content (as 37 | opposed to the URL bar, etc.) Knows how to convert HTML tags 38 | (e.g. some text) into actual pixels on the screen, by drawing 39 | the right sort of text in the right places. 40 | """ 41 | 42 | def __init__(self, browser, parent=None): 43 | """ 44 | Code which is run when we create a new Renderer. 45 | """ 46 | # Set up the underlying HTML parser and user interface code. 47 | super(Renderer, self).__init__() 48 | QWidget.__init__(self, parent) 49 | # Tell the user interface code that we are stretchy, in case 50 | # the window is resized. 51 | self.setSizePolicy(QSizePolicy(QSizePolicy.Policy.Expanding, QSizePolicy.Policy.Expanding)) 52 | self.known_links = list() # stores clickable UI areas 53 | # and the URL we want to visit when it's clicked 54 | # as a list of five-tuples like this: 55 | # (x1, y1, x2, y2, url) 56 | # e.g. 57 | # (10, 20, 50, 30, "http://foo.com") 58 | self.html = "" 59 | self.browser = browser 60 | 61 | def minimumSizeHint(self): 62 | """ 63 | Returns the smallest possible size on the screen for our renderer. 64 | """ 65 | return QSize(800, 400) 66 | 67 | def mouseReleaseEvent(self, event) -> None: 68 | """ 69 | Handle a click somewhere in the renderer area. See if it 70 | matches any known link. 71 | """ 72 | if event is not None and self.browser is not None: 73 | x = event.position().x() 74 | y = event.position().y() 75 | for possible_link in self.known_links: 76 | (x1, y1, x2, y2, url) = possible_link 77 | if x >= x1 and x <= x2 and y >= y1 and y <= y2: 78 | self.browser.link_clicked(url) 79 | return None 80 | return super().mouseReleaseEvent(event) 81 | 82 | def set_html(self, html): 83 | """ 84 | Store the HTML text we've been given. We'll use it when 85 | the user interface asks us to repaint. 86 | """ 87 | self.html = html 88 | self.update() # redraw. This will arrange for the UI 89 | # library to call paintEvent, just below. 90 | 91 | def paintEvent(self, event): 92 | """ 93 | This is called by the user interface whenever we need 94 | to draw the screen. This may be called because of the 95 | "self.update()" call just above, or because the UI 96 | has decided we need to redraw for some other reason. 97 | 98 | Take the HTML we were given by the website, and draw it 99 | onto the screen. 100 | """ 101 | self.painter = QPainter(self) 102 | # Clear the area. 103 | self.painter.fillRect(self.rect(), Qt.GlobalColor.white) 104 | # Information we are remembering as we draw the page, 105 | # to influence how we draw subsequent bits of the page. 106 | self.y_pos = 0 # where we should draw the next text 107 | self.x_pos = 0 # where we should draw the next text 108 | self.ignore_current_text = False 109 | self.is_bold = False 110 | self.is_strikethrough = False 111 | self.font_size = 12 112 | self.tallest_text_in_previous_line = 0 113 | self.space_needed_before_next_data = False 114 | self.current_link = None # if we're in a hyperlink 115 | self.known_links = list() # Links anywhere on the page 116 | self.table = None # whether we're in an HTML table 117 | # The following call interprets all the HTML in page_html. 118 | # You can't see most of the code which does this because it's 119 | # in the library which provides the HTMLParser class. But it will 120 | # result in lots of calls to handle_starttag, handle_endtag and 121 | # handle_data. 122 | # So you can think of this as lots of calls to handle_starttag, 123 | # handle_data and handle_endtag depending on what's inside self.html. 124 | self.feed(self.html) 125 | self.painter = None 126 | # Ignore the following two lines, they're used for exercise 4b only 127 | if os.environ.get("OUTPUT_STATUS") is not None: 128 | print("Rendering completed\n", flush=True) 129 | 130 | def handle_starttag(self, tag, attrs): 131 | """ 132 | Handle an HTML start tag, for instance, 133 | or . In these cases, 'b' and 'a' 134 | are the tag, and in the latter case we also have an "attrbute" 135 | (attr) 136 | """ 137 | if tag == 'script' or tag == 'style' or tag == 'title': 138 | # Stuff inside these tags isn't actually HTML 139 | # to display on the screen. 140 | self.ignore_current_text = True 141 | if self.table is not None: 142 | # If we're inside a table, handle table-related tags but no others 143 | if tag == 'tr': 144 | self.table.handle_tr_start() 145 | if tag == 'td': 146 | self.table.handle_td_start() 147 | return 148 | if tag == 'b' or tag == 'strong': 149 | self.is_bold = True 150 | if tag == 's': 151 | self.is_strikethrough = True 152 | if tag == 'a': # hyperlink. 153 | for tag_name, tag_value in attrs: 154 | if tag_name == 'href': 155 | self.current_link = tag_value 156 | if tag == 'meta': # sometimes sites redirect users to other sites 157 | # Looks like 158 | is_refresh = False 159 | content = None 160 | for tag_name, tag_value in attrs: 161 | if tag_name == 'http-equiv': 162 | if tag_value == 'refresh': 163 | is_refresh = True 164 | if tag_name == 'content': 165 | content = tag_value 166 | if is_refresh and content is not None: 167 | parts = content.split('; ') 168 | if len(parts) == 2: 169 | self.browser.set_window_url(parts[1]) 170 | if parts[0] == '0': # navigate immediately to the requested URL 171 | self.browser.navigate(parts[1]) 172 | # Delayed navigations not yet supported by this browser 173 | if tag == 'small': 174 | self.font_size -= 1 175 | if tag == 'big': 176 | self.font_size += 1 177 | # h1...h6 header tags 178 | if len(tag) == 2 and tag[0] == 'h' and tag != 'hr': 179 | self.newline() 180 | heading_number = int(tag[1]) 181 | font_size_difference = FONT_SIZE_INCREASES_FOR_HEADERS_1_TO_6[heading_number - 1] 182 | self.font_size += font_size_difference 183 | if tag == 'table': 184 | self.table = html_table.HTMLTable() 185 | self.space_needed_before_next_data = True 186 | 187 | def handle_endtag(self, tag): 188 | """ 189 | Handle an HTML end tag, for example or 190 | """ 191 | if self.table is not None: 192 | # If we're inside a table, handle table end but no other tags 193 | if tag == 'table': 194 | self.y_pos = self.table.handle_table_end(self.y_pos, lambda x, y, content: self.draw_text(x, y, content)) 195 | self.table = None 196 | return 197 | if tag == 'br' or tag == 'p': # move to a new line 198 | self.newline() 199 | if tag == 'script' or tag == 'style' or tag == 'title': 200 | self.ignore_current_text = False 201 | if tag == 'b' or tag == 'strong': 202 | self.is_bold = False 203 | if tag == 'a': 204 | self.current_link = None 205 | if tag == 's': 206 | self.is_strikethrough = False 207 | if tag == 'small': 208 | self.font_size += 1 209 | if tag == 'big': 210 | self.font_size -= 1 211 | if len(tag) == 2 and tag[0] == 'h' and tag != 'hr': 212 | self.newline() 213 | heading_number = int(tag[1]) 214 | font_size_difference = FONT_SIZE_INCREASES_FOR_HEADERS_1_TO_6[heading_number - 1] 215 | self.font_size -= font_size_difference 216 | self.space_needed_before_next_data = True 217 | 218 | def newline(self): 219 | """ 220 | Start a new line of text. 221 | """ 222 | SPACING = 3 # just allow a bit of extra space between lines 223 | self.y_pos += self.tallest_text_in_previous_line + SPACING 224 | self.x_pos = 0 225 | self.tallest_text_in_previous_line = 0 226 | 227 | def handle_data(self, data): 228 | """ 229 | Handle some actual text, found within a tag or outside them. For example, 230 | FOO in FOO. 231 | """ 232 | data = data.rstrip() 233 | if self.ignore_current_text or data == '': 234 | return 235 | if self.space_needed_before_next_data: 236 | self.space_needed_before_next_data = False 237 | data = ' ' + data 238 | if self.table is not None: 239 | # If we're inside a table, ask our table layout code to 240 | # figure out where to draw it later 241 | self.table.handle_data(data) 242 | else: 243 | (text_width, text_height) = self.draw_text(self.x_pos, self.y_pos, data) 244 | self.x_pos = self.x_pos + text_width 245 | if text_height > self.tallest_text_in_previous_line: 246 | self.tallest_text_in_previous_line = text_height 247 | 248 | def draw_text(self, x_pos, y_pos, text): 249 | """ 250 | Draw some text on the screen. 251 | Returns a tuple of (x, y) space occupied 252 | """ 253 | # Work out what font we'll draw this in. 254 | weight = QFont.Weight.Normal 255 | if self.is_bold: 256 | weight = QFont.Weight.Bold 257 | font = QFont("Helvetica", weight=weight, pointSize=self.font_size, italic=False) 258 | self.painter.setFont(font) 259 | fill = Qt.GlobalColor.black 260 | if self.current_link is not None: 261 | fill = Qt.GlobalColor.blue 262 | self.painter.setPen(fill) 263 | # Work out the size of the text we're about to draw. 264 | text_measurer = QFontMetrics(font) 265 | text_width = int(text_measurer.horizontalAdvance(text)) 266 | text_height = int(text_measurer.height()) 267 | # Tell our GUI canvas to draw some text! The important bit! 268 | self.painter.drawText(QPoint(x_pos, y_pos + text_height), text) 269 | # If we're in a hyperlink, underline it and record its coordinates 270 | # in case it gets clicked later. 271 | if self.current_link is not None: 272 | self.painter.drawLine(x_pos, y_pos + text_height, x_pos + text_width, y_pos + text_height) 273 | self.known_links.append((x_pos, y_pos, x_pos + text_width, y_pos + text_height, self.current_link)) 274 | # Strikethrough - draw a line over the text but only 275 | # if we don't cover more than 50% of it, we don't want it illegible 276 | if self.is_strikethrough: 277 | fraction_of_text_covered = 6 / self.font_size 278 | if fraction_of_text_covered <= 0.5: 279 | strikethrough_line_y_pos = y_pos + (self.font_size / 2) - 80 280 | self.canvas.create_line(x_pos, strikethrough_line_y_pos, 281 | x_pos + text_width, strikethrough_line_y_pos) 282 | return (text_width, text_height) 283 | 284 | 285 | class Browser(QMainWindow): 286 | """ 287 | A class of objects representing the browser window. At any time there's 288 | exactly one of these Browser objects existing. 289 | """ 290 | 291 | def __init__(self, initial_url): 292 | """ 293 | Code which is run when our Browser is created. 294 | """ 295 | super(Browser, self).__init__() 296 | self.current_url = None 297 | # All the following code lays out the UI. 298 | toolbar_layout = QHBoxLayout() 299 | toolbar_layout.addWidget(QLabel("URL:")) 300 | self.url_box = QLineEdit() 301 | self.url_box.returnPressed.connect(self.go_button_clicked) 302 | toolbar_layout.addWidget(self.url_box) 303 | go_button = QPushButton("Go") 304 | go_button.clicked.connect(self.go_button_clicked) 305 | toolbar_layout.addWidget(go_button) 306 | exit_button = QPushButton("Exit") 307 | exit_button.clicked.connect(lambda: QApplication.quit()) 308 | toolbar_layout.addWidget(exit_button) 309 | toolbar = QWidget() 310 | toolbar.setLayout(toolbar_layout) 311 | overall_layout = QVBoxLayout() 312 | overall_layout.addWidget(toolbar) 313 | self.renderer = Renderer(self) 314 | overall_layout.addWidget(self.renderer) 315 | self.status_bar = QLabel("Status:") 316 | overall_layout.addWidget(self.status_bar) 317 | widget = QWidget() 318 | widget.setLayout(overall_layout) 319 | self.setCentralWidget(widget) 320 | # Set up somewhere to remember the last URL the user used 321 | self.settings = QSettings("browser-learning", "browser") 322 | if initial_url is None: 323 | initial_url = self.settings.value("url", "https://en.wikipedia.org", type=str) 324 | else: 325 | self.navigate(initial_url) 326 | self.set_window_url(initial_url) 327 | self.setup_fuzzer_handling() # ignore 328 | 329 | def go_button_clicked(self): 330 | """ 331 | Called when the Go button is clicked 332 | """ 333 | url = self.url_box.text() 334 | self.navigate(url) 335 | 336 | def link_clicked(self, url): 337 | """ 338 | Called when a link on the page was clicked. 339 | """ 340 | if not ':' in url: 341 | # The hyperlink was a relative URL, e.g. just "some_page.html". 342 | # We need to change that into an absolute URL, e.g. 343 | # https://en.wikipedia.org/some_page.html 344 | # by combining it with parts of the currently-viewed URL. 345 | # (This is a simplification of the real checks for relative URLs...) 346 | current_url_parts = urlparse(self.current_url) 347 | url = current_url_parts._replace(path=url).geturl() 348 | # fill in the URL bar with the new URL 349 | self.set_window_url(url) 350 | self.navigate(url) 351 | 352 | def set_status(self, message): 353 | """ 354 | Update the status line at the bottom of the screen 355 | """ 356 | self.status_bar.setText(message) 357 | 358 | def set_window_url(self, url): 359 | """ 360 | Sets the URL bar 361 | """ 362 | self.url_box.setText(url) 363 | 364 | def navigate(self, url): 365 | """ 366 | Navigates the browser to a new URL 367 | """ 368 | if not ':' in url: 369 | url = 'https://' + url 370 | self.set_window_url(url) 371 | self.settings.setValue("url", url) 372 | self.current_url = url 373 | self.set_status('Status: loading...') 374 | self.setup_encryption(url) 375 | # Connect over the network to a web server to get the HTML 376 | # at this URL. 377 | try: 378 | response = requests.get(url) 379 | except: 380 | self.set_status('Status: unable to connect to %s' % url) 381 | self.renderer.set_html("") 382 | return 383 | if not response.ok: 384 | self.set_status('Status: web server gave us error code %d' % 385 | response.status_code) 386 | self.renderer.set_html("") 387 | return 388 | page_html = response.text 389 | # Tell the renderer about the HTML. 390 | # It doesn't actually handle it until it's asked to paint. 391 | self.renderer.set_html(page_html) 392 | self.set_status('Status: OK') 393 | 394 | def setup_encryption(self, url): 395 | """ 396 | Ignore this function - it's used to set up 397 | encryption for some of the later exercises. 398 | """ 399 | if "localhost" in url: 400 | os.environ["REQUESTS_CA_BUNDLE"] = os.path.join(os.path.dirname( 401 | os.path.dirname(__file__)), "server/tls_things/server.crt") 402 | elif "REQUESTS_CA_BUNDLE" in os.environ: 403 | del os.environ["REQUESTS_CA_BUNDLE"] 404 | 405 | def setup_fuzzer_handling(self): 406 | """ 407 | Ignore this function - it's used to set up 408 | fuzzing for some of the later exercises. 409 | """ 410 | self.reader, self.writer = os.pipe() 411 | signal.signal(signal.SIGHUP, lambda _s, _h: os.write(self.writer, b'a')) 412 | notifier = QSocketNotifier(self.reader, QSocketNotifier.Type.Read, self) 413 | notifier.setEnabled(True) 414 | def signal_received(): 415 | os.read(self.reader, 1) 416 | window.go_button_clicked() 417 | notifier.activated.connect(signal_received) 418 | 419 | 420 | ######################################### 421 | # Main program here 422 | ######################################### 423 | 424 | # Set up the graphical user interface (GUI) 425 | app = QApplication([]) 426 | 427 | # See if we were given a URL on the command-line 428 | initial_url = None 429 | if len(sys.argv) > 1: 430 | initial_url = sys.argv[1] 431 | 432 | # Create the one (and only) example of our Browser class. 433 | window = Browser(initial_url) 434 | window.show() 435 | 436 | # Every 100 msec, check if we've been asked to reload - 437 | # this is only relevant for exercise 4b and works around a bug 438 | # in the GUI toolkit. 439 | timer = QTimer() 440 | timer.timeout.connect(lambda: None) 441 | timer.start(100) 442 | 443 | # The "event loop". An event is something like a click or the user 444 | # typing something. Keep handling those events from the user until 445 | # Exit is clicked or the window is closed. All of this happens 446 | # within app.exec(). 447 | # In particular, this will end up calling the "repaint" method whenever 448 | # we need to display something on the screen, along with 449 | # methods above like "go_button_clicked" or "mouseReleaseEvent" 450 | # when the user interacts with the app. 451 | app.exec() -------------------------------------------------------------------------------- /src/browser/html_table.py: -------------------------------------------------------------------------------- 1 | # Copyright 2024 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | ##################################### 16 | ##################################### 17 | ##################################### 18 | ### DO NOT LOOK INSIDE THIS FILE! ### 19 | ##################################### 20 | ##################################### 21 | ##################################### 22 | ##################################### 23 | # This contains spoilers for exercise 24 | # 4b. Reading this code is cheating! 25 | ##################################### 26 | ##################################### 27 | ##################################### 28 | ##################################### 29 | 30 | class HTMLTable: 31 | def __init__(self): 32 | self.rows = list() 33 | 34 | def handle_tr_start(self): 35 | self.rows.append(list()) 36 | 37 | def handle_td_start(self): 38 | if len(self.rows) == 0: # no tr was found 39 | return 40 | self.rows[-1].append("") 41 | 42 | def handle_data(self, data): 43 | if len(self.rows) == 0: # no tr was found 44 | return 45 | if len(self.rows[-1]) == 0: # no td was found 46 | return 47 | self.rows[-1][-1] += data 48 | 49 | def handle_table_end(self, initial_y_pos, draw_at): 50 | """ 51 | Draws the table, using the passed function which takes 52 | x and y positions and content, draws the content, 53 | and returns a tuple of (x, y) space 54 | occupied. 55 | Returns the y position after the table is drawn. 56 | """ 57 | if len(self.rows) == 0: 58 | return initial_y_pos 59 | y_pos = initial_y_pos 60 | column_widths = list() 61 | first_row = True 62 | # Column widths are based on the first row space 63 | # occupied. A real algorithm would consider other rows. 64 | for row in self.rows: 65 | max_height = 0 66 | if first_row: 67 | first_row = False 68 | for cell in row: 69 | current_x_pos = sum(column_widths) 70 | (width, height) = draw_at(current_x_pos, y_pos, cell) 71 | column_widths.append(width + 10) # padding 72 | max_height = max(max_height, height) 73 | else: 74 | current_x_pos = 0 75 | for n, cell in enumerate(row): 76 | (_, height) = draw_at(current_x_pos, y_pos, cell) 77 | max_height = max(max_height, height) 78 | current_x_pos += column_widths[n] 79 | y_pos += max_height + 10 # padding 80 | return y_pos -------------------------------------------------------------------------------- /src/fuzzer/.gitignore: -------------------------------------------------------------------------------- 1 | browser-v2.json 2 | -------------------------------------------------------------------------------- /src/fuzzer/fuzzer.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # Copyright 2024 Google LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # https://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | from http.server import HTTPServer, BaseHTTPRequestHandler 18 | 19 | import os 20 | import threading 21 | import subprocess 22 | import signal 23 | import random 24 | import sys 25 | 26 | def generate_testcase(): 27 | """ 28 | Generate some HTML to try in the browser 29 | """ 30 | # TODO: Modify this code for the exercises 31 | x = random.randrange(0, 7) 32 | return "Test header" % (x, x) 33 | 34 | # You should not need to modify anything below here 35 | 36 | testcase = "" 37 | running = True 38 | exited_intentionally = False 39 | 40 | def signal_handler(sig, frame): 41 | print('Ctrl-C intercepted, finishing fuzzing!') 42 | global exited_intentionally, running 43 | exited_intentionally = True 44 | running = False 45 | sys.exit(0) 46 | 47 | signal.signal(signal.SIGINT, signal_handler) 48 | 49 | class FuzzerRequestHandler(BaseHTTPRequestHandler): 50 | def do_GET(self): 51 | self.send_response(200) 52 | self.end_headers() 53 | self.wfile.write(testcase.encode()) 54 | 55 | def log_message(self, format, *args): 56 | pass 57 | 58 | httpd = HTTPServer(('localhost', 8001), FuzzerRequestHandler) 59 | 60 | def run_http_server(): 61 | httpd.serve_forever() 62 | 63 | httpd_server_thread = threading.Thread(target=run_http_server, daemon=True) 64 | httpd_server_thread.start() 65 | 66 | print("Beginning fuzzing. Press Control-C (maybe several times) to stop.") 67 | 68 | browser_path = os.path.join(os.path.dirname(os.path.dirname(os.path.realpath(__file__))), "browser", "browser.py") 69 | 70 | url = "http://localhost:8001/testcase.html" 71 | 72 | browserenv = os.environ 73 | browserenv["OUTPUT_STATUS"] = "1" 74 | browser_proc = subprocess.Popen(stdout=subprocess.PIPE, args=[browser_path, url], env=browserenv, encoding="utf8") 75 | 76 | first = True 77 | 78 | while running: 79 | testcase = generate_testcase() 80 | print("Trying %s" % testcase) 81 | crashed = True 82 | render_completed = False 83 | if first: 84 | first = False 85 | else: 86 | browser_proc.send_signal(signal.SIGHUP) 87 | while browser_proc.poll() is None and running and not render_completed: 88 | try: 89 | line = browser_proc.stdout.readline() 90 | if "Rendering completed\n" in line: 91 | crashed = False 92 | render_completed = True 93 | except: 94 | pass 95 | if crashed and not exited_intentionally: 96 | print("The HTML\n%s\ncrashed the browser! Good job!" % testcase) 97 | running = False 98 | browser_proc.terminate() 99 | -------------------------------------------------------------------------------- /src/requirements.txt: -------------------------------------------------------------------------------- 1 | html-parser==0.2 2 | PyQt6 3 | requests==2.31.0 4 | urllib3==2.2.0 5 | -------------------------------------------------------------------------------- /src/server/.gitignore: -------------------------------------------------------------------------------- 1 | tls_things/ 2 | -------------------------------------------------------------------------------- /src/server/http_server.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # Copyright 2024 Google LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # https://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | from http.server import HTTPServer, SimpleHTTPRequestHandler 18 | 19 | import os 20 | 21 | class PagesDirectoryRequestHandler(SimpleHTTPRequestHandler): 22 | def __init__(self, request, client_address, server): 23 | pages_dir = os.path.join(os.path.dirname(__file__), "pages") 24 | super().__init__(request, client_address, server, directory=pages_dir) 25 | 26 | httpd = HTTPServer(('localhost', 8000), PagesDirectoryRequestHandler) 27 | 28 | httpd.serve_forever() 29 | -------------------------------------------------------------------------------- /src/server/https_server.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | # Copyright 2024 Google LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # https://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | from http.server import HTTPServer, SimpleHTTPRequestHandler 18 | import ssl 19 | import os 20 | import subprocess 21 | import tempfile 22 | 23 | 24 | class PagesDirectoryRequestHandler(SimpleHTTPRequestHandler): 25 | def __init__(self, request, client_address, server): 26 | pages_dir = os.path.join(os.path.dirname(__file__), "pages") 27 | super().__init__(request, client_address, server, directory=pages_dir) 28 | 29 | 30 | httpd = HTTPServer(('localhost', 4443), PagesDirectoryRequestHandler) 31 | 32 | context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER) 33 | 34 | tls_things_dir = os.path.join(os.path.dirname(__file__), "tls_things") 35 | server_certificate_file = os.path.join(tls_things_dir, "server.pem") 36 | 37 | if not os.path.exists(server_certificate_file): 38 | if not os.path.exists(tls_things_dir): 39 | os.makedirs(tls_things_dir) 40 | with tempfile.NamedTemporaryFile(delete=False) as config: 41 | config.write( 42 | b"[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth") 43 | config.close() 44 | subprocess.check_call(["openssl", "req", "-x509", "-out", "server.crt", "-keyout", "server.key", "-newkey", "rsa:4096", "-nodes", "-sha256", "-subj", "/CN=localhost", 45 | "-extensions", "EXT", "-config", config.name], 46 | cwd=tls_things_dir) 47 | with open(server_certificate_file, "w") as outfile: 48 | with open(os.path.join(tls_things_dir, "server.crt")) as infile: 49 | outfile.write(infile.read()) 50 | with open(os.path.join(tls_things_dir, "server.key")) as infile: 51 | outfile.write(infile.read()) 52 | 53 | context.load_cert_chain(certfile=server_certificate_file) 54 | httpd.socket = context.wrap_socket(httpd.socket, server_side=True) 55 | httpd.serve_forever() 56 | -------------------------------------------------------------------------------- /src/server/pages/exercise1c-anotherpage.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | Here's a second web page 4 | 5 | 6 |

This is a different web page, though it still has bold text and a link back to the original page.

7 | 8 | 9 | -------------------------------------------------------------------------------- /src/server/pages/exercise1c.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | Here's a simple web page 4 | 5 | 6 |

Here's a simple web page

7 |

This is a paragraph of text.

8 |

And this is another paragraph of text. It has some bold text, and perhaps some italic text!

9 |

It also has a link to another page which you can click in the browser.

10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
A longB very long indeedC quite long
123
456
19 |

More text after table

20 | 21 | 22 | -------------------------------------------------------------------------------- /src/server/pages/spoofable.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | School trip payment 4 | 5 | 6 |

School trip payment

7 |

Thank you for your interest in this school trip to Google. Please pay 8 | the school trip fee into:

9 |

Sort code 01-02-03

10 |

Account number 0987654

11 | 12 | 13 | --------------------------------------------------------------------------------