├── .github
├── ISSUE_TEMPLATE
│ ├── typo-or-grammar-error.md
│ ├── unclear-statement-requirement.md
│ └── wrong-statement.md
└── workflows
│ └── gh-pages.yaml
├── .gitignore
├── LICENSE
├── Makefile
├── README.md
├── _templates
└── layout.html
├── conf.py
├── drawio
├── argv.drawio
├── buddy.drawio
├── buddy_frame_array.drawio
└── vfs.drawio
├── external_reference
└── index.rst
├── hardware
├── asm.rst
├── mailbox.rst
└── uart.rst
├── index.rst
└── labs
├── img
├── argv.svg
├── buddy.svg
├── buddy_frame_array.svg
├── disk_dump.png
├── exception_levels.jpg
├── mem_layout.png
├── vector_table.jpg
└── vfs.png
├── lab0.rst
├── lab1.rst
├── lab2.rst
├── lab3.rst
├── lab4.rst
├── lab5.rst
├── lab6.rst
├── lab7.rst
└── lab8.rst
/.github/ISSUE_TEMPLATE/typo-or-grammar-error.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Typo or grammar errors
3 | about: For typo or grammar errors in documentation
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | ### Typo or grammar errors in documentation
11 |
12 | **URL:** https://
13 |
14 | **Wrong sentence**
15 |
16 | > XXXX
17 |
18 | **Correct sentence**
19 |
20 | > YYYY
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/unclear-statement-requirement.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Unclear requirement or statement
3 | about: For unclear requirement or statement in documentation
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | ### Unclear requirement or statement in documentation
11 |
12 | **URL:** https://
13 |
14 | **Confused sentence**
15 |
16 | > XXX
17 |
18 | **Your understanding or question:**
19 |
20 | Briefly state which part of the statement confuses you, and what's your understanding about the context.
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/wrong-statement.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: 'Wrong statement '
3 | about: For wrong statement in documentation
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | ### Wrong statement in documentation
11 |
12 | **URL:** https://
13 |
14 | **Wrong statement**
15 |
16 | > XXX
17 |
18 | **Correct statement**
19 |
20 | > YYY
21 |
22 | **Explanation**
23 |
24 | Explain why the original statement is wrong and why yours is correct.
25 |
26 | **Reference**
27 |
28 | With reference is better
29 |
--------------------------------------------------------------------------------
/.github/workflows/gh-pages.yaml:
--------------------------------------------------------------------------------
1 | name: github pages
2 |
3 | on:
4 | push:
5 | branches: [ main ]
6 |
7 | jobs:
8 | deploy:
9 | runs-on: self-hosted
10 |
11 | steps:
12 |
13 | - name: Checkout
14 | uses: actions/checkout@v2
15 |
16 | - name: Build docs
17 | run: make
18 |
19 | - name: Deploy
20 | uses: peaceiris/actions-gh-pages@v3
21 | with:
22 | github_token: ${{ secrets.GITHUB_TOKEN }}
23 | publish_dir: ./docs
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | docs
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2021 GRASS 綠色運算與嵌入式系統實驗室
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | HTML_OUTPUT = docs
2 |
3 | .PHONY: all, clean
4 |
5 | all:
6 | sphinx-build . ${HTML_OUTPUT}
7 |
8 | clean:
9 | rm -rf ${HTML_OUTPUT}
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # NYCU_Operating_System_Capstone
2 |
3 | This is the documentation page of course labs.
4 |
5 | https://grasslab.github.io/NYCU_Operating_System_Capstone
6 |
7 | ## Prerequisite
8 |
9 | ```
10 | pip install sphinx
11 | pip install sphinx_rtd_theme
12 | ```
--------------------------------------------------------------------------------
/_templates/layout.html:
--------------------------------------------------------------------------------
1 | {% extends "!layout.html" %}
2 | {%- block extrahead %}
3 | {{ super() }}
4 |
5 |
6 |
13 | {% endblock %}
--------------------------------------------------------------------------------
/conf.py:
--------------------------------------------------------------------------------
1 | # Configuration file for the Sphinx documentation builder.
2 | #
3 | # This file only contains a selection of the most common options. For a full
4 | # list see the documentation:
5 | # https://www.sphinx-doc.org/en/master/usage/configuration.html
6 |
7 | # -- Path setup --------------------------------------------------------------
8 |
9 | # If extensions (or modules to document with autodoc) are in another directory,
10 | # add these directories to sys.path here. If the directory is relative to the
11 | # documentation root, use os.path.abspath to make it absolute, like shown here.
12 | #
13 | # import os
14 | # import sys
15 | # sys.path.insert(0, os.path.abspath('.'))
16 | master_doc = 'index'
17 |
18 | # -- Project information -----------------------------------------------------
19 |
20 | project = 'nycuos'
21 | copyright = '2021, Jim'
22 | author = 'Jim'
23 |
24 | # The full version, including alpha/beta/rc tags
25 | release = '0.0'
26 |
27 |
28 | # -- General configuration ---------------------------------------------------
29 |
30 | # Add any Sphinx extension module names here, as strings. They can be
31 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
32 | # ones.
33 | extensions = [
34 | "sphinx_rtd_theme"
35 | ]
36 |
37 | # Add any paths that contain templates here, relative to this directory.
38 | templates_path = ['_templates']
39 |
40 | # List of patterns, relative to source directory, that match files and
41 | # directories to ignore when looking for source files.
42 | # This pattern also affects html_static_path and html_extra_path.
43 | exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
44 |
45 |
46 | # -- Options for HTML output -------------------------------------------------
47 |
48 | # The theme to use for HTML and HTML Help pages. See the documentation for
49 | # a list of builtin themes.
50 | #
51 | html_theme = 'sphinx_rtd_theme'
52 |
53 | # Add any paths that contain custom static files (such as style sheets) here,
54 | # relative to this directory. They are copied after the builtin static files,
55 | # so a file named "default.css" will overwrite the builtin "default.css".
56 | #html_static_path = ['_static']
--------------------------------------------------------------------------------
/drawio/argv.drawio:
--------------------------------------------------------------------------------
1 | 7Vhdb5swFP01aFulVRhD0jyGpFulZVOnbFr7aIEL3gymxhSyXz9TTMCBplmXrVa1p/ge7vXHufeEayy4SKr3HGXxRxZiajl2WFlwaTkOcIErf2pk0yDTyawBIk5C5dQBa/ITK9BWaEFCnGuOgjEqSKaDAUtTHAgNQ5yzUne7YVRfNUMRHgDrANEh+o2EIm7QM8/u8AtMorhdGdjqSYJaZwXkMQpZ2YPguQUXnDHRjJJqgWlNXstLE/fugafbjXGcikMCPpfpl3nywQ+yzdXlap5Fl7fkrZrlDtFCHdh5jXgUvFF7FpuWiDImAq8zFNR2KZNtQT8WCZUWkEOUZw39N6TCckVfTYy5wNWDOwZbHmQBYZZgwTfSpQ1ouVS147Ssl71MKCjuJaHFkMp9tJ25o0cOFEO/wdbZgBUcympRJuMiZhFLET3vUJ+zIg1rRpa2tDqfFWOZ4u47FmKjSh8VgunM4oqIqzr81FPWtZqsHi+rvrFpjVQetxdUm9f9Z13YvdXGNeerD7U/ZZIDVvAA7+HKUWKV1YTFHj84XgIcUyTInb6PoyfUGZR/ECN+cnL/5xHdGSGCM8NEAMCAlmdTAXiSCsC/UwE8UAWz51QBHFdBKwLPty1vaYIWnIlpWnD+a+FgLbgHagHY40VwsBhU6CUjcovb4pk4evFAd6cqmo2pqK4w5pyjTc8tqx3yh9eBO3/YcLrTlz22L91fDpoddFW65eTphes+JnlgiOS3WTJF8t6AuU9fVysTqHIdw6iaDaiynMltUV94fHTKCmF5C7uDTODQM+3K0d6Dx0nkETCQQ+M61mGDcyGXHXAljyx0enLB2Q+8YJRxiaQsrd/fN4TSHQhREqXSDCRFWOJ+TSAJEJ2rBwkJw/uX/1g+9IbgCBmYQD0DYDbMgDuSAeevZWD4vlmx8uUmYGYY/45Bd7b9Xy6s4/WbYDqeoz/sI2eellsID2sjj9W5NafSpFTkmL/Ka0qylyupnasfnAwlNT2OpKTZfZVt0tZ924bnvwA=
--------------------------------------------------------------------------------
/drawio/buddy.drawio:
--------------------------------------------------------------------------------
1 | 7Zzrb+I4EMD/GnQPaU95mvRj6etOeyudVK12e99M4oRcTcwlpsD99WeTmCR2KNDmYdj2IcWTOJDfjMee8cDIvpmvH1K4mH0hAcIjywjWI/t2ZFmmY1kj/mcEm1wyNtxcEKVxUFxUCh7j/1AhNArpMg5QVruQEoJpvKgLfZIkyKc1GUxTsqpfFhJcf9UFjJAiePQhVqXf4oDOcqnnGqX8dxRHM/HKplGcmUNxcSHIZjAgq4rIvhvZNykhND+ar28Q5vAEl7zf/Z6zuzeWooQe02FGv959DVzv7+vvd08ZwfiL4X8qtPMC8bJ4YOfzpHi/dCMgrGYxRY8L6PP2iil6ZE9mdI5Zy2SHMFvk6MN4jdirTcIY4xuCSbrtbodhaPk+k2c0Jc+ociYAU+ACdkZ9mOL5XlBK0boiKh7uAZE5oumGXSLOCtCFpdmivarorRDNKioTMlhYSrS7cwmTHRQ8T2BrXw5by9CMrdMH28BFXuA0sfWsqQ3aYutpxtY0FbheA9yULJOAY7s1GIgDqAchK3sEy1bJmqBXtKq/bRtt6PLfJrRg+8N7kIRW5PlPN8asAXLVDbeOvBcf7DjaoVW98HmiBYZ2aF0FrQkuylPIztl0VOa7qbAf5qB75v1MfLKr0IDtWGFrWxdtz66KHDh9IgeuQhcFLLYtmiSlMxKRBOK7Ujqp8y+v+ZOQRUH9H0TppgjU4ZKSuk7QOqbfefff3KL1VNyMH9+uq42NaCTsefNOlivaT+KGvFH227ZEx71qy8gy9dFr9ujlF1KYRogesRjm6F41gxRhSOOXetqg/XHkKePI+czav7J/JgeYcvtfLthhxA9NIWOvVhErlnFqpNTBgHEGj3uuFLbGTxkTTJZBsNEBmRyGD44MgDP0MT25mF268pCLsYBWLsZSw/9rjIkPKdOZBoPABpoNAksN6l/1yZa+PlleOA7PtiGJqpdPdnWbxqyG3KheyORwe3hkarStGzJPN2RqsNxFBh4iL2xM/wDfQ9Owmxll8Ay8pUbLZ7t1JE8pw8NtCqHOFK48+QwPV42hzhauPE0NDldkpS8Crm67nrYa9hgKWvZwtE6zDikhCZKIFiKI4yhhTZ8hQ0w+4ahiH+Lr4sQ8DoJteNykvnrI3AJ+U/LKbgP+pmyl1Rn+huT75eK35GTx4PjVFYd1ufhtQzf86prEvmD8Y83wOw1ZhsvF7wDd8KsZC3Xv6mLwu7pNvY6a/VDT+heDH+g29Trqymd8ufjHuk29jhqx3qdwjpjoD/boa01VwSsZxDZbQyn4G7YWD6rFNPvUi6su/zXdWtxL/+AWoagRaW/nr+j6F4nZeykXXPIehSFpLd+jLHpJitu9jXfosilxMdq/Q2bou0MmF7gOnoh3z6bM5+3jRMSiXY8TR65elr1b1+OkaVOlhXGSzsh8usz2TCwVtbYwQEy7ztAbfIA0baa8AvXYkqkhoY4bCpT7LfxRc6M9eh2z6nN2HuhQ4c+oUvZTqQJqvfDHPba20L1q2629b6CcWFtoncFAAQ2Vz/16HzW8yMMK+5pJ1SwTu2W8yPbFAF0vbnSjBwYNAgZ1Mwe9B+hpUbT7KLaYe8aStnOH2NmiSDxn8whqmId0GkHDT9TqPsLHCBJsnGFGkAd6HkFqfr0ygiy9R9Dg8QNQ4wex8JmWa55cwLNysixbwKTGF/y7JNve0H+OtkPtk58nCrk60mj685YB+7vh6qkf/1L2r7wQnHPNJNNssW0bR4nKxdv2HeY3e0gRV8YEE/85q1wjPVntVkzdrEsQv8TcaTCcCfMH7InnEGOUjrZfzkHjaEmWvKZyKt96unfl+P5UK0YhfV+itQWTdiSTdj3VpO1ek95AXep3btN8sSvsmCeterVplFSM2sgNNmTiIy38w4DrcYE5uAGrYdVJBpzLDH9X3FYaIC+AC32/KjrN0KvGXT02nC4Mvfp0T4wcWVUtfYowSSK+GOIeGfMVUfph82+y+abPUPdq8yLu0tDmTT4eCzs3r0DFubtd2/yEw1B8Oyw/sXbUXT6MvW7s48GNvSE1pmfcuhf/wUTxuPXPlu+JP+X151W/21riOY9NLNtnkFh2G77fodewdKxWbYm6les0heqn1n6IuhWnQS391q2M1T1coZd7FnDgOKPZD6IbORdqHKkb53TdsGb5TZq53yq/j9S++x8=
--------------------------------------------------------------------------------
/drawio/buddy_frame_array.drawio:
--------------------------------------------------------------------------------
1 | 7Z39c5s2GMf/Gt/a3WUHEmD7x7y53W252y3t1vU32ciYRUYeyImzv34SIF4keU4c41JbzV0DkkDwfR599E4G8Hq5+ZCi1eKOhpgMgBNuBvBmAMB4DPj/IuC5CAiGThEQpXFYBLl1wH38Ly4DZbJ1HOKslZBRSli8agfOaJLgGWuFoTSlT+1kc0raua5QhLWA+xkieuifccgWRejId+rwjziOFjJn1yljlkgmLgOyBQrpUyMI3g7gdUopK46Wm2tMhHZSl+K6yZbY6sFSnLCXXLBgn28/h/7o6+WX278ySsidM7twR8VtHhFZt984Y89Sg6dFzPD9Cs3E+RM38wBeLdiS8DOXH6JsVSg/jzeYZ3Y1jwm5poSm+eXwxr8d3Xg8PKXrJBQpbhx+FhGUZeWx/i7lUzzilOFNI6h8tw+YLjFLn3kSGSt1Lh3Nk+dPDbOVQYuGxWQYKh0lqu5ca8kPSjlfI+1Yk3YAArQU6hGe+dWkOo1YpUEHsh9AXOD0TNzqFbaJ++XA4h5ARBj0TUSgiQj67IWe1zcB4ekUcb9v/ATe6Ygb9I6f/gmJO+qZuFCvnBxNQP5yrK1ZxlL6gKVICU2wolsZhEgcJfx0xiXDPPxKSBXzJutlGbGMw1Bkc2UyUrsRdgD5XYXKvjPS5PcM8oPO5A80+fU27cnID0Df5B++oFVxMvJDp2/y6z06eMLyj3omv6c3Cb3Tld8L+ia/3mj0T1d+v29Vr6c3K4PTlT/oW9Xr6S2f4enKP+xb1evpI34xf+VNByZIi9d5iwXmNGH35TMZBpP36OPutIZ/TGvIzP63F9ajbizoWzd2qMN8kqIl5kGXaYqeT9ytpRngWDOD63bkyGvvYfXH79HH+c/x3de/k583zvWvF8ah7nyQJluhpGWC4J+1mE66mhXyXvLINJq+40/Gs3bkr/e5wzpCqYs5WsbkuUi6wOQRC90b8Vk+IydiXbDaNCOKTEVMQtMlIo24p1IWESkonccQzHitccGfeRYnkX6l8J4LwcukvNKR+eUxLEVJNufp5ZW54wi70DRs37W6cIpmD1HuHheKIsAbFWIAAe38wJe6hHG2IqjUJE5ILHOaE4qYkr1UnB/lA2XvPi2qAvJe2ombvTBVYB5MKwvOLhgphWSLJxv8fXv7XUGO72q+PjS4OuzK1U1DvopW7dK+Q7LmHGPBIjmJCwyCUuFETGgBHZ31kwnk/7ZB7QDGcKFiDahXAK6pQRN0ZQ59KGEP8tTwqfkDnC0IqkadueBQVKw1k9SY9k0UTrXjNFS1opu08orn3AUsmcTIrFakCVsygZlcMnYHvJwGv5wGwuTlJopVr62ATKOY88sr0fWGOp/g+Rur/GYpV8pwB8USjnytWEJDqXSHXbUHTHM3illwEl6K1S+13C0LoZTJaPqYV8B52CQmMg2/QbkYB8rY8hxUsuJQWzuzU9QdvQEZlmKCWPzYvr1JyTKH32jMM24sBnHbKFWNkdF1OsPlVbU99BupTHYU1nJhIsy0G+WGrV77DbbWe/QWwWeBYM8iuLH8aNwzBOtzXBbBbXJqS8b2RbBq/KMj2LaCzxTBI4vgxuhk3xBsWl5rEdwipzqivDeCR98WwXJo1SL43BDsAsvgxgLynjHYuArfMriJTm3R/74MVo1/dAYDy+AzZXBgGdzYg9I3Bpv2oFgGN9Gp7Rval8Gq8Y/OYOOWGMvg02cwsDNydTH0Qd8YbGfkdjFY2164L4NV4x+dwXZG7lwZbKfk6mIY9G1KDtgpuV0M1nYh78tg1fjHZrB0LMvgs2OwnZNrFMO+zckZNr5bBivoPNScnGr8ozPYjgefKYOhnZOri+Gwb3Ny0I4H72Lw8FBzcqrxj87gs94vA8a798sEBi/qbLsMfMEw4Omaw/N3m6P6Wspx7GEaqjsuCttii9YJHI8nk+8Aku5QMafcDvzqLRRKMfWhQtuuIakPDH1a4BT/kIlnGOSfjmVxtKbrrGpGJtNsVTRkZzj3F25VkXye0qX4VeyClq2gad3+KQJEyzG/sTR73YSC83leDvUtV402VXF9+548uJEPdwLuA/zJhKuJlhvPxHkHBv7tW55BeyHXnLvxQd//6D1UjzJ6ULX83tqG4tUa5da7ET+HIaVatHwDKavt5KaifXhSmta4W1K+jJRArfh8uB8p1fW6UP20Q9ek1JdUf1rEBfcK4uUniBA6Q1PuF3nHcrpmeVdR/D/ForfKk4gTlu+En+IoTpIimM5FXNK+R4mcVHKkiWOnJLBFxzYX8V6IDq8rdMgbb3eaBcpyP8C14RmX0Rq1ajnLjoOsD8C3rg880+DeN+vJ7PSDgUCl+DmMPWAw/KltEWPX0tcN0llXxvBdO1tBv7SC9lyg2nPfzowH1Vsduztj+MTeHUbNCraod7v+VNPbYdv8UlN1XvrboVbrKrWlawCr6Zta7uvLMT+t/5ZHYez6D6LA2/8A
--------------------------------------------------------------------------------
/drawio/vfs.drawio:
--------------------------------------------------------------------------------
1 | 7VzBdps6EP0aL90DCDBeJmmSLpKetF28ppseGWSbFCMHZMd+X/8ESAYk2bEbW2Ceu6k1EgLunRnNSEN64Ga2uk/gfPqIAxT1LCNY9cDnnkX/DSz6XyZZF5LBwC4EkyQMCpFZCn6E/yImNJh0EQYorQ0kGEcknNeFPo5j5JOaDCYJfqsPG+Ooftc5nCBJ8MOHkSz9JwzItJB6jlHKv6BwMuV3Ng3WM4N8MBOkUxjgt4oI3PbATYIxKX7NVjcoysDjuBTX3W3p3TxYgmKyzwXB13Q0SOzvX59XVz+d3+nD84PXN9k0KVnzN0YBBYA1YxzT/64TvIgDlM1j0BZOyBRPcAyjB4znVGhS4QsiZM3ogwuCqWhKZhHrRauQ/Mwu/+Sw1nOl5/OKzZw31rwRk2RduShrPlf7ysvyVnldcJURXz4+ldyFUcT6Zdw4BniR+AwEPLr/A4H9Db7+Cl+G8Nsz7n/pc/2DyQSRHeNAMS4DsnIDxso9wjNEn5cOSFAESbisaxpkCjvZjCs5pT8YrQdQbJ0FxdaBHNPWE0pCChFKmiBeDbXRLuYH58B8e41bDarTFMdL+HuxGr68PD3eGi8pnLxOH+2+JVHcQT4FrE/rvT1NdCpvbnsXk92bUbAno7bZlMnueuwljBbsTjLnUUQD3Ayrt2lI0I85zIF4ozF2nUiYzouodxyuMoW4HlNgb3CEk3wiEEDkjX0qT0mC/6BKj+t7aDTehf8SJQStdgLGw3wera95mwX5b5WQmQ2ZVqJlLjs6xF7nIB4ICFsNI6w5cTAPcFMnixD29Tem2yp/Y6oczt25G4TXNoOwdUViRlP67+yr/+1KgM1LGvQxktWotszJOZ1b8oWgCjhNezhXI8Tj8djylRAH7sh13CNB3DqMhxLGAVpKMNPXI3U86zAxp1DFlIlgFE5i2vQpaNnO2XUGVujD6Ip1zMIgiLYRWHeSxyBAWMZNTybAVhBgnYoAPnGFgBl9u84SAByRAKNhAkCLV+u/XHWPt0ayS59wmCnlJv+2hjUSLe6q+RTFss+usipnRIdOVIQF0kQ505v3+QD5svub0stwd+1vc17IVyBFHqPV/oDsAOVUvzPwW21D35TQP12MpSeMdQUNN92GYyxgKTA++/0Q12obzPK204yab1tjqTGOCQ8yTN4uogfzSNnGQKRoKFO0WQ/0uBu7k6YgeJyBHNLqtQR5byIv+7GMIEwoXpjd42IUVQNozih07nNosgjxgG7Q9OIwkDC+2II1AEI4ajS+QMgJ2elsQc+en2gLoGlbsOWMK0GTRQQTKqQQoYtBcK7sxg3CPmmCJin+yHNsx1CYiuej3FSOALJ00AC8hg1Cdjo+jlPcWks4BgniPpAph0FadyIcVZZ83oouen5bcdigVdEdOTEI0DKkIF4cf50qR1FRdjLHrzzvljPleYL9ljJ0BAaGzZ0L7aw36FAkKrgjR5H5nsodqSGWyyZTAski7a6W98UlwWlazXUmXHo2H4ArbD6AhvW81UfMDRaE7fym6f0PY5iitqTqT1HI0YGtbSCWvTZuSydNExpZlkWMdebDaozl2BP2rBuyautZ2jFIED54AIoEQOvCbOksMNWj6H1xYdZYQ6/GWA5A4adOq3kftCzN2qsARQo2AphOc1hMIQSqhFVlBHVbSq/9RbLcXLkV0HdjigpcjgIuLvtonZ2QFLv8c81D6+xs8XMuQ5hoS50dhR2uK8Pm2YB0/wcGQ2/nc0le98Dx/D1KDSye+G+LBNUquk/I8X9V0c1pEldRXuJxeCmoMJEhTHQkFRUfmJfgbTWdDw23jN3DzeHHxgOwe7x81lIbfyKDUX0A6EbZCjqiPyYk1+tCkO3l1kzJfV1g3tFP81z3ig4w3fmq7OSzfC/KWoq9ayNdpwTNivFkNh+n/Cb0LYr71O9NxZXn6eyS74qGJa34G63Ss+SrisBOoh6s/s/AcS97m7t8Bz0fn/266EfRK6iH4iBQs36oMh8t+pF9eXPxHvW8WDwYG2pUD+VfoGnlh5cSoArYt1sgeNdD6/0oUIHw2e9kmmKgejKQabP8K3tFTFX+rUJw+x8=
--------------------------------------------------------------------------------
/external_reference/index.rst:
--------------------------------------------------------------------------------
1 | External Reference
2 | ===================
3 | These are some great websites you can refer to.
4 |
5 | **Bare metal tutorials in C**
6 |
7 | https://github.com/bztsrc/raspi3-tutorial
8 |
9 | **Operating system tutorials in C**
10 |
11 | https://github.com/s-matyukevich/raspberry-pi-os
12 |
13 | **Operating system tutorials in rust**
14 |
15 | https://github.com/rust-embedded/rust-raspi3-OS-tutorials
16 |
17 | **Raspberry pi officical github**
18 |
19 | https://github.com/raspberrypi/
20 |
21 | **Embedded linux wiki page for Raspberry pi**
22 |
23 | https://elinux.org/RPi_Hub
--------------------------------------------------------------------------------
/hardware/asm.rst:
--------------------------------------------------------------------------------
1 | The Assembly You Need
2 | ======================
3 |
4 | Although this course is not about learning assembly language,
5 | you still need to write some assembly in bare metal programming.
6 |
7 | Here, we provide pieces of assembly code you possibly need.
8 | After copy and paste, you still need to look up the manual to understand how this codes work.
9 |
10 |
11 | .. note::
12 | You might still need others to achieve some extra function.
13 | Please refer to this two manual for more information.
14 | https://static.docs.arm.com/100898/0100/the_a64_Instruction_set_100898_0100.pdf
15 | https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf
16 | https://static.docs.arm.com/ddi0487/ea/DDI0487E_a_armv8_arm.pdf
17 |
18 |
19 | Lab 0
20 | -----
21 |
22 | .. code-block:: c
23 |
24 | // enter busy loop
25 | _start:
26 | wfe
27 | b _start
28 |
29 | Lab 1
30 | -----
31 |
32 | .. code-block:: c
33 |
34 | // set stack pointer and branch to main function.
35 | 2:
36 | ldr x0, = _stack_top
37 | mov sp, x0
38 | bl main
39 | 1:
40 | b 1b
--------------------------------------------------------------------------------
/hardware/mailbox.rst:
--------------------------------------------------------------------------------
1 | .. _mailbox:
2 |
3 | =======
4 | Mailbox
5 | =======
6 |
7 | Mailbox is the communication mechanism between the ARM CPUs and the VideoCoreIV GPU on rpi3.
8 | You can use it to set framebuffer or configure some peripherals.
9 |
10 | We only list the materials needed for the labs.
11 | For details, please refer to https://github.com/raspberrypi/firmware/wiki/Mailboxes
12 |
13 | *******
14 | Basics
15 | *******
16 |
17 | The mailbox mechanism consists of three components mailbox registers, channels, and messages.
18 |
19 | Mailbox Registers
20 | ==================
21 |
22 | Mailbox registers are accessed by MMIO, we only need Mailbox 0 Read/Write (CPU read from GPU),
23 | Mailbox 0 status (check GPU status) and Mailbox 1 Read/Write(CPU write to GPU)
24 |
25 | Channels
26 | ========
27 |
28 | Mailbox 0 define several channels, but we only use channel 8 (CPU->GPU) for communication.
29 |
30 | Message
31 | =======
32 |
33 | To pass messages by the mailbox, you need to prepare a message array.
34 | Then apply the following steps.
35 |
36 | 1. Combine the message address (upper 28 bits) with channel number (lower 4 bits)
37 |
38 | 2. Check if Mailbox 0 status register's full flag is set.
39 |
40 | 3. If not, then you can write to Mailbox 1 Read/Write register.
41 |
42 | 4. Check if Mailbox 0 status register's empty flag is set.
43 |
44 | 5. If not, then you can read from Mailbox 0 Read/Write register.
45 |
46 | 6. Check if the value is the same as you wrote in step 1.
47 |
48 | .. note::
49 | Because only upper 28 bits of message address could be passed, the message array should be correctly aligned.
50 |
51 | Mailbox Address and Flags
52 | =========================
53 |
54 | .. code-block:: c
55 |
56 | #define MMIO_BASE 0x3f000000
57 | #define MAILBOX_BASE MMIO_BASE + 0xb880
58 |
59 | #define MAILBOX_READ MAILBOX_BASE
60 | #define MAILBOX_STATUS MAILBOX_BASE + 0x18
61 | #define MAILBOX_WRITE MAILBOX_BASE + 0x20
62 |
63 | #define MAILBOX_EMPTY 0x40000000
64 | #define MAILBOX_FULL 0x80000000
65 |
66 | ***
67 | Tag
68 | ***
69 |
70 | Mailbox property interface(channel 8, 9) contains several tags to indicate different operations.
71 | You should refer to https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface
72 | to get detail specifications.
73 |
74 | Below, we provide an example to get rpi3's board revision number.
75 |
76 | .. code-block:: c
77 |
78 | #define GET_BOARD_REVISION 0x00010002
79 | #define REQUEST_CODE 0x00000000
80 | #define REQUEST_SUCCEED 0x80000000
81 | #define REQUEST_FAILED 0x80000001
82 | #define TAG_REQUEST_CODE 0x00000000
83 | #define END_TAG 0x00000000
84 |
85 | void get_board_revision(){
86 | unsigned int mailbox[7];
87 | mailbox[0] = 7 * 4; // buffer size in bytes
88 | mailbox[1] = REQUEST_CODE;
89 | // tags begin
90 | mailbox[2] = GET_BOARD_REVISION; // tag identifier
91 | mailbox[3] = 4; // maximum of request and response value buffer's length.
92 | mailbox[4] = TAG_REQUEST_CODE;
93 | mailbox[5] = 0; // value buffer
94 | // tags end
95 | mailbox[6] = END_TAG;
96 |
97 | mailbox_call(mailbox); // message passing procedure call, you should implement it following the 6 steps provided above.
98 |
99 | printf("0x%x\n", mailbox[5]); // it should be 0xa020d3 for rpi3 b+
100 | }
101 |
102 | ***********
103 | Framebuffer
104 | ***********
105 |
106 | Rpi3 has a display output controlled by GPU.
107 | You can set GPU's framebuffer by mailbox to show an image or text on your screen.
108 |
109 | There are several items to configure.
110 | We list some of them with brief explanations.
111 | You should experiment in different configurations on real rpi3 to get a better understanding.
112 |
113 | * **Allocate buffer:**
114 | To get the framebuffer's memory base address.
115 | Then you can set pixel's color according to its address.
116 |
117 | .. note::
118 | The buffer address returned by GPU should be bitwise AND with 0x3fff_ffff.
119 |
120 | * **Physical (display) width/height:**
121 | The display buffer size.
122 |
123 | * **Virtual (buffer) width/height:**
124 | A portion of framebuffer that sends to display.
125 |
126 | * **Virtual (buffer) offset:**
127 | The virtual buffer's size might be bigger than the display buffer's size, an offset is used to decide which part of the virtual buffer is sent to display.
128 |
129 | * **Depth:**
130 | How many bits to represent a pixel.
131 |
132 | * **Pixel order:**
133 | Pixel order is either RGB or BGR.
134 |
135 | * **Get pitch:**
136 | Pitch is how many bytes stores per horizontal line.
137 | For drawing k'th row in the screen, you need to skip k times pitch instead of k times display width.
--------------------------------------------------------------------------------
/hardware/uart.rst:
--------------------------------------------------------------------------------
1 | .. _uart:
2 |
3 | ====
4 | UART
5 | ====
6 |
7 | Rpi3 has 2 UARTs, mini UART and PL011 UART.
8 | We provide an overview about how to access them.
9 |
10 | For details, please refer to https://cs140e.sergio.bz/docs/BCM2837-ARM-Peripherals.pdf
11 |
12 | **********
13 | Background
14 | **********
15 |
16 | MMIO
17 | ====
18 |
19 | Rpi3 accesses its peripherals through memory mapped input output(MMIO).
20 | When a CPU loads/stores value at a specific physical address, it gets/sets
21 | a peripheral's register.
22 | Therefore, you can access different peripherals from different memory addresses.
23 |
24 | .. note::
25 | There is a VideoCore/ARM MMU translating physical addresses to bus addresses.
26 | The MMU maps physical address 0x3f000000 to bus address 0x7e000000.
27 | In your code, you should **use physical addresses instead of bus addresses**.
28 | However, the reference uses bus addresses. You should translate them into physical one.
29 |
30 | GPIO
31 | ====
32 |
33 | Rpi3 has several GPIO lines for basic input-output devices such as LED or button.
34 | Besides, some GPIO lines provide alternate functions such as UART and SPI.
35 | Before using UART, you should configure GPIO pin to the corresponding mode.
36 |
37 | GPIO 14, 15 can be both used for mini UART and PL011 UART.
38 | However, mini UART should set ALT5 and PL011 UART should set ALT0.
39 | You need to **configure GPFSELn register to change alternate function.**
40 |
41 | Next, you need to **configure pull up/down register to disable GPIO pull up/down**.
42 | It's because these GPIO pins use alternate functions, not basic input-output.
43 | Please refer to the description of **GPPUD and GPPUDCLKn** registers for a detailed setup.
44 |
45 | *********
46 | Mini UART
47 | *********
48 |
49 | Mini UART is provided by rpi3's auxiliary peripherals.
50 | It supports limited functions of UART.
51 |
52 | Initialization
53 | ==============
54 |
55 | 1. Set AUXENB register to enable mini UART.
56 | Then mini UART register can be accessed.
57 |
58 | 2. Set AUX_MU_CNTL_REG to 0. Disable transmitter and receiver during configuration.
59 |
60 | 3. Set AUX_MU_IER_REG to 0. Disable interrupt because currently you don't need interrupt.
61 |
62 | 4. Set AUX_MU_LCR_REG to 3. Set the data size to 8 bit.
63 |
64 | 5. Set AUX_MU_MCR_REG to 0. Don't need auto flow control.
65 |
66 | 6. Set AUX_MU_BAUD to 270. Set baud rate to 115200
67 |
68 | After booting, the system clock is 250 MHz.
69 |
70 | .. math::
71 | \text{baud rate} = \frac{\text{systemx clock freq}}{8\times(\text{AUX_MU_BAUD}+1)}
72 |
73 | 7. Set AUX_MU_IIR_REG to 6. No FIFO.
74 |
75 | 8. Set AUX_MU_CNTL_REG to 3. Enable the transmitter and receiver.
76 |
77 | Read data
78 | =========
79 |
80 | 1. Check AUX_MU_LSR_REG's data ready field.
81 |
82 | 2. If set, read from AUX_MU_IO_REG
83 |
84 | Write data
85 | ==========
86 |
87 | 1. Check AUX_MU_LSR_REG's Transmitter empty field.
88 |
89 | 2. If set, write to AUX_MU_IO_REG
90 |
91 | Interrupt
92 | =========
93 |
94 | * AUX_MU_IER_REG: enable tx/rx interrupt
95 |
96 | * AUX_MU_IIR_REG: check interrupt cause
97 |
98 | * Interrupt enable register1(page 116 of manual): set 29 bit to enable. (AUX interrupt enable)
99 |
100 | .. note::
101 | By default, QEMU uses UART0 (PL011 UART) as serial io.
102 | If you want to use UART1 (mini UART) use flag ``-serial null -serial stdio``
103 |
104 | **********
105 | PL011 UART
106 | **********
107 |
108 | To use PL011 UART, you should set up the clock for it first.
109 | It's configured by :ref:`mailbox`.
110 |
111 | Besides the clock configuration, it's similar to mini UART.
112 |
113 | Initialization
114 | ==============
115 |
116 | 1. Configure the UART clock frequency by mailbox.
117 |
118 | 2. Enable GPIO (almost same as mini UART).
119 |
120 | 3. Set IBRD and FBRD to configure baud rate.
121 |
122 | 4. Set LCRH to configure line control.
123 |
124 | 5. Set CR to enable UART.
125 |
126 | Read data
127 | =========
128 |
129 | 1. Check FR
130 |
131 | 2. Read from DR
132 |
133 | Write data
134 | ==========
135 |
136 | 1. Check FR
137 |
138 | 2. Write to DR
139 |
140 | Interrupt
141 | =========
142 |
143 | * IMSC: enable tx/rx interrupt
144 |
145 | * MIS: check interrupt cause
146 |
147 | * ICR: clear interrupt (read or write DR will automatically clear)
148 |
149 | * Interrupt enable register2(page 117 of manual): set 25 bit to enable. (UART interrupt enable)
--------------------------------------------------------------------------------
/index.rst:
--------------------------------------------------------------------------------
1 | NYCU, Operating System Capstone, Spring 2021
2 | ===========================================================
3 |
4 | This course aims to introduce the design and implementation of operating system kernels.
5 | You'll learn both concept and implementation from a series of labs.
6 |
7 | This course uses `Raspberry Pi 3 Model B+ `_ (rpi3 for short)
8 | as the hardware platform.
9 | Students can get their hands dirty on a **Real Machine** instead of an emulator.
10 |
11 | Labs
12 | -----
13 | There are 8 + 1 labs in this course.
14 | You'll learn how to **design** a kernel by **implementing** it yourself.
15 |
16 | There are 2 types of labels in each lab.
17 |
18 | ================== ===========================================================================================
19 | ``required`` You're required to implement it by the description, they take up major of your scores.
20 | ``elective`` You can implement some of them to get a bonus.
21 | ================== ===========================================================================================
22 |
23 | There is no limitation on which programming language you should use for the labs.
24 | However, there are a lot of things which are language dependent and even compiler dependent.
25 | You need to manage them yourself.
26 |
27 | You can check to last year's `course website `_
28 | and `submission repository `_ to see what you might need
29 | to do during this semester.
30 | Yet, the requirements and descriptions may differ this semester.
31 |
32 | Grading Policy
33 | ---------------
34 |
35 | It's allowed and recommended to check others code, but you still need to write it on your own
36 | instead of copy/paste.
37 |
38 | TAs validate plagiarism by asking the detail of your implementation.
39 | If you can't elaborate your code clearly, you only get 70% of the score.
40 |
41 | Your code may work on an emulator even it's wrong.
42 | Hence, you get 90% of the score if your code works on QEMU but not on real rpi3.
43 |
44 | For late hand in, the penalty is 1% per week.
45 |
46 | Disclaimer
47 | ----------
48 | We're not kernel developers or experienced embedded system developers.
49 | It's common we made mistakes in the description.
50 | If you find any of them, send an issue or PR to `this github repo `_.
51 |
52 | .. note::
53 | This documentation is not self-contained, you can get more information from external references.
54 |
55 | ..
56 | chapters tree below
57 |
58 | .. toctree::
59 | :caption: Labs
60 | :hidden:
61 |
62 | labs/lab0
63 | labs/lab1
64 | labs/lab2
65 | labs/lab3
66 | labs/lab4
67 | labs/lab5
68 | labs/lab6
69 | labs/lab7
70 | labs/lab8
71 |
72 | .. toctree::
73 | :caption: Hardware
74 | :hidden:
75 |
76 | hardware/asm
77 | hardware/uart
78 | hardware/mailbox
79 |
80 | .. toctree::
81 | :caption: Miscs
82 | :hidden:
83 |
84 | external_reference/index
85 |
--------------------------------------------------------------------------------
/labs/img/argv.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
--------------------------------------------------------------------------------
/labs/img/buddy_frame_array.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
--------------------------------------------------------------------------------
/labs/img/disk_dump.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GrassLab/NYCU_Operating_System_Capstone/405dfe554d0d49f8dac6149570d4a7e56ea32907/labs/img/disk_dump.png
--------------------------------------------------------------------------------
/labs/img/exception_levels.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GrassLab/NYCU_Operating_System_Capstone/405dfe554d0d49f8dac6149570d4a7e56ea32907/labs/img/exception_levels.jpg
--------------------------------------------------------------------------------
/labs/img/mem_layout.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GrassLab/NYCU_Operating_System_Capstone/405dfe554d0d49f8dac6149570d4a7e56ea32907/labs/img/mem_layout.png
--------------------------------------------------------------------------------
/labs/img/vector_table.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GrassLab/NYCU_Operating_System_Capstone/405dfe554d0d49f8dac6149570d4a7e56ea32907/labs/img/vector_table.jpg
--------------------------------------------------------------------------------
/labs/img/vfs.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GrassLab/NYCU_Operating_System_Capstone/405dfe554d0d49f8dac6149570d4a7e56ea32907/labs/img/vfs.png
--------------------------------------------------------------------------------
/labs/lab0.rst:
--------------------------------------------------------------------------------
1 | =========================
2 | Lab 0: Environment Setup
3 | =========================
4 |
5 | *************
6 | Introduction
7 | *************
8 | In Lab 0, you need to prepare the environment for future development.
9 | You should install the target toolchain, and use them to build a bootable image for rpi3.
10 |
11 | *****
12 | Goals
13 | *****
14 |
15 | * Set up the development environment.
16 | * Understand what's cross-platform development.
17 | * Test your rpi3.
18 |
19 | .. note::
20 | This lab is an introductory lab.
21 | It won't be part of your final grade, but you still need to finish all ``required`` parts,
22 | or you'll be in trouble in the next lab.
23 |
24 | ***************************
25 | Cross-Platform Development
26 | ***************************
27 |
28 | Cross Compiler
29 | ==============
30 |
31 | Rpi3 uses ARM Cortex-A53 CPU.
32 | To compile your source code to 64-bit ARM machine code, you need a cross compiler if you develop
33 | on non-ARM64 environment.
34 |
35 | ``required`` Install a cross compiler on your host computer.
36 |
37 | Linker
38 | ======
39 |
40 | You might not notice the existence of linkers before.
41 | It's because the compiler uses the default linker script for you. (``ld --verbose`` to check the content)
42 | In bare metal programming, you should set the memory layout yourself.
43 |
44 | This is an incomplete linker script for you.
45 | You should extend it in the following lab.
46 |
47 | .. code-block:: none
48 | :linenos:
49 |
50 | SECTIONS
51 | {
52 | . = 0x80000;
53 | .text : { *(.text) }
54 | }
55 |
56 |
57 | QEMU
58 | ====
59 |
60 | In cross-platform development,
61 | it's easier to validate your code on an emulator first.
62 | You can use QEMU to test your code first before validating them on a real rpi3.
63 |
64 | .. warning::
65 | Although QEMU provides a machine option for rpi3, it doesn't behave the same as a real rpi3.
66 | You should validate your code on your rpi3, too.
67 |
68 | ``required`` Install ``qemu-system-aarch64``.
69 |
70 |
71 | ********************************
72 | From Source Code to Kernel Image
73 | ********************************
74 |
75 | You have the basic knowledge of the toolchain for cross-platform development. Now, it’s time to practice them.
76 |
77 | From Source Code to Object Files
78 | ================================
79 |
80 | Source code is converted to object files by cross compiler.
81 | After saving the following assembly as ``a.S``,
82 | you can convert it to an object file by ``aarch64-linux-gnu-gcc -c a.S``.
83 | Or if you would like to, you can also try llvm’s linker ``clang -mcpu=cortex-a53 --target=aarch64-rpi3-elf -c a.S``,
84 | especially if you are trying to develop on macOS.
85 |
86 | .. code-block:: c
87 |
88 | .section ".text"
89 | _start:
90 | wfe
91 | b _start
92 |
93 | From Object Files to ELF
94 | ========================
95 |
96 | A linker links object files to an ELF file.
97 | An ELF file can be loaded and executed by program loaders.
98 | Program loaders are usually provided by the operating system in regular development environment.
99 | In bare metal programming, ELF can be loaded by some bootloaders.
100 |
101 |
102 | To convert the object file from previous step to an ELF file,
103 | you can save the provided linker script as ``linker.ld``, and run the following command.
104 |
105 | .. code-block:: none
106 |
107 | # On GNU LD
108 | aarch64-linux-gnu-ld -T linker.ld -o kernel8.elf a.o
109 | # On LLVM
110 | ld.lld -m aarch64elf -T linker.ld -o kernel8.elf a.o
111 |
112 | From ELF to Kernel Image
113 | ========================
114 |
115 | Rpi3's bootloader can't load ELF files.
116 | Hence, you need to convert the ELF file to a raw binary image.
117 | You can use ``objcopy`` to convert ELF files to raw binary.
118 |
119 | .. code-block:: none
120 |
121 | aarch64-linux-gnu-objcopy -O binary kernel8.elf kernel8.img
122 | # Or
123 | llvm-objcopy --output-target=aarch64-rpi3-elf -O binary kernel8.elf kernle8.img
124 |
125 | Check on QEMU
126 | =============
127 |
128 | After building, you can use QEMU to see the dumped assembly.
129 |
130 | .. code-block:: none
131 |
132 | qemu-system-aarch64 -M raspi3 -kernel kernel8.img -display none -d in_asm
133 |
134 | ``required`` Build your first kernel image, and check it on QEMU.
135 |
136 | *******************
137 | Deploy to REAL Rpi3
138 | *******************
139 |
140 | Flash Bootable Image to SD Card
141 | ===============================
142 |
143 | To prepare a bootable image for rpi3, you have to prepare at least the following stuff.
144 |
145 | * An FAT16/32 partition contains
146 |
147 | * Firmware for GPU.
148 |
149 | * Kernel image.(kernel8.img)
150 |
151 | There are two ways to do it.
152 |
153 | 1.
154 | We already prepared a `bootable image
155 | `_.
156 | You can use the following command to flash it to your SD card.
157 |
158 | .. code-block:: none
159 |
160 | dd if=nctuos.img of=/dev/sdb
161 |
162 | .. warning:: /dev/sdb should be replaced by your SD card device. You can check it by `lsblk`
163 |
164 | It's already partition and contains an FAT32 filesystem with firmware inside.
165 | You can mount the partition to check.
166 |
167 | 2.
168 | Partition the disk and prepare the booting firmware yourself.
169 | You can download the firmware from
170 | https://github.com/raspberrypi/firmware/tree/master/boot
171 |
172 | bootcode.bin, fixup.dat and start.elf are essentials.
173 | More information about pi3's booting could be checked on official website
174 | https://www.raspberrypi.org/documentation/configuration/boot_folder.md
175 | https://www.raspberrypi.org/documentation/hardware/raspberrypi/bootmodes/README.md
176 |
177 | Finally, put the firmware and your kernel image into the FAT partition.
178 |
179 | .. note::
180 | Besides using ``mkfs.fat -F 32`` to create an FAT32 filesystem, you should also set the partition type to FAT.
181 |
182 |
183 | ``required`` Use either one of the methods to set up your SD card.
184 |
185 | Interact with Rpi3
186 | ==================
187 |
188 | In our provided bootable image, it contains a kernel image can echoes what you type through UART.
189 | You can use it to test if your Lab kits function well.
190 |
191 | 1. If you use method 2 to set up your bootable image, you should download `kernel8.img `_
192 | , and put it into your boot partition. It's identical to the one in the provided bootable image.
193 |
194 | 2. Plug in the UART to USB converter to your host machine, and open it through serial console such as screen or putty with correct baud rate.
195 |
196 | 3. Connect TX, RX, GND to the corresponding pins on rpi3, and turn on your rpi3.
197 |
198 | 4. After your rpi3 powers on, you can type some letters, and your serial console should print what you just typed.
199 |
200 | .. code-block:: none
201 |
202 | screen /dev/ttyUSB0 115200
203 |
204 | *********
205 | Debugging
206 | *********
207 |
208 | Debug on QEMU
209 | =============
210 |
211 | Debugging on QEMU is a relatively easier way to validate your code.
212 | QEMU could dump memory, registers, and expose them to a debugger.
213 | You can use the following command waiting for gdb connection.
214 |
215 | .. code-block:: none
216 |
217 | qemu-system-aarch64 -M raspi3 -kernel kernel8.img -display none -S -s
218 |
219 | Then you can use the following command in gdb to load debugging information and connect to QEMU.
220 |
221 | .. code-block:: none
222 |
223 | file kernel8.elf
224 | target remote :1234
225 |
226 | .. note::
227 | Your gdb should also be cross-platform gdb.
228 |
229 |
230 | Debug on Real Rpi3
231 | ==================
232 |
233 | You could either use print log or JTAG to debug on a real rpi3.
234 | We don't provide JTAG in this course, you can try it if you have one.
235 | https://metebalci.com/blog/bare-metal-raspberry-pi-3b-jtag/
236 |
--------------------------------------------------------------------------------
/labs/lab1.rst:
--------------------------------------------------------------------------------
1 | ==================
2 | Lab 1: Hello World
3 | ==================
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | In Lab 1, you will practice bare metal programming by implementing a simple shell.
10 | You need to set up mini UART, and let your host and rpi3 can communicate through it.
11 |
12 | *****
13 | Goals
14 | *****
15 |
16 | * Practice bare metal programming.
17 | * Understand how to access rpi3’s peripherals.
18 | * Set up mini UART.
19 |
20 | ********
21 | Required
22 | ********
23 |
24 | Requirement 1
25 | =============
26 |
27 | Basic Initialization
28 | --------------------
29 |
30 | When a program is loaded, it requires,
31 |
32 | * All it's data is presented at correct memory address.
33 | * The program counter is set to correct memory address.
34 | * The bss segment are initialized to 0.
35 | * The stack pointer is set to a proper address.
36 |
37 | After rpi3 booting, its booloader loads kernel8.img to physical address 0x80000,
38 | and start executing the loaded program.
39 | If the linker script is correct, the above two requirements are met.
40 |
41 | However, both bss segment and stack pointer are not properly initialized.
42 | Hence, you need to initialize them at very beginning.
43 | Otherwise, it may lead to undefined behaviors.
44 |
45 | ``required 1`` Initialize rpi3 after booted by bootloader.
46 |
47 | Requirement 2
48 | =============
49 |
50 | Mini UART
51 | ---------
52 |
53 | You'll use UART as a bridge between rpi3 and host computer for all the labs.
54 | Rpi3 has 2 different UARTs, mini UART and PL011 UART.
55 | In this lab, you need to set up the mini UART.
56 |
57 | ``required 2`` Following :ref:`uart` to set up mini UART.
58 |
59 | Requirement 3
60 | =============
61 |
62 | Simple Shell
63 | ------------
64 |
65 | After setting up UART, you should implement a simple shell to let rpi3 interact with the host computer.
66 | The shell should be able to execute the following commands.
67 |
68 | ======== ============================
69 | command Description
70 | ======== ============================
71 | help print all available commands
72 | hello print Hello World!
73 | ======== ============================
74 |
75 | ``required 3`` Implement a simple shell supporting the listed commands.
76 |
77 | .. note::
78 | There may be some text alignment issue on screen IO, think about \r\n on both input and output.
79 |
80 | ********
81 | Elective
82 | ********
83 |
84 | Reboot
85 | ======
86 |
87 | Rpi3 doesn't provide an on board reset button.
88 |
89 | You can follow example code to reset your rpi3.
90 |
91 | .. code-block:: c
92 |
93 | #define PM_PASSWORD 0x5a000000
94 | #define PM_RSTC 0x3F10001c
95 | #define PM_WDOG 0x3F100024
96 |
97 | void reset(int tick){ // reboot after watchdog timer expire
98 | set(PM_RSTC, PM_PASSWORD | 0x20); // full reset
99 | set(PM_WDOG, PM_PASSWORD | tick); // number of watchdog tick
100 | }
101 |
102 | void cancel_reset() {
103 | set(PM_RSTC, PM_PASSWORD | 0); // full reset
104 | set(PM_WDOG, PM_PASSWORD | 0); // number of watchdog tick
105 | }
106 |
107 |
108 | ``elective 1`` Add command.
109 |
110 | .. note::
111 | This snippet of code only works on real rpi3, not on QEMU.
112 |
--------------------------------------------------------------------------------
/labs/lab2.rst:
--------------------------------------------------------------------------------
1 | ===============
2 | Lab 2: Booting
3 | ===============
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | Booting is the process to set up the environment to run various user programs after a computer reset.
10 | It includes a kernel loaded by bootloader, subsystems initialization, device-driver matching, and loading the init user program to bring
11 | up the remaining services in userspace.
12 |
13 | In Lab 2, you'll learn one of the methods to load your kernel and user programs.
14 | Also, you'll learn how to match a device to a driver on rpi3.
15 | The initialization of the remaining subsystems will be introduced at later labs.
16 |
17 |
18 | *******************
19 | Goals of this lab
20 | *******************
21 |
22 | * Implement a bootloader that loads kernel images through UART.
23 |
24 | * Understand what's initial ramdisk.
25 |
26 | * Understand what's devicetree.
27 |
28 | **********
29 | Background
30 | **********
31 |
32 | How a Kernel is Loaded on Rpi3
33 | ===============================
34 |
35 | There are 4 steps before your kernel starts its execution.
36 |
37 | 1. GPU executes the first stage bootloader from ROM on the SoC.
38 |
39 | 2. The first stage bootloader recognizes the FAT16/32 file system and loads the second stage bootloader bootcode.bin from SD card to L2 cache.
40 |
41 | 3. bootcode.bin initializes SDRAM and loads start.elf
42 |
43 | 4. start.elf reads the configuration to load a kernel and other data to memory then wakes up CPUs to start execution.
44 |
45 | The kernel loaded at step 4 can also be another bootloader with more powerful functionalities such as network booting, or ELF loading.
46 |
47 | In Lab 2, you'll implement a bootloader that loads the actual kernel through UART, and it's loaded by the previous stage bootloader.
48 |
49 | *********
50 | Required
51 | *********
52 |
53 | Requirement 1
54 | ==============
55 |
56 | UART Bootloader
57 | ---------------
58 |
59 | In Lab 1, you might experience the process of moving the SD card between your host and rpi3 very often during debugging.
60 | You can eliminate this by introducing another bootloader to load the kernel under debugging.
61 |
62 | To send binary through UART, you should devise a protocol to read raw data.
63 | It rarely drops data during transmission, so you can keep the protocol simple.
64 |
65 | You can effectively write data from the host to rpi3 by serial device's device file in Linux.
66 |
67 | .. code-block:: python
68 |
69 | with open('/dev/ttyUSB0', "wb", buffering = 0) as tty:
70 | tty.write(...)
71 |
72 |
73 | .. hint::
74 | You can use ``qemu-system-aarch64 -serial null -serial pty`` to create a pseudo TTY device and test your bootloader through it.
75 |
76 |
77 | Config Kernel Loading Setting
78 | ------------------------------
79 |
80 | You may still want to load your actual kernel image at 0x80000, but it then overlaps with your bootloader.
81 | You can first specify the start address to another by **re-writing the linker script**.
82 | Then, add ``config.txt`` file to your SD card's boot partition to specify the loading address by ``kernel_address=``.
83 |
84 | To further make your bootloader less ambiguous with the actual kernel, you can add the loading image name by
85 | ``kernel=`` and ``arm_64bit=1``
86 |
87 | .. code-block:: none
88 |
89 | kernel_address=0x60000
90 | kernel=bootloader.img
91 | arm_64bit=1
92 |
93 |
94 | ``required 1`` Implement UART bootloader that loads kernel images through UART.
95 |
96 |
97 | .. note::
98 | UART is a low-speed interface. It's okay to send your kernel image because it's quite small. Don't use it to send large binary files.
99 |
100 |
101 | Requirement 2
102 | ==============
103 |
104 | Initial Ramdisk
105 | ---------------
106 |
107 | After a kernel is initialized, it mounts a root filesystem and runs an init user program.
108 | The init program can be a script or executable binary to bring up other services or load other drivers later on.
109 |
110 | However, you haven't implemented any filesystem and storage driver code yet, so you can't load anything from the SD card using your kernel.
111 | Another approach is loading user programs through initial ramdisk.
112 |
113 | Initial ramdisk is a file loaded by bootloader or embedded in a kernel.
114 | It's usually an archive that can be extracted to build a root filesystem.
115 |
116 | New ASCII Format Cpio Archive
117 | ------------------------------
118 |
119 | Cpio is a very simple archive format to pack directories and files.
120 | Each directory and file is recorded as **a header followed by its pathname and content**.
121 |
122 | In Lab 2, you are going to use the New ASCII Format Cpio format to create a cpio archive.
123 | You can first create a ``rootfs`` directory and put all files you need inside it.
124 | Then, use the following commands to archive it.
125 |
126 | .. code-block:: sh
127 |
128 | cd rootfs
129 | find . | cpio -o -H newc > ../initramfs.cpio
130 | cd ..
131 |
132 | `Freebsd's man page `_ has a detailed definition of how
133 | New ASCII Format Cpio Archive is structured.
134 | You should read it and implement a parser to read files in the archive.
135 |
136 | Loading Cpio Archive
137 | ---------------------
138 |
139 | **QEMU**
140 |
141 | Add the argument ``-initrd `` to QEMU.
142 | QEMU loads the cpio archive file to 0x8000000 by default.
143 |
144 | **Rpi3**
145 |
146 | Move the cpio archive into SD card.
147 | Then specify the name and loading address in ``config.txt``.
148 |
149 | .. code-block:: none
150 |
151 | initramfs initramfs.cpio 0x20000000
152 |
153 | ``required 2`` Parse New ASCII Format Cpio archive, and read file's content given file's pathname.
154 |
155 |
156 | .. note::
157 | In Lab 2, you only need to **put some plain text files inside your archive** to test the functionality.
158 | In the later labs, you will also put script files and executables inside to automate the testing.
159 |
160 |
161 | ***********
162 | Elective
163 | ***********
164 |
165 | Bootloader Self Relocation
166 | ===========================
167 |
168 | In the required part, you are allowed to specify the loading address of your bootloader in ``config.txt``.
169 | However, not all previous stage bootloaders are able to specify the loading address.
170 | Hence, a bootloader should be able to relocate itself to another address, so it can load a kernel to an address overlapping with its loading address.
171 |
172 |
173 | ``elective 1`` Add self-relocation to your UART bootloader, so you don't need ``kernel_address=`` option in ``config.txt``
174 |
175 |
176 | Devicetree
177 | ===========
178 |
179 | Introduction
180 | --------------
181 |
182 | During the booting process, a kernel should know what devices are currently connected and use the corresponding driver to initialize and access it.
183 | For powerful buses such as PCIe and USB, the kernel can detect what devices are connected by querying the bus's registers.
184 | Then, it matches the device's name with all drivers and uses the compatible driver to initialize and access the device.
185 |
186 | However, for a computer system with a simple bus, a kernel can't detect what devices are connected.
187 | One approach to drive these devices is as you did in Lab 1;
188 | developers know what's the target machine to be run on and hard code the io memory address in their kernel.
189 | It turns out the driver code becomes not portable.
190 |
191 | A cleaner approach is a file describing what devices are on a computer system.
192 | Also, it records the properties and relationships between each device.
193 | Then, a kernel can query this file as querying like powerful bus systems to load the correct driver.
194 | The file is called **deivcetree**.
195 |
196 | Format
197 | -------
198 |
199 | Devicetree has two formats **devicetree source(dts)** and **flattened devicetree(dtb)**.
200 | Devicetree source describes device tree in human-readable form.
201 | It's then compiled into flattened devicetree so the parsing can be simpler and faster in slow embedded systems.
202 |
203 | You can read rpi3's dts from raspberry pi's
204 | `linux repository `_
205 |
206 | You can get rpi3's dtb by either compiling it manually or download the `off-the-shelf one `_.
207 |
208 | Parsing
209 | ---------
210 |
211 | In this elective part, you should implement a parser to parse the flattened devicetree.
212 | Besides, your kernel should provide an interface that takes a callback function argument.
213 | So a driver code can walk the entire devicetree to query each device node and match itself by checking the node's name and properties.
214 |
215 | You can get the latest specification from the `devicetree's official website `_.
216 | Then follow the order Chapter 5, 2, 3 and read rpi3's dts to implement your parser.
217 |
218 | Dtb Loading
219 | ------------
220 |
221 | A bootloader loads a dtb into memory and passes the loading address specified at register ``x0`` to the kernel.
222 | Besides, it modifies the original dtb content to match the actual machine setting.
223 | For example, it adds initial ramdisk's loading address in dtb if you ask the bootloader to load an initial ramdisk.
224 |
225 | **QEMU**
226 |
227 | Add the argument ``-dtb bcm2710-rpi-3-b-plus.dtb`` to QEMU.
228 |
229 | **Rpi3**
230 |
231 | Move ``bcm2710-rpi-3-b-plus.dtb`` into SD card.
232 |
233 | ``elective 2`` Implement a parser that can iterate the device tree. Also, provide an API that takes a callback function,
234 | so driver code can access the content of device node during device tree iteration.
235 |
--------------------------------------------------------------------------------
/labs/lab3.rst:
--------------------------------------------------------------------------------
1 | ================
2 | Lab 3: Allocator
3 | ================
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | A kernel allocates physical memory for maintaining its internal states and user programs' use.
10 | Without memory allocators, you need to statically partition the physical memory into several memory pools for
11 | different objects.
12 | It's sufficient for some systems that run known applications on known devices.
13 | Yet, general-purpose operating systems that run diverse applications on diverse devices determine the use and amount
14 | of physical memory at runtime.
15 | Therefore, dynamic memory allocation is necessary.
16 |
17 | In Lab 3, you need to implement memory allocators.
18 | They'll be used in all later labs.
19 |
20 | *******************
21 | Goals of this lab
22 | *******************
23 |
24 | * Implement a page frame allocator.
25 |
26 | * Implement a dynamic memory allocator.
27 |
28 | * Implement a startup allocator.
29 |
30 | ************
31 | Background
32 | ************
33 |
34 | Reserved Memory
35 | ================
36 |
37 | After rpi3 is booted, some physical memory is already in use.
38 | For example, there are already spin tables for multicore boot(``0x0000 - 0x1000``), flatten device tree,
39 | initramfs, and your kernel image in the physical memory.
40 | Your memory allocator should not allocate these memory blocks if you still need to use them.
41 |
42 | Dynamic Memory Allocator
43 | ========================
44 |
45 | Given the allocation size,
46 | a dynamic memory allocator needs to find a large enough contiguous memory block and return the pointer to it.
47 | Also, the memory block would be released after use.
48 | A dynamic memory allocator should be able to reuse the memory block for another memory allocation.
49 | Reserved or allocated memory should not be allocated again to prevent data corruption.
50 |
51 | Page Frame Allocator
52 | ======================
53 |
54 | You may be wondering why you are asked to implement another memory allocator for page frames.
55 | Isn't a well-designed dynamic memory allocator enough for all dynamic memory allocation cases?
56 | Indeed, you don't need it if you run applications in kernel space only.
57 |
58 | However, if you need to run user space applications with virtual memory,
59 | you'll need a lot of 4KB memory blocks with 4KB memory alignment called page frames.
60 | That's because 4KB is the unit of virtual memory mapping.
61 | A regular dynamic memory allocator that uses a size header before memory block turns out half of the physical memory
62 | can't be used as page frames.
63 | Therefore, a better approach is representing available physical memory as page frames.
64 | The page frame allocator reserves and uses an additional page frame array to bookkeep the use of page frames.
65 | The page frame allocator should be able to allocate contiguous page frames for allocating large buffers.
66 | For fine-grained memory allocation, a dynamic memory allocator can allocate a page frame first then cut it into chunks.
67 |
68 | Observability of Allocators
69 | ============================
70 |
71 | It's hard to observe the internal state of a memory allocator and hence hard to demo.
72 | To check the correctness of your allocator, you need to **print the log of each allocation and free**.
73 |
74 | .. note::
75 | TAs will verify the correctness by these logs in the demo.
76 |
77 | *********
78 | Required
79 | *********
80 |
81 | In the required part, your allocator doesn't need to deal with the reserved memory problem.
82 | You can find an unused memory region (e.g. 0x1000_0000 -> 0x2000_0000) and manage that part of memory only.
83 |
84 | Requirement 1
85 | =============
86 |
87 | Buddy System
88 | -------------
89 |
90 | Buddy system is a well-known and simple algorithm for allocating contiguous memory blocks.
91 | It has an internal fragmentation problem, but it's still suitable for page frame allocation
92 | because the problem can be reduced with the dynamic memory allocator.
93 | We provide one possible implementation in the following part.
94 | You can still design it yourself as long as you follow the specification of the buddy system.
95 |
96 | ``required 1`` Implement the buddy system for contiguous page frames allocation.
97 |
98 | .. note::
99 |
100 | You don't need to handle the case of out-of-memory.
101 |
102 | Data Structure
103 | ----------------
104 |
105 | **The Frame Array** (or *"The Array"*, so to speak)
106 |
107 | *The Array* represents the allocation status of the memory by constructing a 1-1 relationship between the physical memory frame and *The Array*'s entries.
108 | For example, if the size of the total allocable memory is 200kb with each frame being 4kb. Then *The Array* would consist of 50 entries, with the first and the second entry representing memory addresses starts from 0x0 and 0x1000(4k).
109 |
110 | However, to describe a living Buddy system with *The Array*, we need to provide extra meaning to items in *The Array* by assigning values to them, defined as followed:
111 |
112 | For each entry in *The Array* with index :math:`\text{idx}` and value :math:`\text{val}`
113 | (Suppose the framesize to be ``4kb``)
114 |
115 | if :math:`\text{val} \geq 0`:
116 | There is an allocable, contiguous memory that starts from the :math:`\text{idx}`'th frame with :math:`\text{size} = 2^{\text{val}}` :math:`\times` ``4kb``.
117 |
118 | if :math:`\text{val} = \text{}`: (user defined value)
119 | The :math:`\text{idx}`'th frame is free, but it belongs to a larger contiguous memory block. Hence, buddy system doesn't directly allocate it.
120 |
121 | if :math:`\text{val} = \text{}`: (user defined value)
122 | The :math:`\text{idx}`'th frame is already allocated, hence not allocable.
123 |
124 | .. image:: img/buddy_frame_array.svg
125 |
126 | Below is the generalized view of **The Frame Array**:
127 |
128 | .. image:: img/buddy.svg
129 |
130 |
131 | You can calculate the address and the size of the contiguous block by the following formula.
132 |
133 | + :math:`\text{block's physical address} = \text{block's index} \times 4096 + \text{base address}`
134 | + :math:`\text{block's size} = 4096 \times 2^\text{block's exponent}`
135 |
136 | Linked-lists for blocks with different size
137 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
138 | You can set a maximum contiguous block size and create one linked-list for each size.
139 | The linked-list links free blocks of the same size.
140 | The buddy allocator's search starts from the specified block size list.
141 | If the list is empty, it tries to find a larger block in a larger block list
142 |
143 | .. _release_redu:
144 |
145 | Release redundant memory block
146 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147 | The above algorithm may allocate one block far larger than the required size.
148 | The allocator should cut off the bottom half of the block and put it back to the buddy system until the size equals the required size.
149 |
150 | .. note::
151 | You should print the log of releasing redundant memory block for the demo
152 |
153 | Free and Coalesce Blocks
154 | --------------------------
155 | To make the buddy system contains larger contiguous memory blocks.
156 | When the user frees the allocated memory block, the buddy allocator should not naively put it back to the linked-list.
157 | It should try to :ref:`find_buddy` and :ref:`merge_iter`.
158 |
159 | .. _find_buddy:
160 |
161 | Find the buddy
162 | ^^^^^^^^^^^^^^
163 |
164 | You can use the block's index xor with its exponent to find its buddy.
165 | If its buddy is in the page frame array, then you can merge them to a larger block.
166 |
167 | .. _merge_iter:
168 |
169 | Merge iteratively
170 | ^^^^^^^^^^^^^^^^^
171 | There is still a possible buddy for the merged block.
172 | You should use the same way to find the buddy of the merge block.
173 | When you can't find the buddy of the merged block or the merged block size is maximum-block-size,
174 | the allocator stops and put the merged block to the linked-list.
175 |
176 | .. note::
177 | You should print the log of merge iteration for the demo.
178 |
179 | Requirement 2
180 | =============
181 |
182 | Dynamic Memory Allocator
183 | -------------------------
184 |
185 | Your page frame allocator already provides the functionality for large contiguous memory allocation.
186 | Your dynamic memory allocator only needs to add a wrapper to translate a page frame to its physical address.
187 | For small memory allocation, you can create several memory pools for some common size such as [16, 32, 48, 96 ...].
188 | Then, partition page frames into several chunk slots.
189 | When there is a memory allocation request, round up the requested allocation size to the nearest size and check if
190 | there is any unallocated slot.
191 | If not, allocate a new page frame from the page allocator.
192 | Then, return one chunk to the caller.
193 | Objects from the same page frame have a common prefix address.
194 | The allocator can use it to determine the memory pool the chunk belonged to when it's freed.
195 |
196 | ``required 2`` Implement a dynamic memory allocator.
197 |
198 |
199 | ***********
200 | Elective
201 | ***********
202 |
203 | .. _startup_alloc:
204 |
205 | Startup Allocator
206 | ===================
207 |
208 | In general purpose operating systems, the amount of physical memory is determined at runtime.
209 | Hence, a kernel needs to dynamically allocate its page frame array for its page frame allocator.
210 | The page frame allocator then depends on dynamic memory allocation.
211 | The dynamic memory allocator depends on the page frame allocator.
212 | This introduces the chicken or the egg problem.
213 | To break the dilemma, you need a dedicated allocator during startup time.
214 |
215 | The design of the startup allocator is quite simple.
216 | Just implement a dynamic memory allocator not based on the page allocator.
217 | It records the start address and size of the allocated and reserved blocks in a statically allocated array.
218 | If there are not many memory holes in the physical memory, it can bookkeep with a minimum number of entries.
219 |
220 | Your startup allocator should be able to reserve memory for the buddy system, kernel, initramfs, etc.
221 | In the end, it hands the physical memory to the buddy system.
222 | The buddy system should mark the reserved segment as allocated.
223 |
224 | ``elective 1`` Implement a startup allocator.
225 |
226 | .. note::
227 | * Your startup allocator should still work when the memory size is large or contains memory holes.
228 |
229 | * Reserved memory block detection is not part of the startup allocator. You can either find a way to get those information or hard code it. Then call
230 | the startup allocator's API to reserve those regions.
--------------------------------------------------------------------------------
/labs/lab4.rst:
--------------------------------------------------------------------------------
1 | ==============================
2 | Lab 4: Exception and Interrupt
3 | ==============================
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | An exception is an event that causes the currently executing program to relinquish the CPU to the corresponding handler.
10 | With the exception mechanism, an operating system can
11 |
12 | 1. do proper handling when an error occurs during execution.
13 | 2. A user program can generate an exception to get the corresponding operating system's service.
14 | 3. A peripheral device can force the currently executing program to relinquish the CPU and execute its handler.
15 |
16 | *******************
17 | Goals of this lab
18 | *******************
19 |
20 | * Understand what's exception levels in Armv8-A.
21 | * Understand what's exception handling.
22 | * Understand what's interrupt.
23 | * Understand how rpi3's peripherals interrupt the CPU by interrupt controllers.
24 | * Understand how to multiplex a timer.
25 | * Understand how to concurrently handle I/O devices.
26 |
27 | **********
28 | Background
29 | **********
30 |
31 | Official Reference
32 | ===================
33 |
34 | Exceptions are tightly coupled with the CPU's design.
35 | We only briefly introduce the components needed in this lab.
36 | If you want to know the details, please refer to ARM's official
37 | `introduction `_
38 | and
39 | `manual `_'s Chapter D1(page 1496)
40 |
41 |
42 | Exception Levels
43 | =================
44 |
45 | `Principle of least privilege `_
46 | limits the resource that a program can access.
47 | The limitation reduces possible errors during execution and the attack surface,
48 | so the system's security and stability are increased.
49 | Armv8-A's CPUs follow the principle and implement exception levels,
50 | so an operating system can run diverse user applications without crashing the entire system.
51 |
52 | .. image:: img/exception_levels.jpg
53 |
54 | Armv8-A has 4 exception levels(ELs).
55 | In general, all user programs are run in EL0, and operating systems are run in EL1.
56 | The operating system can decrease the exception level and jump to the user program by setting the system registers and execute an exception return instruction.
57 | When an exception is taken during a user program's execution, the exception level is increased, and the CPU will jump to the exception handler.
58 |
59 | Exception Handling
60 | ==================
61 |
62 | When a CPU takes an exception, it does the following things.
63 |
64 | * Save the current processor's state(PSTATE) in SPSR_ELx. (x is the target Exception level)
65 | * Save the exception return address in ELR_ELx.
66 | * Disable its interrupt. (PSTATE.{D,A,I,F} are set to 1).
67 | * If the exception is a synchronous exception or an an SError interrupt, save the cause of that exception in ESR_ELx.
68 | * Switch to the target Exception level and start at the corresponding vector address.
69 |
70 | After the exception handler finishes, it issues ``eret`` to return from the exception.
71 | Then the CPU,
72 |
73 | * Restore program counter from ELR_ELx.
74 | * Restore PSTATE from SPSR_ELx.
75 | * Switch to the corresponding Exception level according to SPSR_ELx.
76 |
77 | Vector Table
78 | -------------
79 |
80 | As mentioned above, the CPU starts its execution at the corresponding vector address.
81 | The address is defined as the following vector table, and the base address of the table is saved in VBAR_ELx.
82 |
83 | .. image:: img/vector_table.jpg
84 |
85 | The left part of the table is the cause of an exception.
86 | In this lab, we only focus on Synchronous and IRQ exceptions.
87 | The right part of the table is the relationship between the EL that the exception happens and the EL that it targets.
88 | In this lab, we only focus on the case that the kernel takes exceptions(EL1 -> EL1) and 64bit user programs take exceptions(EL0 -> EL1).
89 | Also, we want the kernel and user programs using different stacks(use SP_ELx).
90 | Therefore, it corresponds to the table's
91 |
92 | Exception from the currentEL while using SP_ELx
93 |
94 | and
95 |
96 | Exception from a lower EL and at least one lower EL is AARCH64.
97 |
98 |
99 | Interrupts
100 | ===========
101 |
102 | Besides computing, the CPU also needs to control I/O devices.
103 | I/O devices are not always ready.
104 | Hence, the CPU needs to check the device's status and readiness before processing an I/O device's data.
105 |
106 | In previous labs, you did it by busy polling the devices.
107 | However, redundant polling wastes CPU time.
108 | Also, the CPU may not handle an I/O device's data immediately right after the device is ready.
109 | It may lead to the I/O device's underutilization or data loss.
110 |
111 | Interrupts as an alternative way allow I/O devices to inform the CPU when they're ready.
112 | Then, the CPU can do the device's handler to process the data immediately.
113 | Besides, interrupt forces the current running process relinquishes the CPU's control.
114 | Therefore, any process can't run with indefinite time.
115 |
116 | Interrupt Controllers
117 | ----------------------
118 |
119 | Rpi3 has two levels of interrupt controllers.
120 | The first level controller routes interrupt to each CPU core, so each CPU core can have its own timer interrupt and send interrupt processor Interrupts between each other.
121 | The details could be found in
122 |
123 | https://github.com/raspberrypi/documentation/blob/master/hardware/raspberrypi/bcm2836/QA7_rev3.4.pdf
124 |
125 | The second level controller routes interrupts from peripherals such as UART and system timer, they are aggregated and send to the first level interrupt controller as GPU IRQ.
126 | The details could be found in
127 |
128 | https://cs140e.sergio.bz/docs/BCM2837-ARM-Peripherals.pdf (page 109)
129 |
130 |
131 | Critical Sections
132 | -----------------
133 |
134 | A critical section is a code segment that can't be executed concurrently.
135 | When interrupts are enabled, the CPU can be interrupted when it's accessing some shared data.
136 | If the interrupt handler also accesses the data, the data may be corrupted.
137 | Therefore, the kernel needs to protect the shared data.
138 |
139 | In the required part, interrupts are only enabled in user programs(EL0).
140 | Hence, your kernel doesn't need to handle it.
141 |
142 | In the elective part, your kernel should enable interrupts.
143 | Hence, you should either
144 |
145 | * disable the CPU's interrupt temporarily in critical sections to prevent concurrent access in the interrupt handlers.
146 | * limit interrupt handlers from calling part of the APIs that access the shared data.
147 |
148 | *********
149 | Required
150 | *********
151 |
152 | Requirement 1
153 | =============
154 |
155 | Exception Level Switch
156 | -----------------------
157 |
158 | EL2 to EL1
159 | ^^^^^^^^^^^
160 |
161 | Rpi3's CPU runs in EL2 after booted by default, but we want the kernel to run in EL1.
162 | Hence, your kernel needs to switch to EL1 at the beginning.
163 |
164 | You can use the following code to switch from EL2 to EL1.
165 | It configures ``hcr_el2`` so EL1 runs in AARCH64.
166 | Then it sets ``spsr_el2`` and ``elr_el2``, so the CPU can return to the target address with the correct PSTATE after ``eret``.
167 |
168 | .. code :: c
169 |
170 | ...
171 | bl from_el2_to_el1
172 | # the next instruction runs in EL1
173 | ...
174 | from_el2_to_el1:
175 | mov x0, (1 << 31) // EL1 uses aarch64
176 | msr hcr_el2, x0
177 | mov x0, 0x3c5 // EL1h (SPSel = 1) with interrupt disabled
178 | msr spsr_el2, x0
179 | msr elr_el2, lr
180 | eret // return to EL1
181 |
182 | ``required 1-1`` Switch from EL2 to EL1 .
183 |
184 | EL1 to EL0
185 | ^^^^^^^^^^
186 |
187 | After the kernel is initialized, it can load user programs and execute them in EL0 by ``eret``.
188 | You need to add a command to your shell that can
189 |
190 | 1. load a user program in the initramfs to a specific address.
191 | 2. set ``spsr_el1`` to ``0x3c0`` and ``elr_el1`` to the program's start address.
192 | 3. set the user program's stack pointer to a proper position by setting ``sp_el0``.
193 | 4. issue ``eret`` to return to the user code.
194 |
195 | ``required 1-2`` Add a command that can load a user program in the initramfs. Then, use ``eret`` to jump to the start address.
196 |
197 | .. hint::
198 | You can use QEMU and GDB to check if you do it correctly.
199 |
200 | EL0 to EL1
201 | ^^^^^^^^^^
202 |
203 | The user program can go back to EL1 by taking an exception.
204 | But you need to set up the exception vector table first.
205 | You can use the following vector table and set ``vbar_el1`` to its address.
206 |
207 | .. code:: c
208 |
209 | exception_handler:
210 | ...
211 | .align 11 // vector table should be aligned to 0x800
212 | .global exception_vector_table
213 | exception_vector_table:
214 | b exception_handler // branch to a handler function.
215 | .align 7 // entry size is 0x80, .align will pad 0
216 | b exception_handler
217 | .align 7
218 | b exception_handler
219 | .align 7
220 | b exception_handler
221 | .align 7
222 |
223 | b exception_handler
224 | .align 7
225 | b exception_handler
226 | .align 7
227 | b exception_handler
228 | .align 7
229 | b exception_handler
230 | .align 7
231 |
232 | b exception_handler
233 | .align 7
234 | b exception_handler
235 | .align 7
236 | b exception_handler
237 | .align 7
238 | b exception_handler
239 | .align 7
240 |
241 | b exception_handler
242 | .align 7
243 | b exception_handler
244 | .align 7
245 | b exception_handler
246 | .align 7
247 | b exception_handler
248 | .align 7
249 |
250 | set_exception_vector_table:
251 | adr x0, exception_vector_table
252 | msr vbar_el1, x0
253 |
254 | .. note::
255 |
256 | The vector table's base address should be aligned to 0x800
257 |
258 | Exception Handling
259 | -------------------
260 |
261 | After setting the vector table, loads the following user program.
262 | The user program takes an exception by the ``svc`` instruction which is used for system calls.
263 |
264 | The design of system calls is left to the next lab.
265 | Now, your kernel only needs to print the content of ``spsr_el1``, ``elr_el1``, and ``esr_el1`` in the exception handler.
266 |
267 | .. code:: c
268 |
269 | .section ".text"
270 | .global _start
271 | _start:
272 | mov x0, 0
273 | 1:
274 | add x0, x0, 1
275 | svc 0
276 | cmp x0, 5
277 | blt 1b
278 | 1:
279 | b 1b
280 |
281 | ``required 1-3`` Set the vector table and implement the exception handler.
282 |
283 | Context saving
284 | ^^^^^^^^^^^^^^^
285 |
286 | You may find that the above user program behaves unexpectedly.
287 | That's because the user program and the exception handler share the same general purpose registers bank.
288 | You need to save them before entering the kernel's function.
289 | Otherwise, it may be corrupted.
290 |
291 | You can use the following code to save registers before entering the kernel and load them before exiting the kernel.
292 |
293 | .. code:: c
294 |
295 | // save general registers to stack
296 | .macro save_all
297 | sub sp, sp, 32 * 8
298 | stp x0, x1, [sp ,16 * 0]
299 | stp x2, x3, [sp ,16 * 1]
300 | stp x4, x5, [sp ,16 * 2]
301 | stp x6, x7, [sp ,16 * 3]
302 | stp x8, x9, [sp ,16 * 4]
303 | stp x10, x11, [sp ,16 * 5]
304 | stp x12, x13, [sp ,16 * 6]
305 | stp x14, x15, [sp ,16 * 7]
306 | stp x16, x17, [sp ,16 * 8]
307 | stp x18, x19, [sp ,16 * 9]
308 | stp x20, x21, [sp ,16 * 10]
309 | stp x22, x23, [sp ,16 * 11]
310 | stp x24, x25, [sp ,16 * 12]
311 | stp x26, x27, [sp ,16 * 13]
312 | stp x28, x29, [sp ,16 * 14]
313 | str x30, [sp, 16 * 15]
314 | .endm
315 |
316 | // load general registers from stack
317 | .macro load_all
318 | ldp x0, x1, [sp ,16 * 0]
319 | ldp x2, x3, [sp ,16 * 1]
320 | ldp x4, x5, [sp ,16 * 2]
321 | ldp x6, x7, [sp ,16 * 3]
322 | ldp x8, x9, [sp ,16 * 4]
323 | ldp x10, x11, [sp ,16 * 5]
324 | ldp x12, x13, [sp ,16 * 6]
325 | ldp x14, x15, [sp ,16 * 7]
326 | ldp x16, x17, [sp ,16 * 8]
327 | ldp x18, x19, [sp ,16 * 9]
328 | ldp x20, x21, [sp ,16 * 10]
329 | ldp x22, x23, [sp ,16 * 11]
330 | ldp x24, x25, [sp ,16 * 12]
331 | ldp x26, x27, [sp ,16 * 13]
332 | ldp x28, x29, [sp ,16 * 14]
333 | ldr x30, [sp, 16 * 15]
334 | add sp, sp, 32 * 8
335 | .endm
336 |
337 | exception_handler:
338 | save_all
339 | bl exception_entry
340 | load_all
341 | eret
342 |
343 | ``required 1-4`` Save the user program's context before executing the exception handler.
344 |
345 | Requirement 2
346 | =============
347 |
348 | Core Timer Interrupt
349 | ---------------------
350 |
351 | Rpi3's each CPU core has its core timer.
352 | It can be configured by the following system registers.
353 |
354 | * ``cntpct_el0``: The timer's current count.
355 |
356 | * ``cntp_cval_el0``: A compared timer count. If ``cntpct_el0`` >= ``cntp_cval_el0``, interrupt the CPU core.
357 |
358 | * ``cntp_tval_el0``: (``cntp_cval_el0`` - ``cntpct_el0``). You can use it to set an expired timer after the current timer count.
359 |
360 | To enable the timer's interrupt, you need to
361 |
362 | 1. set ``cntp_ctl_el0`` to 1.
363 | 2. unmask the timer interrupt from the first level interrupt controller.
364 | 3. you should enable the CPU core's interrupt.
365 |
366 | In the required part, you only need to enable interrupt in EL0.
367 | You can do it by setting ``spsr_el1`` to 0 before returning to EL0.
368 |
369 | You can use the following code to enable the core timer's interrupt.
370 |
371 | .. code:: c
372 |
373 | #define CORE0_TIMER_IRQ_CTRL 0x40000040
374 |
375 | core_timer_enable:
376 | mov x0, 1
377 | msr cntp_ctl_el0, x0 // enable
378 | mrs x0, cntfrq_el0
379 | msr cntp_tval_el0, x0 // set expired time
380 | mov x0, 2
381 | ldr x1, =CORE0_TIMER_IRQ_CTRL
382 | str w0, [x1] // unmask timer interrupt
383 |
384 | core_timer_handler:
385 | mrs x0, cntfrq_el0
386 | msr cntp_tval_el0, x0
387 |
388 | ``required 2`` Enable the core timer's interrupt. The interrupt handler should print the seconds after booting and set the next timeout to 2 seconds later.
389 |
390 | .. hint::
391 |
392 | You can get the seconds after booting from the count of the timer(``cntpct_el0``) and the frequency of the timer(``cntfrq_el0``).
393 |
394 | *********
395 | Elective
396 | *********
397 |
398 | Enable Interrupt in EL1
399 | ========================
400 |
401 | In the elective part, it's required to enable interrupts in EL1.
402 | You can only disable interrupts to protect the critical sections.
403 | You can use the following code to enable/disable interrupts.
404 |
405 | .. code-block:: c
406 |
407 | // enable interrupt
408 | msr DAIFClr, 0xf
409 | // disable interrupt
410 | msr DAIFSet, 0xf
411 |
412 | .. note::
413 |
414 | This part is the dependency of the following elective parts, but it doesn't count in your score.
415 |
416 | Rpi3's Peripheral Interrupt
417 | ============================
418 |
419 | In this elective part, you need to implement rpi3's mini UART's interrupt handling.
420 | Then, you don't have to busy polling the UART device.
421 |
422 | Enable mini UART's Interrupt.
423 | ------------------------------
424 |
425 | To enable mini UART's interrupt,
426 | you need to set ``AUX_MU_IER_REG(0x3f215044)`` and the second level interrupt controller's ``Enable IRQs1(0x3f00b210)``'s bit29.
427 |
428 | Determine the Interrupt Source
429 | --------------------------------
430 |
431 | When the UART's interrupt is enabled, there is more than one interrupt source to the CPU.
432 | Hence, your kernel needs to check the source of the interrupt before executing the corresponding interrupt handler.
433 | Please refer to both interrupt controllers' manuals to determine the interrupt source.
434 |
435 | Asynchronous Read and Write
436 | ----------------------------
437 |
438 | In previous labs, your shell blocks the execution by busy polling the UART when it needs to read or write.
439 | Now, you can create a read buffer and a write buffer.
440 | Your shell writes bytes to the write buffer when it prints a message.
441 | The data is sent asynchronously by the UART's TX interrupt handler.
442 | Also, the UART's RX interrupt handler put data in the read buffer.
443 | The shell reads bytes array from the buffer and gets the number of bytes it read.
444 |
445 | ``elective 1`` Implement the asynchronous UART read/write by interrupt handlers.
446 |
447 | .. note::
448 |
449 | You don't have to replace all print functions to the asynchronous version.
450 |
451 | Timer Multiplexing
452 | ===================
453 |
454 | Timers can be used to do periodic jobs such as scheduling and journaling and one-shot executing such as sleeping and timeout.
455 | However, the number of the hardware timer is limited.
456 | Therefore, the kernel needs a software mechanism to multiplex the timer.
457 |
458 | One simple way is using a periodic timer.
459 | The kernel can use the tick period as the time unit and calculate the corresponding timeout tick.
460 | For example, suppose the periodic timer's frequency is 1000HZ and a process sleeps for 1.5 seconds.
461 | The kernel can add a wake-up event at the moment that 1500 ticks after the current tick.
462 |
463 | However, when the tick frequency is too low, the timer has a bad resolution.
464 | Then, it can't be used for time-sensitive jobs.
465 | When the tick frequency is too high, it introduces a lot of overhead for redundant timer interrupt handling.
466 |
467 | Another way is using a one-shot timer.
468 | When someone needs a timeout event, a timer is inserted into a timer queue.
469 | If the timeout is earlier than the previous programed expired time, the kernel reprograms the hardware timer to the earlier one.
470 | In the timer interrupt handler, it executes the expired timer's callback function.
471 |
472 | In this elective part, you need to implement the timer API that a user can register the callback function when the
473 | timeout using the one-shot timer(the core timer is a one-shot timer).
474 | The API and its use case should look like the below pseudo code.
475 |
476 | .. code:: python
477 |
478 | # An example API
479 | def add_timer(callback(data), after):
480 | ...
481 |
482 | # An example use case
483 | def sleep(duration):
484 | add_timer(wakeup(current_process), duration)
485 |
486 | To test the API, you need to implement the shell command ``setTimeout MESSAGE SECONDS``.
487 | It prints MESSAGE after SECONDS with the current time and the command executed time.
488 |
489 | ``elective 2`` Implement the ``setTimeout`` command with the timer API.
490 |
491 | .. note::
492 | ``setTimeout`` is non-blocking. Users can set multiple timeouts.
493 | The printing order is determined by the command executed time and the user specified SECONDS.
494 |
495 | Concurrent I/O Devices Handling
496 | ===============================
497 |
498 | The kernel needs to handle a lot of I/O devices at the same time.
499 | For devices(e.g. UART) that have a short period of process time,
500 | the kernel can finish their handlers immediately right after they're ready.
501 | However, for those devices(e.g. network interface controller) that require a longer time for the follow-up processing,
502 | the kernel needs to schedule the execution order.
503 |
504 | Usually, we want to use the first come first serve principle to prevent starvation.
505 | However, we may also want prioritized execution for some critical handlers.
506 | In this part, you need to know how to implement it using a single thread(i.e. a single stack).
507 |
508 | Decouple the Interrupt Handlers
509 | ---------------------------------
510 |
511 | A simpler way to implement an interrupt handler is processing all the device's data one at a time with interrupts disabled.
512 | However, a less critical interrupt handler can block a more critical one for a long time.
513 | Hence, we want to decouple the interrupt handler and the actual processing.
514 |
515 | This can be achieved by a task queue.
516 | In the interrupt handler, the kernel
517 |
518 | 1. masks the device's interrupt line,
519 | 2. move data from the device's buffer through DMA, or manually copy,
520 | 3. enqueues the processing task to the event queue,
521 | 4. does the tasks with interrupts enabled,
522 | 5. unmasks the interrupt line to get the next interrupt at the end of the task.
523 |
524 | Those tasks in the queue can be processed when the system is idle.
525 | Also, the kernel can execute the task in any order such as FIFO or LIFO.
526 |
527 | ``elective 3-1`` Implement a task queue mechanism, so interrupt handlers can add its processing tasks to it.
528 |
529 | Nested Interrupt
530 | ------------------
531 |
532 | The tasks in the queue can be executed at any time, but we want them to be executed as soon as possible.
533 | It's because that a high-priority process may be waiting for the data.
534 |
535 | Therefore, before the interrupt handler return to the user program,
536 | it should execute the tasks in the interrupt context with interrupts enabled (otherwise, critical interrupts are blocked).
537 | Then, the interrupt handler may be nested.
538 | Hence, besides general-purpose registers, you should also save ``spsr_el1`` and ``elr_el1`` so the previous saved data are preserved.
539 |
540 | ``elective 3-2`` Execute the tasks in the queue before returning to the user program with interrupts enabled.
541 |
542 | Preemption
543 | -----------
544 |
545 | Now, any interrupt handler can preempt the task's execution, but the newly enqueued task still needs to wait for
546 | the currently running task's completion.
547 | It'd be better if the newly enqueued task with a higher priority can preempt the currently running task.
548 |
549 | To achieve the preemption,
550 | the kernel can check the last executing task's priority before returning to the previous interrupt handler.
551 | If there are higher priority tasks, execute the highest priority task.
552 |
553 | ``elective 3-3`` Implement the task queue's preemption mechanism.
554 |
--------------------------------------------------------------------------------
/labs/lab5.rst:
--------------------------------------------------------------------------------
1 | ==============================
2 | Lab 5: Thread and User Process
3 | ==============================
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | Multitasking is the most important feature of an operating system.
10 | In this lab, you'll learn how to create threads and how to switch between different threads to achieve multitasking.
11 | Moreover, you'll learn how a user program becomes a user process and accesses services provided by the kernel through system calls.
12 |
13 | *******************
14 | Goals of this lab
15 | *******************
16 |
17 | * Understand how to create threads and user processes.
18 | * Understand how to implement scheduler and context switch.
19 | * Understand what's preemption.
20 | * Understand how to implement the waiting mechanism.
21 | * Understand how to implement POSIX signals.
22 |
23 |
24 | **********
25 | Background
26 | **********
27 |
28 | Threads
29 | =======
30 |
31 | In the previous lab, you already learned how to implement multitasking with a single stack.
32 | However, in the case of single stack multitasking, the CPU thread can't switch between two tasks at any time.
33 | Otherwise, a task may corrupt another task's context stored on the stack.
34 |
35 | As you know that the context of a CPU thread is determined by the values of its register set.
36 | Therefore, we can create different copies of register sets stored in the memory to represent different threads.
37 | When we want a CPU thread to run a specific thread, we let the CPU thread loads the corresponding register set in the memory to its registers.
38 | Then, from a macro point of view, there are multiple CPU threads running tasks independently at the same time.
39 | Moreover, these register sets can be loaded by any other CPU thread to achieve true parallelism.
40 |
41 | .. note::
42 | In this documentation, a thread means a software thread.
43 | For processing elements containing their hardware register sets are called CPU threads.
44 |
45 | User Process
46 | ============
47 |
48 | When a user wants to run an application,
49 | the operating system loads the user program into memory and runs it with one or multiple threads.
50 | However, users want to run multiple programs or multiple copies of the same program.
51 | Moreover, they want each executing program to be isolated and has its identity, capabilities, and resource.
52 | To achieve this, the operating system maintains multiple isolated execution instances called processes.
53 | A process can only access the resource it owns.
54 | If it needs additional resources, it invokes the corresponding system calls.
55 | The kernel then checks the capabilities of the process and only provides the resource if the process has the access right.
56 |
57 | MMU-less
58 | ---------
59 |
60 | In general, programs that directly run machine code on the CPU are isolated by virtual memory.
61 | However, we don't enable the MMU in this lab,
62 | so we can't prevent illegal memory access and we can't use the same virtual address for different processes.
63 | If you want to execute multiple programs, please use different linker scripts for different programs
64 | and load them to different addresses to prevent overlapping.
65 |
66 | Run Queue and Wait Queue
67 | =========================
68 |
69 | One CPU thread can run a thread at a time, but there may be multiple runnable threads at the same time.
70 | Those runnable threads are put in the run queue.
71 | When the current thread relinquishes control of the CPU thread, it calls the scheduler to pick the next thread.
72 | Then, a piece of code saves the CPU thread's register set and loads the next thread's register set.
73 |
74 | During a thread's execution, it may need to wait for a certain resource(e.g. a locked mutex or a non-ready IO device).
75 | Instead of busy waiting, a more efficient way is to yield the CPU thread so other threads can do meaningful jobs.
76 | Yet, yielding CPU is not enough because the thread may be scheduled again and waste CPU time.
77 | Therefore, when a thread needs to wait for a long time, it removes itself from the run queue, puts itself in a wait queue,
78 | and waits for others to wake it up.
79 |
80 | In general, each resource has its own wait queue.
81 | When the resource is ready, one or many waiting threads in the wait queue will be put back to the run queue.
82 | The awakened thread is eventually scheduled and runs.
83 | Then, it can get the resource if the resource is still available.
84 |
85 | Yield and Preemption
86 | =====================
87 | As mentioned above, a thread can voluntarily yield the CPU thread to others.
88 | Yet, we can't rely on a voluntary yield, because once a thread never yields,
89 | a high-priority thread can't run even when it's runnable.
90 | Hence, the kernel should be able to force the current thread to yield the CPU thread(i.e. preemption).
91 |
92 | The implementation of preemption is simple.
93 | Once a thread relinquishes control of the CPU thread during its execution,
94 | there is a chance for another piece of code to call the scheduler and switch to another thread.
95 | For example, when a thread in kernel mode is interrupted, the control is handed over to the interrupt handler.
96 | Before returning to the original execution, the kernel can call the scheduler to do a context switch to achieve kernel preemption.
97 | When a user process takes exceptions(system calls, interrupts, etc.), the control is handed over to the exception handler.
98 | Before returning to the original execution, the kernel can call the scheduler to do a context switch to achieve user preemption.
99 |
100 | The tricky part of preemption is the protection of critical sections because code executions are arbitrally interleaving now.
101 | Fortunately, user programs protect their critical sections themselves.
102 | Even the user program doesn't protect the critical sections well, it's the user program's developer's fault, and no one will blame the operating system.
103 | Also, it's not possible to break other isolated processes.
104 | Therefore, the kernel developers don't need to worry about problems caused by enabling user preemption.
105 |
106 | On the contrary, there are multiple shared resources in the kernel.
107 | Meanwhile, a data race in the kernel can break the entire system.
108 | If the kernel's developers need to enable fine-grained preemption in kernel mode,
109 | They need to be aware of all possible shared resource accesses and adopt the right methods to protect them,
110 | hence it's more complex to enable kernel preemption.
111 |
112 |
113 | *********
114 | Required
115 | *********
116 |
117 | Requirement 1
118 | =============
119 |
120 | In this part, you need to implement the creation, switch, and recycle of threads.
121 |
122 | Creating a Thread
123 | ------------------
124 |
125 | Implement a thread-creating API.
126 | Users can pass a function(task) to the API, and the function is run in a newly created thread.
127 | To make the thread schedulable and runnable, you should create a data structure and a stack for it.
128 | Then, put it into the run queue.
129 |
130 | The example API is listed below.
131 |
132 | .. code:: python
133 |
134 | def foo():
135 | pass
136 |
137 | t = Thread(foo)
138 |
139 | Scheduler and Context Switch
140 | -----------------------------
141 |
142 | Implement the ``schedule()`` API.
143 | When the current thread calls this API, the scheduler picks the next thread from the run queue.
144 | In this lab, your scheduler should at least be able to schedule the threads of the same priority in a **round-robin** manner.
145 |
146 | After the next thread is picked, the kernel can save the current thread's register set and load the next thread's.
147 |
148 | .. code:: c
149 |
150 | .global switch_to
151 | switch_to:
152 | stp x19, x20, [x0, 16 * 0]
153 | stp x21, x22, [x0, 16 * 1]
154 | stp x23, x24, [x0, 16 * 2]
155 | stp x25, x26, [x0, 16 * 3]
156 | stp x27, x28, [x0, 16 * 4]
157 | stp fp, lr, [x0, 16 * 5]
158 | mov x9, sp
159 | str x9, [x0, 16 * 6]
160 |
161 | ldp x19, x20, [x1, 16 * 0]
162 | ldp x21, x22, [x1, 16 * 1]
163 | ldp x23, x24, [x1, 16 * 2]
164 | ldp x25, x26, [x1, 16 * 3]
165 | ldp x27, x28, [x1, 16 * 4]
166 | ldp fp, lr, [x1, 16 * 5]
167 | ldr x9, [x1, 16 * 6]
168 | mov sp, x9
169 | msr tpidr_el1, x1
170 | ret
171 |
172 | .global get_current
173 | get_current:
174 | mrs x0, tpidr_el1
175 | ret
176 |
177 | The above example gets the current thread's data structure from the system register ``tpidr_el1``.
178 | Then it passes the current thread and the next thread to the ``switch_to(prev, next)`` function.
179 | Next, the CPU thread's register set is saved on the current thread's data structure,
180 | and the next thread's register set is loaded.
181 | After switching the stack pointer and the ``tpidr_el1`` register, the CPU thread is in the context of the next thread.
182 |
183 | .. note::
184 | You only need to save `callee-saved registers `_,
185 | because other registers are already on the stack.
186 |
187 | The Idle Thread
188 | ---------------
189 | The idle thread is a thread that is always runnable.
190 | When there are no other runnable threads,
191 | the scheduler should pick it to guarantee that the CPU thread always can fetch and execute the next instruction.
192 |
193 | End of a Thread
194 | ---------------
195 |
196 | When a thread finishes its jobs, it needs to explicitly or implicitly call(return and let the caller call) ``exit()``
197 | to indicate it's terminated.
198 |
199 | In general, the thread can't recycle all its resources.
200 | It's because memory deallocation is a function call, and a thread shouldn't free its stack while still using it.
201 | Therefore, the finished thread only removes itself from the run queue,
202 | releases freeable resources, sets its state to be dead,
203 | and waits for someone to recycle the remaining stuff.
204 |
205 | In UNIX-like operating systems, the recycler is another thread that creates the zombie thread(parent).
206 | The parent can also get the status code from the zombie child's data structure as useful information.
207 | In this lab, you can let the idle thread do the jobs to simplify the implementation.
208 | When the idle thread is scheduled, it checks if there is any zombie thread.
209 | If yes, it recycles them as following.
210 |
211 | .. code:: python
212 |
213 | def idle():
214 | while True:
215 | kill_zombies() # reclaim threads marked as DEAD
216 | schedule() # switch to any other runnable thread
217 |
218 | Test
219 | ----
220 |
221 | Please test your implementation with the following code or equivalent logic code in the demo.
222 |
223 | Expected result: multiple threads print the content interleaved.
224 |
225 | .. code:: c
226 |
227 | void foo(){
228 | for(int i = 0; i < 10; ++i) {
229 | printf("Thread id: %d %d\n", current_thread().id(), i);
230 | delay(1000000);
231 | schedule();
232 | }
233 | }
234 |
235 | void kernel_main() {
236 | // ...
237 | // boot setup
238 | // ...
239 | for(int i = 0; i < N; ++i) { // N should > 2
240 | thread_create(foo);
241 | }
242 | idle();
243 | }
244 |
245 | ``required 1`` Implement the thread mechanism.
246 |
247 | Requirement 2
248 | =============
249 |
250 | In this part, you need to implement the basic user process mechanism such as system calls and user preemption.
251 |
252 |
253 | Arguments Passing
254 | ------------------
255 | In the previous lab, your kernel could already load a user program and get system calls from it.
256 | In this lab, you need to add the arguments passing into the program loader,
257 | so you can create a process with different arguments.
258 |
259 | .. image:: img/argv.svg
260 |
261 | As shown in the above image, you can put the strings, pointers to the strings, and the number of arguments on the user stack's top.
262 | Meanwhile, set the user's stack pointer to the corresponding address, so a user program can find the passed arguments.
263 |
264 | After that, you can create multiple threads with different arguments to run multiple user processes.
265 |
266 | .. code:: python
267 |
268 | def init():
269 | exec("init", ["init", "arg1"] )
270 |
271 | init_thread = Thread(init)
272 |
273 | .. warning::
274 | Be aware of the alignment problem when setting the user stack, unaligned access on rpi3 will cause an exception.
275 |
276 | System Calls
277 | -------------
278 | In the previous lab, your user program could already trap to the kernel by the ``svc`` instruction.
279 | In this lab, you need to know how the arguments and return value passed across the user mode and the kernel mode.
280 | Also, you need to implement some basic system calls so you can write simple user programs.
281 |
282 | Trap Frame
283 | ^^^^^^^^^^^
284 | When a user process takes an exception and enters kernel mode,
285 | the registers are saved at the kernel stack's top.
286 | Before returning to the user mode, the registers are loaded.
287 | The saved content is called the trap frame.
288 |
289 | In regular exception handling(e.g. page fault, interrupt),
290 | the kernel won't touch the trap frame, so the user process won't notice that it entered the kernel mode.
291 | However, in the case of system calls, the user program expects that the kernel does something for it.
292 |
293 | As regular function calls, the program sets the arguments and gets the return value by accessing the general-purpose registers.
294 | Then, the kernel can read the trap frame to get the user's arguments and write the trap frame to set the return value and the error code.
295 |
296 | Required System Calls
297 | ^^^^^^^^^^^^^^^^^^^^^^
298 |
299 | You need to implement the following system calls for user programs.
300 |
301 | int getpid()
302 | Get current process's id.
303 |
304 | size_t uart_read(char buf[], size_t size)
305 | Read **size** byte to user provided buffer **buf** and return the how many byte read.
306 |
307 | size_t uart_write(const char buf[], size_t size)
308 | Write **size** byte from user provided buffer **buf** and return the how many byte written.
309 |
310 | **int exec(const char* name, char *const argv[])**
311 | Execute the program with arguments.
312 |
313 | **void exit()**
314 | Terminate the current process.
315 |
316 | In addition, ``fork()`` is the classic way of UNIX-like operating systems to duplicate the current process.
317 | You also need to implement it so a process can create another process.
318 | After invoking ``fork()``, two processes are executing the same code.
319 | To distinguish them, please set the return value of the parent process to the child's id and set the child process's return value to 0.
320 |
321 | Note that, we don't enable the MMU, so the two processes are more like two threads now.
322 | Please duplicate the content of the parent stack to the child's and don't use global variables.
323 |
324 | User Preemption
325 | ----------------
326 |
327 | To implement user preemption, at the end of EL0 to EL1 exception handling,
328 | the kernel should check if the current thread should be switched out(e.g. its time slice is used up).
329 | If yes, call the ``schedule()`` to switch to the next thread.
330 |
331 | Test
332 | -----
333 |
334 | Please test your implementation with the following code or equivalent logic code in the demo.
335 |
336 | Expected result:
337 |
338 | 1. argv_test prints the arguments,
339 | 2. fork_test's pid should be the same as argv_test,
340 | 3. fork_test's parent should print correct child pid,
341 | 4. fork_test's child should start execution at the correct location.
342 | 5. All processes should exit properly.
343 |
344 | argv_test.c
345 |
346 | .. code:: c
347 |
348 | int main(int argc, char **argv) {
349 | printf("Argv Test, pid %d\n", getpid());
350 | for (int i = 0; i < argc; ++i) {
351 | puts(argv[i]);
352 | }
353 | char *fork_argv[] = {"fork_test", 0};
354 | exec("fork_test", fork_argv);
355 | }
356 |
357 |
358 | fork_test.c
359 |
360 | .. code:: c
361 |
362 | int main(void) {
363 | printf("Fork Test, pid %d\n", getpid());
364 | int cnt = 1;
365 | int ret = 0;
366 | if ((ret = fork()) == 0) { // child
367 | printf("pid: %d, cnt: %d, ptr: %p\n", getpid(), cnt, &cnt);
368 | ++cnt;
369 | fork();
370 | while (cnt < 5) {
371 | printf("pid: %d, cnt: %d, ptr: %p\n", getpid(), cnt, &cnt);
372 | delay(1000000);
373 | ++cnt;
374 | }
375 | } else {
376 | printf("parent here, pid %d, child %d\n", getpid(), ret);
377 | }
378 | }
379 |
380 |
381 | kernel code
382 |
383 | .. code:: c
384 |
385 | void user_test(){
386 | char* argv[] = {"argv_test", "-o", "arg2", 0};
387 | exec("argv_test", argv);
388 | }
389 |
390 | void kernel_main() {
391 | // ...
392 | // boot setup
393 | // ...
394 | thread_create(user_test);
395 | idle();
396 | }
397 |
398 | ``required 2`` Implement the the user process related mechanisms.
399 |
400 | *********
401 | Elective
402 | *********
403 |
404 | Wait Queue
405 | ===========
406 |
407 | Implement the APIs as below example pseudocode, so each resource can declare its wait queue.
408 | Also, a thread can suspend itself and wait until the resource is ready.
409 |
410 | .. code:: python
411 |
412 | wait_queue = WaitQueue()
413 |
414 | def block_read():
415 | while nonblock_read() == Again:
416 | wait_queue.wait()
417 |
418 | def handler():
419 | # ...
420 | wait_queue.wake_up()
421 |
422 | Besides, use the wait queue APIs to implement ``sleep()`` and blocking API of ``uart_read()``.
423 | The current thread suspends itself and waits for the events to wake it up.
424 |
425 | ``elective 1`` Implement ``sleep()`` and ``uart_read()`` by wait queues.
426 |
427 | Kernel Preemption
428 | ==================
429 |
430 | To implement kernel preemption, at the end of EL1 interrupt handling,
431 | the kernel should check if the current thread should be switched out(e.g. its time slice is used up).
432 | If yes, call the ``schedule()`` to switch to the next thread.
433 |
434 | Note that, you are only allowed to disable preemption or interrupts when it's necessary.
435 | At other moments, your kernel should always be preemptible.
436 |
437 | ``elective 2`` Implement kernel preemption.
438 |
439 | POSIX Signal
440 | ==============
441 |
442 | POSIX signal is an asynchronous inter-process communication mechanism.
443 | A user process runs a default or registered signal handler when it receives a signal like the interrupt handling.
444 |
445 | You need to implement the ``kill(pid, signal)`` system call, so a process sends signals to any process.
446 | Meanwhile, you need to implement the default signal handler for SIGINT and SIGKILL(terminate the process).
447 | Next, You need to implement the ``signal(signal, handler)`` system call, so a user program can register its function as the signal's handler.
448 |
449 |
450 | Implementation
451 | ----------------
452 |
453 | One possible implementation is that the kernel checks if there is any pending signal before the process returns to the user mode.
454 | If yes, the kernel runs the corresponding handler.
455 |
456 | The default signal handlers can be finished in kernel mode.
457 | On the contrary, the registered signal handlers should be run in the user mode.
458 | Furthermore, the user process may enter the kernel mode again due to another system call or interrupt while running the handler.
459 | Therefore, you should save the original context before executing the handler.
460 | After the handler finishes, the kernel restores the context to continue the original execution.
461 |
462 | After the handler finishes and returns, it's still in the user mode.
463 | To make it enters the kernel mode and indicates that it already finishes,
464 | the kernel can set the handler's return address(``lr``) to a piece of code containing the ``sigreturn()`` system call.
465 | Then after executing it, the kernel knows that the handler is done and restores the previous context.
466 |
467 | Lastly, the handler needs a user stack during its execution.
468 | The kernel should allocate another stack for the handler and recycle it after the handler finishes.
469 | The kernel can also put the process's previous user context and ``sigreturn()`` on it.
470 |
471 | .. note::
472 | You don't need to handle the case of nested registered signal handlers.
473 |
474 | ``elective 3`` Implement POSIX signal.
475 |
--------------------------------------------------------------------------------
/labs/lab6.rst:
--------------------------------------------------------------------------------
1 | ===========================
2 | Lab 6 : Virtual File System
3 | ===========================
4 |
5 | ***************
6 | Introduction
7 | ***************
8 |
9 | A file system manages data in storage mediums.
10 | Each file system has a specific way to store and retrieve the data.
11 | Hence, a virtual file system(VFS) is common in general-purpose OS to provide a unified interface for all file systems.
12 |
13 | In this lab, you'll implement a memory-based file system(tmpfs) to get familiar with the concept of VFS.
14 | In the next lab, you'll implement the FAT32 file system to access files from an SD card.
15 | It's recommended to do both together.
16 |
17 | *****************
18 | Goals of this lab
19 | *****************
20 |
21 | * Understand how to set up a root file system.
22 | * Understand how to create, open, close, read, and write files.
23 | * Understand how a user process access files through the virtual file system.
24 | * Understand how to mount a file system and look up a file across file systems.
25 | * Understand how to design the procfs.
26 |
27 | ***********
28 | Background
29 | ***********
30 |
31 | Tree Structure
32 | ===============
33 |
34 | A file system is usually hierarchical, so a tree is a suitable data structure to represent it.
35 |
36 | Each **node** represents an entity such as a file or directory in the file system.
37 |
38 | Each **edge** has its name, and it's stored in the directory entry.
39 |
40 | There is one **path** between two **nodes**.
41 | Concatenating all the **edges**' names on the **path** forms a **pathname** to the file or directory.
42 | Then, the VFS parses the **pathname** to find the target file or directory.
43 |
44 | **Example graph**
45 |
46 | .. image:: img/vfs.png
47 |
48 | Terminology
49 | ============
50 |
51 | File System
52 | ------------
53 | In this documentation, a file system refers to a concrete file system type such as tmpfs, FAT32, etc.
54 | The virtual file system will be shortened as VFS.
55 |
56 | Vnode
57 | ------------
58 |
59 | A vnode is an abstract class in the VFS tree.
60 | The underlying file system implements the methods and creates the instances.
61 | Then, users can access files and directories with a unified interface.
62 |
63 | Component Name
64 | ---------------
65 |
66 | A pathname delimits each name by '/'.
67 | We call each separated name **component name**.
68 |
69 | File Handle
70 | ------------
71 | A file in the file system can be opened by multiple processes.
72 | The VFS should maintain a data structure for each opened file,
73 | so information such as the current position for the next read/write operation can be kept individually.
74 | We call the data structure **file handle**.
75 |
76 | ***********
77 | Required
78 | ***********
79 |
80 | Requirement 1
81 | ==============
82 |
83 | In this part, you need to implement the interfaces of the VFS.
84 | Then, you should follow the interfaces to implement a memory-based file system, tmpfs.
85 | Finally, you should mount the tmpfs as the root file system and populate the root file system with the initramfs.
86 |
87 | .. note::
88 | If you only want to finish the required part, you can assume that the root file system contains no subdirectory and no other mounted file system.
89 |
90 |
91 | Mount the Root File System
92 | --------------------------
93 |
94 | The VFS should provide an API for users to choose the file system(e.g. tmpfs in this lab or FAT32 in the next lab)
95 | as the root file system.
96 | Since each file system has its initialization method, the VFS should provide another interface for each file system to register its method.
97 | Then, the user can specify the file system's name to mount the root file system.
98 |
99 | Note that, each file system should have a root directory, and the VFS uses the root directory's vnode to look up a file.
100 | Hence, the file system should also create a vnode for its root directory.
101 |
102 | Open(Create) and Close a File
103 | ------------------------------
104 |
105 | The VFS should provide an API for users to open a file with the pathname.
106 | The VFS then tries to find the vnode of the file in the mounted file system.
107 |
108 | However, the VFS doesn't know the implementation of the underlying file system.
109 | Hence, the VFS iteratively passes the component name to the current vnode to get the next vnode.
110 |
111 | The underlying file system creates and initializes the vnode, so the vnode should have the method to look up the next level vnode.
112 | Therefore, the VFS can get the vnode of the file if it exists.
113 | Then, the VFS returns the file handle to the caller.
114 |
115 | To create a non-existing file, the user should be able to use the flag of the open-file API to create a file.
116 | Then, the underlying file system uses its method to create the file with the component name in the directory.
117 |
118 | Read and Write a File
119 | ----------------------
120 |
121 | The VFS should provide APIs for users to read or write the opened file.
122 | When the user calls the API, the VFS should call the file's corresponding method so each file system can have its way to read/write a file.
123 |
124 | After the underlying file system reads or writes the file, it should update the file's size and the file handle's current position if needed.
125 | Note that, the file handle's current position should not exceed its size.
126 | Hence, once a file handle reaches its end of file(EOF) in a read operation, it should stop there, and return the number of bytes it read.
127 |
128 | Populate the Root File System
129 | ------------------------------
130 |
131 | When the user uses the initramfs to set up the system,
132 | the kernel can mount the root file system using the tmpfs.
133 | Next, it populates the root file system with the content of the initramfs.
134 | Then, it can use the user program inside the initramfs to set up the remaining system.
135 |
136 | ``required 1`` Populate the root file system with initramfs using the VFS APIs and the tmpfs.
137 |
138 | Example VFS Code
139 | -----------------
140 |
141 | The following code provides concrete APIs and interfaces.
142 | You don't have to follow it, but it provides a good starting point.
143 |
144 | .. code:: c
145 |
146 | struct vnode {
147 | struct mount* mount;
148 | struct vnode_operations* v_ops;
149 | struct file_operations* f_ops;
150 | void* internal;
151 | };
152 |
153 | struct file {
154 | struct vnode* vnode;
155 | size_t f_pos; // The next read/write position of this opened file
156 | struct file_operations* f_ops;
157 | int flags;
158 | };
159 |
160 | struct mount {
161 | struct vnode* root;
162 | struct filesystem* fs;
163 | };
164 |
165 | struct filesystem {
166 | const char* name;
167 | int (*setup_mount)(struct filesystem* fs, struct mount* mount);
168 | };
169 |
170 | struct file_operations {
171 | int (*write) (struct file* file, const void* buf, size_t len);
172 | int (*read) (struct file* file, void* buf, size_t len);
173 | };
174 |
175 | struct vnode_operations {
176 | int (*lookup)(struct vnode* dir_node, struct vnode** target, const char* component_name);
177 | int (*create)(struct vnode* dir_node, struct vnode** target, const char* component_name);
178 | };
179 |
180 | struct mount* rootfs;
181 |
182 | int register_filesystem(struct filesystem* fs) {
183 | // register the file system to the kernel.
184 | }
185 |
186 | struct file* vfs_open(const char* pathname, int flags) {
187 | // 1. Lookup pathname from the root vnode.
188 | // 2. Create a new file descriptor for this vnode if found.
189 | // 3. Create a new file if O_CREAT is specified in flags.
190 | }
191 | int vfs_close(struct file* file) {
192 | // 1. release the file descriptor
193 | }
194 | int vfs_write(struct file* file, const void* buf, size_t len) {
195 | // 1. write len byte from buf to the opened file.
196 | // 2. return written size or error code if an error occurs.
197 | }
198 | int vfs_read(struct file* file, void* buf, size_t len) {
199 | // 1. read min(len, readable file data size) byte to buf from the opened file.
200 | // 2. return read size or error code if an error occurs.
201 | }
202 |
203 | Requirement 2
204 | ==============
205 |
206 | As mentioned in the previous lab, each user process should be isolated and has its resource.
207 | Hence, the kernel should maintain a per-process data structure and provide system calls for accessing the file system.
208 |
209 | File Descriptor Table
210 | ----------------------
211 |
212 | Each process should have a file descriptor table to bookkeep the opened files.
213 | When the user opens a file, the kernel creates a file handle in the table and returns the index(file descriptor) to the user.
214 | After that, the user can pass the file descriptor to the kernel to get the file handle.
215 | Then, the kernel calls the corresponding VFS API using the file handle and return the result to the user.
216 |
217 | System Calls
218 | -------------
219 |
220 | You need to provide the following system calls so a user process can access the file system.
221 |
222 | .. code:: c
223 |
224 | int open(const char *pathname, int flags);
225 | int close(int fd);
226 | int write(int fd, const void *buf, int count);
227 | int read(int fd, void *buf, int count);
228 |
229 |
230 | Test
231 | -----
232 |
233 | Please test your implementation with the following code or equivalent logic code in the demo.
234 |
235 | Expected result: Print Hello World!.
236 |
237 |
238 | .. code:: c
239 |
240 | int a = open("hello", O_CREAT);
241 | int b = open("world", O_CREAT);
242 | write(a, "Hello ", 6);
243 | write(b, "World!", 6);
244 | close(a);
245 | close(b);
246 | b = open("hello", 0);
247 | a = open("world", 0);
248 | int sz;
249 | sz = read(b, buf, 100);
250 | sz += read(a, buf + sz, 100);
251 | buf[sz] = '\0';
252 | printf("%s\n", buf); // should be Hello World!
253 |
254 | ``required 2`` Implement the file descriptor table and the system calls so user processes can access the file system.
255 |
256 | ***********
257 | Elective
258 | ***********
259 |
260 | Read a Directory
261 | =================
262 |
263 | A directory can have multiple entries to other files or subdirectories.
264 |
265 | In this part, you need to implement the APIs to iterate the directory entries of the opened directory.
266 | Next, implement the system calls.
267 | Finally, implement the ``ls `` user program to list the target directory.
268 |
269 | ls.c
270 |
271 | .. code:: c
272 |
273 | int main(int argc, char** argv) {
274 | int fd = open(argv[1], 0);
275 | char name[100];
276 | int size;
277 | // Modify the for loop to iterate the directory entries of the opened directory.
278 | for(;;) {
279 | printf("Name: %s Size: %d\n", name, size);
280 | }
281 | }
282 |
283 | ``elective 1`` Implement the ``ls `` program to list the target directory.
284 |
285 | Multi-levels VFS
286 | ==========================
287 |
288 | In the required part, the VFS contains only the root directory and files under it.
289 | Now, your VFS should be able to
290 |
291 | * Create subdirectories.
292 |
293 | * Change the current working directory.
294 |
295 | * Mount file systems on directories.
296 |
297 | * Look up the vnode with its full pathname.
298 |
299 | Create a Directory
300 | -------------------
301 |
302 | Creating a directory is almost the same as creating a regular file.
303 | The VFS should find the parent directory of a newly created directory first.
304 | If the parent directory is found, the VFS calls the file system's method with the component name to create a new directory.
305 |
306 | Change the Directory
307 | --------------------
308 |
309 | With the levels of the VFS tree increased, it's inefficient to specify the pathname starting from the root vnode.
310 | Hence besides the root vnode, the VFS should also keep the per-process current working directory vnode.
311 | Then, the user can specify the pathname starting from the current working directory to shorten the pathname.
312 |
313 | Mount Another File System
314 | ----------------------------
315 |
316 | You should implement the following API to mount a file system.
317 |
318 | int mount(const char* device, const char* mountpoint, const char* filesystem)
319 | **filesystem** is the file system's name.
320 |
321 | * The VFS should find and call the file system's method to set up the mount.
322 |
323 | **device** is a name that,
324 |
325 | * For device-based file systems, the name should be a pathname of a device file that stores a file system.
326 |
327 | * For memory-based file systems, the name can be used as the name for the mounted file system.
328 |
329 | **mountpoint** is the directory's pathname to be mounted on.
330 |
331 |
332 | You should also implement the following API to unmount a file system.
333 |
334 | int umount(const char* mountpoint)
335 | **mountpoint** is the directory's pathname with a file system mounted on.
336 |
337 | Pathname Lookup
338 | ------------------
339 |
340 | As mentioned in the background section, a pathname lookup is simply traversing the vnodes.
341 | The VFS can use the following step to find the target file or directory.
342 |
343 | 1. Start from one vnode.
344 |
345 | 2. Get the next component name.
346 |
347 | 3. Get the next vnode by the next component name using the current vnode's method.
348 |
349 | 4. Go to the next vnode.
350 |
351 | Repeat 2-4; the VFS can reach the target vnode if it exists.
352 |
353 | Absolute vs. Relative Pathname
354 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
355 |
356 | A pathname starts with '/' is an absolute pathname.
357 | The lookup starts at the root directory vnode.
358 |
359 | Otherwise, it's a relative pathname.
360 | The lookup starts at the current working directory vnode.
361 |
362 | Dot Component Name
363 | ^^^^^^^^^^^^^^^^^^
364 |
365 | **"."** and **".."** are special component names.
366 |
367 | **"."** refers to the current directory.
368 |
369 | **".."** refers to the parent directory.
370 |
371 | Cross the Mountpoint
372 | ^^^^^^^^^^^^^^^^^^^^
373 |
374 | A pathname lookup crosses a mountpoint in the following case.
375 |
376 | * The current directory is the root of a file system, and the next component name is "\.\."
377 |
378 | * The next component name is a mountpoint.
379 |
380 | In the first case, if the current directory is also the root of the VFS, the VFS can just stay at the root directory vnode.
381 | Otherwise, the file system is mounted on another file system's mountpoint.
382 | The VFS should go to the parent directory vnode of the mountpoint.
383 |
384 | In the second case, the VFS should go to the mounted file system's root directory vnode instead of the mountpoint's vnode.
385 |
386 | Test
387 | ----
388 |
389 | Please test your implementation with the following code or equivalent logic code in the demo.
390 |
391 | test.c
392 |
393 | .. code:: c
394 |
395 | int main() {
396 | char buf[8];
397 | mkdir("mnt");
398 | int fd = open("/mnt/a.txt", O_CREAT);
399 | write(fd, "Hi", 2);
400 | close(fd);
401 | chdir("mnt");
402 | fd = open("./a.txt", 0);
403 | assert(fd >= 0);
404 | read(fd, buf, 2);
405 | assert(strncmp(buf, "Hi", 2) == 0);
406 |
407 | chdir("..");
408 | mount("tmpfs", "mnt", "tmpfs");
409 | fd = open("mnt/a.txt", 0);
410 | assert(fd < 0);
411 |
412 | umount("/mnt");
413 | fd = open("/mnt/a.txt", 0);
414 | assert(fd >= 0);
415 | read(fd, buf, 2);
416 | assert(strncmp(buf, "Hi", 2) == 0);
417 | }
418 |
419 | ``elective 2`` Implement multi-levels VFS and pass the test.
420 |
421 | Procfs
422 | =======
423 |
424 | Even if a piece of code has nothing to do with data storage,
425 | as long as it implements the VFS's interfaces,
426 | it can be mounted by the VFS.
427 |
428 | Procfs is one of the examples.
429 | It's used in UNIX-like operating to expose the states of the kernel and the processes.
430 | In this part, you need to implement it so user processes can access the kernel's internal states.
431 |
432 | Read and Write the Kernel's States
433 | -----------------------------------
434 | In the procfs, the kernel's states are retrieved/modified by reading/writing the corresponding files.
435 | Here, you can practice the concept by creating **switch** and **hello** files.
436 |
437 | Reading **hello** always gets the string "hello".
438 |
439 | The content of **switch** can change the letter case of **hello**.
440 |
441 | * Writing "0" to **switch**, the content of **hello** becomes "hello".
442 |
443 | * Writing "1" to **switch**, the content of **hello** becomes "HELLO".
444 |
445 |
446 | ``elective 3-1`` The procfs creates **switch** and **hello** file in its root directory. Users can access them by open, read, and write.
447 |
448 | Get Per-process States
449 | -----------------------
450 | Creating the files and directories of per-process states ahead of time is not a good idea because the process's states are updated consistently.
451 | Hence, they should be lazily created and updated until someone accesses them.
452 |
453 | Then, the procfs should
454 |
455 | 1. Get the information from the process subsystem.
456 |
457 | 2. Create/Delete process's directory in procfs.
458 |
459 | 3. Create/Update process's related files.
460 |
461 | ``elective 3-2`` The procfs lazily updates the per-process directories and files. Users can read process's status by reading ``/status``
462 |
463 | .. note::
464 | The procfs should still follow the VFS interface, but the lookup/read/write methods could be specialized for different vnodes.
465 |
466 | Test
467 | -----
468 |
469 | Please test your implementation with the following code or equivalent logic code in the demo.
470 |
471 | proc.c
472 |
473 | .. code:: c
474 |
475 | int main() {
476 | char buf[16];
477 | mkdir("proc");
478 | mount("procfs", "proc", "procfs");
479 | int fd = open("/proc/switch", 0);
480 | write(fd, "0", 1);
481 | close(fd);
482 |
483 | fd = open("/proc/hello", 0);
484 | int sz = read(fd, buf, 16);
485 | buf[sz] = '\0';
486 | printf("%s\n", buf); // should be hello
487 | close(fd);
488 |
489 | fd = open("/proc/switch", 0);
490 | write(fd, "1", 1);
491 | close(fd);
492 |
493 | fd = open("/proc/hello", 0);
494 | sz = read(fd, buf, 16);
495 | buf[sz] = '\0';
496 | printf("%s\n", buf); //should be HELLO
497 | close(fd);
498 |
499 | fd = open("/proc/1/status", 0); // choose a created process's id here
500 | sz = read(fd, buf, 16);
501 | buf[sz] = '\0';
502 | printf("%s\n", buf); // process's status.
503 | close(fd);
504 |
505 | fd = open("/proc/999/status", 0); // choose a non-existed process's id here
506 | assert(fd < 0);
507 | }
508 |
--------------------------------------------------------------------------------
/labs/lab7.rst:
--------------------------------------------------------------------------------
1 | ==================================
2 | Lab 7 : File System Meets Hardware
3 | ==================================
4 |
5 | *************
6 | Introduction
7 | *************
8 |
9 | In the previous lab,
10 | the file's operations were only in the memory.
11 | In this lab, you need to read data from an external storage device, modify it in the memory,
12 | and write it back.
13 | In addition, you need to implement the basic of the FAT32 file system.
14 |
15 | **********************
16 | Goals of this lab
17 | **********************
18 |
19 | * Understand how to read/write data from an SD card.
20 | * Implement the FAT32 file system.
21 | * Understand how to access devices by the VFS.
22 | * Understand how memory be used as a cache for slow external storage mediums.
23 |
24 | ************
25 | Background
26 | ************
27 |
28 | SD Card
29 | ===============
30 |
31 | SD Card Driver
32 | ---------------
33 |
34 | We provide an `SD controller driver
35 | `_
36 | for you.
37 |
38 | You should first call the ``sd_init()`` to set up GPIO, SD host, and initialize the SD card.
39 | Then you can call the following APIs.
40 |
41 | ``readblock(int block_id, char buf[512])``
42 |
43 | ``writeblock(int block_id, char buf[512])``
44 |
45 | It reads/writes 512 bytes from/to the SD card to/from buf[512].
46 |
47 | .. note::
48 | You need to modify the MMIO base according to your kernel mapping.
49 | You can also modify the code to meet your requirements.
50 |
51 | .. warning::
52 | The driver code is possibly wrong.
53 | Also, it's only tested on QEMU and the rpi3 with the SD card we gave to you.
54 | Please report it if you encounter a problem, you can also get a bonus if you find a bug in the code.
55 |
56 | SD Card on QEMU
57 | ----------------
58 |
59 | It's always easier to test your code on QEMU first.
60 |
61 | You can add the argument ``-drive if=sd,file=,format=raw`` to attach an SD card file to QEMU.
62 |
63 | Sector vs. Block vs. Cluster
64 | =============================
65 |
66 | These terms appear in the documentation a lot,
67 | you should be able to distinguish them.
68 |
69 | Sector
70 | -------
71 |
72 | A sector is the smallest unit for a hard drive to manage its data.
73 |
74 | The size is usually 512 bytes.
75 |
76 | Block
77 | ------
78 |
79 | A block is the smallest unit for a block device driver to read/write a device.
80 | It's sometimes interchangeable with a sector.
81 |
82 | The provided SD card driver code use 512 byte as the block size.
83 |
84 | Cluster
85 | ---------
86 |
87 | A cluster is composed of contiguous blocks.
88 | A file system uses it as a basic unit to store a regular file or a directory.
89 |
90 | FAT File System
91 | ================
92 | FAT is a simple and widely used file system.
93 | It has at least one file allocation table(FAT) that each entry stores the allocation status, so it gets the name.
94 | The entry's size can be varied from 12 bit(FAT12), 16 bit(FAT16), and 32 bit(FAT32).
95 |
96 | .. note::
97 | You only need to implement FAT32 in this lab.
98 |
99 | .. note::
100 | FAT is a portable file system and used in Windows, Linux, and some RTOS.
101 | Hence, your FAT file system should be able to read a file written by another operating system(e.g. Linux).
102 | Also, another operating system should be able to read a file written by your FAT file system.
103 |
104 |
105 |
106 | Short Filenames(SFN) vs. Long Filenames(LFN)
107 | --------------------------------------------
108 | Originally, FAT uses 8 bytes to stores a file's name and 3 bytes to store the file's extension name.
109 | For example, if the file's entire name is a.out, "a" will be stored in the 8-byte filename, "out" will be stored in the 3-byte extension name, and "." is not stored.
110 |
111 | However, it limits the filename's size, and it's impossible to store a filename with special characters.
112 | Hence, LFN is invented.
113 | It stores the filename in Unicode and can stack multiple directory entries to support varied size filename.
114 |
115 | SFN is easier than LFN, so you only need to support SFN in the required part.
116 | The nctuos.img we provided in lab0 stores the filename in LFN.
117 | We provide a new SFN `sfn_nctuos.img
118 | `_
119 |
120 | There is also a kernel8.img inside.
121 | The kernel8.img prints the first block and the first partition block of the SD card.
122 | You can replace the kernel8.img with yours later on.
123 |
124 | .. image:: img/disk_dump.png
125 |
126 | .. note::
127 | In Linux, you can specify how to mount a FAT file system.
128 |
129 | ``mount -t msdos ``: store and load filename by SFN.
130 |
131 | ``mount -t vfat ``: store and load filename by LFN.
132 |
133 | .. hint::
134 | In Linux, you can set up a loop device for a file.
135 |
136 | ``losetup -fP sfn_nctuos.img``: set up a loop device for sfn_nctuos.img.
137 |
138 | ``losetup -d ``: detach the loop device from the sfn_nctuos.img.
139 |
140 | Then, you can update the SD image to test your code on QEMU first.
141 |
142 | Details of FAT
143 | --------------
144 |
145 | In this lab, you need to understand the format of FAT to be able to find, read, and write files in FAT.
146 | The details are not covered by this documentation.
147 | Please refer to https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system
148 | You can find everything about FAT there.
149 |
150 | *********
151 | Required
152 | *********
153 |
154 | In the required part, you should be able to read and write existing files under the root directory of a FAT32 file system.
155 |
156 | You can create a new text file on your host computer first.
157 | Then read/write the file on your rpi3.
158 |
159 | The size of a FAT32 cluster is usually larger than the block size, but you can assume that the directory and the regular file you read/write is on the first block of the cluster.
160 |
161 | Requirement 1
162 | ===============
163 |
164 | In this requirement, you need to mount the FAT32 file system in the SD card.
165 | You could set the FAT32 file system as the root file system if you didn't implement the multi-levels VFS in the previous lab.
166 |
167 | Get the FAT32 Partition
168 | ---------------------------------
169 |
170 | You should know the location of the FAT32 file system in the SD card first before mounting it.
171 |
172 | The SD card should already be formatted by MBR.
173 | You can parse it to get each partition's type, size, and block index.
174 |
175 | .. hint::
176 | If you use the provided sfn_nctuos.img, the FAT32 partition's block index is 2048.
177 |
178 | ``required 1-1`` Get the FAT32 partition.
179 |
180 | Mount the FAT32 File System
181 | -----------------------------
182 |
183 | A FAT32 file system stores its metadata in the first sector of the partition.
184 |
185 | You need to do the following things during mounting.
186 |
187 | 1. Parse the metadata on the SD card.
188 |
189 | 2. Create a kernel object to store the metadata in memory.
190 |
191 | 3. Get the root directory cluster number and create a FAT32's root directory object.
192 |
193 | ``required 1-2`` Parse the FAT32's metadata and set up the mount.
194 |
195 | Requirement 2
196 | ===============
197 |
198 | Lookup and Open a File in FAT32
199 | --------------------------------
200 | To look up a file in a FAT32 directory,
201 |
202 | 1. Get the cluster of the directory and calculate its block index.
203 |
204 | 2. Read the first block of the cluster by the ``readblock()``
205 |
206 | 3. Traverse the directory entries and compare the component name with filename + extension name to find the file.
207 |
208 | 4. You can get the first cluster of the file in the directory entry.
209 |
210 | ``required 2-1`` Look up and open a file in FAT32.
211 |
212 | Read/Write a File in FAT32
213 | ---------------------------
214 |
215 | After you get the first cluster of the file, you can use ``readblock()``/``writeblock()`` to read/write the file.
216 |
217 | ``required 2-2`` Read/Write a file in FAT32.
218 |
219 | .. note::
220 | You need to update the file's size in the FAT32's directly entry if the file's size is changed by a file write.
221 |
222 | ************
223 | Elective
224 | ************
225 |
226 | Create a File in FAT32
227 | ========================
228 |
229 | To create a new file in FAT32,
230 |
231 | 1. Find an empty entry in the FAT table.
232 |
233 | 2. Find an empty directory entry in the target directory.
234 |
235 | 3. Set them to proper values.
236 |
237 | ``elective 1`` Create a file in FAT32.
238 |
239 | FAT32 with LFN
240 | ===============
241 | In the required part, your FAT32 file system supports only SFN.
242 | In this part, please modify your code to support LFN.
243 |
244 | Note that, the directory entry of LFN is stored in UCS-2.
245 | You may need to translate the UCS-2 strings to another format if your terminal use different formats.
246 |
247 | ``elective 2`` Implement a FAT32 with LFN support. You should create/lookup a file with special characters(e.g. Chinese) in its name.
248 |
249 | Device File
250 | ============
251 |
252 | A vnode in the VFS tree can also represent a device, and we call it a device file.
253 | To support device files in the VFS, you need
254 |
255 | * an API for users to create a device file's vnode,
256 |
257 | * an API for each device driver to register itself to the VFS.
258 |
259 | Device File Registration
260 | -------------------------
261 |
262 | A device can register itself to the VFS in its setup.
263 | The VFS assigns the device a unique device id.
264 | Then the device can be recognized by the VFS.
265 |
266 | Mknod
267 | ------
268 |
269 | A user can use the device id to create a device file in a file system.
270 |
271 | After the device file is found in the file system,
272 | the VFS uses the device id to find the device driver.
273 | Next, the driver initializes the file handle with its method.
274 | Then, the user can read/write the file handle to access the device.
275 |
276 | Console
277 | ---------
278 |
279 | You need to create a device file for your UART device as the console.
280 | Then, users can get/print characters from/to the console by reading or writing its device file
281 |
282 | ``elective 3`` Create a UART device file as the console so users can get/print characters from/to the console by reading or writing its device file.
283 |
284 | .. note::
285 | Device files can be persistently stored in some file systems, but you only need to create them in the tmpfs in this lab.
286 |
287 | Memory as Cache for External Storage
288 | =======================================
289 |
290 | Accessing an SD card is much slower than accessing memory.
291 | Before a CPU shutdown or an SD card is ejected, it's not necessary to synchronize the data between memory and SD card.
292 | Hence, it's more efficient to preserve the data in memory and use memory as a cache for external storage.
293 |
294 | We can categorize the file's data on the storage into three types: file's content, file's name, and file's metadata.
295 |
296 | File's Metadata
297 | -----------------
298 |
299 | Besides the content of a file, additional information such as file size is stored in external storage, too.
300 | The additional information is the file's metadata.
301 | There is also metadata for a file system such as FAT tables in FAT.
302 |
303 | Those metadata are cached by a file system's kernel objects.
304 | You should have already implemented it.
305 |
306 | File's Name
307 | ------------
308 |
309 | A pathname lookup for a file system on external storage involves,
310 |
311 | 1. Read the directory block from the external storage.
312 |
313 | 2. Parse the directory entry and compare the directory entry's filename with the component name.
314 |
315 | 3. Get the next directory location.
316 |
317 | The VFS can reduce the time spend on reading directory block and parsing directory entry by a component name cache mechanism.
318 | A component name cache mechanism can be implemented as:
319 |
320 | 1. Look up the component name cache of the directory first.
321 |
322 | 2. If successfully finds the vnode, return to the vnode. Otherwise, call the lookup method of the underlying file system.
323 |
324 | 3. The underlying file system looks up from external storage.
325 |
326 | 4. If it successfully finds the file, it creates a vnode for the file and puts it into the component name cache.
327 |
328 | ``elective 4-1`` Implement a component name cache mechanism for faster pathname lookup.
329 |
330 | File's Content
331 | -----------------
332 |
333 | The VFS can cache a file's content in memory by page frames.
334 | A page cache mechanism can be implemented as:
335 |
336 | 1. Check the existence of the file's page frames when read or write a file.
337 |
338 | 2. If the page frames don't exist, allocate page frames for the file.
339 |
340 | 3. The underlying file system populates the page frames with the file's content in external storage if necessary.
341 |
342 | 4. Read or write the page frames of the file.
343 |
344 | ``elective 4-2`` Implement a page cache mechanism for faster file read and write.
345 |
346 |
347 | Sync
348 | ------
349 |
350 | The VFS should synchronize the file's memory cache with the external storage when the user wants to eject it.
351 | Hence, The VFS should provide an API for users to synchronize the data, and the file system should implement the synchronize method for writing data back to the external storage.
352 |
353 | ``elective 4-3`` Implement the ``sync`` API to write back the cache data.
--------------------------------------------------------------------------------
/labs/lab8.rst:
--------------------------------------------------------------------------------
1 | ======================
2 | Lab 8 : Virtual Memory
3 | ======================
4 |
5 | ************
6 | Introduction
7 | ************
8 |
9 | Virtual memory provides isolated address spaces,
10 | so each user process can run in its address space without interfering with others.
11 |
12 | In this lab, you need to initialize the memory management unit(MMU) and
13 | set up the address spaces for the kernel and user processes to achieve process isolation
14 |
15 | *******************
16 | Goals of this lab
17 | *******************
18 |
19 | * Understand ARMv8-A virtual memory system architecture.
20 | * Understand how the kernel manages memory for user processes.
21 | * Understand how demand paging works.
22 | * Understand how copy-on-write works.
23 |
24 | *******************
25 | Background
26 | *******************
27 |
28 | Terminology
29 | ============
30 |
31 | Translation Levels
32 | --------------------
33 |
34 | Translating a virtual address to a physical address involves levels of translation.
35 | ARMv8-A has 2 to 4 levels of translation for different page sizes and the second stage translation for hypervisors. (not used in labs)
36 |
37 | We name each level as in Linux.
38 | The top-level is the page global directory (PGD) followed by page upper directory (PUD), page middle directory (PMD), and page table entry(PTE).
39 |
40 | Page v.s. Page Frame v.s. Page Table
41 | -------------------------------------
42 |
43 | **Page**: A chunk of virtual memory pointed by one entry of PTE.
44 |
45 | **Block**: A chunk of virtual memory pointed by one entry of PUD or PMD.
46 |
47 | **Page frame**: A chunk of physical memory.
48 |
49 | **Page table**: A page frame whose entries point to the next level page tables, blocks, or pages.
50 | In this documentation, PGD, PUD, PMD, and PTE are all called page tables.
51 |
52 | Page's Descriptor
53 | ===================
54 |
55 | As mentioned earlier, each entry of a page table points to the next level page table, a block, or a page.
56 | The entry is combined with the page frame physical address and attributes of the region.
57 |
58 | We list the necessary content for you.
59 |
60 | Descriptor's Format(simplified)
61 | ---------------------------------
62 |
63 | .. code:: none
64 |
65 | Entry of PGD, PUD, PMD which point to a page table
66 |
67 | +-----+------------------------------+---------+--+
68 | | | next level table's phys addr | ignored |11|
69 | +-----+------------------------------+---------+--+
70 | 47 12 2 0
71 |
72 | Entry of PUD, PMD which point to a block
73 |
74 | +-----+------------------------------+---------+--+
75 | | | block's physical address |attribute|01|
76 | +-----+------------------------------+---------+--+
77 | 47 n 2 0
78 |
79 | Entry of PTE which point to a page
80 |
81 | +-----+------------------------------+---------+--+
82 | | | page's physical address |attribute|11|
83 | +-----+------------------------------+---------+--+
84 | 47 12 2 0
85 |
86 | Invalid entry
87 |
88 | +-----+------------------------------+---------+--+
89 | | | page's physical address |attribute|*0|
90 | +-----+------------------------------+---------+--+
91 | 47 12 2 0
92 |
93 | .. _page_attr:
94 |
95 | Attributes Used in this Lab
96 | ---------------------------------
97 |
98 | **Bits[54]**
99 | The unprivileged execute-never bit, non-executable page frame for EL0 if set.
100 |
101 | **Bits[53]**
102 | The privileged execute-never bit, non-executable page frame for EL1 if set.
103 |
104 | **Bits[47:n]**:
105 | The physical address the entry point to.
106 | Note that the address should be aligned to :math:`2^n` Byte.
107 |
108 | **Bits[10]**
109 | The access flag, a page fault is generated if not set.
110 |
111 | **Bits[7]**
112 | 0 for read-write, 1 for read-only.
113 |
114 | **Bits[6]**
115 | 0 for only kernel access, 1 for user/kernel access.
116 |
117 | **Bits[4:2]**
118 | The index to MAIR.
119 |
120 | **Bits[1:0]**
121 | Specify the next level is a block/page, page table, or invalid.
122 |
123 | .. warning::
124 | If you set Bits[7:6] to 0b01, which means the user can read/write the region,
125 | then the kernel is automatically not executable in that region no matter what the value of Bits[53] is.
126 |
127 | AArch64 memory layout
128 | ========================
129 |
130 | In the 64-bit virtual memory system, the upper address space is usually for kernel mode, and the lower address space is for user mode.
131 |
132 |
133 | .. image:: img/mem_layout.png
134 |
135 | .. note::
136 | The entire accessible physical address could be linearly mapped by to offset 0xffff_0000_0000_0000 for kernel access in the labs.
137 | It simplfies the design.
138 |
139 | Configuration
140 | ===============
141 |
142 | ARMv8-A has the elasticity for different configurations.
143 | You can change the granularity of paging, the addressable region, etc.
144 | To keep everything simple, the following configuration is specified for this lab.
145 |
146 | * Disable instruction cache.
147 | * Disable data cache.
148 | * The addressable region is 48 bit.
149 | * The page granule size is 4KB.
150 | * Not use address space ID (ASID).
151 |
152 | Reference
153 | ============
154 |
155 | So far, we briefly introduce the concept of virtual memory and ARMv8-A virtual memory system architecture.
156 | For details, you can refer to
157 |
158 | * `ARMv8-A Address Translation `_
159 | * **The AArch64 Virtual Memory System Architecture Chapter(page 1720)** of `ARMv8-A Architecture Reference `_
160 |
161 | *********
162 | Required
163 | *********
164 |
165 | Requirement 1
166 | ===============
167 |
168 | We provide a step-by-step tutorial to guide you to make your original kernel works with virtual memory.
169 | However, we only give the essential explanation in each step.
170 | For details, please refer to the manual.
171 |
172 | Translation Control Register (TCR)
173 | -----------------------------------
174 |
175 | Paging is configured by TCR.
176 | The following basic configuration is used in this lab.
177 |
178 | .. code:: c
179 |
180 | #define TCR_CONFIG_REGION_48bit (((64 - 48) << 0) | ((64 - 48) << 16))
181 | #define TCR_CONFIG_4KB ((0b00 << 14) | (0b10 << 30))
182 | #define TCR_CONFIG_DEFAULT (TCR_CONFIG_REGION_48bit | TCR_CONFIG_4KB)
183 |
184 | ldr x0, = TCR_CONFIG_DEFAULT
185 | msr tcr_el1, x0
186 |
187 | ``required 1-1`` Set up TCR_EL1.
188 |
189 | Memory Attribute Indirection Register (MAIR)
190 | ---------------------------------------------
191 |
192 | Brief Introduction
193 | ^^^^^^^^^^^^^^^^^^^
194 |
195 | The MMU has different memory **access policies** for different **memory regions**.
196 |
197 | * Memory **access policies** are encoded as attributes and stored in MAIR.
198 |
199 | * To select the attribute for a certain **memory region**, each page table's entry contains the index to the attribute. (see :ref:`page_attr`)
200 |
201 | When the MMU gets a virtual address, it gets the index from the page table's entry and looks up MAIR to get the memory attribute.
202 | Then, it accesses the memory with different access policies.
203 |
204 | Used Memory Attributes
205 | ^^^^^^^^^^^^^^^^^^^^^^^
206 |
207 | The following two attributes are used in the lab.
208 |
209 | * Device memory nGnRnE:
210 |
211 | * Peripheral access.
212 |
213 | * The most restricted memory access.
214 |
215 |
216 | * Normal memory without cache:
217 |
218 | * Normal RAM access.
219 |
220 | * Memory gathering, reordering, and speculative execution are possible but without cache.
221 |
222 | .. code:: c
223 |
224 | #define MAIR_DEVICE_nGnRnE 0b00000000
225 | #define MAIR_NORMAL_NOCACHE 0b01000100
226 | #define MAIR_IDX_DEVICE_nGnRnE 0
227 | #define MAIR_IDX_NORMAL_NOCACHE 1
228 |
229 | ldr x0, =( \
230 | (MAIR_DEVICE_nGnRnE << (MAIR_IDX_DEVICE_nGnRnE * 8)) | \
231 | (MAIR_NORMAL_NOCACHE << (MAIR_IDX_NORMAL_NOCACHE * 8)) \
232 | )
233 | msr mair_el1, x0
234 |
235 | ``required 1-2`` Set up ``mair_el1``.
236 |
237 |
238 | Identity Paging
239 | ----------------
240 |
241 | Before enabling the MMU, you need to set up the page tables for the kernel.
242 | You can start from identity paging with two-level translation.
243 |
244 | In a two-level translation, you only need PGD and PUD.
245 | Each entry of PUD points to a 1GB block.
246 | Hence, you only need
247 |
248 | * The first entry of PGD which points to PUD
249 |
250 | * The first two entries of PUD.
251 |
252 | * The first one maps 0x00000000 - 0x3fffffff (RAM and GPU peripherals)
253 |
254 | * The second one maps 0x40000000 - 0x7fffffff(ARM local peripherals).
255 |
256 | **setup**
257 |
258 | * 2 page frames for PGD and PUD.
259 |
260 | * PUD's entries are blocks.
261 |
262 | * Map all memory as Device nGnRnE.
263 |
264 | .. code:: c
265 |
266 | #define PD_TABLE 0b11
267 | #define PD_BLOCK 0b01
268 | #define PD_ACCESS (1 << 10)
269 | #define BOOT_PGD_ATTR PD_TABLE
270 | #define BOOT_PUD_ATTR (PD_ACCESS | (MAIR_IDX_DEVICE_nGnRnE << 2) | PD_BLOCK)
271 |
272 | mov x0, 0 // PGD's page frame at 0x0
273 | mov x1, 0x1000 // PUD's page frame at 0x1000
274 |
275 | ldr x2, = BOOT_PGD_ATTR
276 | orr x2, x1, x2 // combine the physical address of next level page with attribute.
277 | str x2, [x0]
278 |
279 | ldr x2, = BOOT_PUD_ATTR
280 | mov x3, 0x00000000
281 | orr x3, x2, x3
282 | str x3, [x1] // 1st 1GB mapped by the 1st entry of PUD
283 | mov x3, 0x40000000
284 | orr x3, x2, x3
285 | str x3, [x1, 8] // 2nd 1GB mapped by the 2nd entry of PUD
286 |
287 | msr ttbr0_el1, x0 // load PGD to the bottom translation based register.
288 |
289 | mrs x2, sctlr_el1
290 | orr x2 , x2, 1
291 | msr sctlr_el1, x2 // enable MMU, cache remains disabled
292 |
293 | If you set up correctly, your kernel should work as before.
294 |
295 | ``required 1-3`` Set up identity paging.
296 |
297 | Map the Kernel Space
298 | ---------------------
299 |
300 | As mentioned earlier, the kernel space is the upper address space.
301 | Now, you need to modify your linker script to make your kernel's symbols in the upper address space.
302 |
303 | .. code:: none
304 |
305 | SECTIONS
306 | {
307 | . = 0xffff000000000000; // kernel space
308 | . += 0x80000; // kernel load address
309 | _kernel_start = . ;
310 | // ...
311 | }
312 |
313 | After the kernel is re-built and loaded, load the identity paging's PGD to ``ttbr1_el1``.
314 | Next, enable the MMU and using an indirect branch to the virtual address.
315 | Then, the CPU is running your kernel in the upper address space.
316 |
317 | .. code:: c
318 |
319 | // ...
320 |
321 | msr ttbr0_el1, x0
322 | msr ttbr1_el1, x0 // also load PGD to the upper translation based register.
323 | mrs x2, sctlr_el1
324 | orr x2 , x2, 1
325 | msr sctlr_el1, x2
326 |
327 | ldr x2, = boot_rest // indirect branch to the virtual address
328 | br x2
329 |
330 | boot_rest:
331 | // ...
332 |
333 | ``required 1-4`` Modify the linker script, and map the kernel space.
334 |
335 | .. note::
336 | If there is hard-coded address(e.g. IO address) in your kernel, you should also set it to the upper address space.
337 |
338 | Finer Granularity Paging
339 | ----------------------------------
340 |
341 | The granularity of two-level translation is 1GB.
342 | In the previous setting, all memory regions are mapped as device memory.
343 |
344 | However, unaligned access of device memory causes alignment exception and the compiler sometimes generates unaligned access.
345 | Hence, you should map most of the RAM as normal memory and MMIO region as device memory.
346 |
347 | Then, you should use three level translation(2MB) or four level translation(4KB) for linear mapping.
348 |
349 | ``required 1-5`` Linear map kernel with finer granularity and map RAM as normal memory.
350 |
351 | Requirement 2
352 | ===============
353 |
354 | PGD Allocation
355 | ---------------
356 |
357 | To isolate user processes, you should create an address space for each of them.
358 | Hence, the kernel should allocate one PGD for each process when it creates a process.
359 |
360 | Map the User Space
361 | -------------------
362 |
363 | Same as kernel space mapping, you need to iteratively fill in the entries of page tables from PGD -> PUD -> PMD -> PTE.
364 |
365 | During this process, the next level page tables such as PUD, PMD, and PTE may not already present.
366 | You should allocate one page frame to be used as the next level page table.
367 | Then, fill the page frame's entries to map the virtual address.
368 |
369 | ``required 2-1`` Implement user space paging.
370 |
371 | .. note::
372 | You should use 4KB pages for user processes in this lab, so you need PGD, PUD, PMD, and PTE for four-layer translation.
373 |
374 |
375 | Revisit Fork and Exec
376 | ^^^^^^^^^^^^^^^^^^^^^^
377 |
378 | In lab 5, different user programs used different linker scripts to prevent address overlapping.
379 | Also, the child process can't use the same user stack address as the parent.
380 |
381 | With virtual memory, the same virtual address can be mapped to different physical addresses.
382 | Therefore, you can to revisit ``fork()`` and ``exec()`` with virtual memory to solve the problems mentioned above.
383 |
384 | ``required 2-2`` Revisit ``fork()`` and ``exec()`` to map the same virtual address to different physical addresses for different processes.
385 |
386 | Context Switch
387 | ---------------
388 |
389 | To switch between different address spaces,
390 | you can set the translation based register(``ttbr0_el1``) with different PGDs.
391 |
392 | In addition, you might need memory barriers to guarantee previous instructions are finished.
393 | Also, a TLB invalidation is needed because the old values are staled.
394 |
395 | .. code:: c
396 |
397 | ldr x0, = next_pgd
398 | dsb ish // ensure write has completed
399 | msr ttbr0_el1, x0 // switch translation based address.
400 | tlbi vmalle1is // invalidate all TLB entries
401 | dsb ish // ensure completion of TLB invalidatation
402 | isb // clear pipeline
403 |
404 | ``required 2-3`` Set ``ttbr0_el1`` to switch the address space in context switches.
405 |
406 |
407 | Simple Page Fault Handler
408 | --------------------------
409 |
410 | When the CPU accesses a non-mapped address, a page fault exception is taken.
411 | You should **print the fault address** store in ``far_el1`` in the kernel mode and **terminate the user process**.
412 |
413 | ``required 2-4`` Implement a simple page fault handler.
414 |
415 | Test
416 | =======
417 |
418 | Please test your implementation with the following code or equivalent logic code in the demo.
419 |
420 | test.c
421 |
422 | .. code:: c
423 |
424 | int main(void) {
425 | int cnt = 0;
426 | if(fork() == 0) {
427 | fork();
428 | fork();
429 | while(cnt < 10) {
430 | printf("pid: %d, sp: 0x%llx cnt: %d\n", getpid(), &cnt, cnt++); // address should be the same, but the cnt should be increased indepndently
431 | delay(1000000);
432 | }
433 | } else {
434 | int* a = 0x0; // a non-mapped address.
435 | printf("%d\n", *a); // trigger simple page fault.
436 | printf("Should not be printed\n");
437 | }
438 | }
439 |
440 | **********
441 | Elective
442 | **********
443 |
444 | Mmap
445 | =====
446 |
447 | ``mmap()`` is the system call to create memory regions for a user process.
448 | Each region can be mapped to a file or anonymous page(the page frames not related to any file) with different protection.
449 | Then, users can create heap and memory-mapped files using the system call.
450 |
451 | Besides, the kernel can also use it for implementing the program loader.
452 | Memory regions such as .text and .data can be created by **memory-mapped files**.
453 | Memory regions such as **.bss** and **user stack** can be created by **anonymous page mapping**.
454 |
455 | API Specification
456 | ------------------
457 |
458 | (void*) mmap(void* addr, size_t len, int prot, int flags, int fd, int file_offset)
459 | The kernel uses **addr** and **len** to create a new valid region for the current process.
460 |
461 | * If **addr** is NULL, the kernel decides the new region's start address
462 |
463 | * If **addr** is not NULL
464 |
465 | * If the new region **overlaps** with existing regions, or **addr** is **not page-aligned**
466 |
467 | * If MAP_FIXED is set, ``mmap()`` is failed
468 |
469 | * Otherwise, the kernel takes **addr** as a hint and decides the new region's start address.
470 |
471 | * Otherwise, the kernel uses **addr** as the new region's start address.
472 |
473 | * The memory region created by ``mmap()`` should be page-aligned, if the **len** is not multiple of the page size, the kernel rounds it up.
474 |
475 |
476 | **prot** is the region's access protection
477 |
478 | * PROT_NONE : not accessible
479 |
480 | * PROT_READ : readable
481 |
482 | * PROT_WRITE : writable
483 |
484 | * PROT_EXEC : executable
485 |
486 | The following **flags** should be implemented
487 |
488 | * MAP_FIXED: New region's start should be **addr**, otherwise the ``mmap()`` fails.
489 |
490 | * MAP_ANONYMOUS: New region is mapped to anonymous page. It's usually used for stack and heap.
491 |
492 | * MAP_POPULATE: After ``mmap()``, it directly does :ref:`region_map`. (You don't have to implement it if you implement demand paging)
493 |
494 | **fd** is the mapped file's file descriptor..
495 |
496 | The new region's is mapped to the **file_offset** of the mapped file.
497 |
498 | * The file_offset should be page-aligned.
499 |
500 | .. note::
501 | * You don't need to handle the case that the new region overlaps existing regions.
502 |
503 | * We use memory mapped files for the ELF loader. If you don't implement ELF loader, you don't need to implement **fd**, **file_offset**, and **MAP_FIXED**.
504 |
505 |
506 |
507 | .. _region_map:
508 |
509 | Region Page Mapping
510 | ----------------------
511 |
512 | If the user specifies MAP_POPULATE in the ``mmap()`` call.
513 | The kernel should create the page mapping for the newly created region.
514 |
515 | * If the region is mapped to anonymous pages
516 |
517 | 1. Allocate page frames.
518 |
519 | 2. Map the region to page frames, and set the page attributes according to region's protection policy.
520 |
521 | * If the region is mapped to a file
522 |
523 | 1. Allocate page frames.
524 |
525 | 2. Map the region to page frames, and set the page attributes according to region's protection policy.
526 |
527 | 3. Copy the file's content to the memory region.
528 |
529 | Tests
530 | -----
531 |
532 | Please test your implementation with the following code or equivalent logic code in the demo.
533 |
534 | illegal_read.c
535 |
536 | .. code:: c
537 |
538 | int main(void) {
539 | char* ptr = mmap(0x1000, 4096, PROT_READ, MAP_ANONYMOUS, -1, 0);
540 | printf("addr: %llx\n", ptr);
541 | printf("%d\n", ptr[1000]); // should be 0
542 | printf("%d\n", ptr[4097]); // should be segfault
543 | }
544 |
545 | illegal_write.c
546 |
547 | .. code:: c
548 |
549 | int main() {
550 | char* ptr = mmap(NULL, 4096, PROT_READ, MAP_ANONYMOUS, -1, 0);
551 | printf("addr: %llx\n", ptr);
552 | printf("%d\n", ptr[1000]); // should be 0
553 | ptr[0] = 1; // should be seg fault
554 | printf("%d\n", ptr[0]); // not reached
555 | }
556 |
557 | ``elective 1`` Implement ``mmap()``.
558 |
559 |
560 | .. _ELF:
561 |
562 | ELF Loader
563 | ==========
564 |
565 | In this part, you need to implement an ELF loader to replace the raw binary loader.
566 |
567 | ELF Parsing
568 | -------------
569 |
570 | The difference between raw binary and ELF is the header.
571 | You can get segments information by parsing the ELF file's header
572 |
573 | To implement an ELF loader, you only need to care about the ELF header and the program headers.
574 | The following are struct members you need to use for loading a statically linked ELF.
575 |
576 | ELF Header
577 | ^^^^^^^^^^^
578 |
579 | * **e_entry**: The ELF's entry point, you need to set user exception return address to it.
580 |
581 | * **e_phoff**: The offset of program headers from ELF's file start.
582 |
583 | * **e_phnum**: The number of program headers
584 |
585 | Program Header
586 | ^^^^^^^^^^^^^^^
587 |
588 | * **p_type**: The type of program header, you only need to care about PT_LOAD (LOAD segments).
589 |
590 | * **p_vaddr**: The virtual address should be loaded to.
591 |
592 | * **p_offset**: The offset to start of ELF.
593 |
594 | * **p_align**: **p_vaddr** :math:`\equiv` **p_offset** (mod **p_align**)
595 |
596 | * **p_filesz**: The file size, contains .text, .data, etc.
597 |
598 | * **p_memsz**: The memory size of the segment. It usually equals **p_filesz**. If the segment contains .bss, it should be larger than **p_filesz**
599 |
600 | * **p_flags**: The extra flags, you only need to care about rwx.
601 |
602 |
603 | .. note::
604 | Don't confuse the **p_offset** with **file_offset** in ``mmap()``. **p_offset** may not be page-aligned.
605 |
606 | Don't confuse the **p_vaddr** with **addr** in ``mmap()``. **p_vaddr** may not be page-aligned.
607 |
608 | ``elective 2-1`` Parse the ELF header.
609 |
610 | .. hint::
611 | You can check the correctness by readelf -l on linux
612 |
613 | ELF reference
614 | ^^^^^^^^^^^^^^^
615 |
616 | * https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
617 |
618 | ELF mapping
619 | --------------
620 |
621 | You can use ``mmap()`` to create regions for the ELF file according to the LOAD segments in program headers.
622 |
623 | In general, you can use
624 | :code:`mmap(p_vaddr, p_filesz, p_flags, MAP_FIXED | MAP_POPULATE, bin_start, p_offset); // MAP_POPULATE can be removed if you implement demand paging`
625 | to create memory regions, and :ref:`region_map` can do the mapping and copying jobs for you.
626 |
627 | However, there are some cases you need to care about:
628 |
629 | p_memsz > p_filesz
630 | ^^^^^^^^^^^^^^^^^^^
631 |
632 | It usually happens in .bss and .data are in one LOAD segment, or .bss has its own LOAD segment.
633 | In this case, **.data** should still **map to the ELF file** but **.bss** should **map to anonymous page frames** by setting MAP_ANONYMOUS because it's not backed by the ELF file.
634 |
635 | If unfortunately, **.bss and .data are in the same segment** and their **boundary is at the middle of a page frame**.
636 | You should
637 |
638 | 1. Do the same thing as normal file mapping region as in :ref:`region_map`
639 |
640 | 2. Initialize the part of the page frame that belongs to .bss to 0.
641 |
642 | .. note::
643 | If you implement demand paging, you should pre-fault on the .data and .bss boundary and make .bss's head 0 initialized.
644 |
645 | p_vaddr and p_offset are not page aligned
646 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
647 |
648 | The region created by ``mmap`` should be page aligned.
649 | With the MAP_FIXED flag, some parameters need to be modified
650 |
651 | * **addr** should be set to **p_vaddr** - (**p_vaddr** MOD **page_size**)
652 |
653 | * **file_offset** should be set to **p_offset** - (**p_offset** MOD **page_size**)
654 |
655 | * **len** should be set to **p_filesz** + (**p_offset** MOD **page_size**)
656 |
657 | ``elective 2-2`` Implement ELF mapping.
658 |
659 |
660 | Page Fault Handler & Demand Paging
661 | ======================================
662 |
663 | The page frames are pre-allocated in the previous parts.
664 | However, user program might allocate a huge space on heap or memory mapped files without using it.
665 | The kerenl wastes the CPU time and the physical memory on this things.
666 |
667 | In this part, your kernel should allocate page frames for user processes on demand.
668 | The kernel only allocates the PGD for newly created process in the beggining.
669 |
670 | When a page fault is generated,
671 |
672 | * If the fault address is not part of any region in the process's address space,
673 |
674 | * A segmentation fault is generated, and the kernel terminates the process.
675 |
676 | * If it's part of one region,
677 |
678 | Follow :ref:`region_map` but only map **one page frame**. for the fault address.
679 |
680 | ``elective 3`` Implement demand paging.
681 |
682 | Copy on Write
683 | ================
684 |
685 | When a process call ``fork()`` to create a child process,
686 | the kernel needs to copy all the page frames owned by the parent in the previous implementation.
687 | Otherwise, a write by either child or parent might not be awared by the other one and induce error.
688 |
689 | However, an ``exec()`` followed by a ``fork()`` call is quite common in UNIX programming.
690 | The original mapping of child would be destoryed and you waste a lot of time on copying never used page frames.
691 | Hence, a copy-on-write mechanism comes to help these odds.
692 |
693 | The following statements is a possible copy-on-write implementation.
694 |
695 | On Fork a New Process
696 | ----------------------
697 |
698 | 1. Copy the page frames of page tables.
699 |
700 | 2. Then mark PTE entries of **both child and parent** to be **read-only** even for original read-write pages.
701 |
702 | When Either Children or Parent Write to that Page
703 | -------------------------------------------------
704 |
705 | A permission fault is generated because the PTE entry marks as read-only, then you should
706 |
707 | Check the region's permission in the address space.
708 |
709 | * If the corresponding region is **read-only**, then the **segmentation fault** is generated because the user trying to write a read-only region.
710 |
711 | * If the corresponding region is **read-write**, then it's a **copy-on-write fault**.
712 |
713 | * The kernel should allocate a page frame, copy the data, and modify the table's entry to be correct permission.
714 |
715 | .. note::
716 | ``fork()`` may be executed many times, so page frames may be shared by many children and one parent.
717 | Hence, you need a reference count for each page frame.
718 | And you should not reclaim the page frame if there is still someone referring to it.
719 |
720 | ``elective 4`` Implement copy-on-write.
721 |
--------------------------------------------------------------------------------