├── content ├── _index.md ├── blog │ ├── _index.md │ └── 2024-10-16-new-website.md ├── glossary │ ├── git.md │ ├── sbom.md │ ├── omnibor.md │ ├── artifact.md │ ├── build_tool.md │ ├── artifact_dependency_graph.md │ ├── git_blob.svg │ ├── git_object.svg │ ├── gitoid.svg │ ├── _index.md │ └── gitref.svg ├── code-of-conduct.md ├── resources │ ├── supply-chain-talk.md │ ├── cloud-native-security-con-eu-2022.md │ ├── intro.md │ ├── _index.md │ ├── faq.md │ └── whitepaper.md ├── spec │ ├── _index.md │ └── v0.1 │ │ ├── annex-b.md │ │ ├── annex-a.md │ │ ├── annex-c.md │ │ └── _index.md ├── docs │ ├── _index.md │ ├── input-manifests.md │ └── artifact-ids.md ├── third-party.md ├── contribute.md └── project.md ├── styles └── main.css ├── templates ├── shortcodes │ ├── img.html │ ├── mermaid.html │ └── calendar.html ├── macros │ ├── icon.html │ └── toc.html ├── redirect.html ├── blog.html ├── page.html ├── section.html ├── doc.html ├── spec.html ├── glossary.html ├── spec_page.html ├── project.html ├── includes │ ├── header.html │ └── footer.html ├── blog_post.html ├── index.html └── base.html ├── package.json ├── tailwind.config.js ├── README.md ├── netlify.toml ├── config.toml ├── .gitignore └── LICENSE /content/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "OmniBOR" 3 | --- 4 | -------------------------------------------------------------------------------- /styles/main.css: -------------------------------------------------------------------------------- 1 | @tailwind base; 2 | @tailwind components; 3 | @tailwind utilities; 4 | -------------------------------------------------------------------------------- /content/blog/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Blog" 3 | template: "blog.html" 4 | page_template: "blog_post.html" 5 | --- 6 | -------------------------------------------------------------------------------- /templates/shortcodes/img.html: -------------------------------------------------------------------------------- 1 |
2 | {{ alt }} 3 |
4 | -------------------------------------------------------------------------------- /templates/shortcodes/mermaid.html: -------------------------------------------------------------------------------- 1 | 2 |
3 |     {{ body }}
4 | 
5 | -------------------------------------------------------------------------------- /content/glossary/git.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Git" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#git" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/glossary/sbom.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "SBOM" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#sbom" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "type": "commonjs", 3 | "devDependencies": { 4 | "@tailwindcss/typography": "^0.5.15", 5 | "tailwindcss": "^3.4.13" 6 | } 7 | } 8 | -------------------------------------------------------------------------------- /content/glossary/omnibor.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "OmniBOR" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#omnibor" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/glossary/artifact.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Artifact" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#artifact" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/glossary/build_tool.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Build Tool" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#build-tool" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/code-of-conduct.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Code of Conduct" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "https://github.com/omnibor/spec/blob/main/code_of_conduct.md" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/glossary/artifact_dependency_graph.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Artifact Dependency Graph" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "@/glossary/_index.md#artifact-dependency-graph" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/resources/supply-chain-talk.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Supply Chain Security Con NA 2021" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "https://www.youtube.com/watch?v=GKyrsDOse6s&t=546s" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/resources/cloud-native-security-con-eu-2022.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Cloud Native Security Con EU 2022" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "https://www.youtube.com/watch?v=2SSkNLWL4UM" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /content/resources/intro.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Intro to GitBOM" 3 | template: "redirect.html" 4 | extra: 5 | redirect_to: "https://docs.google.com/presentation/d/1fSyRyvYhRYQr-RGm5N1TFcSLQdNV7YYtZmaf2xVwjy4/edit?usp=sharing" 6 | --- 7 | 8 |   9 | -------------------------------------------------------------------------------- /templates/macros/icon.html: -------------------------------------------------------------------------------- 1 | {% macro icon(name, classes="") %} 2 | 3 | 4 | 5 | {% endmacro icon %} 6 | -------------------------------------------------------------------------------- /templates/redirect.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block scripts %} 4 | {% if page.extra.redirect_to %} 5 | {% set target = page.extra.redirect_to %} 6 | 7 | {% if target is starting_with("http") %} 8 | {% set url = target %} 9 | {% else %} 10 | {% set url = get_url(path=target) %} 11 | {% endif %} 12 | 13 | 16 | {% endif %} 17 | {% endblock %} 18 | -------------------------------------------------------------------------------- /tailwind.config.js: -------------------------------------------------------------------------------- 1 | /** @type {import('tailwindcss').Config} */ 2 | 3 | export default { 4 | content: [ 5 | "./templates/**/*.html", 6 | "./public/**/*.html", 7 | "./content/**/*.md", 8 | "config.toml", 9 | ], 10 | theme: { 11 | fontFamily: { 12 | sans: [ 13 | "'InterVariable', 'Inter', 'sans-serif'", 14 | { 15 | fontFeatureSettings: "'liga' 1, 'calt' 1", 16 | }, 17 | ], 18 | }, 19 | }, 20 | plugins: [require("@tailwindcss/typography")], 21 | }; 22 | -------------------------------------------------------------------------------- /templates/shortcodes/calendar.html: -------------------------------------------------------------------------------- 1 | 2 | {# For Large Screens #} 3 | 6 | 7 | {# For Small Screens #} 8 |
9 | 10 |
11 | -------------------------------------------------------------------------------- /content/spec/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Specification" 3 | page_template: spec_page.html 4 | --- 5 | 6 | The OmniBOR specification is being actively developed. The following is a table 7 | with the current state of the in-development specification and prior tagged 8 | releases of the specification. 9 | 10 | | Version | Stable? | Rendered | Source | 11 | |:--------|:--------|:---------------|:-------------| 12 | | `main` | | | [View][main] | 13 | | `v0.1` | Yes | [View][v0_1_r] | [View][v0_1] | 14 | 15 | [main]: https://github.com/omnibor/spec/blob/main/spec/SPEC.md 16 | [v0_1]: https://github.com/omnibor/spec/blob/v0.1/spec/SPEC.md 17 | [v0_1_r]: @/spec/v0.1/_index.md 18 | -------------------------------------------------------------------------------- /templates/macros/toc.html: -------------------------------------------------------------------------------- 1 | 2 | {% macro toc(obj) %} 3 | {% if obj.toc %} 4 |
5 |

Table of Contents

6 | 7 |
    8 | {% for entry in obj.toc %} 9 |
  1. 10 | {% set link = entry.permalink | split(pat='#') | nth(n=1) %} 11 | 12 | 13 | {{ entry.title }} 14 | 15 |
  2. 16 | {% endfor %} 17 |
18 |
19 | {% endif %} 20 | {% endmacro %} 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # OmniBOR Website 2 | 3 | This is the repository for the official [OmniBOR website][omnibor_site]. The 4 | site is built using Hugo, and is deployed to Netlify. 5 | 6 | ## Contributing 7 | 8 | We're happy to accept contributions to the site! All contributions are done 9 | under the terms of the Apache 2.0 license. See the [`LICENSE` file][license] 10 | for more information. 11 | 12 | We provide a [devcontainer][devcontainer] to make contributing easier, it 13 | comes with all dependencies required to build the site already installed. 14 | You can use the container locally, or using [GitHub codespaces][codespaces]. 15 | 16 | ## License 17 | 18 | The website code is licensed under the Apache 2.0 license. The full contents 19 | can be found in the [`LICENSE` file][license]. 20 | 21 | [omnibor_site]: https://omnibor.io/ 22 | [license]: https://github.com/omnibor/site/blob/main/LICENSE 23 | [devcontainer]: https://containers.dev/ 24 | [codespaces]: https://docs.github.com/en/codespaces/developing-in-a-codespace/developing-in-a-codespace 25 | 26 | 27 | -------------------------------------------------------------------------------- /templates/blog.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ section.title }} 9 | {% if section.extra.subtitle %} 10 | {{ section.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 | {{ section.content | markdown | safe }} 15 | 16 | 27 | {% endblock %} 28 | 29 | {% block sidebar %} 30 | {% endblock %} 31 | -------------------------------------------------------------------------------- /templates/page.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ page.title }} 9 | {% if page.extra.subtitle %} 10 | {{ page.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ page.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {% endblock %} 45 | -------------------------------------------------------------------------------- /templates/section.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ section.title }} 9 | {% if section.extra.subtitle %} 10 | {{ section.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ section.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {% endblock %} 45 | -------------------------------------------------------------------------------- /templates/doc.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ page.title }} 9 | {% if page.extra.subtitle %} 10 | {{ page.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ page.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {{ toc::toc(obj=page) }} 45 | {% endblock %} 46 | -------------------------------------------------------------------------------- /templates/spec.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ section.title }} 9 | {% if section.extra.subtitle %} 10 | {{ section.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ section.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {{ toc::toc(obj=section) }} 45 | {% endblock %} 46 | -------------------------------------------------------------------------------- /templates/glossary.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ section.title }} 9 | {% if section.extra.subtitle %} 10 | {{ section.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ section.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {{ toc::toc(obj=section) }} 45 | {% endblock %} 46 | -------------------------------------------------------------------------------- /templates/spec_page.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ page.title }} 9 | {% if page.extra.subtitle %} 10 | {{ page.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ page.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {{ toc::toc(obj=page) }} 45 | {% endblock %} 46 | -------------------------------------------------------------------------------- /content/docs/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Docs 3 | --- 4 | 5 | While the [OmniBOR Specification][spec] is the official reference information 6 | for what OmniBOR is, how its key concepts are defined, and how producers and 7 | users of OmniBOR data should work, it's not necessarily an approachable guide 8 | to understanding the motivation and purpose behind the design of OmniBOR. 9 | 10 | This documentation is intended to supplement the specification. If this 11 | material and the specification are ever in conflict, the specification 12 | supersedes this content. 13 | 14 | Currently, there are four pieces of documentation: 15 | 16 | * [__Artifact ID__][artifact_id]: An explanation of what Artifact IDs are, how their defined, 17 | and how to derive them yourself. 18 | * [__Input Manifest__][input_manifest]: An explanation of what Input Manifests are, how they're 19 | structured, stored, and distributed. 20 | * [__Glossary__][glossary]: A glossary of important terms used throughout the OmniBOR 21 | project. 22 | * [__Resources__][resources]: Links to other helpful resources including written materials, 23 | frequently asked questions, and conference talks. 24 | 25 | [spec]: @/spec/_index.md 26 | [artifact_id]: @/docs/artifact-ids.md 27 | [input_manifest]: @/docs/input-manifests.md 28 | [glossary]: @/glossary/_index.md 29 | [resources]: @/resources/_index.md 30 | -------------------------------------------------------------------------------- /content/resources/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Resources 3 | --- 4 | 5 | The following are collected resources related to OmniBOR. 6 | 7 | ## Frequently Asked Questions 8 | 9 | We maintain a list of answers to [frequently asked questions][faq] about 10 | OmniBOR. 11 | 12 | ## Whitepaper 13 | 14 | OmniBOR was originally envisioned in a whitepaper drafted by Aeva Black. 15 | [View the whitepaper here][whitepaper]. 16 | 17 | ## Talks 18 | 19 | | Event | Video | 20 | |:----------------------------------|:------------------------------------------| 21 | | Cloud Native Security Con EU 2022 | [View][cloud_native_security_con_eu_2022] | 22 | | Supply Chain Security Con NA 2021 | [View][supply_chain_security_con_na_2021] | 23 | 24 | ## Slides 25 | 26 | | Title | Link | Notes | 27 | |:----------------|:--------------|:------------------------------------------------------------------| 28 | | Intro to GitBOM | [View][intro] | These slides predate the project renaming from GitBOM to OmniBOR. | 29 | 30 | [faq]: @/resources/faq.md 31 | [cloud_native_security_con_eu_2022]: https://www.youtube.com/watch?v=2SSkNLWL4UM 32 | [supply_chain_security_con_na_2021]: https://www.youtube.com/watch?v=GKyrsDOse6s&t=546s 33 | [intro]: https://docs.google.com/presentation/d/1fSyRyvYhRYQr-RGm5N1TFcSLQdNV7YYtZmaf2xVwjy4/edit?usp=sharing 34 | [whitepaper]: @/resources/whitepaper.md 35 | -------------------------------------------------------------------------------- /templates/project.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ page.title }} 9 | {% if page.extra.subtitle %} 10 | {{ page.extra.subtitle }} 11 | {% endif %} 12 |

13 | 14 |
39 | {{ page.content | safe }} 40 |
41 | {% endblock %} 42 | 43 | {% block sidebar %} 44 | {% endblock %} 45 | 46 | {% block body_scripts %} 47 | 60 | {% endblock %} 61 | -------------------------------------------------------------------------------- /content/third-party.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Third-Party Integrations 3 | --- 4 | 5 | One of the goals of the OmniBOR project is to integrate the generation of 6 | [Input Manifests][input-manifests] into compilers, linkers, archivers, 7 | bundlers, and containerization tools people already use today. Ideally, we can 8 | achieve a future where Input Manifests are built automatically, and where 9 | anyone distributing software distributes the Input Manifests alongside it. This 10 | is how we can collectively achieve a future of universal transparency through 11 | Artifact Dependency Graphs. 12 | 13 | ## Build Tool Patches 14 | 15 | This goal involves working with a large number of open source projects and 16 | communities. In the spirit of open source, collaboration, and putting in the 17 | work, the OmniBOR Project maintains patches for some existing build tools. 18 | Long-term, our plan is to work with these projects to get OmniBOR generation 19 | integrated upstream. 20 | 21 | Today, we maintain patches for the following tools: 22 | 23 | | Name | Patch | 24 | |:----------|:-----------------| 25 | | GCC | [Link][gcc] | 26 | | LLVM | [Link][llvm] | 27 | | Binutils | [Link][binutils] | 28 | | GNU Patch | [Link][patch] | 29 | 30 | If you're interested in contributing to these patches, helping maintain them, 31 | or getting this work upstreamed into their respective projects, we'd love for 32 | you to [get involved][contribute]! 33 | 34 | [input-manifests]: @/docs/input-manifests.md 35 | [gcc]: https://github.com/omnibor/gcc-omnibor 36 | [llvm]: https://github.com/omnibor/llvm-omnibor 37 | [binutils]: https://github.com/omnibor/binutils-omnibor 38 | [patch]: https://github.com/omnibor/patch-omnibor 39 | [contribute]: @/contribute.md 40 | -------------------------------------------------------------------------------- /netlify.toml: -------------------------------------------------------------------------------- 1 | #============================================================================ 2 | # General Build Configuration 3 | #---------------------------------------------------------------------------- 4 | 5 | [build] 6 | publish = "public" 7 | 8 | #============================================================================ 9 | # Deployment Contexts 10 | # 11 | # Learn more: https://docs.netlify.com/configure-builds/file-based-configuration/#deploy-contexts 12 | #---------------------------------------------------------------------------- 13 | 14 | # Deploys coming from `main`. 15 | [context.production] 16 | command = "zola build && tailwindcss -i styles/main.css -o public/main.css" 17 | environment = {ZOLA_VERSION = "0.13.0"} 18 | 19 | # Deploys coming from pull requests. 20 | [context.deploy-preview] 21 | command = "zola build --base-url $DEPLOY_PRIME_URL && tailwindcss -i styles/main.css -o public/main.css" 22 | environment = {ZOLA_VERSION = "0.13.0"} 23 | 24 | # Deploys coming from branches other than `main`. 25 | [context.branch-deploy] 26 | command = "zola build --base-url $DEPLOY_PRIME_URL && tailwindcss -i styles/main.css -o public/main.css" 27 | environment = {ZOLA_VERSION = "0.13.0"} 28 | 29 | #============================================================================ 30 | # Redirects 31 | #---------------------------------------------------------------------------- 32 | 33 | [[redirects]] 34 | from = "/community" 35 | to = "/contribute" 36 | 37 | [[redirects]] 38 | from = "/glossary/git/#git-ref" 39 | to = "/glossary/git/#git-object-id-gitoid" 40 | 41 | [[redirects]] 42 | from = "/glossary/git/#git-ref" 43 | to = "/glossary/git/#git-object-id-gitoid" 44 | 45 | [[redirects]] 46 | from = "https://gitbom.dev/*" 47 | to = "https://omnibor.io/:splat" 48 | status = 301 49 | force = true 50 | -------------------------------------------------------------------------------- /templates/includes/header.html: -------------------------------------------------------------------------------- 1 |
2 | 37 |
38 | -------------------------------------------------------------------------------- /content/blog/2024-10-16-new-website.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: A New OmniBOR Website 3 | authors: 4 | - Andrew Lilley Brinker 5 | --- 6 | 7 | The OmniBOR Project has a new website, new documentation, and for the first 8 | time, a project blog! 9 | 10 | 11 | 12 | If you're reading this, then you can already see the new look of the OmniBOR 13 | website. This new website does include a substantial redesign, intended to work 14 | better on both desktop and mobile, but it also includes a substantial addition 15 | of new content! 16 | 17 | The project website now hosts the [OmniBOR specification][spec]. Our intent is 18 | to keep this updated with any new release of the specification. If you'd like 19 | to track the pre-release updates of new versions, we also link to the 20 | [`omnibor/spec`][spec_repo] repository. 21 | 22 | We've also added new [documentation][docs] intended to provide a more 23 | accessible explanation of [Artifact IDs][artifact-ids], 24 | [Input Manifests][input-manifests], and the other key concepts underlying what 25 | we're doing with OmniBOR. If you're brand new to OmniBOR, we recommend you 26 | start with this new documentation before diving into the more detailed 27 | specification. 28 | 29 | Finally, the new site has [a blog][blog]! This is the first post on that blog. In the 30 | future we'll be using this to provide updates on the project, to share 31 | information about work we're doing around the open source ecosystem, and more. 32 | 33 | As always, OmniBOR is a project that benefits enormously from the 34 | involvement of a broad community of people. If you're interested in solving 35 | the challenge of universal software identity, we'd love for you to [get 36 | involved][contribute]! 37 | 38 | [spec]: @/spec/_index.md 39 | [spec_repo]: https://github.com/omnibor/spec 40 | [docs]: @/docs/_index.md 41 | [artifact-ids]: @/docs/artifact-ids.md 42 | [input-manifests]: @/docs/input-manifests.md 43 | [blog]: @/blog/_index.md 44 | [contribute]: @/contribute.md 45 | -------------------------------------------------------------------------------- /templates/blog_post.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 | {% endblock %} 5 | 6 | {% block content %} 7 |

8 | {{ page.title }} 9 | {% if page.extra.subtitle %} 10 | {{ page.extra.subtitle }} 11 | {% endif %} 12 |

13 |
14 | {% if page.authors %} 15 | 16 | {% set num_authors = page.authors | length %} 17 | {% if num_authors != 0 %} 18 | Written by 19 | {% for author in page.authors -%} 20 | {# I WILL produce a proper Oxford comma #} 21 | {%- if not loop.first and not loop.last -%}, {%- elif loop.last and num_authors != 1 -%}{% if num_authors >= 3%},{% endif %} and {%- endif -%}{{ author }} 22 | {%- endfor %} 23 | {% endif %} 24 | 25 | {% endif %} 26 | 27 | Posted on {{ page.date | date(format="%B %-d, %Y") }} 28 | 29 |
30 | 31 |
56 | {{ page.content | safe }} 57 |
58 | {% endblock %} 59 | 60 | {% block sidebar %} 61 | {% endblock %} 62 | -------------------------------------------------------------------------------- /content/contribute.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Contribute" 3 | --- 4 | 5 | The OmniBOR Project is happy to accept contributions from anyone! Part of our 6 | ethos is that __OmniBOR is for everyone__. Below you'll find some guidance for 7 | making contributions to different parts of the OmniBOR project! 8 | 9 | ## Contributing to the OmniBOR Specification 10 | 11 | The OmniBOR specification is the central location where we define Artifact IDs, 12 | Input Manifests, and how they should be constructed, stored, and distributed. 13 | 14 | Contributions to the specification can take many shapes. For small, 15 | non-semantic contributions, like rewriting prose for clarity or fixing typos 16 | and grammatical mistakes, we encourage you to just make a Pull Request directly 17 | to the specification repository on GitHub. 18 | 19 | For larger contributions, especially those which change the semantic meaning of 20 | the spec, we work by a consensus model which involves: 21 | 22 | 1. Proposing changes 23 | 2. Discussing those changes among the OmniBOR Working Group 24 | 3. Modifying proposals based on feedback 25 | 4. If consensus is reached, making edits to the specification 26 | 27 | The process of gathering feedback currently happens largely synchronously during 28 | the weekly OmniBOR Working Group calls, held on Zoom every Monday from 10am to 29 | 11am Pacific Time. 30 | 31 | ### Working Group Meetings 32 | 33 | [iCal-format calendar subscription](https://calendar.google.com/calendar/ical/rqmtkd0ucekn9obagmo9v4b6s4%40group.calendar.google.com/public/basic.ics) 34 | 35 | {{ calendar(id = "rqmtkd0ucekn9obagmo9v4b6s4") }} 36 | 37 | ## Contributing Code to the OmniBOR Project 38 | 39 | The OmniBOR Project maintains a number of existing software projects, including: 40 | 41 | - First-party implementations of the OmniBOR specification 42 | - Patches for third-party tools to support producing OmniBOR data 43 | - Additional tools to support interacting with OmniBOR data 44 | 45 | We are happy to accept contributions to any of these! 46 | 47 | Each project has its own license and may have specific unique contribution 48 | guidance, so you should review the policies of the specific repository before 49 | contributing. For first-party implementations, we default to permissive 50 | open source licenses like Apache 2.0 or MIT. For patches to third-party tools, 51 | we match the licensing of the upstream tool, as our goal is to eventually 52 | merge any patches we maintain back into upstream so others can make use of our 53 | changes. 54 | -------------------------------------------------------------------------------- /templates/includes/footer.html: -------------------------------------------------------------------------------- 1 | 2 | 54 | -------------------------------------------------------------------------------- /config.toml: -------------------------------------------------------------------------------- 1 | base_url = "https://omnibor.io" 2 | title = "OmniBOR" 3 | compile_sass = false 4 | build_search_index = true 5 | 6 | [markdown] 7 | 8 | highlight_code = true 9 | highlight_theme = "ayu-light" 10 | 11 | [extra] 12 | 13 | nav = [ 14 | { name = "Spec", url = "@/spec/_index.md", pill = "v0.1" }, 15 | { name = "Docs", url = "@/docs/_index.md" }, 16 | { sep = true }, 17 | { name = "Blog", url = "@/blog/_index.md" }, 18 | { name = "Project", url = "@/project.md" }, 19 | { name = "Contribute", url = "@/contribute.md" }, 20 | { sep = true }, 21 | { icon = "github", url = "https://github.com/omnibor" }, 22 | ] 23 | 24 | footer = [ 25 | [ 26 | { name = "Spec", title = true }, 27 | { name = "Stable", url = "@/spec/v0.1/_index.md", pill = "v0.1" }, 28 | { name = "Latest", url = "https://github.com/omnibor/spec", external = true }, 29 | ], 30 | [ 31 | { name = "Tools", title = true }, 32 | { name = "OmniBOR CLI", url = "https://github.com/omnibor/omnibor-rs/tree/main/omnibor-cli", external = true }, 33 | { name = "bomsh", url = "https://github.com/omnibor/bomsh", external = true }, 34 | { name = "jbor", url = "https://github.com/omnibor/jbor", external = true }, 35 | { name = "Third-Party Tools", url = "@/third-party.md" }, 36 | ], 37 | [ 38 | { name = "Libraries", title = true }, 39 | { name = "Rust", url = "https://github.com/omnibor/omnibor-rs", external = true }, 40 | { name = "Go", url = "https://github.com/omnibor/omnibor-go", external = true }, 41 | { name = ".NET", url = "https://github.com/omnibor/omnibor-dotnet", external = true }, 42 | { name = "Python", url = "https://github.com/omnibor/omnibor-py", external = true }, 43 | ], 44 | [ 45 | { name = "Docs", title = true }, 46 | { name = "Artifact IDs", url = "@/docs/artifact-ids.md" }, 47 | { name = "Input Manifests", url = "@/docs/input-manifests.md" }, 48 | { name = "Glossary", url = "@/glossary/_index.md" }, 49 | { name = "Resources", url = "@/resources/_index.md" }, 50 | ], 51 | [ 52 | { name = "Info", title = true }, 53 | { name = "Contribute", url = "@/contribute.md" }, 54 | { name = "Project", url = "@/project.md" }, 55 | ], 56 | [ 57 | { name = "Social", title = true }, 58 | { name = "YouTube", url = "https://www.youtube.com/@omnibor", external = true }, 59 | { name = "Twitter / X", url = "https://x.com/omnibor", external = true }, 60 | ], 61 | ] 62 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | public/ 2 | 3 | # Logs 4 | logs 5 | *.log 6 | npm-debug.log* 7 | yarn-debug.log* 8 | yarn-error.log* 9 | lerna-debug.log* 10 | .pnpm-debug.log* 11 | 12 | # Diagnostic reports (https://nodejs.org/api/report.html) 13 | report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json 14 | 15 | # Runtime data 16 | pids 17 | *.pid 18 | *.seed 19 | *.pid.lock 20 | 21 | # Directory for instrumented libs generated by jscoverage/JSCover 22 | lib-cov 23 | 24 | # Coverage directory used by tools like istanbul 25 | coverage 26 | *.lcov 27 | 28 | # nyc test coverage 29 | .nyc_output 30 | 31 | # Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files) 32 | .grunt 33 | 34 | # Bower dependency directory (https://bower.io/) 35 | bower_components 36 | 37 | # node-waf configuration 38 | .lock-wscript 39 | 40 | # Compiled binary addons (https://nodejs.org/api/addons.html) 41 | build/Release 42 | 43 | # Dependency directories 44 | node_modules/ 45 | jspm_packages/ 46 | 47 | # Snowpack dependency directory (https://snowpack.dev/) 48 | web_modules/ 49 | 50 | # TypeScript cache 51 | *.tsbuildinfo 52 | 53 | # Optional npm cache directory 54 | .npm 55 | 56 | # Optional eslint cache 57 | .eslintcache 58 | 59 | # Optional stylelint cache 60 | .stylelintcache 61 | 62 | # Microbundle cache 63 | .rpt2_cache/ 64 | .rts2_cache_cjs/ 65 | .rts2_cache_es/ 66 | .rts2_cache_umd/ 67 | 68 | # Optional REPL history 69 | .node_repl_history 70 | 71 | # Output of 'npm pack' 72 | *.tgz 73 | 74 | # Yarn Integrity file 75 | .yarn-integrity 76 | 77 | # dotenv environment variable files 78 | .env 79 | .env.development.local 80 | .env.test.local 81 | .env.production.local 82 | .env.local 83 | 84 | # parcel-bundler cache (https://parceljs.org/) 85 | .cache 86 | .parcel-cache 87 | 88 | # Next.js build output 89 | .next 90 | out 91 | 92 | # Nuxt.js build / generate output 93 | .nuxt 94 | dist 95 | 96 | # Gatsby files 97 | .cache/ 98 | # Comment in the public line in if your project uses Gatsby and not Next.js 99 | # https://nextjs.org/blog/next-9-1#public-directory-support 100 | # public 101 | 102 | # vuepress build output 103 | .vuepress/dist 104 | 105 | # vuepress v2.x temp and cache directory 106 | .temp 107 | .cache 108 | 109 | # Docusaurus cache and generated files 110 | .docusaurus 111 | 112 | # Serverless directories 113 | .serverless/ 114 | 115 | # FuseBox cache 116 | .fusebox/ 117 | 118 | # DynamoDB Local files 119 | .dynamodb/ 120 | 121 | # TernJS port file 122 | .tern-port 123 | 124 | # Stores VSCode versions used for testing VSCode extensions 125 | .vscode-test 126 | 127 | # yarn v2 128 | .yarn/cache 129 | .yarn/unplugged 130 | .yarn/build-state.yml 131 | .yarn/install-state.gz 132 | .pnp.* 133 | -------------------------------------------------------------------------------- /templates/index.html: -------------------------------------------------------------------------------- 1 | {% extends "base.html" %} 2 | 3 | {% block hero %} 4 |

5 | Reproducible identifiers & fine-grained build dependency tracking for software artifacts. 6 |

7 | 8 |
9 | {# Text block #} 10 |
11 |

12 | OmniBOR defines two key concepts, Artifact IDs and 13 | Input Manifests, that enable anyone to 14 | independently produce the same identifier for any software artifact, and to detect any artifact built 15 | with vulnerable inputs.

16 |
17 | 18 | {# Artifact ID block #} 19 |
20 |
21 |
{{ icon::icon(name="tag", classes="w-5 h-5") }}
22 |
23 |

Artifact IDs

24 |

Reproducible identifiers based only on an artifact itself.

25 | Learn more ↗ 26 |
27 |
28 |
29 | 30 | {# Input Manifest block #} 31 |
32 |
33 |
{{ icon::icon(name="book-open", classes="w-5 h-5") }}
34 |
35 |

Input Manifests

36 |

Records of Artifact IDs for artifact build inputs.

37 | Learn more ↗ 38 |
39 |
40 |
41 |
42 | {% endblock %} 43 | 44 | {% block content %} 45 | {% endblock %} 46 | 47 | {% block sidebar %} 48 | {% endblock %} 49 | -------------------------------------------------------------------------------- /content/spec/v0.1/annex-b.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: OmniBOR Specification, Version 0.1 3 | extra: 4 | subtitle: Annex B — ELF Embedding 5 | --- 6 | 7 | Annex B contains a method of embedding Input Manifest Identifiers into ELF 8 | files. 9 | 10 | ## Input Manifest Identifiers 11 | 12 | Input Manifest Identifiers are Artifact Identifiers (Git Object Identifiers 13 | \[GitOIDs\]) for Artifact Input Manifests. They identify an Artifact Input 14 | Manifest and MAY be embedded into an artifact to relate the artifact to its 15 | Artifact Input Manifests. 16 | 17 | If an ELF artifact contains an embedded Input Manifest Identifier, then 18 | implementations MUST conform to the format specified in this document. 19 | 20 | Note that multiple Input Manifests MUST be produced for a single artifact, 21 | reflecting the use of different hash functions to produce the Artifact 22 | Identifiers. 23 | 24 | ## Input Manifest Identifier persistence in ELF Objects/Executables 25 | 26 | Input Manifest Identifiers MUST be persisted by build tools when they build 27 | an artifact and produce an Artifact Input Manifest for that artifact. 28 | 29 | When persisting Input Manifest Identifiers into an ELF object or an ELF 30 | executable, the build tool MUST create a [section][elf_section] 31 | `.note.omnibor` and place the Input Manifest Identifiers in the descriptor 32 | field of the note entry. This section MUST be of type `SHT_NOTE` and MUST have 33 | the attribute `SHF_ALLOC`. Multiple Note entries MUST be created, one for each 34 | Artifact Identifier type when multiple Artifact Identifier types are involved. 35 | Each note entry MUST contain the following fields in the same order as given 36 | below: 37 | 38 | 1. `namesz` (4 bytes): This field MUST be set to a value of `8`, the length of 39 | the 'owner' field `OMNIBOR\0` in bytes. 40 | 2. `descz` (4 bytes): This field MUST contain the length of the Input Manifest 41 | Identifier in bytes, including a byte for the null terminator. 42 | 3. `type` (4 bytes): This field MUST contain the value associated with one of 43 | the reserved Artifact Identifier types. The values for the reserved types 44 | are in the range of `0x00000000` to `0x7fffffff`. Permissible types with 45 | reserved values are: 46 | 47 | ``` 48 | NT_GITOID_BLOB_SHA1 = 0x1, 49 | NT_GITOID_BLOB_SHA256 = 0x2, 50 | ``` 51 | 52 | 4. `owner` (8 bytes): This field MUST contain the string `OMNIBOR\0`, padded to 53 | 8 bytes. 54 | 5. `descriptor`: This field MUST contain the Input Manifest Identifiers as raw 55 | bytes.The length of this field is the same as the value in the `descz` field. 56 | 57 | When recording multiple Input Manifest Identifiers in the note section, 58 | 59 | 1. There MUST be only one note entry for each Input Manifest Identifier type. 60 | 2. The note entries MUST be in ascending order of Input Manifest Identifier 61 | type. 62 | 63 | Conforming build tools MUST generate all Input Manifest Identifier types, 64 | currently SHA1 and SHA256 Artifact Identifiers. 65 | 66 | [elf_section]: https://refspecs.linuxfoundation.org/LSB_3.0.0/LSB-PDA/LSB-PDA.junk/sections.html 67 | -------------------------------------------------------------------------------- /content/spec/v0.1/annex-a.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: OmniBOR Specification, Version 0.1 3 | extra: 4 | subtitle: Annex A — File System Storage 5 | --- 6 | 7 | Annex A documents known methods of persisting OmniBOR Documents to various 8 | stores. 9 | 10 | ## Input Manifest persistence by a Build Tool to its local filesystem 11 | 12 | If a build tool persists an Input Manifest to its local filesystem, the build 13 | tool should write out the Input Manifest to 14 | `${OMNIBOR_DIR}/objects/${Artifact Identifier Type uri prefix with ':' replaced by '_'}/${Input Manifest Identifier:0:2}/${Input Manifest Identifier:2:}` 15 | where `${Input Manifest Identifier}` is Input Manifest Identifier in lowercase 16 | hexadecimal with leading zeros NOT suppressed. 17 | 18 | Example: 19 | 20 | If `OMNIBOR_DIR=.omnibor` then the Input Manifest with `gitoid:blob:sha1` Input 21 | Manifest Identifier `0e8efd4cdf0d5bafcfcae658c2662a73b199b301` would be stored 22 | in: 23 | 24 | ``` 25 | .omnibor/objects/gitoid_blob_sha1/0e/8efd4cdf0d5bafcfcae658c2662a73b199b301 26 | ``` 27 | 28 | ## Build tool persistence of related metadata 29 | 30 | A build tool may persist additional metadata to that makes reference to the 31 | Artifact Dependency Graph (ADG). It should persist such metadata to a 32 | subdirectory of the directory to which the output artifact is being 33 | written of the form: `${OMNIBOR_DIR}/metadata/${context}/`. 34 | 35 | For metadata specific to a particular build tool `${context}` should be a name 36 | uniquely associated with the build tool. For example: 37 | 38 | - `${OMNIBOR_DIR}/metadata/llvm` 39 | - `${OMNIBOR_DIR}/metadata/clang` 40 | - `${OMNIBOR_DIR}/metadata/go` 41 | - `${OMNIBOR_DIR}/metadata/rustc` 42 | - `${OMNIBOR_DIR}/metadata/gcc` 43 | 44 | Build tools should report their selection of `${context}` subdirectory name to 45 | the OmniBOR spec for inclusion in a list to preclude `${context}` collision. 46 | 47 | Metadata persisted by multiple build tools in the same way should be documented 48 | in a specification for that metadata. Such specs must include the `${context}` 49 | for that metadata. Such specs should be reported to the OmniBOR spec for 50 | inclusion in a list to preclude `${context}` collision. For example, if a 51 | group of build tools decide to store metadata about file locations in a common 52 | format, they might choose to define a `${context}` `filelocation` in which case 53 | the metadata would be stored in `${OMNIBOR_DIR}/metadata/filelocation`. 54 | 55 | Subdirectory structure, filenaming, and file schema below 56 | `${OMNIBOR_DIR}/metadata/${context}/` are at the discretion of the build tool 57 | for build tool specific metadata or the metadata spec for common metadata. 58 | 59 | ## Build tool selection of `OMNIBOR_DIR` 60 | 61 | `OMNIBOR_DIR` may be set by the following methods, listed in order of 62 | precedence: 63 | 64 | 1. A build tool specific flag 65 | 2. A non-empty env variable named `OMNIBOR_DIR` 66 | 67 | The absence of specification of a location to write omnibor data via either the 68 | build tool specific flag or `OMNIBOR_DIR` variable may be taken as a signal to 69 | skip OmniBOR generation. 70 | -------------------------------------------------------------------------------- /content/project.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Project" 3 | template: project.html 4 | --- 5 | 6 | The OmniBOR project consists of: 7 | 8 | - The [OmniBOR specification][spec] 9 | - A set of first-party OmniBOR implementations, including: 10 | - [`omnibor-rs`][rust]: A Rust implementation 11 | - [`omnibor-go`][go]: A Go implementation 12 | - [`omnibor-dotnet`][dotnet]: A .NET implementation 13 | - [`omnibor-py`][python]: A Python implementation 14 | - A set of patches for third-party software to add OmniBOR support, including: 15 | - [`patch-omnibor`][patch]: A patched version of GNU `patch`. 16 | - [`gcc-omnibor`][gcc]: A patched version of GCC. 17 | - [`binutils-omnibor`][binutils]: A patched version of `binutils`. 18 | - [`llvm-omnibor`][llvm]: A patched version of LLVM. 19 | - Miscellaneous other tools related to OmniBOR, including: 20 | - [`bomsh`][bomsh]: Shell scripts for interacting with OmniBOR data. 21 | - [`jbor`][jbor]: A Java agent to log OmniBOR Artifact IDs. 22 | - The [OmniBOR website][site] 23 | - OmniBOR project spaces, including: 24 | - GitHub Discussions under on any OmniBOR repositories. 25 | - The weekly OmniBOR Working Group meetings. 26 | - Any other meetings or discussion spaces operated by the OmniBOR project. 27 | 28 | ## Code of Conduct 29 | 30 | All OmniBOR projects and spaces are covered by the [OmniBOR Project Code of 31 | Conduct][coc]. 32 | 33 | ## Governance 34 | 35 | The OmniBOR Project is governed by consensus among active project 36 | participants. Generally, being an "active project participant" means 37 | participating in the weekly OmniBOR Working Group meetings, currently 38 | held over Zoom from 10am to 11am Pacific Time on Mondays. 39 | 40 | Proposals for improvements to the language are generally discussed 41 | during these meetings, and when consensus is reached on a design, a formal 42 | proposal is made to the relevant repository and the change is merged. 43 | 44 | The project does have a Core Team of long-term active participants. The 45 | OmniBOR Core Team currently consists of: 46 | 47 | 55 | 56 | [spec]: https://github.com/omnibor/spec 57 | [coc]: https://github.com/omnibor/spec/blob/main/code_of_conduct.md 58 | [rust]: https://github.com/omnibor/omnibor-rs 59 | [go]: https://github.com/omnibor/omnibor-go 60 | [dotnet]: https://github.com/omnibor/omnibor-dotnet 61 | [python]: https://github.com/omnibor/omnibor-py 62 | [patch]: https://github.com/omnibor/patch-omnibor 63 | [gcc]: https://github.com/omnibor/gcc-omnibor 64 | [binutils]: https://github.com/omnibor/binutils-omnibor 65 | [llvm]: https://github.com/omnibor/llvm-omnibor 66 | [bomsh]: https://github.com/omnibor/bomsh 67 | [jbor]: https://github.com/omnibor/jbor 68 | [site]: https://omnibor.io 69 | -------------------------------------------------------------------------------- /content/spec/v0.1/annex-c.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: OmniBOR Specification, Version 0.1 3 | extra: 4 | subtitle: Annex C – Source Embedding 5 | --- 6 | 7 | Annex C contains a method of embedding an Input Manifest Identifier into source code files. 8 | 9 | ### Embedded Input Manifest Identifier 10 | 11 | Most source code files are hand coded by humans. Some however are generated from other input(s) by a build tool. 12 | 13 | A build tool outputing a source code file may embed the Input Manifest Identifier for the output source code file into 14 | the output source code file by adding a comment line containing a string of the form: 15 | 16 | ``` 17 | OmniBOR-Input-Manifest-ID: [ ${comma separated list of Input Manifest Identifier URIs} ] 18 | ``` 19 | 20 | For a file with C commenting semantics (like C, C++, Java, Go, etc) a concrete example might be: 21 | 22 | ``` 23 | // OmniBOR-Input-Manifest-ID: [ gitoid:blob:sha1:261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64, gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 ] 24 | ``` 25 | 26 | For a file with shell commenting semantics (like a shell script, Python, etc) a concrete example might be: 27 | ``` 28 | # OmniBOR-Input-Manifest-ID: [ gitoid:blob:sha1:261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64, gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 ] 29 | ``` 30 | 31 | When interpretting an OmniBOR-Input-Manifest-ID comment line a reader should ignore any leading and trailing spaces around '[' or ']' 32 | or ','. 33 | 34 | ### Placement of OmniBOR-Input-Manifest-ID Comment Line 35 | 36 | The OmniBOR-Input-Manifest-ID comment line should be placed as the last line in the source code file. The OmniBOR-Input-Manifest-ID comment line should be preceded by a blank line to ensure it is not interpretted as part of another comment block. 37 | 38 | A tool reading the source code file should interpret the last OmniBOR-Input-Manifest-ID comment line it encounters in the file as being the Input Manifest Identifier, and ignore previous comment lines in the file which may contain Input Manifest Identifiers. 39 | 40 | Example: 41 | 42 | If the input source code file begins with: 43 | 44 | ```go 45 | // Code generated by stringer DO NOT EDIT. 46 | 47 | import ( 48 | "fmt" 49 | ) 50 | ... 51 | ``` 52 | 53 | The output source code file should look like: 54 | ```go 55 | // Code generated by stringer DO NOT EDIT. 56 | 57 | import ( 58 | "fmt" 59 | ) 60 | ... 61 | 62 | // OmniBOR-Input-Manifest-ID: [ gitoid:blob:sha1:261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64, gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 ] 63 | ``` 64 | 65 | If the input source code file begins with: 66 | 67 | ```c 68 | /* 69 | * Copyright 2023 Yoyodyne Inc 70 | * SPDX-License-Identifier: 71 | */ 72 | 73 | #include 74 | int main() { 75 | // printf() displays the string inside quotation 76 | printf("Hello, World!"); 77 | return 0; 78 | } 79 | ``` 80 | 81 | The output source code file should look like: 82 | 83 | ```c 84 | /* 85 | * Copyright 2023 Yoyodyne Inc 86 | * SPDX-License-Identifier: 87 | */ 88 | 89 | #include 90 | int main() { 91 | // printf() displays the string inside quotation 92 | printf("Hello, World!"); 93 | return 0; 94 | } 95 | 96 | //* OmniBOR-Input-Manifest-ID: [ gitoid:blob:sha1:261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64, gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 ] */ 97 | ``` 98 | 99 | ### Tools which mutate existing source code files 100 | 101 | Many source code generation tools, like patch, specifically mutate an existing input source code file which may contain 102 | an existing OmniBOR-Input-Manifest comment. In such circumstances the tool should either 103 | 104 | 1. Replace an existing OmniBOR-Input-Manifest comment if found 105 | 2. Insert the OmniBOR-Input-Manifest normally, which will cause it to be placed *after* the existing OmniBOR-Input-Manifest comment line. 106 | -------------------------------------------------------------------------------- /templates/base.html: -------------------------------------------------------------------------------- 1 | {# IMPORT MACROS #} 2 | {% import "macros/icon.html" as icon %} 3 | {% import "macros/toc.html" as toc %} 4 | 5 | {# CONFIGURE title AND nav_title #} 6 | {% if page and page.title %} 7 | {% set nav_title = page.title ~ " — " ~ config.title %} 8 | {% set title = page.title %} 9 | {% elif section and section.title %} 10 | {% set nav_title = section.title ~ " — " ~ config.title %} 11 | {% set title = section.title %} 12 | {% else %} 13 | {% set nav_title = config.title %} 14 | {% set title = config.title %} 15 | {% endif %} 16 | 17 | {# CONFIGURE description #} 18 | {% if page and page.description %} 19 | {% set description = page.description %} 20 | {% elif section and section.description %} 21 | {% set description = section.description %} 22 | {% else %} 23 | {% set description = config.description %} 24 | {% endif %} 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | {{ nav_title }} 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | {% block scripts %}{% endblock %} 55 | 56 | 57 | {% include "includes/header.html" %} 58 | 59 |
60 |
61 |
62 | {% block hero %}{% endblock %} 63 |
64 | 65 |
66 |
67 | {% block content %}{% endblock %} 68 |
69 |
70 | {% block sidebar %}{% endblock %} 71 |
72 |
73 |
74 |
75 | 76 |
77 | {% block footer %} 78 | {% include "includes/footer.html" %} 79 | {% endblock %} 80 |
81 | 82 | 88 | 89 | {# Smooth scrolling on anchor click for the current page. #} 90 | 105 | 106 | {% block body_scripts %}{% endblock %} 107 | 108 | 109 | -------------------------------------------------------------------------------- /content/resources/faq.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Frequently Asked Questions 3 | --- 4 | 5 | This FAQ is a good-faith attempt by the OmniBOR community to answer common 6 | OmniBOR questions. This document will evolve over time and as the community 7 | grows. 8 | 9 | To propose a new question, please open [an issue][site_issue]. 10 | 11 | To propose a question-answer pair, please open [a pull request][site_pr] 12 | updating this file. 13 | 14 | ## GitOID's are based on SHA-1, isn't SHA-1 broken? 15 | 16 | Git itself still uses SHA-1, and we'd like to be compatible with git and tools 17 | that already use gitoids. We plan to update to another hashing algorithm 18 | if/when git updates. 19 | 20 | [shattered.io][shattered] has found an impressive attack on SHA-1 in PDF files. 21 | There are [misconceptions about what that means][shattered_misconceptions]. 22 | 23 | GitHub has published some analysis of its [implications for git][shattered_git]. 24 | 25 | ## Git has been considering moving to SHA-256 for some time. Why doesn't OmniBOR simply adopt that? 26 | 27 | Great question. We might — when Git does. 28 | 29 | ## Why isn't information about the compiler or linker included in the OmniBOR? 30 | 31 | Our view is that build environment information that does not affect the build 32 | output should not be represented in the OmniBOR. Doing so would invalidate the 33 | characteristics of **Uniqueness** and **Artifact Identity**. 34 | 35 | ## Why isn't metadata included in the OmniBOR? 36 | 37 | OmniBOR seeks to have the following characteristics: 38 | 39 | 1. **Artifact Equivalence**: Two artifacts are equivalent if and only if they 40 | are bit-for-bit identical. 41 | 2. **Artifact Identity**: Independent parties derive the same artifact identity 42 | when presented with equivalent artifacts. 43 | 3. **Immutability**: An identified artifact can not be modified without changing 44 | its identity. 45 | 4. **Uniqueness**: An artifact can have precisely *one* artifact identity graph. 46 | All equivalent artifacts have the same graph. 47 | 48 | The uniqueness requirement is what drives the exclusion of metadata from 49 | OmniBOR. 50 | 51 | ## Will the generation of artifact dependency graphs slow down build processes? Will the graphs be very large? 52 | 53 | We don't think so and would be delighted to receive data from very large 54 | projects that would either challenge or validate this assumption. 55 | 56 | ## What about files with duplicate hashes? 57 | 58 | We don't think this will be a problem because OmniBOR does not include any 59 | metadata, such as provenance, timestamp, and licence — the domain of SBOMs. 60 | 61 | While duplicate hashes of empty files and regularly copied files (such as 62 | LICENSE files) are guaranteed to occur, this does not affect the security 63 | properties of OmniBOR. 64 | 65 | ## How do [Software Heritage Foundation][swh] identifiers relate to OmniBOR Identifiers? 66 | 67 | [Software Heritage Foundation Identifiers][swhid] use 68 | [Git Object IDs][swhid_gitoid] as part of their 69 | [core identifiers][swhid_coreids]: 70 | 71 | > SWHIDs for contents, directories, revisions, and releases are, at present, 72 | > compatible with the Git way of computing identifiers for its objects. The 73 | > part of a SWHID for a content object is the Git blob identifier 74 | > of any file with the same content; for a revision it is the Git commit 75 | > identifier for the same revision, etc. 76 | 77 | OmniBOR uses Git Object IDs as the entire Artifact ID. 78 | 79 | Whereas SWHIDs' core identifier includes additional metadata (see 80 | [SWHID Syntax][swhid_metadata]): 81 | 82 | ``` 83 | ::= "swh" ":" ":" ":" ; 84 | ``` 85 | 86 | … the Git Object ID is the object's identifier in an Input Manifest. 87 | 88 | ``` 89 | ::= 90 | ``` 91 | 92 | The scheme in which SWHIDs are used is also different from the scheme in which 93 | OmniBOR Artifact IDs are used in an Input Manifest. 94 | 95 | [site_issue]: https://github.com/omnibor/site/issues 96 | [site_pr]: https://github.com/omnibor/site/pulls 97 | [shattered]: https://shattered.io/ 98 | [shattered_misconceptions]: https://manishearth.github.io/blog/2017/02/26/clarifying-misconceptions-about-shattered/ 99 | [shattered_git]: https://github.blog/2017-03-20-sha-1-collision-detection-on-github-com/ 100 | [swh]: https://www.softwareheritage.org/ 101 | [swhid]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#persistent-identifiers 102 | [swhid_gitoid]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#git-compatibility 103 | [swhid_coreids]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#core-identifiers 104 | [swhid_metadata]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#syntax 105 | -------------------------------------------------------------------------------- /content/docs/input-manifests.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Input Manifests 3 | template: doc.html 4 | --- 5 | 6 | Input Manifests, alongside [Artifact IDs][artifact_ids], are one half of the 7 | equation for OmniBOR. Input Manifests are how OmniBOR records the inputs used 8 | to build software artifacts, and form the basis for how OmniBOR can allow 9 | consumers of software to build fine-grained _Artifact Dependency Graphs_ (ADGs) 10 | that enable rapid discovery of vulnerable components and much more. 11 | 12 | ## What is an Input Manifest? 13 | 14 | An Input Manifests is a small text file format which records information about 15 | the inputs used to build a software artifacts. By "inputs" we mean anything 16 | provided to a build tool in order to produce the artifact. For example, when 17 | building a project written in the C programming language, the Input Manifest 18 | for a `.o` file (an object file) built from an associated `.c` file (a 19 | source file) would have an Input Manifest recording the Artifact ID of the 20 | `.c` file. 21 | 22 | ## What do Input Manifests Look Like? 23 | 24 | Input Manifests look something like this: 25 | 26 | ``` 27 | gitoid:blob:sha256\n 28 | 09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772\n 29 | 230f3515d1306690815bd9c3da0d15d8b6fcf43894d17100eb44b6d329a92f61\n 30 | 2f4a51b16b76bbc87c4c27af8ae062b1b50b280f1ab78e3eec155334588dc88e manifest 4f3a822f776412c049dda53c3277bf2225b51b805ce8a99222af23a7d9f55636\n 31 | c71d239df91726fc519c6eb72d318ec65820627232b2f796219e87dcf35d0ab4\n 32 | f47ffb3518f236eea6525fd29f057ddd5cda1bb803ccc662e6bc5925afd1e4af\n 33 | ``` 34 | 35 | Every Input Manifest starts with a header that provides some information about 36 | the Artifact IDs used throughout the rest of the manifest. Every Artifact ID 37 | includes `blob` as its object type, and `sha256` as its hash type, and _all_ 38 | Artifact IDs in a single Input Manifest must have the same hash type as all 39 | others. This is to ensure in the future, if Artifact IDs are ever updated to 40 | support more hash algorithms, that a single Input Manifest only uses one hash 41 | algorithm at a time. 42 | 43 | Then we have Artifact IDs for each input artifact used to build the "target 44 | artifact" being described. These Artifact IDs are listed in lexical order. 45 | 46 | Each line is separated by a single newline character (`\n`) regardless of 47 | the user's current platform. This is because we need these Input Manifests to 48 | always be bit-for-bit identical regardless of where they're derived. 49 | 50 | The one additional wrinkle is that if an input artifact itself has an Input 51 | Manifest, we can include the Artifact ID of the input artifact's Input Manifest 52 | as well. 53 | 54 | _This_ is a key part of the secret sauce of OmniBOR, which we'll explain in 55 | the next section. 56 | 57 | ## From Input Manifests to Artifact Dependency Graphs 58 | 59 | There are two key ideas that we've not discussed yet which turn OmniBOR from 60 | a lightweight method for listing IDs of input files and turn it into a 61 | Merkle tree for a software artifact's complete dependency tree: 62 | 63 | - Input Manifests record Artifact IDs of their inputs and (if available) their 64 | inputs' own Input Manifests 65 | - The Artifact ID of an Input Manifest should be embedded in the artifact 66 | itself at build time. 67 | 68 | These two details, when implemented in tooling, mean that artifacts become 69 | cryptographically tied to a description of their own inputs, which can't be 70 | modified without detection. Because an Artifact ID is based on the contents of 71 | an artifact, if the artifact's contents include the Artifact ID of its own 72 | Input Manifest, any change to that manifest results in a change in the 73 | artifact's own Artifact ID. 74 | 75 | Thinking this through, this means changes in a dependency anywhere in the 76 | dependency graph results in changes of Artifact IDs for _anything_ derived from 77 | it, no matter how many steps removed it is! 78 | 79 | For some people, you may already be thinking this sounds like a form of Merkel 80 | tree, and you're right! OmniBOR's Artifact IDs and Input Manifests come 81 | together to form a Merkel tree of all dependencies used to construct an 82 | artifact. If a software consumer receives all the Input Manifests for an 83 | artifact and its dependencies, they can not just detect changes in the 84 | Artifact ID of the artifact itself, but they can use the manifests to drill 85 | down to exactly what changed and where. 86 | 87 | ## How Should Input Manifests be Produced? 88 | 89 | Our dream as a project is to get changes upstreamed into popular software build 90 | tools like compilers, containerization tools, linkers, archivers, and more to 91 | be able to automatically produce Input Manifests and, whenever possible, to 92 | embed their Artifact IDs into the artifacts being constructed. 93 | 94 | We're also working on tooling to enable users of these tools to wrap them so 95 | they produce Artifact IDs today, though this is not yet ready. 96 | 97 | ## What's Next? 98 | 99 | We're still working to develop the implementations of OmniBOR to fruition that 100 | will enable others to start integrating it and producing Artifact IDs and 101 | Input Manifests. If that sounds interesting to you, [come help us][contribute]! 102 | 103 | [artifact_ids]: @/docs/artifact-ids.md 104 | [contribute]: @/contribute.md 105 | -------------------------------------------------------------------------------- /content/docs/artifact-ids.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Artifact IDs 3 | template: doc.html 4 | --- 5 | 6 | NOTE: The following explanations are based on the latest version of the 7 | OmniBOR specification. If any conflicts arise between the explanations given 8 | here and the OmniBOR specification, the specification supersedes these 9 | explanations. 10 | 11 | ## What are Artifact IDs? 12 | 13 | Artifact IDs are how OmniBOR solves the challenge of reproducibly identifying 14 | software artifacts. What we mean by "reproducible" is that anyone with access 15 | to the artifact can produce an identifier for it, and that identifier will 16 | always be the same. 17 | 18 | ## How are Artifact IDs Defined? 19 | 20 | OmniBOR Artifact IDs achieve this by choosing Git Object Identifiers (GitOIDs) 21 | as their identification scheme. GitOIDs are how Git identifies objects it 22 | tracks, and they're formed using a choice of a hash function, and then producing 23 | hashes using what we'll call the "GitOID construction". 24 | 25 | ### Choice of Hash Function 26 | 27 | Today, Git supports three hash functions: 28 | 29 | - SHA-1 30 | - SHA-1CD 31 | - SHA-256 32 | 33 | SHA-1 is the hash function most people are familiar with in Git, and for a long 34 | time it was the _only_ hash function Git supported. However, since the 35 | discovery of the SHAttered attack against SHA-1, and Git project did two things: 36 | 37 | 1. Introduced SHA-1CD to mitigate the risk of collisions arising from attempts 38 | to exploit the SHAttered attack. 39 | 2. Began a process of transitioning to SHA-256 as the basis for all GitOIDs, 40 | which is still underway. 41 | 42 | SHA-1CD is _almost_ equivalent to SHA-1. In essence, SHA-1CD attempts to 43 | detect attempts to engineer collisions (hence the "CD," for "collision 44 | detection") and modifies the output of the hash in those cases to break the 45 | collision. For Git, this kind of modification is fine, and so Git in recent 46 | versions usually uses SHA-1CD under the hood by default, though Git 47 | documentation still just calls in SHA-1. In the context of a single Git 48 | repository, the distinction doesn't generally matter. 49 | 50 | However, for the purposes of achieving a universally reproducible identifier, 51 | we do need to care about the distinction between SHA-1 and SHA-1CD, which is 52 | why we list them separately here. 53 | 54 | The SHA-256 transition in Git has been moving slowly, with successive versions 55 | periodically adding more support to smooth the transition. Nonetheless, progress 56 | is generally recognized as slow. 57 | 58 | __OmniBOR Artifact IDs only support SHA-256 today.__ 59 | 60 | This is important. While SHA-1 (really, SHA-1CD) is in wide use in existing Git 61 | data today, we expect that in the long run, it will be phased out. The Git 62 | project itself continues its slow movement along the SHA-256 transition, and 63 | (perhaps more importantly) we anticipate there will likely be government 64 | standards in the future which mandate a move away from SHA-1, in a similar 65 | fashion to prior widespread mandates to move away from MD5. 66 | 67 | While we could in theory support multiple identifiers at the same time, even 68 | just supporting two would double the complexity of OmniBOR operationally for 69 | producers and consumers. Worse, given the SHA-1 / SHA-1CD split, we'd likely 70 | need to support all three if we're going to support SHA-1. 71 | 72 | The specification does reserve the right to add support for alternative hashes 73 | in the future if, for example, SHA-256 is later found to be broken in a manner 74 | similar to how SHA-1 can be broken today. 75 | 76 | ### The "GitOID Construction" 77 | 78 | GitOIDs are constructed not just by hashing the data of the object itself. 79 | Instead, a small "prefix string" is hashed in first, with the following 80 | structure: 81 | 82 | ``` 83 | \0 84 | ``` 85 | 86 | Here `⎵` refers to the ASCII space character (`0x20`), replaced with a visual 87 | character for clarity. 88 | 89 | This prefix string has two purposes in Git. First, the object type (which can 90 | be `blob`, `tree`, `commit`, or `tag`) indicates the type of the data being 91 | stored based on Git's object model. This helps differentiate hashes for the 92 | different types of objects. Second, the length being hashed in helps provide 93 | additional protection against collisions. With this length included, an attacker 94 | trying to engineer a collision in Git's object storage would need to account 95 | for how the length of the colliding data impacts the hash as well. The 96 | SHAttered attack specifically relies on extensions of the original data in 97 | highly flexible formats like PDFs, which this is an effective protection 98 | against. 99 | 100 | For Artifact IDs, the object type is always `blob`, so for our purposes 101 | the prefix string is 102 | 103 | ``` 104 | blob⎵\0 105 | ``` 106 | 107 | Thus, the "GitOID construction" for Artifact IDs is to: 108 | 109 | 1. Calculate the length of the object being identified, in bytes. 110 | 2. Provide this prefix string to the SHA-256 hasher. 111 | 3. Provide the bytes of the object to the hasher. 112 | 4. Generate the hash from the hasher. 113 | 114 | ## How are Artifact IDs represented? 115 | 116 | The textual representation of an Artifact ID looks like this: 117 | 118 | ``` 119 | gitoid:blob:sha256:9f64df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8 120 | ``` 121 | 122 | This is a URI using the `gitoid` scheme, [registered with IANA][uri]. In this 123 | scheme, the representation includes four parts, each separate by a colon (`:`). 124 | The first is the string `gitoid`, indicating the URI scheme. The second is 125 | the object type, which for Artifact IDs is always `blob`. Then it's the hash 126 | algorithm, which for Artifact IDs is `sha256`. Finally, it's a lowercase 127 | hexadecimal representation of the SHA-256 hash of the object made with the 128 | GitOID construction. 129 | 130 | ## Why are Artifact IDs Useful? 131 | 132 | Artifact IDs are used for uniquely and reproducibly identifying software 133 | artifacts. Because the construction of an Artifact ID relies only on the 134 | contents of an artifact itself, anyone who has access to the artifact can 135 | derive its Artifact ID, and the Artifact ID they derive will be exactly 136 | equal to one derived by anyone else with access to the same artifact. 137 | 138 | This means that, as an identifier scheme, Artifact IDs can scale without 139 | limits! Other identification systems, like 140 | [Common Platform Enumerations (CPE)][cpe] or [Package URLs (pURLs)][purl], 141 | rely on some form of centralization. CPEs are identifiers which rely on a 142 | centralized dictionary, maintained by the United States' National 143 | Institute of Standards and Technology (NIST). Package URLs rely on a central 144 | list of known package hosts. In either case, while these identifier schemes 145 | are _very_ useful (and we view OmniBOR's Artifact IDs as _complementary_ to 146 | these other identifier schemes), they lack the property of independent 147 | reproducibility that makes Artifact IDs so powerful! 148 | 149 | ## What's Next? 150 | 151 | Of course, Artifact IDs by themselves are only one part of the equation. To 152 | understand more, learn about [Input Manifests][input_manifests] next! 153 | 154 | [uri]: https://www.iana.org/assignments/uri-schemes/prov/gitoid 155 | [cpe]: https://nvd.nist.gov/products/cpe 156 | [purl]: https://github.com/package-url/purl-spec 157 | [input_manifests]: @/docs/input-manifests.md 158 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /content/glossary/git_blob.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | -------------------------------------------------------------------------------- /content/glossary/git_object.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | -------------------------------------------------------------------------------- /content/spec/v0.1/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "OmniBOR Specification, Version 0.1" 3 | template: spec.html 4 | --- 5 | 6 | ## Foreword 7 | 8 | This specification is subject to the Community Specification License 1.0, 9 | available at . 10 | 11 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 12 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are used as described in 13 | [RFC 2119][rfc_2119]. 14 | 15 | Attention is drawn to the possibility that some of the elements of this 16 | document may be the subject of patent rights. No party shall be held 17 | responsible for identifying any or all such patent rights. 18 | 19 | Any trade name used in this document is information given for the convenience 20 | of users and does not constitute an endorsement. 21 | 22 | This document was prepared by the OmniBOR Community. 23 | 24 | Known patent licensing exclusions are available in the specification 25 | repository's `NOTICES.md` file. 26 | 27 | Any feedback or questions on this document should be directed to the 28 | specification's repository, located at . 29 | 30 | THESE MATERIALS ARE PROVIDED "AS IS." The Contributors and Licensees expressly 31 | disclaim any warranties (express, implied, or otherwise), including implied 32 | warranties of merchantability, non-infringement, fitness for a particular 33 | purpose, or title, related to the materials. The entire risk as to 34 | implementing or otherwise using the materials is assumed by the implementer and 35 | user. IN NO EVENT WILL THE CONTRIBUTORS OR LICENSEES BE LIABLE TO ANY OTHER 36 | PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR 37 | CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND 38 | WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON 39 | BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR 40 | NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 41 | 42 | ## Introduction 43 | 44 | Software supply chains face many challenges: security and compliance chief 45 | among them. Often, projects are hamstrung by the inability to easily and 46 | reliably capture a complete, concise, verifiable accounting of exactly 47 | __what__ inputs were built into software. Without this information, 48 | identifying vulnerable software to patch or replace is difficult. While 49 | Software Bills of Material (SBOMs) help identify third-party components, 50 | they do not go far enough to precisely identify the exact inputs necessary 51 | for vulnerability management. 52 | 53 | The OmniBOR standard defines three concepts, which together enable the 54 | consistent, reproducible, and embeddable encoding of the exact inputs used to 55 | build a software artifact: Artifact Identifiers, Input Manifests, and Artifact 56 | Dependency Graphs. 57 | 58 | An Artifact Identifier is a content-based identifier of a single input (for 59 | example, a single file) used to build a software artifact. Identifiers are 60 | reproducible, meaning two individuals will always derive the same identifier 61 | for the same input. With these identifiers, we can consistently and precisely 62 | identify any software artifact or its input, for use in forensics, accounting, 63 | and vulnerability management. 64 | 65 | Next, an Input Manifest lists the Artifact Identifier of every input used to 66 | produce an artifact. For example, if an executable is compiled by linking 67 | together a collection of object files, the Artifact Identifier of every object 68 | file would be listed in the Input Manifest for the executable. Input Manifests 69 | can be identified by treating them as artifacts and applying the same identifier 70 | heuristic to them applied to any other artifact. For purposes of discussion, 71 | these are typically called Input Manifest Idenftiers or Input Manifest IDs 72 | or IMIDs. The Input Manifest Identifier can be embedded directly into 73 | executable files, or can be provided in a separate file alongside the artifact 74 | whose inputs they describe. 75 | 76 | Finally, a collection of Input Manifests can be combined to produce an Artifact 77 | Dependency Graph. The Artifact Dependency Graph is a complete description of 78 | all inputs, direct or transitive, used to produce a software artifact. 79 | 80 | Returning to the example of building an executable: the executable's Input 81 | Manifest would list the Artifact Identifier of every object file, and each 82 | object file would have its own Input Manifest listing the Artifact Identifier 83 | of each of their source files. This set of Input Manifests can then be 84 | resolved to produce an overall graph completely describing the inputs which 85 | produced the executable. 86 | 87 | With the Artifact Dependency Graph, consumers of this information could then 88 | exactly identify when two artifacts were produced with exactly identical 89 | inputs, and if inputs vary, could identify the exact inputs which vary and 90 | observe how that affects the entirety of the graph. When coupled with 91 | SBOM information about third-party dependencies, this can provide highly 92 | specific and accurate identification of supply chain differences and their 93 | causes. 94 | 95 | This Artifact Dependency Graph may also be used to supplement vulnerability 96 | information by precisely identifying affects files or resolving the impacts 97 | of changes to those files across all users of those projects. By leveraging 98 | transparent inclusion of Input Manifests into executable and other formats, 99 | users would also gain the benefits of high precision supply chain information 100 | without manually recording or updating those manifests as projects develop over 101 | time. 102 | 103 | ## Scope 104 | 105 | Specifies procedures for constructing and conveying Input Manifests, 106 | Artifact Dependency Graphs (ADGs), and other related data structures 107 | for artifacts. Including but not limited to: 108 | 109 | - formats for artifact identifiers 110 | - formats for specifying graph relationships between artifacts 111 | - manner of embedding identifiers for Input Manifests, ADGs, and other related 112 | data structures in artifacts of various types 113 | - guidance on metadata which references Input Manifests, ADGs, and other related data structures 114 | - guidance for build tools for: 115 | - constructing Input Manifests, ADGs, and other related data structures 116 | - conveying Input Manifests, ADGs, and other related data structures 117 | - embedding identifiers for Input Manifests, and other related data structures' ids in artifacts 118 | - manners of conveyance of Input Manifests, and other related data structure's 119 | - descriptions of use cases for which Input Manifests, ADGs, and other related data structures may be used 120 | 121 | ## Normative References 122 | 123 | - [GitOID URI][gitoid_uri] 124 | 125 | ## Terms and Definitions 126 | 127 | For the purposes of this document, the following terms and definitions apply. 128 | 129 | ### Artifact 130 | 131 | An artifact is any object of interest that can be represented as arrays of 132 | bytes (`[]byte`). 133 | 134 | ### Artifact Equivalency 135 | 136 | Two artifacts are equivalent if and only if their binary representations are 137 | equal. This can be expressed in pseudocode with the following expression: 138 | `[]byte(artifact1) == []byte(artifact2)` 139 | 140 | ### Artifact Identifiers 141 | 142 | It should be possible to identify each artifact with an artifact identifier with 143 | the following characteristics: 144 | 145 | - __Reproducible__: Independent parties, presented with equivalent artifacts, 146 | derive the same artifact identity. 147 | - __Unique__: Non-equivalent artifacts have distinct identities. 148 | - __Immutable__: An identified artifact can not be modified without also 149 | changing its identity. 150 | 151 | ## Build Tools 152 | 153 | A build tool is something which reads one or more input artifacts and writes 154 | one or more output artifacts. Examples of build tools include: 155 | 156 | - compilers: 157 | - llvm-clang 158 | - gcc 159 | - javac 160 | - rustc 161 | - go 162 | - linkers: 163 | - llvm-lld 164 | - binutils-ld 165 | - runtimes 166 | - Java JVM 167 | - Node.js 168 | - Python interpreter 169 | - code generators 170 | 171 | ## Specifications 172 | 173 | ### Artifact ID 174 | 175 | Because two artifacts are equivalent if and only if their binary 176 | representations are equal, a hash function may be applied to the binary 177 | representation of an artifact to yield an identifier which satisfies the 178 | canonical, unique, and immutable requirements of artifact identifiers. 179 | 180 | ### Artifact Identifier Types 181 | 182 | The majority of source code artifacts are already stored in git and 183 | indexed by their git object identifiers ("gitoids") as git objects of type 184 | "blob". 185 | 186 | For this reason, OmniBOR has chosen to use the "gitoid" of an Artifact as 187 | its Artifact Identifier. 188 | 189 | Git currently supports two varieties of gitoids. One is based on SHA1 and is 190 | in common use. The other is based on SHA256 and has been very slow to garner 191 | adoption. The [gitoid URI spec][gitoid_uri] uses different prefixes, 192 | `gitoid:blob:sha1` or `gitoid:blob:sha256`, to distinguish which algorithm is 193 | being used for computing the gitoid of a blob. This document adopts the gitoid 194 | URI prefixes to distinguish Artifact Identifier Types. This approach is 195 | anticipated to extend gracefully as git adopts new hash types in the future. 196 | 197 | All subsequent references to mandatory identifier types in this document should 198 | be interpreted to mean the list: 199 | 200 | - `gitoid:blob:sha1` 201 | - `gitoid:blob:sha256` 202 | 203 | ### Artifact Input Manifest 204 | 205 | An Artifact Input Manifest for an Artifact enumerates the inputs to the 206 | build tool that produced the artifact. 207 | 208 | Hereafter in the spec Artifact Input Manifest will simply be referred to as Input Manifest. 209 | 210 | A given Input Manifest utilizes precisely one Artifact Identifier Type. 211 | 212 | #### Input Manifest Identifier 213 | 214 | An Input Manifest is identified by computing its identifier as an artifact 215 | with the Artifact Identifier Type used for identifiers within the Input Manifest 216 | itself. 217 | 218 | The Input Manifest Identifier for the Input Manifest of an artifact is sometimes 219 | referred to as the Input Manifest Identifier of the artifact. 220 | 221 | #### Input Manifest Header 222 | 223 | In order to distinguish the type of identifier used in the Input Manifest, 224 | it begins with a single newline terminated header line: 225 | 226 | ``` 227 | ${Artifact Identifier Type uri prefix}\n 228 | ``` 229 | 230 | For example: 231 | 232 | ``` 233 | gitoid:blob:sha1\n 234 | ``` 235 | 236 | or 237 | 238 | ``` 239 | gitoid:blob:sha256\n 240 | ``` 241 | 242 | All identifiers in a Input Manifest MUST be of the Artifact Identifier 243 | Type declared in the header. 244 | 245 | #### Input Manifest Records 246 | 247 | The Input Manifest after the header consists of a list of newline terminated 248 | input records 249 | 250 | An input record for an artifact for which no Input Manifest Identifier is known is represented as: 251 | 252 | ``` 253 | blob⎵${artifact identifier of the input artifact}\n 254 | ``` 255 | 256 | An input record for an artifact for which an Input Manifest Identifier is known is represented as: 257 | 258 | ``` 259 | blob⎵${artifact identifier of the input artifact}⎵bom⎵${input manifest identifier of the input artifact}\n 260 | ``` 261 | 262 | `⎵` above refers to the ASCII space character (0x20). 263 | 264 | Artifact identifiers in Input Records should be represented as a strings in lower case hexadecimal. For example 265 | 514516097a2f95c893f2a9685bcecfb85b7598e6. 266 | 267 | The input artifact records must be written to the Input Manifest in lexical 268 | order. 269 | 270 | The Artifact Identifier and Input Manifest Identifier must both be of the Artifact Identifier 271 | Type declared in the Input Manifest header. 272 | 273 | #### Input Manifest Character Encoding 274 | 275 | All characters in an Input Manifest are encoded in ASCII. Please note: all '\n' 276 | must be encoded as '\n' characters, _not_ the line delimiter of the platform. 277 | 278 | #### Input Manifest Identifier Embedding 279 | 280 | Each build tool should embed into the output artifact a deterministically 281 | ordered list of Input Manifest Identifiers for each mandatory Artifact 282 | Identifier Type in a manner: 283 | 284 | 1. Appropriate to the type of artifact 285 | 2. Generally agreed upon for that artifact 286 | 287 | #### Input Manifest Construction by a Build Tool 288 | 289 | A build tool creating an output artifact must compute an Input Manifest of 290 | each mandatory artifact identifier type. 291 | 292 | For each input artifact the build tool must: 293 | 294 | 1. Compute the artifact identifier of the input - `${artifact identifier}` 295 | 2. Examine the input for an embedded Input Manifest Identifier - 296 | `${input manifest identifier}` 297 | 298 | The build tool must persist an Input Manifest using the 299 | `${artifact identifier}` and `${input manifest identifier}` for each input. 300 | 301 | #### Input Manifest Examples 302 | 303 | ``` 304 | gitoid:blob:sha1 305 | blob 06a6891154fff74e1ddb6245f4a0467b09c617c5 306 | blob 06dd79bc831bb06a6267a36ad2d62beccd7900b2 bom a9a64def763517df596fbb4348a8561069b5dc4b 307 | blob 0bc39408c1e5feaadd6f0420d14324b477420b93 308 | blob 15acd4427ca14000111aad5071563bc7f2dc09f4 309 | blob 1be90e6fab4ab9b7dd3b27cea5bb1fe29acc0204 310 | blob 1d8a4e28d1b62a2bfeba837fe18422cd106e6ddf bom 5bda8237d1676df0a2d0b8682d40f99a27ef5b13 311 | blob 28488e0b05954ccf87c779f5f9258987e4d68ac5 312 | blob 2c0cde251f1a9f05563a5f7a7f32588f04aaa235 313 | ``` 314 | 315 | ``` 316 | gitoid:blob:sha256 317 | blob 09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772 318 | blob 230f3515d1306690815bd9c3da0d15d8b6fcf43894d17100eb44b6d329a92f61 319 | blob 2f4a51b16b76bbc87c4c27af8ae062b1b50b280f1ab78e3eec155334588dc88e bom 4f3a822f776412c049dda53c3277bf2225b51b805ce8a99222af23a7d9f55636 320 | blob c71d239df91726fc519c6eb72d318ec65820627232b2f796219e87dcf35d0ab4 321 | blob f47ffb3518f236eea6525fd29f057ddd5cda1bb803ccc662e6bc5925afd1e4af 322 | ``` 323 | 324 | ### Artifact Dependency Graph (ADG) 325 | 326 | The Artifact Dependency Graph (ADG) of an artifact is the recursive DAG 327 | (Directed Acyclic Graph) of all the "input artifacts" that are transformed 328 | by a build tool into that artifact. It includes the direct input artifacts, 329 | and the recursive set of input artifacts to each input artifact, all the way 330 | down the graph. 331 | 332 | Concretely the Artifact Dependency Graph (ADG) of an Artifact is: 333 | 334 | - The set of Input Manifests defined by: 335 | - The Input Manifest of the Artifact 336 | - Any Input Manifest referenced in an Input Manifest in the set (ie the transitive closure of the Input Manifests) 337 | - The Input Manifest Identifier of the Artifact 338 | ## Annexes 339 | 340 | - [Annex A - File System Storage](@/spec/v0.1/annex-a.md) 341 | - [Annex B - ELF Embedding](@/spec/v0.1/annex-b.md) 342 | - [Annex C - Source Embedding](@/spec/v0.1/annex-c.md) 343 | 344 | ## Bibliography 345 | 346 | - RFC 2119: 347 | - GitOID URI: 348 | 349 | [rfc_2119]: https://tools.ietf.org/html/rfc2119 350 | [gitoid_uri]: https://www.iana.org/assignments/uri-schemes/prov/gitoid 351 | -------------------------------------------------------------------------------- /content/glossary/gitoid.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 14 | gitoid(${contents}) = sha1( 15 | ) 16 | 18 | 20 | 21 | 22 | 23 | 41 | 42 | 81 | 88 | 89 | 146 | 147 | -------------------------------------------------------------------------------- /content/glossary/_index.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Glossary 3 | template: glossary.html 4 | --- 5 | 6 | The following is a glossary of terms defined by the OmniBOR project. For the 7 | current precise definitions, refer to the [specification]. 8 | 9 | ## Artifact 10 | 11 | An artifact is any software object of interest. 12 | 13 | Examples: 14 | 15 | - source code file (of any language) 16 | - `.o` object file 17 | - `.so` shared object file 18 | - `.class` Java class file 19 | - `.jar` file 20 | - `.pyc` compiled python file 21 | - executable file 22 | - container image 23 | 24 | What all artifacts have in common is that they are all arrays of bytes. 25 | 26 | ## Artifact Equivalency 27 | 28 | Two artifacts are equivalent if and only if their byte representations are 29 | exactly equal. 30 | 31 | ## Derived Artifacts 32 | 33 | Most artifacts are produced by a [build tool](#build-tool) consuming some set 34 | of input artifacts to produce an artifact as an output. Such artifacts are said 35 | to be 'derived artifacts'. 36 | 37 | ## Leaf Artifacts 38 | 39 | Artifacts which are not 'derived artifacts' are said to be 'leaf artifacts'. 40 | Leaf artifacts are usually source code files constructed by hand by humans. 41 | 42 | Examples: 43 | 44 | - "`foo.o` is derived from `foo.c` and `bar.h` using `gcc`" 45 | - "`fooexecutable` is derived from `foo.o` and `baz.o` using `ld`" 46 | - "`foo.class` is derived from `foo.java` using `javac`" 47 | 48 | ## Artifact ID 49 | 50 | It should be possible to identify each artifact with an Artifact ID. 51 | 52 | Artifact IDs should have the following characteristics: 53 | 54 | **Canonical** 55 | : Independent parties, presented with equivalent artifacts, derive the same 56 | Artifact ID. 57 | 58 | **Unique** 59 | : Non-equivalent artifacts have distinct Artifact IDs. 60 | 61 | **Immutable** 62 | : An artifact cannot be modified without also changing its Artifact ID. 63 | 64 | OmniBOR uses the [GitOID](#gitoid) of an artifact as its 65 | Artifact ID. 66 | 67 | Source code [leaf artifacts](#leaf-artifacts) are typically already being 68 | stored in [Git](#git) where they are identified via their [GitOID](#gitoid). 69 | 70 | ## Artifact Dependency Graph 71 | 72 | The Artifact Dependency Graph (ADG) of an [artifact](#artifact) is the 73 | DAG (Directed Acyclic Graph) of all the ['leaf artifacts'](#leaf-artifacts) that 74 | are transformed by a [build tool](#build-tool) into that artifact. This 75 | includes the direct input artifacts, and the transitive set of artifacts to 76 | each input artifact, all the way down to source code. 77 | 78 | ### Examples 79 | 80 | Simple C Executable 81 | 82 | {% mermaid() %} 83 | flowchart BT 84 | c1[.c] --> o1[.o] 85 | h1.1[.h] --> o1[.o] 86 | h1.2[.h] --> o1[.o] 87 | c2[.c] --> o2[.o] 88 | h2.1[.h] --> o2[.o] 89 | h2.2[.h] --> o2[.o] 90 | o1 --> executable 91 | o2 --> executable 92 | {% end %} 93 | 94 | Running C Executable with Shared Object 95 | 96 | {% mermaid() %} 97 | flowchart BT 98 | c1[.c] --> o1[.o] 99 | h1.1[.h] --> o1[.o] 100 | h1.2[.h] --> o1[.o] 101 | c2[.c] --> o2[.o] 102 | h2.1[.h] --> o2[.o] 103 | h2.2[.h] --> o2[.o] 104 | o1 --> executable 105 | o2 --> executable 106 | c3[.c] --> o3[.o] 107 | h3.1[.h] --> o3[.o] 108 | h3.2[.h] --> o3[.o] 109 | c4[.c] --> o4[.o] 110 | h4.1[.h] --> o4[.o] 111 | h4.2[.h] --> o4[.o] 112 | o3 --> .so 113 | o4 --> .so 114 | executable --> running[running executable] 115 | .so --> running[running executable] 116 | {% end %} 117 | 118 | Java Example 119 | 120 | {% mermaid() %} 121 | flowchart BT 122 | java1[.java] --> cls1[.class] 123 | java2[.java] --> cls2[.class] 124 | java3[.java] --> cls3[.class] 125 | java4[.java] --> cls4[.class] 126 | java5[.java] --> cls5[.class] 127 | cls1 --> running[running executable] 128 | cls2 --> running[running executable] 129 | cls3 --> running[running executable] 130 | cls4 --> running[running executable] 131 | cls5 --> running[running executable] 132 | {% end %} 133 | 134 | Go Example 135 | 136 | {% mermaid() %} 137 | flowchart BT 138 | go1[.go] --> o1[.o] 139 | go2[.go] --> o2[.o] 140 | go3[.go] --> o3[.o] 141 | go4[.go] --> o4[.o] 142 | go5[.go] --> o5[.o] 143 | o1 --> executable 144 | o2 --> executable 145 | o3 --> executable 146 | o4 --> executable 147 | o5 --> executable 148 | {% end %} 149 | 150 | Python Example 151 | 152 | {% mermaid() %} 153 | flowchart BT 154 | py1[.py] --> pyc1[.pyc] 155 | py2[.py] --> pyc2[.pyc] 156 | py3[.py] --> pyc3[.pyc] 157 | py4[.py] --> pyc4[.pyc] 158 | py5[.py] --> pyc5[.pyc] 159 | pyc1 --> running[running executable] 160 | pyc2 --> running[running executable] 161 | pyc3 --> running[running executable] 162 | pyc4 --> running[running executable] 163 | pyc5 --> running[running executable] 164 | {% end %} 165 | 166 | ## Build Tool 167 | 168 | A build tool is something which reads one or more input [artifacts](#artifact) 169 | and writes one or more output [artifacts](#artifact). 170 | 171 | {% mermaid() %} 172 | flowchart LR 173 | input1 --> buildtool[build tool] --> output 174 | input2 --> buildtool[build tool] 175 | input3 --> buildtool[build tool] 176 | {% end %} 177 | 178 | Examples: 179 | 180 | * C compiler consumes one `.c` file and zero or more `.h` files to produce a 181 | `.o` file 182 | 183 | {% mermaid() %} 184 | flowchart LR 185 | .c --> compiler[[compiler]] 186 | *.h --> compiler[[compiler]] 187 | compiler --> .o 188 | {% end %} 189 | 190 | * C linker consumes one or more `.o` files to produce an executable file 191 | 192 | {% mermaid() %} 193 | flowchart LR 194 | *.o --> linker[[linker]] 195 | linker --> executable 196 | {% end %} 197 | 198 | * C linker consumes one or more `.o` files to produce a shared object 199 | 200 | {% mermaid() %} 201 | flowchart LR 202 | *.o --> linker[[linker]] 203 | linker --> .so 204 | {% end %} 205 | 206 | * Dynamic linker consumes an executable file and zero or more shared objects to 207 | produce a running process 208 | 209 | {% mermaid() %} 210 | flowchart LR 211 | executable --> linker[[dynamic linker]] 212 | *.so --> linker[[dynamic linker]] 213 | linker --> running[running executable] 214 | {% end %} 215 | 216 | * Java compiler consumes a `.java` file to produce a `.class` file 217 | 218 | {% mermaid() %} 219 | flowchart LR 220 | .java --> compiler[[compiler]] 221 | compiler --> classfile[.class] 222 | {% end %} 223 | 224 | * Java runtime consumes one or more `.class` files to produce a running process 225 | 226 | {% mermaid() %} 227 | flowchart LR 228 | classfile[*.class] --> runtime[[runtime]] 229 | runtime --> running[running executable] 230 | {% end %} 231 | 232 | * Python bytecode compiler consumes a `.py` file to produce a `.pyc` file 233 | 234 | {% mermaid() %} 235 | flowchart LR 236 | .py --> compiler[[compiler]] 237 | compiler --> .pyc 238 | {% end %} 239 | 240 | The totality of ancestors for a given artifact may be represented as an 241 | [Artifact Dependency Graph (ADG)](#artifact-dependency-graph). 242 | 243 | ## Code Generators 244 | 245 | Typically, source code files are hand written by humans, and as such are 246 | [leaf artifacts](#leaf-artifacts) in the 247 | [Artifact Dependency Graph (ADG)](#artifact-dependency-graph). 248 | 249 | Source code files can also be **generated** from other inputs by a code 250 | generator. 251 | 252 | {% mermaid() %} 253 | flowchart LR 254 | input[input] --> codegenerator[[code generator]] --> generatedsrc[generated source code file] 255 | {% end %} 256 | 257 | In this scenario, the generated source code file is a 258 | [derived artifact](#derived-artifacts). This is because the 259 | [code generator](#code-generators) is a [build tool](#build-tool) and, by 260 | definition, the output from the [build tool](#build-tool) is a 261 | [derived artifact](/glossary/artifact/#derived-artifacts). 262 | 263 | Code generation is very common in many languages. 264 | See [go generate](https://eli.thegreenplace.net/2021/a-comprehensive-guide-to-go-generate/), 265 | [Java Xtend](https://www.eclipse.org/xtend/), and 266 | [qtcpp](https://qface.readthedocs.io/en/latest/qtcpp.html) for examples. 267 | 268 | ## Git 269 | 270 | [Git](https://git-scm.com/) is an object store masquerading as a source 271 | code management system (SCM). 272 | 273 | Git's storage model stores source code and metadata using a Merkel tree. 274 | 275 | ## Git Objects 276 | 277 | Git Objects are represented as follows: 278 | 279 | {{ img(path = "/glossary/git_object.svg", alt = "Git Object") }} 280 | 281 | * `${type}` - Git Object Type as a string 282 | - `blob` - any bytes 283 | - `tree` - represents a filesystem tree 284 | - `commit` - represents a Git commit 285 | - `tag` - represents a Git tag 286 | * `${size}`: size in bytes of `${content}` represented as a string base 10. 287 | * `${content}`: the byte content of the object 288 | 289 | ## Git Blob 290 | 291 | A Git [blob](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) 292 | (binary large object) is the type used for file contents in git: 293 | 294 | {{ img(path = "/glossary/git_blob.svg", alt = "Git Blobs") }} 295 | 296 | - `${content}` - bytes of the file contents 297 | - Does not include filename or path 298 | - Does not include mode information 299 | - Does not include *any* metadata 300 | - Just the contents 301 | - **Any file anywhere with the same contents will have the same 'blob' object** 302 | - **Any file anywhere with the same contents will have the same GitOID** 303 | 304 | ## GitOID 305 | 306 | Git Blobs are identified by the SHA-1 hash of the blob object with the GitOID 307 | construction, which first hashes in a string containing the object type, an 308 | ASCII space character, the length of the content in number of bytes, and an 309 | ASCII null terminator character: 310 | 311 | {{ img(path = "/glossary/gitoid.svg", alt = "GitOIDs") }} 312 | 313 | ## OmniBOR 314 | 315 | An [artifact dependency graph](#artifact-dependency-graph) can be represented as 316 | a graph with nodes identified by an [Artifact ID](#artifact-id). In the examples 317 | below, we only show tree structures for simplicity. 318 | 319 | {% mermaid() %} 320 | flowchart BT 321 | Artifact-2[Artifact-2 ID] --> Artifact-1[Artifact-1 ID] 322 | Artifact-3[Artifact-3 ID] --> Artifact-1[Artifact-1 ID] 323 | Artifact-4[Artifact-4 ID] --> Artifact-2[Artifact-2 ID] 324 | Artifact-5[Artifact-5 ID] --> Artifact-2[Artifact-2 ID] 325 | Artifact-6[Artifact-6 ID] --> Artifact-3[Artifact-3 ID] 326 | Artifact-7[Artifact-7 ID] --> Artifact-3[Artifact-3 ID] 327 | {% end %} 328 | 329 | OmniBOR uses the [GitOID](#gitoid) of an artifact as its 330 | [Artifact ID](#artifact-id). 331 | 332 | {% mermaid() %} 333 | flowchart BT 334 | Artifact-2[Artifact-2 gitoid] --> Artifact-1[Artifact-1 gitoid] 335 | Artifact-3[Artifact-3 gitoid] --> Artifact-1[Artifact-1 gitoid] 336 | Artifact-4[Artifact-4 gitoid] --> Artifact-2[Artifact-2 gitoid] 337 | Artifact-5[Artifact-5 gitoid] --> Artifact-2[Artifact-2 gitoid] 338 | Artifact-6[Artifact-6 gitoid] --> Artifact-3[Artifact-3 gitoid] 339 | Artifact-7[Artifact-7 gitoid] --> Artifact-3[Artifact-3 gitoid] 340 | {% end %} 341 | 342 | ## Input Manifest 343 | 344 | The parent-child relationship is captured by a set of Input Manifests. 345 | 346 | Each artifact has an Input Manifest that describes its immediate children 347 | consiting of a set of new line delimited records, one for each child, in 348 | lexical order. 349 | 350 | A child artifact which is itself a [leaf artifact](#leaf-artifacts) would be 351 | represented by: 352 | 353 | ``` 354 | ${Artifact ID of child}\n 355 | ``` 356 | 357 | A child artifact which is itself a [derived artifact](#derived-artifacts) would 358 | be represented by: 359 | 360 | ``` 361 | ${Artifact ID of child}⎵manifest⎵${Artifact ID of child's Input Manifest}\n 362 | ``` 363 | 364 | Example: 365 | 366 | {% mermaid() %} 367 | flowchart BT 368 | Artifact-2[Artifact-2 Artifact ID] --> Artifact-1[Artifact-1 Artifact ID] 369 | Artifact-3[Artifact-3 Artifact ID] --> Artifact-1[Artifact-1 Artifact ID] 370 | Artifact-4[Artifact-4 Artifact ID] --> Artifact-2[Artifact-2 Artifact ID] 371 | Artifact-5[Artifact-5 Artifact ID] --> Artifact-2[Artifact-2 Artifact ID] 372 | Artifact-6[Artifact-6 Artifact ID] --> Artifact-3[Artifact-3 Artifact ID] 373 | Artifact-7[Artifact-7 Artifact ID] --> Artifact-3[Artifact-3 Artifact ID] 374 | {% end %} 375 | 376 | Artifact-2's Input Manifest: 377 | 378 | ``` 379 | gitoid:sha256\n 380 | ${Artifact ID of Artifact-4}\n 381 | ${Artifact ID of Artifact-5}\n 382 | ``` 383 | 384 | Artifact-3's Input Manifest: 385 | 386 | ``` 387 | gitoid:sha256\n 388 | ${Artifact ID of Artifact-6}\n 389 | ${Artifact ID of Artifact-7}\n 390 | ``` 391 | 392 | Artifact-1's Input Manifest: 393 | 394 | ``` 395 | gitoid:sha256\n 396 | ${Artifact ID of Artifact-2}⎵manifest⎵${Artifact ID of Artifact-2's Input Manifest}\n 397 | ${Artifact ID of Artifact-3}⎵manifest⎵${Artifact ID of Artifact-2's Input Manifest}\n 398 | ``` 399 | 400 | ### Embedding of Artifact IDs for Input Manifests 401 | 402 | OmniBOR advocates for [build tools](#build-tool) to embed into each 403 | [derived artifact](#derived-artifacts) the Artifact ID of that derived 404 | artifact's Input Manifest. 405 | 406 | Examples: 407 | 408 | **ELF Files (Executables and `.so`, and `.o` files)** 409 | : Embed Input Manifest Artifact ID into an ELF section named `.omnibor` 410 | 411 | **ar Files (`.a` static libraries)** 412 | : Embed Input Manifest Artifact ID into an archive entry named `.omnibor` 413 | 414 | **General Archive files (`tar`, `gzip`, etc.)** 415 | : Embed Input Manifest Artifact ID into an archive entry named `.omnibor` 416 | 417 | **Java `.class` file** 418 | : Embed Input Manifest Artifact ID into an annotation named `@OMNIBOR` in the 419 | `.class` file. 420 | 421 | **Python `.pyc` files** 422 | : Embed Input Manifest Artifact ID into an `__omnibor__` in the `.pyc` file. 423 | 424 | **Container Images** 425 | : Embed Input Manifest Artifact ID into the image manifest as an annotation 426 | named `dot.omnibor` 427 | 428 | **Generated Source Code** 429 | : Embed Input Manifest Artifact ID for a generated source code file using a 430 | comment 431 | 432 | ## SBOM 433 | 434 | OmniBOR is not a Software Bill of Materials (SBOM). It is designed to 435 | complement SBOMs, such as [SPDX](https://spdx.dev/) or 436 | [CycloneDX](https://cyclonedx.org/). 437 | 438 | [OmniBOR](#omnibor) can help [SBOMs](#sbom) be more precise and reliable. 439 | 440 | Most [SBOMs](#sbom) allow for 'external identifiers' and can thus use 441 | [Artifact IDs](#artifact-id) to reference the artifacts in the OmniBOR 442 | [Artifact Dependency Graph (ADG)](#artifact-dependency-graph). This allows an 443 | [SBOM](#sbom) describing a specific component, e.g. 444 | `Component Name: Django` and `Component Version: 1.11.1`, to reference a list 445 | of applicable [Artifact IDs](#artifact-id). 446 | 447 | This is helpful because today two different tools might produce two different 448 | SBOMs for the same software [artifact](#artifact). This could occur if the SBOM 449 | generation tools use different sources to identify and describe the component. 450 | OmniBOR provides a precise software [Artifact ID](#artifact-id) which can be 451 | used in SBOMs in situations where naming schemes may be ambiguous. 452 | 453 | **Example 1**: If one SBOM generation tool uses [CPEs](https://nvd.nist.gov/products/cpe): 454 | 455 | ``` 456 | cpe:2.3:a:djangoproject:django:1.11.1:*:*:*:*:*:*:* 457 | ``` 458 | 459 | and the other uses [Package URLs (pURLs)](https://github.com/package-url/purl-spec): 460 | 461 | ``` 462 | pkg:pypi/django@1.11.1 463 | ``` 464 | 465 | … then these two SBOMs might diverge when they define the component 466 | supplier: it could be `Component Supplier: djangoproject` or 467 | `Component Supplier: pypi`. 468 | 469 | **Example 2:** In another instance a vendor might choose to use their 470 | product's current marketing name for the component name in their SBOM 471 | generation tools, whereas third-party SBOM generation tools might use the 472 | vendor's product name as listed in a [CPE](https://nvd.nist.gov/products/cpe) 473 | or [SWID tag](https://nvd.nist.gov/products/swid). 474 | 475 | By enabling both SBOM generation tools to list the OmniBOR Artifact ID(s) for 476 | associated with the component, an SBOM consumer can quickly understand that 477 | both SBOMs do describe the same artifact, regardless of ambiguities in naming 478 | schemes. 479 | 480 | [specification]: @/spec/_index.md 481 | -------------------------------------------------------------------------------- /content/resources/whitepaper.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Whitepaper 3 | --- 4 | 5 | 6 | OmniBOR: Enabling Universal Artifact Traceability In Software Supply Chains 7 | === 8 | 9 | * Author: Aeva Black 10 | * Status: DRAFT 11 | * Last updated: 2022-01-25 12 | 13 | ## Summary 14 | 15 | OmniBOR is an application of the [git](https://en.wikipedia.org/wiki/Git) [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph), a widely used merkle tree with a flat-file storage format, to the challenge of creating build artifact dependency graphs in today's language-heterogeneous open source environments. 16 | 17 | By generating artifact dependency graphs at build time, embedding the hash of the graph in produced artifacts, and referencing that hash in each subsequent build step, OmniBOR will enable the creation of verifiable and complete artifact dependency graphs while requiring no effort from, or changes in, most open source projects. Furthermore, it will enable efficient correlation of vulnerability databases against a concise representation of the artifact dependency graph within run-time environments, if vulnerability databases can be correlated to source files or intermediary packages or libraries. These benefits would also accrue to closed-source projects that use the same build tools, and provide insights which span both open and closed source components in a consistent manner. 18 | 19 | ### Objective 20 | 21 | It is desirable to enable efficient launch-time comparison of the verifiable and complete build tree of any executable component [1] against a then-current list of undesirable source files [2] which are known to be undesirable, where such a build tree contains unique referents for all sources from which the given executable object was composed. 22 | 23 | [1]: binary, dynamically-linked library, container image, etc. 24 | 25 | [2]: because vulnerabilities may be discovered between the time an executable is created and the time when it is run, these processes must be decoupled 26 | 27 | 28 | ### Proposal 29 | 30 | In an ideal scenario, an open source consumer would have available to them a complete artifact dependency graph, tracing dependencies to their ultimate depth. Even if we do not achieve this ideal, we should seek a solution with the lowest cost of adoption so as to enable the greatest buy-in across all open source ecosystems and communities. 31 | 32 | For this reason we propose two areas of work: 33 | 1. enhancing artifact-generating tools (e.g., compilers, linkers, and container image generators) to also output metadata regarding their inputs and outputs 34 | 2. defining a storage format which represents the minimum information to describe the artifact dependency graph, and which is based on git's on-disk storage format 35 | 36 | Following from (1), this approach will require minimal to no effort on the part of open source project maintainers, thus significantly increasing its chances of widespread adoption as compared to any approach which requires maintainers to perform additional actions (e.g., implementing substantive changes in their CI/CD or package build pipeline to generate an SBOM). 37 | 38 | Following from (2), this on-disk format provides an efficient and already well-understood method for cross referencing artifacts and source files by a deterministically-generated UUID (SHA1 or SHA256). 39 | 40 | 41 | ### ASCII-Art Flow Chart 42 | ``` 43 | ┌─────────────────────────────-┐ 44 | │ Build Time: Graph Generation │ 45 | │ │ 46 | │ ┌────────┐ ┌────────┐ │ 47 | │ │ Src A │ │ Src B │ │ 48 | │ └───┬────┘ └──┬─────┘ │ 49 | │ │ │ │ 50 | │ ▼ │ │ 51 | │ ┌───────┐ │ │ 52 | │ │ Obj A │ │ │ 53 | │ └─────┬─┘ │ │ 54 | │ │ │ │ 55 | │ ▼ ▼ │ 56 | │ ┌─────────────┐ │ 57 | │ │ Compilation │ │ 58 | │ │ & Signing │ │ 59 | │ └─────┬───────┘ │ 60 | │ │ │ 61 | └──────────────┼──────────────-┘ 62 | │ 63 | ┌──▼─┐ 64 | ▼ ▼ 65 | ┌──────────┐ ┌──────--┐ 66 | │ [header] ├──►│omnibor │ 67 | │executable│ │ graph │ 68 | └──────┬───┘ └┬─────--┘ 69 | │ │ 70 | ┌───────┼────────┼────────────────────────────┐ 71 | │ │ │ Run Time: Comparison │ 72 | │ │ │ │ 73 | │ ▼ ▼ ┌────────────┐ │ 74 | │ ┌───────────────┐ │ Public │ │ 75 | │ │ Policy │◄─────►│ Vuln │ │ 76 | │ │ Enforcement │ │ Database │ │ 77 | │ └─┬─────────────┘ └────────────┘ │ 78 | │ │ | │ 79 | │ ▼ ▼ │ 80 | │ ┌─────────────┐ ┌────────────┐ │ 81 | │ │ Runtime | | Scanning | | 82 | | | Environment │◄─────►│ Tools | │ 83 | │ └─────────────┘ └────────────┘ │ 84 | │ │ 85 | └─────────────────────────────────────────────┘ 86 | 87 | ``` 88 | 89 | 90 | ## OmniBOR 91 | 92 | OmniBOR is an approach which has the following properties: 93 | 1. re-uses a well understood paradigm for modelling artifact relationships efficiently in flat files on disk in a machine-readable format 94 | 2. optimally efficient approach for run-time comparison of any given binary object against a dataset of signatures of known-vulnerable inputs 95 | 3. does not require project maintainers to make any changes to their workflow in order to comply with the [Biden Executive Order](https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/) 96 | 4. has a bounded scope of work to achieve near-complete coverage of the F/OSS landscape 97 | 5. could be integrated with both free and commercial services 98 | 99 | ### Characteristics 100 | 101 | 1. **Artifact Equivalence**: Two artifacts are equivalent IFF `[]byte(artifact1) == []byte(artifact2)`. 102 | 2. **Artifact Identity**: Independent parties, presented with equivalent artifacts, derive the same artifact identity. 103 | 3. **Immutability**: An identified artifact can not be modified without also changing its identity. 104 | 4. **Uniqueness**: An artifact can have precisely *one* artifact identity graph. All equivalent artifacts have the same graph. 105 | 5. **Transparently Opaque**: Artifacts and associated metadata may be obfuscated when sharing the artifact identity graph, while preserving other properties. 106 | 6. **Truncatability of Graph**: Artifact identity graphs may themselves be treated as artifacts, enabling truncation of a part of the graph and replacing the leading node with a signature of the sub-graph, thereby preserving all other properties with respect to the whole. 107 | 7. **Independent Metadata**: Artifacts may be associated, through their identity, to independently generated metadata stored outside of the artifact identity graph, such as an SBOM containing license and provenance metadata. 108 | 8. **Authoritative Reference**: By generating the artifacts in the authoring function, correctness of the generated artifact identity graph can have the minimum number of dependencies (N=1) and least error rate of all solutions which could generate similar graphs. 109 | 9. **Non-reputability**: 110 | 10. **Embedded**: An artifact includes a unique identifier of the document containing the artifact identity graph used to generate that artifact. 111 | 112 | 113 | #### 1. Artifact Equivalence 114 | 115 | *Two artifacts are equivalent if `[]byte(artifact1) == []byte(artifact2)`.* 116 | 117 | Two artifacts are said to be equivalent if and only if they are byte-for-byte identical. This implies that OmniBOR is not concerned with questions of provenance, origination, licensure, or many others aspects which are encompassed by a software bill of materials, and which could differ between byte-equivalent artifacts. 118 | 119 | #### 2. Artifact Identity 120 | 121 | *Independent parties, presented with equivalent artifacts, derive the same artifact identity.* 122 | 123 | This implies that a deterministic hashing function may be used to derive artifact identity, such as SHA256. 124 | 125 | #### 3. Immutability 126 | 127 | *An identified artifact cannot be modified without also changing its identity. Non-equivalent artifacts have distinct identities.* 128 | 129 | "An identified artifact" means an artifact whose identity has been determined. "Can not be modified without also changing its identity" means that the deterministic hashing function has no collisions, and therefore any change to the artifact results in a change to its identity. In this way, the relationship between artifact and identity is immutable. 130 | 131 | #### 4. Uniqueness 132 | 133 | *An artifact can have precisely one artifact identity graph. All equivalent artifacts have the same graph.* 134 | 135 | This implies that we must not include build tooling in the artifact dependency graph, as doing otherwise would violate the Uniqueness requirement. For example, two reproducible build systems which rely on different auxiliary libraries (e.g., zlib) and result in byte-equivalent outputs **must** yield identical OmniBORs. 136 | 137 | For further exploration of this topic, see Wheeler's work on reproducibility as a means to verify trustability: [Countering Trusting Trust through Diverse Double-Compiling](https://dwheeler.com/trusting-trust/) 138 | 139 | {{% notification type="info" %}} 140 | **Note the implication** that for any artifact, there can only be one artifact identity graph, but the reverse is not true. Each artifact identity graph may generate multiple artifacts (e.g., if different build parameters are used, or it is compiled on a different architecture, or different metadata, such as compile time, were embedded in the built artifact). 141 | {{% /notification %}} 142 | 143 | #### 5. Transparently Opaque 144 | 145 | *Artifacts and associated metadata may be obfuscated when sharing the artifact identity graph, while preserving other properties.* 146 | 147 | Metadata about artifacts and their associated artifact dependency graphs may have varying levels of sensitivity. OmniBOR allows the supplier to reveal as little or as much as they, in negotiation with their consumers, choose. The OmniBOR graph itself is just a [merkle tree](https://en.wikipedia.org/wiki/Merkle_tree) of opaque hashes. This provides transparency about the artifact dependency graph and its structure, while allowing supplier modulated levels of opaequeness about the metadata. 148 | 149 | #### 6. Truncatability of Graph 150 | 151 | 152 | 153 | #### 7. Independent Metadata 154 | 155 | *Artifacts may be associated, through their identity, to independently generated metadata stored outside of the artifact identity graph, such as an SBOM containing license and provenance metadata.* 156 | 157 | There are many many many use cases that could use OmniBORs. An incomplete list would include: 158 | 159 | * Detecting potential vulnerabilities in executables/containers. 160 | * Identifying Open Source License obligations 161 | * Identifying commercial license obligations 162 | * More reliable attestation 163 | * Post exploit forensics 164 | 165 | Undoubtably, more will arise. Independence of metadata independent permissionless innovation around each use case without the need for cross domain coordination. This lowers the cost of innovation and thus allows more productive innovation in this space. 166 | 167 | #### 8. Authoritative Reference 168 | 169 | #### 9. Non-reputability 170 | 171 | 172 | ### What OmniBOR is not 173 | 174 | 1. Not a system for build reproducibility, but it does provide information that is useful for that. 175 | 2. Not a version control system, though it is designed to co-exist with them. 176 | 3. Not an SBOM, though it is designed to complement them. 177 | 4. Not a globally unique software identifier (SWID). 178 | 5. Not reliant on any particular packaging or distribution mechanism, either for artifacts or for artifact identity graphs). 179 | 180 | 181 | ### Comparison to Software Bill Of Materials and our Objective 182 | 183 | {{% notification type="info" %}} 184 | OmniBOR is **not** an SBOM standard. 185 | {{% /notification%}} 186 | 187 | From the OmniBOR perspective, any SBOM document is a type of artifact which could be referenced in an artifact dependency graph. 188 | 189 | From an SBOM perspective, OmniBOR is a common precise way to identify artifacts and their artifact dependency graphs, and nothing more. This makes OmniBOR incapable of fulfilling many of the objectives of SBOMs, such as recording provenance, origination, build environment information, licensure, and other qualities. 190 | 191 | {{% notification type="info" %}} 192 | Speaking strictly from an **SPDX 3.0-draft** perspective, OmniBOR is a lossy serialization format that only includes the minimum metadata field of "Identifier". 193 | {{% /notification %}} 194 | 195 | Current metadata formats, such as SPDX 2.x, as well as current systems to sign and transport metadata documents, do not *efficiently* support [our use case](#Objective) in the general case. They may well, however, support this use case in a specialized case, which we will discuss. 196 | 197 | An argument can be made that current metadata formats can enable run-time analysis of the complete artifact dependency graph. Achieving this would require (1) that generation of SBOM metadata be performed using compatible tooling by every project within the graph, (2) the documents' distrubion be consistent, and, crucially, (3) that a separate system exist to recursively fetch and parse metadata documents for all related projects and index them in a manner enabling efficient search. 198 | 199 | Let us look briefly at these three adoption requirements in more detail to understand the implications for (and, at least, one motivation for hesitancy in uptake of) volunteer-maintained open source projects. 200 | 201 | 1. Current tooling to generate SBOM documents requires effort on the part of every OSS project maintainer to integrate with their build systems. While full SBOM generation *could* be integrated into compilers and linkers, as we propose for OmniBOR, many view the complexity as overly burdensome on small projects, [creating a source of friction](https://opensource.com/article/21/8/open-source-maintainers) that has and may continue to hamper adoption. On the other hand, due to the pervasiveness of Git itself, we believe a minimalist approach that *already feels familiar* will be better received by this long tail of OSS projects. 202 | 203 | 2. One obstacle in the distribution and adoption of SBOMs has been competing standards (see the "Landscape" document for examples in addition to SPDX). By proposing to capture only the bare minimum metadata necessary to enable this scenario, we believe this proposal will avoid the ongoing debates about competing standards. *N.B.: Early socialization of this idea has received fairly wide support for the principle of a minimalist disk-based representation of the artifact dependency graph.* 204 | 205 | 3. Run-time comparison, as described in the Objective, must be within the capabilities of even small and independent consumers of open source. A proposal which required large investments in infrastructure (e.g., that an operator maintain a database containing complete SBOM documents for the totality of open source) will not be seen as a reasonable requirement for smaller and independent organizations (even though it may make for a very compelling product offering, were someone to build and license it!). 206 | 207 | ### How will this intersect with reproducible build systems? 208 | 209 | ***TODO*** 210 | 211 | ### Does this play well with In-Toto? 212 | 213 | ***TODO: Santiago*** 214 | 215 | ### OmniBOR and SWID 216 | 217 | ### OmniBOR and pURL 218 | 219 | 220 | ## Examples 221 | 222 | ### Example: hello.c 223 | 224 | Imagine we have the following two files: 225 | 226 | `hello.c` has gitoid `c64efd8bd8bceca8c69f9b5b7647cf0ff61fed59` and includes `stdio.h` 227 | 228 | `stdio.h` has gitoid `c0f35b8ae567f5348df3711496fdc0ef6f634169` 229 | 230 | From these two inputs, we compile `hello.o`. The resulting OmniBOR is a document (text file) containing the lexically ordered sequence of the gitoids of each input artifact related to this build step: 231 | 232 | ``` 233 | blob⎵c0f35b8ae567f5348df3711496fdc0ef6f634169\n 234 | blob⎵c64efd8bd8bceca8c69f9b5b7647cf0ff61fed59\n 235 | ``` 236 | 237 | The gitoid of the resulting document is `85322091b1d50a23d1c2a0f5933788a2a958f2ad`, and this document is written out to disk in a directory in the build environment, e.g.: 238 | 239 | ``` 240 | ./.bom/object/85/322091b1d50a23d1c2a0f5933788a2a958f2ad 241 | ``` 242 | 243 | The compiler would also embed this gitoid in a new elf section of the resulting `hello.o` binary; this adds a total of 89 bytes when accounting for elf section formatting. 244 | 245 | ### Example: OCI v2 / ORAS 246 | 247 | Imagine we have the following Dockerfile: 248 | ```docker 249 | FROM : 250 | RUN 251 | ``` 252 | 253 | We calculate the hash of `:`, which is: `000TODO`. 254 | 255 | Things get a little trickier when we go to calculate the hash of the next layer. 256 | 257 | Also, we want to produce an artifact dependency graph that can reference the omnibor of any artifacts added to that layer, not merely a hash of the whole layer. We'll do that by ... *TODO* ... 258 | 259 | Combining these together, we produce the following OmniBOR document: 260 | ``` 261 | blob_000TODO 262 | blob_000TODO 263 | ``` 264 | ... and embed the gitoid of this omnibor in the image manifest's `annotations` field, like so: 265 | 266 | ``` 267 | { 268 | "schemaVersion": 2, 269 | "config": {...}, 270 | "layers": [ {...}, {...} ], 271 | "annotations": { 272 | "omnibor”: “sha256:abc123TODO” 273 | } 274 | } 275 | ``` 276 | {{% notification type="info" %}} 277 | **NOTE**: The annotation type 'omnibor' is not yet standardized or accepted to OCI. In the above snippet, 'omnibor' is merely an example. 278 | {{% /notification %}} 279 | 280 | ### Example: truncating a graph for non-public subgraphs 281 | 282 | ***TODO*** 283 | 284 | ### Example: very large build systems (e.g., Linux) 285 | 286 | ***TODO*** 287 | 288 | ## Proposed Implementation 289 | 290 | ### For Compiled Artifacts 291 | 292 | **TODO:** Replace / reformat examples as a specification 293 | - *Describe implementation for GCC* 294 | - *Describe implementation for LLVM* 295 | - *Address container image composition* 296 | 297 | ### For Non-compiled Artifacts 298 | 299 | **TODO** 300 | - *Address run-time compiled languages, such as python and java* 301 | 302 | 303 | ## Credits and Gratitudes 304 | 305 | I must thank Ed Warnicke, who pitched this idea to me one sunny summer afternoon in 2021 while I was stuck in Puget Sound traffic, and who graciously accomodated my awkward schedule throughout the rest of the year, most often while both of us were in a car. 306 | 307 | I must also thank everyone who provided input and feedback to my "Open Source Landscape" document in 2021, which I have since migrated to a [github repo](https://github.com/AevaOnline/supply-chain-synthesis). The knowledge I gained through those discussions allowed me to identify a tool that was missing from my "supply chain backpack": the OmniBOR. 308 | -------------------------------------------------------------------------------- /content/glossary/gitref.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | --------------------------------------------------------------------------------