├── publications
    └── threats-risks-mitigations
    │   ├── v1.1
    │       ├── img
    │       │   ├── Ideation.png
    │       │   ├── Threat&Risks.png
    │       │   ├── WritingCode.png
    │       │   ├── AttackSurface.png
    │       │   ├── AttackSurface2.png
    │       │   ├── PackageIdentity.png
    │       │   ├── LocalDevelopment.png
    │       │   ├── ExternalContribution.png
    │       │   ├── CentralInfrastructure.png
    │       │   ├── PackageConsumptionFlow.png
    │       │   ├── VulnerabilitiesFixingFlow.png
    │       │   └── ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png
    │       ├── Supporting Visio Diagrams.vsdx
    │       ├── Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.docx
    │       ├── Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.pdf
    │       └── Threats, Risks, and Mitigations in the Open Source Ecosystem.md
    │   ├── v1.2
    │       └── img
    │       │   ├── Ideation.png
    │       │   ├── Threat&Risks.png
    │       │   ├── WritingCode.png
    │       │   ├── AttackSurface.png
    │       │   ├── AttackSurface2.png
    │       │   ├── PackageIdentity.png
    │       │   ├── LocalDevelopment.png
    │       │   ├── ExternalContribution.png
    │       │   ├── CentralInfrastructure.png
    │       │   ├── PackageConsumptionFlow.png
    │       │   ├── VulnerabilitiesFixingFlow.png
    │       │   └── ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png
    │   ├── v1
    │       ├── Supporting Visio Diagrams.vsdx
    │       ├── Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.docx
    │       └── Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.pdf
    │   └── README.md
├── SECURITY.md
├── .github
    └── settings.yml
├── virtual-mini-summit-for-maintainers-of-critical-OSS-projects.md
├── README.md
├── CHARTER.md
└── LICENSE


/publications/threats-risks-mitigations/v1.1/img/Ideation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/Ideation.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/Ideation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/Ideation.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/Threat&Risks.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/Threat&Risks.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/WritingCode.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/WritingCode.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/Threat&Risks.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/Threat&Risks.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/WritingCode.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/WritingCode.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/AttackSurface.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/AttackSurface.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/AttackSurface2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/AttackSurface2.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/PackageIdentity.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/PackageIdentity.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/AttackSurface.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/AttackSurface.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/AttackSurface2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/AttackSurface2.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/PackageIdentity.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/PackageIdentity.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/LocalDevelopment.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/LocalDevelopment.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/LocalDevelopment.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/LocalDevelopment.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/ExternalContribution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/ExternalContribution.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/ExternalContribution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/ExternalContribution.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1/Supporting Visio Diagrams.vsdx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1/Supporting Visio Diagrams.vsdx


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/Supporting Visio Diagrams.vsdx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/Supporting Visio Diagrams.vsdx


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/CentralInfrastructure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/CentralInfrastructure.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/PackageConsumptionFlow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/PackageConsumptionFlow.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/CentralInfrastructure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/CentralInfrastructure.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/PackageConsumptionFlow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/PackageConsumptionFlow.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/VulnerabilitiesFixingFlow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/VulnerabilitiesFixingFlow.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/VulnerabilitiesFixingFlow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/VulnerabilitiesFixingFlow.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/img/ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/img/ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.2/img/ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.2/img/ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.docx


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.pdf


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.docx


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ossf/wg-metrics-and-metadata/HEAD/publications/threats-risks-mitigations/v1.1/Threats, Risks, and Mitigations in the Open Source Ecosystem - v1.1.pdf


--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
 1 | # Security
 2 | 
 3 | Per the [Linux Foundation Vulnerability Disclosure Policy](https://www.linuxfoundation.org/security), 
 4 | if you find a vulnerability in a project maintained by the Open Source Security Foundation (OpenSSF), 
 5 | please report that directly to the project maintaining that code, preferably using 
 6 | GitHub's [Private Vulnerability Reporting](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing/privately-reporting-a-security-vulnerability#privately-reporting-a-security-vulnerability).
 7 | 
 8 | If you've been unable to find a way to report it, or have received no response after repeated attempts,
 9 | please contact the OpenSSF security contact email, [security@openssf.org](mailto:security@openssf.org).
10 | 
11 | Thank you.
12 | 


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/README.md:
--------------------------------------------------------------------------------
 1 | # About
 2 | 
 3 | The purpose of this document is to build a mutual understanding of the high-level threats,
 4 | security risks, and potential mitigations associated with the open source ecosystem. There
 5 | is a natural overlap between these threats and risks, and those that affect the more general
 6 | software development process.
 7 | 
 8 | The primary intended audience consists of members of the
 9 | [Open Source Security Foundation](https://github.com/ossf) and similar organizations
10 | interested in promoting and advancing improvements to the security of the open source ecosystem,
11 | but should not be considered as a product roadmap or promised set of features. It should also
12 | be noted that this document focuses exclusively on security risk and does not include risks
13 | related to intellectual property (i.e. patents, copyright, licensing, contracts) or other
14 | domains.
15 | 


--------------------------------------------------------------------------------
/.github/settings.yml:
--------------------------------------------------------------------------------
 1 | repository:
 2 |   # See https://developer.github.com/v3/repos/#edit for all available settings.
 3 | 
 4 |   # The name of the repository. Changing this will rename the repository
 5 |   name: wg-identifying-security-threats
 6 | 
 7 |   # A short description of the repository that will show up on GitHub
 8 |   description: The purpose of the Identifying Security Threats working group is to enable stakeholders to have informed confidence in the security of open source projects. We do this by collecting, curating, and communicating relevant metrics and metadata from open source projects and the ecosystems of which they are a part.
 9 |  
10 |   # A URL with more information about the repository
11 |   homepage: https://openssf.org
12 | 
13 |   # Collaborators: give specific users access to this repository.
14 |   # see /governance/roles.md for details on write access policy
15 |   # note that the permissions below may provide wider access than needed for
16 |   # a specific role, and we trust these individuals to act according to their
17 |   # role. If there are questions, please contact one of the chairs.
18 | collaborators:
19 |   # Chairs and Admin Help
20 |   - username: scovetta
21 |     permission: admin
22 |     
23 |   - username: rhaning
24 |     permission: admin
25 | 
26 |     # Contributors
27 |     # all permissions except admin
28 | 
29 |   - username: caniszczyk
30 |     permission: push
31 | 
32 | labels:
33 |   - name: helpwanted
34 |     color: ffff54
35 |   - name: good first issue
36 |     color: ff8c00
37 |   - name: meeting
38 |     color: 00ff00
39 | 
40 | # additional colors in this palette:
41 | # 7f0000 , 1e90ff, ffdab9, ff69b4
42 | 


--------------------------------------------------------------------------------
/virtual-mini-summit-for-maintainers-of-critical-OSS-projects.md:
--------------------------------------------------------------------------------
 1 | # Virtual mini Summit for maintainers of critical OSS projects
 2 | 
 3 | :calendar: One day event : February 22, 2023 at 10:30 EST
 4 | 
 5 | 🌎: Online event
 6 | 
 7 | ### What?
 8 | 
 9 | Virtual meeting for critical maintainers of open source software.
10 | 
11 | Provide a space where maintainers of said projects can share their problems, pain points,  and common experiences around the challenges of securing and maintaining their projects.
12 | 
13 | ### What is not?
14 | 
15 | This is not an effort to drive OSS community to achieve higher security level by presenting solutions, but to explore in detail the particular pain points experienced by open source projects- not to be confused with Alpha-Omega project
16 | ( It is rather a public forum/space to let people know about their efforts to make their OSS project more secure )
17 | 
18 | 
19 | ### Who?
20 | 
21 | List of invitees (candidates):
22 | - [List of Critical Open Source Projects, Components, and Frameworks](https://github.com/ossf/wg-securing-critical-projects)
23 | - Ask the foundations for the strongest leads of contributors
24 | Event size: ~ 40 people and/or ~15 critical projects. 
25 | 
26 | This is a private event and attendees will be asked to abide by Chatham House Rules.
27 | 
28 | 
29 | ### Event Details
30 | 
31 | - Date: February 22, 2023
32 | - Time: 0730 Pacific/1030 Eastern/ 1630 CET
33 | - Duration: 1 day event - 3.5/4 hours event 
34 | - Workshop format: 1.5 hours
35 | 
36 | ### Event Schedule
37 | 
38 | - 10 min – Introduction / Welcome
39 | - 15 min – Framing / Survey Results
40 | - 45 min – Panel
41 | - 10 min – Break
42 | - 65 min – Breakout  – Moderated working groups ( 4 )
43 | - 15 mins - Break
44 | - 20 min – Open discussion
45 | 
46 | 
47 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Metrics and Metadata  in Open Source Projects
 2 | 
 3 | The purpose of this working group is to enable stakeholders to have informed
 4 | confidence in the security of open source projects. We do this by collecting,
 5 | curating, and communicating relevant metrics and metadata from open source
 6 | projects and the ecosystems of which they are a part.
 7 | 
 8 | ### Motivation
 9 | 
10 | Open source software is an essential part of modern software development, and
11 | of practically all technology solutions. Adoption of open source software has
12 | grown over the past two decades, powering everything from tiny "Internet of
13 | Things" devices to the most advanced supercomputers in the world. This has led
14 | to enormous productivity gains, allowing software engineers to focus more on
15 | solving business problems and less on creating and re-creating the same
16 | building blocks needed in many situations.
17 | 
18 | With these benefits, however, comes some risk. Attackers frequently target
19 | open source projects and the ecosystems they are a part of in order to 
20 | compromise the organizations or users that use those projects. It's
21 | essential that we understand these threats and work to build defenses against
22 | them.
23 | 
24 | ### Objective
25 | 
26 | Our objective is to enable stakeholders to have informed confidence in the
27 | security of open source projects. This includes identifying threats to the
28 | open source ecosystem and recommending practical mitigations. We will also
29 | identify a set of key metrics and build tooling to communicate those metrics
30 | to stakeholders, enabling a better understanding of the security posture of
31 | individual open source software components.
32 | 
33 | ### Scope
34 | 
35 | The scope of this working group includes "security", as opposed to privacy,
36 | resiliency, or other related areas. We also consider the broad open source
37 | ecosystem, as opposed to focusing exclusively on critical open source projects.
38 | 
39 | 
40 | ### Active Projects
41 | 
42 | * [Security Insights](https://github.com/ossf/security-insights-spec) - Provides a mechanism for projects to report information about their security practices in a machine-readable way.
43 |   * Lead: Luigi Gubello
44 | 
45 | * **Security Risk Dashboard** - This project's purpose is to collect, organize, and provide interesting security metrics for
46 |   open source projects to stakeholders, including users.
47 |   * Lead: Jay White
48 | 
49 | * [Security Reviews](https://github.com/ossf/security-reviews) -
50 |   This repository contains a collection of security reviews of open source software.
51 | 
52 | * [Threats, Risks, and Mitigations in the Open Source Ecosystem](https://github.com/ossf/wg-identifying-security-threats/blob/main/publications/threats-risks-mitigations/v1.1/Threats%2C%20Risks%2C%20and%20Mitigations%20in%20the%20Open%20Source%20Ecosystem%20-%20v1.1.pdf)
53 | 
54 | ### Get Involved
55 | 
56 | * Please get involved with our specific projects, e.g,.
57 | * [Mailing List](https://lists.openssf.org/g/openssf-wg-security-threats) and [Security Reviews](https://github.com/ossf/security-reviews).
58 |   ([Manage your subscriptions to OpenSSF mailing lists](https://lists.openssf.org/g/main/subgroups))
59 | * [OpenSSF Community Calendar](https://calendar.google.com/calendar?cid=czYzdm9lZmhwNWk5cGZsdGI1cTY3bmdwZXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ)
60 | * [Join us on Slack](https://openssf.slack.com/archives/C01A50B978T)
61 | 
62 | ### Related Work
63 | 
64 | * [OpenSSF Best Practices Badge Program](https://bestpractices.coreinfrastructure.org/) - an input to the metrics dashboard generated by the Security Metrics project (formerly named CII Best Practices Badge Program).
65 | * [OpenSSF Scorecard](https://github.com/ossf/scorecard) - another input to the metrics dashboard
66 | 
67 | * [CHAOSS](https://chaoss.community) - develops definitions of metrics
68 | 
69 | ### Quick Start
70 | 
71 | The best way to get started is to simply join a working group meeting. You can also
72 | read our [Meeting Minutes](https://docs.google.com/document/d/1XimygAYGbG2aofAiBD9--ZMTALdnbdbVw53R851ZZKY/edit?usp=sharing) to get up to speed with what we're up to.
73 | 
74 | ### Meeting Times
75 | 
76 | * We meet every other week on Wednesdays. See the
77 |   [OpenSSF Community Calendar](https://calendar.google.com/calendar?cid=czYzdm9lZmhwNWk5cGZsdGI1cTY3bmdwZXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ).
78 |   
79 | ### Meeting Notes
80 | 
81 | [Meeting Minutes](https://docs.google.com/document/d/1XimygAYGbG2aofAiBD9--ZMTALdnbdbVw53R851ZZKY/edit?usp=sharing) If attending please add your name, and if a returning attendee, please change the color of your name from gray to black.
82 | 
83 | ### Antitrust Policy Notice
84 | 
85 | Linux Foundation meetings involve participation by industry competitors, and it is the intention of the Linux Foundation to conduct all of its activities in accordance with applicable antitrust and competition laws. It is therefore extremely important that attendees adhere to meeting agendas, and be aware of, and not participate in, any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws.
86 | 
87 | Examples of types of actions that are prohibited at Linux Foundation meetings and in connection with Linux Foundation activities are described in the Linux Foundation Antitrust Policy available at <http://www.linuxfoundation.org/antitrust-policy>. If you have questions about these matters, please contact your company counsel, or if you are a member of the Linux Foundation, feel free to contact Andrew Updegrove of the firm of Gesmer Updegrove LLP, which provides legal counsel to the Linux Foundation.
88 | 
89 | ### Governance
90 | 
91 | The [CHARTER](https://github.com/ossf/wg-identifying-security-threats/blob/main/CHARTER.md)
92 | document outlines the scope and governance of our group activities.
93 | 
94 | The workgroup leads are:
95 | * Michael Scovetta
96 | * Luigi Gubello
97 | 


--------------------------------------------------------------------------------
/CHARTER.md:
--------------------------------------------------------------------------------
  1 | # Technical Charter for Open Source Security Foundation
  2 | 
  3 | Metrics and Metadata - Working Group
  4 | 
  5 | Adopted June 23, 2022
  6 | 
  7 | This Technical Charter sets forth the responsibilities and procedures for technical contribution to, and oversight of, the Metrics and Metadata of the open source community, which has been established as a Working Group  (the "Technical Initiative") under the Open Source Security Foundation (the “OpenSSF”).  All contributors (including committers, maintainers, and other technical positions) and other participants in the Technical Initiative (collectively, “Collaborators”) must comply with the terms of this Technical Charter and the OpenSSF Charter. 
  8 | 
  9 | #### 1. Mission and Scope of the Technical Initiative
 10 | 
 11 | - a. The mission of the Technical Initiative is to enable stakeholders to have informed confidence in the security of open source projects. We do this by collecting,
 12 | curating, and communicating relevant metrics and metadata from open source projects and the ecosystems of which they are a part.
 13 | 
 14 | - b. The scope of the Technical Initiative includes collaborative development under the Technical Initiative License (as defined herein) supporting the mission, including organizing collaboration activities, defining best practices, documentation, testing, integration, and the creation of other artifacts that support the mission.
 15 | 
 16 | #### 2. Technical Steering Committee
 17 | 
 18 | - a. The Technical Steering Committee (the "TSC") will be responsible for all oversight of the Technical Initiative. 
 19 | 
 20 | - b. The TSC voting members are initially the Technical Initiative’s Maintainers. The Maintainers will be documented in the Technical Initiative repository. The TSC is responsible for determining the future process for defining voting members of the TSC, and any such alternative approach will also be documented appropriately.  Any meetings of the Technical Steering Committee are intended to be open to the public, and can be conducted electronically, via teleconference, or in person. 
 21 | 
 22 | - c. The Technical Initiative generally will involve Collaborators and Contributors. The TSC may adopt or modify additional roles so long as the roles are documented in the Technical Initiative’s repository. Unless otherwise documented: 
 23 | 
 24 |    - i. Contributors include anyone in the technical community that contributes effort, ideas, code, documentation, or other artifacts to the Technical Initiative; 
 25 | 
 26 |    - ii. Collaborators are Contributors who have earned the ability to modify ("commit") text, source code, documentation or other artifacts in the Technical Initiative’s repository or direct the agenda or working activities of the Technical Initiative; and
 27 | 
 28 |    - iii. A Contributor may become a Collaborator by a majority approval of the existing Collaborators. A Collaborator may be removed by a majority approval of the other existing Collaborators.
 29 | 
 30 |    - iv. Maintainers are the initial Collaborators defined at the creation of the Technical Initiative. The Maintainers will determine the process for selecting future Maintainers. A Maintainer may be removed by two-thirds approval of the other existing Maintainers, or a majority of the other existing Collaborators.
 31 | 
 32 | - d. Participation in the Technical Initiative through becoming a Contributor, Collaborator, or Maintainer is open to anyone, whether a OpenSSF member or not, so long as they abide by the terms of this Technical Charter. 
 33 | 
 34 | - e. The TSC may create, change, modify, or remove roles or their definitions, so long as the definitions of roles for the Technical Initiative are publicly available in the Technical Initiative repository.
 35 | 
 36 | - f. The TSC may elect a TSC Chair, who will preside over meetings of the TSC and will serve until their resignation or replacement by the TSC.  **The TSC Chair, or any other TSC member so designated by the TSC, will serve as the Technical Initiative’s voting representative on the OpenSSF’s Technical Advisory Council (the "TAC").
 37 | 
 38 | - g. Responsibilities: The TSC will be responsible for all aspects of oversight relating to the Technical Initiative, which may include:
 39 | 
 40 |    - i. coordinating the direction of the Technical Initiative;
 41 | 
 42 |    - ii. approving, organizing or removing activities and projects;
 43 | 
 44 |    - iii. establish community norms, workflows, processes, release requirements, and templates for the operation of the Technical Initiative;
 45 | 
 46 |    - iv. establish a fundraising model, and approve or modify a Technical Initiative budget, subject to OpenSSF Governing Board approval;
 47 | 
 48 |    - v. appointing representatives to work with other open source or open standards communities;
 49 | 
 50 |    - vi. approving and implementing policies and processes for contributing (to be published in the Technical Initiative repository) and coordinating with the Linux Foundation to resolve matters or concerns that may arise as set forth in Section 6 of this Technical Charter;
 51 | 
 52 |    - vii. facilitating discussions, seeking consensus, and where necessary, voting on technical matters relating to the Technical Initiative; and
 53 | 
 54 |    - viii. coordinating any communications regarding the Technical Initiative.
 55 | 
 56 | #### 3. TSC Voting
 57 | 
 58 | - a. While the Technical Initiative aims to operate as a consensus-based community, if any TSC decision requires a vote to move the Technical Initiative forward, the voting members of the TSC will vote on a one vote per voting member basis.
 59 | 
 60 | - b. Quorum for TSC meetings requires at least fifty percent of all voting members of the TSC to be present. The TSC may continue to meet if quorum is not met but will be prevented from making any decisions at the meeting.
 61 | 
 62 | - c. Except as provided in Section 7.c. and 8.a, decisions by vote at a meeting require a majority vote of those in attendance, provided quorum is met. Decisions made by electronic vote without a meeting require a majority vote of all voting members of the TSC.
 63 | 
 64 | - d. In the event a vote cannot be resolved by the TSC, any voting member of the TSC may refer the matter to the TAC for assistance in reaching a resolution.
 65 | 
 66 | #### 4. Compliance with Policies
 67 | 
 68 | - a. This Technical Charter is subject to the OpenSSF Charter and any rules or policies established for all Technical Initiatives.  
 69 | 
 70 | - b. The Technical Initiative participants must conduct their business in a professional manner, subject to the Contributor Covenant Code of Conduct 2.0, available at [https://www.contributor-covenant.org/version/2/0/code_of_conduct](https://www.contributor-covenant.org/version/2/0/code_of_conduct/). The TSC may adopt a different code of conduct ("CoC") for the Technical Initiative, subject to approval by the TAC.
 71 | 
 72 | - c. All Collaborators must allow open participation from any individual or organization meeting the requirements for contributing under this Technical Charter and any policies adopted for all Collaborators by the TSC, regardless of competitive interests. Put another way, the Technical Initiative community must not seek to exclude any participant based on any criteria, requirement, or reason other than those that are reasonable and applied on a non-discriminatory basis to all Collaborators in the Technical Initiative community. All activities conducted in the Technical Initiative are subject to the Linux Foundation’s Antitrust Policy, available at [https://www.linuxfoundation.org/antitrust-policy](https://www.linuxfoundation.org/antitrust-policy/).
 73 | 
 74 | - d. The Technical Initiative will operate in a transparent, open, collaborative, and ethical manner at all times. The output of all Technical Initiative discussions, proposals, timelines, decisions, and status should be made open and easily visible to all. Any potential violations of this requirement should be reported immediately to the TAC.
 75 | 
 76 | #### 5. Community Assets
 77 | 
 78 | - a. The Linux Foundation will hold title to all trade or service marks used by the Technical Initiative ("Technical Initiative Trademarks"), whether based on common law or registered rights.  Technical Initiative Trademarks may be transferred and assigned to LF Technical Initiatives to hold on behalf of the Technical Initiative. Any use of any Technical Initiative Trademarks by Collaborators in the Technical Initiative will be in accordance with the trademark usage policy of the Linux Foundation, available at [https://www.linuxfoundation.org/trademark-usage](https://www.linuxfoundation.org/trademark-usage/), and inure to the benefit of the Linux Foundation.  
 79 | 
 80 | - b. The Linux Foundation or Technical Initiative must own or control the repositories, social media accounts, and domain name registrations created for use by the Technical Initiative community.
 81 | 
 82 | - c. Under no circumstances will the Linux Foundation be expected or required to undertake any action on behalf of the Technical Initiative that is inconsistent with the policies or tax-exempt status or purpose, as applicable, of the Linux Foundation.
 83 | 
 84 | #### 6. Intellectual Property Policy
 85 | 
 86 | - a. Collaborators acknowledge that the copyright in all new contributions will be retained by the copyright holder as independent works of authorship and that no contributor or copyright holder will be required to assign copyrights to the Technical Initiative. 
 87 | 
 88 | - b. Except as described in Section 6.c., all contributions to the Technical Initiative are subject to the following: 
 89 | 
 90 |   - i. All new inbound code contributions to the Technical Initiative must be made using the Apache License, Version 2.0, available at [https://www.apache.org/licenses/LICENSE-2.0](https://www.apache.org/licenses/LICENSE-2.0) (the "Technical Initiative License"). 
 91 | 
 92 |   - ii. All new inbound code contributions must also be accompanied by a Developer Certificate of Origin ([http://developercertificate.org](http://developercertificate.org)) sign-off in the source code system that is submitted through a TSC-approved contribution process which will bind the authorized contributor and, if not self-employed, their employer to the applicable license;
 93 | 
 94 |   - iii. All outbound code will be made available under the Technical Initiative License.
 95 | 
 96 |   - iv. Documentation will be received and made available by the Technical Initiative under the Creative Commons Attribution 4.0 International License, available at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/). 
 97 | 
 98 |   - v. To the extent a contribution includes or consists of data, any rights in such data shall be made available under the CDLA-Permissive 1.0 License.
 99 | 
100 |   - vi. The Technical Initiative may seek to integrate and contribute back to other open source projects ("Upstream Projects"). In such cases, the Technical Initiative will conform to all license requirements of the Upstream Projects, including dependencies, leveraged by the Technical Initiative.  Upstream Project code contributions not stored within the Technical Initiative’s main code repository will comply with the contribution process and license terms for the applicable Upstream Project.
101 | 
102 | - c. The TSC may approve the use of an alternative license or licenses for inbound or outbound contributions on an exception basis. To request an exception, please describe the contribution, the alternative open source license(s), and the justification for using an alternative open source license for the Technical Initiative. License exceptions must be approved by a two-thirds vote of the entire Governing Board. 
103 | 
104 | - d. Contributed files should contain license information, such as SPDX short form identifiers, indicating the open source license or licenses pertaining to the file.
105 | 
106 | #### 7. Amendments
107 | 
108 | - a. This charter may be amended by a two-thirds vote of the entire TSC and is subject to approval by the TAC.
109 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/publications/threats-risks-mitigations/v1.1/Threats, Risks, and Mitigations in the Open Source Ecosystem.md:
--------------------------------------------------------------------------------
   1 | ## Threats, Risks, and Mitigations in the Open Source Ecosystem
   2 | 
   3 | *[Michael Scovetta](mailto:michael.scovetta@microsoft.com), Microsoft*
   4 | 
   5 | *in collaboration with the Open Source Security Coalition*
   6 | 
   7 | The purpose of this document is to build a mutual understanding of the high-level threats, security risks, and potential mitigations associated with the open source ecosystem. There is a natural overlap between these threats and risks, and those that affect the more general software development process. The primary intended audience consists of members of the [Open Source Security Coalition](https://securitylab.github.com/) (the “Coalition”, herein) and similar organizations interested in promoting and advancing improvements to the security of the open source ecosystem, but should not be considered as a product roadmap or promised set of features. It should also be noted that this document focuses exclusively on security risk and does not include risks related to intellectual property (i.e. patents, copyright, licensing, contracts).
   8 | 
   9 | # Introduction
  10 | 
  11 | Open source software is an essential part of modern software development, and of practically all technology solutions. Adoption of open source software has grown over the past two decades, powering everything from tiny Internet of Things devices to the most advanced supercomputers in the world. Over the last decade, the quantity of open source available through package management systems has grown from around 30,000 to well over two million today. This has led to enormous productivity gains, allowing software engineers to focus more on solving business problems and less on creating and re-creating the same building blocks needed in many situations.
  12 | 
  13 | Open source itself, however, is primarily created by volunteers, working on their own time on a project they are passionate about. They often receive no monetary compensation for their work other than satisfaction that their creation is useful to others, but their work product is routinely used to power for-profit businesses and other organizations. This can create discord between the producers and consumers.
  14 | 
  15 | > *"Open Source & I are going through a labor dispute right now. I really lost a lot of faith in open source when I noticed billion dollar corporations were using my software and not a single one ever bothered to donate even a few dollars to keep it going, but filed tickets." - Jordon Bedwell (@envygeeks)*
  16 | 
  17 | Open source software brings great capability, but with it comes some amount of risk. According to the [2019 State of Software Supply Chain](https://www.sonatype.com/en-us/software-supply-chain-2019) report released by Sonatype:
  18 | 
  19 | - The number of days between vulnerability disclosure and exploit creation has shrunk from 45 to 3.
  20 | 
  21 | - Over half of JavaScript components contain at least one known security vulnerability.
  22 | 
  23 | - JavaScript packages are downloaded over 10 billion times per week (via NPM), which averages to more than 53,000 per developer per year.
  24 | 
  25 | More generally, security vulnerabilities continue to grow in number, with over 17,000 CVEs¹ published in 2019 and nearly 9,000 published in the first half of 2020. Of those 26,000, over 4,000, or fifteen percent, were rated [critical](https://nvd.nist.gov/vuln-metrics/cvss).
  26 | 
  27 | These are scary numbers, but they do not tell the whole story. The purpose of this document is not to promote fear, but to offer solutions and align disparate efforts toward a common goal. To move forward, we must first build a mutual understanding of the threats and risks associated with the open source ecosystem. Where applicable, we offer suggestions on ways to address the threats and mitigate the risks, but we do not presume any of this to be exhaustive. At best, we hope to start a conversation about the best way to proceed.
  28 | 
  29 | A summary of recommendations can be found in the [Appendix](#Appendix).
  30 | 
  31 | ## Version History
  32 | 
  33 | |     |                                   |           |
  34 | |:---:|:--------------------------------- |:---------:|
  35 | | 0.1 | Initial draft                     | 4/16/2020 |
  36 | | 0.2 | Final draft                       | 5/5/2020  |
  37 | | 1.0 | Initial release                   | 5/13/2020 |
  38 | | 1.1 | Updates based on initial feedback | 6/16/2020 |
  39 | 
  40 | ---
  41 | 
  42 | ¹ CVEs cover both open source and proprietary software.
  43 | 
  44 | ---
  45 | 
  46 | ## Related Work
  47 | 
  48 | The following resources contain content that supplements the information to this document.
  49 | 
  50 | ### Industry Reports
  51 | 
  52 | - [The State of Open Source Security Vulnerabilities](https://www.whitesourcesoftware.com/open-source-vulnerability-management-report/) (2020, WhiteSource)
  53 | 
  54 | - [The State of Open Source Security ](https://snyk.io/opensourcesecurity-2019/)(2019, Snyk)
  55 | 
  56 | - [Open Source Security and Risk Analysis (OSSRA)](https://www.synopsys.com/software-integrity/resources/analyst-reports/2019-open-source-security-risk-analysis.html) (2019, Synopsys)
  57 | 
  58 | - [2019 Software Supply Chain Report](https://www.sonatype.com/en-us/software-supply-chain-2019) (2019, Sonatype)
  59 | 
  60 | ### Guidelines, White Papers, and Standards
  61 | 
  62 | - [Fundamental Practices for Secure Software Development](https://safecode.org/wp-content/uploads/2018/03/SAFECode_Fundamental_Practices_for_Secure_Software_Development_March_2018.pdf) (SAFECode)
  63 | 
  64 | - [Managing Security Risks Inherent in the Use of Third-party Components](https://safecode.org/wp-content/uploads/2017/05/SAFECode_TPC_Whitepaper.pdf) (SAFECode)
  65 | 
  66 | - [Microsoft Security Development Lifecycle](https://www.microsoft.com/en-us/securityengineering/sdl/)
  67 | 
  68 | - [NIST SP 800-160](https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-160v1.pdf) (Systems Security Engineering)
  69 | 
  70 | - [NIST SP 800-37](https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-37r2.pdf) (Risk Management Framework for Information Systems and Organizations)
  71 | 
  72 | - [OWASP Packman](https://github.com/OWASP/packman) (Documentation/tracking of security controls of popular package management systems)
  73 | 
  74 | - [OWASP Software Component Verification Standard](https://owasp.org/www-project-software-component-verification-standard/)
  75 | 
  76 | ## Acknowledgments
  77 | 
  78 | Thank you to everyone who reviewed, commented, and provided content for this document; most especially to Guy Acosta, Bas Alberts, Chris Aniszczyk, Charles Brenner, Jennifer Fernick, John Gossman, Luigi Gubello, Chris Jeuell, Maya Kaczorowski, Radoslaw Karpowicz, Steve Lipner, Jason Keirstead, Dan Lorenc, Elie Saad, Andrew Trompler, and Kay Williams.
  79 | 
  80 | ## Table of Contents
  81 | 
  82 | |                                                                                                         |     |
  83 | | ------------------------------------------------------------------------------------------------------- | ---:|
  84 | | [Introduction](#Introduction)                                                                           | 1   |
  85 | | [Threats & Risks](#Threats-&-Risks)                                                                     | 5   |
  86 | | [Ideation / Concept Phase](#Ideation-/-Concept-Phase)                                                   | 5   |
  87 | | [Local Development Phase](#Local-Development-Phase)                                                     | 8   |
  88 | | [External Contributions Phase](#External-Contributions-Phase)                                           | 17  |
  89 | | [Central Infrastructure Phase](#Central-Infrastructure-Phase)                                           | 19  |
  90 | | [Package Consumption Phase](#Package-Consumption-Phase)                                                 | 23  |
  91 | | [Vulnerability Reporting & Security Response Phase](#Vulnerability-Reporting-&-Security-Response-Phase) | 34  |
  92 | | [Cross-Cutting Activities](#Cross-Cutting-Tasks)                                                        | 38  |
  93 | | [Conclusion](#Conclusion)                                                                               | 47  |
  94 | | [Appendix](#Appendix)                                                                                   | 48  |
  95 | 
  96 | # Threats & Risks
  97 | 
  98 | To better frame what we’re going to be exploring, we’ll start with a simple diagram that describes the major parts of the open source ecosystem and how they often relate to one another.
  99 | 
 100 | ![The graph shows the major parts of an open-source ecosystem, from the idetation of a project to the distribution](img/Threat&Risks.png)
 101 | 
 102 | We will use the diagram above to frame our exploration of threats and possible mitigations, after which we’ll discuss some general, cross-cutting practices and recommendations.
 103 | 
 104 | ## Ideation / Concept Phase
 105 | 
 106 | <img src="img/Ideation.png" title="" alt="First phase: ideation and concept" width="297">
 107 | 
 108 | In this phase, there are few explicit threat actors; instead, there is the potential for “business”-level security flaws, biases that have security implications, and other high-level design problems that can have severe consequences if not identified and properly addressed.
 109 | 
 110 | These risks can be influenced by various aspects of a system:
 111 | 
 112 | - **Attack Surface.** As a system’s attack surface grows or becomes less well-defined, it becomes more susceptible to attack. For example, systems that attackers have physical possession of are often harder to secure against tampering or introspection. Systems that have many dependencies could be attacked using a defect in any of them. 
 113 | 
 114 | - **Technology Stack Risks.** Certain technologies are inherently more susceptible to attack than others; for example, most modern programming languages have been built to avoid the memory safety challenges that have [affected](https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/) C and other “low-level” languages.
 115 | 
 116 | - **Security-Sensitive Functionality.** From a security perspective, certain tasks are simply more important than others. For example, it is more important that a library that performs authentication or authorization do so correctly than it is that a calendar widget properly account for leap years. Operating system kernels, cryptographic libraries, and cloud orchestration systems also fall into this category, as do many other components that depend on the context of the system they are being used with.
 117 | 
 118 | - **Unproven Technology.** New technologies often offer advantages, but practically no one “gets it right” on their first try. Whether it’s a new platform, programming language, framework, or library, placing too much reliance on the security of that new technology can be a mistake until it’s been proven out.
 119 | 
 120 | In addition, it’s important to realize that not all risks are technical in nature; systems that perform any of the following are more likely to be attacked than others:
 121 | 
 122 | - Systems that transfer money or other assets between accounts.
 123 | 
 124 | - Systems that provide assurance that certain events have taken place.
 125 | 
 126 | - Systems that protect access to highly valuable assets.
 127 | 
 128 | - Systems that claim to provide anonymity or pseudonymity.
 129 | 
 130 | To mitigate these risks, security practitioners often recommend [threat modeling](https://safecode.org/wp-content/uploads/2017/05/SAFECode_TM_Whitepaper.pdf) and a security-oriented design review (also known as an architectural risk analysis). There are many approaches to performing these, and though they are often as much “art” as “science,” there are some excellent examples that are publicly available:
 131 | 
 132 | - [Kubernetes Security Audit and Threat Model](https://github.com/kubernetes/community/tree/master/wg-security-audit)
 133 | 
 134 | - [OAuth 2.0 Threat Model and Security Considerations](https://tools.ietf.org/html/rfc6819)
 135 | 
 136 | The main challenges around threat modeling include:
 137 | 
 138 | - Most software engineers do not perform threat modeling or similar activities when creating open source software. This may be due to limited understanding of the value these tasks provide, or a desire to work on the core (software) parts of the project.
 139 | 
 140 | - A project’s core concepts often shift over time, especially at the early in the project’s development, which means this process should be repeated regularly.
 141 | 
 142 | - The maintainers may change over time; there may be no one available to describe how a certain part of the project works. 
 143 | 
 144 | - These tasks are often difficult or require specialized training or expertise that may not be available.
 145 | 
 146 | - There may also be a perception that these activities don’t provide much value to open source components, especially since the developer will not have the context of how the component will be ultimately used.
 147 | 
 148 | ### Threat Modeling
 149 | 
 150 | Formally, threat modeling is a process by which potential threats are identified and rated for severity, and possible mitigations are discussed. Less formally, threat modeling happens when you think about how the system you’re building could be broken, and consider what you can do to prevent that from happening.
 151 | 
 152 | It is important to stress that threat modeling is a <u>process</u>, not a tool. While tools can help the process be more efficient (e.g., by providing visualization, tracking changes over time, or identifying changes to software that would be more likely affect its threat model), tools by themselves cannot currently take the place of humans reasoning about how other humans would attack a system.
 153 | 
 154 | Threat modeling can be most effective when multiple stakeholders can come together to look at a system from different angles: developers, architects, service engineers, designers, and end users, along with security specialists. The discussion can be as simple as walking through how the system is used, how it is *supposed* to work and comparing that to how it *actually* works. Security specialists will often ask questions to get a better understanding of the security controls in place, and very often, everyone will leave with a better understanding of the risks that affect the system.
 155 | 
 156 | There are tools, like [OWASP Threat Dragon](https://owasp.org/www-project-threat-dragon) and [SD Elements](https://www.securitycompass.com/sdelements/), that bring structure to this process, and many good tutorials, including [Threat Modeling in 2019](https://www.youtube.com/watch?v=ZoxHIpzaZ6U) (RSA/Adam Shostack, 2019) and the SAFECode [Tactical Threat Modeling](https://safecode.org/safecodepublications/tactical-threat-modeling/) white paper.
 157 | 
 158 | We recommend considering the following projects to advance this area:
 159 | 
 160 | - High-quality training materials for conducting threat modeling and a security-oriented design review should be curated or created and targeted at high-risk projects.
 161 | 
 162 | - Templated threat models should be created for common (and representative) types of open source components and expanded over time with community involvement.
 163 | 
 164 | - Experienced security professionals should collaborate with critical, high-risk open source projects to create security reviews and associated threat models.
 165 | 
 166 | ## Local Development Phase
 167 | 
 168 | <img src="img/LocalDevelopment.png" title="" alt="The graph shows the local development of open source projects" width="453">
 169 | 
 170 | Local development of open source projects usually takes place on the maintainer’s personal infrastructure (workstations, local network, etc.), which, like any other infrastructure, can be vulnerable to attack. For example, [XcodeGhost](https://en.wikipedia.org/wiki/XcodeGhost) was a malicious distribution of Apple’s Xcode software, targeting the far left of the supply chain—the developer’s IDE and local build environment. Similar attacks have become more common, such as the [eslint-scope malware](https://nodesource.com/blog/a-high-level-post-mortem-of-the-eslint-scope-security-incident/) that attempted to exfiltrate the developer’s NPM access tokens during installation.
 171 | 
 172 | Threats that apply to this phase include:
 173 | 
 174 | - Attackers compromise a developer’s environment and use that access to compromise the software components themselves (e.g., making false commits, silently pushing content to a source code repository, modifying files, etc.).
 175 | 
 176 | - Attackers compromise a developer’s environment in order to exfiltrate commits not made public yet in order to be a step ahead in the competition.
 177 | 
 178 | - Attackers compromise a maintainer’s machine, network, or communications tools to intercept researcher-submitted bug reports of zero-day vulnerabilities.
 179 | 
 180 | - Developers leave “debug” functionality that bypasses security controls.
 181 | 
 182 | - Developers copy/paste source code from Stack Overflow or similar sources without considering whether that code contains security defects.
 183 | 
 184 | - Maintainers create software with code-level security defects (vulnerabilities).
 185 | 
 186 | - Maintainers accidentally check secrets into source code or publish them in packages, which attackers find and exploit.
 187 | 
 188 | ### Technical Architecture
 189 | 
 190 | The choice of a technical architecture can have a significant impact on the overall security of a system and the investment needed to keep that system secure in the future. A good (secure) technical architecture can mitigate security risk systemically, while poor technical architectures can amplify it. Our goal in choosing a secure technical architecture is to reduce the likelihood that the system will contain exploitable security vulnerabilities in the future.
 191 | 
 192 | As an example, consider security vulnerabilities that result from inconsistencies created when a concept is translated into a program’s source code. For example, a developer may assume that a person’s age will never be greater than 120, so they assign an 8-bit field to store it; an attacker submits a record with an age of 100 million, and the program state overflows, resulting in corruption and potential execution of code supplied by the attacker.
 193 | 
 194 | Tactically, bugs like this can often be found through static analysis or fuzzing, but one of the best ways to address entire classes of vulnerabilities is to provide software developers programming languages and platforms that make it easier to write secure code. For example, many modern “managed” programming languages avoid manual memory management; in the above example, an attacker supplying a number greater than the allowed size would result in a runtime error rather than memory corruption.
 195 | 
 196 | Memory management issues obviously aren’t the only kind of security flaws, however, and many higher-level constructs cannot be feasibly handled at the programming language level, such as a properly-implemented authorization mechanism or the implementation of a new cryptographic protocol. For many of these cases, having well-vetted, commonly used libraries that implement these constructs has advantages over each package author implementing the construct independently. (Indeed, this is one of the primary benefits of using open source software in the first place!)
 197 | 
 198 | The [MITRE Common Weakness Enumeration](https://cwe.mitre.org/data/slices/699.html) breakdown is a good resource for understanding the range of issues that can affect a system.
 199 | 
 200 | Platform and framework selection can also have a significant impact on the security of an overall system. For example, containers have been ubiquitous in modern software development, but it can be easy to accidentally expose host resources to the container environment, reducing any security protections such a configuration would normally provide.
 201 | 
 202 | We recommend the following activities in this area:
 203 | 
 204 | - Guidance should be created or curated that describes how to securely configure some of the most common platforms and frameworks (e.g., Docker, Kubernetes, Node/Express). 
 205 | 
 206 | - References to high-quality, “batteries included” libraries and frameworks like [ESAPI](https://owasp.org/www-project-enterprise-security-api/) and [Python/Cryptography](https://cryptography.io/en/latest/) should be collected and curated centrally.
 207 | 
 208 | - Guidance should be created or curated to help developers choose or design secure technical architectures, with easy-to-understand examples.
 209 | 
 210 | ### Writing Code
 211 | 
 212 | <img src="img/WritingCode.png" title="" alt="" width="401">
 213 | 
 214 | All software contains flaws, and those flaws can often impact the security quality of a system. There is clear consensus that the best time to detect and fix security flaws is early in the development process, but this understanding does not always translate into clear action. Many teams apply some sort of analysis, ranging from linters to advanced static analysis tools.
 215 | 
 216 | There are a few different options for addressing exploitable vulnerabilities:
 217 | 
 218 | - [Identifying Security Vulnerabilities in Source Code](#Identifying-Security-Vulnerabilities-in-Source-Code) (Detect)
 219 | 
 220 | - [Identifying Security Vulnerabilities During Execution](#Identifying-Security-Vulnerabilities-During-Execution) (Detect)
 221 | 
 222 | - [Reducing the Likelihood that a Vulnerability will be Introduced](#Reducing-the-Likelihood-that-a-Vulnerability-will-be-Introduced) (Prevent)
 223 | 
 224 | - [Reducing the Likelihood that a Vulnerability will be Exploited](#Reducing-the-Likelihood-that-a-Vulnerability-will-be-Exploited) (Prevent)
 225 | 
 226 | #### Identifying Security Vulnerabilities in Source Code
 227 | 
 228 | Static analysis is a term used to describe the process for examining a program outside of its running state (i.e., source code) in an attempt to identify vulnerabilities that would be present if the program were running. These techniques range from simple pattern matching, to analysis of control and data flow graphs, to program emulation, to formal methods for validating pre- and post-conditions.
 229 | 
 230 | The quality of static analyzers has increased in recent years, and while you shouldn’t expect perfect accuracy, these tools can often detect exploitable security vulnerabilities before they are even merged into an official code branch. 
 231 | 
 232 | Challenges of using static analysis include:
 233 | 
 234 | - **Cost.** Open source projects cannot usually afford to pay for a commercial static analysis tool. Fortunately, most are available for free to open source projects, including [LGTM](https://lgtm.com) (GitHub), [Coverity](https://scan.coverity.com/), and [Reshift](https://www.reshiftsecurity.com/). A fairly comprehensive list of static analysis tools can be found on [Wikipedia](https://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis).
 235 | 
 236 | - **Inability to Analyze.** Most static analyzers are language-dependent, meaning that if the tool supports only Java and C#, but your project is written in PHP, then the tool won’t provide you much value. In the worst case, you may be using a programming language for which no analysis tools exist. Analyzers also tend to have challenges when tracing execution between different programming languages (e.g., data coming from a web application, passing through a backend application, to a separate micro-service, and then to a NoSQL store).
 237 | 
 238 | - **Complexity to Analyze.** In the best case, analysis is simple: you push a button, analysis runs, and findings are generated. However, depending on the analyzer, you may need to spend additional time configuring it; at scale, this hurdle could stand in the way of many open source projects from adopting static analysis.
 239 | 
 240 | - **Cost/Time to Review.** Once a static analysis tool is used, findings need to be reviewed and appropriately actioned. For open source projects, especially those with large code bases, this work can be substantial, especially when it is done for the first time. 
 241 | 
 242 | The best way to address these challenges is through improved tools: more accurate rules, better guidance on how to remediate, “turn-key” execution, and broader support for modern and emerging platforms and architectures. We therefore recommend the following:
 243 | 
 244 | - Build (and meta-build) systems should improve integration with static analysis tools, enabling “default on” high-quality analysis.
 245 | 
 246 | - [GitHub Security Lab](https://securitylab.github.com/) should continue to support community development of security rules and should drive toward “on by default” static analysis within source code repositories.
 247 | 
 248 | - Research should be directed toward a standard (cross-tool) format for expressing detection rules, enabling more efficient use of security engineering activities.
 249 | 
 250 | - Research should also be directed toward advancing the state of the art around “auto-fixes”, which would be the ability for a static analysis tool to automatically submit a pull request with the required code change to remediate a vulnerability. (We’re under no illusion that this would be difficult or impossible to solve in the general case, but there is plenty of “low-hanging fruit” for which this seems reasonable.)
 251 | 
 252 | ##### Stack Overflow
 253 | 
 254 | An increasingly common development practice is to make extensive use of Stack Overflow. While there isn’t anything wrong with this per se, [many](https://arxiv.org/pdf/1910.01321.pdf) of the answers provided contain [security defects](https://www.ieee-security.org/TC/SP2017/papers/7.pdf), and copying/pasting those answers [can lead to vulnerabilities](https://stackoverflow.blog/2019/11/26/copying-code-from-stack-overflow-you-might-be-spreading-security-vulnerabilities/). While static analysis can be used at the point of consumption to identify vulnerabilities, it would be more efficient to address the source.
 255 | 
 256 | We therefore recommend:
 257 | 
 258 | - Source code snippets submitted to Stack Overflow should be analyzed, with feedback going to either the author or the public. Readers should be made aware of vulnerable snippets. (The Coalition should consider engaging with Stack Overflow to advocate for a solution integrated into the Stack Overflow experience.)
 259 | 
 260 | - To address existing snippets already available on Stack Overflow, a separate analysis could take place, looking for vulnerable code patterns, and recommending fixes or at least commentary alerting individuals potentially affected by the vulnerability.
 261 | 
 262 | #### Identifying Security Vulnerabilities During Execution
 263 | 
 264 | While most security defects can be theoretically found using static analysis, in practice, static analysis tools are only as good as the rules implemented, and are often language-dependent. As a result, software developers often use tools that actively validate a program as it’s running, through a variety of methods:
 265 | 
 266 | - Dynamic Application Security Testing (DAST), which involves feeding data to a running application in order to “break” it in some way. Examples of this include fuzzing tools like Google’s [OSS-Fuzz](https://github.com/google/oss-fuzz) and web application penetration testing tools like [OWASP Zed Attack Proxy (ZAP)](https://www.zaproxy.org/).
 267 | 
 268 | - Interactive Application Security Testing (IAST), which involves instrumenting the application to achieve better accuracy when detecting vulnerabilities. Examples of this include memory checkers like [Valgrind](https://valgrind.org/) and [AddressSanitizer](https://clang.llvm.org/docs/AddressSanitizer.html). 
 269 | 
 270 | - Runtime Application Self-Protection (RASP), which involves instrumenting the application to detect and sometimes prevent attacks. An example of this is a web application firewall.
 271 | 
 272 | It may be tempting to only consider execution-time security testing for freestanding applications, like a database or web server, but this would be incomplete. Just as unit tests exist to validate individual pieces of functionality, so too can the active testing techniques described above apply to individual components.
 273 | 
 274 | We recommend the following:
 275 | 
 276 | - Build (and meta-build) systems should improve support for plugging into fuzzing tools.
 277 | 
 278 | - We should advocate for increased use of execution-time tools for open source projects.
 279 | 
 280 | #### Reducing the Likelihood that a Vulnerability will be Introduced
 281 | 
 282 | In an ideal world, all security defects would be identified immediately, enabling the software developer to fix them prior to ever being checked in. In the real world, security defects are found at all times throughout the lifecycle, but there are obvious advantages (risk, [cost](https://www.researchgate.net/publication/255965523_Integrating_Software_Assurance_into_the_Software_Development_Life_Cycle_SDLC), etc.) to identifying these as early as possible. This is often referred to as “shifting left”, based on a simplified view of the development lifecycle:
 283 | 
 284 | ![Graph of the development lifecycle: ideation, design, development, testing, release, maintenance, retirement](img/ReducingTheLikelihoodThatAVulnerabilityWillBeIntroduced.png)
 285 | 
 286 | In order to “shift left” as much as possible, software developers require access to high-quality guidance on how to address common classes of software vulnerabilities. While there are some high-quality sources available, including the [OWASP](https://owasp.org) [Cheat Sheet Series](https://cheatsheetseries.owasp.org/), few are comprehensive, curated, and kept up to date. 
 287 | 
 288 | We recommend the following take place:
 289 | 
 290 | - Provide developers with the [proper training](#Secure_Education) required to introduce and implement security in the technology stack being used at hand.
 291 | 
 292 | - Provide developers with a curated list of security tools and related resources.
 293 | 
 294 | - Provide developers with technical expertise when needed, particularly for critical projects.
 295 | 
 296 | #### Reducing the Likelihood that a Vulnerability will be Exploited
 297 | 
 298 | All software contains defects, and some of those defects have security implications. Over the past two decades, considerable work has gone into making it harder for these security defects to be successfully exploited by an attacker. Indeed, just as in the physical world, secure facilities have more than one “lock”, secure software systems have more than one control to prevent abuse. This “defense in depth” is a hallmark of secure software and can have a significant impact on the overall security and resilience of a system.
 299 | 
 300 | Moreover, there are various types of software vulnerabilities. There are also various exploitation techniques for performing attacks using the vulnerabilities. So building security architecture of a system requires understanding of the relationships between:
 301 |  - Bug detection mechanisms,
 302 |  - Defense technologies,
 303 |  - Vulnerability classes,
 304 |  - Exploitation techniques.
 305 | 
 306 | For example, [Linux Kernel Defence Map](https://github.com/a13xp0p0v/linux-kernel-defence-map) shows such relationships for the Linux kernel. This map is useful for developing a threat model for your GNU/Linux system and then learning about kernel defenses that can help against some of these threats.
 307 | 
 308 | In practice, this work also takes the form of security controls implemented within the platform, runtime, or operating system that will identify when the application is doing something unexpected, and take some form of corrective action. Examples include:
 309 | 
 310 | - An application-level firewall notices patterns associated with [Cross-Site Scripting (XSS)](https://owasp.org/www-community/attacks/xss/) or [SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection), and blocks traffic from getting to the application.
 311 | 
 312 | - [Address Space Layout Randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) (ASLR) reduces the likelihood that a buffer overflow will escalate to arbitrary code execution, by placing shared libraries at random locations in memory.
 313 | 
 314 | - Android applications request specific [permissions](https://source.android.com/devices/tech/config), which the user must grant during installation. (Access beyond those granted permissions will blocked by the operating system.)
 315 | 
 316 | We recommend the following projects in this area:
 317 | 
 318 | - Create guidance on how to leverage binary- and platform-level mitigations when building or deploying systems; this could include activities like enabling [Control Flow Integrity](https://clang.llvm.org/docs/ControlFlowIntegrity.html), avoiding [speculative execution](https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/spectre.html) attacks, and enforcing [address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization).
 319 | 
 320 | - Analyze state-of-the-art techniques for binary hardening across different operating systems; start work on porting techniques where there are significant differences, if any. For example, [kconfig-hardened-check tool](https://github.com/a13xp0p0v/kconfig-hardened-check) provides the recommendations that you can use for hardening the Linux kernel on your systems.
 321 | 
 322 | - Implement a “capabilities” model into one or more programming language runtimes.
 323 | 
 324 | #### Fixing Vulnerabilities
 325 | 
 326 | Once a vulnerability is identified and understood, the next obvious task is to fix it. In the ideal scenario, the project maintainer(s) will create a fix, test it, and publish a new release. Consumers (i.e., end-user and downstream package maintainers) will begin to use the new release, and the risk from the vulnerability would be mitigated. Unfortunately, there are often reasons why a prompt fix is not made:
 327 | 
 328 | - The maintainer(s) may not be actively working on the project. For side projects (see [Project Archetypes](#Project-Archetypes)), maintainers may issue a fix, but only at their convenience.
 329 | 
 330 | - The vulnerability may be categorized as low risk by the maintainer and perceived as not  being worth the effort to fix.
 331 | 
 332 | - The vulnerability may require significant effort to fix, either due to complexity, compatibility issues with other components, or simply due to the amount of code that will need to be modified.
 333 | 
 334 | In all of these cases, the larger community can put undue pressure on the project maintainer, who is often working on the project without compensation or may have other priorities. This can stand at odds with the consumers of that project, who are often employed: corporate developers who are being compensated to deliver a software product.
 335 | 
 336 | It is important to note that the scenario above is not the only possible workflow:
 337 | 
 338 | - The security researcher who found the vulnerability may contribute a code fix or may even join the project as a maintainer.
 339 | 
 340 | - The “fix” may be incomplete, and its revelation may encourage attackers to target instances of the software that are now known to be vulnerable.
 341 | 
 342 | - The security researcher may be unwilling to wait for a fix (particularly, a delayed one) and may therefore release the details publicly, including in some cases, a fully “weaponized” exploit.
 343 | 
 344 | A reasonable metric around fixing vulnerabilities could be the elapsed time between when a vulnerability is first identified and when all users have updated the package to a fixed version (up to some threshold). However, it makes sense to split this metric into two separate parts:
 345 | 
 346 | - The elapsed time between when a maintainer is notified and when a (correctly) fixed version is made available to consumers.
 347 | 
 348 | - The elapsed time between when an updated version is available and when it becomes integrated into the downstream project.
 349 | 
 350 | These metrics cannot be completely separated from one another; for one thing, many open source projects are both consumers of upstream packages and providers of packages to downstream consumers. Consider the following:
 351 | 
 352 | ![The graph shows a chain of components, dependent on each other, and shows the time difference between the update of the first component and the last](img/VulnerabilitiesFixingFlow.png)
 353 | 
 354 | In this scenario, think of yourself as a software developer using Component D. A vulnerability in Component A is found on January 1st and fixed ten days later. Downstream packages pick up the fix, one by one, until Component D issues a release in mid-March, which you notice and update at the end of March. Depending on the specifics of the vulnerability, you could have been affected by this publicly-known vulnerability for over two months, even though everyone was issuing fixes in a reasonable timeframe.
 355 | 
 356 | Services like [Snyk](https://snyk.io/) and [Dependabot](https://github.blog/2019-05-23-introducing-new-ways-to-keep-your-code-secure/#automated-security-fixes-with-dependabot) (now part of GitHub) can significantly shorten these delay chains by automatically opening pull requests when vulnerabilities are fixed in dependencies. 
 357 | 
 358 | We recommend the following:
 359 | 
 360 | - Consideration should be given to funding high-impact projects that contain security vulnerabilities, possibly using a model similar to the [Core Infrastructure Initiative](https://coreinfrastructure.org).
 361 | 
 362 | - Educational material should be created to advocate for projects to keep open source packages up to date, even in the absence of any known security flaws. (See Package Update for additional discussion of this topic.)
 363 | 
 364 | - Bug bounties should be created or expanded to include rewards for (accepted) patches, under the assumption that a project maintainer will be more likely to accept a quality pull request than to take the time to investigate and create a fix themselves. (Google’s [patch rewards program](https://www.google.com/about/appsecurity/patch-rewards/) is a good example of this.)
 365 | 
 366 | - A funded pool for software engineers could be created and directed at high-risk situations, such as creating fixes for vulnerabilities in critical projects.
 367 | 
 368 | ### Secrets Management
 369 | 
 370 | Secrets Management is the practice of ensuring that credentials, tokens, cryptographic keys, and other sensitive material is not unexpectedly disclosed. This disclosure can occur in many scenarios, including:
 371 | 
 372 | - [Secrets disclosed in source code when pushed to a source code repository](https://darkport.co.uk/blog/ahh-shhgit!/)
 373 | 
 374 | - [Secrets disclosed in a published package](https://thenewstack.io/npm-password-resets-show-developers-need-better-security-practices/)
 375 | 
 376 | - [Secrets disclosed in artifacts from a CI/CD pipeline](https://blog.travis-ci.com/2017-05-08-security-advisory)
 377 | 
 378 | - [Secrets disclosed to unauthorized entities when a package is installed or is executing](https://www.zdnet.com/article/microsoft-spots-malicious-npm-package-stealing-data-from-unix-systems/)
 379 | 
 380 | The effect of this disclosure is that very often, the secrets themselves can allow an attacker to masquerade as the victim (i.e., package author, publisher, or consumer).
 381 | 
 382 | A number of mitigations exist that address this, including:
 383 | 
 384 | - Tools, such as [truffleHog](https://github.com/dxa4481/truffleHog) and [shhgit](https://shhgit.darkport.co.uk/), identify secrets disclosed in source code, and can be used at appropriate points in the development lifecycle (i.e., pre-commit or pre-receive hooks).
 385 | 
 386 | - Centralized (and well-protected) secret management services can be used to protect, rotate, and audit secrets more effectively.
 387 | 
 388 | - Many CI/CD pipelines contain features that enable secrets to be inserted at runtime and protected from disclosure in logs or other artifacts.
 389 | 
 390 | - Modern software deployment practices (e.g., micro-services, serverless architectures, containers, etc.) reduce the likelihood that an untrusted process will be able to read secrets from the target (though they do increase the attack surface area).
 391 | 
 392 | We recommend the following projects to advance this area:
 393 | 
 394 | - Expand secret detection capabilities in key systems, including source code repositories (e.g., GitHub, Gitlab, etc.) and package management systems (e.g., NPM, PyPI, etc.).
 395 | 
 396 | - Reach out to the maintainer of [truffleHog](https://github.com/dxa4481/truffleHog) to help improve/extend the tool (there are currently 36 open pull requests), or rally around another tool to achieve the same ends: a high-quality secrets detector.
 397 | 
 398 | - Create “play books” on how it can be fast and simple to use a secure secret management facility.
 399 | 
 400 | - Include secrets management in key best practice documentation to developers.
 401 | 
 402 | ### Dependency Management
 403 | 
 404 | When authoring a software project, it’s typical to bring in dependencies for functionality that you don’t want to implement yourself, and is done by both the developers who create final software products as well as the developers who create open source components.
 405 | 
 406 | We discuss dependency management as part of Package Consumption, later in this document.
 407 | 
 408 | ### Local Testing
 409 | 
 410 | Testing often takes place both locally (usually informally) and within a formal build pipeline. We discuss testing within the Security Validation section, later in this document.
 411 | 
 412 | ## External Contributions Phase
 413 | 
 414 | <img src="img/ExternalContribution.png" title="" alt="" width="422">
 415 | 
 416 | In this phase, we explore changes made to a software component by a loosely-affiliated individual, which is to say not by the main author or trusted maintainer. This contributor can be trustworthy or underhanded, and the contribution itself can be of any level of quality. Most open source projects have a way to validate and accept (or reject) these contributions, and the most common way is through a pull request.
 417 | 
 418 | When a contribution is made, a maintainer usually needs to “sign off” on the change before merging it into an “official” code branch.
 419 | 
 420 | In this phase, we have the following threat actors:
 421 | 
 422 | - [An attacker trying to “sneak” a malicious change into a code base.](#Preventing-Malicious-Changes-from-Contributors)
 423 | 
 424 | - [An attacker can attempt to undermine the pull request validation infrastructure.](#Undermining-Automated-Validation-on-Pull-Requests)
 425 | 
 426 | ### Preventing Malicious Changes from Contributors
 427 | 
 428 | Typically, pull requests are validated through some combination of automated tools and manual introspection. Tools often cover things like ensuring a Contributor License Agreement (CLA) is in place for corporate organizations, that code passes unit tests, meets style requirements, passes linting, or is free of (detectable) security vulnerabilities. Unfortunately, attackers would be able to mimic these checks locally, tweaking their contribution until it passes. As a result, manual introspection is an essential part of accepting pull requests.
 429 | 
 430 | This risk can be mitigated to some degree by:
 431 | 
 432 | - Ensuring that all contributions from less-trusted parties are reviewed, preferably by two maintainers, before they is merged.
 433 | 
 434 | - Ensuring that all security tools are run successfully before completing a pull request.
 435 | 
 436 | - Ensuring that all changes to the component’s attack surface or core characteristics are properly reviewed, using tools like [Microsoft Application Inspector](https://github.com/Microsoft/ApplicationInspector) and the [NPM Security Insights API](https://blog.npmjs.org/post/188385634100/npm-security-insights-api-preview-part-2-malware).
 437 | 
 438 | However, we strongly suspect that manual introspection of malicious, intentionally obfuscated changes will not be entirely effective. To prove this out, we recommend a “red team” exercise be performed to ascertain the likelihood that a “hidden” change will pass through a code review.
 439 | 
 440 | ### Undermining Automated Validation on Pull Requests
 441 | 
 442 | Tools are often used to validate that a pull request meets some type of quality bar. This often involves performing a build and running unit tests and other tools to detect issues.
 443 | 
 444 | A modern coding practice is to include build scripts, configuration, and unit tests within the project’s source code repository. This means that an attacker could submit a pull request that, for example, disables both a security feature and its associated unit test. Validation would “pass” and only a manual review would detect this change as suspect. 
 445 | 
 446 | However, an attacker would also be able to do things like exfiltrate any secrets accessible to the build environment or execute arbitrary code (via a build script) within that environment.
 447 | 
 448 | Many continuous integration systems have built controls to mitigate the risk of disclosing secrets, including [GitHub Actions](https://github.community/t5/GitHub-Actions/don-t-run-actions-on-pull-request-from-fork/td-p/45499), [Azure DevOps](https://docs.microsoft.com/en-us/azure/devops/pipelines/repos/github?view=azure-devops&tabs=yaml#validate-contributions-from-forks), [Travis CI](https://docs.travis-ci.com/user/pull-requests#Pull-Requests-and-Security-Restrictions), and [CircleCI](https://circleci.com/blog/managing-secrets-when-you-have-pull-requests-from-outside-contributors/). This is often implemented by not passing secrets to pull requests initiated from forked repositories.
 449 | 
 450 | To mitigate the remaining risks, we recommend:
 451 | 
 452 | - Any changes to build configurations should be validated, similar to any other change.
 453 | 
 454 | - All pull request validation routines should be limited to an expected duration and frequency, in order to avoid denial of service or resource exhaustion.
 455 | 
 456 | ## Central Infrastructure Phase
 457 | 
 458 | <img src="img/CentralInfrastructure.png" title="" alt="The graph shows the open source supply chain: source code repository, testing and validation, continuous integration and delivery, package publishing, package management" width="408">
 459 | 
 460 | “Central Infrastructure” refers to elements in the open source supply chain that are typically operated “as a service” by a trusted third party (e.g., GitHub, NPM, Travis CI, Azure DevOps, etc.). This has advantages to both the maintainer (lower cost and complexity, high quality, etc.) and consumer (increased trust), but some threats apply here.
 461 | 
 462 | ### Source Code Repository
 463 | 
 464 | First, attackers could target the source code repository. Despite most open source development now using a distributed source control system (git), most source code is stored in a central location, such as GitHub, Bitbucket, GitLab, or Azure DevOps. New developers (or existing developers performing a git clone-style operation) would have a hard time determining the authenticity of a repository if it were modified by an attacker.
 465 | 
 466 | This threat is already mitigated by strong operational security practices that these central organizations employ, and can be further mitigated by increased use of [commit signing](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) with central trust authorities.
 467 | 
 468 | The open source developer’s credentials could also be hijacked by an attacker. With those credentials, the attacker could modify the source code available, perform builds, and trigger publishing pipelines—essentially, anything the “real” maintainer could do). This risk is mitigated to a large extent by the use of multi-factor authentication, but could be expanded to include anomaly detection (e.g., requiring an additional layer of authentication if an action is triggered from an unexpected location, or based on other metadata).
 469 | 
 470 | ### Security Validation
 471 | 
 472 | Security validation usually either takes place on a developer’s local workstation or centrally once changes have been made to a source code repository. We’re using the term “security validation” to include things like static analysis, automated penetration testing, fuzzing, and related tasks—the goal of which is to identify security defects so they can be fixed before they can be found by an attacker and exploited.
 473 | 
 474 | There are a few threats that pertain to this phase of the development lifecycle, including:
 475 | 
 476 | - [An attacker learns of security defects prior to a fix being available.](#Premature-Disclosure)
 477 | 
 478 | - [An attacker is able to disable certain security checks from taking place.](#Attacker-Disabling-Security-Checks)
 479 | 
 480 | - [An attacker is able to author a malicious contribution that isn’t identified when analyzed.](#Malicious-Contributions)
 481 | 
 482 | #### Premature Disclosure
 483 | 
 484 | Premature disclosure occurs when a tool is used to identify security defects in a system or software component, but the results wind up being disclosed to an attacker before they can be remediated.
 485 | 
 486 | There is room for argument here on whether full, public disclosure leads to better outcomes, with market forces putting pressure on maintainers (or other contributors) to fix security defects. After all, we should assume that most attackers will be able to perform the same validation activities themselves. On the other hand, security is often about increasing the cost to an attacker, and providing a public list of vulnerable components could certainly lead to more successful attacks.
 487 | 
 488 | An additional point here concerns the level of triage involved. Security tools can often yield a large number of false positives: either completely invalid findings or those that cannot be exploited for one reason for another. The value to an attacker of un-triaged findings is far less than the value of those that have been validated as high-impact—the latter approaching the realm of [Responsible Disclosure](https://en.wikipedia.org/wiki/Responsible_disclosure).
 489 | 
 490 | We recommend the following:
 491 | 
 492 | - Research should be conducted, reviewed, or consolidated to come up with the right policy on public disclosure of un-triaged and minimally triaged findings from automated security tools.
 493 | 
 494 | - Funding should be considered to triage potentially high-impact security findings detected by certain tools (e.g., [lgtm.com](https://lgtm.com) and others).
 495 | 
 496 | #### Attacker Disabling Security Checks
 497 | 
 498 | If an attacker is able to change the configuration of security checks, they would be more likely to be able to “slip something by” the security validation process and insert vulnerable or malicious code into the software component.
 499 | 
 500 | As a result, all changes to the security configuration (e.g., [lgtm.yml](https://lgtm.com/help/lgtm/lgtm.yml-configuration-file)) should be examined by a human to ensure they don’t, for example:
 501 | 
 502 | - reduce the severity of certain defect types,
 503 | 
 504 | - ignore certain paths,
 505 | 
 506 | - change build commands, or
 507 | 
 508 | - change alerting or notification settings.
 509 | 
 510 | We recommend the following:
 511 | 
 512 | - Research should be performed to determine the feasibility of this attack and what types of mitigation could help. Specifically, a “red team” exercise should be performed, to attempt to undermine the security checks performed against a project created for this purpose.
 513 | 
 514 | #### Malicious Contributions
 515 | 
 516 | Most open source projects accept contribution requests from anyone in the form of pull requests. An attacker could attempt to “trick” a maintainer into accepting their contributions, through a few different means:
 517 | 
 518 | - **Minified Code.** Changes made to minified code can be very difficult to read, especially with a line-by-line view of the differences. (Though to be clear, minified code should probably be the output of a build process, and therefore not checked into a source code repository in the first place.)
 519 | 
 520 | - **Many good changes, one bad one.** If an attacker has provided multiple “good” changes, especially over time, the maintainer may relax their guard and accept a subsequent change without reviewing it as much. Similarly, if a single pull request had hundreds of similar changes (e.g., “fix indenting”), it may be hard to find the malicious change.
 521 | 
 522 | - **Homoglyphs.** Homoglyphs can be problematic because the text will “look correct” to a human reviewer but will actually be different, leading the victim to an alternate library.
 523 |   
 524 |   *Can you spot the difference?*
 525 |   
 526 |   |     |     |     |      |     |     |     |
 527 |   |:---:|:---:|:---:|:----:|:---:|:---:|:---:|
 528 |   | l   | e   | f   | t    | p   | a   | d   |
 529 |   |     |     |     | *vs* |     |     |     |
 530 |   | ⅼ   | е   | f   | t    | р   | а   | ⅾ   |
 531 |   
 532 |   *In the bottom table, six of the seven characters are from “non-ASCII” character sets defined by Unicode.*
 533 | 
 534 | - **Large Diffs.** GitHub does not show large diffs by default, and instead shows a “load diff” link, which can be easy to miss, [especially in lock files](https://snyk.io/blog/why-npm-lockfiles-can-be-a-security-blindspot-for-injecting-malicious-modules/).
 535 | 
 536 | - **Binary Files.** Most pull request review systems do not render differences made to binary files, leaving it up to the maintainer to review these out-of-band.
 537 | 
 538 | We recommend the following:
 539 | 
 540 | - Research should be performed to determine the feasibility of this attack and what types of mitigation could help (such as improvements to the pull request review user interface). Specifically, an adversarial “red team” exercise should be undertaken to attempt to submit malicious changes to a target repository created for this purpose.
 541 | 
 542 | ### Continuous Integration & Delivery
 543 | 
 544 | Within a continuous integration and delivery environment, there is some increased risk that vulnerable or malicious content will make its way out to a published package without being detected. Indeed, many of the friction-reducing practices that enable agile development also enable faster “time to market” for “bad” code. If a dependency is updated and all tests pass, a malicious change could make its way into a published package in a matter of seconds.
 545 | 
 546 | It is important to consider the principles and practices of [DevSecOps](https://www.csoonline.com/article/3245748/what-is-devsecops-developing-more-secure-applications.html) and how they can be applied within the CI/CD context. Some useful resources with specific recommendations and walkthroughs include:
 547 | 
 548 | - [Every Security Team is a Software Team Now](https://www.youtube.com/watch?v=8armE3Wz0jk) (Dino Dai Zovi, Black Hat USA 2019)
 549 | 
 550 | - [Building Secure & Reliable Systems](https://landing.google.com/sre/books/) (Google)
 551 | 
 552 | - [DevSecOps - Implementing Secure CI/CD Pipelines](https://www.youtube.com/playlist?list=PLjNII-Jkdjfz5EXWlGMBRk63PC8uJsHMo) (YouTube)
 553 | 
 554 | - [A Primer on Secure DevOps: Why DevSecOps Matters](https://techbeacon.com/security/primer-secure-devops-why-devsecops-matters) (Chris Romeo, TechBeacon)
 555 | 
 556 | - [Six Pillars of DevSecOps](https://cloudsecurityalliance.org/artifacts/six-pillars-of-devsecops/) (Cloud Security Alliance)
 557 | 
 558 | ### Package Publishing
 559 | 
 560 | The act of publishing a package to a package management repository usually starts with the maintainer establishing an account within that ecosystem, and then performing some action that ends with a package being available for consumers to select and install.
 561 | 
 562 | The main threat here is that an attacker would gain access to the maintainer’s credentials, either during account creation, access, or publishing, and use those credentials to perform malicious activities. This could occur through a local attack on the maintainer, an attack on the network or DNS infrastructure, or an attack closer to the central package management infrastructure.
 563 | 
 564 | ## Package Consumption Phase
 565 | 
 566 | ![The graph shows the package consumption phases: package selection, package installation, package use, package removal](img/PackageConsumptionFlow.png)
 567 | 
 568 | Package consumption is the process through which “external” packages are chosen and integrated into a software component. In this case, package consumption actually refers to two similar things:
 569 | 
 570 | - An “end-user” selects an OSS component to use in their software product.
 571 | 
 572 | - An OSS author selects an OSS component to use within their OSS component.
 573 | 
 574 | In both of these cases, we’ll refer to “end-user” and OSS author generically as “consumer”.
 575 | 
 576 | ### Package Selection
 577 | 
 578 | From a consumer’s perspective, things begin when they are searching for a package to consume. This often takes place on the package management system’s web page or through a command line (e.g., `pip search mysql or apt-cache search mail`). 
 579 | 
 580 | Threats that apply to package selection include:
 581 | 
 582 | - [An attacker could compromise a maintainer’s credentials and publish malicious packages.](#Account-Hijacking)
 583 | 
 584 | - [An attacker could subvert the package selection client software.](#Compromised-Package-Repository-Client-Software)
 585 | 
 586 | - [An attacker could compromise the website that displays the package listings.](#Compromised-Package-Repository-Websites)
 587 | 
 588 | - [An attacker could create a new package with a name similar to an existing package (i.e., typo-squatting).](#Typo-Squatting-Attacks)
 589 | 
 590 | - [An attacker could modify an existing package within a package management repository.](#Compromised-Package-Repository-Packages)
 591 | 
 592 | - [An attacker could remove a component from a package management repository.](#Package-Removal)
 593 | 
 594 | #### Account Hijacking
 595 | 
 596 | Open source software developers nearly always publish source code and packages to centralized systems, such as GitHub and NPM. These systems typically require credentials in order to perform certain tasks, like publishing a new release, and attackers have frequently targeted open source developers as a vector to publish malicious code.
 597 | 
 598 | To combat this, practically all source code repositories and package management systems have implemented some form of multi-factor authentication when logging in. Unfortunately, many, if not most, developers do not take advantage of this. In addition, CI/CD pipelines can make multi-factor authentication difficult or impossible, and so alternatives like IP restricted limited-scope tokens are used. These are not ubiquitous, and as a result, attackers continue to hijack accounts and publish malicious code on a regular basis.
 599 | 
 600 | To combat this, we recommend the following:
 601 | 
 602 | - Each package management system should expose a flag (and related information) indicating whether a package was published under an account that used a strong authentication method. This should be announced far in advance, to give users ample opportunity to enable this.
 603 | 
 604 | - Each package management client should expose a flag that gives package consumers control over whether or not they allow packages that do not use a strong authentication method (e.g., `--flag-strong-auth=[silent|warn|fail]`).
 605 | 
 606 | - An analysis should be conducted of the top package management systems, documenting best practices, and sharing them with the broader package management community.
 607 | 
 608 | #### Compromised Package Repository Client Software
 609 | 
 610 | Modern CI/CD systems partially mitigate the threat of compromised client software, in that official builds are performed in a more trusted environment than the developer’s local workstation. Those builds would (presumably) select packages as expected.
 611 | 
 612 | #### Compromised Package Repository Websites
 613 | 
 614 | Most (if not all) package repositories can be browsed through a website (e.g., [npmjs.com](https://npmjs.com), [nuget.org](https://nuget.org), [pypi.org](https://pypi.org)). If these were to be compromised by an attacker, packages and metadata could be changed that would steer users to malicious packages that appeared to be authentic. This threat is mitigated (somewhat) when the organizations that manage these resources are sufficiently resourced to maintain a strong security posture.
 615 | 
 616 | #### Typo-Squatting Attacks
 617 | 
 618 | Typo-squatting occurs when an attacker creates a resource with a name closely resembling an existing name, specifically with the intent of having victims accidentally type the wrong name and access the attacker’s resource. In the context of package management systems, these resources are typically package names (e.g., `djamgo` masquerading for `django`), and they occasionally cross package management systems (`python-dateutil` being available on PyPI; `python3-dateutil` is the name of the Ubuntu package).
 619 | 
 620 | Typo-squatting can be largely mitigated by validating project names when they are published, using indicators like glyph similarity, keyboard distance, edit distance, and related metrics, and taking action when a package’s name is too close to the name of an existing package. Simple solutions like comparing the number of installations of a package may also provide value (e.g., *I see you’re trying to install **momenr**, which has had 4 installations. Did you mean **moment**, which has had 40 million installations?*).
 621 | 
 622 | We recommend the following:
 623 | 
 624 | - All package management systems should implement measures to detect or prevent package typo-squatting, to reduce the likelihood that an attacker will be able to publish a package that masquerades a different, authentic package. This information could either be actioned centrally and/or be conveyed to the package consumer at selection time.
 625 | 
 626 | #### Compromised Package Repository Packages
 627 | 
 628 | If an attacker were able to compromise a package repository’s storage or distribution system, it would enable them to replace an existing package with a malicious version. Since in most cases, trust is anchored to the repository itself, victims who consume the malicious package would have no way of detecting this.
 629 | 
 630 | Certain package management systems (e.g. [Ubuntu PPAs](https://help.launchpad.net/Packaging/PPA/InstallingSoftware), [NuGet Package Signing](https://docs.microsoft.com/en-us/nuget/create-packages/sign-a-package)) provide package signing that roots run at least partially to the package author or maintainer, rather than the repository. Others provide [repository signing](https://devblogs.microsoft.com/nuget/introducing-repository-signatures/) that mitigates the risk of a mirror being compromised. The Go ecosystem uses a [notary mechanism](https://go.googlesource.com/proposal/+/master/design/25530-sumdb.md) for ensuring that cryptographic checksums of modules do not change after they are initially published.
 631 | 
 632 | #### Package Removal
 633 | 
 634 | In addition to being immutable, many software developers generally assume that packages will always be available. Even allowing for periodic network and infrastructure outages, often little consideration is given to a scenario where a maintainer removes a package from a repository.
 635 | 
 636 | This became a problem in March 2016 when the author of 273 NPM packages [removed them all](https://kodfabrik.com/journal/i-ve-just-liberated-my-modules) after a legal/trademark argument. As a result, any build that used one those packages (and did not have a cached copy available) began to fail.
 637 | 
 638 | A related threat would occur if only certain versions were removed, requiring consumers to downgrade to older, less secure versions. One possible scenario is that an intellectual property claim could be made against only certain versions of a package.
 639 | 
 640 | If a package is removed entirely, and the package name then becomes available for others to register, an attacker would be able to take advantage of this. Similar to typo-squatting, this would be analogous to forgetting to renew a domain name registration and having a domain squatter take it over and use it to distribute malware.
 641 | 
 642 | We recommend the following:
 643 | 
 644 | - The scope of this problem should be researched, meaning, for each of the major package managers, what is the process for un-publishing a package? How many packages are un-published, and how many installations were they associated with? How many of those were taken down for a non-security reason? When packages are delisted, do the names become available for others to register?
 645 | 
 646 | - Each package management system should clearly describe its principles and process for package removal.
 647 | 
 648 | ### Package Installation
 649 | 
 650 | The package installation process usually includes a few different steps, and starts once a package is selected:
 651 | 
 652 | - Retrieving and processing metadata about the location of the package.
 653 | 
 654 | - Retrieving the actual package contents.
 655 | 
 656 | - Validating that the package contents have not been tampered with.
 657 | 
 658 | - Extracting those contents into a location on the local file system.
 659 | 
 660 | In many cases, an additional step is added:
 661 | 
 662 | - Executing an installation script included within the package.
 663 | 
 664 | Each of these steps can be subverted in different ways:
 665 | 
 666 | - [Network attacks on package installation](#Network-Attacks-on-Package-Installation)
 667 | 
 668 | - [Local attacks on the build system configuration](#Local-Attacks-on-the-Build-System-Configuration)
 669 | 
 670 | - [Malicious installation scripts](#Malicious-Installation-Scripts)
 671 | 
 672 | - [Installation of opaque binaries](#Installation-of-Opaque-Binaries)
 673 | 
 674 | #### Network Attacks on Package Installation
 675 | 
 676 | Network-layer attacks can modify metadata and package contents while in transit to the consumer. The nearly universal use of TLS significantly reduces this risk, and to a lesser extent, so does the use of private package repositories. Many package management clients offer the ability for developers to disable certificate checking (e.g., [NPM](https://docs.npmjs.com/misc/config#strict-sslhttps://docs.npmjs.com/misc/config#strict-ssl), [PyPI](https://pip.pypa.io/en/stable/reference/pip/#cmdoption-trusted-host)), but doing so is [generally discouraged](https://arxiv.org/pdf/1709.09970.pdf) on [Stack Overflow](https://stackoverflow.com) and other forums.
 677 | 
 678 | Network-layer attacks may also pertain to private package repositories, which could be configured without TLS under the (mistaken) belief that a private network is always secure.
 679 | 
 680 | #### Local Attacks on the Build System Configuration
 681 | 
 682 | An attacker could compromise some part of a consumer’s system; for example, an attacker who is able to modify an user’s .npmrc file would be able to point the user to an attacker-controlled registry and deliver malicious content in response to any request. Such an attack would be similar to any other local attack, as described here.
 683 | 
 684 | We recommend mitigating this (partially, for certain types of resources) by:
 685 | 
 686 | - Encouraging developers to use scoped tokens that only grant access to a minimal set of functions, or tokens that were only usable from certain locations.
 687 | 
 688 | #### Malicious Installation Scripts
 689 | 
 690 | Many packages require special actions to take place as part of installation. These are often codified in an installation script, specified directly in a manifest (as in the case for NPM modules) or as a separate, optional file (in the case of NuGet). Sometimes, the manifest file itself is executable (in the case of Python).
 691 | 
 692 | In all these cases, installation is the first opportunity for a malicious package to execute. This often occurs on build servers, and sometimes, on trusted infrastructure. Attackers sometimes target installation files as a simple way to exfiltrate data, as an attacker did in the NPM [1337qq-js](https://www.zdnet.com/article/microsoft-spots-malicious-npm-package-stealing-data-from-unix-systems/) package.
 693 | 
 694 | The [Chocolatey](https://chocolatey.org/) package manager takes an interesting [approach](https://chocolatey.org/docs/security), including a human review of every package published (except for a set of “trusted” package publishers). Unfortunately, it’s hard to see how this can scale to the [thousands](http://www.modulecounts.com/) of packages that are published each day.
 695 | 
 696 | To defend against this, we recommend:
 697 | 
 698 | - A combination of static and dynamic analysis should take place within the publishing pipeline, analyzing installation script code and behavior to discover unwanted activity. This analysis could use maintainer reputation as an input and could naturally be extended to include the entire package (not just the installation script).
 699 | 
 700 | - Communication should regularly occur between security teams at each of the major package management systems, communicating when new patterns have been noticed or reported.
 701 | 
 702 | #### Installation of Opaque Binaries
 703 | 
 704 | Most consumers assume that when a package is selected and installed, the contents are what they expect. For some types of packages, this is easy to validate. The source code repository and the package contents are often at least similar if not identical. In other cases, such as [Python wheels](https://www.python.org/dev/peps/pep-0427/), NuGet packages, or Ubuntu PPAs, the packages that are delivered are often compiled, platform-specific binaries. For the purposes of this section, we will consider packages that contain minified or obfuscated code to be similar to binary packages.
 705 | 
 706 | Binaries are distributed for a few general reasons:
 707 | 
 708 | - The package may need to be compiled, using a toolchain, configuration, or dependencies that the downstream developer may not have available at installation time.
 709 | 
 710 | - Copying a pre-built binary will almost always be significantly faster than compiling it from source code.
 711 | 
 712 | - The developer may wish to obfuscate their package to deter developers from examining it. (For example, not all packages are distributed under an open source license.)
 713 | 
 714 | Binary packages are inherently riskier than packages containing readable source code because it is significantly harder to reason over them. The only feasible way to validate that a binary came from a purported source code distribution would be to perform a build and check to see if the output was the same. Due to the thousands of different build environments and configuration steps, it is infeasible to do this in a fully automated way. As a result, most consumers are “forced” to place a great deal of trust that the binaries they obtain deserve that trust.
 715 | 
 716 | To be clear, the threat associated with this is that a maintainer (either malicious to begin with, or benevolent but with compromised credentials) will publish a version of a package that contains malicious code that was never included in the source code repository.
 717 | 
 718 | This threat can be mitigated by requiring reproducibility as part of a publishing process. This could take the form of a process by which the maintainer configures a trusted, well-architected continuous integration system to create a package, with the output then being published directly once the build is complete and validated. Essentially, there would be no “upload package” functionality for the maintainer. This doesn’t fully mitigate the risk; after all, the build script could pull in and insert malicious code at that stage, but in general, it would lead to increased transparency, and accordingly, trust.
 719 | 
 720 | However, compilers are free to make certain decisions non-deterministically (e.g., changing the ordering of functions within a binary, selecting variable names during minification, selecting different assembly instructions that achieve the same effect, adding dynamic elements such as a timestamp into the executable). In order to verify reproducibility, some of these settings must be [pre-configured](https://blog.conan.io/2019/09/02/Deterministic-builds-with-C-C++.html).
 721 | 
 722 | We therefore recommend the following:
 723 | 
 724 | - Consideration should be given to the [Reproducible Builds](https://reproducible-builds.org/) project and whether it can be used as a model for expanding reproducibility across the open source ecosystem.
 725 | 
 726 | - Package management systems should consider tighter integration with CI/CD publishing, encouraging publishers to take part in it, and providing a metadata flag to consumers when packages have been built by a trusted entity.
 727 | 
 728 | ### Package Use
 729 | 
 730 | As the name suggests, “package use” occurs when a software product calls a part of an external package to perform some function.
 731 | 
 732 | There are a few different threats that apply uniquely to this area:
 733 | 
 734 | - [Malicious Packages](#Malicious-Packages)
 735 | 
 736 | - [Unconstrained Packages](#Unconstrained-Packages)
 737 | 
 738 | - [Dynamic Packages](#Dynamic-Packages)
 739 | 
 740 | #### Malicious Packages
 741 | 
 742 | In addition to installation scripts, the projects themselves may be malicious in nature. This can be particularly challenging to detect when the source code isn’t readily available, either because it’s published as a binary (see above) or because the installation takes place within a larger system. For example, extensions installed into Visual Studio Code, Jenkins, WordPress, or even GitHub run arbitrary code against the developer’s software, and the consumer seldom goes through the trouble to find and validate the source. Since these are often “one-click” installs, there is essentially no barrier to entry.
 743 | 
 744 | These threats can be mitigated to a degree through tooling (see [Identifying Security Vulnerabilities in Source Code](#Identifying-Security-Vulnerabilities-in-Source-Code)), and manual introspection (albeit at a higher cost). Other potential mitigations include community scoring based on reputation, transparency around what the project is capable of doing, and rapid investigation (including variant analysis) and removal when malware is found. To be clear, traditional anti-malware solutions are typically insufficient to address this risk.
 745 | 
 746 | To address this threat, we recommend the following:
 747 | 
 748 | - Package management systems should conduct automated analyses when packages are published; these analyses should include detection for malicious code patterns.
 749 | 
 750 | #### Unconstrained Packages
 751 | 
 752 | An unconstrained package is one that can perform more than what the consumer expects it to do. Typically, packages execute with the same permissions as the calling function. (We are referring to typical in-process calls, and not inter-process or network-based calls.) For example, the purpose of left-pad is to pad a string out to a fixed length. It does not need to establish a network connection or write to disk, but there’s nothing stopping it from doing so.
 753 | 
 754 | Unconstrained packages are risky for two main reasons:
 755 | 
 756 | - A vulnerability in the package could be exploited to take advantage of those additional permissions available, but not ordinarily used.
 757 | 
 758 | - An attacker who can compromise the package could publish an updated version that performs additional, unexpected actions.
 759 | 
 760 | There have been a few noteworthy attempts to mitigate this risk, including [Android Permissions](https://developer.android.com/guide/topics/permissions/overview), [iOS Permissions](https://developer.apple.com/design/human-interface-guidelines/ios/app-architecture/requesting-permission/), [Windows App Permissions](https://support.microsoft.com/en-us/help/10557/windows-10-app-permissions), and OpenBSD’s [pledge](https://man.openbsd.org/pledge.2) system call. There have also been attempts to add a capabilities model within specific programming languages, including [Java](https://docs.oracle.com/en/java/javase/14/security/permissions-jdk1.html) and [.NET](https://docs.microsoft.com/en-us/dotnet/framework/misc/code-access-security), as well automated analyses like the [NPM Security Insights API](https://blog.npmjs.org/post/188234999089/new-security-insights-api-sneak-peek).
 761 | 
 762 | From a consumer’s perspective, a reasonable interface might be to include the required permissions within the package manifest, to be examined by the consumer during selection and approved during installation. If that package depends on another package, the full set of permissions would have to propagate all the way back to the consumer.
 763 | 
 764 | From a package maintainer’s perspective, tooling would need to exist to calculate the minimum required permissions and create the associated manifest attributes.
 765 | 
 766 | From a runtime environment, certain calls would need to be brokered to ensure that all calls respect the approved permissions.
 767 | 
 768 | This could of course become more complex. If component A depends on a small subset of component B that doesn’t require any special permission, but other parts of B *do* require special permission, the maintainer for project A should be able to specify that in the manifest, and consumers would be protected from B using those permissions not explicitly assigned.
 769 | 
 770 | We recommend the following:
 771 | 
 772 | - There should be continued research toward a package-level capabilities system, ultimately integrated into the various runtime environments.
 773 | 
 774 | #### Dynamic Packages
 775 | 
 776 | Dynamic packages are those that include logic that is not defined within the package contents. For example, a package could download code during installation or at runtime, calling those functions once they are accessible. Dynamic packages present a significant risk for two reasons:
 777 | 
 778 | - It can be exceedingly difficult to gain assurance, since the remote resource could change at any point in the future.
 779 | 
 780 | - If the remote resource were to become unavailable, at least a portion of a component that references it would fail in some way.
 781 | 
 782 | JavaScript uses dynamic packages extensively, whenever a remote `<script>` tag is used to include a remote resource. The associated risk can be mitigated through [Subresource Integrity](https://en.wikipedia.org/wiki/Subresource_Integrity) attributes (which will cause the browser to fail to load the component if the cryptographic checksum isn’t an expected value), with fallback to a local resource.
 783 | 
 784 | Other languages use dynamic code when loading shared libraries (e.g., [LoadLibrary or dlopen](https://en.wikipedia.org/wiki/Dynamic_loading)), though these assume the library will at least be present locally.
 785 | 
 786 | Another common anti-pattern is the use of **curl | sh**, or equivalents, which retrieve a remote file and immediately execute it. (There is some [disagreement](https://www.arp242.net/curl-to-sh.html) on whether this is fundamentally different from cloning a repository and immediately running a build command.) In any case, there is substantial risk in running arbitrary code that you do not have even the opportunity to inspect. This attack can be much more dangerous when paired with unprotected transport encryption (i.e. “http” instead of “https”), as a network-based attacker would be able to modify the content and execute arbitrary code.
 787 | 
 788 | These threats can be mitigated in a few ways:
 789 | 
 790 | - Establish package trust; only those packages with a high trust level would be granted permission to run dynamic code.
 791 | 
 792 | - Improve tools that search for dynamic code constructs to find additional varieties and provide that feedback to consumers prior to execution. (This is related to the capabilities discussion under [Unconstrained Packages](#Unconstrained-Packages).)
 793 | 
 794 | ### Package Update
 795 | 
 796 | Each day, thousands of new package versions are released, and each day, millions of consumers perform an update. Package updates provide new functionality, fixes for bugs, and sometimes, patches for security vulnerabilities.
 797 | 
 798 | Most package management systems provide easy-to-use update mechanisms, such as `apt-get upgrade` or `npm update`. Some offer additional information on packages, such as deprecation status or a list of known security vulnerabilities.
 799 | 
 800 | The package update process is almost identical to the installation process (in fact, an update is often a removal followed by an installation), so the same threats that apply there will apply here. In addition to those threats, package updates have two additional scenarios that introduce risk:
 801 | 
 802 | **Choosing to not take an update.** There are several reasons why failing to update a software component can introduce risk:
 803 | 
 804 | - The update could address a known security vulnerability.
 805 | 
 806 | - The update could address an unknown security vulnerability. There can be many reasons for this, such as when a maintainer notices and silently fixes a security issue, or removes or disables a piece of code that happened to contain a vulnerability. This includes updates to anything in the component’s dependency graph, as well.
 807 | 
 808 | - The updated component could be more resilient to security compromise, such as through improved binary hardening or an improved build toolchain.
 809 | 
 810 | - Failing to take regular updates can also increase technical debt; meaning that a future update through many versions could be more “painful” than if it were taken incrementally. (To be fair, this point assumes that an update will be eventually be required.)
 811 | 
 812 | **Choosing to take an update.** Ironically, there is also risk in updating software components:
 813 | 
 814 | - The update could contain new code, which could contain new vulnerabilities. Indeed, older versions of the component have, by definition, been available for longer, for security researchers to examine.
 815 | 
 816 | - The update could contain a “broken” or partial fix that might draw an attacker’s attention.
 817 | 
 818 | - The update could be malicious, due to, for example, a compromised package manager account. This can be partially mitigated by waiting a brief period, such as 30 days, after a new version is released before using it.
 819 | 
 820 | The “net” of this is that context and risk tolerance is important when choosing when to take software updates.
 821 | 
 822 | We therefore recommend the following:
 823 | 
 824 | - Research should be performed to validate the underlying assumptions made in this section; specifically:
 825 |   
 826 |   - What is the tradeoff between non-security bugs and the age of a release? How likely will new bugs be introduced?
 827 |   
 828 |   - What is the tradeoff between security bugs and the age of the release? How likely will version N be more secure than version N-1?
 829 |   
 830 |   - How long after a release is made are vulnerabilities or malicious updates typically identified?
 831 |   
 832 |   - How can this information be combined to optimize both security and operational risk, and how can that be expressed as a policy recommendation?
 833 | 
 834 | - Educational materials should be produced and socialized with the open source community describing the value of keeping dependencies up to date.
 835 | 
 836 | ## Vulnerability Reporting & Security Response Phase
 837 | 
 838 | Once a vulnerability is reported to a project maintainer, they must decide what to do, and while “fix the vulnerability” is often the action taken, there are instances where this isn’t done. Some reasons why a maintainer would not necessarily fix a vulnerability include:
 839 | 
 840 | - The “maintainer” does not want to continue working on the project.
 841 | 
 842 | - The maintainer does not think the vulnerability is important enough to fix.
 843 | 
 844 | - The maintainer does not understand the vulnerability as reported.
 845 | 
 846 | - The vulnerability would require extensive re-architecture to address.
 847 | 
 848 | - The vulnerability is by design (e.g., [WebGoat](https://owasp.org/www-project-webgoat/)).
 849 | 
 850 | - The maintainer does not want the existence of the vulnerability disclosed.
 851 | 
 852 | This last point could be especially troublesome, and though more typical in a commercial setting, open source projects are maintained by humans, and humans sometimes resort to legal threats in an attempt to make problems “go away”. (To be fair, the author of this document is not aware of any instances where the maintainer of an open source project has threatened legal action to “silence” a security researcher, there’s no reason to think this would *never* occur.)
 853 | 
 854 | From the consumer’s perspective, these lead to the following risks:
 855 | 
 856 | - Consumers often lack information on the degree to which their use of a vulnerable component affects them.
 857 | 
 858 | - Consumers will mistakenly assume that a project will issue security fixes if a vulnerability is reported.
 859 | 
 860 | - A security researcher is unable to contact a project maintainer to privately report a vulnerability.
 861 | 
 862 | ### Identifying Vulnerabilities on the Attack Surface
 863 | 
 864 | Suppose a software product is using an open source component for padding strings, and that that component contains a security vulnerability. Is the software product vulnerable as a result? The answer is, unfortunately, *<u>it depends</u>*. This is because when you consume a component, you usually only use a small part of the functionality it implements. If a vulnerability were to only affect those parts of the component that aren’t used, then there wouldn’t be a way to exploit it, and you’d be safe to use it, at least for the time being.
 865 | 
 866 | ![The graph shows various options for how a consumer can respond to a vulnerability in an open source component](img/AttackSurface.png)
 867 | 
 868 | The above graph shows various options for how a consumer can respond to a vulnerability in an open source component. If we make some broad assumptions:
 869 | 
 870 | - The cost to upgrade a component is $100.
 871 | - The cost to respond if the software product is vulnerable is $1,000.
 872 | - The cost to determine if the software product is vulnerable is $20.
 873 | - The likelihood that the vulnerability will affect the software product is 33%.
 874 | 
 875 | Then the cost of the left branch is \$100, the middle branch is \$333 (\$1,000 * 0.33), and the right branch is \$53 (\$20 + \$100 * 0.33).
 876 | 
 877 | It is important to optimize here because we are not expecting to do this once, but rather hundreds or thousands of times per year within a large organization.
 878 | 
 879 | So, what can one do? Clearly, if the cost to upgrade can be reduced enough, it is possible that the optimal path would always be to just do that. In fact, tools like [Dependabot](https://github.blog/2019-01-31-keep-your-dependencies-secure-and-up-to-date-with-github-and-dependabot/) (GitHub) will watch open source projects and submit pull requests automatically to upgrade vulnerable components. (These tools can be used for proprietary source code, as well.)
 880 | 
 881 | On the other side, the cost to determine if a software product were indeed vulnerable due to a known-vulnerable component could be reduced. The challenge here is that most public vulnerability reports do not contain enough information to make this determination; instead, it’s usually left as an “exercise for the reader”.
 882 | 
 883 | One way to potentially improve this would be to *attempt* an automated upgrade, and only fall back to the graph described above if it failed:
 884 | 
 885 | ![The graph shows the last one by the addition of a new layer, the automatic updates for improving the security of the project](img/AttackSurface2.png)
 886 | 
 887 | If we assume the cost to try an automated upgrade is \$5, and it works 80% of the time, it would reduce the cost of the left branch from \$100 to \$24 (\$5 * 0.80 + \$100 * 0.20), the middle from \$333 to \$71 (\$5 * 0.80 + \$1,000 * 0.20 * 0.33), and the right from \$53 to \$15 (\$5 * 0.80 + (\$20 + \$100 * 0.33) * 0.20).
 888 | 
 889 | *Again, the numbers used here are rough assumptions.*
 890 | 
 891 | We recommend the following:
 892 | 
 893 | - Source code repositories should improve support for issuing pull requests for upgrading vulnerable packages, making this enabled by default and covering additional package types, including non-packaged files copied from other repositories (i.e., “jquery.js” in a /vendor directory).
 894 | 
 895 | - The industry should improve public vulnerability reports, including more information to enable consumers to understand whether their software product is vulnerable when using the component. For example, one could use [CodeQL](https://securitylab.github.com/tools/codeql) to determine whether there is a call path between the software product and the vulnerable function or construct in the affected component.
 896 | 
 897 | ### Security Response Expectations
 898 | 
 899 | It is important that project maintainers articulate their commitment to responding to security vulnerabilities if and when they are found. This is an essential aspect of the project archetype, and should be included in any artifacts produced around it. 
 900 | 
 901 | #### Routing Vulnerability Reports
 902 | 
 903 | When a security researcher identifies a vulnerability in an open source project, they have a few different places that they could report it, including:
 904 | 
 905 | - (Good) The project maintainers directly, through a private channel
 906 | 
 907 | - (Good) A mediator, such as [HackerOne](https://hackerone.com), possibly through a bug bounty program
 908 | 
 909 | - (Good) [CERT](https://www.us-cert.gov/) or another industry/government organization
 910 | 
 911 | - (Bad) The project author directly, through a public GitHub issue
 912 | 
 913 | - (Bad) The public community (“dropping a 0-day”)
 914 | 
 915 | - (Bad) A criminal organization
 916 | 
 917 | As a best practice, open source projects should clearly state how they prefer to receive vulnerability reports. While there are no universally-followed standards, GitHub implements basic support for a [SECURITY.md](https://help.github.com/en/github/managing-security-vulnerabilities/adding-a-security-policy-to-your-repository) file. These files are meant to be read by a human, which can be costly at scale and often missed or ignored.
 918 | 
 919 | We recommend the following:
 920 | 
 921 | - Establish a “machine-readable” schema for security reporting and build out a few common scenarios, and advocate for their use in appropriate channels.
 922 | 
 923 | - Create a “help line” for open source security reporting (possibly using an automated bot for “first-tier” help), with fallback to someone who can help a finder locate the appropriate contact. (This would be similar to the [HackerOne Disclosure Assistance](https://hackerone.com/disclosure-assistance) program.)
 924 | 
 925 | #### Security Fixes
 926 | 
 927 | Once a security fix is made, the project issues an updated release (or multiple releases, in some cases) and discloses the existence of the vulnerability. At that point, it is up to the consumer to ingest the updated package to prevent the final software product from remaining vulnerable.
 928 | 
 929 | Of course, security fixes are not always released in a timely manner, and sometimes, never at all. This is where disclosure practices come into play; if a maintainer fails to address a fix that a security researcher finds, what should that researcher do?
 930 | 
 931 | - Remain quiet, potentially forever?
 932 | 
 933 | - Privately fork the project, and submit a patch back to the maintainer?
 934 | 
 935 | - Publicly fork the project and attempt to fix the vulnerability?
 936 | 
 937 | - Publicly disclose the vulnerability (presumably to pressure the maintainer to issue a fix)?
 938 | 
 939 | There are reasonable arguments to be made on all sides of disclosure practices; we won’t go into depth in this document, but the Electronic Frontier Foundation has an [excellent writeup](https://www.eff.org/issues/coders/vulnerability-reporting-faq) of the challenges and nuances involved, as does [CERT](https://vuls.cert.org/confluence/display/CVD/The+CERT+Guide+to+Coordinated+Vulnerability+Disclosure).
 940 | 
 941 | ## Cross-Cutting Tasks
 942 | 
 943 | ### Development Process Integrity
 944 | 
 945 | The software development process is a lot like baking a cake:
 946 | 
 947 | - A lot of work goes into design, build, and testing.
 948 | 
 949 | - Most people only see (eat) the final product.
 950 | 
 951 | - It is difficult (or impossible) to reverse engineer the processes used, if you only have access to the final deliverable.
 952 | 
 953 | In the context of software development, it’s often the case that consumers have only public artifacts to use to assess quality and trustworthiness: source code, public issues, and other content released by the maintainers or by other consumers.
 954 | 
 955 | To borrow a similar analogy, the food supply chain requires the collaboration of hundreds of suppliers to bring a finished meal to a restaurant table. A breakdown at any of these points can have terrible consequences, as frequent [E. coli](https://www.google.com/search?q=e+coli+outbreak&tbm=nws) and [Salmonella](https://www.google.com/search?q=salmonella+outbreak&tbm=nws) outbreaks demonstrate.
 956 | 
 957 | Together, this lack of transparency into how software is created and delivered, and the loosely-coupled, complex graph of actors, tools, and processes leads naturally to the “software supply chain security” problem. Consumers wind up having little choice to but trust that software they expect is what they actually receive.
 958 | 
 959 | This is where supply chain integrity frameworks such as [in-toto](https://in-toto.io) come into the picture. These frameworks allow capturing digitally signed documents attesting that certain procedures were followed during the creation and delivery of the software; with assertions such as:
 960 | 
 961 | - All source code commits came from the maintainer.
 962 | 
 963 | - Tools X, Y, and Z were run against the source and had [these] findings.
 964 | 
 965 | - Packages were built using compiler version X obtained from [this] source.
 966 | 
 967 | - The files exhausted from the build process matched the files in the package.
 968 | 
 969 | Supply chain integrity frameworks don’t prevent all vulnerabilities or ensure that malicious code cannot enter the supply chain; instead, they provide assurances that (a) artifacts cannot be modified surreptitiously, and (b) that trusted entities have attested that certain tasks have taken place, optionally generating additional artifacts.
 970 | 
 971 | Additional details can be found in Mark Russinovich’s keynote at RSA 2020, [Collaborating to Improve Open Source Security: How the Ecosystem Is Stepping Up](https://www.rsaconference.com/usa/agenda/collaborating-to-improve-open-source-security-how-the-ecosystem-is-stepping-up).
 972 | 
 973 | We recommend the following:
 974 | 
 975 | - Coalition members should begin experimenting with efforts such as in-toto for securing the end-to-end integrity of supply chains.
 976 | 
 977 | ### Project Archetypes
 978 | 
 979 | Open source projects come in all shapes and sizes, from *hello-world*’s to homework assignments to full operating systems, and everything in between. Unfortunately, it is not always easy to recognize the goals of a project in terms of support and continued development. As a result, software developers often end up relying on open source components that are abandoned (implicitly or explicitly), and are forced to either retrofit a different component into the system or accept the risk of using the component as-is.
 980 | 
 981 | If we examine projects at scale, we expect them to fall into a small number of categories based on their (increasing) level of support:
 982 | 
 983 | - **No support.** These include “hello world”, “just testing this out”, homework projects, and barely started projects that never went anywhere. There should be no expectations for any future work. This category includes explicitly ‘archived’ or ‘deprecated’ projects, but could also include small projects that are essentially “complete”, like the one-liner [is-even](https://www.npmjs.com/package/is-even) NPM package, which is downloaded over 80,000 times per week, but simply negates the output of a call to the [is-odd](https://www.npmjs.com/package/is-odd) NPM package.
 984 | 
 985 | - **Minimal support.** These are typical “garage” projects that often meet a niche need. They are often maintained by a single individual, who may stop supporting it at any time, with or without notice.
 986 | 
 987 | - **Best effort support.** These are often medium-sized projects, maintained by one or a small number of individuals. They can be widely used, and the maintainers typically address issues and issue new releases on a somewhat regular, but uncommitted basis.
 988 | 
 989 | - **Promised support.** These are typically large projects, funded by commercial enterprises or foundations. They often have roadmaps, frequent development, bug fix SLAs, and high-quality releases (some of which would be available through a commercial support agreement).
 990 | 
 991 | Package consumers could then make more informed decisions when taking a dependency, especially when the component in question was a fundamental part of their system architecture.
 992 | 
 993 | We recommend the following:
 994 | 
 995 | - A document should be created that describes these archetypes and be socialized and refined with the help of the larger open source community. As they mature, source code repositories and package management systems should build support for relaying this information when consumers browse or access a package.
 996 | 
 997 | In addition, the open source ecosystem includes vendors that provide additional support (including fixes) for projects that may have a different upstream support model. For example, Debian maintains a list of [orphaned packages](https://www.debian.org/devel/wnpp/orphaned_byage), many of which are included in still-supported operating system versions. This means that a project, even with identical source code, could have multiple archetypes from the perspective of who is maintaining/supporting it. (Projects with multiple active forks will have a similar archetype.)
 998 | 
 999 | ### Strong Package Identity
1000 | 
1001 | One challenge, perhaps somewhat unique to the open source ecosystem, is that most actors are anonymous, or pseudonymous. The maintainers will often have only a (purported) name, and e-mail address, and an account with a source code and package management repository. Identities of maintainers, projects, and packages, are always “local”, meaning, for example, the packages called “2048” at [NPM](https://www.npmjs.com/package/2048), [NuGet](https://www.nuget.org/packages/2048/), and [PyPI](https://pypi.org/project/2048/) share a name, but have completely different code bases and are authored by different individuals.
1002 | 
1003 | To better illustrate the point, consider the relationship that each package has to the ecosystem around it. A developer wants to understand if they are affected by a vulnerability in zlib. The developer understands that they are using the NuGet “zlib.net” package, the RubyGems “zlib” package, and the “zlib1g” package from Debian².
1004 | 
1005 | ![](img/PackageIdentity.png)
1006 | 
1007 | Of course, the first two have no relationship to the *real* zlib implementation, so the vulnerability probably wouldn’t affect it. The version maintained by Debian comes with a series of patches, which results in a solid, *maybe*. 
1008 | 
1009 | Having a strong identity would mean that the NuGet “zlib.net” package would be *obviously* unaffiliated with the [madler/zlib](https://github.com/madler/zlib) project on GitHub, as would the RubyGems “zlib” package. The Debian package has a well-defined relationship to the root zlib project. A software bill of materials would be one method of providing this, as would better tools to trace source code back to individual commits in different repositories. This leads naturally to provenance, the ability to reason about the lineage of a particular component. For example, if we were to learn that the “Debian/zlib1g” project was authored 90% by Mark Adler, 8% by other contributors to the madler/zlib repository, and 2% by Mark Brown, and that the changes by Mark Brown were made two weeks after a new minor version of madler/zlib was released, that would probably give us confidence that the two are tightly coupled.
1010 | 
1011 | There are many challenges that would need be addressed in order to implement a strong package identity, including questions around provenance, trust anchors, revocation, partially trusted source, cryptographic agility, and distributed vs. centralized control. While on paper, code-signing can help, the enormous [friction](https://www.bleepingcomputer.com/news/software/notepad-no-longer-code-signed-dev-wont-support-overpriced-cert-industry/) involved in purchasing and maintaining a code signing certificate has led to very few non-corporate-backed projects using them. 
1012 | 
1013 | We recommend the following:
1014 | 
1015 | - Consideration should be given to the concept of a “strong” identity for open source packages
1016 | 
1017 | - Typo-squatting should be addressed within and between package management systems. (See [Typo-Squatting Attacks](#Typo-Squatting-Attacks) for additional information.)
1018 | 
1019 | ### Project Risk Estimation
1020 | 
1021 | When allocating resources to directly contribute to specific projects, it is important to do so in such a way as to achieve maximum benefit to those projects. We can use the term “project criticality” to reflect how important a project is to the larger ecosystem; roughly speaking, a security flaw in a more critical project would have a greater aggregate impact across all consumers of that project than a similar security flaw would have to a less critical project.
1022 | 
1023 | On the other hand, there’s some reason to believe that the critical projects undergo greater scrutiny and would therefore be less likely to contain such flaws.
1024 | 
1025 | Countering this would be an argument that more critical projects are more complex than most others, making security flaws more difficult to detect.
1026 | 
1027 | Finally, one could argue that critical components are often used deep within software systems (e.g., libc, cryptographic libraries, garbage collection systems) and not exposed directly to end-users; as a result, any vulnerability in those critical components would be less likely to be exploitable, since a “path” would need to connect that point of vulnerability to the external system interface. Experience has shown this analysis to be both time-consuming and error-prone, even for experts intimately familiar with the affected source code. Clearly, an automated solution would be required.
1028 | 
1029 | We could argue this point indefinitely, but the net of this is that we should not make any assumptions about a correlation between the security quality of a project and its criticality.
1030 | 
1031 | #### Calculating Project Criticality
1032 | 
1033 | Project criticality can be measured in different ways, either qualitatively or quantitatively. On the qualitative side, we can independently ask many organizations what they perceive to be the most critical open source projects they depend on. Those projects with the most “votes” should begin to approximate the “right” answer as the sample size increases.
1034 | 
1035 | On the other hand, we can use data to “score” components based on their prevalence and their broad categories. For example, standard “libc” libraries form the backbone of practically all user-space software, and the Linux kernel would similarly be for all practical purposes, “universal”. In the JavaScript ecosystem, the [Lodash](https://www.npmjs.com/package/lodash) NPM module is downloaded over 20 million times per week, and is depended upon by over 110,000 other NPM modules.
1036 | 
1037 | The Linux Foundation’s Core Infrastructure Initiative released the [Census Program II Preliminary Report](https://www.coreinfrastructure.org/programs/census-program-ii/) in February 2020, which contains the ten most-used JavaScript and non-JavaScript components, according to their collection and scoring methodology.
1038 | 
1039 | We recommend the following:
1040 | 
1041 | - Collect a survey to proactively solicit anonymous feedback on the perceived N most critical open source software in use by a representative set of organizations.
1042 | 
1043 | - Research should be performed to establish a better quantitative metric for project criticality.
1044 | 
1045 | - Establish a metric to identify the N most depended upon software components.
1046 | 
1047 | #### Calculating Project Risk
1048 | 
1049 | We can define “project risk” as a proxy for, “how likely will we suffer damage due to a future security vulnerability in this project?” Project risk therefore clearly depends on criticality, but a critical project that is exceptionally well managed should have less aggregate risk than a less critical project that is poorly managed.
1050 | 
1051 | ### Project Health
1052 | 
1053 | We define “project health” to be, roughly, the likelihood that a project will meet user expectations in the short-term future. Project health can often be related to security quality, but one could imagine cases where the two differ significantly, such as small libraries that are “abandoned” but don’t contain any known security flaws.
1054 | 
1055 | When choosing an open source package to depend on, project health is often a contributing factor. A healthy project is more likely to support, for example, future web browsers, new versions of a programming language, or simply new features.
1056 | 
1057 | However, there are no well-adopted methodologies for calculating a project’s health; instead, developers often rely on coarse metrics like the number of downloads or name recognition.
1058 | 
1059 | We recommend the following:
1060 | 
1061 | - Establish a methodology for estimating a project’s health, and refine it over time with actual data, potentially building a model using data science and machine learning techniques.
1062 | 
1063 | ### Security Audits
1064 | 
1065 | Despite the increasing power of tools to identify security vulnerabilities in software, there remains value in having an expert review source code for security defects. Some organizations release their results publicly, including [Mozilla](https://www.mozilla.org/en-US/moss/), [Trail of Bits](https://github.com/trailofbits/publications/tree/master/reviews), [Cure53](https://cure53.de/#publications), [NCC Group](https://www.nccgroup.trust/us/our-research/?research=Public+Reports), and [others](https://github.com/pomerium/awesome-security-audits), but there are over 1,000 new components published every day, and quality reviews often take weeks to complete.
1066 | 
1067 | The primary reason why security audits are important, above and beyond relying on publicly known vulnerabilities, is that with over two million open source components, only a small portion of them have ever been looked at from a security perspective. Security researchers are a scarce commodity, and often focus on components that they believe will yield the highest return. To illustrate the point, which component would you perceive to be more secure?
1068 | 
1069 | - Component A has had 6 CVEs published over the past five years, all promptly fixed.
1070 | 
1071 | - Component B, which has never had a CVE published.
1072 | 
1073 | The answer is, again, *it depends*. Perhaps Component A underwent hundreds of hours of scrutiny and Component B, none at all? There simply is not enough information to decide. What if, instead, we have:
1074 | 
1075 | - Component A has had 6 CVEs published over the past five years, all promptly fixed.
1076 | 
1077 | - Component B, which has never had a CVE published, and an independent security review did not discover any security issues.
1078 | 
1079 | That would make Component B *possibly* a more secure option.
1080 | 
1081 | As part of the Open Source Project Security Service, we recommend:
1082 | 
1083 | - The service should include security reviews or links to externally prepared reviews for relevant open source components.
1084 | 
1085 | ### Software Characteristics
1086 | 
1087 | We use the term “software characterization” to refer to analysis that identifies interesting features, characteristics, and related information from a set of source code files. These characteristics could include things like:
1088 | 
1089 | - using or implementing cryptography,
1090 | 
1091 | - making authentication or authorization decisions,
1092 | 
1093 | - updating system configuration (e.g., managing user accounts, startup scripts, etc.), or
1094 | 
1095 | - connecting to databases or other external cloud services.
1096 | 
1097 | These can be useful when making decisions about where to focus resources; for example, many NPM modules are essentially “compute-only” (requiring no system or network access), and may require less scrutiny than a module with more complex integrations.
1098 | 
1099 | Software characterization tools include [Microsoft Application Inspector](https://github.com/Microsoft/ApplicationInspector) and [Microsoft Attack Surface Analyzer](https://github.com/Microsoft/AttackSurfaceAnalyzer).
1100 | 
1101 | When analyzing software at scale, it’s often desirable to be able to group components by a common characteristic, such as the type of data they use, the other services they interact with, or the degree to which they affect the configuration of the system they are installed on. These tools can help analysts focus attention on more critical projects or on parts of a larger project that implement higher-risk functionality.
1102 | 
1103 | ### Open Source Project Security Service (Dashboard)
1104 | 
1105 | When selecting an open source component, it is often helpful to understand whether the one chosen is the best one for the job. After all, there are literally hundreds of libraries that perform sorting, logging, database interaction, and similar “low-level” functions. How is a developer to choose which is best (and roughly, which ones are secure)?
1106 | 
1107 | Metrics can be useful in helping to convey this information to a prospective user, and can fall into at least two basic categories:
1108 | 
1109 | - **Project Health** – How likely will the project continue to meet the needs of its users in the future?
1110 | 
1111 | - **Security Posture** – How likely will the project contain security vulnerabilities that will affect projects that depend on it? 
1112 | 
1113 | These categories, and the elements that can contribute to them have been discussed at length in earlier sections of this document, especially in [Project Risk Estimation](#Project-Risk-Estimation), [Project Health](#Project-Health), [Security Audits](#Security-Audits), [Static Analysis](#Static-Analysis), and [Security Response](#Security-Response). Essentially, most of what this document describes could contribute in some way into useful metrics for open source projects.
1114 | 
1115 | Regarding prior work, there have been a few notable attempts at calculating useful health metrics, including:
1116 | 
1117 | - The Cloud Native Computing Foundation’s [DevStats](https://all.devstats.cncf.io/d/53/projects-health-table?orgId=1) project, which calculates health metrics for associated projects based on data from GitHub.
1118 | 
1119 | - The [Community Health Analytics Open Source Software (CHAOSS)](https://chaoss.community/) project from the [Linux Foundation](https://www.linuxfoundation.org/) is currently active, with reports and software available for self-hosting, but the author wasn’t able to find public results for analyzed projects.
1120 | 
1121 | - The [OSSMETER project](http://www.ossmeter.org/), funded by the European Commission, looks to be unmaintained, with their last publication in 2014 and last [source code](https://github.com/ossmeter/ossmeter) commit in 2017.
1122 | 
1123 | We recommend the following:
1124 | 
1125 | - The Coalition should research whether existing work from the above-referenced sources could be leveraged to build a useful per-project dashboard; if not, the Coalition should drive the creation of one.
1126 | 
1127 | ### Security Education
1128 | 
1129 | We believe that security education is an important part of a modern software developer’s ongoing training. This means much more than “trivia” that can be easily found with a tool. Instead, such an education should include things like:
1130 | 
1131 | - Understanding the “attacker mindset” and how to leverage it while designing a software component (including threat modeling).
1132 | 
1133 | - How to design and implement security features effectively (e.g., authentication, authorization, cryptography, serialization, etc.).
1134 | 
1135 | - Common vulnerability classes and how to remediate them when found.
1136 | 
1137 | - Using security tools effectively (e.g., how to integrate them into a development process and how to interpret and action the results).
1138 | 
1139 | To this end, we recommend the following:
1140 | 
1141 | - The Coalition should continue to drive work on understanding the current gaps and how they can be addressed.
1142 | 
1143 | # Conclusion
1144 | 
1145 | As stated in the introduction, the purpose of this document is not to be exhaustive, but rather to simply touch on the most significant risks that can affect the open source ecosystem. The intent is to establish a framework for dialog and hopefully to help stakeholders focus efforts on the areas that present the greatest risk.
1146 | 
1147 | We will plan to update this document as needed. If you have questions or feedback about the contents of this document, please direct them to the primary author, Michael Scovetta at either michael.scovetta@microsoft.com or michael.scovetta@gmail.com.
1148 | 
1149 | # Appendix
1150 | 
1151 | ## Summary of Recommendations
1152 | 
1153 | ### Recommendation
1154 | 
1155 | #### [Ideation/Concept Phase](#Ideation/Concept-Phase)
1156 | 
1157 | - Create/curate training materials for threat modeling.
1158 | 
1159 | - Create templated threat models for common types of open source projects. 
1160 | 
1161 | - Collaborate with critical projects to perform threat modeling.
1162 | 
1163 | #### [Local Development Phase](#Local-Development-Phase)
1164 | 
1165 | - Create/curate guidance on secure configuration of common platforms/frameworks.
1166 | 
1167 | - Curate list of high-quality security libraries/frameworks.
1168 | 
1169 | - Improve build system integration with static analysis and fuzzing tools.
1170 | 
1171 | - Drive toward “on by default” static analysis.
1172 | 
1173 | - Establish a cross-tool format for expressing security detection rules.
1174 | 
1175 | - Drive “auto-fix” capabilities into static analysis tools.
1176 | 
1177 | - Analyze security of snippets available on Stack Overflow.
1178 | 
1179 | - Advocate for using runtime security tools on open source projects.
1180 | 
1181 | - Improve developer training on secure development techniques.
1182 | 
1183 | - Curate list of high-quality security tools and how to use them effectively.
1184 | 
1185 | - Provide developers of critical open source projects access to security experts.
1186 | 
1187 | - Research binary hardening techniques and drive work to close gaps.
1188 | 
1189 | - Advocate for an effective “capabilities” model for open source platforms/runtimes.
1190 | 
1191 | - Actively fund security work for critical open source projects.
1192 | 
1193 | - Advocate a position of “keep open source packages up to date”.
1194 | 
1195 | - Create/expand bug/patch bounty programs for open source packages.
1196 | 
1197 | - Expand secret detection capabilities in key systems.
1198 | 
1199 | - Establish a best-in-class open source tool for detection of exposed secrets.
1200 | 
1201 | - Create playbook for using secure secret management capabilities.
1202 | 
1203 | - Include secret management in best practice documentation to developers.
1204 | 
1205 | #### [External Contribution Phase](#External-Contribution-Phase)
1206 | 
1207 | - Create/curate best practices for securely accepting external contributions.
1208 | 
1209 | #### [Central Infrastructure Phase](#Central-Infrastructure-Phase)
1210 | 
1211 | - Establish a best practice around disclosure of minimally triaged security tool findings.
1212 | 
1213 | - Actively fund triage of security findings against critical open source projects.
1214 | 
1215 | - Validate feasibility of “sneaking” a malicious change through code review.
1216 | 
1217 | #### [Package Consumption Phase](#Package-Consumption-Phase)
1218 | 
1219 | - Drive package management system to expose a “published using 2FA” flag.
1220 | 
1221 | - Add a “policy” to package downloaders for requiring 2FA published packages.
1222 | 
1223 | - Share security best practices among the package management system community.
1224 | 
1225 | - Package management systems should improve detection of typo-squatting.
1226 | 
1227 | - Research effects of package removal on different package ecosystems.
1228 | 
1229 | - Advocate for use of scoped publishing tokens.
1230 | 
1231 | - Improve detection capabilities for malicious packages when published and share observations with other package management systems.
1232 | 
1233 | - Advocate for reproducibility as part of a trusted publishing pipeline.
1234 | 
1235 | - Tighten integration between source code repositories, build pipelines, and package
1236 | 
1237 | - management systems.
1238 | 
1239 | - Research how package update policies affect the security of a system.
1240 | 
1241 | #### [Vulnerability Reporting & Security Response Phase](#Vulnerability-Reporting-&-Security-Response-Phase)
1242 | 
1243 | - Source code repositories should improve support for “auto-upgrading” dependencies.
1244 | 
1245 | - Public vulnerability reports should be expanded to include more information to enable consumers to know whether they are vulnerable.
1246 | 
1247 | - Establish a machine-readable schema for security reporting.
1248 | 
1249 | - Create/curate a “help line” for reporting vulnerabilities in open source projects.
1250 | 
1251 | #### [Cross-Cutting Tasks](#Cross-Cutting-Tasks)
1252 | 
1253 | - Support efforts to standardize a Software Bill of Materials framework that can be adopted within the open source ecosystem.
1254 | 
1255 | - Create/curate a description of project archetypes and socialize with the community.
1256 | 
1257 | - Research how a “strong package identity” could be feasible with the open source community.
1258 | 
1259 | - Collect/curate a list of the most-critical open source projects.
1260 | 
1261 | - Establish a methodology for calculating project health.
1262 | 
1263 | - Research or establish an open source project “metrics dashboard” to expose security-relevant information to consumers.
1264 | 
1265 | - Research current gaps in security education and work to fill those gaps.
1266 | 


--------------------------------------------------------------------------------