└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # PySecDB: security commit dataset in Python 2 | 3 | ## Description 4 | 5 | To foster large-scale research on vulnerability mitigation and to enable a comparison of different detection approaches, we make our dataset ***PySecDB*** from our ICSME23 paper publicly available. 6 | 7 | PySecDB is a real-world Python security commit dataset that contains around 1.2K security commits and 2.8K non-security commits. You can find more details on the dataset in the paper *"[Exploring Security Commits in Python](https://csis.gmu.edu/ksun/)"*. 8 | 9 | ## Download Policy 10 | 11 | We are delighted to share PySecDB and hope you can find our dataset useful in your research. 12 | However, in order to prevent any misuse, we kindly ask you to fill out a **request form** to state your identity and research scope. 13 | We will verify them and then send you the download link of PySecDB dataset. 14 | 15 | **Request Steps:** 16 | 17 | 1. Please open the online request form in a browser. \ 18 | **PySecDB Request Form**: https://forms.gle/Uu441xPQ4dqnVGV39. \ 19 | (If you are unable to access the page, please contact SunLab by email.) 20 | 21 | 2. Sign in to your Google account. \ 22 | *Since our request form and download link are facilitated by Google, please use your Gmail as the valid email to receive the form response.* 23 | 24 | 3. In the request form, please include your name, affiliation, work email, homepage, and the purpose of using PySecDB. \ 25 | *The information is needed for verification. 26 | Note that your request may be ignored if we are not able to determine your identity or affiliation. 27 | We do not share your personal information with any third parties.* 28 | 29 | 4. Acknowledge all the information you provided is correct. 30 | 31 | 5. Read and acknowledge the [Disclaimer & Download Agreement](#jump) for PySecDB. 32 | 33 | 6. Submit the request form. \ 34 | *A request receipt will be emailed to the email address you provided. 35 | Once we verify your information, we will email the download link to you as soon as possible.* 36 | 37 | **If you are using PySecDB for work that will result in a publication (thesis, dissertation, paper, article), please use the following citation:** 38 | 39 | 40 | ## Disclaimer & Download Agreement 41 | 42 | To download the PySecDB dataset, you must agree with the succeeding Disclaimer & Download Agreement items. You should carefully read the following terms before submitting the PySecDB request form. 43 | 44 | - PySecDB is constructed and cross-checked by 3 experts that work in security patch research. 45 | Due to the potential misclassification led by subjective factors, the Sun Security Laboratory (SunLab) cannot guarantee 100% accuracy for samples in the dataset. 46 | 47 | - The copyright of the PySecDB dataset is owned by SunLab. 48 | 49 | - The purpose of using PySecDB should be non-commercial research and/or personal use. The dataset should not be used for commercial use or any profitable purpose. 50 | 51 | - The PySecDB dataset should not be re-sell or redistributed. Anyone who has obtained PySecDB should not share the dataset with others without permission from SunLab. 52 | 53 | ## Team 54 | 55 | The PySecDB dataset is built by [Sun Security Laboratory](https://sunlab-gmu.github.io/) (SunLab) at [George Mason University](https://www2.gmu.edu/), Fairfax, VA. 56 | 57 | --- 58 | Last Updated Date: July, 2023 59 | --------------------------------------------------------------------------------