├── .env-dist ├── .gitignore ├── .temp ├── README.md ├── chat ├── completions.go ├── json.go └── models.go ├── cmd ├── ingest.go └── update.go ├── cve ├── json.go └── types.go ├── database ├── affected.go ├── load.go ├── nvd_info.go ├── ovs_info.go ├── parser.go ├── verification.go └── write.go ├── files ├── pkg_info.json └── pkg_info_pypi_test.json ├── go.mod ├── go.sum └── main.go /.env-dist: -------------------------------------------------------------------------------- 1 | DATA_PATH= 2 | GROQ_API_KEY= 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .env 2 | nvdbase 3 | -------------------------------------------------------------------------------- /.temp: -------------------------------------------------------------------------------- 1 | 2 | // func loadDataFromNVD(nvdUrl, startDate, endDate, llm, groqApiKey string) { 3 | // req, err := http.NewRequest("GET", nvdUrl, nil) 4 | // if err != nil { 5 | // log.Fatal("Error creating new request: ", err.Error()) 6 | // } 7 | // response, err := http.DefaultClient.Do(req) 8 | // if err != nil { 9 | // log.Fatal("Error executing request: ", err.Error()) 10 | // } 11 | // defer response.Body.Close() 12 | // body, err := io.ReadAll(response.Body) 13 | // if err != nil { 14 | // log.Fatal("Error reading response body: ", err.Error()) 15 | // } 16 | // 17 | // fmt.Printf("%s\n", body) 18 | // var obj cve.NvdResponse 19 | // err = cve.Decode(body, &obj) 20 | // if err != nil { 21 | // log.Fatal("Error decoding json: ", err.Error()) 22 | // } 23 | // 24 | // } 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CVE database ingestion pipeline 2 | 3 | In this exercise, I wrote a program that can create and update a database of Vulnerabilities and their corresponding affected Java packages (So primarily the Maven repositories). 4 | The CVE feeds are updated [here](https://nvd.nist.gov/vuln/data-feeds#JSON_FEED) which may also be indexed on the [OSV database](https://osv.dev/). 5 | 6 | ## Methodology 7 | 8 | I will briefly explain what my thought process was, for this project. 9 | 10 | - First, I had to get the NVD feeds. I was explicitly limiting myself to analyze feeds for the years 2023 and 2024, so I downloaded the archives from the given link above. However, this was not my first approach since I saw that the Feeds were going to be migrated to an API. So I tried to implement a search after / pagination request to gather all the feeds from `start-date` to `end-date`, but I soon realized that this will take me a lot of time to do, and I told myself, I shouldn't take more than 3 hours for a side project like this, so I just downloaded the feeds. 11 | 12 | - The feeds from NVD contain not only packages but also other applications, which do not concern us, and the only common fields in the NVD databse and the OSV database are either the IDs Which can be one of GHSA, OSV, or even a CVE if it exist's in the alias of the IDs OR the commit SHA of the packge affected and you can have either one of them, or both. (I'm guessing it's some sort of an elasticsearch/opensearch ish database). The problem is that The JSON from the feeds are very nested and you find these sometimes, in different parts of the JSON. So in order to simply extracting information from the JSON, I used an LLM, namely, Llama-3 (8b parameters). In general, when parsing / extracting info from a very dense JSON, using an LLM seems like a good choice as they have gotten very efficient in understanding JSONs. 13 | 14 | - Luckily, the OSV database API is not rate limited. So the moment we receive the extracted information, we can pass that on to the OSV api and search if that package exists and extract the affected package ecosystem, version ranges, etc. and filter based on the desired ecosystem, in this case, Maven repositories. 15 | - Once, that information is gathered, we will have, ideally: a CVE id and their corresponding affected maven packages and their versions in a JSON file. **Note**: I did not use an actual database, I just used a json file called `pkg_info.json` 16 | 17 | ## Tech choices 18 | 19 | - I made this project in `Go` due to it's concurrency patterns. 20 | - The LLM I used is coming from an inference engine called [Groq](https://groq.com/) as they have very generous rate limits in the free tier and it's extremely fast. ~0.26 seconds per request on average, which for a model like Llama-3 is extraordinary! 21 | 22 | ### Special tools and libraries 23 | 24 | - I mostly stuck to the Go standard library, except for a scheduler package called [gocron](https://github.com/jasonlvhit/gocron) For starting an update job every 2 hours, since that's roughly the time period in which the NVD database get's updated. 25 | 26 | ## Installation 27 | 28 | Make sure to have [Go](https://go.dev/doc/install) installed 29 | 30 | Then run: 31 | 32 | ```sh 33 | go build 34 | ``` 35 | 36 | Get your Groq API key [here](https://console.groq.com/keys) 37 | It should be very straightforward, just creating an account should do the job, however, if you want to use my API key, Please shoot me an email, I'll send it over :) 38 | 39 | Check the `.env-dist` file to fill in the necessary variables into a `.env`: 40 | 41 | ```sh 42 | DATA_PATH= 43 | GROQ_API_KEY= 44 | ``` 45 | 46 | The `DATA_PATH` is folder containing the feeds from 2023 and 2024 in a JSON format (They have to downloaded in order to test the code, since they are relatively large files for github). 47 | and the `GROQ_API_KEY` is, well, the Groq API key. 48 | 49 | To ingest current data from 2023 - 2024: 50 | 51 | ```sh 52 | ./nvdbase -c ingest 53 | ``` 54 | 55 | To launch a cron job that updates the last modified feeds from NVD: 56 | 57 | ```sh 58 | ./nvdbase -c update 59 | ``` 60 | 61 | ## Challenges faced 62 | 63 | - The Groq API, inspite of having a generous free tier, is very limiting for a large and continuos database feed like the NVD, so I was rate limited quite a few times and had to wait. 64 | - There are significantly lesser Maven packges, when compared to let's say PyPI, so to even test if my code was working, I had to first test it on the PyPI ecosystem to see that the JSON file was being written properly. (I included some test results for the PyPI package ecosystem in a file called `pkg_info_pypi_test.json`) as I was rate limited before I even hit my first maven package search. 65 | - Since I initially spent time exploring the NVD feeds API, I had lost some amount of time before I started coding and I could not finish this test in the given time frame of 2-3 hours. (It took me roughly 5h20 minutes in total). 66 | 67 | ## Potential improvements 68 | 69 | This code is by no means, perfect and can use some improvements, if this project scales up. Here are some things that I think can be nice: 70 | 71 | - If a Mistral/OpenAI API key is available, then integrating Tool calling / function calling capability of LLMs to directly call the OSV endpoint automatically would be a plus! 72 | - Actually implementing the Pagination search for NVD feeds to not be solely based on the Webpage. 73 | - Parametrize the `ecosystem` variable to run this ingestion pipeline against many different types of packages. 74 | -------------------------------------------------------------------------------- /chat/completions.go: -------------------------------------------------------------------------------- 1 | package chat 2 | 3 | import ( 4 | "bytes" 5 | "fmt" 6 | "io" 7 | "log" 8 | "net/http" 9 | "time" 10 | ) 11 | 12 | const SystemMessageStr = ` 13 | You will recieve a JSON object as an input from the user, and your task is to extract the following information from it: 14 | - A GHSA_ID usually starts with GHSA- (If it doesn't exist, return an empty string) 15 | - A commit SHA (Again, if it doesn't exist, return an empty string) 16 | - The CVE ID of the given vulnerability 17 | - The description 18 | 19 | Remember to respond in a JSON format. Avoid explaining your response. Refer to the sample outputs below for help 20 | 21 | Example outputs: 22 | { 23 | "ghsa_id": "GHSA-m732-wvh2-7cq4", 24 | "commit": "", 25 | "cve": "CVE-2024-29199" 26 | } 27 | 28 | Empty example: 29 | { 30 | "ghsa_id": "", 31 | "commit": "", 32 | "cve": "CVE-2024-29199" 33 | } 34 | ` 35 | 36 | type ChatCompletionError struct { 37 | StatusCode int 38 | } 39 | 40 | func (c *ChatCompletionError) Error() string { 41 | return fmt.Sprintf("Error from completion response API. Status code: %d\n", c.StatusCode) 42 | } 43 | 44 | func GetChatCompletion( 45 | chatRequest *ChatRequest, 46 | apiKey string, 47 | ) (ChatResponse, error) { 48 | var bearer string = fmt.Sprintf("Bearer %s", apiKey) 49 | encodedRequest, err := encode(chatRequest) 50 | if err != nil { 51 | return ChatResponse{}, err 52 | } 53 | bodyReader := bytes.NewReader(encodedRequest) 54 | req, err := http.NewRequest( 55 | "POST", 56 | "https://api.groq.com/openai/v1/chat/completions", 57 | bodyReader, 58 | ) 59 | if err != nil { 60 | log.Println(err.Error()) 61 | return ChatResponse{}, err 62 | } 63 | req.Header.Add("Authorization", bearer) 64 | response, err := http.DefaultClient.Do(req) 65 | if err != nil { 66 | log.Println(err.Error()) 67 | return ChatResponse{}, err 68 | } 69 | if response.StatusCode == http.StatusTooManyRequests { 70 | log.Println("Too many requests, retrying in 30 seconds") 71 | time.Sleep(30 * time.Second) 72 | return GetChatCompletion(chatRequest, apiKey) 73 | } 74 | if response.StatusCode != http.StatusOK && response.StatusCode != http.StatusTooManyRequests { 75 | log.Printf("Got status code: %d, exiting with empty response", response.StatusCode) 76 | return ChatResponse{}, &ChatCompletionError{StatusCode: response.StatusCode} 77 | } 78 | defer response.Body.Close() 79 | chatResponse, err := io.ReadAll(response.Body) 80 | if err != nil { 81 | return ChatResponse{}, err 82 | } 83 | var objMap ChatResponse 84 | err = decode(chatResponse, &objMap) 85 | if err != nil { 86 | return ChatResponse{}, err 87 | } 88 | return objMap, nil 89 | } 90 | -------------------------------------------------------------------------------- /chat/json.go: -------------------------------------------------------------------------------- 1 | package chat 2 | 3 | import ( 4 | "encoding/json" 5 | ) 6 | 7 | func encode[T any](object T) ([]byte, error) { 8 | bytes, err := json.Marshal(object) 9 | if err != nil { 10 | return nil, err 11 | } 12 | return bytes, nil 13 | } 14 | 15 | func decode[T any](input []byte, obj T) error { 16 | err := json.Unmarshal(input, obj) 17 | if err != nil { 18 | return err 19 | } 20 | return nil 21 | } 22 | -------------------------------------------------------------------------------- /chat/models.go: -------------------------------------------------------------------------------- 1 | package chat 2 | 3 | type Message struct { 4 | Role string `json:"role"` 5 | Content string `json:"content"` 6 | } 7 | 8 | type ChatRequest struct { 9 | Messages []Message `json:"messages"` 10 | Model string `json:"model"` 11 | } 12 | 13 | type ChatResponse struct { 14 | Choices []Choice `json:"choices"` 15 | } 16 | 17 | type Choice struct { 18 | Message Message `json:"message"` 19 | LogProbs float32 `json:"log_probs"` 20 | FinishReason string `json:"finish_reason"` 21 | Index int `json:"index"` 22 | } 23 | -------------------------------------------------------------------------------- /cmd/ingest.go: -------------------------------------------------------------------------------- 1 | package cmd 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | 7 | "nvdbase/database" 8 | ) 9 | 10 | const ( 11 | OVS_COMMIT_SEARCH_URL = "https://api.osv.dev/v1/query" 12 | OVS_ID_SERCH_URL = "https://api.osv.dev/v1/vulns/" 13 | MODEL_NAME = "llama3-8b-8192" 14 | ) 15 | 16 | func Ingest(dataPath, outputFilePath, groqApiKey string) { 17 | data2023File := fmt.Sprintf("%s/nvdcve-1.1-2023.json", dataPath) 18 | data2024File := fmt.Sprintf("%s/nvdcve-1.1-2024.json", dataPath) 19 | vulnerabilities := database.LoadData( 20 | data2023File, 21 | data2024File, 22 | ) 23 | 24 | if len(vulnerabilities) == 0 { 25 | log.Println("Nothing to do, exiting") 26 | return 27 | } 28 | 29 | pkgInfoCh := database.GetPkgInfo(vulnerabilities, groqApiKey, MODEL_NAME) 30 | for pkgInfo := range pkgInfoCh { 31 | database.VerifyAndWrite(&pkgInfo, outputFilePath) 32 | } 33 | } 34 | -------------------------------------------------------------------------------- /cmd/update.go: -------------------------------------------------------------------------------- 1 | package cmd 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "os/exec" 7 | 8 | "github.com/jasonlvhit/gocron" 9 | 10 | "nvdbase/database" 11 | ) 12 | 13 | func Update(dataPath, outputFilePath, groqApiKey string) { 14 | gocron.Every(2).Hours().Do(updateDatabase, dataPath, groqApiKey, outputFilePath) 15 | <-gocron.Start() 16 | } 17 | 18 | func updateDatabase(dataPath, groqApiKey, outputFilePath string) { 19 | getModifiedData(dataPath) 20 | dataModifiedFile := fmt.Sprintf("%s/nvdcve-1.1-modified.json", dataPath) 21 | vulnerabilities := database.LoadData(dataModifiedFile) 22 | if len(vulnerabilities) == 0 { 23 | log.Println("Nothing to do, exiting") 24 | return 25 | } 26 | 27 | pkgInfoCh := database.GetPkgInfo(vulnerabilities, groqApiKey, MODEL_NAME) 28 | for pkgInfo := range pkgInfoCh { 29 | database.VerifyAndWrite(&pkgInfo, outputFilePath) 30 | } 31 | } 32 | 33 | func getModifiedData(dataPath string) { 34 | outputFilePath := fmt.Sprintf("%s/modified.zip", dataPath) 35 | curlOutput, err := exec.Command( 36 | "curl", 37 | "-o", 38 | outputFilePath, 39 | "https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-modified.json.zip", 40 | ).Output() 41 | if err != nil { 42 | log.Println("error executing curl: ", err.Error()) 43 | return 44 | } 45 | log.Println(curlOutput) 46 | 47 | _, err = exec.Command("unzip", outputFilePath, "-d", dataPath).Output() 48 | if err != nil { 49 | log.Println("error executing unzip: ", err.Error()) 50 | return 51 | } 52 | } 53 | -------------------------------------------------------------------------------- /cve/json.go: -------------------------------------------------------------------------------- 1 | package cve 2 | 3 | import "encoding/json" 4 | 5 | func Encode[T any](object T) ([]byte, error) { 6 | bytes, err := json.Marshal(object) 7 | if err != nil { 8 | return nil, err 9 | } 10 | return bytes, nil 11 | } 12 | 13 | func Decode[T any](input []byte, obj *T) error { 14 | err := json.Unmarshal(input, obj) 15 | if err != nil { 16 | return err 17 | } 18 | return nil 19 | } 20 | -------------------------------------------------------------------------------- /cve/types.go: -------------------------------------------------------------------------------- 1 | package cve 2 | 3 | type NvdResponse struct { 4 | ResultsPerPage int `json:"resultsPerPage"` 5 | StartIndex int `json:"startIndex"` 6 | TotalResults int `json:"totalResults"` 7 | Format string `json:"format"` 8 | Version string `json:"version"` 9 | Timestamp string `json:"timestamp"` 10 | Vulnerabilities []Vulnerability `json:"vulnerabilities"` 11 | } 12 | 13 | type Vulnerability struct { 14 | Cve Cve `json:"cve"` 15 | } 16 | 17 | type Cve struct { 18 | References References `json:"references"` 19 | CveMetaData CveMetaData `json:"CVE_data_meta"` 20 | } 21 | 22 | type CveMetaData struct { 23 | Id string `json:"ID"` 24 | Assigner string `json:"ASSIGNER"` 25 | } 26 | 27 | type References struct { 28 | ReferenceData []ReferenceData `json:"reference_data"` 29 | } 30 | 31 | type ReferenceData struct { 32 | Url string `json:"url"` 33 | Name string `json:"name"` 34 | Refsource string `json:"refsource"` 35 | Tags []string `json:"tags"` 36 | } 37 | 38 | type PackageInfoNVD struct { 39 | GHSAId string `json:"ghsa_id"` 40 | Commit string `json:"commit"` 41 | Cve string `json:"cve"` 42 | } 43 | 44 | type PackageInfoOVS struct { 45 | Id string `json:"id"` 46 | Summary string `json:"summary"` 47 | Details string `json:"details"` 48 | Affected []AffectedPackage `json:"affected"` 49 | } 50 | 51 | type PackageListOVS struct { 52 | Vulns []PackageInfoOVS `json:"vulns"` 53 | } 54 | 55 | type AffectedPackage struct { 56 | Package map[string]string `json:"package"` 57 | Ranges []map[string]any `json:"ranges"` 58 | Versions []string `json:"versions"` 59 | } 60 | 61 | type NvdDataStatic struct { 62 | CVEDataType string `json:"CVE_data_type"` 63 | CVEDataFormat string `json:"CVE_data_format"` 64 | CVEDataVersion string `json:"CVE_data_version"` 65 | CVEDataNumberOfCVES string `json:"CVE_data_number_of_cves"` 66 | CVEDataTimestamp string `json:"CVE_data_timestamp"` 67 | CVEItems []Vulnerability `json:"CVE_items"` 68 | } 69 | 70 | type PackageEntry struct { 71 | Cve string `json:"cve"` 72 | AffectedPackages []AffectedPackage `json:"affected_packages"` 73 | } 74 | -------------------------------------------------------------------------------- /database/affected.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "log" 5 | 6 | "nvdbase/cve" 7 | ) 8 | 9 | func getAffectedPackages(pkgInfo *cve.PackageInfoNVD) []cve.AffectedPackage { 10 | log.Println("Getting affected packages for ", pkgInfo.Cve) 11 | additionalInfo := getAdditionalInformation(pkgInfo) 12 | var affectedPackages []cve.AffectedPackage 13 | for _, vuln := range additionalInfo.Vulns { 14 | for _, affected := range vuln.Affected { 15 | if affected.Package["ecosystem"] != "Maven" { 16 | continue 17 | } 18 | affectedPackages = append(affectedPackages, affected) 19 | } 20 | } 21 | return affectedPackages 22 | } 23 | -------------------------------------------------------------------------------- /database/load.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "log" 5 | 6 | "nvdbase/cve" 7 | ) 8 | 9 | func LoadData(dataPaths ...string) []cve.Vulnerability { 10 | var vulnerabilities []cve.Vulnerability = []cve.Vulnerability{} 11 | for _, path := range dataPaths { 12 | nvdData, err := JSONParser(path) 13 | if err != nil { 14 | log.Printf("Error opening datafile: %s\n", err.Error()) 15 | continue 16 | } 17 | vulnerabilities = append(vulnerabilities, nvdData.CVEItems...) 18 | } 19 | 20 | return vulnerabilities 21 | } 22 | -------------------------------------------------------------------------------- /database/nvd_info.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | 7 | "nvdbase/chat" 8 | "nvdbase/cve" 9 | ) 10 | 11 | func GetPkgInfoFromVuln( 12 | vuln cve.Vulnerability, 13 | apiKey, modelName string, 14 | ) (cve.PackageInfoNVD, error) { 15 | vulnDataStr := fmt.Sprintf("%s\n", vuln) 16 | 17 | systemMessage := &chat.Message{Role: "system", Content: chat.SystemMessageStr} 18 | userMessage := &chat.Message{Role: "user", Content: vulnDataStr} 19 | chatReq := &chat.ChatRequest{ 20 | Messages: []chat.Message{*systemMessage, *userMessage}, 21 | Model: modelName, 22 | } 23 | 24 | completion, err := chat.GetChatCompletion(chatReq, apiKey) 25 | if err != nil { 26 | log.Println("Got err: ", err.Error(), " from chat completion. returning") 27 | return cve.PackageInfoNVD{}, err 28 | } 29 | 30 | var pkgInfo cve.PackageInfoNVD 31 | err = cve.Decode([]byte(completion.Choices[0].Message.Content), &pkgInfo) 32 | if err != nil { 33 | log.Println("Error decoding into package information") 34 | return cve.PackageInfoNVD{}, err 35 | } 36 | 37 | fmt.Printf("got package info: %+v\n", pkgInfo) 38 | return pkgInfo, nil 39 | } 40 | 41 | func GetPkgInfo( 42 | vulnerabilities []cve.Vulnerability, 43 | apiKey, modelName string, 44 | ) <-chan cve.PackageInfoNVD { 45 | pkgInfoCh := make(chan cve.PackageInfoNVD) 46 | go func() { 47 | defer close(pkgInfoCh) 48 | for _, vuln := range vulnerabilities { 49 | pkgInfo, err := GetPkgInfoFromVuln(vuln, apiKey, modelName) 50 | if err != nil { 51 | log.Printf( 52 | "Error getting package information from vulnerability node: %s\n", 53 | err.Error(), 54 | ) 55 | continue 56 | } 57 | pkgInfoCh <- pkgInfo 58 | } 59 | }() 60 | 61 | return pkgInfoCh 62 | } 63 | -------------------------------------------------------------------------------- /database/ovs_info.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "bytes" 5 | "fmt" 6 | "io" 7 | "log" 8 | "net/http" 9 | 10 | "nvdbase/cve" 11 | ) 12 | 13 | func GetAdditionalInfo(pkgInfo *cve.PackageInfoNVD) { 14 | fmt.Printf("Getting additional information with package: %+v\n", pkgInfo) 15 | if pkgInfo.GHSAId == "" && pkgInfo.Commit == "" { 16 | log.Println("no package found, returning") 17 | return 18 | } 19 | 20 | if pkgInfo.Commit != "" { 21 | // write package info with commit id 22 | commitRequest := map[string]interface{}{"commit": pkgInfo.Commit} 23 | requestBody, err := cve.Encode(commitRequest) 24 | if err != nil { 25 | log.Println(err.Error()) 26 | return 27 | } 28 | req, err := http.NewRequest("POST", 29 | OVS_COMMIT_SEARCH_URL, 30 | bytes.NewBuffer(requestBody)) 31 | if err != nil { 32 | log.Println(err.Error()) 33 | return 34 | } 35 | response, err := http.DefaultClient.Do(req) 36 | if err != nil { 37 | log.Println(err.Error()) 38 | return 39 | } 40 | 41 | defer response.Body.Close() 42 | responseBody, err := io.ReadAll(response.Body) 43 | if err != nil { 44 | log.Println(err.Error()) 45 | return 46 | } 47 | 48 | fmt.Println("Received response: ", string(responseBody)) 49 | 50 | var ovsResponse cve.PackageListOVS 51 | err = cve.Decode(responseBody, &ovsResponse) 52 | if err != nil { 53 | log.Println(err.Error()) 54 | return 55 | } 56 | fmt.Printf("OVS response: %+v\n", ovsResponse) 57 | return 58 | } 59 | 60 | if pkgInfo.GHSAId != "" { 61 | // write package info with GHSA id 62 | url := fmt.Sprintf("%s%s", OVS_ID_SERCH_URL, pkgInfo.GHSAId) 63 | log.Println("Performing request with url: ", url) 64 | req, err := http.NewRequest( 65 | "GET", 66 | url, 67 | nil, 68 | ) 69 | if err != nil { 70 | log.Println(err.Error()) 71 | return 72 | } 73 | response, err := http.DefaultClient.Do(req) 74 | if err != nil { 75 | log.Println(err.Error()) 76 | return 77 | } 78 | 79 | if response.StatusCode != http.StatusOK { 80 | log.Println("Received status: ", response.Status) 81 | return 82 | } 83 | 84 | defer response.Body.Close() 85 | responseBody, err := io.ReadAll(response.Body) 86 | fmt.Println("Received response: ", string(responseBody)) 87 | if err != nil { 88 | log.Println(err.Error()) 89 | return 90 | } 91 | 92 | var ovsResponse cve.PackageInfoOVS 93 | err = cve.Decode(responseBody, &ovsResponse) 94 | if err != nil { 95 | log.Println(err.Error()) 96 | return 97 | } 98 | fmt.Printf("OVS response: %+v\n", ovsResponse) 99 | return 100 | } 101 | 102 | return 103 | } 104 | -------------------------------------------------------------------------------- /database/parser.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "io" 5 | "os" 6 | 7 | "nvdbase/cve" 8 | ) 9 | 10 | func JSONParser(filename string) (cve.NvdDataStatic, error) { 11 | var nvdData cve.NvdDataStatic 12 | jsonObj, err := os.Open(filename) 13 | if err != nil { 14 | return cve.NvdDataStatic{}, err 15 | } 16 | defer jsonObj.Close() 17 | 18 | body, err := io.ReadAll(jsonObj) 19 | if err != nil { 20 | return cve.NvdDataStatic{}, err 21 | } 22 | 23 | err = cve.Decode(body, &nvdData) 24 | if err != nil { 25 | return cve.NvdDataStatic{}, err 26 | } 27 | 28 | return nvdData, nil 29 | } 30 | -------------------------------------------------------------------------------- /database/verification.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "bytes" 5 | "fmt" 6 | "io" 7 | "log" 8 | "net/http" 9 | 10 | "nvdbase/cve" 11 | ) 12 | 13 | const ( 14 | OVS_COMMIT_SEARCH_URL = "https://api.osv.dev/v1/query" 15 | OVS_ID_SERCH_URL = "https://api.osv.dev/v1/vulns/" 16 | ) 17 | 18 | func getAdditionalInformation(pkgInfo *cve.PackageInfoNVD) cve.PackageListOVS { 19 | fmt.Printf("Getting additional information with package: %+v\n", pkgInfo) 20 | if pkgInfo.GHSAId == "" && pkgInfo.Commit == "" { 21 | log.Println("no package found, returning") 22 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 23 | } 24 | 25 | if pkgInfo.Commit != "" { 26 | // write package info with commit id 27 | commitRequest := map[string]interface{}{"commit": pkgInfo.Commit} 28 | requestBody, err := cve.Encode(commitRequest) 29 | if err != nil { 30 | log.Println(err.Error()) 31 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 32 | } 33 | req, err := http.NewRequest("POST", 34 | OVS_COMMIT_SEARCH_URL, 35 | bytes.NewBuffer(requestBody)) 36 | if err != nil { 37 | log.Println(err.Error()) 38 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 39 | } 40 | response, err := http.DefaultClient.Do(req) 41 | if err != nil { 42 | log.Println(err.Error()) 43 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 44 | } 45 | 46 | defer response.Body.Close() 47 | responseBody, err := io.ReadAll(response.Body) 48 | if err != nil { 49 | log.Println(err.Error()) 50 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 51 | } 52 | 53 | // fmt.Println("Received response: ", string(responseBody)) 54 | 55 | var ovsResponse cve.PackageListOVS 56 | err = cve.Decode(responseBody, &ovsResponse) 57 | if err != nil { 58 | log.Println(err.Error()) 59 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 60 | } 61 | // fmt.Printf("OVS response: %+v\n", ovsResponse) 62 | return ovsResponse 63 | } 64 | 65 | if pkgInfo.GHSAId != "" { 66 | // write package info with GHSA id 67 | url := fmt.Sprintf("%s%s", OVS_ID_SERCH_URL, pkgInfo.GHSAId) 68 | log.Println("Performing request with url: ", url) 69 | req, err := http.NewRequest( 70 | "GET", 71 | url, 72 | nil, 73 | ) 74 | if err != nil { 75 | log.Println(err.Error()) 76 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 77 | } 78 | response, err := http.DefaultClient.Do(req) 79 | if err != nil { 80 | log.Println(err.Error()) 81 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 82 | } 83 | 84 | if response.StatusCode != http.StatusOK { 85 | log.Println("Received status: ", response.Status) 86 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 87 | } 88 | 89 | defer response.Body.Close() 90 | responseBody, err := io.ReadAll(response.Body) 91 | fmt.Println("Received response: ", string(responseBody)) 92 | if err != nil { 93 | log.Println(err.Error()) 94 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 95 | } 96 | 97 | var ovsResponse cve.PackageInfoOVS 98 | err = cve.Decode(responseBody, &ovsResponse) 99 | if err != nil { 100 | log.Println(err.Error()) 101 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 102 | } 103 | // fmt.Printf("OVS response: %+v\n", ovsResponse) 104 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{ovsResponse}} 105 | } 106 | 107 | return cve.PackageListOVS{Vulns: []cve.PackageInfoOVS{}} 108 | } 109 | -------------------------------------------------------------------------------- /database/write.go: -------------------------------------------------------------------------------- 1 | package database 2 | 3 | import ( 4 | "log" 5 | "os" 6 | 7 | "nvdbase/cve" 8 | ) 9 | 10 | func VerifyAndWrite(pkgInfo *cve.PackageInfoNVD, fileName string) { 11 | affectedPackages := getAffectedPackages(pkgInfo) 12 | if len(affectedPackages) == 0 { 13 | log.Printf("Found no affected packages for %s, skipping", pkgInfo.Cve) 14 | return 15 | } 16 | 17 | log.Printf("Found %d affected java packages for %s", len(affectedPackages), pkgInfo.Cve) 18 | pkgEntry := cve.PackageEntry{Cve: pkgInfo.Cve, AffectedPackages: affectedPackages} 19 | encoded, err := cve.Encode(pkgEntry) 20 | if err != nil { 21 | log.Printf("Error marshalling package entry: %s\n", err.Error()) 22 | return 23 | } 24 | 25 | log.Printf("Writing affected packges: %+v\n", affectedPackages) 26 | 27 | f, err := os.OpenFile(fileName, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0600) 28 | if err != nil { 29 | panic(err) 30 | } 31 | 32 | defer f.Close() 33 | 34 | if _, err = f.WriteString(string(encoded) + "\n"); err != nil { 35 | panic(err) 36 | } 37 | } 38 | -------------------------------------------------------------------------------- /files/pkg_info.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/softmaxer/cve-database-ingestion/0d544c11714fe82b2d17e85f2e94c517142b06ff/files/pkg_info.json -------------------------------------------------------------------------------- /files/pkg_info_pypi_test.json: -------------------------------------------------------------------------------- 1 | {"cve":"CVE-2023-0015","affected_packages":[{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"0"},{"fixed":"1.6.16"}],"type":"ECOSYSTEM"}],"versions":["1.0.0","1.0.0a1","1.0.0a2","1.0.0b1","1.0.0b2","1.0.0b3","1.0.0b4","1.0.1","1.0.2","1.0.3","1.1.0","1.1.1","1.1.2","1.1.3","1.1.4","1.1.5","1.1.6","1.2.0","1.2.1","1.2.10","1.2.11","1.2.2","1.2.3","1.2.4","1.2.5","1.2.6","1.2.7","1.2.8","1.2.9","1.3.0","1.3.1","1.3.10","1.3.2","1.3.3","1.3.4","1.3.5","1.3.6","1.3.7","1.3.8","1.3.9","1.4.0","1.4.1","1.4.10","1.4.2","1.4.3","1.4.4","1.4.5","1.4.7","1.4.8","1.4.9","1.5.0","1.5.1","1.5.10","1.5.11","1.5.12","1.5.13","1.5.14","1.5.15","1.5.16","1.5.17","1.5.18","1.5.19","1.5.2","1.5.20","1.5.21","1.5.22","1.5.23","1.5.24","1.5.3","1.5.4","1.5.5","1.5.6","1.5.7","1.5.8","1.5.9","1.6.0","1.6.1","1.6.10","1.6.11","1.6.12","1.6.13","1.6.14","1.6.15","1.6.2","1.6.3","1.6.4","1.6.5","1.6.6","1.6.7","1.6.8","1.6.9"]},{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"2.0.0"},{"fixed":"2.1.9"}],"type":"ECOSYSTEM"}],"versions":["2.0.0","2.0.1","2.0.2","2.0.3","2.0.4","2.0.5","2.0.6","2.1.0","2.1.0b1","2.1.1","2.1.2","2.1.3","2.1.4","2.1.5","2.1.6","2.1.7","2.1.8"]}]} 2 | {"cve":"CVE-2023-0024","affected_packages":[{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"0"},{"fixed":"1.6.16"}],"type":"ECOSYSTEM"}],"versions":["1.0.0","1.0.0a1","1.0.0a2","1.0.0b1","1.0.0b2","1.0.0b3","1.0.0b4","1.0.1","1.0.2","1.0.3","1.1.0","1.1.1","1.1.2","1.1.3","1.1.4","1.1.5","1.1.6","1.2.0","1.2.1","1.2.10","1.2.11","1.2.2","1.2.3","1.2.4","1.2.5","1.2.6","1.2.7","1.2.8","1.2.9","1.3.0","1.3.1","1.3.10","1.3.2","1.3.3","1.3.4","1.3.5","1.3.6","1.3.7","1.3.8","1.3.9","1.4.0","1.4.1","1.4.10","1.4.2","1.4.3","1.4.4","1.4.5","1.4.7","1.4.8","1.4.9","1.5.0","1.5.1","1.5.10","1.5.11","1.5.12","1.5.13","1.5.14","1.5.15","1.5.16","1.5.17","1.5.18","1.5.19","1.5.2","1.5.20","1.5.21","1.5.22","1.5.23","1.5.24","1.5.3","1.5.4","1.5.5","1.5.6","1.5.7","1.5.8","1.5.9","1.6.0","1.6.1","1.6.10","1.6.11","1.6.12","1.6.13","1.6.14","1.6.15","1.6.2","1.6.3","1.6.4","1.6.5","1.6.6","1.6.7","1.6.8","1.6.9"]},{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"2.0.0"},{"fixed":"2.1.9"}],"type":"ECOSYSTEM"}],"versions":["2.0.0","2.0.1","2.0.2","2.0.3","2.0.4","2.0.5","2.0.6","2.1.0","2.1.0b1","2.1.1","2.1.2","2.1.3","2.1.4","2.1.5","2.1.6","2.1.7","2.1.8"]}]} 3 | {"cve":"CVE-2023-0044","affected_packages":[{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"0"},{"fixed":"1.6.16"}],"type":"ECOSYSTEM"}],"versions":["1.0.0","1.0.0a1","1.0.0a2","1.0.0b1","1.0.0b2","1.0.0b3","1.0.0b4","1.0.1","1.0.2","1.0.3","1.1.0","1.1.1","1.1.2","1.1.3","1.1.4","1.1.5","1.1.6","1.2.0","1.2.1","1.2.10","1.2.11","1.2.2","1.2.3","1.2.4","1.2.5","1.2.6","1.2.7","1.2.8","1.2.9","1.3.0","1.3.1","1.3.10","1.3.2","1.3.3","1.3.4","1.3.5","1.3.6","1.3.7","1.3.8","1.3.9","1.4.0","1.4.1","1.4.10","1.4.2","1.4.3","1.4.4","1.4.5","1.4.7","1.4.8","1.4.9","1.5.0","1.5.1","1.5.10","1.5.11","1.5.12","1.5.13","1.5.14","1.5.15","1.5.16","1.5.17","1.5.18","1.5.19","1.5.2","1.5.20","1.5.21","1.5.22","1.5.23","1.5.24","1.5.3","1.5.4","1.5.5","1.5.6","1.5.7","1.5.8","1.5.9","1.6.0","1.6.1","1.6.10","1.6.11","1.6.12","1.6.13","1.6.14","1.6.15","1.6.2","1.6.3","1.6.4","1.6.5","1.6.6","1.6.7","1.6.8","1.6.9"]},{"package":{"ecosystem":"PyPI","name":"nautobot","purl":"pkg:pypi/nautobot"},"ranges":[{"events":[{"introduced":"2.0.0"},{"fixed":"2.1.9"}],"type":"ECOSYSTEM"}],"versions":["2.0.0","2.0.1","2.0.2","2.0.3","2.0.4","2.0.5","2.0.6","2.1.0","2.1.0b1","2.1.1","2.1.2","2.1.3","2.1.4","2.1.5","2.1.6","2.1.7","2.1.8"]}]} 4 | {"cve":"CVE-2023-0055","affected_packages":[{"package":{"ecosystem":"PyPI","name":"pyload-ng","purl":"pkg:pypi/pyload-ng"},"ranges":[{"events":[{"introduced":"0"},{"fixed":"1374c824271cb7e927740664d06d2e577624ca3e"},{"fixed":"c7cdc18ad9134a75222974b39e8b427c4af845fc"}],"repo":"https://github.com/pyload/pyload","type":"GIT"},{"events":[{"introduced":"0"},{"fixed":"0.5.0b3.dev78"}],"type":"ECOSYSTEM"}],"versions":["0.5.0a5.dev528","0.5.0a5.dev532","0.5.0a5.dev535","0.5.0a5.dev536","0.5.0a5.dev537","0.5.0a5.dev539","0.5.0a5.dev540","0.5.0a5.dev545","0.5.0a5.dev562","0.5.0a5.dev564","0.5.0a5.dev565","0.5.0a6.dev570","0.5.0a6.dev578","0.5.0a6.dev587","0.5.0a7.dev596","0.5.0a8.dev602","0.5.0a9.dev615","0.5.0a9.dev629","0.5.0a9.dev632","0.5.0a9.dev641","0.5.0a9.dev643","0.5.0a9.dev655","0.5.0a9.dev806","0.5.0b1.dev1","0.5.0b1.dev2","0.5.0b1.dev3","0.5.0b1.dev4","0.5.0b1.dev5","0.5.0b2.dev10","0.5.0b2.dev11","0.5.0b2.dev12","0.5.0b2.dev9","0.5.0b3.dev13","0.5.0b3.dev14","0.5.0b3.dev17","0.5.0b3.dev18","0.5.0b3.dev19","0.5.0b3.dev20","0.5.0b3.dev21","0.5.0b3.dev22","0.5.0b3.dev24","0.5.0b3.dev26","0.5.0b3.dev27","0.5.0b3.dev28","0.5.0b3.dev29","0.5.0b3.dev30","0.5.0b3.dev31","0.5.0b3.dev32","0.5.0b3.dev33","0.5.0b3.dev34","0.5.0b3.dev35","0.5.0b3.dev38","0.5.0b3.dev39","0.5.0b3.dev40","0.5.0b3.dev41","0.5.0b3.dev42","0.5.0b3.dev43","0.5.0b3.dev44","0.5.0b3.dev45","0.5.0b3.dev46","0.5.0b3.dev47","0.5.0b3.dev48","0.5.0b3.dev49","0.5.0b3.dev50","0.5.0b3.dev51","0.5.0b3.dev52","0.5.0b3.dev53","0.5.0b3.dev54","0.5.0b3.dev57","0.5.0b3.dev60","0.5.0b3.dev62","0.5.0b3.dev64","0.5.0b3.dev65","0.5.0b3.dev66","0.5.0b3.dev67","0.5.0b3.dev68","0.5.0b3.dev69","0.5.0b3.dev70","0.5.0b3.dev71","0.5.0b3.dev72","0.5.0b3.dev73","0.5.0b3.dev74","0.5.0b3.dev75","0.5.0b3.dev76","0.5.0b3.dev77"]}]} 5 | {"cve":"CVE-2023-0057","affected_packages":[{"package":{"ecosystem":"PyPI","name":"pyload-ng","purl":"pkg:pypi/pyload-ng"},"ranges":[{"events":[{"introduced":"0"},{"fixed":"1374c824271cb7e927740664d06d2e577624ca3e"},{"fixed":"c7cdc18ad9134a75222974b39e8b427c4af845fc"}],"repo":"https://github.com/pyload/pyload","type":"GIT"},{"events":[{"introduced":"0"},{"fixed":"0.5.0b3.dev78"}],"type":"ECOSYSTEM"}],"versions":["0.5.0a5.dev528","0.5.0a5.dev532","0.5.0a5.dev535","0.5.0a5.dev536","0.5.0a5.dev537","0.5.0a5.dev539","0.5.0a5.dev540","0.5.0a5.dev545","0.5.0a5.dev562","0.5.0a5.dev564","0.5.0a5.dev565","0.5.0a6.dev570","0.5.0a6.dev578","0.5.0a6.dev587","0.5.0a7.dev596","0.5.0a8.dev602","0.5.0a9.dev615","0.5.0a9.dev629","0.5.0a9.dev632","0.5.0a9.dev641","0.5.0a9.dev643","0.5.0a9.dev655","0.5.0a9.dev806","0.5.0b1.dev1","0.5.0b1.dev2","0.5.0b1.dev3","0.5.0b1.dev4","0.5.0b1.dev5","0.5.0b2.dev10","0.5.0b2.dev11","0.5.0b2.dev12","0.5.0b2.dev9","0.5.0b3.dev13","0.5.0b3.dev14","0.5.0b3.dev17","0.5.0b3.dev18","0.5.0b3.dev19","0.5.0b3.dev20","0.5.0b3.dev21","0.5.0b3.dev22","0.5.0b3.dev24","0.5.0b3.dev26","0.5.0b3.dev27","0.5.0b3.dev28","0.5.0b3.dev29","0.5.0b3.dev30","0.5.0b3.dev31","0.5.0b3.dev32","0.5.0b3.dev33","0.5.0b3.dev34","0.5.0b3.dev35","0.5.0b3.dev38","0.5.0b3.dev39","0.5.0b3.dev40","0.5.0b3.dev41","0.5.0b3.dev42","0.5.0b3.dev43","0.5.0b3.dev44","0.5.0b3.dev45","0.5.0b3.dev46","0.5.0b3.dev47","0.5.0b3.dev48","0.5.0b3.dev49","0.5.0b3.dev50","0.5.0b3.dev51","0.5.0b3.dev52","0.5.0b3.dev53","0.5.0b3.dev54","0.5.0b3.dev57","0.5.0b3.dev60","0.5.0b3.dev62","0.5.0b3.dev64","0.5.0b3.dev65","0.5.0b3.dev66","0.5.0b3.dev67","0.5.0b3.dev68","0.5.0b3.dev69","0.5.0b3.dev70","0.5.0b3.dev71","0.5.0b3.dev72","0.5.0b3.dev73","0.5.0b3.dev74","0.5.0b3.dev75","0.5.0b3.dev76","0.5.0b3.dev77"]}]} 6 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module nvdbase 2 | 3 | go 1.22.3 4 | 5 | require github.com/go-co-op/gocron v1.37.0 6 | 7 | require ( 8 | github.com/PuerkitoBio/goquery v1.9.2 // indirect 9 | github.com/andybalholm/cascadia v1.3.2 // indirect 10 | github.com/antchfx/htmlquery v1.3.1 // indirect 11 | github.com/antchfx/xmlquery v1.4.0 // indirect 12 | github.com/antchfx/xpath v1.3.0 // indirect 13 | github.com/gobwas/glob v0.2.3 // indirect 14 | github.com/gocolly/colly v1.2.0 // indirect 15 | github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect 16 | github.com/golang/protobuf v1.5.2 // indirect 17 | github.com/google/uuid v1.4.0 // indirect 18 | github.com/jasonlvhit/gocron v0.0.1 // indirect 19 | github.com/joho/godotenv v1.5.1 // indirect 20 | github.com/kennygrant/sanitize v1.2.4 // indirect 21 | github.com/robfig/cron v1.2.0 // indirect 22 | github.com/robfig/cron/v3 v3.0.1 // indirect 23 | github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d // indirect 24 | github.com/temoto/robotstxt v1.1.2 // indirect 25 | go.uber.org/atomic v1.9.0 // indirect 26 | golang.org/x/net v0.26.0 // indirect 27 | golang.org/x/text v0.16.0 // indirect 28 | google.golang.org/appengine v1.6.8 // indirect 29 | google.golang.org/protobuf v1.26.0 // indirect 30 | ) 31 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/PuerkitoBio/goquery v1.9.2 h1:4/wZksC3KgkQw7SQgkKotmKljk0M6V8TUvA8Wb4yPeE= 2 | github.com/PuerkitoBio/goquery v1.9.2/go.mod h1:GHPCaP0ODyyxqcNoFGYlAprUFH81NuRPd0GX3Zu2Mvk= 3 | github.com/andybalholm/cascadia v1.3.2 h1:3Xi6Dw5lHF15JtdcmAHD3i1+T8plmv7BQ/nsViSLyss= 4 | github.com/andybalholm/cascadia v1.3.2/go.mod h1:7gtRlve5FxPPgIgX36uWBX58OdBsSS6lUvCFb+h7KvU= 5 | github.com/antchfx/htmlquery v1.3.1 h1:wm0LxjLMsZhRHfQKKZscDf2COyH4vDYA3wyH+qZ+Ylc= 6 | github.com/antchfx/htmlquery v1.3.1/go.mod h1:PTj+f1V2zksPlwNt7uVvZPsxpKNa7mlVliCRxLX6Nx8= 7 | github.com/antchfx/xmlquery v1.4.0 h1:xg2HkfcRK2TeTbdb0m1jxCYnvsPaGY/oeZWTGqX/0hA= 8 | github.com/antchfx/xmlquery v1.4.0/go.mod h1:Ax2aeaeDjfIw3CwXKDQ0GkwZ6QlxoChlIBP+mGnDFjI= 9 | github.com/antchfx/xpath v1.3.0 h1:nTMlzGAK3IJ0bPpME2urTuFL76o4A96iYvoKFHRXJgc= 10 | github.com/antchfx/xpath v1.3.0/go.mod h1:i54GszH55fYfBmoZXapTHN8T8tkcHfRgLyVwwqzXNcs= 11 | github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= 12 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 13 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= 14 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 15 | github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= 16 | github.com/go-co-op/gocron v1.37.0 h1:ZYDJGtQ4OMhTLKOKMIch+/CY70Brbb1dGdooLEhh7b0= 17 | github.com/go-co-op/gocron v1.37.0/go.mod h1:3L/n6BkO7ABj+TrfSVXLRzsP26zmikL4ISkLQ0O8iNY= 18 | github.com/go-redis/redis v6.15.5+incompatible/go.mod h1:NAIEuMOZ/fxfXJIrKDQDz8wamY7mA7PouImQ2Jvg6kA= 19 | github.com/gobwas/glob v0.2.3 h1:A4xDbljILXROh+kObIiy5kIaPYD8e96x1tgBhUI5J+Y= 20 | github.com/gobwas/glob v0.2.3/go.mod h1:d3Ez4x06l9bZtSvzIay5+Yzi0fmZzPgnTbPcKjJAkT8= 21 | github.com/gocolly/colly v1.2.0 h1:qRz9YAn8FIH0qzgNUw+HT9UN7wm1oF9OBAilwEWpyrI= 22 | github.com/gocolly/colly v1.2.0/go.mod h1:Hof5T3ZswNVsOHYmba1u03W65HDWgpV5HifSuueE0EA= 23 | github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da h1:oI5xCqsCo564l8iNU+DwB5epxmsaqB+rhGL0m5jtYqE= 24 | github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= 25 | github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= 26 | github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk= 27 | github.com/golang/protobuf v1.5.2 h1:ROPKBNFfQgOUMifHyP+KYbvpjbdoFNs+aK7DXlji0Tw= 28 | github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY= 29 | github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= 30 | github.com/google/uuid v1.4.0 h1:MtMxsa51/r9yyhkyLsVeVt0B+BGQZzpQiTQ4eHZ8bc4= 31 | github.com/google/uuid v1.4.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= 32 | github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU= 33 | github.com/jasonlvhit/gocron v0.0.1 h1:qTt5qF3b3srDjeOIR4Le1LfeyvoYzJlYpqvG7tJX5YU= 34 | github.com/jasonlvhit/gocron v0.0.1/go.mod h1:k9a3TV8VcU73XZxfVHCHWMWF9SOqgoku0/QlY2yvlA4= 35 | github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0= 36 | github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4= 37 | github.com/kennygrant/sanitize v1.2.4 h1:gN25/otpP5vAsO2djbMhF/LQX6R7+O1TB4yv8NzpJ3o= 38 | github.com/kennygrant/sanitize v1.2.4/go.mod h1:LGsjYYtgxbetdg5owWB2mpgUL6e2nfw2eObZ0u0qvak= 39 | github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= 40 | github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI= 41 | github.com/kr/pretty v0.3.0/go.mod h1:640gp4NfQd8pI5XOwp5fnNeVWj67G7CFk/SaSQn7NBk= 42 | github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= 43 | github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= 44 | github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= 45 | github.com/onsi/ginkgo v1.6.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= 46 | github.com/onsi/ginkgo v1.10.1/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= 47 | github.com/onsi/gomega v1.7.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY= 48 | github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA= 49 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= 50 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 51 | github.com/robfig/cron v1.2.0 h1:ZjScXvvxeQ63Dbyxy76Fj3AT3Ut0aKsyd2/tl3DTMuQ= 52 | github.com/robfig/cron v1.2.0/go.mod h1:JGuDeoQd7Z6yL4zQhZ3OPEVHB7fL6Ka6skscFHfmt2k= 53 | github.com/robfig/cron/v3 v3.0.1 h1:WdRxkvbJztn8LMz/QEvLN5sBU+xKpSqwwUO1Pjr4qDs= 54 | github.com/robfig/cron/v3 v3.0.1/go.mod h1:eQICP3HwyT7UooqI/z+Ov+PtYAWygg1TEWWzGIFLtro= 55 | github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTEfhy4qGm1nDQc= 56 | github.com/rogpeppe/go-internal v1.8.1/go.mod h1:JeRgkft04UBgHMgCIwADu4Pn6Mtm5d4nPKWu0nJ5d+o= 57 | github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d h1:hrujxIzL1woJ7AwssoOcM/tq5JjjG2yYOc8odClEiXA= 58 | github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d/go.mod h1:uugorj2VCxiV1x+LzaIdVa9b4S4qGAcH6cbhh4qVxOU= 59 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 60 | github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw= 61 | github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo= 62 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= 63 | github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= 64 | github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= 65 | github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU= 66 | github.com/stretchr/testify v1.8.2 h1:+h33VjcLVPDHtOdpUCuF+7gSuG3yGIftsP1YvFihtJ8= 67 | github.com/stretchr/testify v1.8.2/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4= 68 | github.com/temoto/robotstxt v1.1.2 h1:W2pOjSJ6SWvldyEuiFXNxz3xZ8aiWX5LbfDiOFd7Fxg= 69 | github.com/temoto/robotstxt v1.1.2/go.mod h1:+1AmkuG3IYkh1kv0d2qEB9Le88ehNO0zwOr3ujewlOo= 70 | github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY= 71 | go.uber.org/atomic v1.9.0 h1:ECmE8Bn/WFTYwEW/bpKD3M8VtR/zQVbavAoalC1PYyE= 72 | go.uber.org/atomic v1.9.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc= 73 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= 74 | golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc= 75 | golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4= 76 | golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= 77 | golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= 78 | golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 79 | golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 80 | golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg= 81 | golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c= 82 | golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs= 83 | golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs= 84 | golang.org/x/net v0.9.0/go.mod h1:d48xBJpPfHeWQsugry2m+kC02ZBRGRgulfHnEXEuWns= 85 | golang.org/x/net v0.26.0 h1:soB7SVo0PWrY4vPW/+ay0jKDNScG2X9wFeYlXIvJsOQ= 86 | golang.org/x/net v0.26.0/go.mod h1:5YKkiSynbBIh3p6iOc/vibscux0x38BZDkn8sCUPxHE= 87 | golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 88 | golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 89 | golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 90 | golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 91 | golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 92 | golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 93 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 94 | golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 95 | golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 96 | golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 97 | golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 98 | golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 99 | golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 100 | golang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 101 | golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= 102 | golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= 103 | golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k= 104 | golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY= 105 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= 106 | golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= 107 | golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ= 108 | golang.org/x/text v0.3.8/go.mod h1:E6s5w1FMmriuDzIBO73fBruAKo1PCIq6d2Q6DHfQ8WQ= 109 | golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8= 110 | golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8= 111 | golang.org/x/text v0.16.0 h1:a94ExnEXNtEwYLGJSIUxnWoxoRz/ZcCsV63ROupILh4= 112 | golang.org/x/text v0.16.0/go.mod h1:GhwF1Be+LQoKShO3cGOHzqOgRrGaYc9AvblQOmPVHnI= 113 | golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 114 | golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= 115 | golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc= 116 | golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= 117 | golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 118 | golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 119 | google.golang.org/appengine v1.6.8 h1:IhEN5q69dyKagZPYMSdIjS2HqprW324FRQZJcGqPAsM= 120 | google.golang.org/appengine v1.6.8/go.mod h1:1jJ3jBArFh5pcgW8gCtRJnepW8FzD1V44FJffLiz/Ds= 121 | google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw= 122 | google.golang.org/protobuf v1.26.0 h1:bxAC2xTBsZGibn2RTntX0oH50xLsqy1OxA9tTL3p/lk= 123 | google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc= 124 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 125 | gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 126 | gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q= 127 | gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI= 128 | gopkg.in/fsnotify.v1 v1.4.7/go.mod h1:Tz8NjZHkW78fSQdbUxIjBTcgA1z1m8ZHf0WmKUhAMys= 129 | gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw= 130 | gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 131 | gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= 132 | gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 133 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= 134 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 135 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "flag" 5 | "os" 6 | 7 | "github.com/joho/godotenv" 8 | 9 | "nvdbase/cmd" 10 | ) 11 | 12 | func main() { 13 | godotenv.Load(".env") 14 | groqApiKey := os.Getenv("GROQ_API_KEY") 15 | dataPath := os.Getenv("DATA_PATH") 16 | var outputFilePath string 17 | var command string 18 | flag.StringVar( 19 | &outputFilePath, 20 | "o", 21 | "files/pkg_info.json", 22 | "The output file to write the affected package information to.", 23 | ) 24 | flag.StringVar( 25 | &command, 26 | "c", 27 | "ingest", 28 | "Specify whether to ingest data from 2023-2024 or start an update script", 29 | ) 30 | flag.Parse() 31 | 32 | if command == "ingest" { 33 | cmd.Ingest(dataPath, outputFilePath, groqApiKey) 34 | } 35 | if command == "update" { 36 | cmd.Update(dataPath, outputFilePath, groqApiKey) 37 | } 38 | } 39 | --------------------------------------------------------------------------------