{{ cve }}
63 |{{ description }}
64 |Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et d$
69 |├── README.md ├── app ├── __init__.py ├── __pycache__ │ └── __init__.cpython-37.pyc ├── log.txt ├── models │ ├── __pycache__ │ │ └── cve.cpython-37.pyc │ └── cve.py ├── mods │ └── mod_main │ │ ├── GHlib.py │ │ ├── TWlib.py │ │ ├── __pycache__ │ │ ├── GHlib.cpython-37.pyc │ │ ├── TWlib.cpython-37.pyc │ │ ├── databasePSQL.cpython-37.pyc │ │ ├── textProcessing.cpython-37.pyc │ │ ├── utils.cpython-37.pyc │ │ └── views.cpython-37.pyc │ │ ├── databasePSQL.py │ │ ├── textProcessing.py │ │ ├── utils.py │ │ ├── utils.py.bak │ │ └── views.py ├── static │ ├── css │ │ ├── AdminLTE.min.css │ │ ├── AdminLTE.min.css.save │ │ ├── bootstrap.css │ │ └── styles.css │ └── js │ │ ├── adminlte.js │ │ ├── chart.js │ │ └── main.js └── templates │ ├── index.html │ ├── index.html.bak │ ├── index.html.old │ └── layout.html ├── config.ini ├── cve_details.py ├── images ├── chart_with_exploit.png ├── confusion_matrix_nlp_tw2.png ├── main.png ├── multinominalNB.png ├── nlp_infopage_vs_exploitcode.png ├── nlp_pipeline.png └── twitter_exploit_vs_info.png ├── log.txt ├── models ├── TW_exploit_detect_NLP_Modelv1.joblib ├── stopwords │ └── corpora │ │ ├── stopwords.zip │ │ └── stopwords │ │ ├── README │ │ ├── arabic │ │ ├── azerbaijani │ │ ├── danish │ │ ├── dutch │ │ ├── english │ │ ├── finnish │ │ ├── french │ │ ├── german │ │ ├── greek │ │ ├── hungarian │ │ ├── indonesian │ │ ├── italian │ │ ├── kazakh │ │ ├── nepali │ │ ├── norwegian │ │ ├── portuguese │ │ ├── romanian │ │ ├── russian │ │ ├── slovene │ │ ├── spanish │ │ ├── swedish │ │ ├── tajik │ │ └── turkish └── wordnet │ └── corpora │ ├── wordnet.zip │ └── wordnet │ ├── LICENSE │ ├── README │ ├── adj.exc │ ├── adv.exc │ ├── citation.bib │ ├── cntlist.rev │ ├── data.adj │ ├── data.adv │ ├── data.noun │ ├── data.verb │ ├── index.adj │ ├── index.adv │ ├── index.noun │ ├── index.sense │ ├── index.verb │ ├── lexnames │ ├── noun.exc │ └── verb.exc ├── notebooks ├── CVE-S3arch.ipynb ├── README.md └── Timeseries.ipynb ├── requirements.txt ├── run.py └── setup.py /README.md: -------------------------------------------------------------------------------- 1 | # CVE-Search 2 | 3 | [](https://twitter.com/alexfrancow) [](https://www.linkedin.com/in/alexfrancow) 4 | 5 | CVE-Search (name still in alpha), is a Machine Learning tool focused on the detection of exploits or proofs of concept in social networks such as Twitter, Github. It is also capable of doing related searches on Google, Yandex, DuckDuckGo on CVEs and detecting if the content may be a functional exploit, a proof of concept or simply information about the vulnerability. 6 | 7 |
8 |
9 |
82 |
83 |
91 |
92 |
94 | I've used multinominalNB instead of logistic regression 95 |
96 | 97 | Using the MultinomialNB algorithm we obtained a precision score of 0.77, here is the confusion matrix (1 - exploit, 0 - non-exploit). 98 | 99 |
100 |
101 |
106 |
107 |
130 | NLP (Natural Language Processing) to detect exploits on Twitter based on text classification 131 |
132 | 133 | ### Natural Lenguage Processing (NLP) to detect Exploit code on Webpages 134 | 135 | Natural Language Processing, usually shortened as NLP, is a branch of artificial intelligence that deals with the interaction between computers and humans using the natural language. 136 | 137 | The ultimate objective of NLP is to read, decipher, understand, and make sense of the human languages in a manner that is valuable. Most NLP techniques rely on machine learning to derive meaning from human languages. 138 | 139 | With NLP we can classify web pages into "Just Information Page" or "Exploit Page!". 140 | 141 |
142 |
143 |
148 |
149 |
{{ description }}
64 |Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et d$
69 |