├── Procfile ├── vectorizer.pkl ├── requirements.txt ├── Screenshot (58).png ├── Screenshot (59).png ├── finalized_model.pkl ├── app.py ├── README.md ├── templates ├── prediction.html └── index.html ├── static └── image.svg ├── .ipynb_checkpoints └── Fake News Detection-checkpoint.ipynb └── Fake News Detection.ipynb /Procfile: -------------------------------------------------------------------------------- 1 | web: gunicorn app:app 2 | -------------------------------------------------------------------------------- /vectorizer.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codejay411/Fake-News-Detection-App/HEAD/vectorizer.pkl -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codejay411/Fake-News-Detection-App/HEAD/requirements.txt -------------------------------------------------------------------------------- /Screenshot (58).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codejay411/Fake-News-Detection-App/HEAD/Screenshot (58).png -------------------------------------------------------------------------------- /Screenshot (59).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codejay411/Fake-News-Detection-App/HEAD/Screenshot (59).png -------------------------------------------------------------------------------- /finalized_model.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/codejay411/Fake-News-Detection-App/HEAD/finalized_model.pkl -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | #Implement all this concept by machine learning with flask 2 | 3 | from flask import Flask, escape, request, render_template 4 | import pickle 5 | 6 | vector = pickle.load(open("vectorizer.pkl", 'rb')) 7 | model = pickle.load(open("finalized_model.pkl", 'rb')) 8 | 9 | app = Flask(__name__) 10 | 11 | @app.route('/') 12 | def home(): 13 | return render_template("index.html") 14 | 15 | @app.route('/prediction', methods=['GET', 'POST']) 16 | def prediction(): 17 | if request.method == "POST": 18 | news = str(request.form['news']) 19 | print(news) 20 | 21 | predict = model.predict(vector.transform([news]))[0] 22 | print(predict) 23 | 24 | return render_template("prediction.html", prediction_text="News headline is -> {}".format(predict)) 25 | 26 | 27 | else: 28 | return render_template("prediction.html") 29 | 30 | 31 | if __name__ == '__main__': 32 | app.debug = True 33 | app.run() 34 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # About Fake News Detection Project 2 | 3 | This project is available for open source contribution 4 | 5 | ## Video link on youtube 6 | If you want a detailed explanation of this project, then feel free to go and watch videos. Make sure if you like the video then please subscribe to my channel.\ 7 | [Fake News Detection Using Machine Learning](https://www.youtube.com/watch?v=CUkggjNNoWs&list=PLA0J2h1KIAR7xoDbI1usGLVRW6_6qiLuq&index=19) 8 | 9 | ## Overview 10 | The topic of fake news detection on social media has recently attracted tremendous attention. The basic countermeasure of comparing websites against a list of labeled fake news sources is inflexible, and so a machine learning approach is desirable. Our project aims to use Machine learning algorithms to detect fake news directly, based on the text content of news articles. 11 | 12 | ## Problem Definition 13 | Develop a machine learning program to identify when a news source may be producing fake news. We aim to use a corpus of labeled real and fake news articles to build a classifier that can make decisions about information based on the content from the corpus. The model will focus on identifying fake news sources, based on multiple articles originating from a source. Once a source is labeled as a producer of fake news, we can predict with high confidence that any future articles from that source will also be fake news. Focusing on sources widens our article misclassification tolerance because we will have multiple data points coming from each source. 14 | 15 | The intended application of the project is for use in applying visibility weights in social media. Using weights produced by this model, social networks can make stories that are highly likely to be fake news less visible. 16 | 17 | Planning: - 18 | 1. Data Collection 19 | 2. Model Building 20 | 3. Backend work 21 | 4. Deployment 22 | 23 | ## Project link 24 | 25 | [Fake News Detection Using Machine Learning](https://youtu.be/CUkggjNNoWs) 26 | 27 | All parts available in playlist, 28 | channel name - [codejay](https://www.youtube.com/channel/UCZnkti7aeEmQ7CzumqEEsLg) 29 | 30 | ## Installation 31 | 32 | Use the package manager [pip](https://pip.pypa.io/en/stable/) to install library. 33 | 34 | ```bash 35 | pip install virtualenv 36 | ``` 37 | ```bash 38 | virtualenv env_name 39 | ``` 40 | ```bash 41 | env_name/scripts/activate 42 | ``` 43 | ## Start Project 44 | 45 | Follow these commands to start your project. 46 | 47 | ```bash 48 | pip install -r requirements.txt 49 | ``` 50 | ```bash 51 | python app.py 52 | ``` 53 | ## Home page 54 | 55 | ![Test Image 1](https://github.com/codejay411/Fake_News_detection/blob/main/Screenshot%20(58).png) 56 | 57 | ## Prediction page 58 | 59 | ![Test Image 1](https://github.com/codejay411/Fake_News_detection/blob/main/Screenshot%20(59).png) 60 | -------------------------------------------------------------------------------- /templates/prediction.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Fake News Detection 13 | 14 | 15 |
16 |
17 | 18 | 19 | 20 | 21 | Fake News Detection 22 | 23 | 29 |
30 |

31 |
32 |
33 |
34 |

Machine learning project

35 |

Fake News Detection

36 |

Whatever cardigan tote bag tumblr hexagon brooklyn asymmetrical gentrify, subway tile poke farm-to-table. Franzen you probably haven't heard of them man bun deep jianbing selfies heirloom prism food truck ugh squid celiac humblebrag.

37 |

{{prediction_text}}

38 |
39 |
40 |
41 |
42 | 43 | 44 |
We'll never share your email with anyone else.
45 |
46 | 47 | 48 |
49 |
50 |
51 | 52 |
53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 65 | 66 | -------------------------------------------------------------------------------- /static/image.svg: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/Fake News Detection-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "4da05d9a", 6 | "metadata": {}, 7 | "source": [ 8 | "## Fake News Detection" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "fd31b0c2", 14 | "metadata": {}, 15 | "source": [ 16 | "#### Import library" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "id": "c8c65619", 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "import pandas as pd\n", 27 | "import numpy as np\n", 28 | "import itertools" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 2, 34 | "id": "c2080b11", 35 | "metadata": {}, 36 | "outputs": [], 37 | "source": [ 38 | "# !pip install pandas # download and install pandas library" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 3, 44 | "id": "bba61f32", 45 | "metadata": {}, 46 | "outputs": [], 47 | "source": [ 48 | "df = pd.read_csv(\"news.csv\")" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 4, 54 | "id": "eb0ccf03", 55 | "metadata": {}, 56 | "outputs": [ 57 | { 58 | "data": { 59 | "text/html": [ 60 | "
\n", 61 | "\n", 74 | "\n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | "
Unnamed: 0titletextlabel
08476You Can Smell Hillary’s FearDaniel Greenfield, a Shillman Journalism Fello...FAKE
110294Watch The Exact Moment Paul Ryan Committed Pol...Google Pinterest Digg Linkedin Reddit Stumbleu...FAKE
23608Kerry to go to Paris in gesture of sympathyU.S. Secretary of State John F. Kerry said Mon...REAL
310142Bernie supporters on Twitter erupt in anger ag...— Kaydee King (@KaydeeKing) November 9, 2016 T...FAKE
4875The Battle of New York: Why This Primary MattersIt's primary day in New York and front-runners...REAL
\n", 122 | "
" 123 | ], 124 | "text/plain": [ 125 | " Unnamed: 0 title \\\n", 126 | "0 8476 You Can Smell Hillary’s Fear \n", 127 | "1 10294 Watch The Exact Moment Paul Ryan Committed Pol... \n", 128 | "2 3608 Kerry to go to Paris in gesture of sympathy \n", 129 | "3 10142 Bernie supporters on Twitter erupt in anger ag... \n", 130 | "4 875 The Battle of New York: Why This Primary Matters \n", 131 | "\n", 132 | " text label \n", 133 | "0 Daniel Greenfield, a Shillman Journalism Fello... FAKE \n", 134 | "1 Google Pinterest Digg Linkedin Reddit Stumbleu... FAKE \n", 135 | "2 U.S. Secretary of State John F. Kerry said Mon... REAL \n", 136 | "3 — Kaydee King (@KaydeeKing) November 9, 2016 T... FAKE \n", 137 | "4 It's primary day in New York and front-runners... REAL " 138 | ] 139 | }, 140 | "execution_count": 4, 141 | "metadata": {}, 142 | "output_type": "execute_result" 143 | } 144 | ], 145 | "source": [ 146 | "df.head()" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 5, 152 | "id": "5aab448c", 153 | "metadata": {}, 154 | "outputs": [ 155 | { 156 | "data": { 157 | "text/plain": [ 158 | "(6335, 4)" 159 | ] 160 | }, 161 | "execution_count": 5, 162 | "metadata": {}, 163 | "output_type": "execute_result" 164 | } 165 | ], 166 | "source": [ 167 | "df.shape" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 6, 173 | "id": "9921c042", 174 | "metadata": {}, 175 | "outputs": [ 176 | { 177 | "data": { 178 | "text/plain": [ 179 | "Unnamed: 0 0\n", 180 | "title 0\n", 181 | "text 0\n", 182 | "label 0\n", 183 | "dtype: int64" 184 | ] 185 | }, 186 | "execution_count": 6, 187 | "metadata": {}, 188 | "output_type": "execute_result" 189 | } 190 | ], 191 | "source": [ 192 | "df.isnull().sum()" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": 7, 198 | "id": "a8107bfc", 199 | "metadata": {}, 200 | "outputs": [], 201 | "source": [ 202 | "labels = df.label" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 8, 208 | "id": "d3f6bc52", 209 | "metadata": {}, 210 | "outputs": [ 211 | { 212 | "data": { 213 | "text/plain": [ 214 | "0 FAKE\n", 215 | "1 FAKE\n", 216 | "2 REAL\n", 217 | "3 FAKE\n", 218 | "4 REAL\n", 219 | "Name: label, dtype: object" 220 | ] 221 | }, 222 | "execution_count": 8, 223 | "metadata": {}, 224 | "output_type": "execute_result" 225 | } 226 | ], 227 | "source": [ 228 | "labels.head()" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 9, 234 | "id": "e30acb39", 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "from sklearn.model_selection import train_test_split" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": 10, 244 | "id": "8ae64cea", 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "x_train, x_test, y_train, y_test = train_test_split(df[\"text\"], labels, test_size = 0.2, random_state = 20)" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 11, 254 | "id": "d7fee436", 255 | "metadata": {}, 256 | "outputs": [ 257 | { 258 | "data": { 259 | "text/plain": [ 260 | "4741 NAIROBI, Kenya — President Obama spoke out Sun...\n", 261 | "2089 Killing Obama administration rules, dismantlin...\n", 262 | "4074 Dean Obeidallah, a former attorney, is the hos...\n", 263 | "5376 WashingtonsBlog \\nCNN’s Jake Tapper hit the ...\n", 264 | "6028 Some of the biggest issues facing America this...\n", 265 | "Name: text, dtype: object" 266 | ] 267 | }, 268 | "execution_count": 11, 269 | "metadata": {}, 270 | "output_type": "execute_result" 271 | } 272 | ], 273 | "source": [ 274 | "x_train.head()" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 12, 280 | "id": "c731d247", 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "from sklearn.feature_extraction.text import TfidfVectorizer\n", 285 | "from sklearn.linear_model import PassiveAggressiveClassifier" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": 13, 291 | "id": "9ecc3123", 292 | "metadata": {}, 293 | "outputs": [], 294 | "source": [ 295 | "# initilise a Tfidvectorizer\n", 296 | "vector = TfidfVectorizer(stop_words='english', max_df=0.7)" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": 14, 302 | "id": "ab7def54", 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "# fit and tranform\n", 307 | "tf_train = vector.fit_transform(x_train)\n", 308 | "tf_test = vector.transform(x_test)" 309 | ] 310 | }, 311 | { 312 | "cell_type": "code", 313 | "execution_count": 15, 314 | "id": "d9e371a8", 315 | "metadata": {}, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "PassiveAggressiveClassifier(max_iter=50)" 321 | ] 322 | }, 323 | "execution_count": 15, 324 | "metadata": {}, 325 | "output_type": "execute_result" 326 | } 327 | ], 328 | "source": [ 329 | "# initilise a PassiveAggressiveClassifier\n", 330 | "pac = PassiveAggressiveClassifier(max_iter=50)\n", 331 | "pac.fit(tf_train, y_train)" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": 16, 337 | "id": "313da13d", 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "# predicton the tst dataset\n", 342 | "from sklearn.metrics import accuracy_score, confusion_matrix\n", 343 | "y_pred = pac.predict(tf_test)" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": 17, 349 | "id": "1aaecf16", 350 | "metadata": {}, 351 | "outputs": [], 352 | "source": [ 353 | "score = accuracy_score(y_test, y_pred)" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": 18, 359 | "id": "59496c09", 360 | "metadata": {}, 361 | "outputs": [ 362 | { 363 | "name": "stdout", 364 | "output_type": "stream", 365 | "text": [ 366 | "Accuracy : 94.87%\n" 367 | ] 368 | } 369 | ], 370 | "source": [ 371 | "print(f\"Accuracy : {round(score*100,2)}%\")" 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 19, 377 | "id": "effdf53c", 378 | "metadata": {}, 379 | "outputs": [ 380 | { 381 | "data": { 382 | "text/plain": [ 383 | "array([[622, 26],\n", 384 | " [ 39, 580]], dtype=int64)" 385 | ] 386 | }, 387 | "execution_count": 19, 388 | "metadata": {}, 389 | "output_type": "execute_result" 390 | } 391 | ], 392 | "source": [ 393 | "# confusion metrics\n", 394 | "confusion_matrix(y_test, y_pred, labels=['FAKE', 'REAL'])" 395 | ] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "execution_count": 20, 400 | "id": "d113323f", 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "# save model\n", 405 | "import pickle\n", 406 | "filename = 'finalized_model.pkl'\n", 407 | "pickle.dump(pac, open(filename, 'wb'))" 408 | ] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "execution_count": null, 413 | "id": "25f6c11e", 414 | "metadata": {}, 415 | "outputs": [], 416 | "source": [] 417 | } 418 | ], 419 | "metadata": { 420 | "kernelspec": { 421 | "display_name": "Python 3 (ipykernel)", 422 | "language": "python", 423 | "name": "python3" 424 | }, 425 | "language_info": { 426 | "codemirror_mode": { 427 | "name": "ipython", 428 | "version": 3 429 | }, 430 | "file_extension": ".py", 431 | "mimetype": "text/x-python", 432 | "name": "python", 433 | "nbconvert_exporter": "python", 434 | "pygments_lexer": "ipython3", 435 | "version": "3.8.9" 436 | } 437 | }, 438 | "nbformat": 4, 439 | "nbformat_minor": 5 440 | } 441 | -------------------------------------------------------------------------------- /Fake News Detection.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "4da05d9a", 6 | "metadata": {}, 7 | "source": [ 8 | "## Fake News Detection" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "fd31b0c2", 14 | "metadata": {}, 15 | "source": [ 16 | "#### Import library" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "id": "c8c65619", 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "import pandas as pd\n", 27 | "import numpy as np\n", 28 | "import itertools" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 2, 34 | "id": "c2080b11", 35 | "metadata": {}, 36 | "outputs": [], 37 | "source": [ 38 | "# !pip install pandas # download and install pandas library" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 3, 44 | "id": "bba61f32", 45 | "metadata": {}, 46 | "outputs": [], 47 | "source": [ 48 | "df = pd.read_csv(\"news.csv\")" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 4, 54 | "id": "eb0ccf03", 55 | "metadata": {}, 56 | "outputs": [ 57 | { 58 | "data": { 59 | "text/html": [ 60 | "
\n", 61 | "\n", 74 | "\n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | "
Unnamed: 0titletextlabel
08476You Can Smell Hillary’s FearDaniel Greenfield, a Shillman Journalism Fello...FAKE
110294Watch The Exact Moment Paul Ryan Committed Pol...Google Pinterest Digg Linkedin Reddit Stumbleu...FAKE
23608Kerry to go to Paris in gesture of sympathyU.S. Secretary of State John F. Kerry said Mon...REAL
310142Bernie supporters on Twitter erupt in anger ag...— Kaydee King (@KaydeeKing) November 9, 2016 T...FAKE
4875The Battle of New York: Why This Primary MattersIt's primary day in New York and front-runners...REAL
\n", 122 | "
" 123 | ], 124 | "text/plain": [ 125 | " Unnamed: 0 title \\\n", 126 | "0 8476 You Can Smell Hillary’s Fear \n", 127 | "1 10294 Watch The Exact Moment Paul Ryan Committed Pol... \n", 128 | "2 3608 Kerry to go to Paris in gesture of sympathy \n", 129 | "3 10142 Bernie supporters on Twitter erupt in anger ag... \n", 130 | "4 875 The Battle of New York: Why This Primary Matters \n", 131 | "\n", 132 | " text label \n", 133 | "0 Daniel Greenfield, a Shillman Journalism Fello... FAKE \n", 134 | "1 Google Pinterest Digg Linkedin Reddit Stumbleu... FAKE \n", 135 | "2 U.S. Secretary of State John F. Kerry said Mon... REAL \n", 136 | "3 — Kaydee King (@KaydeeKing) November 9, 2016 T... FAKE \n", 137 | "4 It's primary day in New York and front-runners... REAL " 138 | ] 139 | }, 140 | "execution_count": 4, 141 | "metadata": {}, 142 | "output_type": "execute_result" 143 | } 144 | ], 145 | "source": [ 146 | "df.head()" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 5, 152 | "id": "5aab448c", 153 | "metadata": {}, 154 | "outputs": [ 155 | { 156 | "data": { 157 | "text/plain": [ 158 | "(6335, 4)" 159 | ] 160 | }, 161 | "execution_count": 5, 162 | "metadata": {}, 163 | "output_type": "execute_result" 164 | } 165 | ], 166 | "source": [ 167 | "df.shape" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 6, 173 | "id": "9921c042", 174 | "metadata": {}, 175 | "outputs": [ 176 | { 177 | "data": { 178 | "text/plain": [ 179 | "Unnamed: 0 0\n", 180 | "title 0\n", 181 | "text 0\n", 182 | "label 0\n", 183 | "dtype: int64" 184 | ] 185 | }, 186 | "execution_count": 6, 187 | "metadata": {}, 188 | "output_type": "execute_result" 189 | } 190 | ], 191 | "source": [ 192 | "df.isnull().sum()" 193 | ] 194 | }, 195 | { 196 | "cell_type": "code", 197 | "execution_count": 7, 198 | "id": "a8107bfc", 199 | "metadata": {}, 200 | "outputs": [], 201 | "source": [ 202 | "labels = df.label" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 8, 208 | "id": "d3f6bc52", 209 | "metadata": {}, 210 | "outputs": [ 211 | { 212 | "data": { 213 | "text/plain": [ 214 | "0 FAKE\n", 215 | "1 FAKE\n", 216 | "2 REAL\n", 217 | "3 FAKE\n", 218 | "4 REAL\n", 219 | "Name: label, dtype: object" 220 | ] 221 | }, 222 | "execution_count": 8, 223 | "metadata": {}, 224 | "output_type": "execute_result" 225 | } 226 | ], 227 | "source": [ 228 | "labels.head()" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 9, 234 | "id": "e30acb39", 235 | "metadata": {}, 236 | "outputs": [], 237 | "source": [ 238 | "from sklearn.model_selection import train_test_split" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": 10, 244 | "id": "8ae64cea", 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "x_train, x_test, y_train, y_test = train_test_split(df[\"text\"], labels, test_size = 0.2, random_state = 20)" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 11, 254 | "id": "d7fee436", 255 | "metadata": {}, 256 | "outputs": [ 257 | { 258 | "data": { 259 | "text/plain": [ 260 | "4741 NAIROBI, Kenya — President Obama spoke out Sun...\n", 261 | "2089 Killing Obama administration rules, dismantlin...\n", 262 | "4074 Dean Obeidallah, a former attorney, is the hos...\n", 263 | "5376 WashingtonsBlog \\nCNN’s Jake Tapper hit the ...\n", 264 | "6028 Some of the biggest issues facing America this...\n", 265 | "Name: text, dtype: object" 266 | ] 267 | }, 268 | "execution_count": 11, 269 | "metadata": {}, 270 | "output_type": "execute_result" 271 | } 272 | ], 273 | "source": [ 274 | "x_train.head()" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 12, 280 | "id": "c731d247", 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "from sklearn.feature_extraction.text import TfidfVectorizer\n", 285 | "from sklearn.linear_model import PassiveAggressiveClassifier" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": 13, 291 | "id": "9ecc3123", 292 | "metadata": {}, 293 | "outputs": [], 294 | "source": [ 295 | "# initilise a Tfidvectorizer\n", 296 | "vector = TfidfVectorizer(stop_words='english', max_df=0.7)" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": 14, 302 | "id": "ab7def54", 303 | "metadata": {}, 304 | "outputs": [], 305 | "source": [ 306 | "# fit and tranform\n", 307 | "tf_train = vector.fit_transform(x_train)\n", 308 | "tf_test = vector.transform(x_test)" 309 | ] 310 | }, 311 | { 312 | "cell_type": "code", 313 | "execution_count": 15, 314 | "id": "d9e371a8", 315 | "metadata": {}, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "PassiveAggressiveClassifier(max_iter=50)" 321 | ] 322 | }, 323 | "execution_count": 15, 324 | "metadata": {}, 325 | "output_type": "execute_result" 326 | } 327 | ], 328 | "source": [ 329 | "# initilise a PassiveAggressiveClassifier\n", 330 | "pac = PassiveAggressiveClassifier(max_iter=50)\n", 331 | "pac.fit(tf_train, y_train)" 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": 16, 337 | "id": "313da13d", 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "# predicton the tst dataset\n", 342 | "from sklearn.metrics import accuracy_score, confusion_matrix\n", 343 | "y_pred = pac.predict(tf_test)" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": 17, 349 | "id": "1aaecf16", 350 | "metadata": {}, 351 | "outputs": [], 352 | "source": [ 353 | "score = accuracy_score(y_test, y_pred)" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": 18, 359 | "id": "59496c09", 360 | "metadata": {}, 361 | "outputs": [ 362 | { 363 | "name": "stdout", 364 | "output_type": "stream", 365 | "text": [ 366 | "Accuracy : 94.63%\n" 367 | ] 368 | } 369 | ], 370 | "source": [ 371 | "print(f\"Accuracy : {round(score*100,2)}%\")" 372 | ] 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 19, 377 | "id": "effdf53c", 378 | "metadata": {}, 379 | "outputs": [ 380 | { 381 | "data": { 382 | "text/plain": [ 383 | "array([[621, 27],\n", 384 | " [ 41, 578]], dtype=int64)" 385 | ] 386 | }, 387 | "execution_count": 19, 388 | "metadata": {}, 389 | "output_type": "execute_result" 390 | } 391 | ], 392 | "source": [ 393 | "# confusion metrics\n", 394 | "confusion_matrix(y_test, y_pred, labels=['FAKE', 'REAL'])" 395 | ] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "execution_count": 20, 400 | "id": "d113323f", 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "# save model\n", 405 | "import pickle\n", 406 | "filename = 'finalized_model.pkl'\n", 407 | "pickle.dump(pac, open(filename, 'wb'))" 408 | ] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "execution_count": 26, 413 | "id": "25f6c11e", 414 | "metadata": {}, 415 | "outputs": [], 416 | "source": [ 417 | "# save vectorizer\n", 418 | "filename = 'vectorizer.pkl'\n", 419 | "pickle.dump(vector, open(filename, 'wb'))" 420 | ] 421 | }, 422 | { 423 | "cell_type": "code", 424 | "execution_count": null, 425 | "id": "48321b9c", 426 | "metadata": {}, 427 | "outputs": [], 428 | "source": [] 429 | } 430 | ], 431 | "metadata": { 432 | "kernelspec": { 433 | "display_name": "Python 3 (ipykernel)", 434 | "language": "python", 435 | "name": "python3" 436 | }, 437 | "language_info": { 438 | "codemirror_mode": { 439 | "name": "ipython", 440 | "version": 3 441 | }, 442 | "file_extension": ".py", 443 | "mimetype": "text/x-python", 444 | "name": "python", 445 | "nbconvert_exporter": "python", 446 | "pygments_lexer": "ipython3", 447 | "version": "3.8.9" 448 | } 449 | }, 450 | "nbformat": 4, 451 | "nbformat_minor": 5 452 | } 453 | -------------------------------------------------------------------------------- /templates/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Fake News Detection 13 | 14 | 15 |
16 |
17 | 18 | 19 | 20 | 21 | Fake News Detection 22 | 23 | 29 |
30 |

31 |
32 |
33 |
34 |

Fake News Detection 35 | Machine learning project 36 |

37 |

Copper mug try-hard pitchfork pour-over freegan heirloom neutra air plant cold-pressed tacos poke beard tote bag. Heirloom echo park mlkshk tote bag selvage hot chicken authentic tumeric truffaut hexagon try-hard chambray.

38 |
39 | 40 | 41 |
42 |
43 |
44 | hero 45 |
46 |
47 |
48 |
49 |
50 |

Testimonials

51 |
52 |
53 |
54 | 55 | 56 | 57 |

Synth chartreuse iPhone lomo cray raw denim brunch everyday carry neutra before they sold out fixie 90's microdosing. Tacos pinterest fanny pack venmo, post-ironic heirloom try-hard pabst authentic iceland.

58 | 59 | testimonial 60 | 61 | Holden Caulfield 62 | UI DEVELOPER 63 | 64 | 65 |
66 |
67 |
68 |
69 | 70 | 71 | 72 |

Synth chartreuse iPhone lomo cray raw denim brunch everyday carry neutra before they sold out fixie 90's microdosing. Tacos pinterest fanny pack venmo, post-ironic heirloom try-hard pabst authentic iceland.

73 | 74 | testimonial 75 | 76 | Alper Kamu 77 | DESIGNER 78 | 79 | 80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |

Raw Denim Heirloom Man Braid 88 | Selfies Wayfarers 89 |

90 |
91 |
92 |
93 | 94 | 95 | 96 |
97 |
98 |

Shooting Stars

99 |

Blue bottle crucifix vinyl post-ironic four dollar toast vegan taxidermy. Gastropub indxgo juice poutine, ramps microdosing banh mi pug VHS try-hard ugh iceland kickstarter tumblr live-edge tilde.

100 | Learn More 101 | 102 | 103 | 104 | 105 |
106 |
107 |
108 |
109 | 110 | 111 | 112 | 113 | 114 |
115 |
116 |

The Catalyzer

117 |

Blue bottle crucifix vinyl post-ironic four dollar toast vegan taxidermy. Gastropub indxgo juice poutine, ramps microdosing banh mi pug VHS try-hard ugh iceland kickstarter tumblr live-edge tilde.

118 | Learn More 119 | 120 | 121 | 122 | 123 |
124 |
125 |
126 |
127 | 128 | 129 | 130 | 131 |
132 |
133 |

Neptune

134 |

Blue bottle crucifix vinyl post-ironic four dollar toast vegan taxidermy. Gastropub indxgo juice poutine, ramps microdosing banh mi pug VHS try-hard ugh iceland kickstarter tumblr live-edge tilde.

135 | Learn More 136 | 137 | 138 | 139 | 140 |
141 |
142 |
143 |
144 |
145 | 249 | 250 | 251 | 252 | 253 | 254 | 255 | 256 | 260 | 261 | --------------------------------------------------------------------------------