├── .gitignore ├── Instance based VS Model based learning.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yassermessahli/Instance-based-vs-Model-based-Learning/849b9f4cd02a7d5ea62f4c86069b3082411b5cb9/.gitignore -------------------------------------------------------------------------------- /Instance based VS Model based learning.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "9973ce1d", 6 | "metadata": {}, 7 | "source": [ 8 | "# Instance-based vs Model-based Learning\n", 9 | "### 1. Introduction\n", 10 | "Welcome🧠! in this lecture we will discuss two types of machine learning, which are **instance-based** and **model-based** learning. we will talk about both **Regression** and **Classification** using the two types of learning, giving the definition, the principle, and practical example of each to illustrate them. I hope you enjoy the lecture. If you find this lecture valuable, please consider sharing it with your mates so they can also benefit. Let's begin!💡" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "id": "a8c81be7", 16 | "metadata": {}, 17 | "source": [ 18 | "### 2. Definitions\n", 19 | "\n", 20 | "#### 2.1 Model-Based Learning\n", 21 | "Model-based learning involves creating a mathematical model that can predict outcomes based on input data. The model is trained on a large dataset and then used to make predictions on new data. The model can be thought of as a set of rules that the machine uses to make predictions.\n", 22 | "\n", 23 | "#### 2.2 Instance-Based Learning\n", 24 | "Instance-based learning involves using the entire dataset to make predictions. The machine learns by storing all instances of data and then using these instances to make predictions on new data. The machine compares the new data to the instances it has seen before and uses the closest match to make a prediction.\n", 25 | "\n", 26 | "**🔖Source:** [Model-Based vs Instance-Based Learning: Understanding the Differences with Examples - medium](https://medium.com/@pp1222001/model-based-vs-instance-based-learning-understanding-the-differences-with-examples-1545c9c3a056#:~:text=Model-based%20learning%20is%20typically%20faster%20and%20more%20accurate,is%20slower%20and%20can%20make%20less%20accurate%20predictions.)" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "id": "376eb0d4", 32 | "metadata": {}, 33 | "source": [ 34 | "### 3. Regression\n", 35 | "#### 3.1 Instance-based regression\n", 36 | "- **Definition:** as we talked previously, in instance-based learning we use the training instances itselves for the prediction. More our instance is similar to a given training instance, more its label value is close to that instance label value.\n", 37 | "**i.e** : our_instance ~ given_instance So our_label ~ given_label\n", 38 | "\n", 39 | "- **Principle:** chosing the most similar training instances to the new instance, and use its labels to predict the new label such as the new value will be the average value of the labels values.\n", 40 | "\n", 41 | "- **Example of implementation:** we will implement a regression decision function to predict values using training instances" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "id": "9a6024ea", 47 | "metadata": {}, 48 | "source": [ 49 | "- Import dataset" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 127, 55 | "id": "69469c6b", 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "import warnings\n", 60 | "from sklearn.datasets import make_regression\n", 61 | "warnings.filterwarnings('ignore')\n", 62 | "\n", 63 | "data = make_regression(n_features=1, noise=15, bias=20)" 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "id": "31d87ca4", 69 | "metadata": {}, 70 | "source": [ 71 | "- Visualisation of the dataset" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 128, 77 | "id": "b4beec95", 78 | "metadata": {}, 79 | "outputs": [ 80 | { 81 | "data": { 82 | "image/png": "\n", 83 | "text/plain": [ 84 | "
" 85 | ] 86 | }, 87 | "metadata": { 88 | "needs_background": "light" 89 | }, 90 | "output_type": "display_data" 91 | } 92 | ], 93 | "source": [ 94 | "import matplotlib.pyplot as plt\n", 95 | "%matplotlib inline\n", 96 | "\n", 97 | "fig = plt.figure(figsize=(12, 6))\n", 98 | "ax1 = fig.add_subplot(111)\n", 99 | "ax1.scatter(x=data[0], y=data[1], s=10, c='red', label='training instances')\n", 100 | "\n", 101 | "plt.xlabel('X')\n", 102 | "plt.ylabel('y')\n", 103 | "plt.legend(loc='best')\n", 104 | "plt.title('data visualisation')\n", 105 | "fig.show()" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "id": "18e93eb0", 111 | "metadata": {}, 112 | "source": [ 113 | "- we have to chose the most similar training instances to predict the label of our new instance (those instances called predictors) for that we need a similarity measure. we will chose the simple euclidiean_distance implemented below" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 129, 119 | "id": "c4bb0b8d", 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "from math import sqrt\n", 124 | "def euclidean_distance(x,y):\n", 125 | " return np.sqrt(np.sum((np.array(x) - np.array(y)) ** 2))" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "id": "c59bd535", 131 | "metadata": {}, 132 | "source": [ 133 | "- instance-based decision function implementation.\n", 134 | "\n", 135 | "I called the parameter alpha \"the generalisation rate\", is define how the function will generalise the data to predict the new value, by default is 2 (it refer to the number of predictors)" 136 | ] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "execution_count": 130, 141 | "id": "73810893", 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "import numpy as np\n", 146 | "\n", 147 | "def decision_function(X, y, new_x, alpha=2):\n", 148 | "# list of all similarties between training instances and new instance\n", 149 | " similarities = [euclidean_distance([x], [new_x]) for x in X]\n", 150 | " \n", 151 | "# predictors are the indexes of the most similar instances to our instance\n", 152 | " predictors = np.argsort(similarities)[:alpha]\n", 153 | " \n", 154 | " prediction = np.mean(y[predictors])\n", 155 | " return prediction, predictors" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "id": "ed03f0cc", 161 | "metadata": {}, 162 | "source": [ 163 | "- new instance prediction" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": 131, 169 | "id": "6fb67c13", 170 | "metadata": {}, 171 | "outputs": [ 172 | { 173 | "data": { 174 | "text/plain": [ 175 | "(0.787602032321169, 57.86976055058832)" 176 | ] 177 | }, 178 | "execution_count": 131, 179 | "metadata": {}, 180 | "output_type": "execute_result" 181 | } 182 | ], 183 | "source": [ 184 | "import random as rnd\n", 185 | "new_instance = (rnd.random() * 4) - 2 # -2 < new_instance < 2\n", 186 | "y_pred, predictors = decision_function(data[0], data[1], new_instance, alpha=5) # number of predictors = 5\n", 187 | "(new_instance, y_pred)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "id": "58dae75c", 193 | "metadata": {}, 194 | "source": [ 195 | "- visualisation of the new instance and the predictors" 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": 132, 201 | "id": "b8657efb", 202 | "metadata": {}, 203 | "outputs": [ 204 | { 205 | "data": { 206 | "image/png": "\n", 207 | "text/plain": [ 208 | "
" 209 | ] 210 | }, 211 | "execution_count": 132, 212 | "metadata": {}, 213 | "output_type": "execute_result" 214 | } 215 | ], 216 | "source": [ 217 | "ax1.scatter(data[0][predictors], data[1][predictors], marker='+', c='#BDB300', s=70, label='predictors')\n", 218 | "ax1.scatter(x=new_instance, y=y_pred, marker='x', c='blue', s=70, label='new instance')\n", 219 | "ax1.legend()\n", 220 | "fig" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "id": "cb6ffe03", 226 | "metadata": {}, 227 | "source": [ 228 | "#### 3.2 Model-based regression\n" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "id": "29cb0ea6", 234 | "metadata": {}, 235 | "source": [ 236 | "- **Definition**: As we talked previousely, in model-base learning we create amodel (or function or formula) that generalise the rules to predict, then use it to make predictions without need of training instances for the prediction\n", 237 | "\n", 238 | "- **Principle**: chose a regression model, fitting it using training instances. then predict new labels using trained model only.\n", 239 | "\n", 240 | "- **Example of implementation**: A simple linear regression is an example of model-based regression" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": 133, 246 | "id": "b0ebcaf7", 247 | "metadata": {}, 248 | "outputs": [], 249 | "source": [ 250 | "from sklearn.linear_model import LinearRegression\n", 251 | "\n", 252 | "lr = LinearRegression()\n", 253 | "lr.fit(data[0], data[1])\n", 254 | "\n", 255 | "# model parameters (the model: y = w*X + b)\n", 256 | "w, b = lr.coef_, lr.intercept_\n", 257 | "\n", 258 | "y_pred = w * new_instance + b" 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": 134, 264 | "id": "4694661b", 265 | "metadata": {}, 266 | "outputs": [ 267 | { 268 | "data": { 269 | "text/plain": [ 270 | "(0.787602032321169, array([65.68719033]))" 271 | ] 272 | }, 273 | "execution_count": 134, 274 | "metadata": {}, 275 | "output_type": "execute_result" 276 | } 277 | ], 278 | "source": [ 279 | "(new_instance, y_pred)" 280 | ] 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "id": "e38cdfd7", 285 | "metadata": {}, 286 | "source": [ 287 | "- visualisation of the model and the new instance" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": 135, 293 | "id": "7e3c0af0", 294 | "metadata": {}, 295 | "outputs": [ 296 | { 297 | "data": { 298 | "image/png": "\n", 299 | "text/plain": [ 300 | "
" 301 | ] 302 | }, 303 | "metadata": { 304 | "needs_background": "light" 305 | }, 306 | "output_type": "display_data" 307 | } 308 | ], 309 | "source": [ 310 | "x = np.linspace(-2.5, 2.5, num=100)\n", 311 | "y = w*x + b\n", 312 | "\n", 313 | "\n", 314 | "fig2 = plt.figure(figsize=(12, 6))\n", 315 | "ax2 = fig2.add_subplot(111)\n", 316 | "\n", 317 | "ax2.scatter(x=data[0], y=data[1], s=10, c='red', label='training instances')\n", 318 | "ax2.scatter(x=new_instance, y=y_pred, marker='x', c='blue', s=70, label='new instance')\n", 319 | "ax2.plot(x, y, 'y--', label='LinearRegresseion model')\n", 320 | "\n", 321 | "plt.xlabel('X')\n", 322 | "plt.ylabel('y')\n", 323 | "plt.legend(loc='best')\n", 324 | "plt.title('visualisation')\n", 325 | "fig2.show()" 326 | ] 327 | }, 328 | { 329 | "cell_type": "markdown", 330 | "id": "926e0e4b", 331 | "metadata": {}, 332 | "source": [ 333 | "### 4. Classification\n", 334 | "#### 4.1 Instance-based Classification\n", 335 | "- **Definition:** we defined it previously (see section 3.1)\n", 336 | "\n", 337 | "- **Principle:** we chose a number **n** of the most similar instances to the new instance (the number n is refered by the parameter alpha), then the class of the new instance is the majority class in those instances set (the most occured class)\n", 338 | "\n", 339 | "- **Example of implementation:** implement a classifier using the training instances directly without training" 340 | ] 341 | }, 342 | { 343 | "cell_type": "markdown", 344 | "id": "4c30ec80", 345 | "metadata": {}, 346 | "source": [ 347 | "- Import dataset " 348 | ] 349 | }, 350 | { 351 | "cell_type": "code", 352 | "execution_count": 72, 353 | "id": "c8f6f2a8", 354 | "metadata": {}, 355 | "outputs": [], 356 | "source": [ 357 | "from sklearn.datasets import make_moons\n", 358 | "data = make_moons(noise=0.3)\n", 359 | "X = data[0]\n", 360 | "y = data [1]" 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "id": "bebb19a1", 366 | "metadata": {}, 367 | "source": [ 368 | "- Visualisation of the dataset" 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": 124, 374 | "id": "47ceca70", 375 | "metadata": {}, 376 | "outputs": [ 377 | { 378 | "data": { 379 | "image/png": "\n", 380 | "text/plain": [ 381 | "
" 382 | ] 383 | }, 384 | "metadata": { 385 | "needs_background": "light" 386 | }, 387 | "output_type": "display_data" 388 | } 389 | ], 390 | "source": [ 391 | "fig3 = plt.figure(figsize=(12, 6))\n", 392 | "ax3 = fig3.add_subplot(111)\n", 393 | "\n", 394 | "ax3.scatter(X[y==0][:, 0], X[y==0][:, 1], marker='^', s=30, c='green', label='class0')\n", 395 | "ax3.scatter(X[y==1][:, 0], X[y==1][:, 1], marker='s', s=30, c='blue', label='class1')\n", 396 | "\n", 397 | "plt.title('visualisation of the data')\n", 398 | "plt.legend()\n", 399 | "plt.xlabel('X1')\n", 400 | "plt.ylabel('X2')\n", 401 | "plt.show()" 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "id": "70e9f42f", 407 | "metadata": {}, 408 | "source": [ 409 | "- Decision function implementation\n", 410 | "\n", 411 | "We will use the same similarity measure as before wich is ***euclidian distance()***. we will also use ***mode()*** function to select the majority class" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": 97, 417 | "id": "89fac3d7", 418 | "metadata": {}, 419 | "outputs": [], 420 | "source": [ 421 | "import statistics\n", 422 | "\n", 423 | "def instance_based_classifier(X, y, new_x, alpha): \n", 424 | " similarities = [euclidean_distance(x, new_x) for x in X]\n", 425 | " predictors = np.argsort(similarities)[:alpha]\n", 426 | " prediction = statistics.mode(y[predictors])\n", 427 | " return prediction, predictors" 428 | ] 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "id": "f4771b98", 433 | "metadata": {}, 434 | "source": [ 435 | "- New instance" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": 99, 441 | "id": "242c3cc5", 442 | "metadata": {}, 443 | "outputs": [ 444 | { 445 | "data": { 446 | "text/plain": [ 447 | "1" 448 | ] 449 | }, 450 | "execution_count": 99, 451 | "metadata": {}, 452 | "output_type": "execute_result" 453 | } 454 | ], 455 | "source": [ 456 | "new_instance = np.array([1.25, 0.25])\n", 457 | "y_pred, predictors = instance_based_classifier(X, y, new_instance, alpha=20)\n", 458 | "y_pred" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": 100, 464 | "id": "62c9dfda", 465 | "metadata": {}, 466 | "outputs": [ 467 | { 468 | "data": { 469 | "text/plain": [ 470 | "array([1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0],\n", 471 | " dtype=int64)" 472 | ] 473 | }, 474 | "execution_count": 100, 475 | "metadata": {}, 476 | "output_type": "execute_result" 477 | } 478 | ], 479 | "source": [ 480 | "y[predictors]" 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "id": "2b121887", 486 | "metadata": {}, 487 | "source": [ 488 | "- visualisation of the new instance and the predictors" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": 102, 494 | "id": "78fd22ea", 495 | "metadata": {}, 496 | "outputs": [ 497 | { 498 | "data": { 499 | "text/plain": [ 500 | "Text(0, 0.5, 'X2')" 501 | ] 502 | }, 503 | "execution_count": 102, 504 | "metadata": {}, 505 | "output_type": "execute_result" 506 | }, 507 | { 508 | "data": { 509 | "image/png": "\n", 510 | "text/plain": [ 511 | "
" 512 | ] 513 | }, 514 | "metadata": { 515 | "needs_background": "light" 516 | }, 517 | "output_type": "display_data" 518 | } 519 | ], 520 | "source": [ 521 | "fig4 = plt.figure(figsize=(12, 6))\n", 522 | "ax4 = fig4.add_subplot(111)\n", 523 | "\n", 524 | "ax4.scatter(X[y==0][:, 0], X[y==0][:, 1], marker='^', s=30, c='green', label='class0')\n", 525 | "ax4.scatter(X[y==1][:, 0], X[y==1][:, 1], marker='s', s=30, c='blue', label='class1')\n", 526 | "ax4.scatter(X[predictors][:, 0], X[predictors][:, 1], marker='o', c='#BDB300', s=70, label='predictors')\n", 527 | "\n", 528 | "class_shape = {\n", 529 | " 0: '^',\n", 530 | " 1: 's'\n", 531 | "}\n", 532 | "\n", 533 | "marker = class_shape[y_pred]\n", 534 | "ax4.scatter(new_instance[0], new_instance[1], marker=marker, c='red', s=100, label=f'new instance class({y_pred})')\n", 535 | "\n", 536 | "plt.title('visualisation of the data')\n", 537 | "plt.legend()\n", 538 | "plt.xlabel('X1')\n", 539 | "plt.ylabel('X2')" 540 | ] 541 | }, 542 | { 543 | "cell_type": "markdown", 544 | "id": "711d09bd", 545 | "metadata": {}, 546 | "source": [ 547 | "#### 4.2 Model-based Classification\n", 548 | "- **Definition:** we defined it previously (see section 3.2)\n", 549 | "\n", 550 | "- **Principle:** chose a classification model, fitting it using training instances. then predict new class using trained model only.\n", 551 | "\n", 552 | "- **Example of implementation:** Decision Tree classifier is an example of model-based classification" 553 | ] 554 | }, 555 | { 556 | "cell_type": "markdown", 557 | "id": "bf9ab6e2", 558 | "metadata": {}, 559 | "source": [ 560 | "- import, fit, and visualise the model" 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": 104, 566 | "id": "4fd4dbd3", 567 | "metadata": {}, 568 | "outputs": [ 569 | { 570 | "data": { 571 | "text/plain": [ 572 | "[Text(0.5, 0.75, 'x[1] <= 0.016\\ngini = 0.5\\nsamples = 100\\nvalue = [50, 50]'),\n", 573 | " Text(0.25, 0.25, 'gini = 0.145\\nsamples = 38\\nvalue = [3, 35]'),\n", 574 | " Text(0.75, 0.25, 'gini = 0.367\\nsamples = 62\\nvalue = [47, 15]')]" 575 | ] 576 | }, 577 | "execution_count": 104, 578 | "metadata": {}, 579 | "output_type": "execute_result" 580 | }, 581 | { 582 | "data": { 583 | "image/png": "\n", 584 | "text/plain": [ 585 | "
" 586 | ] 587 | }, 588 | "metadata": { 589 | "needs_background": "light" 590 | }, 591 | "output_type": "display_data" 592 | } 593 | ], 594 | "source": [ 595 | "from sklearn.tree import DecisionTreeClassifier, plot_tree\n", 596 | "\n", 597 | "dtc = DecisionTreeClassifier(max_depth=1)\n", 598 | "dtc.fit(X, y)\n", 599 | "\n", 600 | "plot_tree(dtc)" 601 | ] 602 | }, 603 | { 604 | "cell_type": "markdown", 605 | "id": "3f060b29", 606 | "metadata": {}, 607 | "source": [ 608 | "- new instance prediction" 609 | ] 610 | }, 611 | { 612 | "cell_type": "code", 613 | "execution_count": 112, 614 | "id": "e870cb2a", 615 | "metadata": {}, 616 | "outputs": [ 617 | { 618 | "data": { 619 | "text/plain": [ 620 | "(array([ 1. , -0.125]), 1)" 621 | ] 622 | }, 623 | "execution_count": 112, 624 | "metadata": {}, 625 | "output_type": "execute_result" 626 | } 627 | ], 628 | "source": [ 629 | "new_instance = np.array([1, -0.125])\n", 630 | "\n", 631 | "y_pred = dtc.predict([new_instance])\n", 632 | "(new_instance, y_pred[0])" 633 | ] 634 | }, 635 | { 636 | "cell_type": "markdown", 637 | "id": "7c1c943b", 638 | "metadata": {}, 639 | "source": [ 640 | "- visualisation of the data" 641 | ] 642 | }, 643 | { 644 | "cell_type": "code", 645 | "execution_count": 113, 646 | "id": "8bdc1cef", 647 | "metadata": {}, 648 | "outputs": [ 649 | { 650 | "data": { 651 | "image/png": "\n", 652 | "text/plain": [ 653 | "
" 654 | ] 655 | }, 656 | "metadata": { 657 | "needs_background": "light" 658 | }, 659 | "output_type": "display_data" 660 | } 661 | ], 662 | "source": [ 663 | "fig5 = plt.figure(figsize=(12, 6))\n", 664 | "ax5 = fig5.add_subplot(111)\n", 665 | "\n", 666 | "ax5.scatter(X[y==0][:, 0], X[y==0][:, 1], marker='^', s=30, c='green', label='class0')\n", 667 | "ax5.scatter(X[y==1][:, 0], X[y==1][:, 1], marker='s', s=30, c='blue', label='class1')\n", 668 | "\n", 669 | "x = np.linspace(-2, 2.5, 100)\n", 670 | "ax5.plot(x, np.full(100, 0.016), 'y--', label='Decision boundary')\n", 671 | "\n", 672 | "class_shape = {\n", 673 | " 0: '^',\n", 674 | " 1: 's'\n", 675 | "}\n", 676 | "ax5.scatter(new_instance[0], new_instance[1], c='red', marker=class_shape[y_pred[0]], s=100, label='new instance')\n", 677 | "\n", 678 | "\n", 679 | "plt.title('visualisation of the data')\n", 680 | "plt.legend()\n", 681 | "plt.xlabel('X1')\n", 682 | "plt.ylabel('X2')\n", 683 | "plt.show()" 684 | ] 685 | }, 686 | { 687 | "cell_type": "markdown", 688 | "id": "730e57ca", 689 | "metadata": {}, 690 | "source": [ 691 | "# That's it\n", 692 | "That was the difference between the two techniques of machine learning. if you find this lecture usefull, share it with your mates 😊" 693 | ] 694 | } 695 | ], 696 | "metadata": { 697 | "kernelspec": { 698 | "display_name": "Python 3 (ipykernel)", 699 | "language": "python", 700 | "name": "python3" 701 | }, 702 | "language_info": { 703 | "codemirror_mode": { 704 | "name": "ipython", 705 | "version": 3 706 | }, 707 | "file_extension": ".py", 708 | "mimetype": "text/x-python", 709 | "name": "python", 710 | "nbconvert_exporter": "python", 711 | "pygments_lexer": "ipython3", 712 | "version": "3.10.6" 713 | } 714 | }, 715 | "nbformat": 4, 716 | "nbformat_minor": 5 717 | } 718 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Instance-based-vs-Model-based-Learning 2 | [***📢 See the notebook for more understanding with the code samples***] 3 | ### 1. Introduction 4 | Welcome🧠! in this lecture we will discuss two types of machine learning, which are **instance-based** and **model-based** learning. we will talk about both **Regression** and **Classification** using the two types of learning, giving the definition, the principle, and practical example of each to illustrate them. I hope you enjoy the lecture. If you find this lecture valuable, please consider sharing it with your mates so they can also benefit. Let's begin!💡 5 | 6 | ### 2. Definitions 7 | #### 2.1 Model-Based Learning 8 | Model-based learning involves creating a mathematical model that can predict outcomes based on input data. The model is trained on a large dataset and then used to make predictions on new data. The model can be thought of as a set of rules that the machine uses to make predictions. 9 | #### 2.2 Instance-Based Learning 10 | Instance-based learning involves using the entire dataset to make predictions. The machine learns by storing all instances of data and then using these instances to make predictions on new data. The machine compares the new data to the instances it has seen before and uses the closest match to make a prediction. 11 | 12 | **🔖Source:** [Model-Based vs Instance-Based Learning: Understanding the Differences with Examples - medium](https://medium.com/@pp1222001/model-based-vs-instance-based-learning-understanding-the-differences-with-examples-1545c9c3a056#:~:text=Model-based%20learning%20is%20typically%20faster%20and%20more%20accurate,is%20slower%20and%20can%20make%20less%20accurate%20predictions.) 13 | 14 | ### 3. Regression 15 | 16 | #### 3.1 Instance-based regression 17 | - **Definition:** as we talked previously, in instance-based learning we use the training instances itselves for the prediction. More our instance is similar to a given training instance, more its label value is close to that instance label value. 18 | **i.e** : our_instance ~ given_instance So our_label ~ given_label 19 | 20 | - **Principle:** chosing the most similar training instances to the new instance, and use its labels to predict the new label such as the new value will be the average value of the labels values. 21 | 22 | - **Example of implementation:** we will implement a regression decision function to predict values using training instances 23 | 24 | 25 | #### 3.2 Model-based regression 26 | - **Definition**: As we talked previousely, in model-base learning we create amodel (or function or formula) that generalise the rules to predict, then use it to make predictions without need of training instances for the prediction 27 | 28 | - **Principle**: chose a regression model, fitting it using training instances. then predict new labels using trained model only. 29 | 30 | - **Example of implementation**: A simple linear regression is an example of model-based regression 31 | 32 | 33 | 34 | ### 4. Classification 35 | #### 4.1 Instance-based Classification 36 | - **Definition:** we defined it previously (see section 3.1) 37 | 38 | - **Principle:** we chose a number **n** of the most similar instances to the new instance (the number n is refered by the parameter alpha), then the class of the new instance is the majority class in those instances set (the most occured class) 39 | 40 | - **Example of implementation:** implement a classifier using the training instances directly without training 41 | 42 | 43 | #### 4.2 Model-based Classification 44 | - **Definition:** we defined it previously (see section 3.2) 45 | 46 | - **Principle:** chose a classification model, fitting it using training instances. then predict new class using trained model only. 47 | 48 | - **Example of implementation:** Decision Tree classifier is an example of model-based classification 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | --------------------------------------------------------------------------------