├── .gitignore
├── Processed Datasets
    ├── AmazonMusic.tar.xz
    ├── AmazonMusicCompact.tar.xz
    ├── Anime
    │   ├── README.md
    │   └── anime.zip
    ├── BookCrossing
    │   ├── README.md
    │   └── book_crossing.zip
    ├── FBC.ipynb
    ├── FC.ipynb
    ├── RS_NonPersonalized.ipynb
    ├── RetailrocketEcommerce
    │   ├── README.md
    │   └── Retailrocket_Ecommerce.zip
    └── Steam
    │   ├── README.md
    │   └── steam.zip
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
1 | # Created by .ignore support plugin (hsz.mobi)
2 | .gitignore
3 | .idea/
4 | 


--------------------------------------------------------------------------------
/Processed Datasets/AmazonMusic.tar.xz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/AmazonMusic.tar.xz


--------------------------------------------------------------------------------
/Processed Datasets/AmazonMusicCompact.tar.xz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/AmazonMusicCompact.tar.xz


--------------------------------------------------------------------------------
/Processed Datasets/Anime/README.md:
--------------------------------------------------------------------------------
 1 | SUMMARY & USAGE LICENSE
 2 | =============================================
 3 | 
 4 | This dataset is provided by Keagle through the link: https://www.kaggle.com/CooperUnion/anime-recommendations-database
 5 | 
 6 | * Context
 7 | The original dataset contains information on user preference data from 73,516 users on 12,294 anime. Each user is able to add anime to their completed list and give it a rating and this data set is a compilation of those ratings. This dataset has been organized and reduced by Arthur Fortes [1], which generated new IDs for users and items, separated in files the information of history and ratings and randomly selected a subsample of 5,000 users. 
 8 | 
 9 | The reduced was made with Python [2] with random.seed(123).
10 | 
11 | * Acknowledgements
12 | Thanks to myanimelist.net API for providing anime data and user ratings.
13 | 
14 | 
15 | Detailed descriptions of the data file can be found at the end of this file.
16 |  
17 | This dataset consists of:
18 |   * 520,610 interactions (play / purchase) from 5,000 users on 7,718 animes.
19 |     - History: 5,000 users and 7,390 games (520,610 interactions)
20 |     - Ratings: 4,714 users and 7,157 animes (419,944 interactions) 
21 | 
22 | If you have any further questions or comments, please contact me
23 | <fortes.arthur@gmail.com>. 
24 | 
25 | 
26 | DETAILED DESCRIPTIONS OF DATA FILES
27 | ==============================================
28 | 
29 | Here are brief descriptions of the data.
30 | 
31 | anime_ratings.dat    
32 | 
33 |                   The full ratings set: 419,944 interactions by 4,714 users on 7,157 animes.
34 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
35 |                   This is a tab separated list of 
36 |                   User_ID | Anime_ID | Feedback 
37 | 
38 |                   Range of ratings: 1 - 10
39 | 
40 | anime_history.dat    
41 | 
42 |                   The full history set: 520,610 interactions by 5,000 users on 7,390 animes.
43 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
44 |                   This is a tab separated list of 
45 |                   User_ID | Anime_ID | Feedback 
46 | 
47 | anime_info.dat 
48 | 
49 |                   Information about the items (animes); this is a tab separated list of
50 |                   anime_ids | name | genre | type | episodes | rating | members
51 | 
52 |                   The item ids are the ones used in the game_purchase.dat 
53 |                   and game_play.dat files.
54 | 
55 |                   anime_ids - myanimelist.net's unique id identifying an anime.
56 |                   name - full name of anime.
57 |                   genre - comma separated list of genres for this anime.
58 |                   type - movie, TV, OVA, etc.
59 |                   episodes - how many episodes in this show. (1 if movie).
60 |                   rating - average rating out of 10 for this anime.
61 |                   members - number of community members that are in this anime's "group".
62 | 
63 | 
64 | REFERENCES
65 | ==============================================
66 | 
67 | [1] Da Costa, Arthur Fortes. PhD candidate at the Institute of Mathematical and Computational Sciences, 
68 | University of São Paulo. URL: https://arthurfortes.github.io/
69 | 
70 | [2] https://www.python.org/


--------------------------------------------------------------------------------
/Processed Datasets/Anime/anime.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/Anime/anime.zip


--------------------------------------------------------------------------------
/Processed Datasets/BookCrossing/README.md:
--------------------------------------------------------------------------------
 1 | SUMMARY & USAGE LICENSE
 2 | =============================================
 3 | 
 4 | Book Crossing dataset were collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) 
 5 | from the Book-Crossing community with kind permission from Ron Hornbaker, CTO of Humankind Systems. 
 6 | 
 7 | This data has been organized and cleaned up by Arthur Fortes [1] based on MovieLens 100k treatment [2], 
 8 | which removed all users and items who had less than 20 and 10 interactions, receptively, items that have no information and separated 
 9 | in files the explicit and implicit interactions. 
10 | 
11 | Detailed descriptions of the data file can be found at the end of this file.
12 |  
13 | This dataset consists of:
14 |   * 272,679 interactions (explicit / implicit) from 2,946 users on 17,384 books.
15 |     - Ratings: 1,295 users and 14,684 books (62,657 ratings applied)
16 |     - History: 2,946 users and 17,384 books (272,679 accesses) 
17 |   * Ratings are between 1 - 10. Implicit feedback are represented by 1.
18 |   * Simple demographic info for the users (age, gender, occupation, zip)
19 | 
20 | If you have any further questions or comments, please contact me
21 | <fortes.arthur@gmail.com>. 
22 | 
23 | 
24 | CITATION
25 | ==============================================
26 | 
27 | Freely available for research use when acknowledged with the following reference (further details on the dataset are given in this publication):
28 | 
29 | Improving Recommendation Lists Through Topic Diversification, Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, Georg Lausen; 
30 | Proceedings of the 14th International World Wide Web Conference (WWW '05), May 10-14, 2005, Chiba, Japan.
31 | 
32 | 
33 | DETAILED DESCRIPTIONS OF DATA FILES
34 | ==============================================
35 | 
36 | Here are brief descriptions of the data.
37 | 
38 | items_info.dat    
39 | 
40 |                   Information about the items (books); this is a tab separated list of
41 |                   Book_ID | ISBN | Book-Title | Book-Author | Year-Of-Publication | 
42 |                   Publisher | Image-URL-S | Image-URL-M | Image-URL-L |
43 | 
44 |                   The item ids are the ones used in the book_history.dat 
45 |                   and book_ratings.dat files.
46 | 
47 | users_info.dat    
48 | 
49 |                   Demographic information about the users; this is a tab
50 |                   separated list of
51 |                   User-ID | Location | Age
52 | 
53 |                   The user ids are the ones used in the book_history.dat 
54 |                   and book_ratings.dat files.
55 | 
56 | 
57 | book_history.dat  
58 | 
59 |                   The full history set, 272,679 accesses by 2,946 users on 17,384 books.
60 |                   Each user has accessed at least 20 books.  Users and items are
61 |                   numbered consecutively from 1.  The data is ordered by users ids. 
62 |                   This is a tab separated list of 
63 |                   user id | item id | accessed 
64 | 
65 | book_ratings.dat
66 | 
67 |                   The full ratings set, 62,657 ratings by 1,295 users on 14,684 books.
68 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
69 |                   This is a tab separated list of 
70 |                   user id | item id | ratings 
71 | 
72 | 
73 | REFERENCES
74 | ==============================================
75 | 
76 | [1] Da Costa, Arthur Fortes. PhD candidate at the Institute of Mathematical and Computational Sciences, 
77 | University of São Paulo. URL: https://arthurfortes.github.io/
78 | 
79 | 
80 | [2] MovieLens 100K Dataset. Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. 
81 | Released 4/1998. URL: https://grouplens.org/datasets/movielens/100k/
82 | Generated by GroupLens [Department of Computer Science and Engineering at the University of Minnesota].
83 | 


--------------------------------------------------------------------------------
/Processed Datasets/BookCrossing/book_crossing.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/BookCrossing/book_crossing.zip


--------------------------------------------------------------------------------
/Processed Datasets/FBC.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |   "nbformat": 4,
   3 |   "nbformat_minor": 0,
   4 |   "metadata": {
   5 |     "colab": {
   6 |       "name": "FBC.ipynb",
   7 |       "version": "0.3.2",
   8 |       "provenance": []
   9 |     },
  10 |     "kernelspec": {
  11 |       "name": "python3",
  12 |       "display_name": "Python 3"
  13 |     }
  14 |   },
  15 |   "cells": [
  16 |     {
  17 |       "cell_type": "code",
  18 |       "metadata": {
  19 |         "id": "UbCVwAjd4Lk7",
  20 |         "colab_type": "code",
  21 |         "colab": {}
  22 |       },
  23 |       "source": [
  24 |         "import pandas as pd\n",
  25 |         "import numpy as np"
  26 |       ],
  27 |       "execution_count": 0,
  28 |       "outputs": []
  29 |     },
  30 |     {
  31 |       "cell_type": "code",
  32 |       "metadata": {
  33 |         "id": "8OFEkZuC4d0G",
  34 |         "colab_type": "code",
  35 |         "colab": {}
  36 |       },
  37 |       "source": [
  38 |         "metadata = pd.read_csv('AmazonMusic/amazon_music_metadata.csv')"
  39 |       ],
  40 |       "execution_count": 0,
  41 |       "outputs": []
  42 |     },
  43 |     {
  44 |       "cell_type": "code",
  45 |       "metadata": {
  46 |         "id": "az-cFZrF7KY9",
  47 |         "colab_type": "code",
  48 |         "colab": {
  49 |           "base_uri": "https://localhost:8080/",
  50 |           "height": 412
  51 |         },
  52 |         "outputId": "963ec8ff-2fcb-41f4-f4f8-493cf90befcb"
  53 |       },
  54 |       "source": [
  55 |         "metadata.head()"
  56 |       ],
  57 |       "execution_count": 6,
  58 |       "outputs": [
  59 |         {
  60 |           "output_type": "execute_result",
  61 |           "data": {
  62 |             "text/html": [
  63 |               "<div>\n",
  64 |               "<style scoped>\n",
  65 |               "    .dataframe tbody tr th:only-of-type {\n",
  66 |               "        vertical-align: middle;\n",
  67 |               "    }\n",
  68 |               "\n",
  69 |               "    .dataframe tbody tr th {\n",
  70 |               "        vertical-align: top;\n",
  71 |               "    }\n",
  72 |               "\n",
  73 |               "    .dataframe thead th {\n",
  74 |               "        text-align: right;\n",
  75 |               "    }\n",
  76 |               "</style>\n",
  77 |               "<table border=\"1\" class=\"dataframe\">\n",
  78 |               "  <thead>\n",
  79 |               "    <tr style=\"text-align: right;\">\n",
  80 |               "      <th></th>\n",
  81 |               "      <th>asin</th>\n",
  82 |               "      <th>title</th>\n",
  83 |               "      <th>Accessories</th>\n",
  84 |               "      <th>Acid Jazz</th>\n",
  85 |               "      <th>Acoustic Blues</th>\n",
  86 |               "      <th>Adult Alternative</th>\n",
  87 |               "      <th>Adult Contemporary</th>\n",
  88 |               "      <th>Africa</th>\n",
  89 |               "      <th>Afro Brazilian</th>\n",
  90 |               "      <th>Afro-Cuban</th>\n",
  91 |               "      <th>Air Tool Accessories</th>\n",
  92 |               "      <th>Album-Oriented Rock (AOR)</th>\n",
  93 |               "      <th>Alt Industrial</th>\n",
  94 |               "      <th>Alt-Country &amp; Americana</th>\n",
  95 |               "      <th>Alternative Medicine</th>\n",
  96 |               "      <th>Alternative Metal</th>\n",
  97 |               "      <th>Alternative Rock</th>\n",
  98 |               "      <th>Ambient</th>\n",
  99 |               "      <th>Ambient Pop</th>\n",
 100 |               "      <th>American Alternative</th>\n",
 101 |               "      <th>American Punk</th>\n",
 102 |               "      <th>Americana</th>\n",
 103 |               "      <th>Amplifiers &amp; Effects</th>\n",
 104 |               "      <th>Andes</th>\n",
 105 |               "      <th>Arena Rock</th>\n",
 106 |               "      <th>Argentina</th>\n",
 107 |               "      <th>Arts &amp; Crafts Supplies</th>\n",
 108 |               "      <th>Arts, Crafts &amp; Sewing</th>\n",
 109 |               "      <th>Australia &amp; New Zealand</th>\n",
 110 |               "      <th>Austria</th>\n",
 111 |               "      <th>Avant Garde &amp; Free Jazz</th>\n",
 112 |               "      <th>Baby Products</th>\n",
 113 |               "      <th>Bachata</th>\n",
 114 |               "      <th>Bags &amp; Cases</th>\n",
 115 |               "      <th>Bakersfield Sound</th>\n",
 116 |               "      <th>Ballets</th>\n",
 117 |               "      <th>Ballets &amp; Dances</th>\n",
 118 |               "      <th>Baroque Pop</th>\n",
 119 |               "      <th>Bass</th>\n",
 120 |               "      <th>Bass Guitars</th>\n",
 121 |               "      <th>...</th>\n",
 122 |               "      <th>Third Wave Ska</th>\n",
 123 |               "      <th>Thrash &amp; Speed Metal</th>\n",
 124 |               "      <th>Tin Pan Alley</th>\n",
 125 |               "      <th>Tools &amp; Accessories</th>\n",
 126 |               "      <th>Tools &amp; Home Improvement</th>\n",
 127 |               "      <th>Traditional</th>\n",
 128 |               "      <th>Traditional Blues</th>\n",
 129 |               "      <th>Traditional British &amp; Celtic Folk</th>\n",
 130 |               "      <th>Traditional Folk</th>\n",
 131 |               "      <th>Traditional Jazz &amp; Ragtime</th>\n",
 132 |               "      <th>Traditional Pop</th>\n",
 133 |               "      <th>Traditional Vocal Pop</th>\n",
 134 |               "      <th>Trance</th>\n",
 135 |               "      <th>Tributes</th>\n",
 136 |               "      <th>Trim &amp; Embellishments</th>\n",
 137 |               "      <th>Trip-Hop</th>\n",
 138 |               "      <th>Turkey</th>\n",
 139 |               "      <th>Turntablists</th>\n",
 140 |               "      <th>Twee Pop</th>\n",
 141 |               "      <th>Urban &amp; Contemporary</th>\n",
 142 |               "      <th>Urban Folk</th>\n",
 143 |               "      <th>Uruguay</th>\n",
 144 |               "      <th>Venezuela</th>\n",
 145 |               "      <th>Vitamins &amp; Dietary Supplements</th>\n",
 146 |               "      <th>Vocal Blues</th>\n",
 147 |               "      <th>Vocal Jazz</th>\n",
 148 |               "      <th>Vocal Non-Opera</th>\n",
 149 |               "      <th>Vocal Pop</th>\n",
 150 |               "      <th>Voices</th>\n",
 151 |               "      <th>Walkers</th>\n",
 152 |               "      <th>Wall Stickers</th>\n",
 153 |               "      <th>Wall Switches</th>\n",
 154 |               "      <th>Washers</th>\n",
 155 |               "      <th>Wave Washers &amp; Wave Springs</th>\n",
 156 |               "      <th>Wedding Music</th>\n",
 157 |               "      <th>West Coast</th>\n",
 158 |               "      <th>West Coast Blues</th>\n",
 159 |               "      <th>Western Swing</th>\n",
 160 |               "      <th>World Dance</th>\n",
 161 |               "      <th>World Music</th>\n",
 162 |               "    </tr>\n",
 163 |               "  </thead>\n",
 164 |               "  <tbody>\n",
 165 |               "    <tr>\n",
 166 |               "      <th>0</th>\n",
 167 |               "      <td>5555991584</td>\n",
 168 |               "      <td>Memory of Trees</td>\n",
 169 |               "      <td>0.0</td>\n",
 170 |               "      <td>0.0</td>\n",
 171 |               "      <td>0.0</td>\n",
 172 |               "      <td>0.0</td>\n",
 173 |               "      <td>0.0</td>\n",
 174 |               "      <td>0.0</td>\n",
 175 |               "      <td>0.0</td>\n",
 176 |               "      <td>0.0</td>\n",
 177 |               "      <td>0.0</td>\n",
 178 |               "      <td>0.0</td>\n",
 179 |               "      <td>0.0</td>\n",
 180 |               "      <td>0.0</td>\n",
 181 |               "      <td>0.0</td>\n",
 182 |               "      <td>0.0</td>\n",
 183 |               "      <td>0.0</td>\n",
 184 |               "      <td>0.0</td>\n",
 185 |               "      <td>0.0</td>\n",
 186 |               "      <td>0.0</td>\n",
 187 |               "      <td>0.0</td>\n",
 188 |               "      <td>0.0</td>\n",
 189 |               "      <td>0.0</td>\n",
 190 |               "      <td>0.0</td>\n",
 191 |               "      <td>0.0</td>\n",
 192 |               "      <td>0.0</td>\n",
 193 |               "      <td>0.0</td>\n",
 194 |               "      <td>0.0</td>\n",
 195 |               "      <td>0.0</td>\n",
 196 |               "      <td>0.0</td>\n",
 197 |               "      <td>0.0</td>\n",
 198 |               "      <td>0.0</td>\n",
 199 |               "      <td>0.0</td>\n",
 200 |               "      <td>0.0</td>\n",
 201 |               "      <td>0.0</td>\n",
 202 |               "      <td>0.0</td>\n",
 203 |               "      <td>0.0</td>\n",
 204 |               "      <td>0.0</td>\n",
 205 |               "      <td>0.0</td>\n",
 206 |               "      <td>0.0</td>\n",
 207 |               "      <td>...</td>\n",
 208 |               "      <td>0.0</td>\n",
 209 |               "      <td>0.0</td>\n",
 210 |               "      <td>0.0</td>\n",
 211 |               "      <td>0.0</td>\n",
 212 |               "      <td>0.0</td>\n",
 213 |               "      <td>0.0</td>\n",
 214 |               "      <td>0.0</td>\n",
 215 |               "      <td>0.0</td>\n",
 216 |               "      <td>0.0</td>\n",
 217 |               "      <td>0.0</td>\n",
 218 |               "      <td>0.0</td>\n",
 219 |               "      <td>0.0</td>\n",
 220 |               "      <td>0.0</td>\n",
 221 |               "      <td>0.0</td>\n",
 222 |               "      <td>0.0</td>\n",
 223 |               "      <td>0.0</td>\n",
 224 |               "      <td>0.0</td>\n",
 225 |               "      <td>0.0</td>\n",
 226 |               "      <td>0.0</td>\n",
 227 |               "      <td>0.0</td>\n",
 228 |               "      <td>0.0</td>\n",
 229 |               "      <td>0.0</td>\n",
 230 |               "      <td>0.0</td>\n",
 231 |               "      <td>0.0</td>\n",
 232 |               "      <td>0.0</td>\n",
 233 |               "      <td>0.0</td>\n",
 234 |               "      <td>0.0</td>\n",
 235 |               "      <td>0.0</td>\n",
 236 |               "      <td>0.0</td>\n",
 237 |               "      <td>0.0</td>\n",
 238 |               "      <td>0.0</td>\n",
 239 |               "      <td>0.0</td>\n",
 240 |               "      <td>0.0</td>\n",
 241 |               "      <td>0.0</td>\n",
 242 |               "      <td>0.0</td>\n",
 243 |               "      <td>0.0</td>\n",
 244 |               "      <td>0.0</td>\n",
 245 |               "      <td>0.0</td>\n",
 246 |               "      <td>0.0</td>\n",
 247 |               "      <td>0.0</td>\n",
 248 |               "    </tr>\n",
 249 |               "    <tr>\n",
 250 |               "      <th>1</th>\n",
 251 |               "      <td>6308051551</td>\n",
 252 |               "      <td>Dont Drink His Blood</td>\n",
 253 |               "      <td>0.0</td>\n",
 254 |               "      <td>0.0</td>\n",
 255 |               "      <td>0.0</td>\n",
 256 |               "      <td>0.0</td>\n",
 257 |               "      <td>0.0</td>\n",
 258 |               "      <td>0.0</td>\n",
 259 |               "      <td>0.0</td>\n",
 260 |               "      <td>0.0</td>\n",
 261 |               "      <td>0.0</td>\n",
 262 |               "      <td>0.0</td>\n",
 263 |               "      <td>0.0</td>\n",
 264 |               "      <td>0.0</td>\n",
 265 |               "      <td>0.0</td>\n",
 266 |               "      <td>0.0</td>\n",
 267 |               "      <td>1.0</td>\n",
 268 |               "      <td>0.0</td>\n",
 269 |               "      <td>0.0</td>\n",
 270 |               "      <td>0.0</td>\n",
 271 |               "      <td>0.0</td>\n",
 272 |               "      <td>0.0</td>\n",
 273 |               "      <td>0.0</td>\n",
 274 |               "      <td>0.0</td>\n",
 275 |               "      <td>0.0</td>\n",
 276 |               "      <td>0.0</td>\n",
 277 |               "      <td>0.0</td>\n",
 278 |               "      <td>0.0</td>\n",
 279 |               "      <td>0.0</td>\n",
 280 |               "      <td>0.0</td>\n",
 281 |               "      <td>0.0</td>\n",
 282 |               "      <td>0.0</td>\n",
 283 |               "      <td>0.0</td>\n",
 284 |               "      <td>0.0</td>\n",
 285 |               "      <td>0.0</td>\n",
 286 |               "      <td>0.0</td>\n",
 287 |               "      <td>0.0</td>\n",
 288 |               "      <td>0.0</td>\n",
 289 |               "      <td>0.0</td>\n",
 290 |               "      <td>0.0</td>\n",
 291 |               "      <td>...</td>\n",
 292 |               "      <td>0.0</td>\n",
 293 |               "      <td>0.0</td>\n",
 294 |               "      <td>0.0</td>\n",
 295 |               "      <td>0.0</td>\n",
 296 |               "      <td>0.0</td>\n",
 297 |               "      <td>0.0</td>\n",
 298 |               "      <td>0.0</td>\n",
 299 |               "      <td>0.0</td>\n",
 300 |               "      <td>0.0</td>\n",
 301 |               "      <td>0.0</td>\n",
 302 |               "      <td>0.0</td>\n",
 303 |               "      <td>0.0</td>\n",
 304 |               "      <td>0.0</td>\n",
 305 |               "      <td>0.0</td>\n",
 306 |               "      <td>0.0</td>\n",
 307 |               "      <td>0.0</td>\n",
 308 |               "      <td>0.0</td>\n",
 309 |               "      <td>0.0</td>\n",
 310 |               "      <td>0.0</td>\n",
 311 |               "      <td>0.0</td>\n",
 312 |               "      <td>0.0</td>\n",
 313 |               "      <td>0.0</td>\n",
 314 |               "      <td>0.0</td>\n",
 315 |               "      <td>0.0</td>\n",
 316 |               "      <td>0.0</td>\n",
 317 |               "      <td>0.0</td>\n",
 318 |               "      <td>0.0</td>\n",
 319 |               "      <td>0.0</td>\n",
 320 |               "      <td>0.0</td>\n",
 321 |               "      <td>0.0</td>\n",
 322 |               "      <td>0.0</td>\n",
 323 |               "      <td>0.0</td>\n",
 324 |               "      <td>0.0</td>\n",
 325 |               "      <td>0.0</td>\n",
 326 |               "      <td>0.0</td>\n",
 327 |               "      <td>0.0</td>\n",
 328 |               "      <td>0.0</td>\n",
 329 |               "      <td>0.0</td>\n",
 330 |               "      <td>0.0</td>\n",
 331 |               "      <td>0.0</td>\n",
 332 |               "    </tr>\n",
 333 |               "    <tr>\n",
 334 |               "      <th>2</th>\n",
 335 |               "      <td>7901622466</td>\n",
 336 |               "      <td>On Fire</td>\n",
 337 |               "      <td>0.0</td>\n",
 338 |               "      <td>0.0</td>\n",
 339 |               "      <td>0.0</td>\n",
 340 |               "      <td>0.0</td>\n",
 341 |               "      <td>0.0</td>\n",
 342 |               "      <td>0.0</td>\n",
 343 |               "      <td>0.0</td>\n",
 344 |               "      <td>0.0</td>\n",
 345 |               "      <td>0.0</td>\n",
 346 |               "      <td>0.0</td>\n",
 347 |               "      <td>0.0</td>\n",
 348 |               "      <td>0.0</td>\n",
 349 |               "      <td>0.0</td>\n",
 350 |               "      <td>0.0</td>\n",
 351 |               "      <td>0.0</td>\n",
 352 |               "      <td>0.0</td>\n",
 353 |               "      <td>0.0</td>\n",
 354 |               "      <td>0.0</td>\n",
 355 |               "      <td>0.0</td>\n",
 356 |               "      <td>0.0</td>\n",
 357 |               "      <td>0.0</td>\n",
 358 |               "      <td>0.0</td>\n",
 359 |               "      <td>0.0</td>\n",
 360 |               "      <td>0.0</td>\n",
 361 |               "      <td>0.0</td>\n",
 362 |               "      <td>0.0</td>\n",
 363 |               "      <td>0.0</td>\n",
 364 |               "      <td>0.0</td>\n",
 365 |               "      <td>0.0</td>\n",
 366 |               "      <td>0.0</td>\n",
 367 |               "      <td>0.0</td>\n",
 368 |               "      <td>0.0</td>\n",
 369 |               "      <td>0.0</td>\n",
 370 |               "      <td>0.0</td>\n",
 371 |               "      <td>0.0</td>\n",
 372 |               "      <td>0.0</td>\n",
 373 |               "      <td>0.0</td>\n",
 374 |               "      <td>0.0</td>\n",
 375 |               "      <td>...</td>\n",
 376 |               "      <td>0.0</td>\n",
 377 |               "      <td>0.0</td>\n",
 378 |               "      <td>0.0</td>\n",
 379 |               "      <td>0.0</td>\n",
 380 |               "      <td>0.0</td>\n",
 381 |               "      <td>0.0</td>\n",
 382 |               "      <td>0.0</td>\n",
 383 |               "      <td>0.0</td>\n",
 384 |               "      <td>0.0</td>\n",
 385 |               "      <td>0.0</td>\n",
 386 |               "      <td>0.0</td>\n",
 387 |               "      <td>0.0</td>\n",
 388 |               "      <td>0.0</td>\n",
 389 |               "      <td>0.0</td>\n",
 390 |               "      <td>0.0</td>\n",
 391 |               "      <td>0.0</td>\n",
 392 |               "      <td>0.0</td>\n",
 393 |               "      <td>0.0</td>\n",
 394 |               "      <td>0.0</td>\n",
 395 |               "      <td>0.0</td>\n",
 396 |               "      <td>0.0</td>\n",
 397 |               "      <td>0.0</td>\n",
 398 |               "      <td>0.0</td>\n",
 399 |               "      <td>0.0</td>\n",
 400 |               "      <td>0.0</td>\n",
 401 |               "      <td>0.0</td>\n",
 402 |               "      <td>0.0</td>\n",
 403 |               "      <td>0.0</td>\n",
 404 |               "      <td>0.0</td>\n",
 405 |               "      <td>0.0</td>\n",
 406 |               "      <td>0.0</td>\n",
 407 |               "      <td>0.0</td>\n",
 408 |               "      <td>0.0</td>\n",
 409 |               "      <td>0.0</td>\n",
 410 |               "      <td>0.0</td>\n",
 411 |               "      <td>0.0</td>\n",
 412 |               "      <td>0.0</td>\n",
 413 |               "      <td>0.0</td>\n",
 414 |               "      <td>0.0</td>\n",
 415 |               "      <td>0.0</td>\n",
 416 |               "    </tr>\n",
 417 |               "    <tr>\n",
 418 |               "      <th>3</th>\n",
 419 |               "      <td>B0000000ZW</td>\n",
 420 |               "      <td>Changing Faces</td>\n",
 421 |               "      <td>0.0</td>\n",
 422 |               "      <td>0.0</td>\n",
 423 |               "      <td>0.0</td>\n",
 424 |               "      <td>0.0</td>\n",
 425 |               "      <td>0.0</td>\n",
 426 |               "      <td>0.0</td>\n",
 427 |               "      <td>0.0</td>\n",
 428 |               "      <td>0.0</td>\n",
 429 |               "      <td>0.0</td>\n",
 430 |               "      <td>0.0</td>\n",
 431 |               "      <td>0.0</td>\n",
 432 |               "      <td>0.0</td>\n",
 433 |               "      <td>0.0</td>\n",
 434 |               "      <td>0.0</td>\n",
 435 |               "      <td>0.0</td>\n",
 436 |               "      <td>0.0</td>\n",
 437 |               "      <td>0.0</td>\n",
 438 |               "      <td>0.0</td>\n",
 439 |               "      <td>0.0</td>\n",
 440 |               "      <td>0.0</td>\n",
 441 |               "      <td>0.0</td>\n",
 442 |               "      <td>0.0</td>\n",
 443 |               "      <td>0.0</td>\n",
 444 |               "      <td>0.0</td>\n",
 445 |               "      <td>0.0</td>\n",
 446 |               "      <td>0.0</td>\n",
 447 |               "      <td>0.0</td>\n",
 448 |               "      <td>0.0</td>\n",
 449 |               "      <td>0.0</td>\n",
 450 |               "      <td>0.0</td>\n",
 451 |               "      <td>0.0</td>\n",
 452 |               "      <td>0.0</td>\n",
 453 |               "      <td>0.0</td>\n",
 454 |               "      <td>0.0</td>\n",
 455 |               "      <td>0.0</td>\n",
 456 |               "      <td>0.0</td>\n",
 457 |               "      <td>0.0</td>\n",
 458 |               "      <td>0.0</td>\n",
 459 |               "      <td>...</td>\n",
 460 |               "      <td>0.0</td>\n",
 461 |               "      <td>0.0</td>\n",
 462 |               "      <td>0.0</td>\n",
 463 |               "      <td>0.0</td>\n",
 464 |               "      <td>0.0</td>\n",
 465 |               "      <td>0.0</td>\n",
 466 |               "      <td>0.0</td>\n",
 467 |               "      <td>0.0</td>\n",
 468 |               "      <td>0.0</td>\n",
 469 |               "      <td>0.0</td>\n",
 470 |               "      <td>0.0</td>\n",
 471 |               "      <td>0.0</td>\n",
 472 |               "      <td>0.0</td>\n",
 473 |               "      <td>0.0</td>\n",
 474 |               "      <td>0.0</td>\n",
 475 |               "      <td>0.0</td>\n",
 476 |               "      <td>0.0</td>\n",
 477 |               "      <td>0.0</td>\n",
 478 |               "      <td>0.0</td>\n",
 479 |               "      <td>0.0</td>\n",
 480 |               "      <td>0.0</td>\n",
 481 |               "      <td>0.0</td>\n",
 482 |               "      <td>0.0</td>\n",
 483 |               "      <td>0.0</td>\n",
 484 |               "      <td>0.0</td>\n",
 485 |               "      <td>0.0</td>\n",
 486 |               "      <td>0.0</td>\n",
 487 |               "      <td>0.0</td>\n",
 488 |               "      <td>0.0</td>\n",
 489 |               "      <td>0.0</td>\n",
 490 |               "      <td>0.0</td>\n",
 491 |               "      <td>0.0</td>\n",
 492 |               "      <td>0.0</td>\n",
 493 |               "      <td>0.0</td>\n",
 494 |               "      <td>0.0</td>\n",
 495 |               "      <td>0.0</td>\n",
 496 |               "      <td>0.0</td>\n",
 497 |               "      <td>0.0</td>\n",
 498 |               "      <td>0.0</td>\n",
 499 |               "      <td>0.0</td>\n",
 500 |               "    </tr>\n",
 501 |               "    <tr>\n",
 502 |               "      <th>4</th>\n",
 503 |               "      <td>B00000016W</td>\n",
 504 |               "      <td>Pet Sounds</td>\n",
 505 |               "      <td>0.0</td>\n",
 506 |               "      <td>0.0</td>\n",
 507 |               "      <td>0.0</td>\n",
 508 |               "      <td>0.0</td>\n",
 509 |               "      <td>0.0</td>\n",
 510 |               "      <td>0.0</td>\n",
 511 |               "      <td>0.0</td>\n",
 512 |               "      <td>0.0</td>\n",
 513 |               "      <td>0.0</td>\n",
 514 |               "      <td>0.0</td>\n",
 515 |               "      <td>0.0</td>\n",
 516 |               "      <td>0.0</td>\n",
 517 |               "      <td>0.0</td>\n",
 518 |               "      <td>0.0</td>\n",
 519 |               "      <td>1.0</td>\n",
 520 |               "      <td>0.0</td>\n",
 521 |               "      <td>0.0</td>\n",
 522 |               "      <td>0.0</td>\n",
 523 |               "      <td>0.0</td>\n",
 524 |               "      <td>0.0</td>\n",
 525 |               "      <td>0.0</td>\n",
 526 |               "      <td>0.0</td>\n",
 527 |               "      <td>0.0</td>\n",
 528 |               "      <td>0.0</td>\n",
 529 |               "      <td>0.0</td>\n",
 530 |               "      <td>0.0</td>\n",
 531 |               "      <td>0.0</td>\n",
 532 |               "      <td>0.0</td>\n",
 533 |               "      <td>0.0</td>\n",
 534 |               "      <td>0.0</td>\n",
 535 |               "      <td>0.0</td>\n",
 536 |               "      <td>0.0</td>\n",
 537 |               "      <td>0.0</td>\n",
 538 |               "      <td>0.0</td>\n",
 539 |               "      <td>0.0</td>\n",
 540 |               "      <td>1.0</td>\n",
 541 |               "      <td>0.0</td>\n",
 542 |               "      <td>0.0</td>\n",
 543 |               "      <td>...</td>\n",
 544 |               "      <td>0.0</td>\n",
 545 |               "      <td>0.0</td>\n",
 546 |               "      <td>0.0</td>\n",
 547 |               "      <td>0.0</td>\n",
 548 |               "      <td>0.0</td>\n",
 549 |               "      <td>0.0</td>\n",
 550 |               "      <td>0.0</td>\n",
 551 |               "      <td>0.0</td>\n",
 552 |               "      <td>0.0</td>\n",
 553 |               "      <td>0.0</td>\n",
 554 |               "      <td>0.0</td>\n",
 555 |               "      <td>0.0</td>\n",
 556 |               "      <td>0.0</td>\n",
 557 |               "      <td>0.0</td>\n",
 558 |               "      <td>0.0</td>\n",
 559 |               "      <td>0.0</td>\n",
 560 |               "      <td>0.0</td>\n",
 561 |               "      <td>0.0</td>\n",
 562 |               "      <td>0.0</td>\n",
 563 |               "      <td>0.0</td>\n",
 564 |               "      <td>0.0</td>\n",
 565 |               "      <td>0.0</td>\n",
 566 |               "      <td>0.0</td>\n",
 567 |               "      <td>0.0</td>\n",
 568 |               "      <td>0.0</td>\n",
 569 |               "      <td>0.0</td>\n",
 570 |               "      <td>0.0</td>\n",
 571 |               "      <td>0.0</td>\n",
 572 |               "      <td>0.0</td>\n",
 573 |               "      <td>0.0</td>\n",
 574 |               "      <td>0.0</td>\n",
 575 |               "      <td>0.0</td>\n",
 576 |               "      <td>0.0</td>\n",
 577 |               "      <td>0.0</td>\n",
 578 |               "      <td>0.0</td>\n",
 579 |               "      <td>0.0</td>\n",
 580 |               "      <td>0.0</td>\n",
 581 |               "      <td>0.0</td>\n",
 582 |               "      <td>0.0</td>\n",
 583 |               "      <td>0.0</td>\n",
 584 |               "    </tr>\n",
 585 |               "  </tbody>\n",
 586 |               "</table>\n",
 587 |               "<p>5 rows × 463 columns</p>\n",
 588 |               "</div>"
 589 |             ],
 590 |             "text/plain": [
 591 |               "         asin                 title  ...  World Dance  World Music\n",
 592 |               "0  5555991584       Memory of Trees  ...          0.0          0.0\n",
 593 |               "1  6308051551  Dont Drink His Blood  ...          0.0          0.0\n",
 594 |               "2  7901622466               On Fire  ...          0.0          0.0\n",
 595 |               "3  B0000000ZW        Changing Faces  ...          0.0          0.0\n",
 596 |               "4  B00000016W            Pet Sounds  ...          0.0          0.0\n",
 597 |               "\n",
 598 |               "[5 rows x 463 columns]"
 599 |             ]
 600 |           },
 601 |           "metadata": {
 602 |             "tags": []
 603 |           },
 604 |           "execution_count": 6
 605 |         }
 606 |       ]
 607 |     },
 608 |     {
 609 |       "cell_type": "code",
 610 |       "metadata": {
 611 |         "id": "LA5NCxJC7coB",
 612 |         "colab_type": "code",
 613 |         "colab": {
 614 |           "base_uri": "https://localhost:8080/",
 615 |           "height": 206
 616 |         },
 617 |         "outputId": "875b8583-6db8-4a3d-c351-84acd5b7440a"
 618 |       },
 619 |       "source": [
 620 |         "new_metadata = metadata.iloc[:,1:]\n",
 621 |         "new_metadata = new_metadata.melt(id_vars=[\"title\"])\n",
 622 |         "new_metadata = new_metadata[new_metadata.value != 0]\n",
 623 |         "new_metadata.reset_index(inplace=True, drop=True)\n",
 624 |         "new_metadata.tail()"
 625 |       ],
 626 |       "execution_count": 31,
 627 |       "outputs": [
 628 |         {
 629 |           "output_type": "execute_result",
 630 |           "data": {
 631 |             "text/html": [
 632 |               "<div>\n",
 633 |               "<style scoped>\n",
 634 |               "    .dataframe tbody tr th:only-of-type {\n",
 635 |               "        vertical-align: middle;\n",
 636 |               "    }\n",
 637 |               "\n",
 638 |               "    .dataframe tbody tr th {\n",
 639 |               "        vertical-align: top;\n",
 640 |               "    }\n",
 641 |               "\n",
 642 |               "    .dataframe thead th {\n",
 643 |               "        text-align: right;\n",
 644 |               "    }\n",
 645 |               "</style>\n",
 646 |               "<table border=\"1\" class=\"dataframe\">\n",
 647 |               "  <thead>\n",
 648 |               "    <tr style=\"text-align: right;\">\n",
 649 |               "      <th></th>\n",
 650 |               "      <th>title</th>\n",
 651 |               "      <th>variable</th>\n",
 652 |               "      <th>value</th>\n",
 653 |               "    </tr>\n",
 654 |               "  </thead>\n",
 655 |               "  <tbody>\n",
 656 |               "    <tr>\n",
 657 |               "      <th>62572</th>\n",
 658 |               "      <td>Lloyd Im Ready to Be Heartbroken</td>\n",
 659 |               "      <td>World Music</td>\n",
 660 |               "      <td>1.0</td>\n",
 661 |               "    </tr>\n",
 662 |               "    <tr>\n",
 663 |               "      <th>62573</th>\n",
 664 |               "      <td>I Sincerely Apologize For All The Trouble Ive ...</td>\n",
 665 |               "      <td>World Music</td>\n",
 666 |               "      <td>1.0</td>\n",
 667 |               "    </tr>\n",
 668 |               "    <tr>\n",
 669 |               "      <th>62574</th>\n",
 670 |               "      <td>Faster Pussycat</td>\n",
 671 |               "      <td>World Music</td>\n",
 672 |               "      <td>1.0</td>\n",
 673 |               "    </tr>\n",
 674 |               "    <tr>\n",
 675 |               "      <th>62575</th>\n",
 676 |               "      <td>Eva Contro Eva</td>\n",
 677 |               "      <td>World Music</td>\n",
 678 |               "      <td>1.0</td>\n",
 679 |               "    </tr>\n",
 680 |               "    <tr>\n",
 681 |               "      <th>62576</th>\n",
 682 |               "      <td>Waters of Nazareth</td>\n",
 683 |               "      <td>World Music</td>\n",
 684 |               "      <td>1.0</td>\n",
 685 |               "    </tr>\n",
 686 |               "  </tbody>\n",
 687 |               "</table>\n",
 688 |               "</div>"
 689 |             ],
 690 |             "text/plain": [
 691 |               "                                                   title     variable  value\n",
 692 |               "62572                   Lloyd Im Ready to Be Heartbroken  World Music    1.0\n",
 693 |               "62573  I Sincerely Apologize For All The Trouble Ive ...  World Music    1.0\n",
 694 |               "62574                                    Faster Pussycat  World Music    1.0\n",
 695 |               "62575                                     Eva Contro Eva  World Music    1.0\n",
 696 |               "62576                                 Waters of Nazareth  World Music    1.0"
 697 |             ]
 698 |           },
 699 |           "metadata": {
 700 |             "tags": []
 701 |           },
 702 |           "execution_count": 31
 703 |         }
 704 |       ]
 705 |     },
 706 |     {
 707 |       "cell_type": "code",
 708 |       "metadata": {
 709 |         "id": "l14vmh1695tf",
 710 |         "colab_type": "code",
 711 |         "colab": {}
 712 |       },
 713 |       "source": [
 714 |         "dict_title = np.load('map_tilte.npy', allow_pickle=True).tolist()\n",
 715 |         "inverse_dict_title = {value: int(key) for key, value in dict_title.items()}"
 716 |       ],
 717 |       "execution_count": 0,
 718 |       "outputs": []
 719 |     },
 720 |     {
 721 |       "cell_type": "code",
 722 |       "metadata": {
 723 |         "id": "6LAXPwCj-btQ",
 724 |         "colab_type": "code",
 725 |         "colab": {}
 726 |       },
 727 |       "source": [
 728 |         "new_metadata['asin_id'] = new_metadata['title'].map(inverse_dict_title)"
 729 |       ],
 730 |       "execution_count": 0,
 731 |       "outputs": []
 732 |     },
 733 |     {
 734 |       "cell_type": "code",
 735 |       "metadata": {
 736 |         "id": "TxxEVPuv7SAf",
 737 |         "colab_type": "code",
 738 |         "colab": {
 739 |           "base_uri": "https://localhost:8080/",
 740 |           "height": 1000
 741 |         },
 742 |         "outputId": "9f5f42f9-b4c9-4225-c86e-2df391f0e993"
 743 |       },
 744 |       "source": [
 745 |         "new_metadata.dropna(inplace=True)\n",
 746 |         "new_metadata = new_metadata[['asin_id', 'variable', 'value']]\n",
 747 |         "new_metadata['asin_id'] = new_metadata.asin_id.astype(int)\n",
 748 |         "new_metadata"
 749 |       ],
 750 |       "execution_count": 49,
 751 |       "outputs": [
 752 |         {
 753 |           "output_type": "execute_result",
 754 |           "data": {
 755 |             "text/html": [
 756 |               "<div>\n",
 757 |               "<style scoped>\n",
 758 |               "    .dataframe tbody tr th:only-of-type {\n",
 759 |               "        vertical-align: middle;\n",
 760 |               "    }\n",
 761 |               "\n",
 762 |               "    .dataframe tbody tr th {\n",
 763 |               "        vertical-align: top;\n",
 764 |               "    }\n",
 765 |               "\n",
 766 |               "    .dataframe thead th {\n",
 767 |               "        text-align: right;\n",
 768 |               "    }\n",
 769 |               "</style>\n",
 770 |               "<table border=\"1\" class=\"dataframe\">\n",
 771 |               "  <thead>\n",
 772 |               "    <tr style=\"text-align: right;\">\n",
 773 |               "      <th></th>\n",
 774 |               "      <th>asin_id</th>\n",
 775 |               "      <th>variable</th>\n",
 776 |               "      <th>value</th>\n",
 777 |               "    </tr>\n",
 778 |               "  </thead>\n",
 779 |               "  <tbody>\n",
 780 |               "    <tr>\n",
 781 |               "      <th>6</th>\n",
 782 |               "      <td>132</td>\n",
 783 |               "      <td>Acid Jazz</td>\n",
 784 |               "      <td>1.0</td>\n",
 785 |               "    </tr>\n",
 786 |               "    <tr>\n",
 787 |               "      <th>8</th>\n",
 788 |               "      <td>243</td>\n",
 789 |               "      <td>Acid Jazz</td>\n",
 790 |               "      <td>1.0</td>\n",
 791 |               "    </tr>\n",
 792 |               "    <tr>\n",
 793 |               "      <th>11</th>\n",
 794 |               "      <td>545</td>\n",
 795 |               "      <td>Acid Jazz</td>\n",
 796 |               "      <td>1.0</td>\n",
 797 |               "    </tr>\n",
 798 |               "    <tr>\n",
 799 |               "      <th>12</th>\n",
 800 |               "      <td>587</td>\n",
 801 |               "      <td>Acid Jazz</td>\n",
 802 |               "      <td>1.0</td>\n",
 803 |               "    </tr>\n",
 804 |               "    <tr>\n",
 805 |               "      <th>13</th>\n",
 806 |               "      <td>601</td>\n",
 807 |               "      <td>Acid Jazz</td>\n",
 808 |               "      <td>1.0</td>\n",
 809 |               "    </tr>\n",
 810 |               "    <tr>\n",
 811 |               "      <th>24</th>\n",
 812 |               "      <td>1255</td>\n",
 813 |               "      <td>Acid Jazz</td>\n",
 814 |               "      <td>1.0</td>\n",
 815 |               "    </tr>\n",
 816 |               "    <tr>\n",
 817 |               "      <th>26</th>\n",
 818 |               "      <td>1258</td>\n",
 819 |               "      <td>Acid Jazz</td>\n",
 820 |               "      <td>1.0</td>\n",
 821 |               "    </tr>\n",
 822 |               "    <tr>\n",
 823 |               "      <th>29</th>\n",
 824 |               "      <td>1907</td>\n",
 825 |               "      <td>Acid Jazz</td>\n",
 826 |               "      <td>1.0</td>\n",
 827 |               "    </tr>\n",
 828 |               "    <tr>\n",
 829 |               "      <th>30</th>\n",
 830 |               "      <td>2048</td>\n",
 831 |               "      <td>Acid Jazz</td>\n",
 832 |               "      <td>1.0</td>\n",
 833 |               "    </tr>\n",
 834 |               "    <tr>\n",
 835 |               "      <th>33</th>\n",
 836 |               "      <td>1282</td>\n",
 837 |               "      <td>Acid Jazz</td>\n",
 838 |               "      <td>1.0</td>\n",
 839 |               "    </tr>\n",
 840 |               "    <tr>\n",
 841 |               "      <th>42</th>\n",
 842 |               "      <td>2149</td>\n",
 843 |               "      <td>Acid Jazz</td>\n",
 844 |               "      <td>1.0</td>\n",
 845 |               "    </tr>\n",
 846 |               "    <tr>\n",
 847 |               "      <th>43</th>\n",
 848 |               "      <td>2189</td>\n",
 849 |               "      <td>Acid Jazz</td>\n",
 850 |               "      <td>1.0</td>\n",
 851 |               "    </tr>\n",
 852 |               "    <tr>\n",
 853 |               "      <th>48</th>\n",
 854 |               "      <td>1487</td>\n",
 855 |               "      <td>Acid Jazz</td>\n",
 856 |               "      <td>1.0</td>\n",
 857 |               "    </tr>\n",
 858 |               "    <tr>\n",
 859 |               "      <th>55</th>\n",
 860 |               "      <td>1607</td>\n",
 861 |               "      <td>Acid Jazz</td>\n",
 862 |               "      <td>1.0</td>\n",
 863 |               "    </tr>\n",
 864 |               "    <tr>\n",
 865 |               "      <th>56</th>\n",
 866 |               "      <td>1620</td>\n",
 867 |               "      <td>Acid Jazz</td>\n",
 868 |               "      <td>1.0</td>\n",
 869 |               "    </tr>\n",
 870 |               "    <tr>\n",
 871 |               "      <th>61</th>\n",
 872 |               "      <td>1724</td>\n",
 873 |               "      <td>Acid Jazz</td>\n",
 874 |               "      <td>1.0</td>\n",
 875 |               "    </tr>\n",
 876 |               "    <tr>\n",
 877 |               "      <th>66</th>\n",
 878 |               "      <td>1837</td>\n",
 879 |               "      <td>Acid Jazz</td>\n",
 880 |               "      <td>1.0</td>\n",
 881 |               "    </tr>\n",
 882 |               "    <tr>\n",
 883 |               "      <th>75</th>\n",
 884 |               "      <td>1953</td>\n",
 885 |               "      <td>Acid Jazz</td>\n",
 886 |               "      <td>1.0</td>\n",
 887 |               "    </tr>\n",
 888 |               "    <tr>\n",
 889 |               "      <th>78</th>\n",
 890 |               "      <td>1999</td>\n",
 891 |               "      <td>Acid Jazz</td>\n",
 892 |               "      <td>1.0</td>\n",
 893 |               "    </tr>\n",
 894 |               "    <tr>\n",
 895 |               "      <th>79</th>\n",
 896 |               "      <td>2021</td>\n",
 897 |               "      <td>Acid Jazz</td>\n",
 898 |               "      <td>1.0</td>\n",
 899 |               "    </tr>\n",
 900 |               "    <tr>\n",
 901 |               "      <th>80</th>\n",
 902 |               "      <td>2028</td>\n",
 903 |               "      <td>Acid Jazz</td>\n",
 904 |               "      <td>1.0</td>\n",
 905 |               "    </tr>\n",
 906 |               "    <tr>\n",
 907 |               "      <th>90</th>\n",
 908 |               "      <td>2327</td>\n",
 909 |               "      <td>Acid Jazz</td>\n",
 910 |               "      <td>1.0</td>\n",
 911 |               "    </tr>\n",
 912 |               "    <tr>\n",
 913 |               "      <th>91</th>\n",
 914 |               "      <td>2364</td>\n",
 915 |               "      <td>Acid Jazz</td>\n",
 916 |               "      <td>1.0</td>\n",
 917 |               "    </tr>\n",
 918 |               "    <tr>\n",
 919 |               "      <th>94</th>\n",
 920 |               "      <td>2442</td>\n",
 921 |               "      <td>Acid Jazz</td>\n",
 922 |               "      <td>1.0</td>\n",
 923 |               "    </tr>\n",
 924 |               "    <tr>\n",
 925 |               "      <th>95</th>\n",
 926 |               "      <td>1126</td>\n",
 927 |               "      <td>Acid Jazz</td>\n",
 928 |               "      <td>1.0</td>\n",
 929 |               "    </tr>\n",
 930 |               "    <tr>\n",
 931 |               "      <th>99</th>\n",
 932 |               "      <td>1840</td>\n",
 933 |               "      <td>Acoustic Blues</td>\n",
 934 |               "      <td>1.0</td>\n",
 935 |               "    </tr>\n",
 936 |               "    <tr>\n",
 937 |               "      <th>106</th>\n",
 938 |               "      <td>167</td>\n",
 939 |               "      <td>Acoustic Blues</td>\n",
 940 |               "      <td>1.0</td>\n",
 941 |               "    </tr>\n",
 942 |               "    <tr>\n",
 943 |               "      <th>110</th>\n",
 944 |               "      <td>562</td>\n",
 945 |               "      <td>Acoustic Blues</td>\n",
 946 |               "      <td>1.0</td>\n",
 947 |               "    </tr>\n",
 948 |               "    <tr>\n",
 949 |               "      <th>111</th>\n",
 950 |               "      <td>570</td>\n",
 951 |               "      <td>Acoustic Blues</td>\n",
 952 |               "      <td>1.0</td>\n",
 953 |               "    </tr>\n",
 954 |               "    <tr>\n",
 955 |               "      <th>112</th>\n",
 956 |               "      <td>574</td>\n",
 957 |               "      <td>Acoustic Blues</td>\n",
 958 |               "      <td>1.0</td>\n",
 959 |               "    </tr>\n",
 960 |               "    <tr>\n",
 961 |               "      <th>...</th>\n",
 962 |               "      <td>...</td>\n",
 963 |               "      <td>...</td>\n",
 964 |               "      <td>...</td>\n",
 965 |               "    </tr>\n",
 966 |               "    <tr>\n",
 967 |               "      <th>62448</th>\n",
 968 |               "      <td>2533</td>\n",
 969 |               "      <td>World Music</td>\n",
 970 |               "      <td>1.0</td>\n",
 971 |               "    </tr>\n",
 972 |               "    <tr>\n",
 973 |               "      <th>62451</th>\n",
 974 |               "      <td>2394</td>\n",
 975 |               "      <td>World Music</td>\n",
 976 |               "      <td>1.0</td>\n",
 977 |               "    </tr>\n",
 978 |               "    <tr>\n",
 979 |               "      <th>62452</th>\n",
 980 |               "      <td>2515</td>\n",
 981 |               "      <td>World Music</td>\n",
 982 |               "      <td>1.0</td>\n",
 983 |               "    </tr>\n",
 984 |               "    <tr>\n",
 985 |               "      <th>62454</th>\n",
 986 |               "      <td>2401</td>\n",
 987 |               "      <td>World Music</td>\n",
 988 |               "      <td>1.0</td>\n",
 989 |               "    </tr>\n",
 990 |               "    <tr>\n",
 991 |               "      <th>62455</th>\n",
 992 |               "      <td>2515</td>\n",
 993 |               "      <td>World Music</td>\n",
 994 |               "      <td>1.0</td>\n",
 995 |               "    </tr>\n",
 996 |               "    <tr>\n",
 997 |               "      <th>62462</th>\n",
 998 |               "      <td>1850</td>\n",
 999 |               "      <td>World Music</td>\n",
1000 |               "      <td>1.0</td>\n",
1001 |               "    </tr>\n",
1002 |               "    <tr>\n",
1003 |               "      <th>62474</th>\n",
1004 |               "      <td>2515</td>\n",
1005 |               "      <td>World Music</td>\n",
1006 |               "      <td>1.0</td>\n",
1007 |               "    </tr>\n",
1008 |               "    <tr>\n",
1009 |               "      <th>62478</th>\n",
1010 |               "      <td>2427</td>\n",
1011 |               "      <td>World Music</td>\n",
1012 |               "      <td>1.0</td>\n",
1013 |               "    </tr>\n",
1014 |               "    <tr>\n",
1015 |               "      <th>62482</th>\n",
1016 |               "      <td>2432</td>\n",
1017 |               "      <td>World Music</td>\n",
1018 |               "      <td>1.0</td>\n",
1019 |               "    </tr>\n",
1020 |               "    <tr>\n",
1021 |               "      <th>62486</th>\n",
1022 |               "      <td>2438</td>\n",
1023 |               "      <td>World Music</td>\n",
1024 |               "      <td>1.0</td>\n",
1025 |               "    </tr>\n",
1026 |               "    <tr>\n",
1027 |               "      <th>62487</th>\n",
1028 |               "      <td>2441</td>\n",
1029 |               "      <td>World Music</td>\n",
1030 |               "      <td>1.0</td>\n",
1031 |               "    </tr>\n",
1032 |               "    <tr>\n",
1033 |               "      <th>62491</th>\n",
1034 |               "      <td>2530</td>\n",
1035 |               "      <td>World Music</td>\n",
1036 |               "      <td>1.0</td>\n",
1037 |               "    </tr>\n",
1038 |               "    <tr>\n",
1039 |               "      <th>62492</th>\n",
1040 |               "      <td>2451</td>\n",
1041 |               "      <td>World Music</td>\n",
1042 |               "      <td>1.0</td>\n",
1043 |               "    </tr>\n",
1044 |               "    <tr>\n",
1045 |               "      <th>62494</th>\n",
1046 |               "      <td>1907</td>\n",
1047 |               "      <td>World Music</td>\n",
1048 |               "      <td>1.0</td>\n",
1049 |               "    </tr>\n",
1050 |               "    <tr>\n",
1051 |               "      <th>62506</th>\n",
1052 |               "      <td>2471</td>\n",
1053 |               "      <td>World Music</td>\n",
1054 |               "      <td>1.0</td>\n",
1055 |               "    </tr>\n",
1056 |               "    <tr>\n",
1057 |               "      <th>62509</th>\n",
1058 |               "      <td>2512</td>\n",
1059 |               "      <td>World Music</td>\n",
1060 |               "      <td>1.0</td>\n",
1061 |               "    </tr>\n",
1062 |               "    <tr>\n",
1063 |               "      <th>62513</th>\n",
1064 |               "      <td>2479</td>\n",
1065 |               "      <td>World Music</td>\n",
1066 |               "      <td>1.0</td>\n",
1067 |               "    </tr>\n",
1068 |               "    <tr>\n",
1069 |               "      <th>62519</th>\n",
1070 |               "      <td>2490</td>\n",
1071 |               "      <td>World Music</td>\n",
1072 |               "      <td>1.0</td>\n",
1073 |               "    </tr>\n",
1074 |               "    <tr>\n",
1075 |               "      <th>62524</th>\n",
1076 |               "      <td>2498</td>\n",
1077 |               "      <td>World Music</td>\n",
1078 |               "      <td>1.0</td>\n",
1079 |               "    </tr>\n",
1080 |               "    <tr>\n",
1081 |               "      <th>62525</th>\n",
1082 |               "      <td>2499</td>\n",
1083 |               "      <td>World Music</td>\n",
1084 |               "      <td>1.0</td>\n",
1085 |               "    </tr>\n",
1086 |               "    <tr>\n",
1087 |               "      <th>62526</th>\n",
1088 |               "      <td>2502</td>\n",
1089 |               "      <td>World Music</td>\n",
1090 |               "      <td>1.0</td>\n",
1091 |               "    </tr>\n",
1092 |               "    <tr>\n",
1093 |               "      <th>62527</th>\n",
1094 |               "      <td>2504</td>\n",
1095 |               "      <td>World Music</td>\n",
1096 |               "      <td>1.0</td>\n",
1097 |               "    </tr>\n",
1098 |               "    <tr>\n",
1099 |               "      <th>62528</th>\n",
1100 |               "      <td>2509</td>\n",
1101 |               "      <td>World Music</td>\n",
1102 |               "      <td>1.0</td>\n",
1103 |               "    </tr>\n",
1104 |               "    <tr>\n",
1105 |               "      <th>62533</th>\n",
1106 |               "      <td>2521</td>\n",
1107 |               "      <td>World Music</td>\n",
1108 |               "      <td>1.0</td>\n",
1109 |               "    </tr>\n",
1110 |               "    <tr>\n",
1111 |               "      <th>62537</th>\n",
1112 |               "      <td>2526</td>\n",
1113 |               "      <td>World Music</td>\n",
1114 |               "      <td>1.0</td>\n",
1115 |               "    </tr>\n",
1116 |               "    <tr>\n",
1117 |               "      <th>62553</th>\n",
1118 |               "      <td>2541</td>\n",
1119 |               "      <td>World Music</td>\n",
1120 |               "      <td>1.0</td>\n",
1121 |               "    </tr>\n",
1122 |               "    <tr>\n",
1123 |               "      <th>62561</th>\n",
1124 |               "      <td>2550</td>\n",
1125 |               "      <td>World Music</td>\n",
1126 |               "      <td>1.0</td>\n",
1127 |               "    </tr>\n",
1128 |               "    <tr>\n",
1129 |               "      <th>62562</th>\n",
1130 |               "      <td>2555</td>\n",
1131 |               "      <td>World Music</td>\n",
1132 |               "      <td>1.0</td>\n",
1133 |               "    </tr>\n",
1134 |               "    <tr>\n",
1135 |               "      <th>62563</th>\n",
1136 |               "      <td>2555</td>\n",
1137 |               "      <td>World Music</td>\n",
1138 |               "      <td>1.0</td>\n",
1139 |               "    </tr>\n",
1140 |               "    <tr>\n",
1141 |               "      <th>62566</th>\n",
1142 |               "      <td>2562</td>\n",
1143 |               "      <td>World Music</td>\n",
1144 |               "      <td>1.0</td>\n",
1145 |               "    </tr>\n",
1146 |               "  </tbody>\n",
1147 |               "</table>\n",
1148 |               "<p>26048 rows × 3 columns</p>\n",
1149 |               "</div>"
1150 |             ],
1151 |             "text/plain": [
1152 |               "       asin_id        variable  value\n",
1153 |               "6          132       Acid Jazz    1.0\n",
1154 |               "8          243       Acid Jazz    1.0\n",
1155 |               "11         545       Acid Jazz    1.0\n",
1156 |               "12         587       Acid Jazz    1.0\n",
1157 |               "13         601       Acid Jazz    1.0\n",
1158 |               "24        1255       Acid Jazz    1.0\n",
1159 |               "26        1258       Acid Jazz    1.0\n",
1160 |               "29        1907       Acid Jazz    1.0\n",
1161 |               "30        2048       Acid Jazz    1.0\n",
1162 |               "33        1282       Acid Jazz    1.0\n",
1163 |               "42        2149       Acid Jazz    1.0\n",
1164 |               "43        2189       Acid Jazz    1.0\n",
1165 |               "48        1487       Acid Jazz    1.0\n",
1166 |               "55        1607       Acid Jazz    1.0\n",
1167 |               "56        1620       Acid Jazz    1.0\n",
1168 |               "61        1724       Acid Jazz    1.0\n",
1169 |               "66        1837       Acid Jazz    1.0\n",
1170 |               "75        1953       Acid Jazz    1.0\n",
1171 |               "78        1999       Acid Jazz    1.0\n",
1172 |               "79        2021       Acid Jazz    1.0\n",
1173 |               "80        2028       Acid Jazz    1.0\n",
1174 |               "90        2327       Acid Jazz    1.0\n",
1175 |               "91        2364       Acid Jazz    1.0\n",
1176 |               "94        2442       Acid Jazz    1.0\n",
1177 |               "95        1126       Acid Jazz    1.0\n",
1178 |               "99        1840  Acoustic Blues    1.0\n",
1179 |               "106        167  Acoustic Blues    1.0\n",
1180 |               "110        562  Acoustic Blues    1.0\n",
1181 |               "111        570  Acoustic Blues    1.0\n",
1182 |               "112        574  Acoustic Blues    1.0\n",
1183 |               "...        ...             ...    ...\n",
1184 |               "62448     2533     World Music    1.0\n",
1185 |               "62451     2394     World Music    1.0\n",
1186 |               "62452     2515     World Music    1.0\n",
1187 |               "62454     2401     World Music    1.0\n",
1188 |               "62455     2515     World Music    1.0\n",
1189 |               "62462     1850     World Music    1.0\n",
1190 |               "62474     2515     World Music    1.0\n",
1191 |               "62478     2427     World Music    1.0\n",
1192 |               "62482     2432     World Music    1.0\n",
1193 |               "62486     2438     World Music    1.0\n",
1194 |               "62487     2441     World Music    1.0\n",
1195 |               "62491     2530     World Music    1.0\n",
1196 |               "62492     2451     World Music    1.0\n",
1197 |               "62494     1907     World Music    1.0\n",
1198 |               "62506     2471     World Music    1.0\n",
1199 |               "62509     2512     World Music    1.0\n",
1200 |               "62513     2479     World Music    1.0\n",
1201 |               "62519     2490     World Music    1.0\n",
1202 |               "62524     2498     World Music    1.0\n",
1203 |               "62525     2499     World Music    1.0\n",
1204 |               "62526     2502     World Music    1.0\n",
1205 |               "62527     2504     World Music    1.0\n",
1206 |               "62528     2509     World Music    1.0\n",
1207 |               "62533     2521     World Music    1.0\n",
1208 |               "62537     2526     World Music    1.0\n",
1209 |               "62553     2541     World Music    1.0\n",
1210 |               "62561     2550     World Music    1.0\n",
1211 |               "62562     2555     World Music    1.0\n",
1212 |               "62563     2555     World Music    1.0\n",
1213 |               "62566     2562     World Music    1.0\n",
1214 |               "\n",
1215 |               "[26048 rows x 3 columns]"
1216 |             ]
1217 |           },
1218 |           "metadata": {
1219 |             "tags": []
1220 |           },
1221 |           "execution_count": 49
1222 |         }
1223 |       ]
1224 |     },
1225 |     {
1226 |       "cell_type": "code",
1227 |       "metadata": {
1228 |         "id": "j4YNe3p39Oj5",
1229 |         "colab_type": "code",
1230 |         "colab": {}
1231 |       },
1232 |       "source": [
1233 |         "new_metadata.to_csv('items_metadata.dat', index=False, sep='\\t', header=False)"
1234 |       ],
1235 |       "execution_count": 0,
1236 |       "outputs": []
1237 |     },
1238 |     {
1239 |       "cell_type": "markdown",
1240 |       "metadata": {
1241 |         "id": "GRSYW3CY_r8T",
1242 |         "colab_type": "text"
1243 |       },
1244 |       "source": [
1245 |         "### Case Recommender\n"
1246 |       ]
1247 |     },
1248 |     {
1249 |       "cell_type": "code",
1250 |       "metadata": {
1251 |         "id": "rYyrosGg_q82",
1252 |         "colab_type": "code",
1253 |         "colab": {}
1254 |       },
1255 |       "source": [
1256 |         "from caserec.recommenders.rating_prediction.item_attribute_knn import ItemAttributeKNN"
1257 |       ],
1258 |       "execution_count": 0,
1259 |       "outputs": []
1260 |     },
1261 |     {
1262 |       "cell_type": "code",
1263 |       "metadata": {
1264 |         "id": "6itklNgN_8eh",
1265 |         "colab_type": "code",
1266 |         "colab": {
1267 |           "base_uri": "https://localhost:8080/",
1268 |           "height": 173
1269 |         },
1270 |         "outputId": "42f96b5c-6251-4311-a683-5c5e72e69828"
1271 |       },
1272 |       "source": [
1273 |         "ItemAttributeKNN('train.dat', 'test.dat', metadata_file='items_metadata.dat', as_similar_first=True).compute()"
1274 |       ],
1275 |       "execution_count": 56,
1276 |       "outputs": [
1277 |         {
1278 |           "output_type": "stream",
1279 |           "text": [
1280 |             "[Case Recommender: Rating Prediction > Item Attribute KNN Algorithm]\n",
1281 |             "\n",
1282 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
1283 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
1284 |             "\n",
1285 |             "training_time:: 10.775388 sec\n",
1286 |             ">> metadata:: 2521 items and 292 metadata (26048 interactions) | sparsity:: 96.46%\n",
1287 |             "prediction_time:: 0.544531 sec\n",
1288 |             "Eval:: MAE: 0.698327 RMSE: 0.984717 \n"
1289 |           ],
1290 |           "name": "stdout"
1291 |         }
1292 |       ]
1293 |     }
1294 |   ]
1295 | }


--------------------------------------------------------------------------------
/Processed Datasets/FC.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "name": "FC.ipynb",
  7 |       "version": "0.3.2",
  8 |       "provenance": []
  9 |     },
 10 |     "kernelspec": {
 11 |       "name": "python3",
 12 |       "display_name": "Python 3"
 13 |     }
 14 |   },
 15 |   "cells": [
 16 |     {
 17 |       "cell_type": "code",
 18 |       "metadata": {
 19 |         "id": "BwBeOhCaERjC",
 20 |         "colab_type": "code",
 21 |         "colab": {}
 22 |       },
 23 |       "source": [
 24 |         "import numpy as np\n",
 25 |         "import pandas as pd"
 26 |       ],
 27 |       "execution_count": 0,
 28 |       "outputs": []
 29 |     },
 30 |     {
 31 |       "cell_type": "code",
 32 |       "metadata": {
 33 |         "id": "kBWUYelzE4R1",
 34 |         "colab_type": "code",
 35 |         "colab": {}
 36 |       },
 37 |       "source": [
 38 |         "title_dict = np.load('map_tilte.npy', allow_pickle=True).tolist()"
 39 |       ],
 40 |       "execution_count": 0,
 41 |       "outputs": []
 42 |     },
 43 |     {
 44 |       "cell_type": "code",
 45 |       "metadata": {
 46 |         "id": "lfGR9oltFwcf",
 47 |         "colab_type": "code",
 48 |         "colab": {
 49 |           "base_uri": "https://localhost:8080/",
 50 |           "height": 142
 51 |         },
 52 |         "outputId": "bb5a5368-adde-4b9c-fb6e-69e49b4e6f76"
 53 |       },
 54 |       "source": [
 55 |         "test = pd.read_csv('test.dat', sep='\\t', names=['reviewerID', 'asin', 'rate', 'title'])\n",
 56 |         "test[test.reviewerID == 0]"
 57 |       ],
 58 |       "execution_count": 18,
 59 |       "outputs": [
 60 |         {
 61 |           "output_type": "execute_result",
 62 |           "data": {
 63 |             "text/html": [
 64 |               "<div>\n",
 65 |               "<style scoped>\n",
 66 |               "    .dataframe tbody tr th:only-of-type {\n",
 67 |               "        vertical-align: middle;\n",
 68 |               "    }\n",
 69 |               "\n",
 70 |               "    .dataframe tbody tr th {\n",
 71 |               "        vertical-align: top;\n",
 72 |               "    }\n",
 73 |               "\n",
 74 |               "    .dataframe thead th {\n",
 75 |               "        text-align: right;\n",
 76 |               "    }\n",
 77 |               "</style>\n",
 78 |               "<table border=\"1\" class=\"dataframe\">\n",
 79 |               "  <thead>\n",
 80 |               "    <tr style=\"text-align: right;\">\n",
 81 |               "      <th></th>\n",
 82 |               "      <th>reviewerID</th>\n",
 83 |               "      <th>asin</th>\n",
 84 |               "      <th>rate</th>\n",
 85 |               "      <th>title</th>\n",
 86 |               "    </tr>\n",
 87 |               "  </thead>\n",
 88 |               "  <tbody>\n",
 89 |               "    <tr>\n",
 90 |               "      <th>11196</th>\n",
 91 |               "      <td>0</td>\n",
 92 |               "      <td>2471</td>\n",
 93 |               "      <td>2</td>\n",
 94 |               "      <td>Jagged Little Pill Acoustic</td>\n",
 95 |               "    </tr>\n",
 96 |               "    <tr>\n",
 97 |               "      <th>14517</th>\n",
 98 |               "      <td>0</td>\n",
 99 |               "      <td>1978</td>\n",
100 |               "      <td>5</td>\n",
101 |               "      <td>La Revancha Del Tango</td>\n",
102 |               "    </tr>\n",
103 |               "    <tr>\n",
104 |               "      <th>16513</th>\n",
105 |               "      <td>0</td>\n",
106 |               "      <td>0</td>\n",
107 |               "      <td>5</td>\n",
108 |               "      <td>Memory of Trees</td>\n",
109 |               "    </tr>\n",
110 |               "  </tbody>\n",
111 |               "</table>\n",
112 |               "</div>"
113 |             ],
114 |             "text/plain": [
115 |               "       reviewerID  asin  rate                        title\n",
116 |               "11196           0  2471     2  Jagged Little Pill Acoustic\n",
117 |               "14517           0  1978     5        La Revancha Del Tango\n",
118 |               "16513           0     0     5              Memory of Trees"
119 |             ]
120 |           },
121 |           "metadata": {
122 |             "tags": []
123 |           },
124 |           "execution_count": 18
125 |         }
126 |       ]
127 |     },
128 |     {
129 |       "cell_type": "markdown",
130 |       "metadata": {
131 |         "id": "PG2-igcjGgzs",
132 |         "colab_type": "text"
133 |       },
134 |       "source": [
135 |         "## Memory-based"
136 |       ]
137 |     },
138 |     {
139 |       "cell_type": "code",
140 |       "metadata": {
141 |         "id": "gnQKrVXnDurV",
142 |         "colab_type": "code",
143 |         "colab": {
144 |           "base_uri": "https://localhost:8080/",
145 |           "height": 153
146 |         },
147 |         "outputId": "a9098c97-03b2-4c74-d2d1-6cba9b7594eb"
148 |       },
149 |       "source": [
150 |         "from caserec.recommenders.rating_prediction.itemknn import ItemKNN\n",
151 |         "\n",
152 |         "ItemKNN('train.dat', 'test.dat', 'rp_iknn.dat').compute()"
153 |       ],
154 |       "execution_count": 9,
155 |       "outputs": [
156 |         {
157 |           "output_type": "stream",
158 |           "text": [
159 |             "[Case Recommender: Rating Prediction > ItemKNN Algorithm]\n",
160 |             "\n",
161 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
162 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
163 |             "\n",
164 |             "training_time:: 10.051786 sec\n",
165 |             "prediction_time:: 0.558465 sec\n",
166 |             "Eval:: MAE: 0.710864 RMSE: 1.104636 \n"
167 |           ],
168 |           "name": "stdout"
169 |         }
170 |       ]
171 |     },
172 |     {
173 |       "cell_type": "code",
174 |       "metadata": {
175 |         "id": "L1EoGufqFHJp",
176 |         "colab_type": "code",
177 |         "colab": {
178 |           "base_uri": "https://localhost:8080/",
179 |           "height": 142
180 |         },
181 |         "outputId": "9fcc6715-9345-4c8e-843d-df3a81a10553"
182 |       },
183 |       "source": [
184 |         "predictions = pd.read_csv('rp_iknn.dat', sep='\\t', names=['reviewerID', 'asin', 'rate'])\n",
185 |         "predictions['title'] = predictions.asin.map(title_dict)\n",
186 |         "predictions.head(3)"
187 |       ],
188 |       "execution_count": 15,
189 |       "outputs": [
190 |         {
191 |           "output_type": "execute_result",
192 |           "data": {
193 |             "text/html": [
194 |               "<div>\n",
195 |               "<style scoped>\n",
196 |               "    .dataframe tbody tr th:only-of-type {\n",
197 |               "        vertical-align: middle;\n",
198 |               "    }\n",
199 |               "\n",
200 |               "    .dataframe tbody tr th {\n",
201 |               "        vertical-align: top;\n",
202 |               "    }\n",
203 |               "\n",
204 |               "    .dataframe thead th {\n",
205 |               "        text-align: right;\n",
206 |               "    }\n",
207 |               "</style>\n",
208 |               "<table border=\"1\" class=\"dataframe\">\n",
209 |               "  <thead>\n",
210 |               "    <tr style=\"text-align: right;\">\n",
211 |               "      <th></th>\n",
212 |               "      <th>reviewerID</th>\n",
213 |               "      <th>asin</th>\n",
214 |               "      <th>rate</th>\n",
215 |               "      <th>title</th>\n",
216 |               "    </tr>\n",
217 |               "  </thead>\n",
218 |               "  <tbody>\n",
219 |               "    <tr>\n",
220 |               "      <th>0</th>\n",
221 |               "      <td>0</td>\n",
222 |               "      <td>0</td>\n",
223 |               "      <td>2.705333</td>\n",
224 |               "      <td>Memory of Trees</td>\n",
225 |               "    </tr>\n",
226 |               "    <tr>\n",
227 |               "      <th>1</th>\n",
228 |               "      <td>0</td>\n",
229 |               "      <td>1978</td>\n",
230 |               "      <td>4.839800</td>\n",
231 |               "      <td>La Revancha Del Tango</td>\n",
232 |               "    </tr>\n",
233 |               "    <tr>\n",
234 |               "      <th>2</th>\n",
235 |               "      <td>0</td>\n",
236 |               "      <td>2471</td>\n",
237 |               "      <td>4.478565</td>\n",
238 |               "      <td>Jagged Little Pill Acoustic</td>\n",
239 |               "    </tr>\n",
240 |               "  </tbody>\n",
241 |               "</table>\n",
242 |               "</div>"
243 |             ],
244 |             "text/plain": [
245 |               "   reviewerID  asin      rate                        title\n",
246 |               "0           0     0  2.705333              Memory of Trees\n",
247 |               "1           0  1978  4.839800        La Revancha Del Tango\n",
248 |               "2           0  2471  4.478565  Jagged Little Pill Acoustic"
249 |             ]
250 |           },
251 |           "metadata": {
252 |             "tags": []
253 |           },
254 |           "execution_count": 15
255 |         }
256 |       ]
257 |     },
258 |     {
259 |       "cell_type": "code",
260 |       "metadata": {
261 |         "id": "dqOhsX0XEIG_",
262 |         "colab_type": "code",
263 |         "colab": {
264 |           "base_uri": "https://localhost:8080/",
265 |           "height": 153
266 |         },
267 |         "outputId": "10857605-fd20-41d2-a412-056e83d8e805"
268 |       },
269 |       "source": [
270 |         "from caserec.recommenders.rating_prediction.userknn import UserKNN\n",
271 |         "\n",
272 |         "UserKNN('train.dat', 'test.dat', 'rp_uknn.dat').compute()"
273 |       ],
274 |       "execution_count": 13,
275 |       "outputs": [
276 |         {
277 |           "output_type": "stream",
278 |           "text": [
279 |             "[Case Recommender: Rating Prediction > UserKNN Algorithm]\n",
280 |             "\n",
281 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
282 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
283 |             "\n",
284 |             "training_time:: 9.999057 sec\n",
285 |             "prediction_time:: 3.507684 sec\n",
286 |             "Eval:: MAE: 0.687115 RMSE: 1.008135 \n"
287 |           ],
288 |           "name": "stdout"
289 |         }
290 |       ]
291 |     },
292 |     {
293 |       "cell_type": "code",
294 |       "metadata": {
295 |         "id": "fG8zLJsFFVN9",
296 |         "colab_type": "code",
297 |         "colab": {
298 |           "base_uri": "https://localhost:8080/",
299 |           "height": 142
300 |         },
301 |         "outputId": "71f85886-4531-4851-e97a-a3bb4d04118f"
302 |       },
303 |       "source": [
304 |         "predictions = pd.read_csv('rp_uknn.dat', sep='\\t', names=['reviewerID', 'asin', 'rate'])\n",
305 |         "predictions['title'] = predictions.asin.map(title_dict)\n",
306 |         "predictions.head(3)"
307 |       ],
308 |       "execution_count": 16,
309 |       "outputs": [
310 |         {
311 |           "output_type": "execute_result",
312 |           "data": {
313 |             "text/html": [
314 |               "<div>\n",
315 |               "<style scoped>\n",
316 |               "    .dataframe tbody tr th:only-of-type {\n",
317 |               "        vertical-align: middle;\n",
318 |               "    }\n",
319 |               "\n",
320 |               "    .dataframe tbody tr th {\n",
321 |               "        vertical-align: top;\n",
322 |               "    }\n",
323 |               "\n",
324 |               "    .dataframe thead th {\n",
325 |               "        text-align: right;\n",
326 |               "    }\n",
327 |               "</style>\n",
328 |               "<table border=\"1\" class=\"dataframe\">\n",
329 |               "  <thead>\n",
330 |               "    <tr style=\"text-align: right;\">\n",
331 |               "      <th></th>\n",
332 |               "      <th>reviewerID</th>\n",
333 |               "      <th>asin</th>\n",
334 |               "      <th>rate</th>\n",
335 |               "      <th>title</th>\n",
336 |               "    </tr>\n",
337 |               "  </thead>\n",
338 |               "  <tbody>\n",
339 |               "    <tr>\n",
340 |               "      <th>0</th>\n",
341 |               "      <td>0</td>\n",
342 |               "      <td>0</td>\n",
343 |               "      <td>5.000000</td>\n",
344 |               "      <td>Memory of Trees</td>\n",
345 |               "    </tr>\n",
346 |               "    <tr>\n",
347 |               "      <th>1</th>\n",
348 |               "      <td>0</td>\n",
349 |               "      <td>1978</td>\n",
350 |               "      <td>4.736122</td>\n",
351 |               "      <td>La Revancha Del Tango</td>\n",
352 |               "    </tr>\n",
353 |               "    <tr>\n",
354 |               "      <th>2</th>\n",
355 |               "      <td>0</td>\n",
356 |               "      <td>2471</td>\n",
357 |               "      <td>3.899040</td>\n",
358 |               "      <td>Jagged Little Pill Acoustic</td>\n",
359 |               "    </tr>\n",
360 |               "  </tbody>\n",
361 |               "</table>\n",
362 |               "</div>"
363 |             ],
364 |             "text/plain": [
365 |               "   reviewerID  asin      rate                        title\n",
366 |               "0           0     0  5.000000              Memory of Trees\n",
367 |               "1           0  1978  4.736122        La Revancha Del Tango\n",
368 |               "2           0  2471  3.899040  Jagged Little Pill Acoustic"
369 |             ]
370 |           },
371 |           "metadata": {
372 |             "tags": []
373 |           },
374 |           "execution_count": 16
375 |         }
376 |       ]
377 |     },
378 |     {
379 |       "cell_type": "markdown",
380 |       "metadata": {
381 |         "id": "lch6FzLoGF4F",
382 |         "colab_type": "text"
383 |       },
384 |       "source": [
385 |         "## Model-based"
386 |       ]
387 |     },
388 |     {
389 |       "cell_type": "code",
390 |       "metadata": {
391 |         "id": "hTgDDYUlGJTc",
392 |         "colab_type": "code",
393 |         "colab": {
394 |           "base_uri": "https://localhost:8080/",
395 |           "height": 187
396 |         },
397 |         "outputId": "f9683bc4-47bc-4055-bb6d-0f03216a6cf4"
398 |       },
399 |       "source": [
400 |         "from caserec.recommenders.rating_prediction.matrixfactorization import MatrixFactorization\n",
401 |         "\n",
402 |         "MatrixFactorization('train.dat', 'test.dat', 'rp_mf.dat').compute()"
403 |       ],
404 |       "execution_count": 19,
405 |       "outputs": [
406 |         {
407 |           "output_type": "stream",
408 |           "text": [
409 |             "[Case Recommender: Rating Prediction > Matrix Factorization]\n",
410 |             "\n",
411 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
412 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
413 |             "\n",
414 |             "training_time:: 14.870324 sec\n",
415 |             "prediction_time:: 0.051896 sec\n",
416 |             "\n",
417 |             "\n",
418 |             "Eval:: MAE: 0.713848 RMSE: 0.979218 \n"
419 |           ],
420 |           "name": "stdout"
421 |         }
422 |       ]
423 |     },
424 |     {
425 |       "cell_type": "code",
426 |       "metadata": {
427 |         "id": "uPF0WFUkGE3q",
428 |         "colab_type": "code",
429 |         "colab": {
430 |           "base_uri": "https://localhost:8080/",
431 |           "height": 142
432 |         },
433 |         "outputId": "1a70e882-7d42-4afe-d7d7-0616855c1fc5"
434 |       },
435 |       "source": [
436 |         "predictions = pd.read_csv('rp_mf.dat', sep='\\t', names=['reviewerID', 'asin', 'rate'])\n",
437 |         "predictions['title'] = predictions.asin.map(title_dict)\n",
438 |         "predictions.head(3)"
439 |       ],
440 |       "execution_count": 20,
441 |       "outputs": [
442 |         {
443 |           "output_type": "execute_result",
444 |           "data": {
445 |             "text/html": [
446 |               "<div>\n",
447 |               "<style scoped>\n",
448 |               "    .dataframe tbody tr th:only-of-type {\n",
449 |               "        vertical-align: middle;\n",
450 |               "    }\n",
451 |               "\n",
452 |               "    .dataframe tbody tr th {\n",
453 |               "        vertical-align: top;\n",
454 |               "    }\n",
455 |               "\n",
456 |               "    .dataframe thead th {\n",
457 |               "        text-align: right;\n",
458 |               "    }\n",
459 |               "</style>\n",
460 |               "<table border=\"1\" class=\"dataframe\">\n",
461 |               "  <thead>\n",
462 |               "    <tr style=\"text-align: right;\">\n",
463 |               "      <th></th>\n",
464 |               "      <th>reviewerID</th>\n",
465 |               "      <th>asin</th>\n",
466 |               "      <th>rate</th>\n",
467 |               "      <th>title</th>\n",
468 |               "    </tr>\n",
469 |               "  </thead>\n",
470 |               "  <tbody>\n",
471 |               "    <tr>\n",
472 |               "      <th>0</th>\n",
473 |               "      <td>0</td>\n",
474 |               "      <td>2471</td>\n",
475 |               "      <td>4.388470</td>\n",
476 |               "      <td>Jagged Little Pill Acoustic</td>\n",
477 |               "    </tr>\n",
478 |               "    <tr>\n",
479 |               "      <th>1</th>\n",
480 |               "      <td>0</td>\n",
481 |               "      <td>1978</td>\n",
482 |               "      <td>4.943073</td>\n",
483 |               "      <td>La Revancha Del Tango</td>\n",
484 |               "    </tr>\n",
485 |               "    <tr>\n",
486 |               "      <th>2</th>\n",
487 |               "      <td>0</td>\n",
488 |               "      <td>0</td>\n",
489 |               "      <td>4.829396</td>\n",
490 |               "      <td>Memory of Trees</td>\n",
491 |               "    </tr>\n",
492 |               "  </tbody>\n",
493 |               "</table>\n",
494 |               "</div>"
495 |             ],
496 |             "text/plain": [
497 |               "   reviewerID  asin      rate                        title\n",
498 |               "0           0  2471  4.388470  Jagged Little Pill Acoustic\n",
499 |               "1           0  1978  4.943073        La Revancha Del Tango\n",
500 |               "2           0     0  4.829396              Memory of Trees"
501 |             ]
502 |           },
503 |           "metadata": {
504 |             "tags": []
505 |           },
506 |           "execution_count": 20
507 |         }
508 |       ]
509 |     }
510 |   ]
511 | }


--------------------------------------------------------------------------------
/Processed Datasets/RS_NonPersonalized.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |   "nbformat": 4,
   3 |   "nbformat_minor": 0,
   4 |   "metadata": {
   5 |     "kernelspec": {
   6 |       "display_name": "Python 3",
   7 |       "language": "python",
   8 |       "name": "python3"
   9 |     },
  10 |     "language_info": {
  11 |       "codemirror_mode": {
  12 |         "name": "ipython",
  13 |         "version": 3
  14 |       },
  15 |       "file_extension": ".py",
  16 |       "mimetype": "text/x-python",
  17 |       "name": "python",
  18 |       "nbconvert_exporter": "python",
  19 |       "pygments_lexer": "ipython3",
  20 |       "version": "3.7.3"
  21 |     },
  22 |     "colab": {
  23 |       "name": "RS_NonPersonalized.ipynb",
  24 |       "version": "0.3.2",
  25 |       "provenance": []
  26 |     }
  27 |   },
  28 |   "cells": [
  29 |     {
  30 |       "cell_type": "code",
  31 |       "metadata": {
  32 |         "id": "UzAb8mL9A7D_",
  33 |         "colab_type": "code",
  34 |         "colab": {
  35 |           "base_uri": "https://localhost:8080/",
  36 |           "height": 442
  37 |         },
  38 |         "outputId": "b8239530-747b-4b76-fcb7-8a2344469d5b"
  39 |       },
  40 |       "source": [
  41 |         "! wget https://github.com/caserec/Datasets-for-Recommneder-Systems/raw/master/Processed%20Datasets/AmazonMusic.tar.xz\n",
  42 |         "! tar -xf AmazonMusic.tar.xz\n",
  43 |         "! pip install caserecommender"
  44 |       ],
  45 |       "execution_count": 30,
  46 |       "outputs": [
  47 |         {
  48 |           "output_type": "stream",
  49 |           "text": [
  50 |             "--2019-09-04 20:48:25--  https://github.com/caserec/Datasets-for-Recommneder-Systems/raw/master/Processed%20Datasets/AmazonMusic.tar.xz\n",
  51 |             "Resolving github.com (github.com)... 192.30.255.112\n",
  52 |             "Connecting to github.com (github.com)|192.30.255.112|:443... connected.\n",
  53 |             "HTTP request sent, awaiting response... 302 Found\n",
  54 |             "Location: https://raw.githubusercontent.com/caserec/Datasets-for-Recommneder-Systems/master/Processed%20Datasets/AmazonMusic.tar.xz [following]\n",
  55 |             "--2019-09-04 20:48:25--  https://raw.githubusercontent.com/caserec/Datasets-for-Recommneder-Systems/master/Processed%20Datasets/AmazonMusic.tar.xz\n",
  56 |             "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...\n",
  57 |             "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.\n",
  58 |             "HTTP request sent, awaiting response... 200 OK\n",
  59 |             "Length: 22112728 (21M) [application/octet-stream]\n",
  60 |             "Saving to: ‘AmazonMusic.tar.xz.2’\n",
  61 |             "\n",
  62 |             "\rAmazonMusic.tar.xz.   0%[                    ]       0  --.-KB/s               \rAmazonMusic.tar.xz. 100%[===================>]  21.09M   137MB/s    in 0.2s    \n",
  63 |             "\n",
  64 |             "2019-09-04 20:48:25 (137 MB/s) - ‘AmazonMusic.tar.xz.2’ saved [22112728/22112728]\n",
  65 |             "\n",
  66 |             "Requirement already satisfied: caserecommender in /usr/local/lib/python3.6/dist-packages (1.0.918.post0)\n",
  67 |             "Requirement already satisfied: scikit-learn in /usr/local/lib/python3.6/dist-packages (from caserecommender) (0.21.3)\n",
  68 |             "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from caserecommender) (1.16.4)\n",
  69 |             "Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from caserecommender) (1.3.1)\n",
  70 |             "Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from caserecommender) (0.24.2)\n",
  71 |             "Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-learn->caserecommender) (0.13.2)\n",
  72 |             "Requirement already satisfied: python-dateutil>=2.5.0 in /usr/local/lib/python3.6/dist-packages (from pandas->caserecommender) (2.5.3)\n",
  73 |             "Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->caserecommender) (2018.9)\n",
  74 |             "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.5.0->pandas->caserecommender) (1.12.0)\n"
  75 |           ],
  76 |           "name": "stdout"
  77 |         }
  78 |       ]
  79 |     },
  80 |     {
  81 |       "cell_type": "code",
  82 |       "metadata": {
  83 |         "id": "MngqcgCxBLeX",
  84 |         "colab_type": "code",
  85 |         "colab": {
  86 |           "base_uri": "https://localhost:8080/",
  87 |           "height": 51
  88 |         },
  89 |         "outputId": "7490f411-aaf4-4582-aeb2-8df3677adf20"
  90 |       },
  91 |       "source": [
  92 |         "ls"
  93 |       ],
  94 |       "execution_count": 31,
  95 |       "outputs": [
  96 |         {
  97 |           "output_type": "stream",
  98 |           "text": [
  99 |             "\u001b[0m\u001b[01;34mAmazonMusic\u001b[0m/        AmazonMusic.tar.xz.1  map_tilte.npy  test.dat\n",
 100 |             "AmazonMusic.tar.xz  AmazonMusic.tar.xz.2  \u001b[01;34msample_data\u001b[0m/   train.dat\n"
 101 |           ],
 102 |           "name": "stdout"
 103 |         }
 104 |       ]
 105 |     },
 106 |     {
 107 |       "cell_type": "code",
 108 |       "metadata": {
 109 |         "id": "L0e1hREHA1GH",
 110 |         "colab_type": "code",
 111 |         "colab": {}
 112 |       },
 113 |       "source": [
 114 |         "import pandas as pd\n",
 115 |         "import numpy as np"
 116 |       ],
 117 |       "execution_count": 0,
 118 |       "outputs": []
 119 |     },
 120 |     {
 121 |       "cell_type": "code",
 122 |       "metadata": {
 123 |         "id": "OAQRi8d5A1GO",
 124 |         "colab_type": "code",
 125 |         "colab": {
 126 |           "base_uri": "https://localhost:8080/",
 127 |           "height": 204
 128 |         },
 129 |         "outputId": "66cd718f-710e-4bf7-aade-4fbf9bac24ae"
 130 |       },
 131 |       "source": [
 132 |         "dataset = pd.read_json('./AmazonMusic/Digital_Music_5.json', lines=True)\n",
 133 |         "dataset.head()"
 134 |       ],
 135 |       "execution_count": 33,
 136 |       "outputs": [
 137 |         {
 138 |           "output_type": "execute_result",
 139 |           "data": {
 140 |             "text/html": [
 141 |               "<div>\n",
 142 |               "<style scoped>\n",
 143 |               "    .dataframe tbody tr th:only-of-type {\n",
 144 |               "        vertical-align: middle;\n",
 145 |               "    }\n",
 146 |               "\n",
 147 |               "    .dataframe tbody tr th {\n",
 148 |               "        vertical-align: top;\n",
 149 |               "    }\n",
 150 |               "\n",
 151 |               "    .dataframe thead th {\n",
 152 |               "        text-align: right;\n",
 153 |               "    }\n",
 154 |               "</style>\n",
 155 |               "<table border=\"1\" class=\"dataframe\">\n",
 156 |               "  <thead>\n",
 157 |               "    <tr style=\"text-align: right;\">\n",
 158 |               "      <th></th>\n",
 159 |               "      <th>asin</th>\n",
 160 |               "      <th>helpful</th>\n",
 161 |               "      <th>overall</th>\n",
 162 |               "      <th>reviewText</th>\n",
 163 |               "      <th>reviewTime</th>\n",
 164 |               "      <th>reviewerID</th>\n",
 165 |               "      <th>reviewerName</th>\n",
 166 |               "      <th>summary</th>\n",
 167 |               "      <th>unixReviewTime</th>\n",
 168 |               "    </tr>\n",
 169 |               "  </thead>\n",
 170 |               "  <tbody>\n",
 171 |               "    <tr>\n",
 172 |               "      <th>0</th>\n",
 173 |               "      <td>5555991584</td>\n",
 174 |               "      <td>[3, 3]</td>\n",
 175 |               "      <td>5</td>\n",
 176 |               "      <td>It's hard to believe \"Memory of Trees\" came ou...</td>\n",
 177 |               "      <td>09 12, 2006</td>\n",
 178 |               "      <td>A3EBHHCZO6V2A4</td>\n",
 179 |               "      <td>Amaranth \"music fan\"</td>\n",
 180 |               "      <td>Enya's last great album</td>\n",
 181 |               "      <td>1158019200</td>\n",
 182 |               "    </tr>\n",
 183 |               "    <tr>\n",
 184 |               "      <th>1</th>\n",
 185 |               "      <td>5555991584</td>\n",
 186 |               "      <td>[0, 0]</td>\n",
 187 |               "      <td>5</td>\n",
 188 |               "      <td>A clasically-styled and introverted album, Mem...</td>\n",
 189 |               "      <td>06 3, 2001</td>\n",
 190 |               "      <td>AZPWAXJG9OJXV</td>\n",
 191 |               "      <td>bethtexas</td>\n",
 192 |               "      <td>Enya at her most elegant</td>\n",
 193 |               "      <td>991526400</td>\n",
 194 |               "    </tr>\n",
 195 |               "    <tr>\n",
 196 |               "      <th>2</th>\n",
 197 |               "      <td>5555991584</td>\n",
 198 |               "      <td>[2, 2]</td>\n",
 199 |               "      <td>5</td>\n",
 200 |               "      <td>I never thought Enya would reach the sublime h...</td>\n",
 201 |               "      <td>07 14, 2003</td>\n",
 202 |               "      <td>A38IRL0X2T4DPF</td>\n",
 203 |               "      <td>bob turnley</td>\n",
 204 |               "      <td>The best so far</td>\n",
 205 |               "      <td>1058140800</td>\n",
 206 |               "    </tr>\n",
 207 |               "    <tr>\n",
 208 |               "      <th>3</th>\n",
 209 |               "      <td>5555991584</td>\n",
 210 |               "      <td>[1, 1]</td>\n",
 211 |               "      <td>5</td>\n",
 212 |               "      <td>This is the third review of an irish album I w...</td>\n",
 213 |               "      <td>05 3, 2000</td>\n",
 214 |               "      <td>A22IK3I6U76GX0</td>\n",
 215 |               "      <td>Calle</td>\n",
 216 |               "      <td>Ireland produces good music.</td>\n",
 217 |               "      <td>957312000</td>\n",
 218 |               "    </tr>\n",
 219 |               "    <tr>\n",
 220 |               "      <th>4</th>\n",
 221 |               "      <td>5555991584</td>\n",
 222 |               "      <td>[1, 1]</td>\n",
 223 |               "      <td>4</td>\n",
 224 |               "      <td>Enya, despite being a successful recording art...</td>\n",
 225 |               "      <td>01 17, 2008</td>\n",
 226 |               "      <td>A1AISPOIIHTHXX</td>\n",
 227 |               "      <td>Cloud \"...\"</td>\n",
 228 |               "      <td>4.5; music to dream to</td>\n",
 229 |               "      <td>1200528000</td>\n",
 230 |               "    </tr>\n",
 231 |               "  </tbody>\n",
 232 |               "</table>\n",
 233 |               "</div>"
 234 |             ],
 235 |             "text/plain": [
 236 |               "         asin helpful  ...                       summary unixReviewTime\n",
 237 |               "0  5555991584  [3, 3]  ...       Enya's last great album     1158019200\n",
 238 |               "1  5555991584  [0, 0]  ...      Enya at her most elegant      991526400\n",
 239 |               "2  5555991584  [2, 2]  ...               The best so far     1058140800\n",
 240 |               "3  5555991584  [1, 1]  ...  Ireland produces good music.      957312000\n",
 241 |               "4  5555991584  [1, 1]  ...        4.5; music to dream to     1200528000\n",
 242 |               "\n",
 243 |               "[5 rows x 9 columns]"
 244 |             ]
 245 |           },
 246 |           "metadata": {
 247 |             "tags": []
 248 |           },
 249 |           "execution_count": 33
 250 |         }
 251 |       ]
 252 |     },
 253 |     {
 254 |       "cell_type": "code",
 255 |       "metadata": {
 256 |         "id": "BOLVuBBqA1GV",
 257 |         "colab_type": "code",
 258 |         "colab": {
 259 |           "base_uri": "https://localhost:8080/",
 260 |           "height": 265
 261 |         },
 262 |         "outputId": "2b82c7d7-bf47-41ae-81c1-5a3bc6a17ed4"
 263 |       },
 264 |       "source": [
 265 |         "dataset.overall.value_counts().plot(kind='bar', color=['g', 'c', 'y', 'b', 'r']);"
 266 |       ],
 267 |       "execution_count": 34,
 268 |       "outputs": [
 269 |         {
 270 |           "output_type": "display_data",
 271 |           "data": {
 272 |             "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAD4CAYAAAAHHSreAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAE4xJREFUeJzt3X+MXeWd3/H3J+bHsk0TTJgiZFsL\n2liKnLTrhFlDlVXLEsUYWtWslEbQanERwlsFVFZdVSHbVg5JkJI/dlEjJUhscTCr3RDKboQbmfVa\nLGyUVvwYCAUMi5glibDFj9mYH0tZgcx++8d9XK78zHjGM+O5Y/x+SVf33O95zrnfc2XPZ+45z72T\nqkKSpGEfGHUDkqTlx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lS56RRNzBfZ555\nZp1zzjmjbkOSjiuPPvro31TV2GzjjttwOOecc5iYmBh1G5J0XEnys7mM87SSJKljOEiSOoaDJKlj\nOEiSOoaDJKljOEiSOoaDJKljOEiSOrN+CC7JLwA/BE5t4++uqm1Jbgf+OfB6G/rvqurxJAH+G3Ap\n8FarP9b2tQX4L23816pqR6ufB9wOnAbsAq6vY/zHrXNjjuXu56y2+Te8JS0/c/mE9NvARVX1ZpKT\ngR8lubet+09Vdfdh4y8B1rbb+cAtwPlJzgC2AeNAAY8m2VlVr7Yx1wAPMQiHTcC9SJJGYtbTSjXw\nZnt4crsd6dfdzcAdbbsHgdOTnA1cDOypqgMtEPYAm9q6D1XVg+3dwh3AZQs4JknSAs3pmkOSFUke\nB15h8AP+obbqpiRPJLk5yamttgp4YWjzfa12pPq+aeqSpBGZUzhU1btVtR5YDWxI8gngS8DHgF8F\nzgC+eMy6bJJsTTKRZGJqaupYP50knbCOarZSVb0G3A9sqqoX26mjt4HvABvasP3AmqHNVrfakeqr\np6lP9/y3VtV4VY2Pjc36jbOSpHmaNRySjCU5vS2fBnwW+Kt2rYA2O+ky4Km2yU7gygxcALxeVS8C\nu4GNSVYmWQlsBHa3dW8kuaDt60rgnsU9TEnS0ZjLbKWzgR1JVjAIk7uq6gdJ/iLJGBDgceDft/G7\nGExjnWQwlfUqgKo6kOSrwCNt3Feq6kBb/gLvTWW9F2cqSdJIzRoOVfUE8Mlp6hfNML6Aa2dYtx3Y\nPk19AvjEbL1IkpaGn5CWJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lS\nx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHUMB0lSx3CQJHVmDYckv5Dk\n4ST/J8neJDe2+rlJHkoymeR7SU5p9VPb48m2/pyhfX2p1Z9NcvFQfVOrTSa5YfEPU5J0NObyzuFt\n4KKq+hVgPbApyQXAN4Cbq+qjwKvA1W381cCrrX5zG0eSdcDlwMeBTcC3k6xIsgL4FnAJsA64oo2V\nJI3IrOFQA2+2hye3WwEXAXe3+g7gsra8uT2mrf9MkrT6nVX1dlX9BJgENrTbZFU9X1XvAHe2sZKk\nEZnTNYf2G/7jwCvAHuCvgdeq6mAbsg9Y1ZZXAS8AtPWvAx8Zrh+2zUz16frYmmQiycTU1NRcWpck\nzcOcwqGq3q2q9cBqBr/pf+yYdjVzH7dW1XhVjY+NjY2iBUk6IRzVbKWqeg24H/inwOlJTmqrVgP7\n2/J+YA1AW/9h4OfD9cO2makuSRqRucxWGktyels+Dfgs8AyDkPhcG7YFuKct72yPaev/oqqq1S9v\ns5nOBdYCDwOPAGvb7KdTGFy03rkYBydJmp+TZh/C2cCONqvoA8BdVfWDJE8Ddyb5GvBj4LY2/jbg\nD5NMAgcY/LCnqvYmuQt4GjgIXFtV7wIkuQ7YDawAtlfV3kU7QknSUcvgl/rjz/j4eE1MTMx7+9yY\nRexm/mrb8fn6Szo+JXm0qsZnG+cnpCVJHcNBktQxHCRJHcNBktQxHCRJHcNBktQxHCRJHcNBktQx\nHCRJHcNBktQxHCRJHcNBktQxHCRJHcNBktQxHCRJHcNBktQxHCRJHcNBktQxHCRJHcNBktSZNRyS\nrElyf5Knk+xNcn2rfznJ/iSPt9ulQ9t8KclkkmeTXDxU39Rqk0luGKqfm+ShVv9eklMW+0AlSXM3\nl3cOB4Hfqap1wAXAtUnWtXU3V9X6dtsF0NZdDnwc2AR8O8mKJCuAbwGXAOuAK4b28422r48CrwJX\nL9LxSZLmYdZwqKoXq+qxtvy3wDPAqiNsshm4s6rerqqfAJPAhnabrKrnq+od4E5gc5IAFwF3t+13\nAJfN94AkSQt3VNcckpwDfBJ4qJWuS/JEku1JVrbaKuCFoc32tdpM9Y8Ar1XVwcPq0z3/1iQTSSam\npqaOpnVJ0lGYczgk+SDwJ8BvV9UbwC3ALwPrgReB3zsmHQ6pqluraryqxsfGxo7100nSCeukuQxK\ncjKDYPijqvpTgKp6eWj9HwA/aA/3A2uGNl/dasxQ/zlwepKT2ruH4fGSpBGYy2ylALcBz1TV7w/V\nzx4a9hvAU215J3B5klOTnAusBR4GHgHWtplJpzC4aL2zqgq4H/hc234LcM/CDkuStBBzeefwaeA3\ngSeTPN5qv8tgttF6oICfAr8FUFV7k9wFPM1gptO1VfUuQJLrgN3ACmB7Ve1t+/sicGeSrwE/ZhBG\nkqQRmTUcqupHQKZZtesI29wE3DRNfdd021XV8wxmM0mSlgE/IS1J6hgOkqSO4SBJ6hgOkqSO4SBJ\n6hgOkqSO4SBJ6hgOkqSO4SBJ6hgOkqSO4SBJ6hgOkqSO4SBJ6hgOkqSO4SBJ6hgOkqSO4SBJ6hgO\nkqSO4SBJ6hgOkqTOrOGQZE2S+5M8nWRvkutb/Ywke5I81+5XtnqSfDPJZJInknxqaF9b2vjnkmwZ\nqp+X5Mm2zTeT5FgcrCRpbubyzuEg8DtVtQ64ALg2yTrgBuC+qloL3NceA1wCrG23rcAtMAgTYBtw\nPrAB2HYoUNqYa4a227TwQ5Mkzdes4VBVL1bVY235b4FngFXAZmBHG7YDuKwtbwbuqIEHgdOTnA1c\nDOypqgNV9SqwB9jU1n2oqh6sqgLuGNqXJGkEjuqaQ5JzgE8CDwFnVdWLbdVLwFlteRXwwtBm+1rt\nSPV909Sne/6tSSaSTExNTR1N65KkozDncEjyQeBPgN+uqjeG17Xf+GuRe+tU1a1VNV5V42NjY8f6\n6STphDWncEhyMoNg+KOq+tNWfrmdEqLdv9Lq+4E1Q5uvbrUj1VdPU5ckjchcZisFuA14pqp+f2jV\nTuDQjKMtwD1D9SvbrKULgNfb6afdwMYkK9uF6I3A7rbujSQXtOe6cmhfkqQROGkOYz4N/CbwZJLH\nW+13ga8DdyW5GvgZ8Pm2bhdwKTAJvAVcBVBVB5J8FXikjftKVR1oy18AbgdOA+5tN0nSiMwaDlX1\nI2Cmzx18ZprxBVw7w762A9unqU8An5itF0nS0vAT0pKkjuEgSeoYDpKkjuEgSeoYDpKkjuEgSeoY\nDpKkjuEgSeoYDpKkjuEgSeoYDpKkjuEgSeoYDpKkjuEgSerM5e856H0uDzww6hYAqAsvHHULkhrf\nOUiSOoaDJKljOEiSOoaDJKljOEiSOrOGQ5LtSV5J8tRQ7ctJ9id5vN0uHVr3pSSTSZ5NcvFQfVOr\nTSa5Yah+bpKHWv17SU5ZzAOUJB29ubxzuB3YNE395qpa3267AJKsAy4HPt62+XaSFUlWAN8CLgHW\nAVe0sQDfaPv6KPAqcPVCDkiStHCzhkNV/RA4MMf9bQburKq3q+onwCSwod0mq+r5qnoHuBPYnCTA\nRcDdbfsdwGVHeQySpEW2kGsO1yV5op12Wtlqq4AXhsbsa7WZ6h8BXquqg4fVp5Vka5KJJBNTU1ML\naF2SdCTzDYdbgF8G1gMvAr+3aB0dQVXdWlXjVTU+Nja2FE8pSSekeX19RlW9fGg5yR8AP2gP9wNr\nhoaubjVmqP8cOD3JSe3dw/B4SdKIzOudQ5Kzhx7+BnBoJtNO4PIkpyY5F1gLPAw8AqxtM5NOYXDR\nemdVFXA/8Lm2/Rbgnvn0JElaPLO+c0jyXeBC4Mwk+4BtwIVJ1gMF/BT4LYCq2pvkLuBp4CBwbVW9\n2/ZzHbAbWAFsr6q97Sm+CNyZ5GvAj4HbFu3oJEnzMms4VNUV05Rn/AFeVTcBN01T3wXsmqb+PIPZ\nTJKkZcJPSEuSOoaDJKljOEiSOoaDJKljOEiSOoaDJKljOEiSOoaDJKljOEiSOoaDJKljOEiSOoaD\nJKljOEiSOoaDJKljOEiSOoaDJKljOEiSOoaDJKljOEiSOoaDJKkzazgk2Z7klSRPDdXOSLInyXPt\nfmWrJ8k3k0wmeSLJp4a22dLGP5dky1D9vCRPtm2+mSSLfZCSpKMzl3cOtwObDqvdANxXVWuB+9pj\ngEuAte22FbgFBmECbAPOBzYA2w4FShtzzdB2hz+XJGmJzRoOVfVD4MBh5c3Ajra8A7hsqH5HDTwI\nnJ7kbOBiYE9VHaiqV4E9wKa27kNV9WBVFXDH0L4kSSMy32sOZ1XVi235JeCstrwKeGFo3L5WO1J9\n3zT1aSXZmmQiycTU1NQ8W5ckzWbBF6Tbb/y1CL3M5blurarxqhofGxtbiqeUpBPSfMPh5XZKiHb/\nSqvvB9YMjVvdakeqr56mLkkaofmGw07g0IyjLcA9Q/Ur26ylC4DX2+mn3cDGJCvbheiNwO627o0k\nF7RZSlcO7UuSNCInzTYgyXeBC4Ezk+xjMOvo68BdSa4GfgZ8vg3fBVwKTAJvAVcBVNWBJF8FHmnj\nvlJVhy5yf4HBjKjTgHvbTZI0QrOGQ1VdMcOqz0wztoBrZ9jPdmD7NPUJ4BOz9SFJWjp+QlqS1DEc\nJEkdw0GS1DEcJEkdw0GS1DEcJEkdw0GS1DEcJEkdw0GS1DEcJEkdw0GS1DEcJEkdw0GS1Jn1W1ml\nE8kDD2TULQBw4YVL8scVpRn5zkGS1DEcJEkdw0GS1DEcJEkdw0GS1DEcJEmdBYVDkp8meTLJ40km\nWu2MJHuSPNfuV7Z6knwzyWSSJ5J8amg/W9r455JsWdghSZIWajHeOfx6Va2vqvH2+AbgvqpaC9zX\nHgNcAqxtt63ALTAIE2AbcD6wAdh2KFAkSaNxLE4rbQZ2tOUdwGVD9Ttq4EHg9CRnAxcDe6rqQFW9\nCuwBNh2DviRJc7TQcCjgz5M8mmRrq51VVS+25ZeAs9ryKuCFoW33tdpM9U6SrUkmkkxMTU0tsHVJ\n0kwW+vUZv1ZV+5P8I2BPkr8aXllVlWTRvgegqm4FbgUYHx/3+wUk6RhZ0DuHqtrf7l8Bvs/gmsHL\n7XQR7f6VNnw/sGZo89WtNlNdkjQi8w6HJP8gyT88tAxsBJ4CdgKHZhxtAe5pyzuBK9uspQuA19vp\np93AxiQr24Xoja0mSRqRhZxWOgv4fpJD+/njqvqzJI8AdyW5GvgZ8Pk2fhdwKTAJvAVcBVBVB5J8\nFXikjftKVR1YQF+SpAWadzhU1fPAr0xT/znwmWnqBVw7w762A9vn24skaXH5CWlJUsdwkCR1DAdJ\nUsdwkCR1DAdJUsdwkCR1Fvr1GZLepwYfYRq98otyRsJwkKTZnIBJ6WklSVLHcJAkdQwHSVLHcJAk\ndQwHSVLHcJAkdQwHSVLHcJAkdQwHSVLHcJAkdQwHSVLHcJAkdZZNOCTZlOTZJJNJbhh1P5J0IlsW\n4ZBkBfAt4BJgHXBFknWj7UqSTlzLIhyADcBkVT1fVe8AdwKbR9yTJJ2wlsvfc1gFvDD0eB9w/uGD\nkmwFtraHbyZ5dgl6O5Izgb9ZyA7y5WXyPfELt/DXYpEaWQYW/Fq8j16Nhf+7eN+8FIvw72JxXoxf\nmsug5RIOc1JVtwK3jrqPQ5JMVNX4qPtYDnwt3uNr8R5fi/ccb6/FcjmttB9YM/R4datJkkZguYTD\nI8DaJOcmOQW4HNg54p4k6YS1LE4rVdXBJNcBu4EVwPaq2jvituZi2ZziWgZ8Ld7ja/EeX4v3HFev\nRWoJ/2C1JOn4sFxOK0mSlhHDQZLUMRwkSR3DQYsiyR2j7mFUkmxI8qtteV2S/5jk0lH3JS3Espit\ndDxK8msMvvbjqar681H3s5SSHD7NOMCvJzkdoKr+1dJ3NRpJtjH4TrCTkuxh8Mn++4Ebknyyqm4a\naYNLLMnHGHzjwUNV9eZQfVNV/dnoOtPRcrbSHCV5uKo2tOVrgGuB7wMbgf9ZVV8fZX9LKcljwNPA\nfweKQTh8l8HnU6iqvxxdd0sryZPAeuBU4CVgdVW9keQ0Bj8g/8lIG1xCSf4Dg/8XzzB4Ta6vqnva\nuseq6lOj7G+5SHJVVX1n1H3MxtNKc3fy0PJW4LNVdSODcPi3o2lpZMaBR4H/DLxeVQ8Af1dVf3ki\nBUNzsKreraq3gL+uqjcAqurvgL8fbWtL7hrgvKq6DLgQ+K9Jrm/r3j/fkLRwN466gbnwtNLcfSDJ\nSgaBmqqaAqiq/5vk4GhbW1pV9ffAzUn+R7t/mRP339I7SX6xhcN5h4pJPsyJFw4fOHQqqap+muRC\n4O4kv8QJFg5JnphpFXDWUvYyXyfqf+j5+DCD35YDVJKzq+rFJB/kBPuHf0hV7QP+dZJ/Abwx6n5G\n5J9V1dvw/0PzkJOBLaNpaWReTrK+qh4HqKo3k/xLYDvwj0fb2pI7C7gYePWweoD/vfTtHD2vOSxQ\nkl8Ezqqqn4y6F2mUkqxmcJrtpWnWfbqq/tcI2hqJJLcB36mqH02z7o+r6t+MoK2jYjhIkjpekJYk\ndQwHSVLHcJAkdQwHSVLn/wGh/eI8lieAxAAAAABJRU5ErkJggg==\n",
 273 |             "text/plain": [
 274 |               "<Figure size 432x288 with 1 Axes>"
 275 |             ]
 276 |           },
 277 |           "metadata": {
 278 |             "tags": []
 279 |           }
 280 |         }
 281 |       ]
 282 |     },
 283 |     {
 284 |       "cell_type": "code",
 285 |       "metadata": {
 286 |         "id": "Ir7FstouA1Gc",
 287 |         "colab_type": "code",
 288 |         "colab": {
 289 |           "base_uri": "https://localhost:8080/",
 290 |           "height": 406
 291 |         },
 292 |         "outputId": "a7e4a83e-2c67-4aaf-9179-5b019e4ac300"
 293 |       },
 294 |       "source": [
 295 |         "dataset_metadata = pd.read_csv('AmazonMusic/amazon_music_metadata.csv')\n",
 296 |         "dataset_metadata.head()"
 297 |       ],
 298 |       "execution_count": 35,
 299 |       "outputs": [
 300 |         {
 301 |           "output_type": "execute_result",
 302 |           "data": {
 303 |             "text/html": [
 304 |               "<div>\n",
 305 |               "<style scoped>\n",
 306 |               "    .dataframe tbody tr th:only-of-type {\n",
 307 |               "        vertical-align: middle;\n",
 308 |               "    }\n",
 309 |               "\n",
 310 |               "    .dataframe tbody tr th {\n",
 311 |               "        vertical-align: top;\n",
 312 |               "    }\n",
 313 |               "\n",
 314 |               "    .dataframe thead th {\n",
 315 |               "        text-align: right;\n",
 316 |               "    }\n",
 317 |               "</style>\n",
 318 |               "<table border=\"1\" class=\"dataframe\">\n",
 319 |               "  <thead>\n",
 320 |               "    <tr style=\"text-align: right;\">\n",
 321 |               "      <th></th>\n",
 322 |               "      <th>asin</th>\n",
 323 |               "      <th>title</th>\n",
 324 |               "      <th>Accessories</th>\n",
 325 |               "      <th>Acid Jazz</th>\n",
 326 |               "      <th>Acoustic Blues</th>\n",
 327 |               "      <th>Adult Alternative</th>\n",
 328 |               "      <th>Adult Contemporary</th>\n",
 329 |               "      <th>Africa</th>\n",
 330 |               "      <th>Afro Brazilian</th>\n",
 331 |               "      <th>Afro-Cuban</th>\n",
 332 |               "      <th>Air Tool Accessories</th>\n",
 333 |               "      <th>Album-Oriented Rock (AOR)</th>\n",
 334 |               "      <th>Alt Industrial</th>\n",
 335 |               "      <th>Alt-Country &amp; Americana</th>\n",
 336 |               "      <th>Alternative Medicine</th>\n",
 337 |               "      <th>Alternative Metal</th>\n",
 338 |               "      <th>Alternative Rock</th>\n",
 339 |               "      <th>Ambient</th>\n",
 340 |               "      <th>Ambient Pop</th>\n",
 341 |               "      <th>American Alternative</th>\n",
 342 |               "      <th>American Punk</th>\n",
 343 |               "      <th>Americana</th>\n",
 344 |               "      <th>Amplifiers &amp; Effects</th>\n",
 345 |               "      <th>Andes</th>\n",
 346 |               "      <th>Arena Rock</th>\n",
 347 |               "      <th>Argentina</th>\n",
 348 |               "      <th>Arts &amp; Crafts Supplies</th>\n",
 349 |               "      <th>Arts, Crafts &amp; Sewing</th>\n",
 350 |               "      <th>Australia &amp; New Zealand</th>\n",
 351 |               "      <th>Austria</th>\n",
 352 |               "      <th>Avant Garde &amp; Free Jazz</th>\n",
 353 |               "      <th>Baby Products</th>\n",
 354 |               "      <th>Bachata</th>\n",
 355 |               "      <th>Bags &amp; Cases</th>\n",
 356 |               "      <th>Bakersfield Sound</th>\n",
 357 |               "      <th>Ballets</th>\n",
 358 |               "      <th>Ballets &amp; Dances</th>\n",
 359 |               "      <th>Baroque Pop</th>\n",
 360 |               "      <th>Bass</th>\n",
 361 |               "      <th>Bass Guitars</th>\n",
 362 |               "      <th>...</th>\n",
 363 |               "      <th>Third Wave Ska</th>\n",
 364 |               "      <th>Thrash &amp; Speed Metal</th>\n",
 365 |               "      <th>Tin Pan Alley</th>\n",
 366 |               "      <th>Tools &amp; Accessories</th>\n",
 367 |               "      <th>Tools &amp; Home Improvement</th>\n",
 368 |               "      <th>Traditional</th>\n",
 369 |               "      <th>Traditional Blues</th>\n",
 370 |               "      <th>Traditional British &amp; Celtic Folk</th>\n",
 371 |               "      <th>Traditional Folk</th>\n",
 372 |               "      <th>Traditional Jazz &amp; Ragtime</th>\n",
 373 |               "      <th>Traditional Pop</th>\n",
 374 |               "      <th>Traditional Vocal Pop</th>\n",
 375 |               "      <th>Trance</th>\n",
 376 |               "      <th>Tributes</th>\n",
 377 |               "      <th>Trim &amp; Embellishments</th>\n",
 378 |               "      <th>Trip-Hop</th>\n",
 379 |               "      <th>Turkey</th>\n",
 380 |               "      <th>Turntablists</th>\n",
 381 |               "      <th>Twee Pop</th>\n",
 382 |               "      <th>Urban &amp; Contemporary</th>\n",
 383 |               "      <th>Urban Folk</th>\n",
 384 |               "      <th>Uruguay</th>\n",
 385 |               "      <th>Venezuela</th>\n",
 386 |               "      <th>Vitamins &amp; Dietary Supplements</th>\n",
 387 |               "      <th>Vocal Blues</th>\n",
 388 |               "      <th>Vocal Jazz</th>\n",
 389 |               "      <th>Vocal Non-Opera</th>\n",
 390 |               "      <th>Vocal Pop</th>\n",
 391 |               "      <th>Voices</th>\n",
 392 |               "      <th>Walkers</th>\n",
 393 |               "      <th>Wall Stickers</th>\n",
 394 |               "      <th>Wall Switches</th>\n",
 395 |               "      <th>Washers</th>\n",
 396 |               "      <th>Wave Washers &amp; Wave Springs</th>\n",
 397 |               "      <th>Wedding Music</th>\n",
 398 |               "      <th>West Coast</th>\n",
 399 |               "      <th>West Coast Blues</th>\n",
 400 |               "      <th>Western Swing</th>\n",
 401 |               "      <th>World Dance</th>\n",
 402 |               "      <th>World Music</th>\n",
 403 |               "    </tr>\n",
 404 |               "  </thead>\n",
 405 |               "  <tbody>\n",
 406 |               "    <tr>\n",
 407 |               "      <th>0</th>\n",
 408 |               "      <td>5555991584</td>\n",
 409 |               "      <td>Memory of Trees</td>\n",
 410 |               "      <td>0.0</td>\n",
 411 |               "      <td>0.0</td>\n",
 412 |               "      <td>0.0</td>\n",
 413 |               "      <td>0.0</td>\n",
 414 |               "      <td>0.0</td>\n",
 415 |               "      <td>0.0</td>\n",
 416 |               "      <td>0.0</td>\n",
 417 |               "      <td>0.0</td>\n",
 418 |               "      <td>0.0</td>\n",
 419 |               "      <td>0.0</td>\n",
 420 |               "      <td>0.0</td>\n",
 421 |               "      <td>0.0</td>\n",
 422 |               "      <td>0.0</td>\n",
 423 |               "      <td>0.0</td>\n",
 424 |               "      <td>0.0</td>\n",
 425 |               "      <td>0.0</td>\n",
 426 |               "      <td>0.0</td>\n",
 427 |               "      <td>0.0</td>\n",
 428 |               "      <td>0.0</td>\n",
 429 |               "      <td>0.0</td>\n",
 430 |               "      <td>0.0</td>\n",
 431 |               "      <td>0.0</td>\n",
 432 |               "      <td>0.0</td>\n",
 433 |               "      <td>0.0</td>\n",
 434 |               "      <td>0.0</td>\n",
 435 |               "      <td>0.0</td>\n",
 436 |               "      <td>0.0</td>\n",
 437 |               "      <td>0.0</td>\n",
 438 |               "      <td>0.0</td>\n",
 439 |               "      <td>0.0</td>\n",
 440 |               "      <td>0.0</td>\n",
 441 |               "      <td>0.0</td>\n",
 442 |               "      <td>0.0</td>\n",
 443 |               "      <td>0.0</td>\n",
 444 |               "      <td>0.0</td>\n",
 445 |               "      <td>0.0</td>\n",
 446 |               "      <td>0.0</td>\n",
 447 |               "      <td>0.0</td>\n",
 448 |               "      <td>...</td>\n",
 449 |               "      <td>0.0</td>\n",
 450 |               "      <td>0.0</td>\n",
 451 |               "      <td>0.0</td>\n",
 452 |               "      <td>0.0</td>\n",
 453 |               "      <td>0.0</td>\n",
 454 |               "      <td>0.0</td>\n",
 455 |               "      <td>0.0</td>\n",
 456 |               "      <td>0.0</td>\n",
 457 |               "      <td>0.0</td>\n",
 458 |               "      <td>0.0</td>\n",
 459 |               "      <td>0.0</td>\n",
 460 |               "      <td>0.0</td>\n",
 461 |               "      <td>0.0</td>\n",
 462 |               "      <td>0.0</td>\n",
 463 |               "      <td>0.0</td>\n",
 464 |               "      <td>0.0</td>\n",
 465 |               "      <td>0.0</td>\n",
 466 |               "      <td>0.0</td>\n",
 467 |               "      <td>0.0</td>\n",
 468 |               "      <td>0.0</td>\n",
 469 |               "      <td>0.0</td>\n",
 470 |               "      <td>0.0</td>\n",
 471 |               "      <td>0.0</td>\n",
 472 |               "      <td>0.0</td>\n",
 473 |               "      <td>0.0</td>\n",
 474 |               "      <td>0.0</td>\n",
 475 |               "      <td>0.0</td>\n",
 476 |               "      <td>0.0</td>\n",
 477 |               "      <td>0.0</td>\n",
 478 |               "      <td>0.0</td>\n",
 479 |               "      <td>0.0</td>\n",
 480 |               "      <td>0.0</td>\n",
 481 |               "      <td>0.0</td>\n",
 482 |               "      <td>0.0</td>\n",
 483 |               "      <td>0.0</td>\n",
 484 |               "      <td>0.0</td>\n",
 485 |               "      <td>0.0</td>\n",
 486 |               "      <td>0.0</td>\n",
 487 |               "      <td>0.0</td>\n",
 488 |               "      <td>0.0</td>\n",
 489 |               "    </tr>\n",
 490 |               "    <tr>\n",
 491 |               "      <th>1</th>\n",
 492 |               "      <td>6308051551</td>\n",
 493 |               "      <td>Dont Drink His Blood</td>\n",
 494 |               "      <td>0.0</td>\n",
 495 |               "      <td>0.0</td>\n",
 496 |               "      <td>0.0</td>\n",
 497 |               "      <td>0.0</td>\n",
 498 |               "      <td>0.0</td>\n",
 499 |               "      <td>0.0</td>\n",
 500 |               "      <td>0.0</td>\n",
 501 |               "      <td>0.0</td>\n",
 502 |               "      <td>0.0</td>\n",
 503 |               "      <td>0.0</td>\n",
 504 |               "      <td>0.0</td>\n",
 505 |               "      <td>0.0</td>\n",
 506 |               "      <td>0.0</td>\n",
 507 |               "      <td>0.0</td>\n",
 508 |               "      <td>1.0</td>\n",
 509 |               "      <td>0.0</td>\n",
 510 |               "      <td>0.0</td>\n",
 511 |               "      <td>0.0</td>\n",
 512 |               "      <td>0.0</td>\n",
 513 |               "      <td>0.0</td>\n",
 514 |               "      <td>0.0</td>\n",
 515 |               "      <td>0.0</td>\n",
 516 |               "      <td>0.0</td>\n",
 517 |               "      <td>0.0</td>\n",
 518 |               "      <td>0.0</td>\n",
 519 |               "      <td>0.0</td>\n",
 520 |               "      <td>0.0</td>\n",
 521 |               "      <td>0.0</td>\n",
 522 |               "      <td>0.0</td>\n",
 523 |               "      <td>0.0</td>\n",
 524 |               "      <td>0.0</td>\n",
 525 |               "      <td>0.0</td>\n",
 526 |               "      <td>0.0</td>\n",
 527 |               "      <td>0.0</td>\n",
 528 |               "      <td>0.0</td>\n",
 529 |               "      <td>0.0</td>\n",
 530 |               "      <td>0.0</td>\n",
 531 |               "      <td>0.0</td>\n",
 532 |               "      <td>...</td>\n",
 533 |               "      <td>0.0</td>\n",
 534 |               "      <td>0.0</td>\n",
 535 |               "      <td>0.0</td>\n",
 536 |               "      <td>0.0</td>\n",
 537 |               "      <td>0.0</td>\n",
 538 |               "      <td>0.0</td>\n",
 539 |               "      <td>0.0</td>\n",
 540 |               "      <td>0.0</td>\n",
 541 |               "      <td>0.0</td>\n",
 542 |               "      <td>0.0</td>\n",
 543 |               "      <td>0.0</td>\n",
 544 |               "      <td>0.0</td>\n",
 545 |               "      <td>0.0</td>\n",
 546 |               "      <td>0.0</td>\n",
 547 |               "      <td>0.0</td>\n",
 548 |               "      <td>0.0</td>\n",
 549 |               "      <td>0.0</td>\n",
 550 |               "      <td>0.0</td>\n",
 551 |               "      <td>0.0</td>\n",
 552 |               "      <td>0.0</td>\n",
 553 |               "      <td>0.0</td>\n",
 554 |               "      <td>0.0</td>\n",
 555 |               "      <td>0.0</td>\n",
 556 |               "      <td>0.0</td>\n",
 557 |               "      <td>0.0</td>\n",
 558 |               "      <td>0.0</td>\n",
 559 |               "      <td>0.0</td>\n",
 560 |               "      <td>0.0</td>\n",
 561 |               "      <td>0.0</td>\n",
 562 |               "      <td>0.0</td>\n",
 563 |               "      <td>0.0</td>\n",
 564 |               "      <td>0.0</td>\n",
 565 |               "      <td>0.0</td>\n",
 566 |               "      <td>0.0</td>\n",
 567 |               "      <td>0.0</td>\n",
 568 |               "      <td>0.0</td>\n",
 569 |               "      <td>0.0</td>\n",
 570 |               "      <td>0.0</td>\n",
 571 |               "      <td>0.0</td>\n",
 572 |               "      <td>0.0</td>\n",
 573 |               "    </tr>\n",
 574 |               "    <tr>\n",
 575 |               "      <th>2</th>\n",
 576 |               "      <td>7901622466</td>\n",
 577 |               "      <td>On Fire</td>\n",
 578 |               "      <td>0.0</td>\n",
 579 |               "      <td>0.0</td>\n",
 580 |               "      <td>0.0</td>\n",
 581 |               "      <td>0.0</td>\n",
 582 |               "      <td>0.0</td>\n",
 583 |               "      <td>0.0</td>\n",
 584 |               "      <td>0.0</td>\n",
 585 |               "      <td>0.0</td>\n",
 586 |               "      <td>0.0</td>\n",
 587 |               "      <td>0.0</td>\n",
 588 |               "      <td>0.0</td>\n",
 589 |               "      <td>0.0</td>\n",
 590 |               "      <td>0.0</td>\n",
 591 |               "      <td>0.0</td>\n",
 592 |               "      <td>0.0</td>\n",
 593 |               "      <td>0.0</td>\n",
 594 |               "      <td>0.0</td>\n",
 595 |               "      <td>0.0</td>\n",
 596 |               "      <td>0.0</td>\n",
 597 |               "      <td>0.0</td>\n",
 598 |               "      <td>0.0</td>\n",
 599 |               "      <td>0.0</td>\n",
 600 |               "      <td>0.0</td>\n",
 601 |               "      <td>0.0</td>\n",
 602 |               "      <td>0.0</td>\n",
 603 |               "      <td>0.0</td>\n",
 604 |               "      <td>0.0</td>\n",
 605 |               "      <td>0.0</td>\n",
 606 |               "      <td>0.0</td>\n",
 607 |               "      <td>0.0</td>\n",
 608 |               "      <td>0.0</td>\n",
 609 |               "      <td>0.0</td>\n",
 610 |               "      <td>0.0</td>\n",
 611 |               "      <td>0.0</td>\n",
 612 |               "      <td>0.0</td>\n",
 613 |               "      <td>0.0</td>\n",
 614 |               "      <td>0.0</td>\n",
 615 |               "      <td>0.0</td>\n",
 616 |               "      <td>...</td>\n",
 617 |               "      <td>0.0</td>\n",
 618 |               "      <td>0.0</td>\n",
 619 |               "      <td>0.0</td>\n",
 620 |               "      <td>0.0</td>\n",
 621 |               "      <td>0.0</td>\n",
 622 |               "      <td>0.0</td>\n",
 623 |               "      <td>0.0</td>\n",
 624 |               "      <td>0.0</td>\n",
 625 |               "      <td>0.0</td>\n",
 626 |               "      <td>0.0</td>\n",
 627 |               "      <td>0.0</td>\n",
 628 |               "      <td>0.0</td>\n",
 629 |               "      <td>0.0</td>\n",
 630 |               "      <td>0.0</td>\n",
 631 |               "      <td>0.0</td>\n",
 632 |               "      <td>0.0</td>\n",
 633 |               "      <td>0.0</td>\n",
 634 |               "      <td>0.0</td>\n",
 635 |               "      <td>0.0</td>\n",
 636 |               "      <td>0.0</td>\n",
 637 |               "      <td>0.0</td>\n",
 638 |               "      <td>0.0</td>\n",
 639 |               "      <td>0.0</td>\n",
 640 |               "      <td>0.0</td>\n",
 641 |               "      <td>0.0</td>\n",
 642 |               "      <td>0.0</td>\n",
 643 |               "      <td>0.0</td>\n",
 644 |               "      <td>0.0</td>\n",
 645 |               "      <td>0.0</td>\n",
 646 |               "      <td>0.0</td>\n",
 647 |               "      <td>0.0</td>\n",
 648 |               "      <td>0.0</td>\n",
 649 |               "      <td>0.0</td>\n",
 650 |               "      <td>0.0</td>\n",
 651 |               "      <td>0.0</td>\n",
 652 |               "      <td>0.0</td>\n",
 653 |               "      <td>0.0</td>\n",
 654 |               "      <td>0.0</td>\n",
 655 |               "      <td>0.0</td>\n",
 656 |               "      <td>0.0</td>\n",
 657 |               "    </tr>\n",
 658 |               "    <tr>\n",
 659 |               "      <th>3</th>\n",
 660 |               "      <td>B0000000ZW</td>\n",
 661 |               "      <td>Changing Faces</td>\n",
 662 |               "      <td>0.0</td>\n",
 663 |               "      <td>0.0</td>\n",
 664 |               "      <td>0.0</td>\n",
 665 |               "      <td>0.0</td>\n",
 666 |               "      <td>0.0</td>\n",
 667 |               "      <td>0.0</td>\n",
 668 |               "      <td>0.0</td>\n",
 669 |               "      <td>0.0</td>\n",
 670 |               "      <td>0.0</td>\n",
 671 |               "      <td>0.0</td>\n",
 672 |               "      <td>0.0</td>\n",
 673 |               "      <td>0.0</td>\n",
 674 |               "      <td>0.0</td>\n",
 675 |               "      <td>0.0</td>\n",
 676 |               "      <td>0.0</td>\n",
 677 |               "      <td>0.0</td>\n",
 678 |               "      <td>0.0</td>\n",
 679 |               "      <td>0.0</td>\n",
 680 |               "      <td>0.0</td>\n",
 681 |               "      <td>0.0</td>\n",
 682 |               "      <td>0.0</td>\n",
 683 |               "      <td>0.0</td>\n",
 684 |               "      <td>0.0</td>\n",
 685 |               "      <td>0.0</td>\n",
 686 |               "      <td>0.0</td>\n",
 687 |               "      <td>0.0</td>\n",
 688 |               "      <td>0.0</td>\n",
 689 |               "      <td>0.0</td>\n",
 690 |               "      <td>0.0</td>\n",
 691 |               "      <td>0.0</td>\n",
 692 |               "      <td>0.0</td>\n",
 693 |               "      <td>0.0</td>\n",
 694 |               "      <td>0.0</td>\n",
 695 |               "      <td>0.0</td>\n",
 696 |               "      <td>0.0</td>\n",
 697 |               "      <td>0.0</td>\n",
 698 |               "      <td>0.0</td>\n",
 699 |               "      <td>0.0</td>\n",
 700 |               "      <td>...</td>\n",
 701 |               "      <td>0.0</td>\n",
 702 |               "      <td>0.0</td>\n",
 703 |               "      <td>0.0</td>\n",
 704 |               "      <td>0.0</td>\n",
 705 |               "      <td>0.0</td>\n",
 706 |               "      <td>0.0</td>\n",
 707 |               "      <td>0.0</td>\n",
 708 |               "      <td>0.0</td>\n",
 709 |               "      <td>0.0</td>\n",
 710 |               "      <td>0.0</td>\n",
 711 |               "      <td>0.0</td>\n",
 712 |               "      <td>0.0</td>\n",
 713 |               "      <td>0.0</td>\n",
 714 |               "      <td>0.0</td>\n",
 715 |               "      <td>0.0</td>\n",
 716 |               "      <td>0.0</td>\n",
 717 |               "      <td>0.0</td>\n",
 718 |               "      <td>0.0</td>\n",
 719 |               "      <td>0.0</td>\n",
 720 |               "      <td>0.0</td>\n",
 721 |               "      <td>0.0</td>\n",
 722 |               "      <td>0.0</td>\n",
 723 |               "      <td>0.0</td>\n",
 724 |               "      <td>0.0</td>\n",
 725 |               "      <td>0.0</td>\n",
 726 |               "      <td>0.0</td>\n",
 727 |               "      <td>0.0</td>\n",
 728 |               "      <td>0.0</td>\n",
 729 |               "      <td>0.0</td>\n",
 730 |               "      <td>0.0</td>\n",
 731 |               "      <td>0.0</td>\n",
 732 |               "      <td>0.0</td>\n",
 733 |               "      <td>0.0</td>\n",
 734 |               "      <td>0.0</td>\n",
 735 |               "      <td>0.0</td>\n",
 736 |               "      <td>0.0</td>\n",
 737 |               "      <td>0.0</td>\n",
 738 |               "      <td>0.0</td>\n",
 739 |               "      <td>0.0</td>\n",
 740 |               "      <td>0.0</td>\n",
 741 |               "    </tr>\n",
 742 |               "    <tr>\n",
 743 |               "      <th>4</th>\n",
 744 |               "      <td>B00000016W</td>\n",
 745 |               "      <td>Pet Sounds</td>\n",
 746 |               "      <td>0.0</td>\n",
 747 |               "      <td>0.0</td>\n",
 748 |               "      <td>0.0</td>\n",
 749 |               "      <td>0.0</td>\n",
 750 |               "      <td>0.0</td>\n",
 751 |               "      <td>0.0</td>\n",
 752 |               "      <td>0.0</td>\n",
 753 |               "      <td>0.0</td>\n",
 754 |               "      <td>0.0</td>\n",
 755 |               "      <td>0.0</td>\n",
 756 |               "      <td>0.0</td>\n",
 757 |               "      <td>0.0</td>\n",
 758 |               "      <td>0.0</td>\n",
 759 |               "      <td>0.0</td>\n",
 760 |               "      <td>1.0</td>\n",
 761 |               "      <td>0.0</td>\n",
 762 |               "      <td>0.0</td>\n",
 763 |               "      <td>0.0</td>\n",
 764 |               "      <td>0.0</td>\n",
 765 |               "      <td>0.0</td>\n",
 766 |               "      <td>0.0</td>\n",
 767 |               "      <td>0.0</td>\n",
 768 |               "      <td>0.0</td>\n",
 769 |               "      <td>0.0</td>\n",
 770 |               "      <td>0.0</td>\n",
 771 |               "      <td>0.0</td>\n",
 772 |               "      <td>0.0</td>\n",
 773 |               "      <td>0.0</td>\n",
 774 |               "      <td>0.0</td>\n",
 775 |               "      <td>0.0</td>\n",
 776 |               "      <td>0.0</td>\n",
 777 |               "      <td>0.0</td>\n",
 778 |               "      <td>0.0</td>\n",
 779 |               "      <td>0.0</td>\n",
 780 |               "      <td>0.0</td>\n",
 781 |               "      <td>1.0</td>\n",
 782 |               "      <td>0.0</td>\n",
 783 |               "      <td>0.0</td>\n",
 784 |               "      <td>...</td>\n",
 785 |               "      <td>0.0</td>\n",
 786 |               "      <td>0.0</td>\n",
 787 |               "      <td>0.0</td>\n",
 788 |               "      <td>0.0</td>\n",
 789 |               "      <td>0.0</td>\n",
 790 |               "      <td>0.0</td>\n",
 791 |               "      <td>0.0</td>\n",
 792 |               "      <td>0.0</td>\n",
 793 |               "      <td>0.0</td>\n",
 794 |               "      <td>0.0</td>\n",
 795 |               "      <td>0.0</td>\n",
 796 |               "      <td>0.0</td>\n",
 797 |               "      <td>0.0</td>\n",
 798 |               "      <td>0.0</td>\n",
 799 |               "      <td>0.0</td>\n",
 800 |               "      <td>0.0</td>\n",
 801 |               "      <td>0.0</td>\n",
 802 |               "      <td>0.0</td>\n",
 803 |               "      <td>0.0</td>\n",
 804 |               "      <td>0.0</td>\n",
 805 |               "      <td>0.0</td>\n",
 806 |               "      <td>0.0</td>\n",
 807 |               "      <td>0.0</td>\n",
 808 |               "      <td>0.0</td>\n",
 809 |               "      <td>0.0</td>\n",
 810 |               "      <td>0.0</td>\n",
 811 |               "      <td>0.0</td>\n",
 812 |               "      <td>0.0</td>\n",
 813 |               "      <td>0.0</td>\n",
 814 |               "      <td>0.0</td>\n",
 815 |               "      <td>0.0</td>\n",
 816 |               "      <td>0.0</td>\n",
 817 |               "      <td>0.0</td>\n",
 818 |               "      <td>0.0</td>\n",
 819 |               "      <td>0.0</td>\n",
 820 |               "      <td>0.0</td>\n",
 821 |               "      <td>0.0</td>\n",
 822 |               "      <td>0.0</td>\n",
 823 |               "      <td>0.0</td>\n",
 824 |               "      <td>0.0</td>\n",
 825 |               "    </tr>\n",
 826 |               "  </tbody>\n",
 827 |               "</table>\n",
 828 |               "<p>5 rows × 463 columns</p>\n",
 829 |               "</div>"
 830 |             ],
 831 |             "text/plain": [
 832 |               "         asin                 title  ...  World Dance  World Music\n",
 833 |               "0  5555991584       Memory of Trees  ...          0.0          0.0\n",
 834 |               "1  6308051551  Dont Drink His Blood  ...          0.0          0.0\n",
 835 |               "2  7901622466               On Fire  ...          0.0          0.0\n",
 836 |               "3  B0000000ZW        Changing Faces  ...          0.0          0.0\n",
 837 |               "4  B00000016W            Pet Sounds  ...          0.0          0.0\n",
 838 |               "\n",
 839 |               "[5 rows x 463 columns]"
 840 |             ]
 841 |           },
 842 |           "metadata": {
 843 |             "tags": []
 844 |           },
 845 |           "execution_count": 35
 846 |         }
 847 |       ]
 848 |     },
 849 |     {
 850 |       "cell_type": "code",
 851 |       "metadata": {
 852 |         "id": "5_SI0EI7A1Gh",
 853 |         "colab_type": "code",
 854 |         "colab": {
 855 |           "base_uri": "https://localhost:8080/",
 856 |           "height": 204
 857 |         },
 858 |         "outputId": "03785f7b-5fc7-4c57-c65b-b99a18a1daac"
 859 |       },
 860 |       "source": [
 861 |         "df_recsys = dataset[['reviewerID', 'asin', 'overall']] \n",
 862 |         "df_recsys.head()"
 863 |       ],
 864 |       "execution_count": 36,
 865 |       "outputs": [
 866 |         {
 867 |           "output_type": "execute_result",
 868 |           "data": {
 869 |             "text/html": [
 870 |               "<div>\n",
 871 |               "<style scoped>\n",
 872 |               "    .dataframe tbody tr th:only-of-type {\n",
 873 |               "        vertical-align: middle;\n",
 874 |               "    }\n",
 875 |               "\n",
 876 |               "    .dataframe tbody tr th {\n",
 877 |               "        vertical-align: top;\n",
 878 |               "    }\n",
 879 |               "\n",
 880 |               "    .dataframe thead th {\n",
 881 |               "        text-align: right;\n",
 882 |               "    }\n",
 883 |               "</style>\n",
 884 |               "<table border=\"1\" class=\"dataframe\">\n",
 885 |               "  <thead>\n",
 886 |               "    <tr style=\"text-align: right;\">\n",
 887 |               "      <th></th>\n",
 888 |               "      <th>reviewerID</th>\n",
 889 |               "      <th>asin</th>\n",
 890 |               "      <th>overall</th>\n",
 891 |               "    </tr>\n",
 892 |               "  </thead>\n",
 893 |               "  <tbody>\n",
 894 |               "    <tr>\n",
 895 |               "      <th>0</th>\n",
 896 |               "      <td>A3EBHHCZO6V2A4</td>\n",
 897 |               "      <td>5555991584</td>\n",
 898 |               "      <td>5</td>\n",
 899 |               "    </tr>\n",
 900 |               "    <tr>\n",
 901 |               "      <th>1</th>\n",
 902 |               "      <td>AZPWAXJG9OJXV</td>\n",
 903 |               "      <td>5555991584</td>\n",
 904 |               "      <td>5</td>\n",
 905 |               "    </tr>\n",
 906 |               "    <tr>\n",
 907 |               "      <th>2</th>\n",
 908 |               "      <td>A38IRL0X2T4DPF</td>\n",
 909 |               "      <td>5555991584</td>\n",
 910 |               "      <td>5</td>\n",
 911 |               "    </tr>\n",
 912 |               "    <tr>\n",
 913 |               "      <th>3</th>\n",
 914 |               "      <td>A22IK3I6U76GX0</td>\n",
 915 |               "      <td>5555991584</td>\n",
 916 |               "      <td>5</td>\n",
 917 |               "    </tr>\n",
 918 |               "    <tr>\n",
 919 |               "      <th>4</th>\n",
 920 |               "      <td>A1AISPOIIHTHXX</td>\n",
 921 |               "      <td>5555991584</td>\n",
 922 |               "      <td>4</td>\n",
 923 |               "    </tr>\n",
 924 |               "  </tbody>\n",
 925 |               "</table>\n",
 926 |               "</div>"
 927 |             ],
 928 |             "text/plain": [
 929 |               "       reviewerID        asin  overall\n",
 930 |               "0  A3EBHHCZO6V2A4  5555991584        5\n",
 931 |               "1   AZPWAXJG9OJXV  5555991584        5\n",
 932 |               "2  A38IRL0X2T4DPF  5555991584        5\n",
 933 |               "3  A22IK3I6U76GX0  5555991584        5\n",
 934 |               "4  A1AISPOIIHTHXX  5555991584        4"
 935 |             ]
 936 |           },
 937 |           "metadata": {
 938 |             "tags": []
 939 |           },
 940 |           "execution_count": 36
 941 |         }
 942 |       ]
 943 |     },
 944 |     {
 945 |       "cell_type": "code",
 946 |       "metadata": {
 947 |         "id": "Ig3bImzsA1Gm",
 948 |         "colab_type": "code",
 949 |         "colab": {}
 950 |       },
 951 |       "source": [
 952 |         "df_recsys = df_recsys.merge(dataset_metadata[['asin', 'title']])"
 953 |       ],
 954 |       "execution_count": 0,
 955 |       "outputs": []
 956 |     },
 957 |     {
 958 |       "cell_type": "code",
 959 |       "metadata": {
 960 |         "id": "EHTjinG5A1Gq",
 961 |         "colab_type": "code",
 962 |         "colab": {
 963 |           "base_uri": "https://localhost:8080/",
 964 |           "height": 51
 965 |         },
 966 |         "outputId": "707561f9-176d-4cd1-aec2-9f3eed7825bd"
 967 |       },
 968 |       "source": [
 969 |         "# unique users\n",
 970 |         "df_recsys.reviewerID.unique()"
 971 |       ],
 972 |       "execution_count": 38,
 973 |       "outputs": [
 974 |         {
 975 |           "output_type": "execute_result",
 976 |           "data": {
 977 |             "text/plain": [
 978 |               "array(['A3EBHHCZO6V2A4', 'AZPWAXJG9OJXV', 'A38IRL0X2T4DPF', ...,\n",
 979 |               "       'A3IZB368BG43JS', 'A1TPW86OHXTXFC', 'AVSVOKDI0AGR7'], dtype=object)"
 980 |             ]
 981 |           },
 982 |           "metadata": {
 983 |             "tags": []
 984 |           },
 985 |           "execution_count": 38
 986 |         }
 987 |       ]
 988 |     },
 989 |     {
 990 |       "cell_type": "code",
 991 |       "metadata": {
 992 |         "id": "Nre-Q5NIA1Gx",
 993 |         "colab_type": "code",
 994 |         "colab": {
 995 |           "base_uri": "https://localhost:8080/",
 996 |           "height": 51
 997 |         },
 998 |         "outputId": "536ad431-244c-4935-c76a-9e9e4be17878"
 999 |       },
1000 |       "source": [
1001 |         "# unique items\n",
1002 |         "df_recsys.asin.unique()"
1003 |       ],
1004 |       "execution_count": 39,
1005 |       "outputs": [
1006 |         {
1007 |           "output_type": "execute_result",
1008 |           "data": {
1009 |             "text/plain": [
1010 |               "array(['5555991584', 'B0000000ZW', 'B00000016T', ..., 'B000FBGBQ6',\n",
1011 |               "       'B000FDEUI0', 'B000FDFRX2'], dtype=object)"
1012 |             ]
1013 |           },
1014 |           "metadata": {
1015 |             "tags": []
1016 |           },
1017 |           "execution_count": 39
1018 |         }
1019 |       ]
1020 |     },
1021 |     {
1022 |       "cell_type": "code",
1023 |       "metadata": {
1024 |         "id": "1PgnxlkhA1G1",
1025 |         "colab_type": "code",
1026 |         "colab": {
1027 |           "base_uri": "https://localhost:8080/",
1028 |           "height": 204
1029 |         },
1030 |         "outputId": "c98e012e-0e8d-42e3-b00d-330c70d1c598"
1031 |       },
1032 |       "source": [
1033 |         "df_recsys.tail()"
1034 |       ],
1035 |       "execution_count": 40,
1036 |       "outputs": [
1037 |         {
1038 |           "output_type": "execute_result",
1039 |           "data": {
1040 |             "text/html": [
1041 |               "<div>\n",
1042 |               "<style scoped>\n",
1043 |               "    .dataframe tbody tr th:only-of-type {\n",
1044 |               "        vertical-align: middle;\n",
1045 |               "    }\n",
1046 |               "\n",
1047 |               "    .dataframe tbody tr th {\n",
1048 |               "        vertical-align: top;\n",
1049 |               "    }\n",
1050 |               "\n",
1051 |               "    .dataframe thead th {\n",
1052 |               "        text-align: right;\n",
1053 |               "    }\n",
1054 |               "</style>\n",
1055 |               "<table border=\"1\" class=\"dataframe\">\n",
1056 |               "  <thead>\n",
1057 |               "    <tr style=\"text-align: right;\">\n",
1058 |               "      <th></th>\n",
1059 |               "      <th>reviewerID</th>\n",
1060 |               "      <th>asin</th>\n",
1061 |               "      <th>overall</th>\n",
1062 |               "      <th>title</th>\n",
1063 |               "    </tr>\n",
1064 |               "  </thead>\n",
1065 |               "  <tbody>\n",
1066 |               "    <tr>\n",
1067 |               "      <th>51791</th>\n",
1068 |               "      <td>A2LZJ5J9H862SN</td>\n",
1069 |               "      <td>B000FDFRX2</td>\n",
1070 |               "      <td>5</td>\n",
1071 |               "      <td>The Best Of Survivor</td>\n",
1072 |               "    </tr>\n",
1073 |               "    <tr>\n",
1074 |               "      <th>51792</th>\n",
1075 |               "      <td>A14W8HXP3RM3ZS</td>\n",
1076 |               "      <td>B000FDFRX2</td>\n",
1077 |               "      <td>3</td>\n",
1078 |               "      <td>The Best Of Survivor</td>\n",
1079 |               "    </tr>\n",
1080 |               "    <tr>\n",
1081 |               "      <th>51793</th>\n",
1082 |               "      <td>AIMMIYQCNGM24</td>\n",
1083 |               "      <td>B000FDFRX2</td>\n",
1084 |               "      <td>5</td>\n",
1085 |               "      <td>The Best Of Survivor</td>\n",
1086 |               "    </tr>\n",
1087 |               "    <tr>\n",
1088 |               "      <th>51794</th>\n",
1089 |               "      <td>AGGC3BHIG6A5K</td>\n",
1090 |               "      <td>B000FDFRX2</td>\n",
1091 |               "      <td>5</td>\n",
1092 |               "      <td>The Best Of Survivor</td>\n",
1093 |               "    </tr>\n",
1094 |               "    <tr>\n",
1095 |               "      <th>51795</th>\n",
1096 |               "      <td>A3464G00K8ZYD1</td>\n",
1097 |               "      <td>B000FDFRX2</td>\n",
1098 |               "      <td>5</td>\n",
1099 |               "      <td>The Best Of Survivor</td>\n",
1100 |               "    </tr>\n",
1101 |               "  </tbody>\n",
1102 |               "</table>\n",
1103 |               "</div>"
1104 |             ],
1105 |             "text/plain": [
1106 |               "           reviewerID        asin  overall                 title\n",
1107 |               "51791  A2LZJ5J9H862SN  B000FDFRX2        5  The Best Of Survivor\n",
1108 |               "51792  A14W8HXP3RM3ZS  B000FDFRX2        3  The Best Of Survivor\n",
1109 |               "51793   AIMMIYQCNGM24  B000FDFRX2        5  The Best Of Survivor\n",
1110 |               "51794   AGGC3BHIG6A5K  B000FDFRX2        5  The Best Of Survivor\n",
1111 |               "51795  A3464G00K8ZYD1  B000FDFRX2        5  The Best Of Survivor"
1112 |             ]
1113 |           },
1114 |           "metadata": {
1115 |             "tags": []
1116 |           },
1117 |           "execution_count": 40
1118 |         }
1119 |       ]
1120 |     },
1121 |     {
1122 |       "cell_type": "markdown",
1123 |       "metadata": {
1124 |         "id": "ZZUCE_YSA1G6",
1125 |         "colab_type": "text"
1126 |       },
1127 |       "source": [
1128 |         "### Map users and itens"
1129 |       ]
1130 |     },
1131 |     {
1132 |       "cell_type": "code",
1133 |       "metadata": {
1134 |         "id": "PJztpr4NA1G8",
1135 |         "colab_type": "code",
1136 |         "colab": {}
1137 |       },
1138 |       "source": [
1139 |         "map_users = {user: u_id for u_id, user in enumerate(df_recsys.reviewerID.unique())}\n",
1140 |         "map_items = {item: i_id for i_id, item in enumerate(df_recsys.asin.unique())}"
1141 |       ],
1142 |       "execution_count": 0,
1143 |       "outputs": []
1144 |     },
1145 |     {
1146 |       "cell_type": "code",
1147 |       "metadata": {
1148 |         "id": "FrynmP1eA1HA",
1149 |         "colab_type": "code",
1150 |         "colab": {}
1151 |       },
1152 |       "source": [
1153 |         "df_recsys['asin'] = df_recsys['asin'].map(map_items)\n",
1154 |         "df_recsys['reviewerID'] = df_recsys['reviewerID'].map(map_users)"
1155 |       ],
1156 |       "execution_count": 0,
1157 |       "outputs": []
1158 |     },
1159 |     {
1160 |       "cell_type": "code",
1161 |       "metadata": {
1162 |         "id": "LMtVjn6JA1HF",
1163 |         "colab_type": "code",
1164 |         "colab": {
1165 |           "base_uri": "https://localhost:8080/",
1166 |           "height": 204
1167 |         },
1168 |         "outputId": "a7475e4d-06c9-47b3-bcb6-e46dca74b7cd"
1169 |       },
1170 |       "source": [
1171 |         "df_recsys.head()"
1172 |       ],
1173 |       "execution_count": 43,
1174 |       "outputs": [
1175 |         {
1176 |           "output_type": "execute_result",
1177 |           "data": {
1178 |             "text/html": [
1179 |               "<div>\n",
1180 |               "<style scoped>\n",
1181 |               "    .dataframe tbody tr th:only-of-type {\n",
1182 |               "        vertical-align: middle;\n",
1183 |               "    }\n",
1184 |               "\n",
1185 |               "    .dataframe tbody tr th {\n",
1186 |               "        vertical-align: top;\n",
1187 |               "    }\n",
1188 |               "\n",
1189 |               "    .dataframe thead th {\n",
1190 |               "        text-align: right;\n",
1191 |               "    }\n",
1192 |               "</style>\n",
1193 |               "<table border=\"1\" class=\"dataframe\">\n",
1194 |               "  <thead>\n",
1195 |               "    <tr style=\"text-align: right;\">\n",
1196 |               "      <th></th>\n",
1197 |               "      <th>reviewerID</th>\n",
1198 |               "      <th>asin</th>\n",
1199 |               "      <th>overall</th>\n",
1200 |               "      <th>title</th>\n",
1201 |               "    </tr>\n",
1202 |               "  </thead>\n",
1203 |               "  <tbody>\n",
1204 |               "    <tr>\n",
1205 |               "      <th>0</th>\n",
1206 |               "      <td>0</td>\n",
1207 |               "      <td>0</td>\n",
1208 |               "      <td>5</td>\n",
1209 |               "      <td>Memory of Trees</td>\n",
1210 |               "    </tr>\n",
1211 |               "    <tr>\n",
1212 |               "      <th>1</th>\n",
1213 |               "      <td>1</td>\n",
1214 |               "      <td>0</td>\n",
1215 |               "      <td>5</td>\n",
1216 |               "      <td>Memory of Trees</td>\n",
1217 |               "    </tr>\n",
1218 |               "    <tr>\n",
1219 |               "      <th>2</th>\n",
1220 |               "      <td>2</td>\n",
1221 |               "      <td>0</td>\n",
1222 |               "      <td>5</td>\n",
1223 |               "      <td>Memory of Trees</td>\n",
1224 |               "    </tr>\n",
1225 |               "    <tr>\n",
1226 |               "      <th>3</th>\n",
1227 |               "      <td>3</td>\n",
1228 |               "      <td>0</td>\n",
1229 |               "      <td>5</td>\n",
1230 |               "      <td>Memory of Trees</td>\n",
1231 |               "    </tr>\n",
1232 |               "    <tr>\n",
1233 |               "      <th>4</th>\n",
1234 |               "      <td>4</td>\n",
1235 |               "      <td>0</td>\n",
1236 |               "      <td>4</td>\n",
1237 |               "      <td>Memory of Trees</td>\n",
1238 |               "    </tr>\n",
1239 |               "  </tbody>\n",
1240 |               "</table>\n",
1241 |               "</div>"
1242 |             ],
1243 |             "text/plain": [
1244 |               "   reviewerID  asin  overall            title\n",
1245 |               "0           0     0        5  Memory of Trees\n",
1246 |               "1           1     0        5  Memory of Trees\n",
1247 |               "2           2     0        5  Memory of Trees\n",
1248 |               "3           3     0        5  Memory of Trees\n",
1249 |               "4           4     0        4  Memory of Trees"
1250 |             ]
1251 |           },
1252 |           "metadata": {
1253 |             "tags": []
1254 |           },
1255 |           "execution_count": 43
1256 |         }
1257 |       ]
1258 |     },
1259 |     {
1260 |       "cell_type": "code",
1261 |       "metadata": {
1262 |         "id": "5ka4hSNAA1HK",
1263 |         "colab_type": "code",
1264 |         "colab": {}
1265 |       },
1266 |       "source": [
1267 |         "asin_title = {}\n",
1268 |         "\n",
1269 |         "for idx, row in df_recsys.iterrows():\n",
1270 |         "    asin_title[row['asin']] = row['title']\n",
1271 |         "    \n",
1272 |         "np.save('map_tilte.npy', asin_title)"
1273 |       ],
1274 |       "execution_count": 0,
1275 |       "outputs": []
1276 |     },
1277 |     {
1278 |       "cell_type": "markdown",
1279 |       "metadata": {
1280 |         "id": "jsZzl0wbA1HP",
1281 |         "colab_type": "text"
1282 |       },
1283 |       "source": [
1284 |         "### Divide dataset\n",
1285 |         "\n",
1286 |         "https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html"
1287 |       ]
1288 |     },
1289 |     {
1290 |       "cell_type": "code",
1291 |       "metadata": {
1292 |         "id": "0p-YlyM6A1HQ",
1293 |         "colab_type": "code",
1294 |         "colab": {}
1295 |       },
1296 |       "source": [
1297 |         "from sklearn.model_selection import train_test_split"
1298 |       ],
1299 |       "execution_count": 0,
1300 |       "outputs": []
1301 |     },
1302 |     {
1303 |       "cell_type": "code",
1304 |       "metadata": {
1305 |         "id": "h9Co8FfNA1HU",
1306 |         "colab_type": "code",
1307 |         "colab": {}
1308 |       },
1309 |       "source": [
1310 |         "train, test = train_test_split(df_recsys, test_size=0.33, random_state=42)\n",
1311 |         "train.to_csv('train.dat', index=False, header=False, sep='\\t')\n",
1312 |         "test.to_csv('test.dat', index=False, header=False, sep='\\t')"
1313 |       ],
1314 |       "execution_count": 0,
1315 |       "outputs": []
1316 |     },
1317 |     {
1318 |       "cell_type": "code",
1319 |       "metadata": {
1320 |         "id": "qMY469OKBwJv",
1321 |         "colab_type": "code",
1322 |         "colab": {
1323 |           "base_uri": "https://localhost:8080/",
1324 |           "height": 170
1325 |         },
1326 |         "outputId": "3186d9a2-ed33-49f1-bb50-bcc1834796b4"
1327 |       },
1328 |       "source": [
1329 |         "ls -l"
1330 |       ],
1331 |       "execution_count": 47,
1332 |       "outputs": [
1333 |         {
1334 |           "output_type": "stream",
1335 |           "text": [
1336 |             "total 66312\n",
1337 |             "drwxrwxr-x 2 1001 1001     4096 Sep  4 20:33 \u001b[0m\u001b[01;34mAmazonMusic\u001b[0m/\n",
1338 |             "-rw-r--r-- 1 root root 22112728 Sep  4 20:43 AmazonMusic.tar.xz\n",
1339 |             "-rw-r--r-- 1 root root 22112728 Sep  4 20:46 AmazonMusic.tar.xz.1\n",
1340 |             "-rw-r--r-- 1 root root 22112728 Sep  4 20:48 AmazonMusic.tar.xz.2\n",
1341 |             "-rw-r--r-- 1 root root    74684 Sep  4 20:48 map_tilte.npy\n",
1342 |             "drwxr-xr-x 1 root root     4096 Aug 27 16:17 \u001b[01;34msample_data\u001b[0m/\n",
1343 |             "-rw-r--r-- 1 root root   484253 Sep  4 20:48 test.dat\n",
1344 |             "-rw-r--r-- 1 root root   985210 Sep  4 20:48 train.dat\n"
1345 |           ],
1346 |           "name": "stdout"
1347 |         }
1348 |       ]
1349 |     },
1350 |     {
1351 |       "cell_type": "markdown",
1352 |       "metadata": {
1353 |         "id": "2n6LQb5cA1HZ",
1354 |         "colab_type": "text"
1355 |       },
1356 |       "source": [
1357 |         "# Case Recommender"
1358 |       ]
1359 |     },
1360 |     {
1361 |       "cell_type": "markdown",
1362 |       "metadata": {
1363 |         "id": "XzzXj3o1A1Hc",
1364 |         "colab_type": "text"
1365 |       },
1366 |       "source": [
1367 |         "You could also use:\n",
1368 |         "\n",
1369 |         "> from caserec.utils.split_database import SplitDatabase\n",
1370 |         "\n",
1371 |         "> SplitDatabase(input_file=dataset, dir_folds=dir_path, n_splits=10).k_fold_cross_validation()"
1372 |       ]
1373 |     },
1374 |     {
1375 |       "cell_type": "markdown",
1376 |       "metadata": {
1377 |         "id": "bV2yax1HA1Hd",
1378 |         "colab_type": "text"
1379 |       },
1380 |       "source": [
1381 |         "### Rating Prediction"
1382 |       ]
1383 |     },
1384 |     {
1385 |       "cell_type": "code",
1386 |       "metadata": {
1387 |         "id": "YK7liINfA1He",
1388 |         "colab_type": "code",
1389 |         "colab": {
1390 |           "base_uri": "https://localhost:8080/",
1391 |           "height": 170
1392 |         },
1393 |         "outputId": "8dd5f560-e485-40b0-91da-b3e06281aaee"
1394 |       },
1395 |       "source": [
1396 |         "from caserec.recommenders.rating_prediction.most_popular import MostPopular\n",
1397 |         "\n",
1398 |         "MostPopular('train.dat', 'test.dat', 'rp_mostPopular.dat').compute()"
1399 |       ],
1400 |       "execution_count": 48,
1401 |       "outputs": [
1402 |         {
1403 |           "output_type": "stream",
1404 |           "text": [
1405 |             "[Case Recommender: Rating Prediction > Most Popular]\n",
1406 |             "\n",
1407 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
1408 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
1409 |             "\n",
1410 |             "prediction_time:: 0.349664 sec\n",
1411 |             "\n",
1412 |             "\n",
1413 |             "Eval:: MAE: 0.744015 RMSE: 1.005638 \n"
1414 |           ],
1415 |           "name": "stdout"
1416 |         }
1417 |       ]
1418 |     },
1419 |     {
1420 |       "cell_type": "code",
1421 |       "metadata": {
1422 |         "id": "n1EcaoRjA1Hi",
1423 |         "colab_type": "code",
1424 |         "colab": {
1425 |           "base_uri": "https://localhost:8080/",
1426 |           "height": 204
1427 |         },
1428 |         "outputId": "8f69e77d-22e1-4987-e672-3dd2c7153a0e"
1429 |       },
1430 |       "source": [
1431 |         "predictions = pd.read_csv('rp_mostPopular.dat', sep='\\t', names=['reviewerID', 'asin', 'rate'])\n",
1432 |         "predictions['title'] = predictions.asin.map(asin_title)\n",
1433 |         "predictions.head()"
1434 |       ],
1435 |       "execution_count": 49,
1436 |       "outputs": [
1437 |         {
1438 |           "output_type": "execute_result",
1439 |           "data": {
1440 |             "text/html": [
1441 |               "<div>\n",
1442 |               "<style scoped>\n",
1443 |               "    .dataframe tbody tr th:only-of-type {\n",
1444 |               "        vertical-align: middle;\n",
1445 |               "    }\n",
1446 |               "\n",
1447 |               "    .dataframe tbody tr th {\n",
1448 |               "        vertical-align: top;\n",
1449 |               "    }\n",
1450 |               "\n",
1451 |               "    .dataframe thead th {\n",
1452 |               "        text-align: right;\n",
1453 |               "    }\n",
1454 |               "</style>\n",
1455 |               "<table border=\"1\" class=\"dataframe\">\n",
1456 |               "  <thead>\n",
1457 |               "    <tr style=\"text-align: right;\">\n",
1458 |               "      <th></th>\n",
1459 |               "      <th>reviewerID</th>\n",
1460 |               "      <th>asin</th>\n",
1461 |               "      <th>rate</th>\n",
1462 |               "      <th>title</th>\n",
1463 |               "    </tr>\n",
1464 |               "  </thead>\n",
1465 |               "  <tbody>\n",
1466 |               "    <tr>\n",
1467 |               "      <th>0</th>\n",
1468 |               "      <td>0</td>\n",
1469 |               "      <td>2471</td>\n",
1470 |               "      <td>3.777778</td>\n",
1471 |               "      <td>Jagged Little Pill Acoustic</td>\n",
1472 |               "    </tr>\n",
1473 |               "    <tr>\n",
1474 |               "      <th>1</th>\n",
1475 |               "      <td>0</td>\n",
1476 |               "      <td>1978</td>\n",
1477 |               "      <td>4.583333</td>\n",
1478 |               "      <td>La Revancha Del Tango</td>\n",
1479 |               "    </tr>\n",
1480 |               "    <tr>\n",
1481 |               "      <th>2</th>\n",
1482 |               "      <td>0</td>\n",
1483 |               "      <td>0</td>\n",
1484 |               "      <td>4.875000</td>\n",
1485 |               "      <td>Memory of Trees</td>\n",
1486 |               "    </tr>\n",
1487 |               "    <tr>\n",
1488 |               "      <th>3</th>\n",
1489 |               "      <td>1</td>\n",
1490 |               "      <td>1272</td>\n",
1491 |               "      <td>4.333333</td>\n",
1492 |               "      <td>Ani Difranco</td>\n",
1493 |               "    </tr>\n",
1494 |               "    <tr>\n",
1495 |               "      <th>4</th>\n",
1496 |               "      <td>1</td>\n",
1497 |               "      <td>667</td>\n",
1498 |               "      <td>4.833333</td>\n",
1499 |               "      <td>For the Roses</td>\n",
1500 |               "    </tr>\n",
1501 |               "  </tbody>\n",
1502 |               "</table>\n",
1503 |               "</div>"
1504 |             ],
1505 |             "text/plain": [
1506 |               "   reviewerID  asin      rate                        title\n",
1507 |               "0           0  2471  3.777778  Jagged Little Pill Acoustic\n",
1508 |               "1           0  1978  4.583333        La Revancha Del Tango\n",
1509 |               "2           0     0  4.875000              Memory of Trees\n",
1510 |               "3           1  1272  4.333333                 Ani Difranco\n",
1511 |               "4           1   667  4.833333                For the Roses"
1512 |             ]
1513 |           },
1514 |           "metadata": {
1515 |             "tags": []
1516 |           },
1517 |           "execution_count": 49
1518 |         }
1519 |       ]
1520 |     },
1521 |     {
1522 |       "cell_type": "markdown",
1523 |       "metadata": {
1524 |         "id": "jR5JDgESA1Hm",
1525 |         "colab_type": "text"
1526 |       },
1527 |       "source": [
1528 |         "### Ranking"
1529 |       ]
1530 |     },
1531 |     {
1532 |       "cell_type": "code",
1533 |       "metadata": {
1534 |         "id": "M-rLU15PA1Hn",
1535 |         "colab_type": "code",
1536 |         "colab": {
1537 |           "base_uri": "https://localhost:8080/",
1538 |           "height": 187
1539 |         },
1540 |         "outputId": "2e8dcc95-fe44-4d53-ae43-6fe10337f30c"
1541 |       },
1542 |       "source": [
1543 |         "from caserec.recommenders.item_recommendation.most_popular import MostPopular\n",
1544 |         "\n",
1545 |         "MostPopular('train.dat', 'test.dat', 'rank_mostPopular.dat').compute(as_table=True, metrics=['NDCG'])"
1546 |       ],
1547 |       "execution_count": 50,
1548 |       "outputs": [
1549 |         {
1550 |           "output_type": "stream",
1551 |           "text": [
1552 |             "[Case Recommender: Item Recommendation > Most Popular]\n",
1553 |             "\n",
1554 |             "train data:: 5036 users and 2581 items (34703 interactions) | sparsity:: 99.73%\n",
1555 |             "test data:: 4508 users and 2493 items (17093 interactions) | sparsity:: 99.85%\n",
1556 |             "\n",
1557 |             "prediction_time:: 96.215000 sec\n",
1558 |             "\n",
1559 |             "\n",
1560 |             "NDCG@1\tNDCG@3\tNDCG@5\tNDCG@10\t\n",
1561 |             "0.019299\t0.041359\t0.051351\t0.065469\t\n"
1562 |           ],
1563 |           "name": "stdout"
1564 |         }
1565 |       ]
1566 |     },
1567 |     {
1568 |       "cell_type": "code",
1569 |       "metadata": {
1570 |         "id": "INpBAxt-A1Hq",
1571 |         "colab_type": "code",
1572 |         "colab": {
1573 |           "base_uri": "https://localhost:8080/",
1574 |           "height": 359
1575 |         },
1576 |         "outputId": "e2fc6830-3931-4feb-8102-81b3a1e4b108"
1577 |       },
1578 |       "source": [
1579 |         "ranking = pd.read_csv('rank_mostPopular.dat', sep='\\t', names=['reviewerID', 'asin', 'score'])\n",
1580 |         "ranking['title'] = ranking.asin.map(asin_title)\n",
1581 |         "ranking.head(10)"
1582 |       ],
1583 |       "execution_count": 51,
1584 |       "outputs": [
1585 |         {
1586 |           "output_type": "execute_result",
1587 |           "data": {
1588 |             "text/html": [
1589 |               "<div>\n",
1590 |               "<style scoped>\n",
1591 |               "    .dataframe tbody tr th:only-of-type {\n",
1592 |               "        vertical-align: middle;\n",
1593 |               "    }\n",
1594 |               "\n",
1595 |               "    .dataframe tbody tr th {\n",
1596 |               "        vertical-align: top;\n",
1597 |               "    }\n",
1598 |               "\n",
1599 |               "    .dataframe thead th {\n",
1600 |               "        text-align: right;\n",
1601 |               "    }\n",
1602 |               "</style>\n",
1603 |               "<table border=\"1\" class=\"dataframe\">\n",
1604 |               "  <thead>\n",
1605 |               "    <tr style=\"text-align: right;\">\n",
1606 |               "      <th></th>\n",
1607 |               "      <th>reviewerID</th>\n",
1608 |               "      <th>asin</th>\n",
1609 |               "      <th>score</th>\n",
1610 |               "      <th>title</th>\n",
1611 |               "    </tr>\n",
1612 |               "  </thead>\n",
1613 |               "  <tbody>\n",
1614 |               "    <tr>\n",
1615 |               "      <th>0</th>\n",
1616 |               "      <td>0</td>\n",
1617 |               "      <td>1770</td>\n",
1618 |               "      <td>596.0</td>\n",
1619 |               "      <td>The Marshall Mathers LP</td>\n",
1620 |               "    </tr>\n",
1621 |               "    <tr>\n",
1622 |               "      <th>1</th>\n",
1623 |               "      <td>0</td>\n",
1624 |               "      <td>2039</td>\n",
1625 |               "      <td>577.0</td>\n",
1626 |               "      <td>The Eminem Show [Limited Edition w/ Bonus DVD]</td>\n",
1627 |               "    </tr>\n",
1628 |               "    <tr>\n",
1629 |               "      <th>2</th>\n",
1630 |               "      <td>0</td>\n",
1631 |               "      <td>2133</td>\n",
1632 |               "      <td>551.0</td>\n",
1633 |               "      <td>Get Rich Or Die Tryin</td>\n",
1634 |               "    </tr>\n",
1635 |               "    <tr>\n",
1636 |               "      <th>3</th>\n",
1637 |               "      <td>0</td>\n",
1638 |               "      <td>169</td>\n",
1639 |               "      <td>511.0</td>\n",
1640 |               "      <td>All Eyez on Me</td>\n",
1641 |               "    </tr>\n",
1642 |               "    <tr>\n",
1643 |               "      <th>4</th>\n",
1644 |               "      <td>0</td>\n",
1645 |               "      <td>2212</td>\n",
1646 |               "      <td>510.0</td>\n",
1647 |               "      <td>Speakerboxxx/ The Love Below</td>\n",
1648 |               "    </tr>\n",
1649 |               "    <tr>\n",
1650 |               "      <th>5</th>\n",
1651 |               "      <td>0</td>\n",
1652 |               "      <td>992</td>\n",
1653 |               "      <td>509.0</td>\n",
1654 |               "      <td>Are You Experienced</td>\n",
1655 |               "    </tr>\n",
1656 |               "    <tr>\n",
1657 |               "      <th>6</th>\n",
1658 |               "      <td>0</td>\n",
1659 |               "      <td>2408</td>\n",
1660 |               "      <td>492.0</td>\n",
1661 |               "      <td>The Documentary</td>\n",
1662 |               "    </tr>\n",
1663 |               "    <tr>\n",
1664 |               "      <th>7</th>\n",
1665 |               "      <td>0</td>\n",
1666 |               "      <td>1665</td>\n",
1667 |               "      <td>480.0</td>\n",
1668 |               "      <td>Toxicity</td>\n",
1669 |               "    </tr>\n",
1670 |               "    <tr>\n",
1671 |               "      <th>8</th>\n",
1672 |               "      <td>0</td>\n",
1673 |               "      <td>1955</td>\n",
1674 |               "      <td>470.0</td>\n",
1675 |               "      <td>Blueprint</td>\n",
1676 |               "    </tr>\n",
1677 |               "    <tr>\n",
1678 |               "      <th>9</th>\n",
1679 |               "      <td>0</td>\n",
1680 |               "      <td>459</td>\n",
1681 |               "      <td>467.0</td>\n",
1682 |               "      <td>Thriller</td>\n",
1683 |               "    </tr>\n",
1684 |               "  </tbody>\n",
1685 |               "</table>\n",
1686 |               "</div>"
1687 |             ],
1688 |             "text/plain": [
1689 |               "   reviewerID  asin  score                                           title\n",
1690 |               "0           0  1770  596.0                         The Marshall Mathers LP\n",
1691 |               "1           0  2039  577.0  The Eminem Show [Limited Edition w/ Bonus DVD]\n",
1692 |               "2           0  2133  551.0                           Get Rich Or Die Tryin\n",
1693 |               "3           0   169  511.0                                  All Eyez on Me\n",
1694 |               "4           0  2212  510.0                    Speakerboxxx/ The Love Below\n",
1695 |               "5           0   992  509.0                             Are You Experienced\n",
1696 |               "6           0  2408  492.0                                 The Documentary\n",
1697 |               "7           0  1665  480.0                                        Toxicity\n",
1698 |               "8           0  1955  470.0                                       Blueprint\n",
1699 |               "9           0   459  467.0                                        Thriller"
1700 |             ]
1701 |           },
1702 |           "metadata": {
1703 |             "tags": []
1704 |           },
1705 |           "execution_count": 51
1706 |         }
1707 |       ]
1708 |     },
1709 |     {
1710 |       "cell_type": "code",
1711 |       "metadata": {
1712 |         "id": "5Y9tzYdEA1Hz",
1713 |         "colab_type": "code",
1714 |         "colab": {
1715 |           "base_uri": "https://localhost:8080/",
1716 |           "height": 173
1717 |         },
1718 |         "outputId": "4426df3a-0392-43f6-b0ef-586ea1cfff57"
1719 |       },
1720 |       "source": [
1721 |         "train[train.reviewerID == 0]"
1722 |       ],
1723 |       "execution_count": 52,
1724 |       "outputs": [
1725 |         {
1726 |           "output_type": "execute_result",
1727 |           "data": {
1728 |             "text/html": [
1729 |               "<div>\n",
1730 |               "<style scoped>\n",
1731 |               "    .dataframe tbody tr th:only-of-type {\n",
1732 |               "        vertical-align: middle;\n",
1733 |               "    }\n",
1734 |               "\n",
1735 |               "    .dataframe tbody tr th {\n",
1736 |               "        vertical-align: top;\n",
1737 |               "    }\n",
1738 |               "\n",
1739 |               "    .dataframe thead th {\n",
1740 |               "        text-align: right;\n",
1741 |               "    }\n",
1742 |               "</style>\n",
1743 |               "<table border=\"1\" class=\"dataframe\">\n",
1744 |               "  <thead>\n",
1745 |               "    <tr style=\"text-align: right;\">\n",
1746 |               "      <th></th>\n",
1747 |               "      <th>reviewerID</th>\n",
1748 |               "      <th>asin</th>\n",
1749 |               "      <th>overall</th>\n",
1750 |               "      <th>title</th>\n",
1751 |               "    </tr>\n",
1752 |               "  </thead>\n",
1753 |               "  <tbody>\n",
1754 |               "    <tr>\n",
1755 |               "      <th>18167</th>\n",
1756 |               "      <td>0</td>\n",
1757 |               "      <td>959</td>\n",
1758 |               "      <td>5</td>\n",
1759 |               "      <td>Ray of Light</td>\n",
1760 |               "    </tr>\n",
1761 |               "    <tr>\n",
1762 |               "      <th>19290</th>\n",
1763 |               "      <td>0</td>\n",
1764 |               "      <td>1011</td>\n",
1765 |               "      <td>5</td>\n",
1766 |               "      <td>Axis</td>\n",
1767 |               "    </tr>\n",
1768 |               "    <tr>\n",
1769 |               "      <th>27005</th>\n",
1770 |               "      <td>0</td>\n",
1771 |               "      <td>1523</td>\n",
1772 |               "      <td>5</td>\n",
1773 |               "      <td>Experience Hendrix</td>\n",
1774 |               "    </tr>\n",
1775 |               "    <tr>\n",
1776 |               "      <th>37443</th>\n",
1777 |               "      <td>0</td>\n",
1778 |               "      <td>2014</td>\n",
1779 |               "      <td>2</td>\n",
1780 |               "      <td>Come Away with Me</td>\n",
1781 |               "    </tr>\n",
1782 |               "  </tbody>\n",
1783 |               "</table>\n",
1784 |               "</div>"
1785 |             ],
1786 |             "text/plain": [
1787 |               "       reviewerID  asin  overall               title\n",
1788 |               "18167           0   959        5        Ray of Light\n",
1789 |               "19290           0  1011        5                Axis\n",
1790 |               "27005           0  1523        5  Experience Hendrix\n",
1791 |               "37443           0  2014        2   Come Away with Me"
1792 |             ]
1793 |           },
1794 |           "metadata": {
1795 |             "tags": []
1796 |           },
1797 |           "execution_count": 52
1798 |         }
1799 |       ]
1800 |     }
1801 |   ]
1802 | }


--------------------------------------------------------------------------------
/Processed Datasets/RetailrocketEcommerce/README.md:
--------------------------------------------------------------------------------
 1 | SUMMARY & USAGE LICENSE
 2 | =============================================
 3 | 
 4 | This dataset is provided by Keagle through the link: https://www.kaggle.com/retailrocket/ecommerce-dataset
 5 | 
 6 | The data has been collected from a real-world ecommerce website by Retail Rocket (retailrocket.io). It is raw data, i.e. without any content transformations, however, all values are hashed due to confidential issues. The purpose of publishing is to motivate researches in the field of recommender systems with implicit feedback. 
 7 | 
 8 | The behaviour data, i.e. events like clicks, add to carts, transactions, represent interactions that were collected over a period of 4.5 months. A visitor can make three types of events, namely “view”, “addtocart” or “transaction”. In the original dataset, there are 2,756,101 events including 2,664,312 views, 69,332 add to carts and 22,457 transactions produced by 1,407,580 unique visitors. 
 9 | 
10 | 
11 | This data has been organized and cleaned up by Arthur Fortes [1] similar to the MovieLens 100k treatment [2], 
12 | which removed all users and items who had less than 10 and 40 interactions, receptively and separated 
13 | in files the type of events. 
14 | 
15 | Detailed descriptions of the data file can be found at the end of this file.
16 |  
17 | This dataset consists of:
18 |   * 92,490 interactions from 3,431 users on 8,885 items.
19 |     - History: 3,423 users and 8,878 items (78,371 accesses) 
20 |     - Purchase: 824 users and 3,077 items (5,088 interactions)
21 |     - Add to cart: 1,557 users and 4,447 items (9,028 interactions) 
22 | 
23 | If you have any further questions or comments, please contact me
24 | <fortes.arthur@gmail.com>. 
25 | 
26 | 
27 | ACKNOWLEDGEMENTS
28 | ==============================================
29 | 
30 | Retail Rocket (retailrocket.io) helps web shoppers make better shopping decisions by providing personalized real-time recommendations through multiple channels with over 100MM unique monthly users and 1000+ retail partners over the world.
31 | 
32 | 
33 | DETAILED DESCRIPTIONS OF DATA FILES
34 | ==============================================
35 | 
36 | Here are brief descriptions of the data.
37 | 
38 | view_ecommerce.dat    
39 |                   
40 |                   The full history set, 78,371 accesses by 3,423 users on 8,878 items.
41 |                   Each user has accessed at least 10 items.  Users and items are
42 |                   numbered consecutively from 1.  The data is ordered by users ids. 
43 |                   This is a tab separated list of 
44 |                   visitorid | itemid | event
45 | 
46 | 
47 | add_to_cart_ecommerce.dat    
48 | 
49 |                   The full history set, 5,088 accesses by 824 users on 3,077 items.
50 |                   Users and items are numbered consecutively from 1.  
51 |                   The data is ordered by users ids. This is a tab separated list of 
52 |                   visitorid | itemid | event
53 | 
54 | 
55 | purchase_ecommerce.dat  
56 |                   
57 |                   The full history set, 9,028 accesses by 1,557 users on 4,447 items.
58 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
59 |                   This is a tab separated list of 
60 |                   visitorid | itemid | event
61 | 
62 | REFERENCES
63 | ==============================================
64 | 
65 | [1] Da Costa, Arthur Fortes. PhD candidate at the Institute of Mathematical and Computational Sciences, 
66 | University of São Paulo. URL: https://arthurfortes.github.io/
67 | 
68 | 
69 | [2] MovieLens 100K Dataset. Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. 
70 | Released 4/1998. URL: https://grouplens.org/datasets/movielens/100k/
71 | Generated by GroupLens [Department of Computer Science and Engineering at the University of Minnesota].
72 | 


--------------------------------------------------------------------------------
/Processed Datasets/RetailrocketEcommerce/Retailrocket_Ecommerce.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/RetailrocketEcommerce/Retailrocket_Ecommerce.zip


--------------------------------------------------------------------------------
/Processed Datasets/Steam/README.md:
--------------------------------------------------------------------------------
 1 | SUMMARY & USAGE LICENSE
 2 | =============================================
 3 | 
 4 | This dataset is provided by Keagle through the link: https://www.kaggle.com/tamber/steam-video-games/data.
 5 | 
 6 | * Context
 7 | Steam is the world's most popular PC Gaming hub, with over 6,000 games and a community of millions of gamers. With a massive collection that includes everything from AAA blockbusters to small indie titles, great discovery tools are a highly valuable asset for Steam. How can we make them better?
 8 | 
 9 | * Content
10 | This dataset is a list of user behaviors, with columns: user-id, game-title, behavior-name, value. The behaviors included are 'purchase' and 'play'. The value indicates the degree to which the behavior was performed - in the case of 'purchase' the value is always 1, and in the case of 'play' the value represents the number of hours the user has played the game.
11 | 
12 | * Acknowledgements
13 | This dataset is generated entirely from public Steam data, so we want to thank Steam for building such an awesome platform and community!
14 | 
15 | 
16 | This data has been organized by Arthur Fortes [1], which generated new IDs for users and items and separated in files 
17 | the information of purchase and play hours. 
18 | 
19 | Detailed descriptions of the data file can be found at the end of this file.
20 |  
21 | This dataset consists of:
22 |   * 200,000 interactions (play / purchase) from 12,393 users on 5,155 games.
23 |     - Play Hours: 11,350 users and 3,600 games (70,490 interactions)
24 |     - Purchase: 12,393 users and 5,155 games (129,512 interactions) 
25 | 
26 | If you have any further questions or comments, please contact me
27 | <fortes.arthur@gmail.com>. 
28 | 
29 | 
30 | DETAILED DESCRIPTIONS OF DATA FILES
31 | ==============================================
32 | 
33 | Here are brief descriptions of the data.
34 | 
35 | items_info.dat    
36 | 
37 |                   Information about the items (games); this is a tab separated list of
38 |                   Game_ID | Game Name |
39 |                   The item ids are the ones used in the game_purchase.dat 
40 |                   and game_play.dat files.
41 | 
42 | 
43 | users_info.dat    
44 | 
45 |                   IDs information about the users; this is a tab separated list of
46 |                   New_ID | Real_ID |
47 | 
48 |                   The user ids are the ones used in the game_purchase.dat 
49 |                   and game_play.dat files.
50 | 
51 | 
52 | game_purchase.dat 
53 | 
54 |                   The full purchase set: 129,512 interactions by 12,393 users on 5,155 games.
55 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
56 |                   This is a tab separated list of 
57 |                   User_ID | Game_ID | Purchase 
58 |                   
59 | 
60 | game_play.dat     
61 | 
62 |                   The full play hours set: 70,490 interactions by 11,350 users on 3,600 games.
63 |                   Users and items are numbered consecutively from 1. The data is ordered by users ids. 
64 |                   This is a tab separated list of 
65 |                   User_ID | Game_ID | Hours 
66 | 
67 | 
68 | REFERENCES
69 | ==============================================
70 | 
71 | [1] Da Costa, Arthur Fortes. PhD candidate at the Institute of Mathematical and Computational Sciences, 
72 | University of São Paulo. URL: https://arthurfortes.github.io/


--------------------------------------------------------------------------------
/Processed Datasets/Steam/steam.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/caserec/Datasets-for-Recommender-Systems/4180b4dc4103452c591a1718560d29bdf1f48540/Processed Datasets/Steam/steam.zip


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Public Datasets For Recommender Systems
 2 | 
 3 | This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS). They are collected and tidied from Stack Overflow, articles, recommender sites and academic experiments. Most of the datasets presented here are free, having open sorce linceses, however, some are not and you need to ask permission to use or cite the authors' work. 
 4 | 
 5 | > In addition, this repository contains some pre-processed datasets with treatment for academic experiments.
 6 | 
 7 | ## Link and datasets descriptions
 8 | 
 9 | ### Book
10 |   - [Book Crossing](http://www2.informatik.uni-freiburg.de/~cziegler/BX/):: The BookCrossing (BX) dataset was collected by Cai-Nicolas in a 4-week crawl (August / September 2004) from the Book-Crossing community
11 |   
12 | ### Dating
13 |   - [Dating Agency](http://www.occamslab.com/petricek/data/):: This dataset contains 17,359,346 anonymous ratings of 168,791 profiles made by 135,359 LibimSeTi users as dumped on April 4, 2006.
14 | 
15 | ### E-commerce
16 |   - [Amazon](http://jmcauley.ucsd.edu/data/amazon/):: This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014
17 |   - [Retailrocket recommender system dataset](https://www.kaggle.com/retailrocket/ecommerce-dataset):: The dataset consists of three files: a file with behaviour data (events.csv), a file with item properties (item_properties.сsv) and a file, which describes category tree (category_tree.сsv). The data has been collected from a real-world ecommerce website. 
18 | 
19 | ### Music
20 |   - [Amazon Music](http://jmcauley.ucsd.edu/data/amazon/):: This digital music dataset contains reviews and metadata from Amazon
21 |   - [Yahoo Music](https://webscope.sandbox.yahoo.com/catalog.php?datatype=r):: This dataset represents a snapshot of the Yahoo! Music community's preferences for various musical artists.
22 |   - [LastFM (Implicit)](https://grouplens.org/datasets/hetrec-2011/):: This dataset contains social networking, tagging, and music artist listening information from a set of 2K users from Last.fm online music system.
23 |   - [Million Song Dataset](https://labrosa.ee.columbia.edu/millionsong/):: The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.
24 | 
25 | ### Movies
26 |   - [MovieLens](https://grouplens.org/datasets/movielens/):: GroupLens Research has collected and made available rating datasets from their movie web site 
27 |   - [Yahoo Movies](https://webscope.sandbox.yahoo.com/catalog.php?datatype=r):: This dataset contains ratings for songs collected from two different sources. The first source consists of ratings supplied by users during normal interaction with Yahoo! Music services. 
28 |   - [CiaoDVD](https://drive.google.com/file/d/1w1FuVSQC9nqxcK5xj0Aw5Oxc1qV7d09A/view?usp=sharing):: CiaoDVD is a dataset crawled from the entire category of DVDs from the dvd.ciao.co.uk website in December, 2013
29 |   - [FilmTrust](https://drive.google.com/file/d/1ohQ9oo8aaR7aWlpe56hXx66x-bwXxB56/view?usp=sharing):: FilmTrust is a small dataset crawled from the entire FilmTrust website in June, 2011
30 |   - [Netflix](http://academictorrents.com/details/9b13183dc4d60676b773c9e2cd6de5e5542cee9a):: This is the official data set used in the Netflix Prize competition. 
31 |   
32 | ### Games
33 | 
34 |   - [Steam Video Games](https://www.kaggle.com/tamber/steam-video-games/data):: This dataset is a list of user behaviors, with columns: user-id, game-title, behavior-name, value. The behaviors included are 'purchase' and 'play'. The value indicates the degree to which the behavior was performed - in the case of 'purchase' the value is always 1, and in the case of 'play' the value represents the number of hours the user has played the game. 
35 | 
36 | ### Jokes
37 |   - [Jester](http://www.ieor.berkeley.edu/~goldberg/jester-data/):: This Joke dataset contains 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,496 users
38 |   
39 | ### Food
40 |   - [Chicago Entree](http://archive.ics.uci.edu/ml/datasets/Entree+Chicago+Recommendation+Data):: This dataset contains a record of user interactions with the Entree Chicago restaurant recommendation system.
41 |   
42 | ### Anime
43 |   - [Anime Recommendations Database](https://www.kaggle.com/CooperUnion/anime-recommendations-database):: This data set contains information on user preference data from 73,516 users on 12,294 anime. Each user is able to add anime to their completed list and give it a rating and this data set is a compilation of those ratings.
44 | 
45 | ### Android Applications
46 | 
47 |   - [Myket Android Application Install Dataset](https://github.com/erfanloghmani/myket-android-application-market-dataset):: This dataset contains 694,121 application install interactions from 10,000 anonymous users and 7,988 Anroid applications.
48 | 
49 | ### Other dataset
50 | 
51 | You can find more datasets in:
52 | 
53 |   - GroupLens Datasets [link](https://grouplens.org/datasets)
54 |   - LibRec Datasets [link](https://www.librec.net/datasets.html)
55 |   - Yahoo Research [link](https://webscope.sandbox.yahoo.com/catalog.php?datatype=r)
56 |   - Datasets for Machine Learning [link](https://gist.github.com/entaroadun/1653794)
57 |   - Stanford Large Network Dataset Collection [link](https://snap.stanford.edu/data/)
58 |   
59 | ## Usage and License
60 | 
61 | Before using these data sets, please review their README files or sites for the usage licenses, acknowledgments and other details.
62 | 
63 | `Note` : If you have difficulties in downloading any of these datasets please contact me. I have backup of all datasets.
64 | 
65 | ## Recommender Tools
66 | 
67 |   - [Case Recommender](https://github.com/caserec/CaseRecommender):: Python.
68 |   - [MyMediaLite](http://www.mymedialite.net/):: C#.
69 | 
70 | ## Contributors
71 | 
72 |     Arthur Fortes da Costa {fortes [dot] arthur [at] gmail [dot] com} [Editor]
73 | 
74 | 
75 | 


--------------------------------------------------------------------------------