├── .gitignore ├── README.md ├── docs ├── paper.pdf └── slides.pdf ├── notebook ├── movielens_dataset_parse.ipynb └── movielens_experiments.ipynb ├── scripts ├── time_test.sh └── time_validation.sh └── time_filter_exp ├── time_test.py └── time_validation.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.ipynb_checkpoints/ 2 | dataset/ 3 | *logs/ 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dependency 2 | We used [OpenRec](http://openrec.ai/), an open-source and modular library for neural network-inspired recommendation algorithms, to conduct our experiements in the [paper](https://github.com/whongyi/datafilter-recsys/blob/master/docs/paper.pdf). Please refer to the [repo](https://github.com/ylongqi/openrec) for installation details. 3 | 4 | # Dataset 5 | The [MovieLens 20M](http://files.grouplens.org/datasets/movielens/ml-20m-README.html) dataset was used as a testbed to evaluate how user-controlled data filtering could affect recommendation performance. Follow [this notebook](/notebook/movielens_dataset_parse.ipynb) to preprocess the dataset. 6 | 7 | # Experiments 8 | Please refer to our paper for more details about the experiments and the findings. Here we focus on explaining how to reproduce the results. 9 | 10 | ### Hyperparameters selection 11 | Under the project folder, run `./scripts/time_validation.sh $RECOMMENDER` to conduct hyperparemeter selection for the recommender. `$RECOMMENDER` is one of the three: "CML", "BPR", "PMF" (you can extend it to other recommenders as well). Log files will be saved into the `./movielens_validation_logs/` folder. 12 | 13 | ### Model configurations 14 | After the validation logs are generated, follow the [Model configuration](/notebook/movielens_experiments.ipynb#Model-configuration) section to generate model configurations for testing. 15 | 16 | ### Evaluation 17 | Under the project folder, run `./scripts/time_test.sh $RECOMMENDER $EVALUATOR` to evaluate recommendation performance on test set. `$EVALUATOR` is one of "Recall" (Hit Ratio) and "NDCG" (Normalized Discounted Cumulative Gain). Test logs will be saved into the `./movielens_test_logs/` folder. 18 | 19 | ### Results 20 | Follow the [Experiments](/notebook/movielens_experiments.ipynb#Experiments) sections to generate figures that illustrate the experimental results. 21 | 22 | # Reference 23 | Hongyi Wen, Longqi Yang, Michael Sobolev, and Deborah Estrin. 2018. Exploring Recommendations Under User-Controlled Data Filtering. In Twelfth ACM Conference on Recommender Systems (RecSys ’18), October 2–7, 2018, Vancouver, BC, Canada. [[PDF]](https://github.com/whongyi/datafilter-recsys/blob/master/docs/paper.pdf)[[Slides]](https://github.com/whongyi/datafilter-recsys/blob/master/docs/slides.pdf) 24 | 25 | -------------------------------------------------------------------------------- /docs/paper.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/whongyi/datafilter-recsys/ab1f1e910fde4902f592ddc304b4e4e4908ae646/docs/paper.pdf -------------------------------------------------------------------------------- /docs/slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/whongyi/datafilter-recsys/ab1f1e910fde4902f592ddc304b4e4e4908ae646/docs/slides.pdf -------------------------------------------------------------------------------- /notebook/movielens_dataset_parse.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd\n", 10 | "import numpy as np\n", 11 | "import os\n", 12 | "from collections import defaultdict\n", 13 | "from datetime import datetime, timedelta" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": {}, 19 | "source": [ 20 | "### download dataset" 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "metadata": {}, 27 | "outputs": [], 28 | "source": [ 29 | "%%bash\n", 30 | "wget http://files.grouplens.org/datasets/movielens/ml-20m.zip\n", 31 | "unzip ml-20m.zip" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "rating_df = pd.read_csv('./ml-20m/ratings.csv')\n", 41 | "rating_df[\"timestamp\"] = pd.to_datetime(rating_df['timestamp'], unit='s')" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "### truncated dateset " 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": null, 54 | "metadata": {}, 55 | "outputs": [], 56 | "source": [ 57 | "def truncate_dataset(year, month):\n", 58 | " truncated_df = rating_df[rating_df[\"timestamp\"] > datetime(year, month, 1)]\n", 59 | " truncated_uni_movie_id = truncated_df.movieId.unique()\n", 60 | " truncated_uni_user_id = truncated_df.userId.unique()\n", 61 | " \n", 62 | " # user_id mapping \n", 63 | " truncated_user_id = {}\n", 64 | " max_user_len = 0\n", 65 | " for m_id in truncated_uni_user_id:\n", 66 | " truncated_user_id[m_id] = max_user_len\n", 67 | " max_user_len += 1\n", 68 | " \n", 69 | " # item_id mapping \n", 70 | " truncated_movie_id = {}\n", 71 | " max_movie_len = 0\n", 72 | " for m_id in truncated_uni_movie_id:\n", 73 | " truncated_movie_id[m_id] = max_movie_len\n", 74 | " max_movie_len += 1\n", 75 | "\n", 76 | " print(\"max_user:\", max_user_len, \"max_item:\", max_movie_len)\n", 77 | " truncated_df = truncated_df.sort_values(by=\"timestamp\", ascending=False) # sort by timestamp in descending order\n", 78 | " truncated_df[\"movieId\"] = truncated_df[\"movieId\"].apply(lambda x: truncated_movie_id[x])\n", 79 | " truncated_df[\"userId\"] = truncated_df[\"userId\"].apply(lambda x: truncated_user_id[x])\n", 80 | " \n", 81 | " df = truncated_df[[\"userId\", \"movieId\", \"timestamp\"]]\n", 82 | " a = np.array([tuple(i) for i in df.values], dtype=np.dtype([('user_id', '" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "def generate_test_config(val_dir, exp):\n", 130 | " display_itr = 1000 # result outputs after every 1000 iterations, change this according to variable used in 'time_validation.py' \n", 131 | " folders = get_valid_folders(val_dir, exp)\n", 132 | " res = np.array([get_eva(folder, evaluator=exp.evaluator, eval_name=\"Val\") for folder in folders])\n", 133 | " max_epoch = np.max(res, axis=2)\n", 134 | " max_l2 = np.argmax(max_epoch, axis=0)\n", 135 | " train_config = folders[flatten(max_l2)[0]].split('/')[-1].split('_')\n", 136 | " training_itr = (flatten(np.argmax(res, axis=2)[max_l2])[0]+1) * display_itr \n", 137 | " test_config = train_config[:-2] + [str(training_itr), \"_\".join(train_config[-2:])]\n", 138 | " return test_config" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "def save_test_config(recommender, evaluator):\n", 148 | " \n", 149 | " with open(\"../configs/{}_{}_test_config.txt\".format(recommender, evaluator), 'w') as outfile:\n", 150 | " # baseline: no-filtering\n", 151 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0)\n", 152 | " test_config = generate_test_config(val_dir, exp)\n", 153 | " outfile.write(\" \".join(test_config) + \"\\n\")\n", 154 | "\n", 155 | " # with filter \n", 156 | " for user_per in [0.25, 0.5, 0.75, 1.0]:\n", 157 | " for keep_days in [1,7,14,30,60,90,180]:\n", 158 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=user_per, keep_days=keep_days)\n", 159 | " test_config = generate_test_config(val_dir, exp)\n", 160 | " outfile.write(\" \".join(test_config) + \"\\n\")\n", 161 | "\n", 162 | " # complete user records for time intervals of varying length in multiples of 6 months\n", 163 | " for year in range(2010, 2014):\n", 164 | " for month in [1,7]:\n", 165 | " hist = \"{}_{}\".format(year, month)\n", 166 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0, hist=hist)\n", 167 | " test_config = generate_test_config(val_dir, exp)\n", 168 | " outfile.write(\" \".join(test_config) + \"\\n\")" 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "metadata": {}, 174 | "source": [ 175 | "# Model configuration" 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "val_dir = \"../movielens_validation_logs/\" # change to your log files directory\n", 185 | "for recommender in ['PMF', 'BPR', 'CML']:\n", 186 | " for evaluator in ['Recall','NDCG']:\n", 187 | " save_test_config(recommender, evaluator)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "raw", 192 | "metadata": {}, 193 | "source": [ 194 | "Note: this shoud generate model configurations in \"./configs/\" under the project folder." 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "# Experiments" 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "## 1. Population-level performance " 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": {}, 215 | "outputs": [], 216 | "source": [ 217 | "def get_test_results(exp):\n", 218 | " folders = get_test_folders(exp)\n", 219 | " results = get_eva(folders[0], evaluator=exp.evaluator, eval_name=\"Test\")\n", 220 | " return np.concatenate(results)" 221 | ] 222 | }, 223 | { 224 | "cell_type": "code", 225 | "execution_count": null, 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "def generate_df(recommender=\"PMF\", evaluator=\"Recall\"):\n", 230 | " df = pd.DataFrame(columns=[\"user_per\", \"keep_days\", \"recommender\", \"evaluator\", \"result\"])\n", 231 | " row = 0\n", 232 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0)\n", 233 | " baseline_results = get_test_results(exp)\n", 234 | " df.loc[row] = [0.0, 0, recommender, evaluator, np.mean(baseline_results)]\n", 235 | " row += 1\n", 236 | "\n", 237 | " for user_per in [0.25, 0.5, 0.75, 1.0]:\n", 238 | " for keep_days in [1, 7, 14, 30, 60, 90, 180]:\n", 239 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=user_per, keep_days=keep_days)\n", 240 | " results = get_test_results(exp)\n", 241 | " df.loc[row] = [user_per, keep_days, recommender, evaluator, np.mean(results)]\n", 242 | " row += 1 \n", 243 | " return df" 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": null, 249 | "metadata": { 250 | "scrolled": true 251 | }, 252 | "outputs": [], 253 | "source": [ 254 | "def plot_population():\n", 255 | " fig, axn = plt.subplots(2, 3, sharex=True, sharey=True)\n", 256 | " fig.set_size_inches(12,5)\n", 257 | " fig.tight_layout(rect=[0, 0.1, 1, 1])\n", 258 | " \n", 259 | " recommenders = [\"CML\", \"BPR\", \"PMF\"]\n", 260 | " evaluators = [\"Recall\", \"NDCG\"]\n", 261 | " for i, ax in enumerate(axn.flat):\n", 262 | " recommender = recommenders[i%3]\n", 263 | " evaluator = evaluators[i//3]\n", 264 | " df = generate_df(recommender=recommender, evaluator=evaluator)\n", 265 | " baseline = df[\"result\"][0]\n", 266 | " df[\"result_mean\"] = df[\"result\"].apply(lambda x: (x-baseline)*100/baseline)\n", 267 | " plot_df = df.iloc[1:,]\n", 268 | " plot_df[\"user_per\"] = plot_df[\"user_per\"].apply(lambda x: \"P={}\".format(x))\n", 269 | "\n", 270 | " sns.pointplot(data=plot_df, ax=ax, x=\"keep_days\", y=\"result\", hue=\"user_per\", linestyles='--', palette=\"tab20\")\n", 271 | " \n", 272 | " plt.subplots_adjust(hspace = .1, wspace=.1)\n", 273 | " ax.set_xlabel(\"\")\n", 274 | " ax.set_ylabel(\"\")\n", 275 | " ax.set_ylim(-45, 20)\n", 276 | " ax.axhline(y=0, color='k', linestyle=\"--\", alpha=0.5)\n", 277 | " ax.legend(bbox_to_anchor=(-0.05, -0.2, 1.1, -0.1), mode=\"expand\", ncol=4).set_visible(i==4)" 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [ 286 | "# show population-level performance change under different settings (P,N)\n", 287 | "plot_population()" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "## 2. Compare recommenders on complete user records for time intervals of varying length" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": null, 300 | "metadata": {}, 301 | "outputs": [], 302 | "source": [ 303 | "def truncate_history(recommender, evaluator): \n", 304 | " df = pd.DataFrame(columns=[\"recommender\", \"evaluator\", \"history_length\", \"result\"])\n", 305 | " row = 0\n", 306 | " hist = \"{}_{}\".format(2014, 1)\n", 307 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0, hist=hist)\n", 308 | " result = get_test_results(exp)\n", 309 | " df.loc[row] = [recommender, evaluator, hist, np.mean(result)]\n", 310 | " row += 1\n", 311 | " for year in range(2010, 2014):\n", 312 | " for month in [1,7]:\n", 313 | " hist = \"{}_{}\".format(year, month)\n", 314 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0, hist=hist)\n", 315 | " result = get_test_results(exp)\n", 316 | " df.loc[row] = [recommender, evaluator, hist, np.mean(result)]\n", 317 | " row += 1\n", 318 | " return(df)" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": null, 324 | "metadata": {}, 325 | "outputs": [], 326 | "source": [ 327 | "truncate_history(\"PMF\", \"Recall\")" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": null, 333 | "metadata": {}, 334 | "outputs": [], 335 | "source": [ 336 | "truncate_history(\"BPR\", \"NDCG\")" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": null, 342 | "metadata": {}, 343 | "outputs": [], 344 | "source": [ 345 | "truncate_history(\"CML\", \"Recall\")" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "## 3. Group-level peroformance" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": {}, 359 | "outputs": [], 360 | "source": [ 361 | "from openrec.utils.evaluators import Recall, NDCG" 362 | ] 363 | }, 364 | { 365 | "cell_type": "code", 366 | "execution_count": null, 367 | "metadata": { 368 | "scrolled": true 369 | }, 370 | "outputs": [], 371 | "source": [ 372 | "def user_scores(p):\n", 373 | " recall_evaluator = Recall(recall_at=[10])\n", 374 | " ndcg_evaluator = NDCG(ndcg_at=[10])\n", 375 | "\n", 376 | " score_per_user = dict()\n", 377 | " count_per_user = dict()\n", 378 | "\n", 379 | " for user in p['users']:\n", 380 | " neg_scores = p['results'][user][:p['num_negatives']]\n", 381 | " for i in range(len(p['user_items'][user][p['num_negatives']:])):\n", 382 | " pos_score = p['results'][user][p['num_negatives'] + i]\n", 383 | " rank_above = np.array([ float(np.sum(neg_scores > pos_score)) ])\n", 384 | " negative_num = float(p['num_negatives'])\n", 385 | " curr_score_recall = recall_evaluator.compute(rank_above, negative_num)[0]\n", 386 | " curr_score_ndcg = ndcg_evaluator.compute(rank_above, negative_num)[0]\n", 387 | " if user not in score_per_user:\n", 388 | " score_per_user[user] = list()\n", 389 | " if user not in count_per_user:\n", 390 | " count_per_user[user] = 0.0\n", 391 | " score_per_user[user].append((curr_score_recall, curr_score_ndcg))\n", 392 | " count_per_user[user] += 1\n", 393 | "\n", 394 | " # calculate per-user scores\n", 395 | " per_user_recall = dict()\n", 396 | " per_user_ndcg = dict()\n", 397 | "\n", 398 | " for key in score_per_user.keys():\n", 399 | " curr_recall = 0.0\n", 400 | " curr_ndcg = 0.0\n", 401 | " for tup in score_per_user[key]:\n", 402 | " curr_recall += tup[0]\n", 403 | " curr_ndcg += tup[1]\n", 404 | " per_user_recall[ key ] = curr_recall / count_per_user[key]\n", 405 | " per_user_ndcg[ key ] = curr_ndcg / count_per_user[key]\n", 406 | " \n", 407 | " return {\"Recall\":per_user_recall,\n", 408 | " \"NDCG\": per_user_ndcg\n", 409 | " }" 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": null, 415 | "metadata": {}, 416 | "outputs": [], 417 | "source": [ 418 | "def user_grouping(logdir):\n", 419 | " \n", 420 | " with open(logdir+\"/filtered_data.npy\", 'rb') as filter_profile:\n", 421 | " filter_data = np.load(filter_profile)\n", 422 | "\n", 423 | " with open(logdir+\"/train_data.npy\", 'rb') as train_profile:\n", 424 | " train_data = np.load(train_profile)\n", 425 | " \n", 426 | " with open(logdir+\"/test_data.npy\", 'rb') as test_profile:\n", 427 | " test_data = np.load(test_profile)\n", 428 | " \n", 429 | " filter_user = np.unique(filter_data[\"user_id\"])\n", 430 | " train_user = np.unique(train_data[\"user_id\"])\n", 431 | " test_user = np.unique(test_data[\"user_id\"])\n", 432 | " \n", 433 | " # decompose test users into three groups: (1)user profile changed (2) user profile unchanged (3) new users (cold start)\n", 434 | " group_filter = [i for i in test_user if i in filter_user]\n", 435 | " group_same = [i for i in test_user if i in (set(train_user) - set(filter_user))]\n", 436 | " group_new = [i for i in test_user if i not in (set(train_user) | set(filter_user))]\n", 437 | " return [len(train_data), len(train_user), len(test_data), group_filter, group_same, group_new]" 438 | ] 439 | }, 440 | { 441 | "cell_type": "code", 442 | "execution_count": null, 443 | "metadata": {}, 444 | "outputs": [], 445 | "source": [ 446 | "def user_compare(recommender=None, evaluator=None):\n", 447 | " baseline = Experiment(recommender=recommender, evaluator=evaluator, user_per=0.0, keep_days=0)\n", 448 | " baseline_folder = get_test_folders(baseline)[0]\n", 449 | " baseline_files = [os.path.join(baseline_folder, f) for f in sorted(os.listdir(baseline_folder))]\n", 450 | " \n", 451 | " df = pd.DataFrame(columns=[\"group\", \"user_per\", \"keep_days\", \"recommender\", \"evaluator\", \"result_mean\", \"baseline_result_mean\", \"performance_change\"])\n", 452 | " row = 0\n", 453 | " \n", 454 | " for user_per in [0.25, 0.5, 0.75, 1.0]:\n", 455 | " for keep_days in [1,7,14,30,60,90,180]:\n", 456 | " print (user_per, keep_days)\n", 457 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=user_per, keep_days=keep_days)\n", 458 | " folder = get_test_folders(exp)[0]\n", 459 | " log_files = [os.path.join(folder, f) for f in sorted(os.listdir(folder))]\n", 460 | "\n", 461 | " filter_users = []\n", 462 | " unfilter_users = []\n", 463 | " new_users = []\n", 464 | "\n", 465 | " base_filter_users = []\n", 466 | " base_unfilter_users = []\n", 467 | " base_new_users = []\n", 468 | "\n", 469 | " for i in range(len(log_files)):\n", 470 | " \n", 471 | " #print (\"###with user filtering###\") \n", 472 | " with open(log_files[i]+\"/_evaluate_partial.pickle\", 'rb') as eva_file:\n", 473 | " p = pickle.load(eva_file)\n", 474 | " per_user_performance = user_scores(p)[evaluator]\n", 475 | " \n", 476 | " group_filter, group_same, group_new = user_grouping(log_files[i])[-3:]\n", 477 | " #print (len(group_filter), len(group_same), len(group_new))\n", 478 | " assert len(per_user_performance) == len(group_filter) + len(group_same) + len(group_new)\n", 479 | " \n", 480 | " after_group_filter = [per_user_performance[u] for u in group_filter]\n", 481 | " after_group_same = [per_user_performance[u] for u in group_same]\n", 482 | " after_group_new = [per_user_performance[u] for u in group_new]\n", 483 | "\n", 484 | " filter_users.append(np.mean(after_group_filter))\n", 485 | " unfilter_users.append(np.mean(after_group_same))\n", 486 | " new_users.append(np.mean(after_group_new))\n", 487 | " \n", 488 | " #print (\"###baseline###\") \n", 489 | " with open(baseline_files[i]+\"/_evaluate_partial.pickle\", 'rb') as eva_file:\n", 490 | " p = pickle.load(eva_file)\n", 491 | " per_user_performance = user_scores(p)[evaluator]\n", 492 | " \n", 493 | " baseline_group_filter = [per_user_performance[u] for u in group_filter]\n", 494 | " baseline_group_same = [per_user_performance[u] for u in group_same]\n", 495 | " baseline_group_new = [per_user_performance[u] for u in group_new]\n", 496 | " \n", 497 | " base_filter_users.append(np.mean(baseline_group_filter))\n", 498 | " base_unfilter_users.append(np.mean(baseline_group_same))\n", 499 | " base_new_users.append(np.mean(baseline_group_new))\n", 500 | " \n", 501 | " df.loc[row] = [\"filtered\", user_per, keep_days, recommender, evaluator, \n", 502 | " np.mean(filter_users), np.mean(base_filter_users),\n", 503 | " (np.mean(filter_users) - np.mean(base_filter_users))/np.mean(base_filter_users)]\n", 504 | " row +=1\n", 505 | " \n", 506 | " df.loc[row] = [\"no filtered\", user_per, keep_days, recommender, evaluator, \n", 507 | " np.mean(unfilter_users), np.mean(base_unfilter_users),\n", 508 | " (np.mean(unfilter_users) - np.mean(base_unfilter_users))/np.mean(base_unfilter_users)]\n", 509 | " row +=1\n", 510 | " \n", 511 | " df.loc[row] = [\"cold start\", user_per, keep_days, recommender, evaluator, \n", 512 | " np.mean(new_users), np.mean(base_new_users),\n", 513 | " (np.mean(new_users) - np.mean(base_new_users))/np.mean(base_new_users)]\n", 514 | " row +=1\n", 515 | " \n", 516 | " return df" 517 | ] 518 | }, 519 | { 520 | "cell_type": "markdown", 521 | "metadata": {}, 522 | "source": [ 523 | "### User group distribution" 524 | ] 525 | }, 526 | { 527 | "cell_type": "code", 528 | "execution_count": null, 529 | "metadata": { 530 | "scrolled": true 531 | }, 532 | "outputs": [], 533 | "source": [ 534 | "# use any recommender and evaluator \n", 535 | "recommender = \"PMF\" \n", 536 | "evaluator = \"NDCG\"\n", 537 | "df = pd.DataFrame(columns=[\"group\", \"user_per\", \"keep_days\", \"user_num\", \"percentage\"])\n", 538 | "row = 0\n", 539 | "for user_per in [0.25, 0.5, 0.75, 1.0]:\n", 540 | " for keep_days in [1,7,14,30,60,90,180]:\n", 541 | " exp = Experiment(recommender=recommender, evaluator=evaluator, user_per=user_per, keep_days=keep_days)\n", 542 | " folder = get_test_folders(exp)[0]\n", 543 | " log_files = [os.path.join(folder, f) for f in sorted(os.listdir(folder))]\n", 544 | " stats = []\n", 545 | " for i in range(len(log_files)):\n", 546 | " stats.append(user_grouping(log_files[i]))\n", 547 | " groups = np.mean(stats, axis=0)\n", 548 | " df.loc[row] = [\"Group 1\", user_per, keep_days, len(groups[-3]), len(groups[-3])/len(groups[-4])]\n", 549 | " row += 1\n", 550 | " df.loc[row] = [\"Group 2\", user_per, keep_days, len(groups[-2]), len(groups[-2])/len(groups[-4])]\n", 551 | " row +=1\n", 552 | " df.loc[row] = [\"Group 3\", user_per, keep_days, len(groups[-1]), len(groups[-1])/len(groups[-4])]\n", 553 | " row += 1\n", 554 | " \n", 555 | "fig, axn = plt.subplots(1, 4, sharex=True, sharey=True)\n", 556 | "fig.set_size_inches(15,3)\n", 557 | "fig.tight_layout(rect=[0, 0.2, 1, 1])\n", 558 | "user_pers = [0.25, 0.5, 0.75, 1.0]\n", 559 | "days = [\"1\",\"7\",\"14\",\"30\",\"60\",\"90\",\"180\"]\n", 560 | "for i, ax in enumerate(axn.flat):\n", 561 | " user_per = user_pers[i%4]\n", 562 | " plot_df = df[df[\"user_per\"] == user_per]\n", 563 | " group1 = plot_df[plot_df[\"group\"] == \"Group 1\"][\"percentage\"]\n", 564 | " group2 = plot_df[plot_df[\"group\"] == \"Group 2\"][\"percentage\"]\n", 565 | " group3 = plot_df[plot_df[\"group\"] == \"Group 3\"][\"percentage\"]\n", 566 | " ax.bar(days, group1, label=\"Group 1\",alpha=0.8)\n", 567 | " ax.bar(days, group2, bottom=group1, label=\"Group 2\", alpha=0.8)\n", 568 | " ax.bar(days, group3, bottom=np.array(group2) + np.array(group1), label=\"Group 3\", alpha=0.8)\n", 569 | " ax.legend(bbox_to_anchor=(-0.65, -0.2, 1.1, -0.1), mode=\"expand\", ncol=3).set_visible(i==2)" 570 | ] 571 | }, 572 | { 573 | "cell_type": "markdown", 574 | "metadata": {}, 575 | "source": [ 576 | "### Group-level performance change after data filtering " 577 | ] 578 | }, 579 | { 580 | "cell_type": "code", 581 | "execution_count": null, 582 | "metadata": { 583 | "scrolled": true 584 | }, 585 | "outputs": [], 586 | "source": [ 587 | "def plot_grouptrend(evaluator):\n", 588 | " fig, axn = plt.subplots(3, 4, sharex=True, sharey=True)\n", 589 | " fig.set_size_inches(12,6)\n", 590 | " fig.tight_layout(rect=[0, 0.1, 1, 1])\n", 591 | " recommenders = [\"CML\", \"BPR\", \"PMF\"]\n", 592 | " user_pers = [0.25, 0.5, 0.75, 1.0]\n", 593 | " legend_mapping = {\"filtered\": \"Group 1\", \"no filtered\":\"Group 2\", \"cold start\": \"Group 3\"}\n", 594 | " \n", 595 | " for i, ax in enumerate(axn.flat):\n", 596 | " recommender = recommenders[i//4]\n", 597 | " user_per = user_pers[i%4]\n", 598 | " group_df = user_compare(recommender=recommender, evaluator=evaluator)\n", 599 | " plot_df = group_df[(group_df[\"user_per\"] == user_per)]\n", 600 | " plot_df[\"performance_change\"] = group_df[\"performance_change\"].apply(lambda x: x*100)\n", 601 | " plot_df[\"group\"] = group_df[\"group\"].apply(lambda x: legend_mapping[x])\n", 602 | " # sns.set(font_scale=1.1)\n", 603 | " sns.pointplot(data=plot_df, ax=ax, x=\"keep_days\", y=\"performance_change\", hue=\"group\", linestyles='--', palette=\"tab10\")\n", 604 | " ax.set_ylim(-45, 20)\n", 605 | " ax.axhline(y=0, color='k', linestyle=\"--\", alpha=0.5)\n", 606 | " plt.subplots_adjust(hspace = .1, wspace=.1)\n", 607 | " ax.set_xlabel(\"\")\n", 608 | " ax.set_ylabel(\"\")\n", 609 | " ax.legend(bbox_to_anchor=(-0.75, -0.15, 1.5, -0.2), mode=\"expand\", ncol=4).set_visible(i==10)" 610 | ] 611 | }, 612 | { 613 | "cell_type": "code", 614 | "execution_count": null, 615 | "metadata": {}, 616 | "outputs": [], 617 | "source": [ 618 | "plot_grouptrend(\"Recall\")" 619 | ] 620 | }, 621 | { 622 | "cell_type": "code", 623 | "execution_count": null, 624 | "metadata": {}, 625 | "outputs": [], 626 | "source": [ 627 | "plot_grouptrend(\"NDCG\")" 628 | ] 629 | } 630 | ], 631 | "metadata": { 632 | "kernelspec": { 633 | "display_name": "Python 3", 634 | "language": "python", 635 | "name": "python3" 636 | }, 637 | "language_info": { 638 | "codemirror_mode": { 639 | "name": "ipython", 640 | "version": 3 641 | }, 642 | "file_extension": ".py", 643 | "mimetype": "text/x-python", 644 | "name": "python", 645 | "nbconvert_exporter": "python", 646 | "pygments_lexer": "ipython3", 647 | "version": "3.5.2" 648 | } 649 | }, 650 | "nbformat": 4, 651 | "nbformat_minor": 2 652 | } 653 | -------------------------------------------------------------------------------- /scripts/time_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | # trap "exit" INT 3 | 4 | config_dir="configs/" 5 | 6 | while read config; do 7 | while read date; do 8 | python3 "time_filter_exp/time_test.py" movielens $config $date $2 9 | done < ${config_dir}"test_dates.txt" 10 | done < ${config_dir}$1"_"$2"_test_config.txt" 11 | 12 | # example: ./scripts/time_test.sh filter pmf Recall 13 | -------------------------------------------------------------------------------- /scripts/time_validation.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | #validate on 1 year complete user records (baseline) 4 | for l in 0.1 0.01 0.001 0.0001; do 5 | python3 "time_filter_exp/time_validation.py" "movielens" $1 $l "2015-01-01" 2014_1 6 | done 7 | 8 | #validate on different settings (P,N) 9 | for P in 1.0 0.75 0.5 0.25; do 10 | for N in 1 7 14 30 60 90 180; do 11 | for l in 0.1 0.01 0.001 0.0001; do 12 | python3 "time_filter_exp/time_validation.py" "movielens" $1 $l "2015-01-01" 2014_1 $P $N 13 | done 14 | done 15 | done 16 | 17 | # validate on complete user records for time intervals of varying length 18 | for year in 2013 2012 2011 2010; do 19 | for month in 7 1; do 20 | for l in 0.1 0.01 0.001 0.0001; do 21 | python3 "time_filter_exp/time_validation.py" "movielens" $1 $l "2015-01-01" $year"_"$month 22 | done 23 | done 24 | done 25 | 26 | -------------------------------------------------------------------------------- /time_filter_exp/time_test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | from openrec.legacy import ImplicitModelTrainer 4 | from openrec.legacy.utils import ImplicitDataset 5 | from openrec.legacy.recommenders import PMF, CML, BPR 6 | from openrec.legacy.utils.evaluators import Recall, NDCG 7 | from openrec.legacy.utils.samplers import PairwiseSampler,PointwiseSampler 8 | import logging 9 | import datetime 10 | import csv 11 | import tensorflow as tf 12 | import numpy as np 13 | 14 | batch_size = 1000 15 | test_batch_size = 100 16 | LOG_TYPE = 'test' 17 | LOGGING = True 18 | 19 | def load_data(dataset, hist_len): 20 | if dataset == 'movielens': 21 | raw_data = np.load("./dataset/user_data_truncated_%s.npy" % hist_len) 22 | print ('raw movielens dataset loaded') 23 | return raw_data 24 | else: 25 | print ("No dataset loaded...") 26 | return 27 | 28 | def run_test_exp(model_name=None, evaluator=None, raw_data=None, user_per=1.0, keep_days=1, l2_reg=0.001, test_date=None, outdir=None, num_itr=1e4+1): 29 | 30 | # parse dataset into incremental training and testing set 31 | data = raw_data 32 | max_user = len(np.unique(data["user_id"])) 33 | max_item = len(np.unique(data["item_id"])) 34 | print ("max_user:{}, max_item:{}".format(max_user, max_item)) 35 | 36 | test_date = datetime.datetime.strptime(test_date, "%Y-%m-%d").date() 37 | print ("test date:%s" % test_date) 38 | train_data = data[data["timestamp"] < test_date] 39 | test_data = data[(data["timestamp"] >= test_date) & (data["timestamp"] < (test_date+datetime.timedelta(days=7)))] 40 | np.random.seed(10) 41 | test_data = np.asarray([np.random.choice(test_data[test_data["user_id"] == uid], 1)[0] for uid in np.unique(test_data["user_id"])]) 42 | 43 | # filter training data, for selected users keep only the latest n days of data 44 | print ("filter user percentage:%f" % user_per) 45 | print ("ratings before filter:%d" % len(train_data)) 46 | user_list = np.unique(train_data["user_id"]) 47 | filter_user = np.random.choice(user_list, int(len(user_list)*user_per), replace=False) 48 | filter_mask = (np.isin(train_data["user_id"], filter_user)) & (train_data["timestamp"] < (test_date-datetime.timedelta(days=keep_days))) 49 | 50 | # output filtered data and test data 51 | if LOGGING: 52 | np.save(outdir+"filtered_data.npy", train_data[filter_mask]) 53 | np.save(outdir+"train_data.npy", train_data[~filter_mask]) 54 | np.save(outdir+"test_data.npy", test_data) 55 | 56 | train_data = train_data[~filter_mask] 57 | print ("ratings after filter:%d" % len(train_data)) 58 | 59 | train_dataset = ImplicitDataset(train_data, max_user, max_item, name='Train') 60 | test_dataset = ImplicitDataset(test_data, max_user, max_item, name='Test') 61 | 62 | num_process = 8 63 | dim_embed = 50 64 | if model_name == 'PMF': 65 | model = PMF(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 66 | sampler = PointwiseSampler(batch_size=batch_size, dataset=train_dataset, pos_ratio=0.5, num_process=num_process) 67 | elif model_name == 'CML': 68 | model = CML(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), 69 | dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 70 | sampler = PairwiseSampler(batch_size=batch_size, dataset=train_dataset, num_process=num_process) 71 | elif model_name == 'BPR': 72 | model = BPR(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), 73 | dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 74 | sampler = PairwiseSampler(batch_size=batch_size, dataset=train_dataset, num_process=num_process) 75 | else: 76 | print ("Wrong model assigned") 77 | return 78 | 79 | if evaluator == 'Recall': 80 | test_evaluator = Recall(recall_at=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) 81 | elif evaluator == 'NDCG': 82 | test_evaluator = NDCG(ndcg_at=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) 83 | else: 84 | print ("Wrong evaluator assisgned") 85 | return 86 | 87 | model_trainer = ImplicitModelTrainer(batch_size=batch_size, test_batch_size=test_batch_size, train_dataset=train_dataset, model=model, sampler=sampler, item_serving_size=1,eval_save_prefix=outdir) 88 | model_trainer.train(num_itr=num_itr+1, display_itr=num_itr, eval_datasets=[test_dataset], evaluators=[test_evaluator], num_negatives=200) 89 | 90 | 91 | if __name__ == "__main__": 92 | print (sys.argv) 93 | dataset = sys.argv[1] 94 | model_name = sys.argv[2] 95 | user_per = float(sys.argv[3]) 96 | keep_days = int(sys.argv[4]) 97 | l2_reg = float(sys.argv[5]) 98 | num_itr = int(sys.argv[6]) 99 | hist_len = sys.argv[7] 100 | test_date = sys.argv[8] 101 | evaluator = sys.argv[9] 102 | 103 | raw_data = load_data(dataset, hist_len) 104 | # logging 105 | outdir = None 106 | if LOGGING: 107 | outdir = "{}_{}_logs/{}_{}_{}_{}_{}/{}/".format(dataset, LOG_TYPE, model_name, evaluator, user_per, keep_days, hist_len, test_date) 108 | os.popen("mkdir -p %s" % outdir).read() 109 | log = open(outdir+ "training.log", "w") 110 | sys.stdout = log 111 | 112 | parameters = {"model": model_name, 'evaluator': evaluator, "l2_reg": l2_reg, "num_itr":num_itr, "user_per":user_per, "keep_days":keep_days, "dataset": dataset} 113 | print (parameters) 114 | 115 | run_test_exp(model_name=model_name, evaluator=evaluator, raw_data=raw_data, user_per=user_per, keep_days=keep_days, l2_reg=l2_reg, test_date=test_date, outdir=outdir, num_itr=num_itr) 116 | -------------------------------------------------------------------------------- /time_filter_exp/time_validation.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | from openrec.legacy import ImplicitModelTrainer 4 | from openrec.legacy.utils import ImplicitDataset 5 | from openrec.legacy.recommenders import PMF, CML, BPR 6 | from openrec.legacy.utils.evaluators import Recall, NDCG 7 | from openrec.legacy.utils.samplers import PairwiseSampler,PointwiseSampler 8 | import logging 9 | import datetime 10 | import csv 11 | import tensorflow as tf 12 | import numpy as np 13 | 14 | batch_size = 1000 15 | test_batch_size = 100 16 | num_itr = int(1e4+1) 17 | display_itr = 1000 18 | LOG_TYPE = 'validation' 19 | LOGGING = True 20 | 21 | def load_data(dataset, hist_len): 22 | if dataset == 'movielens': 23 | raw_data = np.load("./dataset/user_data_truncated_%s.npy" % hist_len) 24 | print ('raw movielens dataset loaded') 25 | return raw_data 26 | else: 27 | print ("No dataset loaded...") 28 | return 29 | 30 | def run_exp(model_name=None, raw_data=None, user_per=1.0, keep_days=1, l2_reg=0.001, test_date=None, outdir=None): 31 | 32 | # parse dataset into incremental training and testing set 33 | data = raw_data 34 | max_user = len(np.unique(data["user_id"])) 35 | max_item = len(np.unique(data["item_id"])) 36 | print ("max_user:{}, max_item:{}".format(max_user, max_item)) 37 | 38 | test_date = datetime.datetime.strptime(test_date, "%Y-%m-%d").date() 39 | print ("test date:%s" % test_date) 40 | train_data = data[data["timestamp"] < test_date] 41 | 42 | np.random.seed(10) 43 | 44 | # filter training data, for selected users keep only the most recent n days of data 45 | print ("filter user percentage:%f" % user_per) 46 | print ("ratings before filter:%d" % len(train_data)) 47 | user_list = np.unique(train_data["user_id"]) 48 | filter_user = np.random.choice(user_list, int(len(user_list)*user_per), replace=False) 49 | mask = (np.isin(train_data["user_id"], filter_user)) & (train_data["timestamp"] < (test_date-datetime.timedelta(days=keep_days))) 50 | train_data = train_data[~mask] 51 | print ("ratings after filter:%d" % len(train_data)) 52 | 53 | # random select one item for each user for validation 54 | user_list = np.unique(train_data["user_id"]) 55 | val_index = [np.where(train_data["user_id"] == uid)[0][0] for uid in user_list] # leave out the most recent rating for validation 56 | val_data = train_data[val_index] 57 | train_data = np.delete(train_data, val_index) 58 | print ("trian data: %d, validation data %d" % (len(train_data), len(val_data))) 59 | 60 | train_dataset = ImplicitDataset(train_data, max_user, max_item, name='Train') 61 | val_dataset = ImplicitDataset(val_data, max_user, max_item, name='Val') 62 | 63 | num_process = 8 64 | dim_embed = 50 65 | if model_name == 'PMF': 66 | model = PMF(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 67 | sampler = PointwiseSampler(batch_size=batch_size, dataset=train_dataset, pos_ratio=0.5, num_process=num_process) 68 | elif model_name == 'CML': 69 | model = CML(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), 70 | dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 71 | sampler = PairwiseSampler(batch_size=batch_size, dataset=train_dataset, num_process=num_process) 72 | elif model_name == 'BPR': 73 | model = BPR(batch_size=batch_size, max_user=train_dataset.max_user(), max_item=train_dataset.max_item(), 74 | dim_embed=dim_embed, opt='Adam', l2_reg=l2_reg) 75 | sampler = PairwiseSampler(batch_size=batch_size, dataset=train_dataset, num_process=num_process) 76 | else: 77 | print ("Wrong model assigned") 78 | return 79 | 80 | recall_evaluator = Recall(recall_at=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) 81 | ndcg_evaluator = NDCG(ndcg_at=[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) 82 | 83 | model_trainer = ImplicitModelTrainer(batch_size=batch_size, test_batch_size=test_batch_size, train_dataset=train_dataset, model=model, sampler=sampler, item_serving_size=1) 84 | model_trainer.train(num_itr=num_itr, display_itr=display_itr, eval_datasets=[val_dataset], evaluators=[recall_evaluator, ndcg_evaluator], num_negatives=200) 85 | 86 | if __name__ == "__main__": 87 | print (sys.argv) 88 | dataset = sys.argv[1] 89 | model_name = sys.argv[2] 90 | l2_reg = float(sys.argv[3]) 91 | test_date = sys.argv[4] 92 | hist_len = sys.argv[5] 93 | 94 | if len(sys.argv) >= 8: 95 | user_per = float(sys.argv[6]) 96 | keep_days = int(sys.argv[7]) 97 | else: 98 | user_per = 0.0 99 | keep_days = 0 100 | 101 | raw_data = load_data(dataset, hist_len) 102 | # logging 103 | outdir = None 104 | if LOGGING: 105 | outdir = "{}_{}_logs/{}_{}_{}_{}_{}/".format(dataset, LOG_TYPE, model_name, user_per, keep_days, l2_reg, hist_len) 106 | os.popen("mkdir -p %s" % outdir).read() 107 | log = open(outdir+ test_date + "_training.log", "w") 108 | sys.stdout = log 109 | 110 | run_exp(model_name=model_name, raw_data=raw_data, user_per=user_per, keep_days=keep_days, l2_reg=l2_reg, test_date=test_date, outdir=outdir) 111 | --------------------------------------------------------------------------------