└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # BigData_Beverage_Recommender-System 2 | The data was collected from Kaggle and I saved it to AWS S3 Bucket. It has over 1.5 million rows and 13 columns with information on BeerAdvocate user reviews. In orde to predict customer ratings for beers which they have reviewed, I used a collaborative filtering algorithm called Alternating Least Squares (ALS) on AWS with pyspark.ml and SQL context. RMSE was also applied as a evaluator for the accuracy of prediction. The result showed me a 0.55 for RMSE indicating that the difference between the actual rating and the prediction is pretty good in a scale between 0-5. 3 | --------------------------------------------------------------------------------