├── Consumer buying behavior - Paul.ipynb └── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Customer-segmentation-and-consumer-behavior-analysis 2 | 3 | (Please click on the Consumer buying behavior - Paul.ipynb above to see the detailed application of analytics and its interpretation) 4 | 5 | Brief Snapshot: 6 | 7 | Topic # 1: 8 | 9 | Segmenting consumer based buying behavior and applying 80/20 rule to identify top customers/products/geographic locations driving 80% of total $ sales 10 | 11 | This is a real sales data set of a UK based retailer 12 | 13 | Objective: Segment customers based on buying behavior by applying k-means clustering algorithm to calculate the optimal number of customer segments with similar buying habits (features). 14 | 15 | Data Source: https://archive.ics.uci.edu/ml/datasets/Wholesale+customers 16 | 17 | 18 | Topic 2: 19 | 20 | Apply 80/20 rule to identify the top 20% 21 | 22 | 1) Customer segments 23 | 24 | 2) Products and 25 | 26 | 3) Geographic locations 27 | 28 | resulting in 80% of $ sales revenue. 29 | 30 | Data Source used: https://archive.ics.uci.edu/ml/datasets/online+retail 31 | 32 | ## Customer segments based on buying behavior by applying k means clustering (unsupervised learning) algorithm : 33 | 34 | ### Elbow method to choose the optimal number of customer segments (clusters): 35 | 36 | ![image](https://user-images.githubusercontent.com/38769913/51401473-da786880-1b18-11e9-9f8e-3a79651f25f3.png) 37 | 38 | 39 | ### Customer segments: 40 | 41 | ![image](https://user-images.githubusercontent.com/38769913/51401411-aa30ca00-1b18-11e9-957d-5bd9342b9093.png) 42 | 43 | 44 | ## Distribution of customers in the 3 (optimal # of) clusters: 45 | 46 | ![image](https://user-images.githubusercontent.com/38769913/51401510-ef54fc00-1b18-11e9-996e-e06ce24b873d.png) 47 | 48 | 49 | ### Analysis: 50 | There are some overlap between segment No. 3 with each of Segment No. 1 & 2 at the boundaries which we need to keep in mind while analyzing customers. 51 | 52 | The k-means clustering algorithm being a un-supervised learning algorithm, we can perform a quick visual check on the model's performance based on the visualization chart. 53 | 54 | Business Strategy: 55 | Customer segment # 1 & 2 have opportunities for growth and future expansion. As the retails industry is a saturated industry, hence customer segment # 3 may already be dominated by other retailers so our client can try to increaese sales in the 2 other cutomer segments (# 1 & 2) through suitable competitive positioning, pricing stratgey, cohesive sales & marketing efforts, promotions, bundling etc. 56 | 57 | 58 | ### Buying behavior of customers within each cluster: 59 | 60 | ![image](https://user-images.githubusercontent.com/38769913/51401375-91c0af80-1b18-11e9-9eb9-be9fcc102d66.png) 61 | 62 | ![image](https://user-images.githubusercontent.com/38769913/51401293-60e07a80-1b18-11e9-8f6d-0f910b8e6d74.png) 63 | 64 | ### Analysis: 65 | 66 | For all 3 customer segments: majority of customers are from Region 3. So Region is not a key factor to segment customers. 67 | 68 | Fresh: Segment # 2 makes significant purchases compared to other segments folloewed by Segment # 3 69 | 70 | Milk: Segment # 1 makes significant purchases compared to other segments 71 | 72 | Grocery: Segnment # 1 makes significant purchases compared to other segments 73 | 74 | Frozen: Segment # 2 makes significant purchases followed by Segment # 3 75 | 76 | Detergent_Paper: Segment # 1 is a major purchaser 77 | 78 | Delicassen: Segnment # 2 on average makes most purchases, followed by Segmnet # 1 79 | 80 | 81 | ## Snapshots of few Analysis: 82 | (Detailed analysis shown in Consumer buying behavior - Paul.ipynb file) 83 | 84 | ### Customers: 85 | 86 | 1084 out of total 3877, top 25% of customer segments result in 79% of total $ sales amount. 87 | 88 | ![image](https://user-images.githubusercontent.com/38769913/51401094-f92a2f80-1b17-11e9-9772-da123b92844f.png) 89 | 90 | 91 | ### Products: 92 | 93 | 775 out of total 3877, top 20% of products result in 79% of total $ sales amount. 94 | 95 | ![image](https://user-images.githubusercontent.com/38769913/51401016-cc761800-1b17-11e9-9069-29b34fd5bbd1.png) 96 | 97 | 98 | ### Geographic Locations: Top 10 countries by revenue 99 | 100 | ![image](https://user-images.githubusercontent.com/38769913/51400833-5d98bf00-1b17-11e9-8f8a-2fed1fcee2e1.png) 101 | 102 | ### Geographic Locations: Bottom 20 countries by revenue 103 | 104 | ![image](https://user-images.githubusercontent.com/38769913/51400800-4954c200-1b17-11e9-8bdc-7f5d0cd58b89.png) 105 | 106 | ### Analysis: 107 | UK alone results in 82% of total $ revenue which is expected for a UK based retailer. 108 | 109 | ### Business Strategy: 110 | The senior management for the retail company should consider expanding to the other countries (after UK) where they have significant sales such as Neatherland, EIRE, Germany & France for future geographic expansion. 111 | --------------------------------------------------------------------------------