└── README.md /README.md: -------------------------------------------------------------------------------- 1 | A Comparative Analysis of Warehouse vs. Retail Sales Performance 2 | Minor Project – INT375 | Data Science Specialization 3 | 4 | Dataset Source 5 | This analysis is based on a public dataset sourced from Data.gov. The dataset includes comprehensive sales records over time, categorized by item types, sales channels, and suppliers. 6 | 7 | Dataset Overview 8 | Rows: 307,000+ 9 | 10 | Columns: Year, Month, Item Type, Retail Sales, Warehouse Sales, Transfers, etc. 11 | 12 | Item Categories: WINE, BEER, LIQUOR, NON-ALCOHOL, STR_SUPPLIES, and more 13 | 14 | Why Exploratory Data Analysis (EDA)? 15 | EDA is a critical initial step in any data science project. It allows us to: 16 | 17 | Understand data distribution and structure 18 | 19 | Detect missing or anomalous values 20 | 21 | Discover relationships and patterns 22 | 23 | Prepare data for further modeling or insights 24 | 25 | Steps Performed in EDA: 26 | Cleaned data (missing values, duplicates) 27 | 28 | Created Date and Total Sales columns 29 | 30 | Performed univariate and bivariate analysis 31 | 32 | Visualized trends using histograms, bar charts, line plots, and scatter plots 33 | 34 | Project Objectives & Analysis 35 | 1. Monthly, Quarterly & Yearly Sales Trends 36 | Clear seasonality observed 37 | 38 | Q2 and Q4 had highest average sales, likely due to seasonal demand 39 | 40 | 2. Retail vs Warehouse Sales Comparison 41 | Warehouse sales contributed ~75% of total sales 42 | 43 | Retail showed steadier performance over time 44 | 45 | 3. Top-Selling Product Categories 46 | WINE and LIQUOR were top performers in sales revenue 47 | 48 | BEER had high volume but moderate total value 49 | 50 | 4. Price vs Quantity Sold Correlation 51 | A negative correlation suggests that as prices increase, quantity sold decreases 52 | 53 | Reflects consumer price sensitivity 54 | 55 | 5. Peak Sales Months 56 | November & December showed spikes in sales across categories 57 | 58 | Likely driven by holiday season demands 59 | 60 | Key Visualizations 61 | Histogram: Total Sales Distribution 62 | 63 | Bar Chart: Sales by Item Type 64 | 65 | Line Plot: Monthly Sales Trends 66 | 67 | Scatter Plot: Price vs Quantity Correlation 68 | 69 | Pie Chart: Channel-wise Contribution 70 | 71 | Conclusion 72 | This EDA provided valuable insights into: 73 | 74 | How different channels (retail vs. warehouse) perform 75 | 76 | Seasonality in sales patterns 77 | 78 | The role of product type and pricing on revenue and quantity 79 | 80 | Customer behavior trends that can inform future strategy 81 | 82 | It strengthened skills in: 83 | 84 | Data preprocessing & cleaning 85 | 86 | Visualization using Matplotlib and Seaborn 87 | 88 | Analytical thinking and storytelling through data 89 | 90 | Project Repository 91 | You can explore the full analysis and code here: 92 | GitHub Repository – Click Here ( https://github.com/ADARSH1805/warehouse-vs-retail-sales-analysis ) 93 | 94 | Let’s Connect! 95 | If you found this project interesting or helpful, feel free to ⭐️ the repo, give feedback, or connect with me on LinkedIn! 96 | I’m open to collaboration, ideas, and data discussions! 97 | --------------------------------------------------------------------------------