├── mohit45.PNG ├── mohit46.PNG ├── mohit47.PNG ├── mohit48.PNG ├── mohit49.PNG ├── mohit50.PNG ├── analysis.py └── README.md /mohit45.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit45.PNG -------------------------------------------------------------------------------- /mohit46.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit46.PNG -------------------------------------------------------------------------------- /mohit47.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit47.PNG -------------------------------------------------------------------------------- /mohit48.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit48.PNG -------------------------------------------------------------------------------- /mohit49.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit49.PNG -------------------------------------------------------------------------------- /mohit50.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mohittiwari98/Data-analysis-of-Zomato/main/mohit50.PNG -------------------------------------------------------------------------------- /analysis.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | import seaborn as sns 5 | 6 | dataframe=pd.read_csv("Zomato data .csv") 7 | print(dataframe) 8 | 9 | def handleRate(value): 10 | value=str(value).split('/') 11 | value=value[0] 12 | return float(value) 13 | 14 | dataframe['rate'] = dataframe['rate'].apply(handleRate) 15 | print(dataframe.head()) 16 | 17 | dataframe.info() 18 | 19 | dataframe.head() 20 | #use this part to get type of restaurant 21 | sns.countplot(x=dataframe['listed_in(type)']) 22 | plt.xlabel("type of resturant") 23 | plt.show() 24 | 25 | #get graph of voting of restaurant 26 | grouped_data=dataframe.groupby('listed_in(type)')['votes'].sum() 27 | result = pd.DataFrame({'votes':grouped_data}) 28 | plt.plot(result,c="purple",marker="o") 29 | plt.xlabel("Type of Restaurent",c="blue",size=20) 30 | plt.ylabel("votes",c="red",size=20) 31 | plt.show() 32 | 33 | #get rating distribution 34 | plt.hist(dataframe['rate'],bins=5) 35 | plt.title("rating distribution") 36 | plt.show() 37 | 38 | #get approx_cost of two peoples 39 | couple_data=dataframe['approx_cost(for two people)'] 40 | sns.countplot(x=couple_data) 41 | plt.show() 42 | 43 | #get boxplot of online order and rate 44 | plt.figure(figsize=(6,6)) 45 | sns.boxplot(x='online_order',y='rate',data=dataframe) 46 | plt.show() 47 | #get heatmap 48 | pivot_table=dataframe.pivot_table(index='listed_in(type)',columns='online_order', aggfunc='size',fill_value=0) 49 | sns.heatmap(pivot_table,annot=True,cmap="YlGnBu",fmt='d') 50 | plt.title("Heatmap") 51 | plt.xlabel("Online Order") 52 | plt.ylabel("Listed In (Type)") 53 | plt.show() 54 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data-analysis-of-Zomato 2 | 3 | ## Tools and Modules 4 | ![](https://img.shields.io/badge/Programming_Language-Python-cyan.svg) 5 | ![](https://img.shields.io/badge/Tool_Used-matplotlib-orange.svg) 6 | ![](https://img.shields.io/badge/Tool_Used-seaborn-darkblue.svg) 7 | ![](https://img.shields.io/badge/Tool_Used-pandas-pink.svg) 8 | ![](https://img.shields.io/badge/Tool_Used-numpy-darkpink.svg) 9 | ![](https://img.shields.io/badge/Python_Version-3.10.1-blue.svg) 10 | ![](https://img.shields.io/badge/Application-Analysis-lemon.svg) 11 | ![](https://img.shields.io/badge/Status-Complete-green.svg) 12 | 13 | 14 | 15 | 16 | ## Overview 17 | 18 | This project analyzes a Zomato restaurant dataset to explore various aspects such as restaurant types, customer ratings, voting patterns, approximate costs for two people, and the availability of online ordering. The analysis uses Python libraries including Pandas, NumPy, Matplotlib, and Seaborn to process data and generate visualizations for insights. 19 | 20 | ## Dataset 21 | The dataset (`Zomato data .csv`) contains information about restaurants listed on Zomato, including: 22 | - Restaurant name 23 | - Rating (out of 5) 24 | - Number of votes 25 | - Approximate cost for two people 26 | - Type of restaurant (e.g., Dining, Cafes) 27 | - Online order availability (Yes/No) 28 | - Other features like location and table booking 29 | 30 | ## Analysis Objectives 31 | - Investigate the distribution of restaurant types. 32 | - Analyze the relationship between restaurant types and total votes. 33 | - Examine the distribution of customer ratings. 34 | - Explore the cost for two people across restaurants. 35 | - Assess the impact of online ordering on ratings. 36 | - Visualize the correlation between restaurant types and online order availability. 37 | 38 | ## Code Description 39 | The Python script performs the following steps: 40 | 1. **Data Loading and Preprocessing**: 41 | - Loads the dataset using Pandas. 42 | - Cleans the `rate` column by extracting the numeric rating (e.g., converts "4.1/5" to 4.1). 43 | - Displays dataset information and checks for data integrity. 44 | 45 | 2. **Exploratory Data Analysis (EDA)**: 46 | - Generates summary statistics and previews the dataset. 47 | - Processes the `rate` column for numerical analysis. 48 | 49 | 3. **Visualizations**: 50 | - **Count Plot**: Displays the distribution of restaurant types. 51 | - **Line Plot**: Shows the total votes per restaurant type with markers. 52 | - **Histogram**: Illustrates the distribution of restaurant ratings. 53 | - **Count Plot**: Visualizes the frequency of approximate costs for two people. 54 | - **Box Plot**: Compares ratings for restaurants with and without online ordering. 55 | - **Heatmap**: Shows the count of restaurants by type and online order availability. 56 | 57 | ## Libraries Used 58 | - **Pandas**: For data manipulation and analysis. 59 | - **NumPy**: For numerical computations. 60 | - **Matplotlib**: For creating static visualizations. 61 | - **Seaborn**: For enhanced statistical visualizations. 62 | 63 | ## How to Run 64 | 1. Ensure Python and required libraries are installed: 65 | ```bash 66 | pip install pandas numpy matplotlib seaborn 67 | ``` 68 | 2. Place the dataset (`Zomato data .csv`) in the same directory as the script. 69 | 3. Run the Python script: 70 | ```bash 71 | python analysis.py 72 | ``` 73 | 4. Visualizations will be displayed, and console output will show dataset details. 74 | 75 | ## Results 76 | - Restaurant Type Distribution: A count plot shows the frequency of different restaurant types (e.g., Dining, Cafes). 77 | 78 |
79 | DevOpsShack Banner 80 |
81 | 82 | - Votes by Restaurant Type: A line plot illustrates the total votes for each restaurant type, highlighting popularity. 83 | 84 |
85 | DevOpsShack Banner 86 |
87 | 88 | - Rating Distribution: A histogram displays the spread of ratings, indicating common rating ranges. 89 | 90 |
91 | DevOpsShack Banner 92 |
93 | 94 | - Approximate Cost for Two: A count plot shows the frequency of different cost brackets for two people. 95 | 96 |
97 | DevOpsShack Banner 98 |
99 | - Online Order vs. Rating: A box plot compares ratings for restaurants with and without online ordering. 100 | 101 |
102 | DevOpsShack Banner 103 |
104 | - Restaurant Type and Online Order Heatmap: A heatmap shows the count of restaurants by type and online order availability. 105 |
106 | DevOpsShack Banner 107 |
108 | 109 | 110 | 111 | 112 | ## Future Improvements 113 | - Handle missing or inconsistent data (e.g., null values or outliers). 114 | - Add statistical analysis to validate trends (e.g., correlation tests). 115 | - Incorporate additional features like location or table booking for deeper insights. 116 | - Use interactive visualizations (e.g., Plotly) for enhanced user experience. 117 | 118 | ## License 119 | This project is for educational purposes and uses a Zomato dataset. Ensure compliance with any dataset-specific licensing terms. 120 | 121 | --------------------------------------------------------------------------------