├── Procfile ├── documents ├── HLD Report.pdf ├── LLD Report.pdf ├── Wireframe Document.pdf ├── Architecture Design.pdf └── Insurance Premium Prediction DPR.pptx ├── images ├── webapp interface-1 .png └── webapp interface-2.png ├── gradient_boosting_regressor_model.pkl ├── static └── css │ └── style.css ├── app.py ├── requirements.txt ├── README.md ├── templates └── index.html ├── LICENSE ├── insurance.csv └── clean_data.csv /Procfile: -------------------------------------------------------------------------------- 1 | web: gunicorn app:app --preload 2 | -------------------------------------------------------------------------------- /documents/HLD Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/documents/HLD Report.pdf -------------------------------------------------------------------------------- /documents/LLD Report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/documents/LLD Report.pdf -------------------------------------------------------------------------------- /images/webapp interface-1 .png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/images/webapp interface-1 .png -------------------------------------------------------------------------------- /images/webapp interface-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/images/webapp interface-2.png -------------------------------------------------------------------------------- /documents/Wireframe Document.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/documents/Wireframe Document.pdf -------------------------------------------------------------------------------- /documents/Architecture Design.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/documents/Architecture Design.pdf -------------------------------------------------------------------------------- /gradient_boosting_regressor_model.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/gradient_boosting_regressor_model.pkl -------------------------------------------------------------------------------- /documents/Insurance Premium Prediction DPR.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikhilpatil44/insurance-premium-prediction/HEAD/documents/Insurance Premium Prediction DPR.pptx -------------------------------------------------------------------------------- /static/css/style.css: -------------------------------------------------------------------------------- 1 | .{ 2 | margin: 0; 3 | padding: 0; 4 | box-sizing: border-box; 5 | } 6 | 7 | .bg-dark{ 8 | background-color: #75767B 9 | } 10 | 11 | .mt-50{ 12 | margin-top: 50px; 13 | } -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | from flask import Flask, request, render_template 2 | import pandas as pd 3 | import pickle 4 | 5 | app = Flask(__name__) 6 | 7 | 8 | file = open("./gradient_boosting_regressor_model.pkl", 'rb') 9 | model = pickle.load(file) 10 | 11 | data = pd.read_csv('./clean_data.csv') 12 | data.head() 13 | 14 | @app.route('/') 15 | def index(): 16 | sex = sorted(data['sex'].unique()) 17 | smoker = sorted(data['smoker'].unique()) 18 | region = sorted(data['region'].unique()) 19 | return render_template('index.html', sex= sex, smoker= smoker, region= region) 20 | 21 | @app.route('/predict', methods=['POST']) 22 | def predict(): 23 | age = int(request.form.get('age')) 24 | sex = request.form.get('sex') 25 | bmi = float(request.form.get('bmi')) 26 | children = int(request.form.get('children')) 27 | smoker = request.form.get('smoker') 28 | region = request.form.get('region') 29 | 30 | prediction = model.predict(pd.DataFrame([[age, sex, bmi, children, smoker, region]], 31 | columns=['age', 'sex', 'bmi', 'children', 'smoker', 'region'])) 32 | 33 | return str(prediction[0]) 34 | 35 | if __name__=="__main__": 36 | app.run(debug=True) 37 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | astroid==2.5.6 2 | attrs==21.2.0 3 | backcall==0.2.0 4 | certifi==2021.5.30 5 | click==8.0.1 6 | colorama==0.4.4 7 | cycler==0.10.0 8 | decorator==5.0.9 9 | Flask==2.0.1 10 | gunicorn==20.0.4 11 | ipykernel==5.5.5 12 | ipython==7.23.1 13 | ipython-genutils==0.2.0 14 | isort==5.8.0 15 | itsdangerous==2.0.1 16 | jedi==0.18.0 17 | Jinja2==3.0.1 18 | joblib==1.0.1 19 | jsonschema==3.2.0 20 | jupyter-client==6.1.12 21 | jupyter-core==4.7.1 22 | kiwisolver==1.3.1 23 | lazy-object-proxy==1.6.0 24 | MarkupSafe==2.0.1 25 | matplotlib==3.4.3 26 | matplotlib-inline==0.1.2 27 | mccabe==0.6.1 28 | nbformat==5.1.3 29 | numpy==1.21.2 30 | pandas==1.3.2 31 | parso==0.8.2 32 | patsy==0.5.1 33 | pickleshare==0.7.5 34 | Pillow==8.3.1 35 | plotly==5.2.2 36 | prompt-toolkit==3.0.18 37 | Pygments==2.9.0 38 | pylint==2.8.3 39 | pyparsing==2.4.7 40 | pyrsistent==0.18.0 41 | python-dateutil 42 | pytz==2021.1 43 | pyzmq==22.0.3 44 | scikit-learn==0.24.2 45 | scipy==1.6.3 46 | seaborn==0.11.2 47 | six 48 | sklearn==0.0 49 | statsmodels==0.12.2 50 | tenacity==8.0.1 51 | threadpoolctl==2.2.0 52 | toml==0.10.2 53 | tornado==6.1 54 | traitlets==5.0.5 55 | wcwidth==0.2.5 56 | Werkzeug==2.0.1 57 | wincertstore==0.2 58 | wrapt==1.12.1 59 | xgboost==1.4.2 60 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Insurance Premium Prediction 2 | 3 | ## Problem Statement : 4 | The goal of this project to give people an estimate of how much they need based on their individual health situation. After that, customers can work with any health insurance carrier and its plans and perks whilwe keeping the projected cost from our study in mind. This can assist a person in concentrating on the health side of an insurance policy rather than the ineffective part. 5 | 6 | ## Dataset : 7 | The dataset is taken from a Kaggle. You can download the dataset from [here](https://www.kaggle.com/noordeen/insurance-premium-prediction) 8 | 9 | ## Approach : 10 | Applying machine learing tasks like Data Exploration, Data Cleaning, Feature Engineering, Model Building and model testing to build a solution that should able to predict the premium of the personal for health insurance. 11 | 12 | - **Data Exploration :** Exploring the dataset using pandas, numpy, matplotlib, plotly and seaborn. 13 | - **Exploratory Data Analysis :** Plotted different graphs to get more insights about dependent and independent variables/features. 14 | - **Feature Engineering :** Numerical features scaled down and Categorical features encoded. 15 | - **Model Building :** In this step, first dataset Splitting is done. After that model is trained on different Machine Learning Algorithms such as: 16 | 1) Linear Regression 17 | 2) Decision Tree Regressor 18 | 3) Random Forest Regressor 19 | 4) Gradient Boosting Regressor 20 | 5) XGBoost Regressor 21 | 6) KNN 22 | - **Model Selection :** Tested all the models to check the RMSE & R-squared. 23 | - **Pickle File** : Selected model as per best RMSE score & R-squared and created pickle file using pickle library. 24 | - **Webpage &Deployment :** Created a web application that takes all the necessary inputs from the user & shows the output. Then deployed project on the Heroku Platform. 25 | 26 | 27 | ## Deployment Link : 28 | https://insurance-premium-prediction1.herokuapp.com/ 29 | 30 | 31 | ## Web Inerface : 32 |  33 | 34 | 35 |  36 | 37 | 38 | ## Libraries used : 39 | 1) Pandas 40 | 2) Numpy 41 | 3) Matplotlib, Seaborn, Plotly 42 | 4) Scikit-Learn 43 | 5) Flask 44 | 6) HTML 45 | 7) CSS 46 | 47 | ## Technical Aspects : 48 | 1) Python 49 | 2) Front-end : HTML, CSS 50 | 3) Back-end : Flask 51 | 4) Deployment : Heruko 52 | 53 | -------------------------------------------------------------------------------- /templates/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |
4 | 5 | 6 | 7 | 10 | 11 | 12 |