├── .gitignore ├── README.md ├── Week 1 ├── Issue Trees Exercise │ ├── Issue Tree Solutions.pptx │ ├── Issue Tree Template.pptx │ └── readme.md ├── Presentation │ └── readme.md └── Python Exercise │ ├── Retail Data Analysis complete.ipynb │ ├── Retail_Data_Analysis.ipynb │ ├── readme.md │ └── retail_data_s.xlsx ├── Week 2 ├── Exercise 1 - Write Code │ └── README.md ├── Exercise 2 - Store Code │ ├── Folder Structure Template │ │ └── YYYY-MM-DD_Project_Name │ │ │ ├── 1_Resources │ │ │ └── .gitignore │ │ │ ├── 2_Original_Data │ │ │ └── .gitignore │ │ │ ├── 3_Prepared_Data │ │ │ └── .gitignore │ │ │ ├── 4_Scripts │ │ │ └── .gitignore │ │ │ └── 5_Final_Outputs │ │ │ └── .gitignore │ └── README.md ├── Exercise 3 - Run Code │ ├── README.md │ ├── Retail_Data_Analysis.ipynb │ ├── pip.sh │ ├── retail_data_analysis.py │ └── retail_data_s.xlsx └── Presentation │ └── readme.md ├── Week 3 ├── Exercise 1 - Data Wrangling │ ├── Data_Wrangling_With_Python.ipynb │ └── readme.md ├── Exercise 2 - Descriptive Statistics │ ├── Descriptive_Statistics_With_Python.ipynb │ └── readme.md ├── Exercise 3 - Data Visualization │ ├── Exploratory_Plots_With_Pandas.ipynb │ ├── Matplotlib_Tutorial.ipynb │ ├── Seaborn_Tutorial.ipynb │ └── readme.md └── Presentation │ └── readme.md ├── Week 4 ├── Exercises │ ├── Calculate_RFM_and_CLV_Values.ipynb │ ├── Develop_and_Interpret_a_Hierarchical_Clustering.ipynb │ ├── Develop_and_Interpret_a_K_Means_Clustering.ipynb │ ├── Perform_a_Market_Basket_Analysis.ipynb │ └── readme.md └── Presentation │ └── readme.md ├── Week 5 ├── Exercises │ └── readme.md └── Presentation │ └── readme.md └── Week 6 ├── Exercise 1 ├── Prescriptive_Analytics_Credit_Risk_Scoring_Case_Study.ipynb └── readme.md ├── Exercise 2 └── readme.md └── Presentation └── readme.md /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .DS_Store 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Additional resources and materials to the O'Reilly Bootcamp [Business Analytics with Python](https://learning.oreilly.com/live-events/business-analytics-with-python-bootcamp/0636920081960/) 2 | -------------------------------------------------------------------------------- /Week 1/Issue Trees Exercise/Issue Tree Solutions.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 1/Issue Trees Exercise/Issue Tree Solutions.pptx -------------------------------------------------------------------------------- /Week 1/Issue Trees Exercise/Issue Tree Template.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 1/Issue Trees Exercise/Issue Tree Template.pptx -------------------------------------------------------------------------------- /Week 1/Issue Trees Exercise/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 1: Issue Trees 2 | 3 | ## Step 1: 4 | * Download and open the file `Issue Tree Template.pptx` - [Google Slides Version](https://docs.google.com/presentation/d/1XihHUmb8L9qPzss8LTzdK5gew7z9zEvlYxs_3wM7hNs/edit) 5 | 6 | ## Step 2: 7 | * Create your own issue tree based on one of the examples provided. 8 | 9 | 10 | ## Step 3: 11 | * Compare your issue tree with the one in `Issue Tree Solutions.pptx`. Compare your issue tree to the solution. 12 | * Is your issue tree MECE? 13 | 14 | *Note: There is no single true solution. Multiple approaches are possible.* 15 | 16 | ## Step 4: 17 | * Think of a problem of your own that you could break down using an issue tree. 18 | 19 | -------------------------------------------------------------------------------- /Week 1/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 1.pdf](https://drive.google.com/drive/folders/15aHxiJIGk-CFW9mlU8mCZiv9hhloDjpC?usp=share_link) -------------------------------------------------------------------------------- /Week 1/Python Exercise/Retail Data Analysis complete.ipynb: -------------------------------------------------------------------------------- 1 | {"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyPkAY/WaKQTn3mqLYmHJ+bO"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["# Retail Data Exercise\n","\n","## Goal:\n","Compare revenue of Wholesale (B2B) vs. Non-Wholesale (B2C) Orders\n","\n","## Tasks:\n","- Perform an Exploratory Data Analysis\n","- Fix Errors in data\n","- Calculate revenue\n","- Find a criteria to flag wholesale orders\n","- Compare B2B to B2C orders"],"metadata":{"id":"NAqnc8OBHawP"}},{"cell_type":"markdown","source":["## Setup"],"metadata":{"id":"dtpRqDOYHmpb"}},{"cell_type":"code","execution_count":null,"metadata":{"id":"BK2uO23-HTHT"},"outputs":[],"source":["import pandas as pd"]},{"cell_type":"markdown","source":["## Exploratory Data Analysis"],"metadata":{"id":"IjWn5WMiH3kR"}},{"cell_type":"code","source":["df = pd.read_excel(\"retail_data_s.xlsx\", sheet_name = \"Data\")"],"metadata":{"id":"STjPHoyHH_-S"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["df"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":447},"id":"_yMq2LZPIIBt","executionInfo":{"status":"ok","timestamp":1675074990374,"user_tz":-60,"elapsed":32,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"5512bc49-f80d-451d-fa5a-f6348268b29f"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n","0 13047 536367 2010-12-01 08:34:00 84879 32 \n","1 13047 536367 2010-12-01 08:34:00 22745 6 \n","2 13047 536367 2010-12-01 08:34:00 22748 6 \n","3 13047 536367 2010-12-01 08:34:00 22749 8 \n","4 13047 536367 2010-12-01 08:34:00 22310 6 \n","... ... ... ... ... ... \n","91460 17581 581581 2011-12-09 12:20:00 23562 6 \n","91461 17581 581581 2011-12-09 12:20:00 23561 6 \n","91462 17581 581581 2011-12-09 12:20:00 23681 10 \n","91463 17581 581582 2011-12-09 12:21:00 23552 6 \n","91464 17581 581582 2011-12-09 12:21:00 23498 12 \n","\n"," UnitPrice \n","0 1.69 \n","1 2.10 \n","2 2.10 \n","3 3.75 \n","4 1.65 \n","... ... \n","91460 2.89 \n","91461 2.89 \n","91462 1.65 \n","91463 2.08 \n","91464 1.45 \n","\n","[91465 rows x 6 columns]"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPrice
0130475363672010-12-01 08:34:0084879321.69
1130475363672010-12-01 08:34:002274562.10
2130475363672010-12-01 08:34:002274862.10
3130475363672010-12-01 08:34:002274983.75
4130475363672010-12-01 08:34:002231061.65
.....................
91460175815815812011-12-09 12:20:002356262.89
91461175815815812011-12-09 12:20:002356162.89
91462175815815812011-12-09 12:20:0023681101.65
91463175815815822011-12-09 12:21:002355262.08
91464175815815822011-12-09 12:21:0023498121.45
\n","

91465 rows × 6 columns

\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":54}]},{"cell_type":"markdown","source":["Note: \n","\n","* A Customer ID can have multiple InvoiceNo and an InvoiceNo can have multiple rows (orders)\n","* ID column is already there"],"metadata":{"id":"B-nxJLFVTtAB"}},{"cell_type":"code","source":["df.describe()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":323},"id":"gxD02i2KI1IN","executionInfo":{"status":"ok","timestamp":1675074990375,"user_tz":-60,"elapsed":30,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"78725159-9006-4cc9-a10f-7ae27fa5bcc9"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo Quantity UnitPrice\n","count 91465.000000 91465.000000 91465.000000 91465.000000\n","mean 15101.770426 560116.483114 12.352386 2.943347\n","std 1756.168325 13205.742026 40.065201 8.776144\n","min 12347.000000 536367.000000 1.000000 0.000000\n","25% 13488.000000 548739.000000 2.000000 1.250000\n","50% 14944.000000 560841.000000 6.000000 1.690000\n","75% 16628.000000 571676.000000 12.000000 3.750000\n","max 18287.000000 581582.000000 2880.000000 1241.980000"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoQuantityUnitPrice
count91465.00000091465.00000091465.00000091465.000000
mean15101.770426560116.48311412.3523862.943347
std1756.16832513205.74202640.0652018.776144
min12347.000000536367.0000001.0000000.000000
25%13488.000000548739.0000002.0000001.250000
50%14944.000000560841.0000006.0000001.690000
75%16628.000000571676.00000012.0000003.750000
max18287.000000581582.0000002880.0000001241.980000
\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":55}]},{"cell_type":"code","source":["df.sort_values(\"Quantity\", ascending = False).head(20)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":702},"id":"NCYR3_QNJxTV","executionInfo":{"status":"ok","timestamp":1675074990375,"user_tz":-60,"elapsed":28,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"ba6b6644-b0c6-496d-d018-55ae294801c5"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n","973 16754 536830 2010-12-02 17:38:00 84077 2880 \n","58793 17450 567423 2011-09-20 11:05:00 23285 1944 \n","58792 17450 567423 2011-09-20 11:05:00 23288 1944 \n","58791 17450 567423 2011-09-20 11:05:00 23286 1878 \n","82854 17857 578060 2011-11-22 15:22:00 M 1600 \n","13870 13848 543991 2011-02-15 10:17:00 16014 1500 \n","47220 15118 561638 2011-07-28 14:54:00 84568 1440 \n","1087 14156 536890 2010-12-03 11:48:00 17084R 1440 \n","58797 17450 567423 2011-09-20 11:05:00 22969 1428 \n","58800 17450 567423 2011-09-20 11:05:00 23243 1412 \n","90554 15195 581115 2011-12-07 12:20:00 22413 1404 \n","974 16754 536830 2010-12-02 17:38:00 21915 1400 \n","3680 17857 537981 2010-12-09 11:35:00 22492 1394 \n","8788 17450 540689 2011-01-11 08:43:00 22469 1356 \n","8789 17450 540689 2011-01-11 08:43:00 22470 1284 \n","76886 17857 575328 2011-11-09 13:48:00 M 1200 \n","71043 13685 572670 2011-10-25 13:14:00 22065 1200 \n","12753 17381 543192 2011-02-04 11:55:00 15036 1200 \n","41730 13082 559047 2011-07-05 15:44:00 15036 1200 \n","8787 17450 540689 2011-01-11 08:43:00 85123A 1010 \n","\n"," UnitPrice \n","973 0.18 \n","58793 1.08 \n","58792 1.08 \n","58791 1.08 \n","82854 0.25 \n","13870 0.32 \n","47220 0.17 \n","1087 0.16 \n","58797 1.70 \n","58800 5.06 \n","90554 2.75 \n","974 1.06 \n","3680 0.55 \n","8788 1.93 \n","8789 3.21 \n","76886 0.25 \n","71043 0.39 \n","12753 0.65 \n","41730 0.72 \n","8787 3.24 "],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPrice
973167545368302010-12-02 17:38:008407728800.18
58793174505674232011-09-20 11:05:002328519441.08
58792174505674232011-09-20 11:05:002328819441.08
58791174505674232011-09-20 11:05:002328618781.08
82854178575780602011-11-22 15:22:00M16000.25
13870138485439912011-02-15 10:17:001601415000.32
47220151185616382011-07-28 14:54:008456814400.17
1087141565368902010-12-03 11:48:0017084R14400.16
58797174505674232011-09-20 11:05:002296914281.70
58800174505674232011-09-20 11:05:002324314125.06
90554151955811152011-12-07 12:20:002241314042.75
974167545368302010-12-02 17:38:002191514001.06
3680178575379812010-12-09 11:35:002249213940.55
8788174505406892011-01-11 08:43:002246913561.93
8789174505406892011-01-11 08:43:002247012843.21
76886178575753282011-11-09 13:48:00M12000.25
71043136855726702011-10-25 13:14:002206512000.39
12753173815431922011-02-04 11:55:001503612000.65
41730130825590472011-07-05 15:44:001503612000.72
8787174505406892011-01-11 08:43:0085123A10103.24
\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":56}]},{"cell_type":"markdown","source":["## Clean Data"],"metadata":{"id":"awB7GeAeJpUI"}},{"cell_type":"markdown","source":["Inspect orders with \"M\" Stock code"],"metadata":{"id":"aiQra_QvUbpp"}},{"cell_type":"code","source":["df.query(\"StockCode != 'M'\")"],"metadata":{"id":"U14ylK22UbP5"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["Remove orders with \"M\" Stock code"],"metadata":{"id":"uFT5UFj_JtVs"}},{"cell_type":"code","source":["df_clean = df.query(\"StockCode != 'M'\")"],"metadata":{"id":"p7qlTK2-J_S7"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["Remove free products"],"metadata":{"id":"Chi0SqtzKK90"}},{"cell_type":"code","source":["df_clean = df_clean.query(\"UnitPrice > 0\")"],"metadata":{"id":"0EJj2yScKKrD"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["Descriptive statistics"],"metadata":{"id":"QFdifa10KTTb"}},{"cell_type":"code","source":["df_clean.describe()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":323},"id":"mTcdPmyhKUrM","executionInfo":{"status":"ok","timestamp":1675074990377,"user_tz":-60,"elapsed":28,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"3046bd3e-ebf4-4d1b-c26c-02ab9692203c"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo Quantity UnitPrice\n","count 91383.000000 91383.000000 91383.000000 91383.000000\n","mean 15102.159526 560119.154077 12.308055 2.875542\n","std 1755.922140 13203.645238 39.358124 4.637190\n","min 12347.000000 536367.000000 1.000000 0.040000\n","25% 13488.000000 548739.000000 2.000000 1.250000\n","50% 14944.000000 560841.000000 6.000000 1.690000\n","75% 16628.000000 571676.000000 12.000000 3.750000\n","max 18287.000000 581582.000000 2880.000000 300.000000"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoQuantityUnitPrice
count91383.00000091383.00000091383.00000091383.000000
mean15102.159526560119.15407712.3080552.875542
std1755.92214013203.64523839.3581244.637190
min12347.000000536367.0000001.0000000.040000
25%13488.000000548739.0000002.0000001.250000
50%14944.000000560841.0000006.0000001.690000
75%16628.000000571676.00000012.0000003.750000
max18287.000000581582.0000002880.000000300.000000
\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":59}]},{"cell_type":"markdown","source":["### Calculate Revenue"],"metadata":{"id":"KPjewQaXKmD9"}},{"cell_type":"code","source":["df_clean['Revenue'] = df_clean['Quantity'] * df_clean['UnitPrice']"],"metadata":{"id":"5M1U8TbhKnVd"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["df_clean"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":447},"id":"ww8NSxh1K2tF","executionInfo":{"status":"ok","timestamp":1675074990378,"user_tz":-60,"elapsed":27,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"f2919eac-7665-40eb-8c41-2d025945f74a"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n","0 13047 536367 2010-12-01 08:34:00 84879 32 \n","1 13047 536367 2010-12-01 08:34:00 22745 6 \n","2 13047 536367 2010-12-01 08:34:00 22748 6 \n","3 13047 536367 2010-12-01 08:34:00 22749 8 \n","4 13047 536367 2010-12-01 08:34:00 22310 6 \n","... ... ... ... ... ... \n","91460 17581 581581 2011-12-09 12:20:00 23562 6 \n","91461 17581 581581 2011-12-09 12:20:00 23561 6 \n","91462 17581 581581 2011-12-09 12:20:00 23681 10 \n","91463 17581 581582 2011-12-09 12:21:00 23552 6 \n","91464 17581 581582 2011-12-09 12:21:00 23498 12 \n","\n"," UnitPrice Revenue \n","0 1.69 54.08 \n","1 2.10 12.60 \n","2 2.10 12.60 \n","3 3.75 30.00 \n","4 1.65 9.90 \n","... ... ... \n","91460 2.89 17.34 \n","91461 2.89 17.34 \n","91462 1.65 16.50 \n","91463 2.08 12.48 \n","91464 1.45 17.40 \n","\n","[91383 rows x 7 columns]"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPriceRevenue
0130475363672010-12-01 08:34:0084879321.6954.08
1130475363672010-12-01 08:34:002274562.1012.60
2130475363672010-12-01 08:34:002274862.1012.60
3130475363672010-12-01 08:34:002274983.7530.00
4130475363672010-12-01 08:34:002231061.659.90
........................
91460175815815812011-12-09 12:20:002356262.8917.34
91461175815815812011-12-09 12:20:002356162.8917.34
91462175815815812011-12-09 12:20:0023681101.6516.50
91463175815815822011-12-09 12:21:002355262.0812.48
91464175815815822011-12-09 12:21:0023498121.4517.40
\n","

91383 rows × 7 columns

\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":61}]},{"cell_type":"markdown","source":["### Find a criteria to flag wholesale orders"],"metadata":{"id":"cNR4Y-8rK9Ra"}},{"cell_type":"markdown","source":["Approach: Flag every order with more than 3 * STD of Quantity as Wholesale order"],"metadata":{"id":"I5rchhr0K_-E"}},{"cell_type":"code","source":["def flag_wholesale(quantity):\n"," quantity_avg = 12.3\n"," quantity_std = 39.36\n","\n"," if quantity >= (quantity_avg + 3 * quantity_std):\n"," return \"B2B\"\n"," else:\n"," return \"B2C\""],"metadata":{"id":"ywsUOycnLQ2T"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["flag_wholesale(300)"],"metadata":{"id":"1ZSFuKR3WC2O"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["df_clean['Quantity'].map(flag_wholesale)"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"HgVR1O_CL-Qo","executionInfo":{"status":"ok","timestamp":1675074990379,"user_tz":-60,"elapsed":26,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"006c4216-ad6b-41f4-ab02-26f6c9689aa9"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["0 B2C\n","1 B2C\n","2 B2C\n","3 B2C\n","4 B2C\n"," ... \n","91460 B2C\n","91461 B2C\n","91462 B2C\n","91463 B2C\n","91464 B2C\n","Name: Quantity, Length: 91383, dtype: object"]},"metadata":{},"execution_count":63}]},{"cell_type":"code","source":["df_clean['Segment'] = df_clean['Quantity'].map(flag_wholesale)"],"metadata":{"id":"KZruQWlqMHQZ"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["df_clean"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":447},"id":"DfArme6zMM_H","executionInfo":{"status":"ok","timestamp":1675074990745,"user_tz":-60,"elapsed":10,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"d9b21aa6-ed03-495f-a96a-526c36d679d2"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n","0 13047 536367 2010-12-01 08:34:00 84879 32 \n","1 13047 536367 2010-12-01 08:34:00 22745 6 \n","2 13047 536367 2010-12-01 08:34:00 22748 6 \n","3 13047 536367 2010-12-01 08:34:00 22749 8 \n","4 13047 536367 2010-12-01 08:34:00 22310 6 \n","... ... ... ... ... ... \n","91460 17581 581581 2011-12-09 12:20:00 23562 6 \n","91461 17581 581581 2011-12-09 12:20:00 23561 6 \n","91462 17581 581581 2011-12-09 12:20:00 23681 10 \n","91463 17581 581582 2011-12-09 12:21:00 23552 6 \n","91464 17581 581582 2011-12-09 12:21:00 23498 12 \n","\n"," UnitPrice Revenue Segment \n","0 1.69 54.08 B2C \n","1 2.10 12.60 B2C \n","2 2.10 12.60 B2C \n","3 3.75 30.00 B2C \n","4 1.65 9.90 B2C \n","... ... ... ... \n","91460 2.89 17.34 B2C \n","91461 2.89 17.34 B2C \n","91462 1.65 16.50 B2C \n","91463 2.08 12.48 B2C \n","91464 1.45 17.40 B2C \n","\n","[91383 rows x 8 columns]"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPriceRevenueSegment
0130475363672010-12-01 08:34:0084879321.6954.08B2C
1130475363672010-12-01 08:34:002274562.1012.60B2C
2130475363672010-12-01 08:34:002274862.1012.60B2C
3130475363672010-12-01 08:34:002274983.7530.00B2C
4130475363672010-12-01 08:34:002231061.659.90B2C
...........................
91460175815815812011-12-09 12:20:002356262.8917.34B2C
91461175815815812011-12-09 12:20:002356162.8917.34B2C
91462175815815812011-12-09 12:20:0023681101.6516.50B2C
91463175815815822011-12-09 12:21:002355262.0812.48B2C
91464175815815822011-12-09 12:21:0023498121.4517.40B2C
\n","

91383 rows × 8 columns

\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":65}]},{"cell_type":"markdown","source":["### Compare B2B to B2C orders"],"metadata":{"id":"axVS61HsMQQe"}},{"cell_type":"code","source":["pd.pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = ['sum'] )"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":196},"id":"IXq4wOM_NXfK","executionInfo":{"status":"ok","timestamp":1675074990746,"user_tz":-60,"elapsed":10,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"c32c2eb0-f5b3-4ad5-8605-e73c12f802c3"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[" sum\n"," Revenue\n","Segment \n","B2B 352482.43\n","B2C 1564510.24"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
sum
Revenue
Segment
B2B352482.43
B2C1564510.24
\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":66}]},{"cell_type":"code","source":["pd\n",".pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = 'sum' )\n",".plot\n",".bar();"],"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":320},"id":"pcXg72KgNLVC","executionInfo":{"status":"ok","timestamp":1675074990746,"user_tz":-60,"elapsed":8,"user":{"displayName":"Tobias Zwingmann","userId":"04750770048122423001"}},"outputId":"5e486d69-90bf-4664-f22b-4e3e8d18a87a"},"execution_count":null,"outputs":[{"output_type":"display_data","data":{"text/plain":["
"],"image/png":"iVBORw0KGgoAAAANSUhEUgAAAXQAAAEbCAYAAADKwX/cAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAV5ElEQVR4nO3dfbCedX3n8feHJBAoT7PkdIclgUQbLAF5MAd8aF3AlklQJthdHrVbqWh2pDy4th1T3eIOdWd01e1WJ9aNLU1llKfUcbOUih2WiBZwORkBSSJMRC0nZeUQIBUQSOC7f5w77L2Hk3PfCXfOnXOd92vmDPf1u37nur7nzJUPv/O7nlJVSJKmvv36XYAkqTcMdElqCANdkhrCQJekhjDQJakhDHRJaoi+BnqSa5M8nuTBLvtfkGRjkg1Jvra365OkqST9vA49yb8GngG+UlUndOi7ELgJeEdVPZXkl6vq8cmoU5Kmgr6O0KvqTuDJ9rYkr0/yzSTrk3wnya+2Vn0QWFlVT7W+1zCXpDb74hz6KuCKqloM/AHwxVb7scCxSf4hyT1JlvatQknaB83sdwHtkhwMvA24OcnO5gNa/50JLATOAOYCdyZ5Y1U9Pdl1StK+aJ8KdEb/Yni6qk4eZ90w8L2q2g78OMnDjAb8vZNZoCTtq/apKZeq+mdGw/p8gIw6qbX6G4yOzkkyh9EpmEf6Uack7Yv6fdni9cDdwBuSDCe5FHgvcGmS+4ENwLmt7rcBW5NsBO4A/rCqtvajbknaF/X1skVJUu/sU1MukqQ917eTonPmzKn58+f3a/eSNCWtX7/+iaoaGG9d3wJ9/vz5DA0N9Wv3kjQlJfnprtY55SJJDdEx0Lt5gFaSM5Lc13po1rd7W6IkqRvdjNBXA7u8zT7J4Yzenr+sqo4Hzu9NaZKk3dFxDr2q7kwyf4Iu7wG+XlX/2Oq/xw/N2r59O8PDwzz//PN7uolpa/bs2cydO5dZs2b1uxRJfdKLk6LHArOSrAMOAf6sqr4yXscky4HlAEcfffSr1g8PD3PIIYcwf/582p7log6qiq1btzI8PMyCBQv6XY6kPunFSdGZwGLgXcAS4I+THDtex6paVVWDVTU4MPDqq26ef/55jjjiCMN8NyXhiCOO8C8baZrrxQh9GNhaVc8Czya5EzgJeHhPNmaY7xl/b5J6MUL/H8CvJ5mZ5CDgzcCmHmxXkrQbOo7QWw/QOgOYk2QY+AQwC6CqvlRVm5J8E3gAeBn4i6rq6h2hncxf8be92MwrfvKpd3XsM2PGDN74xjeyY8cOFixYwHXXXcfhhx/e0zokaW/o5iqXi7vo8xngMz2pqM8OPPBA7rvvPgDe9773sXLlSj7+8Y/3uSrp1Xo94Jnuuhnw7eu8U3QCb33rW9myZQsAP/rRj1i6dCmLFy/m7W9/Oz/84Q/Ztm0bxxxzDC+//DIAzz77LPPmzWP79u3j9ge45JJLuPLKK3nb297G6173OtasWQPAunXrOOecc17Z9+WXX87q1asBWL9+PaeffjqLFy9myZIlPPbYY5P4W5A0VRjou/DSSy9x++23s2zZMgCWL1/OF77wBdavX89nP/tZLrvsMg477DBOPvlkvv3t0Ztjb7nlFpYsWcKsWbPG7b/TY489xne/+11uueUWVqxYMWEd27dv54orrmDNmjWsX7+e97///f7FIGlc+9or6PruF7/4BSeffDJbtmzhuOOO46yzzuKZZ57hrrvu4vzz/99NsC+88AIAF154ITfeeCNnnnkmN9xwA5dddtmE/QHe/e53s99++7Fo0SJ+9rOfTVjPQw89xIMPPshZZ50FjP6P5sgjj+zljyypIQz0MXbOoT/33HMsWbKElStXcskll3D44Ye/MrfebtmyZXzsYx/jySefZP369bzjHe/g2Wef3WV/gAMOOOCVzztfMDJz5sxXpm6AV64pryqOP/547r777l7+mJIayCmXXTjooIP4/Oc/z+c+9zkOOuggFixYwM033wyMhuz9998PwMEHH8ypp57KVVddxTnnnMOMGTM49NBDd9l/V4455hg2btzICy+8wNNPP83tt98OwBve8AZGRkZeCfTt27ezYcOGvfVjS5rC9ukRer/POp9yyimceOKJXH/99Xz1q1/lQx/6EJ/85CfZvn07F110ESedNPr+6gsvvJDzzz+fdevWvfK9E/Ufz7x587jgggs44YQTWLBgAaeccgoA+++/P2vWrOHKK69k27Zt7Nixgw9/+MMcf/zxe/VnlzT19O2dooODgzX2BRebNm3iuOOO60s9TeDvb3rxssXe6vcAsltJ1lfV4HjrnHKRpIYw0CWpIfa5QO/XFNBU5+9N0j4V6LNnz2br1q2G027a+Tz02bNn97sUSX20T13lMnfuXIaHhxkZGel3KVPOzjcWSZq+9qlAnzVrlm/ckaQ9tE9NuUiS9pyBLkkNYaBLUkMY6JLUEB0DPcm1SR5PMuFr5ZKcmmRHkvN6V54kqVvdjNBXA0sn6pBkBvBp4Fs9qEmStAc6BnpV3Qk82aHbFcDfAI/3oihJ0u57zXPoSY4Cfgv48y76Lk8ylGTIm4ckqbd6cVL0vwEfraqXO3WsqlVVNVhVgwMDAz3YtSRpp17cKToI3JAEYA7wziQ7quobPdi2JKlLrznQq+qVe/WTrAZuMcwlafJ1DPQk1wNnAHOSDAOfAGYBVNWX9mp1kqSudQz0qrq4241V1SWvqRpJ0h7zTlFJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGqJjoCe5NsnjSR7cxfr3JnkgyQ+S3JXkpN6XKUnqpJsR+mpg6QTrfwycXlVvBP4EWNWDuiRJu6mbl0TfmWT+BOvvalu8B5j72suSJO2uXs+hXwr83a5WJlmeZCjJ0MjISI93LUnTW88CPcmZjAb6R3fVp6pWVdVgVQ0ODAz0ateSJLqYculGkhOBvwDOrqqtvdimJGn3vOYRepKjga8D/66qHn7tJUmS9kTHEXqS64EzgDlJhoFPALMAqupLwNXAEcAXkwDsqKrBvVWwJGl83VzlcnGH9R8APtCziiRJe8Q7RSWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSE6BnqSa5M8nuTBXaxPks8n2ZzkgSRv6n2ZkqROuhmhrwaWTrD+bGBh62s58OevvSxJ0u7qGOhVdSfw5ARdzgW+UqPuAQ5PcmSvCpQkdacXc+hHAY+2LQ+32l4lyfIkQ0mGRkZGerBrSdJOk3pStKpWVdVgVQ0ODAxM5q4lqfF6EehbgHlty3NbbZKkSdSLQF8L/E7rape3ANuq6rEebFeStBtmduqQ5HrgDGBOkmHgE8AsgKr6EnAr8E5gM/Ac8Lt7q1hJ0q51DPSqurjD+gJ+r2cVSZL2iHeKSlJDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDWEgS5JDWGgS1JDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQXQV6kqVJHkqyOcmKcdYfneSOJN9P8kCSd/a+VEnSRDoGepIZwErgbGARcHGSRWO6/Ufgpqo6BbgI+GKvC5UkTaybEfppwOaqeqSqXgRuAM4d06eAQ1ufDwP+qXclSpK60U2gHwU82rY83Gpr95+A304yDNwKXDHehpIsTzKUZGhkZGQPypUk7UqvTopeDKyuqrnAO4Hrkrxq21W1qqoGq2pwYGCgR7uWJEF3gb4FmNe2PLfV1u5S4CaAqrobmA3M6UWBkqTudBPo9wILkyxIsj+jJz3Xjunzj8BvACQ5jtFAd05FkiZRx0Cvqh3A5cBtwCZGr2bZkOSaJMta3X4f+GCS+4HrgUuqqvZW0ZKkV5vZTaequpXRk53tbVe3fd4I/FpvS5Mk7Q7vFJWkhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIboKtCTLE3yUJLNSVbsos8FSTYm2ZDka70tU5LUScdX0CWZAawEzgKGgXuTrG29dm5nn4XAHwG/VlVPJfnlvVWwJGl83YzQTwM2V9UjVfUicANw7pg+HwRWVtVTAFX1eG/LlCR10k2gHwU82rY83GprdyxwbJJ/SHJPkqXjbSjJ8iRDSYZGRkb2rGJJ0rh6dVJ0JrAQOAO4GPhyksPHdqqqVVU1WFWDAwMDPdq1JAm6C/QtwLy25bmttnbDwNqq2l5VPwYeZjTgJUmTpJtAvxdYmGRBkv2Bi4C1Y/p8g9HROUnmMDoF80gP65QkddAx0KtqB3A5cBuwCbipqjYkuSbJsla324CtSTYCdwB/WFVb91bRkqRX63jZIkBV3QrcOqbt6rbPBXyk9SVJ6gPvFJWkhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIboK9CRLkzyUZHOSFRP0+7dJKslg70qUJHWjY6AnmQGsBM4GFgEXJ1k0Tr9DgKuA7/W6SElSZ92M0E8DNlfVI1X1InADcO44/f4E+DTwfA/rkyR1qZtAPwp4tG15uNX2iiRvAuZV1d9OtKEky5MMJRkaGRnZ7WIlSbv2mk+KJtkP+K/A73fqW1WrqmqwqgYHBgZe664lSW26CfQtwLy25bmttp0OAU4A1iX5CfAWYK0nRiVpcnUT6PcCC5MsSLI/cBGwdufKqtpWVXOqan5VzQfuAZZV1dBeqViSNK6OgV5VO4DLgduATcBNVbUhyTVJlu3tAiVJ3ZnZTaequhW4dUzb1bvoe8ZrL0uStLu6CvTpbP6KCS/c0W76yafe1e8SpMby1n9JaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIboK9CRLkzyUZHOSFeOs/0iSjUkeSHJ7kmN6X6okaSIdAz3JDGAlcDawCLg4yaIx3b4PDFbVicAa4L/0ulBJ0sS6GaGfBmyuqkeq6kXgBuDc9g5VdUdVPddavAeY29syJUmddBPoRwGPti0Pt9p25VLg78ZbkWR5kqEkQyMjI91XKUnqqKcnRZP8NjAIfGa89VW1qqoGq2pwYGCgl7uWpGlvZhd9tgDz2pbnttr+P0l+E/g4cHpVvdCb8iRJ3epmhH4vsDDJgiT7AxcBa9s7JDkF+O/Asqp6vPdlSpI66RjoVbUDuBy4DdgE3FRVG5Jck2RZq9tngIOBm5Pcl2TtLjYnSdpLuplyoapuBW4d03Z12+ff7HFdkqTd5J2iktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDWEgS5JDWGgS1JDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDVEV4GeZGmSh5JsTrJinPUHJLmxtf57Seb3ulBJ0sQ6BnqSGcBK4GxgEXBxkkVjul0KPFVVvwL8KfDpXhcqSZpYNyP004DNVfVIVb0I3ACcO6bPucBftz6vAX4jSXpXpiSpk5ld9DkKeLRteRh48676VNWOJNuAI4An2jslWQ4sby0+k+ShPSla45rDmN/3vij+7TYdeWz21jG7WtFNoPdMVa0CVk3mPqeLJENVNdjvOqSxPDYnTzdTLluAeW3Lc1tt4/ZJMhM4DNjaiwIlSd3pJtDvBRYmWZBkf+AiYO2YPmuB97U+nwf8r6qq3pUpSeqk45RLa078cuA2YAZwbVVtSHINMFRVa4G/BK5Lshl4ktHQ1+RyKkv7Ko/NSRIH0pLUDN4pKkkNYaBLUkMY6JLUEAa6JDWEgT5FJZm58/EKSeYlOS/JKf2uS9NbkiVJzhun/bwkZ/WjpunEQJ+CknwQeBz4aevz7Yxe/39Dko/2tThNd1cD3x6nfR1wzeSWMv1M6q3/6pkPA68HDgE2AcdU1RNJDmL0RrCp81QKNc0BVTUytrF1fP5SPwqaTgz0qenFqnoKeCrJ5qp6AqCqnkvyYp9r0/R2aJKZVbWjvTHJLODAPtU0bRjoU9OBrfny/YD9W5/T+prd18o03X0d+HKSy6vqWYAkBwN/1lqnvcg7RaegJHdMtL6qzpysWqR2rYfzfRL4APDTVvPRjD4e5I+ranu/apsODHRJPZfkQOBXWoubq+oX/axnuvAqlykqyaFJXj9O+4n9qEfaKcmhwL+qqh+0vn7RavfY3MsM9CkoyQXAD4G/SbIhyaltq1f3pyrJY7PfDPSp6WPA4qo6GfhdRh9d/Futdb7LVf3ksdlHXuUyNc2oqscAqup/JzkTuCXJPMCTIuonj80+coQ+Nf28ff689Q/oDOBc4Ph+FSXhsdlXjtCnpg8x5n/GVfXzJEuBC/pTkgR4bPaVly02RJI5wFbf5ap9jcfm5HHKZQpK8pYk65J8PckpSR4EHgR+1hoJSX3hsdlfjtCnoCRDjF5NcBijL+A9u6ruSfKrwPVV5WN01Rcem/3lCH1qmllV36qqm4H/U1X3AFTVD/tcl+Sx2UcG+tT0ctvnsbdU+yeX+sljs4+ccpmCkrwEPMvojRoHAs/tXAXMrqpZ/apN05vHZn8Z6JLUEE65SFJDGOiS1BAGuqa8JB9vPdnvgST3JXlzv2vaKcn8JO/pdx2aHrz1X1NakrcC5wBvqqoXWncl7t/nstrNB94DfK3PdWgacISuqe5I4ImqegFG3y5fVf+UZHGSbydZn+S2JEcCJDm1bST/mdadjCS5JMk3kvx9kp8kuTzJR5J8P8k9Sf5Fq9/rk3yztd3vtG6YIcnqJJ9PcleSR5Kc16rvU8DbW/v7D5P+29G0YqBrqvsWMC/Jw0m+mOT01hvmvwCcV1WLgWuB/9zq/1fAv289r/ulMds6Afg3wKmt/s+17my8G/idVp9VwBWt7f4B8MW27z8S+HVG/2L4VKttBfCdqjq5qv60Zz+1NA6nXDSlVdUzSRYDbwfOBG5k9CXFJwB/nwRgBvBYksOBQ6rq7ta3f43R8N3pjqr6OaOPgN0G/M9W+w+AE1tvr38bcHNruwAHtH3/N6rqZWBjkn/Z4x9V6shA15RXVS8B64B1SX4A/B6woare2t6vFegTeaHt88ttyy8z+m9lP+Dp1ui+0/f7dh5NOqdcNKUleUOShW1NJwObgIHWCVOSzEpyfFU9zejoe+dVMBftzr6q6p+BHyc5v7XdJDmpw7f9HDhkd/Yj7SkDXVPdwcBfJ9mY5AFgEXA1cB7w6ST3A/cxOlUCcCnw5ST3Ab8EbNvN/b0XuLS13Q2MvolnIg8ALyW535Oi2tu89V/TSpKDq+qZ1ucVwJFVdVWfy5J6wjl0TTfvSvJHjB77PwUu6W85Uu84QpekhnAOXZIawkCXpIYw0CWpIQx0SWoIA12SGuL/AvkvQ8NTXOT1AAAAAElFTkSuQmCC\n"},"metadata":{"needs_background":"light"}}]},{"cell_type":"markdown","source":["**Clear all outputs!**\n","\n","**Run again!**"],"metadata":{"id":"25xpzlHYXXKk"}}]} -------------------------------------------------------------------------------- /Week 1/Python Exercise/Retail_Data_Analysis.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [] 7 | }, 8 | "kernelspec": { 9 | "name": "python3", 10 | "display_name": "Python 3" 11 | }, 12 | "language_info": { 13 | "name": "python" 14 | } 15 | }, 16 | "cells": [ 17 | { 18 | "cell_type": "markdown", 19 | "source": [ 20 | "# Retail Data Exercise\n", 21 | "\n", 22 | "## Goal:\n", 23 | "Compare revenue of Wholesale (B2B) vs. Non-Wholesale (B2C) Orders\n", 24 | "\n", 25 | "## Tasks:\n", 26 | "- Perform an Exploratory Data Analysis\n", 27 | "- Fix Errors in data\n", 28 | "- Calculate revenue\n", 29 | "- Find a criteria to flag wholesale orders\n", 30 | "- Compare B2B to B2C orders" 31 | ], 32 | "metadata": { 33 | "id": "NAqnc8OBHawP" 34 | } 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "source": [ 39 | "## Setup" 40 | ], 41 | "metadata": { 42 | "id": "dtpRqDOYHmpb" 43 | } 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": { 49 | "id": "BK2uO23-HTHT" 50 | }, 51 | "outputs": [], 52 | "source": [ 53 | "import pandas as pd" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "source": [ 59 | "## Exploratory Data Analysis" 60 | ], 61 | "metadata": { 62 | "id": "IjWn5WMiH3kR" 63 | } 64 | }, 65 | { 66 | "cell_type": "code", 67 | "source": [ 68 | "df = pd.read_excel(\"retail_data_s.xlsx\", sheet_name = \"Data\")" 69 | ], 70 | "metadata": { 71 | "id": "STjPHoyHH_-S" 72 | }, 73 | "execution_count": null, 74 | "outputs": [] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "source": [ 79 | "df" 80 | ], 81 | "metadata": { 82 | "id": "_yMq2LZPIIBt" 83 | }, 84 | "execution_count": null, 85 | "outputs": [] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "source": [ 90 | "Note: \n", 91 | "\n", 92 | "* A Customer ID can have multiple InvoiceNo and an InvoiceNo can have multiple rows (orders)\n", 93 | "* ID column is already there" 94 | ], 95 | "metadata": { 96 | "id": "B-nxJLFVTtAB" 97 | } 98 | }, 99 | { 100 | "cell_type": "code", 101 | "source": [ 102 | "df.describe()" 103 | ], 104 | "metadata": { 105 | "id": "gxD02i2KI1IN" 106 | }, 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "source": [ 113 | "df.sort_values(\"Quantity\", ascending = False).head(20)" 114 | ], 115 | "metadata": { 116 | "id": "NCYR3_QNJxTV" 117 | }, 118 | "execution_count": null, 119 | "outputs": [] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "source": [ 124 | "## Clean Data" 125 | ], 126 | "metadata": { 127 | "id": "awB7GeAeJpUI" 128 | } 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "source": [ 133 | "Inspect orders with \"M\" Stock code" 134 | ], 135 | "metadata": { 136 | "id": "aiQra_QvUbpp" 137 | } 138 | }, 139 | { 140 | "cell_type": "code", 141 | "source": [ 142 | "df.query(\"StockCode != 'M'\")" 143 | ], 144 | "metadata": { 145 | "id": "U14ylK22UbP5" 146 | }, 147 | "execution_count": null, 148 | "outputs": [] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "source": [ 153 | "Remove orders with \"M\" Stock code" 154 | ], 155 | "metadata": { 156 | "id": "uFT5UFj_JtVs" 157 | } 158 | }, 159 | { 160 | "cell_type": "code", 161 | "source": [ 162 | "df_clean = df.query(\"StockCode != 'M'\")" 163 | ], 164 | "metadata": { 165 | "id": "p7qlTK2-J_S7" 166 | }, 167 | "execution_count": null, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "source": [ 173 | "Remove free products" 174 | ], 175 | "metadata": { 176 | "id": "Chi0SqtzKK90" 177 | } 178 | }, 179 | { 180 | "cell_type": "code", 181 | "source": [ 182 | "df_clean = df_clean.query(\"UnitPrice > 0\")" 183 | ], 184 | "metadata": { 185 | "id": "0EJj2yScKKrD" 186 | }, 187 | "execution_count": null, 188 | "outputs": [] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "source": [ 193 | "Descriptive statistics" 194 | ], 195 | "metadata": { 196 | "id": "QFdifa10KTTb" 197 | } 198 | }, 199 | { 200 | "cell_type": "code", 201 | "source": [ 202 | "df_clean.describe()" 203 | ], 204 | "metadata": { 205 | "id": "mTcdPmyhKUrM" 206 | }, 207 | "execution_count": null, 208 | "outputs": [] 209 | }, 210 | { 211 | "cell_type": "markdown", 212 | "source": [ 213 | "### Calculate Revenue" 214 | ], 215 | "metadata": { 216 | "id": "KPjewQaXKmD9" 217 | } 218 | }, 219 | { 220 | "cell_type": "code", 221 | "source": [ 222 | "df_clean['Revenue'] = df_clean['Quantity'] * df_clean['UnitPrice']" 223 | ], 224 | "metadata": { 225 | "id": "5M1U8TbhKnVd" 226 | }, 227 | "execution_count": null, 228 | "outputs": [] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "source": [ 233 | "df_clean" 234 | ], 235 | "metadata": { 236 | "id": "ww8NSxh1K2tF" 237 | }, 238 | "execution_count": null, 239 | "outputs": [] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "source": [ 244 | "### Find a criteria to flag wholesale orders" 245 | ], 246 | "metadata": { 247 | "id": "cNR4Y-8rK9Ra" 248 | } 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "source": [ 253 | "Approach: Flag every order with more than 3 * STD of Quantity as Wholesale order" 254 | ], 255 | "metadata": { 256 | "id": "I5rchhr0K_-E" 257 | } 258 | }, 259 | { 260 | "cell_type": "code", 261 | "source": [ 262 | "def flag_wholesale(quantity):\n", 263 | " quantity_avg = 12.3\n", 264 | " quantity_std = 39.36\n", 265 | "\n", 266 | " if quantity >= (quantity_avg + 3 * quantity_std):\n", 267 | " return \"B2B\"\n", 268 | " else:\n", 269 | " return \"B2C\"" 270 | ], 271 | "metadata": { 272 | "id": "ywsUOycnLQ2T" 273 | }, 274 | "execution_count": null, 275 | "outputs": [] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "source": [ 280 | "flag_wholesale(300)" 281 | ], 282 | "metadata": { 283 | "id": "1ZSFuKR3WC2O" 284 | }, 285 | "execution_count": null, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "source": [ 291 | "df_clean['Quantity'].map(flag_wholesale)" 292 | ], 293 | "metadata": { 294 | "id": "HgVR1O_CL-Qo" 295 | }, 296 | "execution_count": null, 297 | "outputs": [] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "source": [ 302 | "df_clean['Segment'] = df_clean['Quantity'].map(flag_wholesale)" 303 | ], 304 | "metadata": { 305 | "id": "KZruQWlqMHQZ" 306 | }, 307 | "execution_count": null, 308 | "outputs": [] 309 | }, 310 | { 311 | "cell_type": "code", 312 | "source": [ 313 | "df_clean" 314 | ], 315 | "metadata": { 316 | "id": "DfArme6zMM_H" 317 | }, 318 | "execution_count": null, 319 | "outputs": [] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "source": [ 324 | "### Compare B2B to B2C orders" 325 | ], 326 | "metadata": { 327 | "id": "axVS61HsMQQe" 328 | } 329 | }, 330 | { 331 | "cell_type": "code", 332 | "source": [ 333 | "pd.pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = ['sum'] )" 334 | ], 335 | "metadata": { 336 | "id": "IXq4wOM_NXfK" 337 | }, 338 | "execution_count": null, 339 | "outputs": [] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "source": [ 344 | "pd\n", 345 | ".pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = 'sum' )\n", 346 | ".plot\n", 347 | ".bar();" 348 | ], 349 | "metadata": { 350 | "id": "pcXg72KgNLVC" 351 | }, 352 | "execution_count": null, 353 | "outputs": [] 354 | } 355 | ] 356 | } -------------------------------------------------------------------------------- /Week 1/Python Exercise/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 2: Analyzing data with Excel and Python 2 | 3 | ## Step 1: 4 | * Open the file `retail_data_s.xlsx` and follow the instructions - [Google Sheets Version](https://docs.google.com/spreadsheets/d/1bHKzsKCOBkTdS83ILHF3oddL9KOvgFXRLXSyC9ImDa8/edit?usp=sharing) 5 | 6 | ## Step 2: 7 | * Open the file `Retail Data Analysis.ipynb` - [Google Colab Version](https://colab.research.google.com/drive/1P8Q_inleNvFE-Ksi7rBEJn-BZPfM4ob8?usp=sharing) 8 | 9 | ## Step 3: 10 | * Reflect on both approaches. What was the same? What was different? 11 | -------------------------------------------------------------------------------- /Week 1/Python Exercise/retail_data_s.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 1/Python Exercise/retail_data_s.xlsx -------------------------------------------------------------------------------- /Week 2/Exercise 1 - Write Code/README.md: -------------------------------------------------------------------------------- 1 | # Exercise 1 Tutorial 2 | 3 | ## Step 1: Install Anaconda 4 | 5 | Visit [Anaconda Download Page](https://www.anaconda.com/products/distribution) 6 | 7 | ## Step 2: Open Anaconda Navigator 8 | 9 | * Mac/Windows: Search for Anaconda Navigator 10 | 11 | ## Step 3: Launch Jupyter Notebook 12 | 13 | * Launch Jupyter Notebook from Anaconda 14 | * Create a new directory 15 | * Create a new notebook file 16 | * Create a new markdown cell 17 | * Create a new code cell 18 | * Run notebook & print "Hello World!" 19 | -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/1_Resources/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/1_Resources/.gitignore -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/2_Original_Data/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/2_Original_Data/.gitignore -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/3_Prepared_Data/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/3_Prepared_Data/.gitignore -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/4_Scripts/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/4_Scripts/.gitignore -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/5_Final_Outputs/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 2 - Store Code/Folder Structure Template/YYYY-MM-DD_Project_Name/5_Final_Outputs/.gitignore -------------------------------------------------------------------------------- /Week 2/Exercise 2 - Store Code/README.md: -------------------------------------------------------------------------------- 1 | # Exercise 2 Tutorial 2 | 3 | ## Step 1: Download Github Desktop 4 | 5 | Visit [Github Desktop Download Page](https://desktop.github.com/) 6 | 7 | ## Step 2: Clone This Repository 8 | 9 | Repository URL: https://github.com/tobiaszwingmann/business_analytics_with_python.git 10 | 11 | ## Step 3: Copy the folder structure to your personal repo 12 | Path: `Week 2 / Exercise 2 / Folder Structure Template` 13 | 14 | -------------------------------------------------------------------------------- /Week 2/Exercise 3 - Run Code/README.md: -------------------------------------------------------------------------------- 1 | # Exercise 3 - Run Code: Tutorial 2 | 3 | ## Step 1: Open Python Sandbox on O'Reilly 4 | 5 | 1. Log in to oreilly.com 6 | 2. Start Learning --> Interactive Learning 7 | 3. Sandbox --> Python Sandbox 8 | 9 | (Direct Link: https://learning.oreilly.com/scenarios/python-sandbox/9781492062844/ ) 10 | 11 | ## Step 2: Clone This Repository 12 | 13 | Clone the course repository into the sandbox using this command: 14 | 15 | ``` 16 | git clone https://github.com/tobiaszwingmann/business_analytics_with_python.git 17 | ``` 18 | 19 | ## Step 3: Install packages 20 | 21 | Type the following commands in the terminal to install the Python manager pip: 22 | 23 | ``` 24 | sudo apt update 25 | sudo apt install python3-venv python3-pip 26 | ``` 27 | 28 | ## Step 4: Work with VSCode! 29 | 30 | Explore the sandbox with the IDE VS Code. 31 | 32 | Try to: 33 | 34 | 1. Create and run a new script file 35 | 2. Open the notebook `Retail_Data_Analysis.ipynb` from the repository 36 | 3. Run the script `retail_data_analysis.py` from the command line 37 | -------------------------------------------------------------------------------- /Week 2/Exercise 3 - Run Code/Retail_Data_Analysis.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "NAqnc8OBHawP" 7 | }, 8 | "source": [ 9 | "# Retail Data Exercise\n", 10 | "\n", 11 | "## Goal:\n", 12 | "Compare revenue of Wholesale (B2B) vs. Non-Wholesale (B2C) Orders\n", 13 | "\n", 14 | "## Tasks:\n", 15 | "- Perform an Exploratory Data Analysis\n", 16 | "- Fix Errors in data\n", 17 | "- Calculate revenue\n", 18 | "- Find a criteria to flag wholesale orders\n", 19 | "- Compare B2B to B2C orders" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "id": "dtpRqDOYHmpb" 26 | }, 27 | "source": [ 28 | "## Setup" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "pip install pandas openpyxl matplotlib" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": { 44 | "id": "BK2uO23-HTHT" 45 | }, 46 | "outputs": [], 47 | "source": [ 48 | "import pandas as pd" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": { 54 | "id": "IjWn5WMiH3kR" 55 | }, 56 | "source": [ 57 | "## Exploratory Data Analysis" 58 | ] 59 | }, 60 | { 61 | "cell_type": "code", 62 | "execution_count": null, 63 | "metadata": { 64 | "id": "STjPHoyHH_-S" 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "df = pd.read_excel(\"retail_data_s.xlsx\", sheet_name = \"Data\")" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": null, 74 | "metadata": { 75 | "colab": { 76 | "base_uri": "https://localhost:8080/", 77 | "height": 447 78 | }, 79 | "id": "_yMq2LZPIIBt", 80 | "outputId": "5512bc49-f80d-451d-fa5a-f6348268b29f" 81 | }, 82 | "outputs": [ 83 | { 84 | "data": { 85 | "text/html": [ 86 | "\n", 87 | "
\n", 88 | "
\n", 89 | "
\n", 90 | "\n", 103 | "\n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | "
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPrice
0130475363672010-12-01 08:34:0084879321.69
1130475363672010-12-01 08:34:002274562.10
2130475363672010-12-01 08:34:002274862.10
3130475363672010-12-01 08:34:002274983.75
4130475363672010-12-01 08:34:002231061.65
.....................
91460175815815812011-12-09 12:20:002356262.89
91461175815815812011-12-09 12:20:002356162.89
91462175815815812011-12-09 12:20:0023681101.65
91463175815815822011-12-09 12:21:002355262.08
91464175815815822011-12-09 12:21:0023498121.45
\n", 217 | "

91465 rows × 6 columns

\n", 218 | "
\n", 219 | " \n", 229 | " \n", 230 | " \n", 267 | "\n", 268 | " \n", 292 | "
\n", 293 | "
\n", 294 | " " 295 | ], 296 | "text/plain": [ 297 | " CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n", 298 | "0 13047 536367 2010-12-01 08:34:00 84879 32 \n", 299 | "1 13047 536367 2010-12-01 08:34:00 22745 6 \n", 300 | "2 13047 536367 2010-12-01 08:34:00 22748 6 \n", 301 | "3 13047 536367 2010-12-01 08:34:00 22749 8 \n", 302 | "4 13047 536367 2010-12-01 08:34:00 22310 6 \n", 303 | "... ... ... ... ... ... \n", 304 | "91460 17581 581581 2011-12-09 12:20:00 23562 6 \n", 305 | "91461 17581 581581 2011-12-09 12:20:00 23561 6 \n", 306 | "91462 17581 581581 2011-12-09 12:20:00 23681 10 \n", 307 | "91463 17581 581582 2011-12-09 12:21:00 23552 6 \n", 308 | "91464 17581 581582 2011-12-09 12:21:00 23498 12 \n", 309 | "\n", 310 | " UnitPrice \n", 311 | "0 1.69 \n", 312 | "1 2.10 \n", 313 | "2 2.10 \n", 314 | "3 3.75 \n", 315 | "4 1.65 \n", 316 | "... ... \n", 317 | "91460 2.89 \n", 318 | "91461 2.89 \n", 319 | "91462 1.65 \n", 320 | "91463 2.08 \n", 321 | "91464 1.45 \n", 322 | "\n", 323 | "[91465 rows x 6 columns]" 324 | ] 325 | }, 326 | "execution_count": 54, 327 | "metadata": {}, 328 | "output_type": "execute_result" 329 | } 330 | ], 331 | "source": [ 332 | "df" 333 | ] 334 | }, 335 | { 336 | "cell_type": "markdown", 337 | "metadata": { 338 | "id": "B-nxJLFVTtAB" 339 | }, 340 | "source": [ 341 | "Note: \n", 342 | "\n", 343 | "* A Customer ID can have multiple InvoiceNo and an InvoiceNo can have multiple rows (orders)\n", 344 | "* ID column is already there" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": null, 350 | "metadata": { 351 | "colab": { 352 | "base_uri": "https://localhost:8080/", 353 | "height": 323 354 | }, 355 | "id": "gxD02i2KI1IN", 356 | "outputId": "78725159-9006-4cc9-a10f-7ae27fa5bcc9" 357 | }, 358 | "outputs": [ 359 | { 360 | "data": { 361 | "text/html": [ 362 | "\n", 363 | "
\n", 364 | "
\n", 365 | "
\n", 366 | "\n", 379 | "\n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | "
CustomerIDInvoiceNoQuantityUnitPrice
count91465.00000091465.00000091465.00000091465.000000
mean15101.770426560116.48311412.3523862.943347
std1756.16832513205.74202640.0652018.776144
min12347.000000536367.0000001.0000000.000000
25%13488.000000548739.0000002.0000001.250000
50%14944.000000560841.0000006.0000001.690000
75%16628.000000571676.00000012.0000003.750000
max18287.000000581582.0000002880.0000001241.980000
\n", 448 | "
\n", 449 | " \n", 459 | " \n", 460 | " \n", 497 | "\n", 498 | " \n", 522 | "
\n", 523 | "
\n", 524 | " " 525 | ], 526 | "text/plain": [ 527 | " CustomerID InvoiceNo Quantity UnitPrice\n", 528 | "count 91465.000000 91465.000000 91465.000000 91465.000000\n", 529 | "mean 15101.770426 560116.483114 12.352386 2.943347\n", 530 | "std 1756.168325 13205.742026 40.065201 8.776144\n", 531 | "min 12347.000000 536367.000000 1.000000 0.000000\n", 532 | "25% 13488.000000 548739.000000 2.000000 1.250000\n", 533 | "50% 14944.000000 560841.000000 6.000000 1.690000\n", 534 | "75% 16628.000000 571676.000000 12.000000 3.750000\n", 535 | "max 18287.000000 581582.000000 2880.000000 1241.980000" 536 | ] 537 | }, 538 | "execution_count": 55, 539 | "metadata": {}, 540 | "output_type": "execute_result" 541 | } 542 | ], 543 | "source": [ 544 | "df.describe()" 545 | ] 546 | }, 547 | { 548 | "cell_type": "code", 549 | "execution_count": null, 550 | "metadata": { 551 | "colab": { 552 | "base_uri": "https://localhost:8080/", 553 | "height": 702 554 | }, 555 | "id": "NCYR3_QNJxTV", 556 | "outputId": "ba6b6644-b0c6-496d-d018-55ae294801c5" 557 | }, 558 | "outputs": [ 559 | { 560 | "data": { 561 | "text/html": [ 562 | "\n", 563 | "
\n", 564 | "
\n", 565 | "
\n", 566 | "\n", 579 | "\n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | "
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPrice
973167545368302010-12-02 17:38:008407728800.18
58793174505674232011-09-20 11:05:002328519441.08
58792174505674232011-09-20 11:05:002328819441.08
58791174505674232011-09-20 11:05:002328618781.08
82854178575780602011-11-22 15:22:00M16000.25
13870138485439912011-02-15 10:17:001601415000.32
47220151185616382011-07-28 14:54:008456814400.17
1087141565368902010-12-03 11:48:0017084R14400.16
58797174505674232011-09-20 11:05:002296914281.70
58800174505674232011-09-20 11:05:002324314125.06
90554151955811152011-12-07 12:20:002241314042.75
974167545368302010-12-02 17:38:002191514001.06
3680178575379812010-12-09 11:35:002249213940.55
8788174505406892011-01-11 08:43:002246913561.93
8789174505406892011-01-11 08:43:002247012843.21
76886178575753282011-11-09 13:48:00M12000.25
71043136855726702011-10-25 13:14:002206512000.39
12753173815431922011-02-04 11:55:001503612000.65
41730130825590472011-07-05 15:44:001503612000.72
8787174505406892011-01-11 08:43:0085123A10103.24
\n", 774 | "
\n", 775 | " \n", 785 | " \n", 786 | " \n", 823 | "\n", 824 | " \n", 848 | "
\n", 849 | "
\n", 850 | " " 851 | ], 852 | "text/plain": [ 853 | " CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n", 854 | "973 16754 536830 2010-12-02 17:38:00 84077 2880 \n", 855 | "58793 17450 567423 2011-09-20 11:05:00 23285 1944 \n", 856 | "58792 17450 567423 2011-09-20 11:05:00 23288 1944 \n", 857 | "58791 17450 567423 2011-09-20 11:05:00 23286 1878 \n", 858 | "82854 17857 578060 2011-11-22 15:22:00 M 1600 \n", 859 | "13870 13848 543991 2011-02-15 10:17:00 16014 1500 \n", 860 | "47220 15118 561638 2011-07-28 14:54:00 84568 1440 \n", 861 | "1087 14156 536890 2010-12-03 11:48:00 17084R 1440 \n", 862 | "58797 17450 567423 2011-09-20 11:05:00 22969 1428 \n", 863 | "58800 17450 567423 2011-09-20 11:05:00 23243 1412 \n", 864 | "90554 15195 581115 2011-12-07 12:20:00 22413 1404 \n", 865 | "974 16754 536830 2010-12-02 17:38:00 21915 1400 \n", 866 | "3680 17857 537981 2010-12-09 11:35:00 22492 1394 \n", 867 | "8788 17450 540689 2011-01-11 08:43:00 22469 1356 \n", 868 | "8789 17450 540689 2011-01-11 08:43:00 22470 1284 \n", 869 | "76886 17857 575328 2011-11-09 13:48:00 M 1200 \n", 870 | "71043 13685 572670 2011-10-25 13:14:00 22065 1200 \n", 871 | "12753 17381 543192 2011-02-04 11:55:00 15036 1200 \n", 872 | "41730 13082 559047 2011-07-05 15:44:00 15036 1200 \n", 873 | "8787 17450 540689 2011-01-11 08:43:00 85123A 1010 \n", 874 | "\n", 875 | " UnitPrice \n", 876 | "973 0.18 \n", 877 | "58793 1.08 \n", 878 | "58792 1.08 \n", 879 | "58791 1.08 \n", 880 | "82854 0.25 \n", 881 | "13870 0.32 \n", 882 | "47220 0.17 \n", 883 | "1087 0.16 \n", 884 | "58797 1.70 \n", 885 | "58800 5.06 \n", 886 | "90554 2.75 \n", 887 | "974 1.06 \n", 888 | "3680 0.55 \n", 889 | "8788 1.93 \n", 890 | "8789 3.21 \n", 891 | "76886 0.25 \n", 892 | "71043 0.39 \n", 893 | "12753 0.65 \n", 894 | "41730 0.72 \n", 895 | "8787 3.24 " 896 | ] 897 | }, 898 | "execution_count": 56, 899 | "metadata": {}, 900 | "output_type": "execute_result" 901 | } 902 | ], 903 | "source": [ 904 | "df.sort_values(\"Quantity\", ascending = False).head(20)" 905 | ] 906 | }, 907 | { 908 | "cell_type": "markdown", 909 | "metadata": { 910 | "id": "awB7GeAeJpUI" 911 | }, 912 | "source": [ 913 | "## Clean Data" 914 | ] 915 | }, 916 | { 917 | "cell_type": "markdown", 918 | "metadata": { 919 | "id": "aiQra_QvUbpp" 920 | }, 921 | "source": [ 922 | "Inspect orders with \"M\" Stock code" 923 | ] 924 | }, 925 | { 926 | "cell_type": "code", 927 | "execution_count": null, 928 | "metadata": { 929 | "id": "U14ylK22UbP5" 930 | }, 931 | "outputs": [], 932 | "source": [ 933 | "df.query(\"StockCode != 'M'\")" 934 | ] 935 | }, 936 | { 937 | "cell_type": "markdown", 938 | "metadata": { 939 | "id": "uFT5UFj_JtVs" 940 | }, 941 | "source": [ 942 | "Remove orders with \"M\" Stock code" 943 | ] 944 | }, 945 | { 946 | "cell_type": "code", 947 | "execution_count": null, 948 | "metadata": { 949 | "id": "p7qlTK2-J_S7" 950 | }, 951 | "outputs": [], 952 | "source": [ 953 | "df_clean = df.query(\"StockCode != 'M'\")" 954 | ] 955 | }, 956 | { 957 | "cell_type": "markdown", 958 | "metadata": { 959 | "id": "Chi0SqtzKK90" 960 | }, 961 | "source": [ 962 | "Remove free products" 963 | ] 964 | }, 965 | { 966 | "cell_type": "code", 967 | "execution_count": null, 968 | "metadata": { 969 | "id": "0EJj2yScKKrD" 970 | }, 971 | "outputs": [], 972 | "source": [ 973 | "df_clean = df_clean.query(\"UnitPrice > 0\")" 974 | ] 975 | }, 976 | { 977 | "cell_type": "markdown", 978 | "metadata": { 979 | "id": "QFdifa10KTTb" 980 | }, 981 | "source": [ 982 | "Descriptive statistics" 983 | ] 984 | }, 985 | { 986 | "cell_type": "code", 987 | "execution_count": null, 988 | "metadata": { 989 | "colab": { 990 | "base_uri": "https://localhost:8080/", 991 | "height": 323 992 | }, 993 | "id": "mTcdPmyhKUrM", 994 | "outputId": "3046bd3e-ebf4-4d1b-c26c-02ab9692203c" 995 | }, 996 | "outputs": [ 997 | { 998 | "data": { 999 | "text/html": [ 1000 | "\n", 1001 | "
\n", 1002 | "
\n", 1003 | "
\n", 1004 | "\n", 1017 | "\n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | "
CustomerIDInvoiceNoQuantityUnitPrice
count91383.00000091383.00000091383.00000091383.000000
mean15102.159526560119.15407712.3080552.875542
std1755.92214013203.64523839.3581244.637190
min12347.000000536367.0000001.0000000.040000
25%13488.000000548739.0000002.0000001.250000
50%14944.000000560841.0000006.0000001.690000
75%16628.000000571676.00000012.0000003.750000
max18287.000000581582.0000002880.000000300.000000
\n", 1086 | "
\n", 1087 | " \n", 1097 | " \n", 1098 | " \n", 1135 | "\n", 1136 | " \n", 1160 | "
\n", 1161 | "
\n", 1162 | " " 1163 | ], 1164 | "text/plain": [ 1165 | " CustomerID InvoiceNo Quantity UnitPrice\n", 1166 | "count 91383.000000 91383.000000 91383.000000 91383.000000\n", 1167 | "mean 15102.159526 560119.154077 12.308055 2.875542\n", 1168 | "std 1755.922140 13203.645238 39.358124 4.637190\n", 1169 | "min 12347.000000 536367.000000 1.000000 0.040000\n", 1170 | "25% 13488.000000 548739.000000 2.000000 1.250000\n", 1171 | "50% 14944.000000 560841.000000 6.000000 1.690000\n", 1172 | "75% 16628.000000 571676.000000 12.000000 3.750000\n", 1173 | "max 18287.000000 581582.000000 2880.000000 300.000000" 1174 | ] 1175 | }, 1176 | "execution_count": 59, 1177 | "metadata": {}, 1178 | "output_type": "execute_result" 1179 | } 1180 | ], 1181 | "source": [ 1182 | "df_clean.describe()" 1183 | ] 1184 | }, 1185 | { 1186 | "cell_type": "markdown", 1187 | "metadata": { 1188 | "id": "KPjewQaXKmD9" 1189 | }, 1190 | "source": [ 1191 | "### Calculate Revenue" 1192 | ] 1193 | }, 1194 | { 1195 | "cell_type": "code", 1196 | "execution_count": null, 1197 | "metadata": { 1198 | "id": "5M1U8TbhKnVd" 1199 | }, 1200 | "outputs": [], 1201 | "source": [ 1202 | "df_clean['Revenue'] = df_clean['Quantity'] * df_clean['UnitPrice']" 1203 | ] 1204 | }, 1205 | { 1206 | "cell_type": "code", 1207 | "execution_count": null, 1208 | "metadata": { 1209 | "colab": { 1210 | "base_uri": "https://localhost:8080/", 1211 | "height": 447 1212 | }, 1213 | "id": "ww8NSxh1K2tF", 1214 | "outputId": "f2919eac-7665-40eb-8c41-2d025945f74a" 1215 | }, 1216 | "outputs": [ 1217 | { 1218 | "data": { 1219 | "text/html": [ 1220 | "\n", 1221 | "
\n", 1222 | "
\n", 1223 | "
\n", 1224 | "\n", 1237 | "\n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | " \n", 1258 | " \n", 1259 | " \n", 1260 | " \n", 1261 | " \n", 1262 | " \n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1266 | " \n", 1267 | " \n", 1268 | " \n", 1269 | " \n", 1270 | " \n", 1271 | " \n", 1272 | " \n", 1273 | " \n", 1274 | " \n", 1275 | " \n", 1276 | " \n", 1277 | " \n", 1278 | " \n", 1279 | " \n", 1280 | " \n", 1281 | " \n", 1282 | " \n", 1283 | " \n", 1284 | " \n", 1285 | " \n", 1286 | " \n", 1287 | " \n", 1288 | " \n", 1289 | " \n", 1290 | " \n", 1291 | " \n", 1292 | " \n", 1293 | " \n", 1294 | " \n", 1295 | " \n", 1296 | " \n", 1297 | " \n", 1298 | " \n", 1299 | " \n", 1300 | " \n", 1301 | " \n", 1302 | " \n", 1303 | " \n", 1304 | " \n", 1305 | " \n", 1306 | " \n", 1307 | " \n", 1308 | " \n", 1309 | " \n", 1310 | " \n", 1311 | " \n", 1312 | " \n", 1313 | " \n", 1314 | " \n", 1315 | " \n", 1316 | " \n", 1317 | " \n", 1318 | " \n", 1319 | " \n", 1320 | " \n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | " \n", 1357 | " \n", 1358 | " \n", 1359 | " \n", 1360 | " \n", 1361 | " \n", 1362 | "
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPriceRevenue
0130475363672010-12-01 08:34:0084879321.6954.08
1130475363672010-12-01 08:34:002274562.1012.60
2130475363672010-12-01 08:34:002274862.1012.60
3130475363672010-12-01 08:34:002274983.7530.00
4130475363672010-12-01 08:34:002231061.659.90
........................
91460175815815812011-12-09 12:20:002356262.8917.34
91461175815815812011-12-09 12:20:002356162.8917.34
91462175815815812011-12-09 12:20:0023681101.6516.50
91463175815815822011-12-09 12:21:002355262.0812.48
91464175815815822011-12-09 12:21:0023498121.4517.40
\n", 1363 | "

91383 rows × 7 columns

\n", 1364 | "
\n", 1365 | " \n", 1375 | " \n", 1376 | " \n", 1413 | "\n", 1414 | " \n", 1438 | "
\n", 1439 | "
\n", 1440 | " " 1441 | ], 1442 | "text/plain": [ 1443 | " CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n", 1444 | "0 13047 536367 2010-12-01 08:34:00 84879 32 \n", 1445 | "1 13047 536367 2010-12-01 08:34:00 22745 6 \n", 1446 | "2 13047 536367 2010-12-01 08:34:00 22748 6 \n", 1447 | "3 13047 536367 2010-12-01 08:34:00 22749 8 \n", 1448 | "4 13047 536367 2010-12-01 08:34:00 22310 6 \n", 1449 | "... ... ... ... ... ... \n", 1450 | "91460 17581 581581 2011-12-09 12:20:00 23562 6 \n", 1451 | "91461 17581 581581 2011-12-09 12:20:00 23561 6 \n", 1452 | "91462 17581 581581 2011-12-09 12:20:00 23681 10 \n", 1453 | "91463 17581 581582 2011-12-09 12:21:00 23552 6 \n", 1454 | "91464 17581 581582 2011-12-09 12:21:00 23498 12 \n", 1455 | "\n", 1456 | " UnitPrice Revenue \n", 1457 | "0 1.69 54.08 \n", 1458 | "1 2.10 12.60 \n", 1459 | "2 2.10 12.60 \n", 1460 | "3 3.75 30.00 \n", 1461 | "4 1.65 9.90 \n", 1462 | "... ... ... \n", 1463 | "91460 2.89 17.34 \n", 1464 | "91461 2.89 17.34 \n", 1465 | "91462 1.65 16.50 \n", 1466 | "91463 2.08 12.48 \n", 1467 | "91464 1.45 17.40 \n", 1468 | "\n", 1469 | "[91383 rows x 7 columns]" 1470 | ] 1471 | }, 1472 | "execution_count": 61, 1473 | "metadata": {}, 1474 | "output_type": "execute_result" 1475 | } 1476 | ], 1477 | "source": [ 1478 | "df_clean" 1479 | ] 1480 | }, 1481 | { 1482 | "cell_type": "markdown", 1483 | "metadata": { 1484 | "id": "cNR4Y-8rK9Ra" 1485 | }, 1486 | "source": [ 1487 | "### Find a criteria to flag wholesale orders" 1488 | ] 1489 | }, 1490 | { 1491 | "cell_type": "markdown", 1492 | "metadata": { 1493 | "id": "I5rchhr0K_-E" 1494 | }, 1495 | "source": [ 1496 | "Approach: Flag every order with more than 3 * STD of Quantity as Wholesale order" 1497 | ] 1498 | }, 1499 | { 1500 | "cell_type": "code", 1501 | "execution_count": null, 1502 | "metadata": { 1503 | "id": "ywsUOycnLQ2T" 1504 | }, 1505 | "outputs": [], 1506 | "source": [ 1507 | "def flag_wholesale(quantity):\n", 1508 | " quantity_avg = 12.3\n", 1509 | " quantity_std = 39.36\n", 1510 | "\n", 1511 | " if quantity >= (quantity_avg + 3 * quantity_std):\n", 1512 | " return \"B2B\"\n", 1513 | " else:\n", 1514 | " return \"B2C\"" 1515 | ] 1516 | }, 1517 | { 1518 | "cell_type": "code", 1519 | "execution_count": null, 1520 | "metadata": { 1521 | "id": "1ZSFuKR3WC2O" 1522 | }, 1523 | "outputs": [], 1524 | "source": [ 1525 | "flag_wholesale(300)" 1526 | ] 1527 | }, 1528 | { 1529 | "cell_type": "code", 1530 | "execution_count": null, 1531 | "metadata": { 1532 | "colab": { 1533 | "base_uri": "https://localhost:8080/" 1534 | }, 1535 | "id": "HgVR1O_CL-Qo", 1536 | "outputId": "006c4216-ad6b-41f4-ab02-26f6c9689aa9" 1537 | }, 1538 | "outputs": [ 1539 | { 1540 | "data": { 1541 | "text/plain": [ 1542 | "0 B2C\n", 1543 | "1 B2C\n", 1544 | "2 B2C\n", 1545 | "3 B2C\n", 1546 | "4 B2C\n", 1547 | " ... \n", 1548 | "91460 B2C\n", 1549 | "91461 B2C\n", 1550 | "91462 B2C\n", 1551 | "91463 B2C\n", 1552 | "91464 B2C\n", 1553 | "Name: Quantity, Length: 91383, dtype: object" 1554 | ] 1555 | }, 1556 | "execution_count": 63, 1557 | "metadata": {}, 1558 | "output_type": "execute_result" 1559 | } 1560 | ], 1561 | "source": [ 1562 | "df_clean['Quantity'].map(flag_wholesale)" 1563 | ] 1564 | }, 1565 | { 1566 | "cell_type": "code", 1567 | "execution_count": null, 1568 | "metadata": { 1569 | "id": "KZruQWlqMHQZ" 1570 | }, 1571 | "outputs": [], 1572 | "source": [ 1573 | "df_clean['Segment'] = df_clean['Quantity'].map(flag_wholesale)" 1574 | ] 1575 | }, 1576 | { 1577 | "cell_type": "code", 1578 | "execution_count": null, 1579 | "metadata": { 1580 | "colab": { 1581 | "base_uri": "https://localhost:8080/", 1582 | "height": 447 1583 | }, 1584 | "id": "DfArme6zMM_H", 1585 | "outputId": "d9b21aa6-ed03-495f-a96a-526c36d679d2" 1586 | }, 1587 | "outputs": [ 1588 | { 1589 | "data": { 1590 | "text/html": [ 1591 | "\n", 1592 | "
\n", 1593 | "
\n", 1594 | "
\n", 1595 | "\n", 1608 | "\n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | "
CustomerIDInvoiceNoInvoiceDateStockCodeQuantityUnitPriceRevenueSegment
0130475363672010-12-01 08:34:0084879321.6954.08B2C
1130475363672010-12-01 08:34:002274562.1012.60B2C
2130475363672010-12-01 08:34:002274862.1012.60B2C
3130475363672010-12-01 08:34:002274983.7530.00B2C
4130475363672010-12-01 08:34:002231061.659.90B2C
...........................
91460175815815812011-12-09 12:20:002356262.8917.34B2C
91461175815815812011-12-09 12:20:002356162.8917.34B2C
91462175815815812011-12-09 12:20:0023681101.6516.50B2C
91463175815815822011-12-09 12:21:002355262.0812.48B2C
91464175815815822011-12-09 12:21:0023498121.4517.40B2C
\n", 1746 | "

91383 rows × 8 columns

\n", 1747 | "
\n", 1748 | " \n", 1758 | " \n", 1759 | " \n", 1796 | "\n", 1797 | " \n", 1821 | "
\n", 1822 | "
\n", 1823 | " " 1824 | ], 1825 | "text/plain": [ 1826 | " CustomerID InvoiceNo InvoiceDate StockCode Quantity \\\n", 1827 | "0 13047 536367 2010-12-01 08:34:00 84879 32 \n", 1828 | "1 13047 536367 2010-12-01 08:34:00 22745 6 \n", 1829 | "2 13047 536367 2010-12-01 08:34:00 22748 6 \n", 1830 | "3 13047 536367 2010-12-01 08:34:00 22749 8 \n", 1831 | "4 13047 536367 2010-12-01 08:34:00 22310 6 \n", 1832 | "... ... ... ... ... ... \n", 1833 | "91460 17581 581581 2011-12-09 12:20:00 23562 6 \n", 1834 | "91461 17581 581581 2011-12-09 12:20:00 23561 6 \n", 1835 | "91462 17581 581581 2011-12-09 12:20:00 23681 10 \n", 1836 | "91463 17581 581582 2011-12-09 12:21:00 23552 6 \n", 1837 | "91464 17581 581582 2011-12-09 12:21:00 23498 12 \n", 1838 | "\n", 1839 | " UnitPrice Revenue Segment \n", 1840 | "0 1.69 54.08 B2C \n", 1841 | "1 2.10 12.60 B2C \n", 1842 | "2 2.10 12.60 B2C \n", 1843 | "3 3.75 30.00 B2C \n", 1844 | "4 1.65 9.90 B2C \n", 1845 | "... ... ... ... \n", 1846 | "91460 2.89 17.34 B2C \n", 1847 | "91461 2.89 17.34 B2C \n", 1848 | "91462 1.65 16.50 B2C \n", 1849 | "91463 2.08 12.48 B2C \n", 1850 | "91464 1.45 17.40 B2C \n", 1851 | "\n", 1852 | "[91383 rows x 8 columns]" 1853 | ] 1854 | }, 1855 | "execution_count": 65, 1856 | "metadata": {}, 1857 | "output_type": "execute_result" 1858 | } 1859 | ], 1860 | "source": [ 1861 | "df_clean" 1862 | ] 1863 | }, 1864 | { 1865 | "cell_type": "markdown", 1866 | "metadata": { 1867 | "id": "axVS61HsMQQe" 1868 | }, 1869 | "source": [ 1870 | "### Compare B2B to B2C orders" 1871 | ] 1872 | }, 1873 | { 1874 | "cell_type": "code", 1875 | "execution_count": null, 1876 | "metadata": { 1877 | "colab": { 1878 | "base_uri": "https://localhost:8080/", 1879 | "height": 196 1880 | }, 1881 | "id": "IXq4wOM_NXfK", 1882 | "outputId": "c32c2eb0-f5b3-4ad5-8605-e73c12f802c3" 1883 | }, 1884 | "outputs": [ 1885 | { 1886 | "data": { 1887 | "text/html": [ 1888 | "\n", 1889 | "
\n", 1890 | "
\n", 1891 | "
\n", 1892 | "\n", 1909 | "\n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | "
sum
Revenue
Segment
B2B352482.43
B2C1564510.24
\n", 1935 | "
\n", 1936 | " \n", 1946 | " \n", 1947 | " \n", 1984 | "\n", 1985 | " \n", 2009 | "
\n", 2010 | "
\n", 2011 | " " 2012 | ], 2013 | "text/plain": [ 2014 | " sum\n", 2015 | " Revenue\n", 2016 | "Segment \n", 2017 | "B2B 352482.43\n", 2018 | "B2C 1564510.24" 2019 | ] 2020 | }, 2021 | "execution_count": 66, 2022 | "metadata": {}, 2023 | "output_type": "execute_result" 2024 | } 2025 | ], 2026 | "source": [ 2027 | "pd.pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = ['sum'] )" 2028 | ] 2029 | }, 2030 | { 2031 | "cell_type": "code", 2032 | "execution_count": null, 2033 | "metadata": { 2034 | "colab": { 2035 | "base_uri": "https://localhost:8080/", 2036 | "height": 320 2037 | }, 2038 | "id": "pcXg72KgNLVC", 2039 | "outputId": "5e486d69-90bf-4664-f22b-4e3e8d18a87a" 2040 | }, 2041 | "outputs": [ 2042 | { 2043 | "data": { 2044 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEbCAYAAADKwX/cAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAV5ElEQVR4nO3dfbCedX3n8feHJBAoT7PkdIclgUQbLAF5MAd8aF3AlklQJthdHrVbqWh2pDy4th1T3eIOdWd01e1WJ9aNLU1llKfUcbOUih2WiBZwORkBSSJMRC0nZeUQIBUQSOC7f5w77L2Hk3PfCXfOnXOd92vmDPf1u37nur7nzJUPv/O7nlJVSJKmvv36XYAkqTcMdElqCANdkhrCQJekhjDQJakhDHRJaoi+BnqSa5M8nuTBLvtfkGRjkg1Jvra365OkqST9vA49yb8GngG+UlUndOi7ELgJeEdVPZXkl6vq8cmoU5Kmgr6O0KvqTuDJ9rYkr0/yzSTrk3wnya+2Vn0QWFlVT7W+1zCXpDb74hz6KuCKqloM/AHwxVb7scCxSf4hyT1JlvatQknaB83sdwHtkhwMvA24OcnO5gNa/50JLATOAOYCdyZ5Y1U9Pdl1StK+aJ8KdEb/Yni6qk4eZ90w8L2q2g78OMnDjAb8vZNZoCTtq/apKZeq+mdGw/p8gIw6qbX6G4yOzkkyh9EpmEf6Uack7Yv6fdni9cDdwBuSDCe5FHgvcGmS+4ENwLmt7rcBW5NsBO4A/rCqtvajbknaF/X1skVJUu/sU1MukqQ917eTonPmzKn58+f3a/eSNCWtX7/+iaoaGG9d3wJ9/vz5DA0N9Wv3kjQlJfnprtY55SJJDdEx0Lt5gFaSM5Lc13po1rd7W6IkqRvdjNBXA7u8zT7J4Yzenr+sqo4Hzu9NaZKk3dFxDr2q7kwyf4Iu7wG+XlX/2Oq/xw/N2r59O8PDwzz//PN7uolpa/bs2cydO5dZs2b1uxRJfdKLk6LHArOSrAMOAf6sqr4yXscky4HlAEcfffSr1g8PD3PIIYcwf/582p7log6qiq1btzI8PMyCBQv6XY6kPunFSdGZwGLgXcAS4I+THDtex6paVVWDVTU4MPDqq26ef/55jjjiCMN8NyXhiCOO8C8baZrrxQh9GNhaVc8Czya5EzgJeHhPNmaY7xl/b5J6MUL/H8CvJ5mZ5CDgzcCmHmxXkrQbOo7QWw/QOgOYk2QY+AQwC6CqvlRVm5J8E3gAeBn4i6rq6h2hncxf8be92MwrfvKpd3XsM2PGDN74xjeyY8cOFixYwHXXXcfhhx/e0zokaW/o5iqXi7vo8xngMz2pqM8OPPBA7rvvPgDe9773sXLlSj7+8Y/3uSrp1Xo94Jnuuhnw7eu8U3QCb33rW9myZQsAP/rRj1i6dCmLFy/m7W9/Oz/84Q/Ztm0bxxxzDC+//DIAzz77LPPmzWP79u3j9ge45JJLuPLKK3nb297G6173OtasWQPAunXrOOecc17Z9+WXX87q1asBWL9+PaeffjqLFy9myZIlPPbYY5P4W5A0VRjou/DSSy9x++23s2zZMgCWL1/OF77wBdavX89nP/tZLrvsMg477DBOPvlkvv3t0Ztjb7nlFpYsWcKsWbPG7b/TY489xne/+11uueUWVqxYMWEd27dv54orrmDNmjWsX7+e97///f7FIGlc+9or6PruF7/4BSeffDJbtmzhuOOO46yzzuKZZ57hrrvu4vzz/99NsC+88AIAF154ITfeeCNnnnkmN9xwA5dddtmE/QHe/e53s99++7Fo0SJ+9rOfTVjPQw89xIMPPshZZ50FjP6P5sgjj+zljyypIQz0MXbOoT/33HMsWbKElStXcskll3D44Ye/MrfebtmyZXzsYx/jySefZP369bzjHe/g2Wef3WV/gAMOOOCVzztfMDJz5sxXpm6AV64pryqOP/547r777l7+mJIayCmXXTjooIP4/Oc/z+c+9zkOOuggFixYwM033wyMhuz9998PwMEHH8ypp57KVVddxTnnnMOMGTM49NBDd9l/V4455hg2btzICy+8wNNPP83tt98OwBve8AZGRkZeCfTt27ezYcOGvfVjS5rC9ukRer/POp9yyimceOKJXH/99Xz1q1/lQx/6EJ/85CfZvn07F110ESedNPr+6gsvvJDzzz+fdevWvfK9E/Ufz7x587jgggs44YQTWLBgAaeccgoA+++/P2vWrOHKK69k27Zt7Nixgw9/+MMcf/zxe/VnlzT19O2dooODgzX2BRebNm3iuOOO60s9TeDvb3rxssXe6vcAsltJ1lfV4HjrnHKRpIYw0CWpIfa5QO/XFNBU5+9N0j4V6LNnz2br1q2G027a+Tz02bNn97sUSX20T13lMnfuXIaHhxkZGel3KVPOzjcWSZq+9qlAnzVrlm/ckaQ9tE9NuUiS9pyBLkkNYaBLUkMY6JLUEB0DPcm1SR5PMuFr5ZKcmmRHkvN6V54kqVvdjNBXA0sn6pBkBvBp4Fs9qEmStAc6BnpV3Qk82aHbFcDfAI/3oihJ0u57zXPoSY4Cfgv48y76Lk8ylGTIm4ckqbd6cVL0vwEfraqXO3WsqlVVNVhVgwMDAz3YtSRpp17cKToI3JAEYA7wziQ7quobPdi2JKlLrznQq+qVe/WTrAZuMcwlafJ1DPQk1wNnAHOSDAOfAGYBVNWX9mp1kqSudQz0qrq4241V1SWvqRpJ0h7zTlFJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGqJjoCe5NsnjSR7cxfr3JnkgyQ+S3JXkpN6XKUnqpJsR+mpg6QTrfwycXlVvBP4EWNWDuiRJu6mbl0TfmWT+BOvvalu8B5j72suSJO2uXs+hXwr83a5WJlmeZCjJ0MjISI93LUnTW88CPcmZjAb6R3fVp6pWVdVgVQ0ODAz0ateSJLqYculGkhOBvwDOrqqtvdimJGn3vOYRepKjga8D/66qHn7tJUmS9kTHEXqS64EzgDlJhoFPALMAqupLwNXAEcAXkwDsqKrBvVWwJGl83VzlcnGH9R8APtCziiRJe8Q7RSWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSE6BnqSa5M8nuTBXaxPks8n2ZzkgSRv6n2ZkqROuhmhrwaWTrD+bGBh62s58OevvSxJ0u7qGOhVdSfw5ARdzgW+UqPuAQ5PcmSvCpQkdacXc+hHAY+2LQ+32l4lyfIkQ0mGRkZGerBrSdJOk3pStKpWVdVgVQ0ODAxM5q4lqfF6EehbgHlty3NbbZKkSdSLQF8L/E7rape3ANuq6rEebFeStBtmduqQ5HrgDGBOkmHgE8AsgKr6EnAr8E5gM/Ac8Lt7q1hJ0q51DPSqurjD+gJ+r2cVSZL2iHeKSlJDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDWEgS5JDWGgS1JDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQXQV6kqVJHkqyOcmKcdYfneSOJN9P8kCSd/a+VEnSRDoGepIZwErgbGARcHGSRWO6/Ufgpqo6BbgI+GKvC5UkTaybEfppwOaqeqSqXgRuAM4d06eAQ1ufDwP+qXclSpK60U2gHwU82rY83Gpr95+A304yDNwKXDHehpIsTzKUZGhkZGQPypUk7UqvTopeDKyuqrnAO4Hrkrxq21W1qqoGq2pwYGCgR7uWJEF3gb4FmNe2PLfV1u5S4CaAqrobmA3M6UWBkqTudBPo9wILkyxIsj+jJz3Xjunzj8BvACQ5jtFAd05FkiZRx0Cvqh3A5cBtwCZGr2bZkOSaJMta3X4f+GCS+4HrgUuqqvZW0ZKkV5vZTaequpXRk53tbVe3fd4I/FpvS5Mk7Q7vFJWkhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIboKtCTLE3yUJLNSVbsos8FSTYm2ZDka70tU5LUScdX0CWZAawEzgKGgXuTrG29dm5nn4XAHwG/VlVPJfnlvVWwJGl83YzQTwM2V9UjVfUicANw7pg+HwRWVtVTAFX1eG/LlCR10k2gHwU82rY83GprdyxwbJJ/SHJPkqXjbSjJ8iRDSYZGRkb2rGJJ0rh6dVJ0JrAQOAO4GPhyksPHdqqqVVU1WFWDAwMDPdq1JAm6C/QtwLy25bmttnbDwNqq2l5VPwYeZjTgJUmTpJtAvxdYmGRBkv2Bi4C1Y/p8g9HROUnmMDoF80gP65QkddAx0KtqB3A5cBuwCbipqjYkuSbJsla324CtSTYCdwB/WFVb91bRkqRX63jZIkBV3QrcOqbt6rbPBXyk9SVJ6gPvFJWkhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIboK9CRLkzyUZHOSFRP0+7dJKslg70qUJHWjY6AnmQGsBM4GFgEXJ1k0Tr9DgKuA7/W6SElSZ92M0E8DNlfVI1X1InADcO44/f4E+DTwfA/rkyR1qZtAPwp4tG15uNX2iiRvAuZV1d9OtKEky5MMJRkaGRnZ7WIlSbv2mk+KJtkP+K/A73fqW1WrqmqwqgYHBgZe664lSW26CfQtwLy25bmttp0OAU4A1iX5CfAWYK0nRiVpcnUT6PcCC5MsSLI/cBGwdufKqtpWVXOqan5VzQfuAZZV1dBeqViSNK6OgV5VO4DLgduATcBNVbUhyTVJlu3tAiVJ3ZnZTaequhW4dUzb1bvoe8ZrL0uStLu6CvTpbP6KCS/c0W76yafe1e8SpMby1n9JaggDXZIawkCXpIYw0CWpIQx0SWoIA12SGsJAl6SGMNAlqSEMdElqCANdkhrCQJekhjDQJakhDHRJaggDXZIawkCXpIYw0CWpIboK9CRLkzyUZHOSFeOs/0iSjUkeSHJ7kmN6X6okaSIdAz3JDGAlcDawCLg4yaIx3b4PDFbVicAa4L/0ulBJ0sS6GaGfBmyuqkeq6kXgBuDc9g5VdUdVPddavAeY29syJUmddBPoRwGPti0Pt9p25VLg78ZbkWR5kqEkQyMjI91XKUnqqKcnRZP8NjAIfGa89VW1qqoGq2pwYGCgl7uWpGlvZhd9tgDz2pbnttr+P0l+E/g4cHpVvdCb8iRJ3epmhH4vsDDJgiT7AxcBa9s7JDkF+O/Asqp6vPdlSpI66RjoVbUDuBy4DdgE3FRVG5Jck2RZq9tngIOBm5Pcl2TtLjYnSdpLuplyoapuBW4d03Z12+ff7HFdkqTd5J2iktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDWEgS5JDWGgS1JDGOiS1BAGuiQ1hIEuSQ1hoEtSQxjoktQQBrokNYSBLkkNYaBLUkMY6JLUEAa6JDVEV4GeZGmSh5JsTrJinPUHJLmxtf57Seb3ulBJ0sQ6BnqSGcBK4GxgEXBxkkVjul0KPFVVvwL8KfDpXhcqSZpYNyP004DNVfVIVb0I3ACcO6bPucBftz6vAX4jSXpXpiSpk5ld9DkKeLRteRh48676VNWOJNuAI4An2jslWQ4sby0+k+ShPSla45rDmN/3vij+7TYdeWz21jG7WtFNoPdMVa0CVk3mPqeLJENVNdjvOqSxPDYnTzdTLluAeW3Lc1tt4/ZJMhM4DNjaiwIlSd3pJtDvBRYmWZBkf+AiYO2YPmuB97U+nwf8r6qq3pUpSeqk45RLa078cuA2YAZwbVVtSHINMFRVa4G/BK5Lshl4ktHQ1+RyKkv7Ko/NSRIH0pLUDN4pKkkNYaBLUkMY6JLUEAa6JDWEgT5FJZm58/EKSeYlOS/JKf2uS9NbkiVJzhun/bwkZ/WjpunEQJ+CknwQeBz4aevz7Yxe/39Dko/2tThNd1cD3x6nfR1wzeSWMv1M6q3/6pkPA68HDgE2AcdU1RNJDmL0RrCp81QKNc0BVTUytrF1fP5SPwqaTgz0qenFqnoKeCrJ5qp6AqCqnkvyYp9r0/R2aJKZVbWjvTHJLODAPtU0bRjoU9OBrfny/YD9W5/T+prd18o03X0d+HKSy6vqWYAkBwN/1lqnvcg7RaegJHdMtL6qzpysWqR2rYfzfRL4APDTVvPRjD4e5I+ranu/apsODHRJPZfkQOBXWoubq+oX/axnuvAqlykqyaFJXj9O+4n9qEfaKcmhwL+qqh+0vn7RavfY3MsM9CkoyQXAD4G/SbIhyaltq1f3pyrJY7PfDPSp6WPA4qo6GfhdRh9d/Futdb7LVf3ksdlHXuUyNc2oqscAqup/JzkTuCXJPMCTIuonj80+coQ+Nf28ff689Q/oDOBc4Ph+FSXhsdlXjtCnpg8x5n/GVfXzJEuBC/pTkgR4bPaVly02RJI5wFbf5ap9jcfm5HHKZQpK8pYk65J8PckpSR4EHgR+1hoJSX3hsdlfjtCnoCRDjF5NcBijL+A9u6ruSfKrwPVV5WN01Rcem/3lCH1qmllV36qqm4H/U1X3AFTVD/tcl+Sx2UcG+tT0ctvnsbdU+yeX+sljs4+ccpmCkrwEPMvojRoHAs/tXAXMrqpZ/apN05vHZn8Z6JLUEE65SFJDGOiS1BAGuqa8JB9vPdnvgST3JXlzv2vaKcn8JO/pdx2aHrz1X1NakrcC5wBvqqoXWncl7t/nstrNB94DfK3PdWgacISuqe5I4ImqegFG3y5fVf+UZHGSbydZn+S2JEcCJDm1bST/mdadjCS5JMk3kvx9kp8kuTzJR5J8P8k9Sf5Fq9/rk3yztd3vtG6YIcnqJJ9PcleSR5Kc16rvU8DbW/v7D5P+29G0YqBrqvsWMC/Jw0m+mOT01hvmvwCcV1WLgWuB/9zq/1fAv289r/ulMds6Afg3wKmt/s+17my8G/idVp9VwBWt7f4B8MW27z8S+HVG/2L4VKttBfCdqjq5qv60Zz+1NA6nXDSlVdUzSRYDbwfOBG5k9CXFJwB/nwRgBvBYksOBQ6rq7ta3f43R8N3pjqr6OaOPgN0G/M9W+w+AE1tvr38bcHNruwAHtH3/N6rqZWBjkn/Z4x9V6shA15RXVS8B64B1SX4A/B6woare2t6vFegTeaHt88ttyy8z+m9lP+Dp1ui+0/f7dh5NOqdcNKUleUOShW1NJwObgIHWCVOSzEpyfFU9zejoe+dVMBftzr6q6p+BHyc5v7XdJDmpw7f9HDhkd/Yj7SkDXVPdwcBfJ9mY5AFgEXA1cB7w6ST3A/cxOlUCcCnw5ST3Ab8EbNvN/b0XuLS13Q2MvolnIg8ALyW535Oi2tu89V/TSpKDq+qZ1ucVwJFVdVWfy5J6wjl0TTfvSvJHjB77PwUu6W85Uu84QpekhnAOXZIawkCXpIYw0CWpIQx0SWoIA12SGuL/AvkvQ8NTXOT1AAAAAElFTkSuQmCC", 2045 | "text/plain": [ 2046 | "
" 2047 | ] 2048 | }, 2049 | "metadata": { 2050 | "needs_background": "light" 2051 | }, 2052 | "output_type": "display_data" 2053 | } 2054 | ], 2055 | "source": [ 2056 | "(pd\n", 2057 | ".pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = 'sum' )\n", 2058 | ".plot\n", 2059 | ".bar());" 2060 | ] 2061 | }, 2062 | { 2063 | "cell_type": "markdown", 2064 | "metadata": { 2065 | "id": "25xpzlHYXXKk" 2066 | }, 2067 | "source": [ 2068 | "**Clear all outputs!**\n", 2069 | "\n", 2070 | "**Run again!**" 2071 | ] 2072 | } 2073 | ], 2074 | "metadata": { 2075 | "colab": { 2076 | "provenance": [] 2077 | }, 2078 | "kernelspec": { 2079 | "display_name": "Python 3", 2080 | "name": "python3" 2081 | }, 2082 | "language_info": { 2083 | "name": "python" 2084 | } 2085 | }, 2086 | "nbformat": 4, 2087 | "nbformat_minor": 0 2088 | } 2089 | -------------------------------------------------------------------------------- /Week 2/Exercise 3 - Run Code/pip.sh: -------------------------------------------------------------------------------- 1 | sudo apt update 2 | sudo apt install python3-venv python3-pip -------------------------------------------------------------------------------- /Week 2/Exercise 3 - Run Code/retail_data_analysis.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | 3 | def flag_wholesale(quantity): 4 | quantity_avg = 12.3 5 | quantity_std = 39.36 6 | 7 | if quantity >= (quantity_avg + 3 * quantity_std): 8 | return "B2B" 9 | else: 10 | return "B2C" 11 | 12 | df = pd.read_excel("retail_data_s.xlsx", sheet_name = "Data") 13 | df_clean = df.query("StockCode != 'M'") 14 | df_clean = df_clean.query("UnitPrice > 0") 15 | df_clean['Revenue'] = df_clean['Quantity'] * df_clean['UnitPrice'] 16 | df_clean['Segment'] = df_clean['Quantity'].map(flag_wholesale) 17 | 18 | print(pd.pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = ['sum'])) 19 | 20 | (pd 21 | .pivot_table(df_clean, index = ['Segment'], values = ['Revenue'], aggfunc = 'sum' ) 22 | .plot 23 | .bar() 24 | .get_figure() 25 | .savefig('barchart.png')) -------------------------------------------------------------------------------- /Week 2/Exercise 3 - Run Code/retail_data_s.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tobiaszwingmann/business_analytics_with_python/51f4c217c842ac240134e97cb39da72049e79437/Week 2/Exercise 3 - Run Code/retail_data_s.xlsx -------------------------------------------------------------------------------- /Week 2/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 2.pdf](https://drive.google.com/drive/folders/1WC9tX70uNkzFWEWgEyxyrpwWUtkb0HCE?usp=share_link) -------------------------------------------------------------------------------- /Week 3/Exercise 1 - Data Wrangling/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 1: Data Wrangling With Python 2 | 3 | ## Tutorial 4 | 5 | Follow the steps in the Jupyter Notebook. 6 | 7 | How to run the notebook 8 | 9 | ### Option 1: Google Colab 10 | * Open this link: https://colab.research.google.com/drive/1a_-okwQUcRQwXTJLPb14USrEHZrjqAID?usp=sharing 11 | * Make a copy 12 | 13 | ### Option 2: On Your Computer 14 | * Clone this repository, e.g. with Github Desktop 15 | * Launch Jupyter Notebook, e.g. from Anaconda Navigator 16 | * Open the .ipynb file from this folder in Jupyter Notebook 17 | 18 | ### Option 3: On the O'Reilly Platform 19 | * Open the [Python Sandbox](https://learning.oreilly.com/scenarios/python-sandbox/9781492062844/) 20 | (Start Learning -> Interactive Learning -> Sandboxes -> Python) 21 | * Clone this repository 22 | `git clone https://github.com/tobiaszwingmann/business_analytics_with_python.git` 23 | * Execute the following commands into the terminal to install the Python package manager pip: 24 | * `sudo apt update` 25 | * `sudo apt install python3-venv python3-pip` 26 | * Type `Y` when prompted 27 | * Open the Notebook from the file browser 28 | * Click `Select Kernel` -> `Install suggested extension Python + Jupyter` 29 | 30 | ## Further Resources 31 | * [Pandas Cheat Sheet](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf) 32 | * Book: [Effective Pandas](https://www.amazon.com/Effective-Pandas-Patterns-Manipulation-Treading/dp/B09MYXXSFM/) -------------------------------------------------------------------------------- /Week 3/Exercise 2 - Descriptive Statistics/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 2: Descriptive Statistics With Python 2 | 3 | ## Tutorial 4 | 5 | * Follow the steps in the Jupyter Notebook `Descriptive_Statistics_With_Python.ipynb`. 6 | 7 | **How to run the notebook:** 8 | 9 | ### Option 1: Google Colab 10 | * Open this link: https://colab.research.google.com/drive/1lKvv918DuDiwIHoixNkwpYZXQiE5d-IW?usp=sharing 11 | * Make a copy 12 | 13 | ### Option 2: On Your Computer 14 | * Clone this repository, e.g. with Github Desktop 15 | * Launch Jupyter Notebook, e.g. from Anaconda Navigator 16 | * Open the .ipynb file from this folder in Jupyter Notebook 17 | 18 | ### Option 3: On the O'Reilly Platform 19 | * Open the [Python Sandbox](https://learning.oreilly.com/scenarios/python-sandbox/9781492062844/) 20 | (Start Learning -> Interactive Learning -> Sandboxes -> Python) 21 | * Clone this repository 22 | `git clone https://github.com/tobiaszwingmann/business_analytics_with_python.git` 23 | * Execute the following commands into the terminal to install the Python package manager pip: 24 | * `sudo apt update` 25 | * `sudo apt install python3-venv python3-pip` 26 | * Type `Y` when prompted 27 | * Open the Notebook from the file browser 28 | * Click `Select Kernel` -> `Install suggested extension Python + Jupyter` 29 | 30 | ## Further Resources 31 | * [Pandas Cheat Sheet](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf) 32 | * [Numpy Cheat Sheet](https://www.datacamp.com/cheat-sheet/numpy-cheat-sheet-data-analysis-in-python) 33 | * Book: [Effective Pandas](https://www.amazon.com/Effective-Pandas-Patterns-Manipulation-Treading/dp/B09MYXXSFM/) 34 | * Book: [Practical Statistics for Data Scientists](https://learning.oreilly.com/library/view/practical-statistics-for/9781492072935/) -------------------------------------------------------------------------------- /Week 3/Exercise 3 - Data Visualization/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 3: Data Visualization With Python 2 | 3 | ## Step 1: Exploratory Plots 4 | * Open the Notebook "Exploratory_Plots_With_Pandas.ipynb" using one of the following options: 5 | 6 | 7 | ### Option 1: Google Colab 8 | * Open this link: https://colab.research.google.com/drive/1ifV1rCzU3lcc8NU9DDINHiTnZImcXwNF?usp=sharing 9 | * Make a copy 10 | 11 | 12 | ### Option 2: On Your Computer 13 | * Clone this repository, e.g. with Github Desktop 14 | * Launch Jupyter Notebook, e.g. from Anaconda Navigator 15 | * Open the .ipynb file from this folder in Jupyter Notebook 16 | 17 | ### Option 3: On the O'Reilly Platform 18 | * Open the [Python Sandbox](https://learning.oreilly.com/scenarios/python-sandbox/9781492062844/) 19 | (Start Learning -> Interactive Learning -> Sandboxes -> Python) 20 | * Clone this repository 21 | `git clone https://github.com/tobiaszwingmann/business_analytics_with_python.git` 22 | * Execute the following commands into the terminal to install the Python package manager pip: 23 | * `sudo apt update` 24 | * `sudo apt install python3-venv python3-pip` 25 | * Type `Y` when prompted 26 | * Open the Notebook from the file browser 27 | * Click `Select Kernel` -> `Install suggested extension Python + Jupyter` 28 | * NOTE: Running the `sweetviz`package will not work in the sandbox 29 | 30 | ## Step 2: Matplotlib Tutorial 31 | * Repeat the same for Notebook "Matplotlib Tutorial.ipynb" 32 | * Google Colab link: https://colab.research.google.com/drive/1_2bZ9OHLIfGCg1pg0k9QUJnWLRHgUlIY?usp=sharing 33 | 34 | ## Step 3: Seaborn Tutorial 35 | * Repeat the same for Notebook "Seaborn Tutorial.ipynb" 36 | * Google Colab link: https://colab.research.google.com/drive/1yQv-XRlU2BCeaSP1-J5ez1ddMu3PYBnd?usp=sharing 37 | 38 | ## Further Resources 39 | * [Matplotlib Cheat Sheet](https://www.datacamp.com/cheat-sheet/matplotlib-cheat-sheet-plotting-in-python) 40 | * [Seaborn Cheat Sheet](https://www.datacamp.com/cheat-sheet/python-seaborn-cheat-sheet) 41 | * [Data Visualization Cheat Sheet](https://policyviz.com/2018/08/07/dataviz-cheatsheet/) 42 | * Book: [Storytelling with Data](https://learning.oreilly.com/library/view/storytelling-with-data/9781119002253/) 43 | * Book: [Fundamentals of Data Visualization](https://learning.oreilly.com/library/view/fundamentals-of-data/9781492031079/) -------------------------------------------------------------------------------- /Week 3/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 3.pdf](https://drive.google.com/drive/folders/1VFoSIssgTkKR021Zjwr1aj_lM73gSR39?usp=share_link) -------------------------------------------------------------------------------- /Week 4/Exercises/readme.md: -------------------------------------------------------------------------------- 1 | # Exercises Week 4: Diagnostic Analytics With Python 2 | 3 | ## Exercise 1: RFM Values 4 | * Complete the interactive learning lab [Calculate RFM and CLV Values](https://learning.oreilly.com/scenarios/-/9781098121747/), [Colab Link](https://colab.research.google.com/drive/1AK417G9DjQ2QycbQcshgtmtPcjX6ZfSX?usp=sharing) 5 | 6 | ## Exercise 2: Clustering 7 | * Step 1: Complete the interactive learning lab [Develop and Interpret a Hierarchical Clustering](https://learning.oreilly.com/scenarios/-/9781098121761/), [Colab Link](https://colab.research.google.com/drive/1cPvsf9aTl_Zx-8OIBwGs2eqspardoa2j?usp=sharing) 8 | * Step 2: Complete the interactive learning lab [Develop and Interpret a K-Means Clustering](https://learning.oreilly.com/scenarios/-/9781098121754/), [Colab Link](https://colab.research.google.com/drive/1mqiN0IOmgBWeSHQfi02susQiAoo29q8P?usp=sharing) 9 | 10 | ## Exercise 3: Rule Mining 11 | * Complete the interactive learning lab [Perform a Market Basket Analysis](https://learning.oreilly.com/scenarios/-/9781098121785/), [Colab Link](https://colab.research.google.com/drive/1Mv7G0CR7bIkpCJk9RBsJaC8IOw7vAr0V?usp=sharing) 12 | -------------------------------------------------------------------------------- /Week 4/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 4.pdf](https://drive.google.com/drive/folders/1ne6EGUTSAv9WF5LpznozFoHsAU94LQX8?usp=share_link) -------------------------------------------------------------------------------- /Week 5/Exercises/readme.md: -------------------------------------------------------------------------------- 1 | # Exercises Week 5: Predictive Analytics With Python 2 | 3 | ## Exercise 1: Hypothesis testing 4 | Complete the following interactive learning labs: 5 | * [The Normal Distribution](https://learning.oreilly.com/scenarios/-/9781098111144) 6 | * [Central Limit Theorem](https://learning.oreilly.com/scenarios/-/9781098111168) 7 | * [P-values One-Tailed Test](https://learning.oreilly.com/scenarios/-/9781098111182) 8 | * [P-values Two-Tailed Test](https://learning.oreilly.com/scenarios/-/9781098117030) 9 | 10 | 11 | ## Exercise 2: Regression 12 | Complete the following interactive learning labs: 13 | * [Linear Regression: Using Scikit-Learn](https://learning.oreilly.com/scenarios/-/9781098127732/) 14 | * [Linear Regression: Calculating the Squared Error](https://learning.oreilly.com/scenarios/-/9781098127749/) 15 | * [Linear Regression: Coefficient of Determination](https://learning.oreilly.com/scenarios/linear-regression-coefficient/9781098127893/) 16 | * [Linear Regression: Statistical Significance](https://learning.oreilly.com/scenarios/-/9781098127886) 17 | 18 | 19 | ## Exercise 3: Machine Learning Basics 20 | Complete the following interactive learning labs: 21 | * [Prepare Data for Train-Test Splits](https://learning.oreilly.com/scenarios/-/9781098121662/) 22 | * [Predict Categorical Values with Tree-Based Models](https://learning.oreilly.com/scenarios/-/9781098121693/) (until step 5) 23 | * Optional: [Predict Numeric Values with Multiple Linear Regression](https://learning.oreilly.com/scenarios/-/9781098121679/) (until step 5) 24 | 25 | ## Exercise 4: Evaluating Machine Learning Models 26 | Complete the following interactive learning labs: 27 | * [Predict Categorical Values with Tree-Based Models](https://learning.oreilly.com/scenarios/-/9781098121693/) (continue from step 6) 28 | * [Calculate and Interpret Predictive Model Performance](https://learning.oreilly.com/scenarios/-/9781098121709/) 29 | 30 | ## Exercise 5: Time Series with ARIMA Models 31 | Complete the following interactive learning labs 32 | * [Forecast Time Series with ARIMA Models](https://learning.oreilly.com/scenarios/-/9781098121686/) 33 | 34 | # Further Resources: 35 | 36 | ## Statistics: 37 | * Book: [Essential Math for Data Science](https://learning.oreilly.com/library/view/essential-math-for/9781098102920/) 38 | * Videos: [Regression - Josh Starmer on StatQuest](https://www.youtube.com/watch?v=PaFPbb66DxQ&list=PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU) 39 | 40 | ## Machine Learning: 41 | * Book: [Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781492032632/) 42 | * Book: [Approaching (Almost) Any Machine Learning Problem](https://github.com/abhishekkrthakur/approachingalmost/blob/master/AAAMLP.pdf) 43 | * Course: [Machine Learning Crash Course by Google](https://developers.google.com/machine-learning/crash-course/ml-intro) 44 | 45 | ## Time Series: 46 | * Book: [Advanced Forecasting with Python](https://learning.oreilly.com/library/view/advanced-forecasting-with/9781484271506) 47 | * Blog: [How to Create an ARIMA Model for Time Series Forecasting in Python](https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/) 48 | -------------------------------------------------------------------------------- /Week 5/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 5.pdf](https://drive.google.com/drive/folders/1wEZOxVsD2-z2gFyYrVjW70NO2wnke8cG?usp=share_link) -------------------------------------------------------------------------------- /Week 6/Exercise 1/readme.md: -------------------------------------------------------------------------------- 1 | # Exercises Week 6: Prescriptive Analytics With Python 2 | 3 | ## Exercise 1: End-to-End Case Study 4 | From Descriptive to Prescriptive Analytics 5 | 6 | * Open `Prescriptive_Analytics_Credit_Risk_Scoring_Case_Study.ipynb` and complete the notebook 7 | * [Colab Link](https://colab.research.google.com/drive/1gVc3QoFdZI0p9BGrsx05lDlzZvbXO2Rl?usp=sharing) -------------------------------------------------------------------------------- /Week 6/Exercise 2/readme.md: -------------------------------------------------------------------------------- 1 | # Exercises Week 6: Prescriptive Analytics With Python 2 | 3 | ## Exercise 2: Hands-On Recommendation Systems 4 | 5 | * [Define and Understand a Reinforcement Learning Environment](https://learning.oreilly.com/scenarios/-/9781098121587) 6 | * [Set Up Simulated User Interactions](https://learning.oreilly.com/scenarios/-/9781098121594) 7 | * [Build a Contextual Bandit Using RLlib](https://learning.oreilly.com/scenarios/-/9781098121617) 8 | * Optional: [Train a Deep Neural Network with SlateQ](https://learning.oreilly.com/scenarios/-/9781098121624) 9 | -------------------------------------------------------------------------------- /Week 6/Presentation/readme.md: -------------------------------------------------------------------------------- 1 | # Presentation Download 2 | 3 | You can download this week's presentation here: 4 | 5 | [Business Analytics Bootcamp - Week 6.pdf](https://drive.google.com/drive/folders/1Tni8XeafkJeadQ9LaM5HzUE0uIxVApWN?usp=share_link) --------------------------------------------------------------------------------