├── 2020-12-13 DataCamp Python Programmer #189,875.pdf
├── 2020-12-18 DataCamp Data Scientist with Python #190,835.pdf
├── 2020-12-29 DataCamp Data Analyst with Python #192,564.pdf
├── 2021-03-16 AI4I-Foundation-in-AI-Certificate-AI-MakerSpace.pdf
├── 2021-04-25 DataCamp Data Engineer with Python #214,688.pdf
├── 2021-05-09 DataCamp Machine Learning Scientist with Python #218,190.pdf
├── AI4I-1-Q-Introduction_to_Python_Quiz.txt
├── AI4I-2-Q-Libraries_and_Data_Manipulation_Quiz.txt
├── AI4I-3-Q-Exploratory_Data_Analysis_Quiz.txt
├── AI4I-4-Q-Statistical_Thinking_Quiz.txt
├── AI4I-5-Q-Supervised_Learning_Quiz.txt
├── AI4I-6-Q-Unsupervised_Learning_Quiz.txt
├── AI4I-7-Q-Deep_Learning_Quiz.txt
├── AI4I-8-Q-Other_Languages_and_Tools_to_Learn_Quiz.txt
├── AI4I-9-Q-Data_Science_Project_Lifecycle_Quiz.txt
├── AI4I_Data_Science_Project_Lifecycle.pdf
└── README.md


/2020-12-13 DataCamp Python Programmer #189,875.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2020-12-13 DataCamp Python Programmer #189,875.pdf


--------------------------------------------------------------------------------
/2020-12-18 DataCamp Data Scientist with Python #190,835.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2020-12-18 DataCamp Data Scientist with Python #190,835.pdf


--------------------------------------------------------------------------------
/2020-12-29 DataCamp Data Analyst with Python #192,564.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2020-12-29 DataCamp Data Analyst with Python #192,564.pdf


--------------------------------------------------------------------------------
/2021-03-16 AI4I-Foundation-in-AI-Certificate-AI-MakerSpace.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2021-03-16 AI4I-Foundation-in-AI-Certificate-AI-MakerSpace.pdf


--------------------------------------------------------------------------------
/2021-04-25 DataCamp Data Engineer with Python #214,688.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2021-04-25 DataCamp Data Engineer with Python #214,688.pdf


--------------------------------------------------------------------------------
/2021-05-09 DataCamp Machine Learning Scientist with Python #218,190.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/2021-05-09 DataCamp Machine Learning Scientist with Python #218,190.pdf


--------------------------------------------------------------------------------
/AI4I-1-Q-Introduction_to_Python_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-1-Q Introduction to Python Quiz
 2 | 
 3 | 1. What is the output of this code?
 4 | import numpy as np
 5 | m = np.array([2,4,5])
 6 | n = 3
 7 | print(m*n)
 8 | 
 9 | [6 15 4]
10 | [6 12 15] <-answer
11 | [5 2 7]
12 | 
13 | 
14 | 
15 | 2. Select the code to return the following output:
16 | 42
17 | 
18 | p=14
19 | q=3
20 | print(p*q)
21 | (^answer)
22 | 
23 | p=14
24 | q=3
25 | print(p-q)
26 | 
27 | p=14
28 | q=3
29 | print(p+q)
30 | 
31 | 
32 | 
33 | 3. Complete the following code snippet:
34 | import 
35 | numpy
36 |  as np
37 | np.array([19, 21, 15,26])
38 | 
39 | 
40 | 
41 | 4. The variables x and y are defined as follows:
42 | x = np.array([[1, 3, 8], [3, 2, 6]])
43 | y = np.array([[6, 4, 2], [0, 5, 1]])
44 | What is the output of this code?
45 | print(x-y)
46 | 
47 | [[7, 7, 12], [7, 13, 2]]
48 | [[6, 10, 35], [0, 42, 1]]
49 | [[-5, -1, 6], [3, -3, 5]] <-answer
50 | 
51 | 
52 | 
53 | 5. Select the code that returns the following output:
54 | nv
55 | 
56 | x = ['n', 'z', 'v', 's', 't', 'h', 'o']
57 | print(x[-7] + x[-5])
58 | (^answer)
59 | 
60 | x = ['n', 'z', 'v', 's', 't', 'h', 'o']
61 | print(x[-4] + x[-6])
62 | 
63 | x = ['n', 'z', 'v', 's', 't', 'h', 'o']
64 | print(x[-2] + x[-5])
65 | 
66 | 


--------------------------------------------------------------------------------
/AI4I-2-Q-Libraries_and_Data_Manipulation_Quiz.txt:
--------------------------------------------------------------------------------
  1 | AI4I-2-Q Libraries and Data Manipulation Quiz
  2 | 
  3 | All of the questions in this section refer to this scenario below.
  4 | 
  5 | You are working as a Data Scientist in an organization focusing on global health. Your manager has asked you analyze a dataset to study Diabetes for a city in a developing country.
  6 | 
  7 | The first dataset you receive was given to you as a text file. When you open the file in a text editor you see the following:
  8 | 
  9 | patient_id#Pregnancies#Glucose#BloodPressure#SkinThickness#Insulin#DiabetesPedigreeFunction#Outcome
 10 | 163#0#114#80#34#285#0.167#0
 11 | 348#3#116#0#0#0#0.187#0
 12 | 395#4#158#78#0#0#0.803#1
 13 | 187#8#181#68#36#495#0.615#1
 14 | 
 15 | 
 16 | 
 17 | Assume the following library imports:
 18 | import numpy as np
 19 | import pandas as pd
 20 | import matplotlib.pyplot as plt
 21 | 
 22 | 
 23 | 1. Complete the code to read in the data using Pandas such that the DataFrame can be displayed as:
 24 |  
 25 |  	Pregnancies	Glucose	BloodPressure	SkinThickness	Insulin	DiabetesPedigreeFunction	Outcome
 26 | patient_id	 	 	 	 	 	 	 
 27 | 163	0	114	80	34	285	0.167	0
 28 | 348	3	116	0	0	0	0.187	0
 29 | 395	4	158	78	0	0	0.803	1
 30 | 187	8	181	68	36	495	0.615	1
 31 | 342	1	95	74	21	73	0.673	0
 32 |  
 33 | Complete the following code (2 answers expected)
 34 | df_main = pd.read_csv(filename, 
 35 | sep
 36 | =”#”, 
 37 | index_col
 38 | =0)
 39 | 
 40 | 
 41 | 
 42 | 2. Once you have loaded in the data, you proceed to do EDA (Exploratory Data Analysis) to have a better understanding of the data.
 43 | As part of this EDA, you decided to create a box and whiskers plot to look at the Glucose levels of the patients by their diabetic condition.
 44 | Assume that the DataFrame is saved as the variable df. Fill in the blank below to display the plot.
 45 | 
 46 | ________________________
 47 | plt.xlabel('Diabetic Condition')
 48 | plt.ylabel('Glucose Level')
 49 | plt.show()
 50 | 
 51 | df.boxplot(column = ['Glucose'], by=['Outcome']) <-answer
 52 | plt.boxplot(df, column = ['Glucose'], by=['Outcome'])
 53 | plt.plot(df, column = ['Glucose'], type='boxplot')
 54 | df.boxplot(column = ['Outcome'], by=['Glucose'])
 55 | 
 56 | 
 57 | 
 58 | 3. As you continue to explore the data, you then focus your attention on the diastolic Blood Pressure readings. You realise that the raw values are not useful and decide to classify (or bin) the values according to well known Blood Pressure categories.
 59 | Fill in the blank with the the appropriate function to count the number of patients based on their Blood Pressure Status.
 60 | 
 61 | bins = [0, 80, 90, 120, 200]
 62 | bin_labels = ['Normal', 'High Blood Pressure 1', 'High Blood Pressure 2', 'Hypertensive']
 63 | bp_status = __________(df_main.BloodPressure, bins=bins).value_counts() 
 64 | bp_status.index = bin_labels
 65 | 
 66 | Normal                   568
 67 | High Blood Pressure 1    127
 68 | High Blood Pressure 2     37
 69 | Hypertensive               1
 70 | Name: BloodPressure, dtype: int64
 71 | 
 72 | pd.bin
 73 | pd.cut <-answer
 74 | np.cut
 75 | 
 76 | 
 77 | 
 78 | 4. Your manager comes to you and says that some additional features have been made available about the patients. You read in the file into a DataFrame with the variable name df_new_features. The first few values are shown below.
 79 |  
 80 |  	BMI	Age
 81 | patient_id	 	 
 82 | 30	34.1	38
 83 | 430	35.0	43
 84 | 259	25.9	24
 85 | 103	22.5	21
 86 | 525	31.6	24
 87 |  
 88 | Combine the two DataFrames, taking care to match the information based on the patient_id. You also decide to sort the DataFrame by ascending age and BMI.
 89 | The resultant DataFrame should look like the following:
 90 |  
 91 |  	Pregnancies	Glucose	BloodPressure	SkinThickness	Insulin	DiabetesPedigreeFunction	Outcome	BMI	Age
 92 | patient_id	 	 	 	 	 	 	 	 	 
 93 | 372	0	118	64	23	89	1.731	0	0.0	21
 94 | 146	0	102	75	23	0	0.572	0	0.0	21
 95 | 61	2	84	0	0	0	0.304	0	0.0	21
 96 | 439	1	97	70	15	0	0.147	0	18.2	21
 97 | 527	1	97	64	19	82	0.299	0	18.2	21
 98 | 
 99 | Complete the following code (2 answers expected)
100 | combined = df.
101 | merge
102 | (df_new_features, on=’patient_id’).
103 | sort_values
104 | (by=[‘Age’,’BMI’])
105 | 
106 | 
107 | 
108 | 5. After a discussion with your manager, you decide to focus the study on patients who are of working age. Filter the dataset to keep only patients aged 65 years (countrys’ official retirement age) or younger. Sort the DataFrame by the patient_id.
109 | The output DataFrame should look like the following:
110 |  
111 |  	Pregnancies	Glucose	BloodPressure	SkinThickness	Insulin	DiabetesPedigreeFunction	Outcome	BMI	Age
112 | patient_id	 	 	 	 	 	 	 	 	 
113 | 372	0	118	64	23	89	1.731	0	0.0	21
114 | 146	0	102	75	23	0	0.572	0	0.0	21
115 | 61	2	84	0	0	0	0.304	0	0.0	21
116 | 439	1	97	70	15	0	0.147	0	18.2	21
117 | 527	1	97	64	19	82	0.299	0	18.2	21
118 |  
119 | Which statement creates this result?
120 | working_age = combined.Age <= 65; combined_filtered = combined[working_age]
121 | combined_filtered = combined[combined['Age'] <= 65].sort_index()
122 | combined_filtered = combined.loc[combined.Age <= 65].sort_index()
123 | Any of the above <-answer
124 | 
125 | 


--------------------------------------------------------------------------------
/AI4I-3-Q-Exploratory_Data_Analysis_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-3-Q Exploratory Data Analysis Quiz
 2 | 
 3 | All of the questions in this section refer to the same scenario.
 4 | 
 5 | You are a marketing analyst for a candy company. You have been given a dataset of survey data about candy and have been asked to analyze it.
 6 | 
 7 | Dataset
 8 | AI4I-3 Exploratory Data Analysis Quiz_DatasetDownload
 9 | 
10 | Imports
11 | Assume the following imports.
12 | import numpy as np
13 | import pandas as pd
14 | import matplotlib.pyplot as plt
15 | import seaborn as sns
16 | import scipy
17 | from sklearn.feature_extraction.text import CountVectorizer
18 | 
19 | When you import the csv, use pd.read_csv('candyhierarchy2017 (1).csv', encoding="ISO-8859-1").
20 | 
21 | df = pd.read_csv('candyhierarchy2017 (1).csv', encoding="ISO-8859-1")
22 | df.info()
23 | <class 'pandas.core.frame.DataFrame'>
24 | RangeIndex: 2460 entries, 0 to 2459
25 | Columns: 120 entries, Internal ID to Click Coordinates (x, y)
26 | dtypes: float64(4), int64(1), object(115)
27 | memory usage: 2.3+ MB
28 | 
29 | 
30 | 
31 | 1. How many rows are the in the dataset?
32 | There are 
33 | 2460
34 |  rows in the dataset.
35 | 
36 | 
37 | 
38 | 
39 | print(df.isnull().sum())
40 | Internal ID                    0
41 | Q1: GOING OUT?               110
42 | Q2: GENDER                    41
43 | Q3: AGE                       84
44 | Q4: COUNTRY                   64
45 |                             ... 
46 | Q12: MEDIA [Daily Dish]     2375
47 | Q12: MEDIA [Science]        1098
48 | Q12: MEDIA [ESPN]           2361
49 | Q12: MEDIA [Yahoo]          2393
50 | Click Coordinates (x, y)     855
51 | Length: 120, dtype: int64
52 | 
53 | 
54 | 2. You have decided to do some analysis around the age of the respondents. After inspecting the data, you notice that there are some missing values. How many missing values does the column ‘Q3: Age’ have in this dataset?
55 | There are 
56 | 84
57 |  missing values in the column.
58 | 
59 | 


--------------------------------------------------------------------------------
/AI4I-4-Q-Statistical_Thinking_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-4-Q Statistical Thinking Quiz
 2 | 
 3 | All of the questions in this section refer to the below scenario.
 4 | 
 5 | Your manager has asked you to analyze some data from the HR department.
 6 | The table below shows values from some of the features available.
 7 | 
 8 | MarriedID	GenderID	DeptID	PayRate	EngagementSurvey	EmpSatisfaction	SpecialProjectsCount	DaysLateLast30	Termd
 9 | 0	1.0	0.0	1.0	28.50	2.04	2.0	6.0	0.0	0.0
10 | 1	0.0	1.0	1.0	23.00	5.00	4.0	4.0	0.0	0.0
11 | 2	0.0	1.0	1.0	29.00	3.90	5.0	5.0	0.0	0.0
12 | 3	1.0	0.0	1.0	21.50	3.24	3.0	4.0	NaN	1.0
13 | 4	0.0	0.0	1.0	16.56	5.00	3.0	5.0	0.0	0.0
14 | 
15 | Imports
16 | Assume the following library imports.
17 | import numpy as np
18 | import pandas as pd
19 | import matplotlib.pyplot as plt
20 | 
21 | 
22 | 1. One of the first things you want to look at are the Terminated employees (Termd column) and how engaged they are compared to those still under employment. You are curious about the overall data spread (exact data points are not necessary). Which plot would you use to show this?
23 | 
24 | Bee-swarm plot
25 | Box plot <-answer
26 | Histogram
27 | None of these are suitable
28 | 
29 | 
30 | 
31 | 2. You then want to look at what the employees are earning. To do that, you plot the ECDF (Empirical Cumulative Distribution Function) graph for the hourly pay rate. From the chart below, what proportion of the employees are earning $40 or less per hour.
32 | Is it:
33 | 
34 | 22%
35 | 30%
36 | 70% <-answer
37 | I cannot tell from the chart
38 | 
39 | 
40 | 
41 | 3. You want to study if there exists a correlation between the pay rate and the employee engagement. To improve your confidence in the calculation, you decide to apply bootstrapping before calculating the replicate. Complete the function below:
42 | 
43 | def draw_bs_pairs (x, y, func, size=1):
44 |   # Set up array of indices
45 |   inds = np.arange(len(x))
46 |   # Initialize the array of replicates
47 |   bs_rep= np.empty(size)
48 |   for i in range(size):
49 |     bs_inds = np.random.
50 | choice
51 | (inds, size = len(inds))
52 |     bs_x = x[bs_inds]
53 |     bs_y = y[bs_inds]
54 |     bs_rep[i] = func([bs_x,bs_y][bs_x,bs_y])
55 |   return bs_rep
56 | 
57 | 
58 | 
59 | 4. The full dataset has 103 features in total. You decide to try a few dimension reduction techniques. Why do you want to do that?
60 | 
61 | The data will be less complex.
62 | The data will require less disk space.
63 | It will take less computation time to process the data.
64 | During modeling, there is a lower chance of overfitting.
65 | All of the above <-answer
66 | 
67 | 


--------------------------------------------------------------------------------
/AI4I-5-Q-Supervised_Learning_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-5-Q: Supervised Learning Quiz
 2 | 
 3 | Q1. An overfitted model is typically characterized by …
 4 | High Bias, High Variance
 5 | High Bias, Low Variance
 6 | Low Bias, High Variance <-answer
 7 | Low Bias, Low Variance
 8 | 
 9 | 
10 | Q2. What kind of error is caused by randomness or natural variation in the data generated by a system?
11 | Bias Error
12 | Variance Error <-answer
13 | Irreducible Error
14 | 
15 | 
16 | Q3. Which two metrics are commonly used to evaluate Classification models?
17 | Accuracy
18 | Precision <-answer
19 | Coefficient of Determination
20 | Recall <-answer
21 | 
22 | 
23 | Q4. Which parameter can we add to VotingClassifier to use soft voting to predict the class labels?
24 | voting='soft' <-answer
25 | predict_using='soft'
26 | soft_voting=True
27 | None. VotingClassifier only allows for hard voting.
28 | 
29 | 


--------------------------------------------------------------------------------
/AI4I-6-Q-Unsupervised_Learning_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-6-Q: Unsupervised Learning Quiz
 2 | 
 3 | 1. What are the main drawbacks when using dimension reduction techniques on your data?
 4 | Some information is lost, possibly degrading the performance of the subsequent ML algorithms
 5 | Transformed features are hard to interpret
 6 | It can be computationally expensive
 7 | It adds complexity to your ML pipelines
 8 | All of the above <-answer
 9 | 
10 | 
11 | 2. Imagine performing PCA on a 1000 dimension dataset and you set the explained variance to 95%. How many dimensions will the resulting dataset have?
12 | Either 2 or 3 dimensions
13 | The dataset cannot be reduced
14 | Trick Question! Depends on the dataset <-answer
15 | 
16 | 
17 | 3. How should you NOT select the optimum number of clusters k in a K-Means Clustering technique?
18 | a) Plot inertia vs number of clusters and try to identify the ‘elbow joint’
19 | b) Look at the loss value for each of the k values and select the one with the lowest loss
20 | c) Plot the silhouette score against the number of cluster. The optimum k value should be near the peak
21 | B only <-answer
22 | 
23 | 
24 | 4. Complete the code snippet below to perform hierarchical clustering on the dataset. Assume that scipy.cluster has been imported for you.
25 | my_cluster = [Blank](dataset, method='complete')
26 | hierarchy.linkage <-answer
27 | linkage
28 | agglomerative
29 | hierarchy
30 | 
31 | 
32 | 5. Which statement about Non-negative Matrix Factorization (NMF) is not true?
33 | It is a dimension reduction technique
34 | NMF models are easy to interpret
35 | NMF can hand handle real numbers as input features <-answer
36 | None of the above
37 | 
38 | 


--------------------------------------------------------------------------------
/AI4I-7-Q-Deep_Learning_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-7-Q: Deep Learning Quiz
 2 | 
 3 | You are trying to build a deep learning classifier that can distinguish between the classes in the MNIST_fashion dataset.
 4 | 
 5 | Access the answer key with this link https://colab.research.google.com/drive/1_hGdZRxwlEQzal8fw-2HY-bPa2I9jmg2?usp=sharing
 6 | 
 7 | Create a notebook in Google Colaboratory. Install and import Tensorflow 2.0 and the required libraries. Load the MNIST_fashion dataset. What is the width of each picture in pixels?
 8 | The width of each picture is 
 9 | 28
10 |  px.
11 | 
12 | 
13 | 
14 | Build a CNN to classifiy the different classes in the dataset. Use what you learnt about convolution layers, activation functions, and dropout.
15 | What activation should you use in the final layer?
16 | softmax
17 |  should be used in the final activation layer.
18 | 
19 | 
20 | 
21 | What is the next step after the forward pass when you are training a neural network? (Hint: You only need to answer with one word)
22 | The next step is 
23 | backpropagation
24 | .
25 | 
26 | 
27 | 
28 | If you have a high training accuracy but a low validation accuracy, what is likely to be happening? (Hint: You only need to answer with one word)
29 | This is 
30 | overfitting
31 | .
32 | 
33 | 
34 | 
35 | Fill in the blank.
36 | When using gradient descent, we try to get as close to the 
37 | global
38 |  minima as possible (Hint: You only need to answer with one word)
39 | 
40 | 


--------------------------------------------------------------------------------
/AI4I-8-Q-Other_Languages_and_Tools_to_Learn_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-8-Q Other Languages and Tools to Learn Quiz
 2 | 
 3 | This is a series of quiz questions to test your basic understanding of SQL, Shell and Environments.
 4 | 
 5 | 1. With SQL, how do you select all the columns from a table named SALES?
 6 | SELECT ALL FROM SALES
 7 | SELECT * FROM SALES <-answer
 8 | SELECT SALES
 9 | SELECT * SALES
10 | 
11 | 
12 | 2. Which SQL keywords specify the sorting direction of the result set retrieved with the ORDER BY clause?
13 | ASC <-answer
14 | REVERSE
15 | SORT
16 | DESC <-answer
17 | 
18 | 
19 | 3. You are in the shell. We have a file called ‘sample’. We want to highlight only the lines that do not contain the character ‘a’, but the result should be in reverse order. We then want to write the resulting output to a file called ‘myoutput’.
20 | 
21 | What commands do you issue to shell? (Use Cat, Grep and Sort Commands to help you)
22 | grep a sample -v | sort -r >> myoutput <-answer
23 | grep a sample | sort >> myoutput
24 | grep a sample -v | sort -r | myoutput
25 | grep a sample | sort -r >> myoutput
26 | 
27 | 


--------------------------------------------------------------------------------
/AI4I-9-Q-Data_Science_Project_Lifecycle_Quiz.txt:
--------------------------------------------------------------------------------
 1 | AI4I-9-Q Data Science Project Lifecycle Quiz
 2 | 
 3 | 
 4 | 1. Which metric is not appropriate to measure a classification model?
 5 | RMSE <-answer
 6 | F1
 7 | Accuracy
 8 | Silhouette Score
 9 | 
10 | 
11 | 2. Which metrics is appropriate to measure a classification model built with imbalance dataset?
12 | F1 <-answer
13 | RMSE
14 | Accuracy
15 | Silhouette Score
16 | 
17 | 
18 | 3. Which is not a main principle of SCRUM methodology?
19 | Frequent sprints to get fast feedbacks from stakeholders
20 | Defining project goal upfront and not deviating from it. <-answer
21 | Product owners provide the direction on what product features to build
22 | 
23 | 


--------------------------------------------------------------------------------
/AI4I_Data_Science_Project_Lifecycle.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JNYH/AI4I_Data_Science_Project_Lifecycle/ab4b752d63f71d63f2fdec0b0fcfcc2190ed92ec/AI4I_Data_Science_Project_Lifecycle.pdf


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # AI4I_Data_Science_Project_Lifecycle
 2 | This is a memo to share what I have learnt in Data Science Project Lifecycle, capturing the learning objectives as well as my personal notes. The course is by Infocomm Media Development Authority (IMDA) AI For Industry (AI4I) with 24 slides compiled by LIM Tern Poh.
 3 | 
 4 | The mission of AI Singapore is to anchor deep national capabilities in Artificial Intelligence, thereby creating social and economic impacts, grow local talent, build an AI ecosystem and put Singapore on the world map.
 5 | 
 6 | AI for Industry (AI4I) is a fully online programme to help learners PLUS-skill themselves and learn data science, machine learning, artificial intelligence and visualization in Python. The programme is hosted on the AI Makerspace online platform. DataCamp is used as a resource to support the learning required to complete the programme. 
 7 | 
 8 | The total estimated course learning time is at least 140 hours, completing at least 35 lessons and 9 quizzes, covering content from basic Python programming to Machine Learning toolkits.
 9 | 
10 | For more information: https://www.aisingapore.org/talentdevelopment/ai4i/
11 | 


--------------------------------------------------------------------------------