├── README.md └── cheetsheet.md /README.md: -------------------------------------------------------------------------------- 1 | # DS-ML-Interview-Questions 2 | 3 | --- 4 | Last update: 11/16/2018 5 | 6 | This is a personal summary of common DS/ML interview questions and will be regularly updated, you are welcome to pull request or discuss the answer with me:) 7 | 8 | To be a good DS or MLE, you have to answer most of these questions below quickly and precisely:) 9 | 10 | Here are some books and MOOCs that may be useful for you. 11 | 12 | MOOCs: 13 | 14 | - Stanford CS231n 15 | - Stanford CS224n 16 | - Coursera Deep learning specialization 17 | - Berkeley CS 188: Introduction to Artificial Intelligence 18 | - Deep Reinforcement Learning - UC Berkeley 19 | - Oxford Deep NLP 2017 course 20 | - Yida Xu's Machine Learning Course 21 | - IFT6266, Deep Learning, graduate class at U. Montreal 22 | 23 | Books: 24 | 25 | - Pattern Recognition and Machine Learning 26 | - The Element of Statistical Learning 27 | - Machine Learning: A Probabilistic Perspective 28 | - Deep Learning Book 29 | - Reinforcement Learning: An Introduction 30 | - Artificial Intelligence A Modern Approach 31 | 32 | ### Behavior and Background 33 | 1. Introduce yourself. 34 | 2. Tell me your past internship or some other projects. 35 | 36 | ## Machine Learning Concept 37 | ### Linear Regression and Logistic Regression 38 | 1. What is linear regression? What is logistic regression? What is the difference? 39 | 2. How to find the best parameter for linear regression? / How to optimize the model? 40 | 3. Please write down the close form solution for linear regression? 41 | 4. What is Stochastic Gradient Descent or Mini-batch Gradient Descent? What is the advantage? and disadvantages? 42 | 5. What is mean square error? What is Cross Entropy? What is the difference? Also please write down the formular of these two cost function. 43 | 6. What is Softmax? What is relationship between Softmax and Logistic Regression? 44 | 7. Explain and derive the SGD for Logistic Regression and Softmax? 45 | 8. Does global optimal can be reached by SGD, why? 46 | 9. What is the Sigmoid function? What is the characteristic of Sigmoid? What is the advantages and disadvantages of Sigmoid function? 47 | 48 | ### Regularization, Overfitting and Model/Feature Selection/Evaluation. 49 | 1. What are L1 and L2? And their difference? 50 | 2. From the view of Bayesian, what's the difference between L1 and L2? 51 | 3. What is overfitting? 52 | 4. How can you know your model is overfitted or not? 53 | 5. What is the Bias-Variance Trade-off? 54 | 6. How to prevent overfitting? 55 | 7. What is cross validation? 56 | 8. Let is consider you are training a classifier to classify the cat pictures taken by cell phone. You have 10k cat pictures that taken by cell phone users. How would you split the pictures into training/validation/test set? Now you have got 100k cat pictures from Internet, what dataset would you like to choose to put these 100k cat pictures in? 57 | 9. For training/validation/test set, which two sets are most important that you have to keep the distribution of data samples the same? 58 | 10. What is data augmentation? Do you know any technique to augment data? 59 | 11. What is F1 score? What are recall and precision? 60 | 12. What is AUC? 61 | 13. How would you handle data imbalance problem? 62 | 63 | ### Decision Tree 64 | 1. What is Decision Tree? 65 | 2. What is Information Gain? 66 | 3. What is Geni Index? 67 | 4. What is the advantages and disadvantages of ID3? 68 | 5. What is Random Forrest? 69 | 6. What is Bagging? 70 | 71 | ### Boosting and Ensemble 72 | 1. What is AdaBoost and the relation between Adaboost and exponential loss function? 73 | 2. What is Gradient Boosting? 74 | 3. What is the idea of Bagging and Stacking? 75 | 4. Do you know XGBoost? And what is the idea of XGBoost? 76 | 77 | ### Naive Bayes 78 | 1. Write down Naive Bayes equation. 79 | 2. Given an example, calculate designate probability by using Bayes Equation. 80 | 81 | ### Unsupervised Learning 82 | 1. What's Clustering? 83 | 2. What's K-means? Implement K-means by Python and Numpy. What's the relationship between K-means and EM? 84 | 3. What's the pros and cons of K-means? 85 | 4. What's the complexity of K-means? 86 | 5. What's PCA and SVD? Given the SVD funtion, please implement PCA. 87 | 6. Do you know T-sne, simply explain it to me? 88 | 89 | ### Graph Models 90 | 1. What is Hidden Markov Model? 91 | 2. What is the assumption that HMM made? 92 | 3. There are three matrices in HMM, what are they? 93 | 3. What is the three problems of HMM? 94 | 4. What is the Viterbi Algorithm? And its complexity? 95 | 5. How to optimize the parameters of HMM? 96 | 6. In what situation you would likt to use HMM? 97 | 7. (Bonus) What are MEMM and CRF? 98 | 99 | ### Support Vector Machine 100 | 1. What is suppor vector? 101 | 2. What is the idea of SVM? 102 | 3. Explain the idea of kernel method? 103 | 4. What is the slack factor in SVM? 104 | 5. What is the loss function of SVM? 105 | 106 | ### EM 107 | 1. What is the idea of EM algorithm? 108 | 2. In what case we would like to use EM? 109 | 110 | ### Reinforcement Learning 111 | 1. What is the Markov Decision Process? 112 | 2. What is the Bellman Equation? 113 | 3. What is the Q-function. 114 | 4. The difference between Policy Gradient and Q-learning. 115 | 116 | ### Deep Learning 117 | 1. What is the relationship between Logistic Regression and Feedforward Neural Networks? 118 | 2. What is Sigmoid, Tanh, Relu? And their pros and cons? 119 | 3. What are RNN, LSTM, GRU? And their pros and cons? 120 | 4. What are gradient explosion and vanishing? 121 | 4. What is CNN, explain the process of CNN and the idea. 122 | 5. What is the differenct between Momentum and Adam optimizer? 123 | 6. What is the pros and cons of Tensorflow and PyTorch? 124 | 7. What is the compuational graph of deep learning frameworks? 125 | 8. Why GPU can accelerate the compuation of deep learning models? 126 | 9. Why deep learning models are powerful now? 127 | 10. What is Batch Normalization and Layer Normalization? 128 | 11. What is Dropout? 129 | 12. In what case you would like to use transfer learning? How would you fine-tune the model? 130 | 13. Do you know any techniques to initialize deep learning models? 131 | 14. Why zero initialization is a pitfall for deep learning? 132 | 15. Implement a single hidden layer FNN by Python and Numpy. 133 | 134 | ### NLP and DL 135 | 1. What is tf-IDF? 136 | 2. What is word embedding and its idea? What is the difference between sampled softmax and negative sampling? Do you know the relationship between negative sampling and Pointwise Mutual Information(Bouns) 137 | 3. What are unigram and bigram? 138 | 4. What is the attention machanism? 139 | 5. (Bonus) Explain LDA topic model. 140 | 141 | ### Computer Vision and DL 142 | 1. What is the difference between simple CNN and Inception Net and ResNet? 143 | 2. What are the common techniques to preprocess the image data? 144 | 3. What are the kernel and pooling in CNN? 145 | 4. What is GAN, the structure and the way we train it. 146 | 147 | ### Miscellaneous 148 | 1. What is the idea of Map Reduce? 149 | 150 | 151 | ## TODO: Statistic and Probability 152 | 153 | ## Case Study 154 | You are given a specific case and you need to give a solution for the problem. 155 | 156 | ## Coding 157 | ### SQL 158 | 1. select, group by, left, right, inner outter join... 159 | 160 | ### Python coding and concepts 161 | 1. What is the key difference between Tuple and List? 162 | 2. What is the list comprehension? 163 | 3. Use ML packages to complete certain tasks. 164 | 4. Some medium level (refer to Leetcode) coding questions. 165 | 5. How to handle exceptions in Python. 166 | 6. Implement K-means, KNN, Linear Regression, Logistic Regression, Simple Neural Network. 167 | 168 | ### Pandas 169 | 1. Use Pandas to manipulate data. 170 | 171 | -------------------------------------------------------------------------------- /cheetsheet.md: -------------------------------------------------------------------------------- 1 | # DS-ML-Interview-Questions 2 | 3 | --- 4 | Last update: 11/16/2018 5 | To be a good DS or MLE, you have to answer most of these questions below quickly and precisely. 6 | 7 | ### Behavior and Background 8 | 1. Introduce yourself. 9 | 2. Tell me your past internship or some other projects. 10 | 11 | ## Machine Learning Concept 12 | ### Linear Regression and Logistic Regression 13 | 1. What is linear regression? What is logistic regression? What is the difference? 14 | 2. How to find the best parameter for linear regression? / How to optimize the model? 15 | 3. Please write down the close form solution for linear regression? 16 | 4. What is Stochastic Gradient Descent or Mini-batch Gradient Descent? What is the advantage? and disadvantages? 17 | 5. What is mean square error? What is Cross Entropy? What is the difference? Also please write down the formular of these two cost function. 18 | 6. What is Softmax? What is relationship between Softmax and Logistic Regression? 19 | 7. Explain and derive the SGD for Logistic Regression and Softmax? 20 | 8. Does global optimal can be reached by SGD, why? 21 | 9. What is the Sigmoid function? What is the characteristic of Sigmoid? What is the advantages and disadvantages of Sigmoid function? 22 | 23 | ### Regularization, Overfitting and Model/Feature Selection/Evaluation. 24 | 1. What are L1 and L2? And their difference? 25 | 2. From the view of Bayesian, what's the difference between L1 and L2? 26 | 3. What is overfitting? 27 | 4. How can you know your model is overfitted or not? 28 | 5. What is the Bias-Variance Trade-off? 29 | 6. How to prevent overfitting? 30 | 7. What is cross validation? 31 | 8. Let is consider you are training a classifier to classify the cat pictures taken by cell phone. You have 10k cat pictures that taken by cell phone users. How would you split the pictures into training/validation/test set? Now you have got 100k cat pictures from Internet, what dataset would you like to choose to put these 100k cat pictures in? 32 | 9. For training/validation/test set, which two sets are most important that you have to keep the distribution of data samples the same? 33 | 10. What is data augmentation? Do you know any technique to augment data? 34 | 11. What is F1 score? What are recall and precision? 35 | 12. What is AUC? 36 | 13. How would you handle data imbalance problem? 37 | 38 | ### Decision Tree 39 | 1. What is Decision Tree? 40 | 2. What is Information Gain? 41 | 3. What is Geni Index? 42 | 4. What is the advantages and disadvantages of ID3? 43 | 5. What is Random Forrest? 44 | 6. What is Bagging? 45 | 46 | ### Boosting and Ensemble 47 | 1. What is AdaBoost and the relation between Adaboost and exponential loss function? 48 | 2. What is Gradient Boosting? 49 | 3. What is the idea of Bagging and Stacking? 50 | 4. Do you know XGBoost? And what is the idea of XGBoost? 51 | 52 | ### Naive Bayes 53 | 1. Write down Naive Bayes equation. 54 | 2. Given an example, calculate designate probability by using Bayes Equation. 55 | 56 | ### Unsupervised Learning 57 | 1. What's Clustering? 58 | 2. What's K-means? Implement K-means by Python and Numpy. What's the relationship between K-means and EM? 59 | 3. What's the pros and cons of K-means? 60 | 4. What's the complexity of K-means? 61 | 5. What's PCA and SVD? Given the SVD funtion, please implement PCA. 62 | 6. Do you know T-sne, simply explain it to me? 63 | 64 | ### Graph Models 65 | 1. What is Hidden Markov Model? 66 | 2. What is the assumption that HMM made? 67 | 3. There are three matrices in HMM, what are they? 68 | 3. What is the three problems of HMM? 69 | 4. What is the Viterbi Algorithm? And its complexity? 70 | 5. How to optimize the parameters of HMM? 71 | 6. In what situation you would likt to use HMM? 72 | 7. (Bonus) What are MEMM and CRF? 73 | 74 | ### Support Vector Machine 75 | 1. What is suppor vector? 76 | 2. What is the idea of SVM? 77 | 3. Explain the idea of kernel method? 78 | 4. What is the slack factor in SVM? 79 | 5. What is the loss function of SVM? 80 | 81 | ### EM 82 | 1. What is the idea of EM algorithm? 83 | 2. In what case we would like to use EM? 84 | 85 | ### Reinforcement Learning 86 | 1. What is the Markov Decision Process? 87 | 2. What is the Bellman Equation? 88 | 3. What is the Q-function. 89 | 4. The difference between Policy Gradient and Q-learning. 90 | 91 | ### Deep Learning 92 | 1. What is the relationship between Logistic Regression and Feedforward Neural Networks? 93 | 2. What is Sigmoid, Tanh, Relu? And their pros and cons? 94 | 3. What are RNN, LSTM, GRU? And their pros and cons? 95 | 4. What are gradient explosion and vanishing? 96 | 4. What is CNN, explain the process of CNN and the idea. 97 | 5. What is the differenct between Momentum and Adam optimizer? 98 | 6. What is the pros and cons of Tensorflow and PyTorch? 99 | 7. What is the compuational graph of deep learning frameworks? 100 | 8. Why GPU can accelerate the compuation of deep learning models? 101 | 9. Why deep learning models are powerful now? 102 | 10. What is Batch Normalization and Layer Normalization? 103 | 11. What is Dropout? 104 | 12. In what case you would like to use transfer learning? How would you fine-tune the model? 105 | 13. Do you know any techniques to initialize deep learning models? 106 | 14. Why zero initialization is a pitfall for deep learning? 107 | 15. Implement a single hidden layer FNN by Python and Numpy. 108 | 109 | ### NLP and DL 110 | 1. What is tf-IDF? 111 | 2. What is word embedding and its idea? What is the difference between sampled softmax and negative sampling? Do you know the relationship between negative sampling and Pointwise Mutual Information(Bouns) 112 | 3. What are unigram and bigram? 113 | 4. What is the attention machanism? 114 | 115 | ### Computer Vision and DL 116 | 1. What is the difference between simple CNN and Inception Net and ResNet? 117 | 2. What are the common techniques to preprocess the image data? 118 | 3. What are the kernel and pooling in CNN? 119 | 4. What is GAN, the structure and the way we train it. 120 | 121 | ## Case Study 122 | You are given a specifi case and you need to give a solution for the problem. 123 | 124 | ## Coding 125 | ### SQL 126 | 1. select, group by, left, right, inner outter join... 127 | 128 | ### Python coding and concepts 129 | 1. What is difference between Tuple and List? 130 | 2. What is the list comprehension? 131 | 3. Use ML packages to complete certain tasks. 132 | 4. Some medium level (refer to Leetcode) coding questions. 133 | 134 | ### Pandas 135 | 1. Use Pandas to manipulate data. 136 | 137 | --------------------------------------------------------------------------------