├── Hands-On Computer Vision with Tensorflow 2 └── Chapter 1: Computer Vision and Neural Networks.md ├── Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow ├── Chapter 7: Ensemble Learning and Random Forests.md ├── Chapter 8: Dimensionality Reduction.md └── Readme.md └── README.md /Hands-On Computer Vision with Tensorflow 2/Chapter 1: Computer Vision and Neural Networks.md: -------------------------------------------------------------------------------- 1 | ## Questions ## 2 | [1. Which of the following tasks not belong to computer vision?]() 3 | - [ ] A web search for images similar to query 4 | - [ ] A 3D scence reconstruction from image sequences 5 | - [ ] Animation of a video character 6 | 7 | [2. Which activation function were the orignal perceptrons using ?]() 8 | 9 | [3. Supose we want to train a mehtod to detect the whether a handwritten digit is a 4 or not. How should we adapt the network that we implemented in this chapter for this task?]() 10 | 11 | ## Questions & Answer ## 12 | 13 | 1. Which of the following tasks not belong to computer vision? 14 | - [ ] A web search for images similar to query 15 | - [ ] A 3D scence reconstruction from image sequences 16 | - [x] Animation of a video character 17 | 18 | 2. Which activation function were the orignal perceptrons using ? 19 | 20 | Answer: 21 | 22 | Step function 23 | 24 | 25 | 3. Supose we want to train a mehtod to detect the whether a handwritten digit is a 4 or not. How should we adapt the network that we implemented in this chapter for this task? 26 | 27 | Answer: 28 | 29 | We will combine the other digits into one class and the digit 4 into another class and then train the classifer on two classes only. 30 | -------------------------------------------------------------------------------- /Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow/Chapter 7: Ensemble Learning and Random Forests.md: -------------------------------------------------------------------------------- 1 | # Chapter 7: Ensemble Learning and Random Forests # 2 | 3 | ## Questions ## 4 | 5 | [1. If you have trained five different models on the exact same training data, and they all achieve 95% precision, is there any chance that you can combine these models to get better results? If so, how? If not, why?](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn,%20Keras,%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md#:~:text=1.%20If%20you%20have%20trained%20five%20different%20models%20on%20the%20exact%20same%20training%20data%2C%20and%20they%20all%20achieve%2095%25%20precision%2C%20is%20there%20any%20chance%20that%20you%20can%20combine%20these%20models%20to%20get%20better%20results%3F%20If%20so%2C%20how%3F%20If%20not%2C%20why%3F) 6 | 7 | [2. What is the difference between hard and soft voting classifiers?](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn,%20Keras,%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md#:~:text=2.%20What%20is%20the%20difference%20between%20hard%20and%20soft%20voting%20classifiers%3F) 8 | 9 | [3. Is it possible to speed up training of a bagging ensemble by distributing it across multiple servers? What about pasting ensembles, boosting ensembles, random 10 | forests, or stacking ensembles?](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn,%20Keras,%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md#:~:text=3.%20Is%20it%20possible%20to%20speed%20up%20training%20of%20a%20bagging%20ensemble%20by%20distributing%20it%20across%20multiple%20servers%3F%20What%20about%20pasting%20ensembles%2C%20boosting%20ensembles%2C%20random%20forests%2C%20or%20stacking%20ensembles%3F) 11 | 12 | [4. What is the benefit of out-of-bag evaluation?](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn,%20Keras,%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md#:~:text=4.%20What%20is%20the%20benefit%20of%20out%2Dof%2Dbag%20evaluation%3F) 13 | 14 | [5. What makes Extra-Trees more random than regular Random Forests? How can this extra randomness help? Are Extra-Trees slower or faster than regular Random Forests?](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn,%20Keras,%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md#:~:text=5.%20What%20makes%20Extra%2DTrees%20more%20random%20than%20regular%20Random%20Forests%3F%20How%20can%20this%20extra%20randomness%20help%3F%20Are%20Extra%2DTrees%20slower%20or%20faster%20than%20regular%20Random%20Forests%3F) 15 | 16 | [6. If your AdaBoost ensemble underfits the training data, what hyperparameters should you tweak and how?]() 17 | 18 | [7. If your Gradient Boosting ensemble overfits the training set, should you increase or decrease the learning rate?]() 19 | 20 | [8. Load the MNIST data (introduced in Chapter 3), and split it into a training set, a validation set, and a test set (e.g., use 50,000 instances for training, 10,000 for validation, and 10,000 for testing). Then train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM. Next, try to combine them into an ensemble that outperforms them all on the validation set, using a soft or hard voting classifier. Once you have found one, try it on the test set. How much better does it perform compared to the individual classifiers?]() 21 | 22 | [9. Run the individual classifiers from the previous exercise to make predictions on the validation set, and create a new training set with the resulting predictions: each training instance is a vector containing the set of predictions from all your classifiers for an image, and the target is the image’s class. Train a classifier on 23 | this new training set. Congratulations, you have just trained a blender, and together with the classifiers they form a stacking ensemble! Now let’s evaluate the ensemble on the test set. For each image in the test set, make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble’s predictions. How does it compare to the voting classifier you trained earlier?]() 24 | 25 | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- 26 | 27 | 28 | ## Questions & Answers ## 29 | 30 | **1. If you have trained five different models on the exact same training data, and they all achieve 95% precision, is there any chance that you can combine these models to get better results? If so, how? If not, why?** 31 | 32 | Answer: 33 | 34 | Yes, using ensamble method, in which you will train each model on the training data and use hard or soft (if possible depedning on the model used) voting to get the prediction. You can also train them using different subset of the data using bagging or pasting method. 35 | 36 | **2. What is the difference between hard and soft voting classifiers?** 37 | 38 | Answer: 39 | 40 | In hard voting the class that get most of the votes will be selected, while in soft voting the probability for each class is averaged throughout all the classifers in the ensemble learners and the class with the heighest probability will be selected. 41 | 42 | 43 | **3. Is it possible to speed up training of a bagging ensemble by distributing it across multiple servers? What about pasting ensembles, boosting ensembles, random 44 | forests, or stacking ensembles?** 45 | 46 | Answer: 47 | Yes it is possible as in bagging the data is sampled with replacement so it is possible to train multiple classifers on different samples in parallel. In pasting it can be also trained in parallel and also in random forests depedning how it will be ensambled. In boosting it can be parallelaizaed as every predictor depends on the previos one. For stacking ensambles each predicotr in the same stack does not depend on the previous one so it can be parallelized, while each stack depend on the previous ensamble stack so you have to wait till it finish. 48 | 49 | **4. What is the benefit of out-of-bag evaluation?** 50 | 51 | Answer: 52 | The benfit of out-of-bag evaluation is that they evaluate the ensemble predictor without the need of having seperate dataset for this as each predictor will be trained on certain samples and can be evaluated with the rest.This will prevent information loss and provide more data for training and providing unbaised evaluation in the same time 53 | 54 | **5. What makes Extra-Trees more random than regular Random Forests? How can this extra randomness help? Are Extra-Trees slower or faster than regular Random Forests?** 55 | 56 | Answer: 57 | Because not only the data is randomly sampled but also the thresholds used to split the data at each node is also sit randomly. This will make the model much faster as most of the training time goes to finding these thresholds. Also it results in a model with lower variacne but on the account of higer bias 58 | 59 | 60 | **6. If your AdaBoost ensemble underfits the training data, what hyperparameters should you tweak and how?** 61 | 62 | Answer: 63 | 64 | 65 | 66 | **7. If your Gradient Boosting ensemble overfits the training set, should you increase or decrease the learning rate?** 67 | 68 | Answer: 69 | 70 | 71 | **8. Load the MNIST data (introduced in Chapter 3), and split it into a training set, a validation set, and a test set (e.g., use 50,000 instances for training, 10,000 for validation, and 10,000 for testing). Then train various classifiers, such as a Random Forest classifier, an Extra-Trees classifier, and an SVM. Next, try to combine them into an ensemble that outperforms them all on the validation set, using a soft or hard voting classifier. Once you have found one, try it on the test set. How much better does it perform compared to the individual classifier?** 72 | 73 | Answer: 74 | 75 | 76 | 77 | **9. Run the individual classifiers from the previous exercise to make predictions on the validation set, and create a new training set with the resulting predictions: each training instance is a vector containing the set of predictions from all your classifiers for an image, and the target is the image’s class. Train a classifier on 78 | this new training set. Congratulations, you have just trained a blender, and together with the classifiers they form a stacking ensemble! Now let’s evaluate the ensemble on the test set. For each image in the test set, make predictions with all your classifiers, then feed the predictions to the blender to get the ensemble’s predictions. How does it compare to the voting classifier you trained earlier?** 79 | 80 | Answer: 81 | 82 | -------------------------------------------------------------------------------- /Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow/Chapter 8: Dimensionality Reduction.md: -------------------------------------------------------------------------------- 1 | ### Questions ### 2 | 3 | [1. What are the main motivations for reducing a dataset’s dimensionality? What are the main drawbacks?]() 4 | 5 | [2. What is the curse of dimensionality?]() 6 | 7 | [3. Once a dataset’s dimensionality has been reduced, is it possible to reverse the operation? If so, how? If not, why?]() 8 | 9 | [4. Can PCA be used to reduce the dimensionality of a highly nonlinear dataset?]() 10 | 11 | [5. Suppose you perform PCA on a 1,000-dimensional dataset, setting the explained variance ratio to 95%. How many dimensions will the resulting dataset have?]() 12 | 13 | [6. In what cases would you use vanilla PCA, Incremental PCA, Randomized PCA, or Kernel PCA?]() 14 | 15 | [7. How can you evaluate the performance of a dimensionality reduction algorithm on your dataset?]() 16 | 17 | [8. Does it make any sense to chain two different dimensionality reduction algorithms?]() 18 | 19 | [9. Load the MNIST dataset (introduced in Chapter 3) and split it into a training set and a test set (take the first 60,000 instances for training, and the remaining 20 | 10,000 for testing). Train a Random Forest classifier on the dataset and time how long it takes, then evaluate the resulting model on the test set. Next, use PCA to 21 | reduce the dataset’s dimensionality, with an explained variance ratio of 95%. Train a new Random Forest classifier on the reduced dataset and see how long it 22 | takes. Was training much faster? Next evaluate the classifier on the test set: how does it compare to the previous classifier?]() 23 | 24 | [10. Use t-SNE to reduce the MNIST dataset down to two dimensions and plot the result using Matplotlib. You can use a scatterplot using 10 different colors to represent each image’s target class. Alternatively, you can write colored digits at the location of each instance, or even plot scaled-down versions of the digit images 25 | themselves (if you plot all digits, the visualization will be too cluttered, so you should either draw a random sample or plot an instance only if no other instance 26 | has already been plotted at a close distance). You should get a nice visualization with well-separated clusters of digits. Try using other dimensionality reduction 27 | algorithms such as PCA, LLE, or MDS and compare the resulting visualizations.]() 28 | 29 | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 30 | ### Answers ### 31 | 32 | 1. What are the main motivations for reducing a dataset’s dimensionality? What are the main drawbacks? 33 | 34 | Answer: 35 | * Increase the training speed or decrease the time complexity. 36 | * Remove the unneeded informatino which might affect the training and model performance which is known as the curise of dimensiionality 37 | * To be able to visualize the data. 38 | * Saving storage/meomery space. 39 | 40 | **Some of the drawbacks:** 41 | 42 | * Increase the machine learning pipeline complexity 43 | * Depending on the training dataset size this might take alot of time and computation power 44 | * It is alawys diffuilt to explain the results with a transformed features. 45 | 46 | 2. What is the curse of dimensionality? 47 | 48 | Answer: 49 | 50 | Curse of dimensionality occurs when the data has large number of dimension and small data size so the distance between each instance will be large and the data will be sparse. This will therfore will make the model diffuiclt to capture the pattern in the data a model is trained on this data. Also it increase the risk of overfitted on the training data and if a new instance is added it will be probably away from the other instances and the moedl will not be able to predict its value or classify it right. 51 | 52 | 3. Once a dataset’s dimensionality has been reduced, is it possible to reverse the operation? If so, how? If not, why? 53 | 54 | Answer: 55 | 56 | It depends on the data itself and the way you applied the dimensionality reduction to it. For example PCA is used then we can revesre the data and obtain almost the data. However it is important to ensure that the data can not be obtained perfectly since part of the data will be lost during the transformation. This part depends on how many percentage did the chosen component did cover and the rest will be lost. 57 | 58 | 4. Can PCA be used to reduce the dimensionality of a highly nonlinear dataset? 59 | 60 | Answer: 61 | Yes PCA can be used to reduce the dimension of a highly nonlinear datasets however if they are for example swiss roll dataset it will lead to losing alot of information since your goal in this case should be unroll it not squashing it. 62 | 63 | 5. Suppose you perform PCA on a 1,000-dimensional dataset, setting the explained variance ratio to 95%. How many dimensions will the resulting dataset have? 64 | 65 | Answer: 66 | It depends on the dataset itself. 67 | 68 | 69 | 6. In what cases would you use vanilla PCA, Incremental PCA, Randomized PCA, or Kernel PCA? 70 | 71 | Answer: 72 | * vanilla PCA: The tradional PCA which is used if the data fit in the meomery and not very strong non linear datasets. 73 | * Incremental PCA: This will be a good choice if you have large dataset that cannot fit the meomery 74 | * Randomized PCA: This can be used to decrease the computation time as they find an approximation of the first d components. 75 | * Kernel PCA: It is useful for non linear dataset 76 | 77 | 7. How can you evaluate the performance of a dimensionality reduction algorithm on your dataset? 78 | 79 | Answer: 80 | 81 | In genral the dimensionality reduction algorithm will work well if it reduce the dimension and in the same time did not loss any important informatino. This can be measured in two ways depending on the algorithm. If the algorithm can reverse the data then we can measure the distance between the reversed data set and the orignal dataset. If it can not then another method will be by using an predictive algorithm and measure the perfomrance of the model before and after dimensionality reduction. 82 | 83 | 8. Does it make any sense to chain two different dimensionality reduction algorithms? 84 | 85 | Answer: 86 | Yes, For example we can use PCA for first remove large number of uneeded usless dimension then use LLE to remove more dimension. This will be similar to use LLE but in fraction of time. 87 | 88 | 9. Load the MNIST dataset (introduced in Chapter 3) and split it into a training set and a test set (take the first 60,000 instances for training, and the remaining 89 | 10,000 for testing). Train a Random Forest classifier on the dataset and time how long it takes, then evaluate the resulting model on the test set. Next, use PCA to 90 | reduce the dataset’s dimensionality, with an explained variance ratio of 95%. Train a new Random Forest classifier on the reduced dataset and see how long it 91 | takes. Was training much faster? Next evaluate the classifier on the test set: how does it compare to the previous classifier? 92 | 93 | Answer: 94 | 95 | 96 | 97 | 10. Use t-SNE to reduce the MNIST dataset down to two dimensions and plot the result using Matplotlib. You can use a scatterplot using 10 different colors to represent each image’s target class. Alternatively, you can write colored digits at the location of each instance, or even plot scaled-down versions of the digit images 98 | themselves (if you plot all digits, the visualization will be too cluttered, so you should either draw a random sample or plot an instance only if no other instance 99 | has already been plotted at a close distance). You should get a nice visualization with well-separated clusters of digits. Try using other dimensionality reduction 100 | algorithms such as PCA, LLE, or MDS and compare the resulting visualizations. 101 | 102 | Answer: 103 | 104 | 105 | -------------------------------------------------------------------------------- /Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow/Readme.md: -------------------------------------------------------------------------------- 1 | ## Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow ## 2 | 3 | * [Chapter 7: Ensemble Learning and Random Forests](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn%2C%20Keras%2C%20and%20TensorFlow/Chapter%207:%20Ensemble%20Learning%20and%20Random%20Forests.md) 4 | * [Chapter 8: Dimensionality Reduction](https://github.com/youssefHosni/AI-Books-Assignments-Answers/blob/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn%2C%20Keras%2C%20and%20TensorFlow/Chapter%208:%20Dimensionality%20Reduction.md) 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## AI Books Assignments & Answers ## 2 | 3 | In this repo I solve the assigniments and exercise of the AI books that I am reading and upload their answer here. 4 | 5 | [1. Hands-On Computer Vision With Tensorflow2](https://github.com/youssefHosni/AI-Books-Assignments-Answers/tree/main/Hands-On%20Computer%20Vision%20with%20Tensorflow%202) 6 | 7 | [2. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow](https://github.com/youssefHosni/AI-Books-Assignments-Answers/tree/main/Hands-On%20Machine%20Learning%20with%20Scikit-Learn%2C%20Keras%2C%20and%20TensorFlow) 8 | --------------------------------------------------------------------------------