├── .github
├── FUNDING.yml
└── desktop.ini
├── LICENSE
├── README.md
├── bernoulli_distribution_tutorial
├── bernoulli_distribution.py
└── desktop.ini
├── convolutional-neural-networks-python
├── cnns.py
├── desktop.ini
└── masks.py
├── decision_tree_learning
├── Iris.csv
├── basic_decision_tree.py
├── decision_tree_classificaiton.py
└── desktop.ini
├── deep-learning
├── deep_learning_basics.ipynb
├── desktop.ini
└── neuron.py
├── descriptive-statistics
├── descriptive-statistics-pdf-book-sample.pdf
├── descriptive_statistics.ipynb
├── descriptive_statistics.py
└── desktop.ini
├── desktop.ini
├── genetic-algorithm-tutorial
├── desktop.ini
├── genetic_algorithm_python_tutorial.ipynb
└── implementation.py
├── google_colab_tutorial
├── check_gpu.py
├── collab_magic.py
├── desktop.ini
└── kaggle_data_download.py
├── gradient_descent_tutorial
├── data.txt
├── desktop.ini
└── gradient_descent_tutorial.ipynb
├── k-nearest-neighbors
├── desktop.ini
└── k_nearest_neighbor_knn_tutorial.py
├── linear-algebra-for-ml-and-deep-learning
├── desktop.ini
├── house_price.csv
├── linear_regression.py
└── pca_with_python.py
├── logic
├── desktop.ini
└── seven_planets_riddle.py
├── machine_learning_algorithms_for_beginners
├── desktop.ini
├── exponential_regression.py
├── linear_regression_example.py
├── logarithmic_regression.py
├── machine_learning_algorithms_for_beginners.ipynb
├── ml_algorithms_1.py
├── multivariable_linear_regression.py
├── polynomial_regression.py
└── sinusoidal_regression.py
├── moment_generating_function
├── desktop.ini
└── moment_generating_function.py
├── monte_carlo_simulation
├── desktop.ini
├── monte_carlo_buffon's_needle_problem.py
├── monte_carlo_casino_example.py
├── monte_carlo_coin_flip.py
├── monte_carlo_estimating_pi_using_circle_and_square.py
└── monte_carlo_monty_hall_problem.py
├── natural_language_processing
├── Natural_Language_Processing_Text.txt
├── circle.png
├── desktop.ini
├── natural_language_processing_code.py
├── natural_language_processing_tutorial.ipynb
└── semantic-analysis.py
├── neural_networks_tutorial_part_1
├── desktop.ini
├── neural_network_part1_1.py
├── neural_network_part1_2.py
├── neural_network_part1_3.py
└── neural_networks_tutorial.ipynb
├── neural_networks_tutorial_part_2
├── desktop.ini
├── neural_networks_part2_1.py
├── neural_networks_part2_2.py
├── neural_networks_part2_3.py
└── neural_networks_tutorial_2.ipynb
├── pandas
├── desktop.ini
├── pd-melt.py
├── pd_dropna().py
├── pd_fillna().py
├── pd_isna().py
├── pd_isnull().py
├── pd_join().py
├── pd_notna().py
└── pd_notnull().py
├── poisson-distribution-process
├── desktop.ini
├── poisson.py
└── poisson_distribution_and_poisson_process_tutorial.ipynb
├── principal_component_analysis
├── correlation_matrix_covariance_matrix.py
├── desktop.ini
└── pca_with_python.py
├── programming
├── desktop.ini
└── variable_swap_data_science.py
├── random-number-generator
├── desktop.ini
├── random_number_generator_tutorial.ipynb
└── random_number_generator_tutorial_with_python.py
├── recommendation_system_tutorial
├── desktop.ini
├── movie_titles.csv
├── new_features.csv
└── recommendation_system_tutorial_netflix.py
├── sentiment_analysis_tutorial
├── desktop.ini
├── sentiment_analysis_tutorial.ipynb
└── women_clothing_review.csv
├── simple_linear_regression_tutorial
├── Fuel_Consumption.csv
├── desktop.ini
└── simple_linear_regression_from_scratch.py
├── support-vector-machine-svm
├── desktop.ini
└── svm_machine_learning.py
├── survival_analysis_in_python
├── desktop.ini
├── lung.csv
├── survival_analysis_1.py
├── survival_analysis_2.py
└── survival_analysis_3.py
└── what-is-a-gpu
├── desktop.ini
└── script.py
/.github/FUNDING.yml:
--------------------------------------------------------------------------------
1 | # These are supported funding model platforms
2 |
3 | github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
4 | patreon: # Replace with a single Patreon username
5 | open_collective: # Replace with a single Open Collective username
6 | ko_fi: # Replace with a single Ko-fi username
7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
9 | liberapay: # Replace with a single Liberapay username
10 | issuehunt: # Replace with a single IssueHunt username
11 | otechie: # Replace with a single Otechie username
12 | custom: ['https://paypal.me/towardsai']
13 |
--------------------------------------------------------------------------------
/.github/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 Towards AI Co.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Tutorials
2 |
3 | **Please know that only the code contained in this repository is under the MIT license found at "[LICENSE](https://github.com/towardsai/tutorials/blob/master/LICENSE)." All tutorials, articles, and books listed in this repository are property of Towards AI, Inc.**
4 |
5 | Please feel free to contribute. We'll review your pull requests ad-hoc.
6 | If you'd like to work with one of our data scientists in an editorial tutorial, please reach out via [editor@towardsai.net](mailto:editor@towardsai.net) and cc: [pub@towardsai.net](mailto:pub@towardsai.net).
7 |
8 | To **contribute** directly to Towards AI, check out our [guidelines to get published](https://contribute.towardsai.net).
9 |
10 | Join our [AI community](https://community.towardsai.net).
11 |
12 | If you'd like to support Towards AI, please support us by [buying one of our books](https://gumroad.com/towardsai) (listed below), [sponsoring this open-source work](https://paypal.me/towardsai).
13 |
14 | Thank you for reading and for being a supporter of Towards AI!
15 |
16 | Access any [tutorial for free](https://towardsai.net/p/category/editorial).
17 |
18 | **[Terms](https://towardsai.net/terms) | [Privacy Policy](https://towardsai.net/privacy)**
19 |
20 | ________________________________________________________________________________
21 |
22 | [Machine Learning Algorithms For Beginners with Code Examples in Python](https://towardsai.net/p/machine-learning/machine-learning-algorithms-for-beginners-with-python-code-examples-ml-19c6afd60daa)
23 |
24 | [Neural Networks from Scratch with Python Code and Math in Detail— I](https://towardsai.net/p/machine-learning/building-neural-networks-from-scratch-with-python-code-and-math-in-detail-i-536fae5d7bbf)
25 |
26 | [Building Neural Networks with Python Code and Math in Detail — II](https://towardsai.net/p/machine-learning/building-neural-networks-with-python-code-and-math-in-detail-ii-bbe8accbf3d1)
27 |
28 | [Natural Language Processing (NLP) with Python — Tutorial](https://towardsai.net/p/nlp/natural-language-processing-nlp-with-python-tutorial-for-beginners-1f54e610a1a0)
29 |
30 | [Monte Carlo Simulation An In-depth Tutorial with Python](https://towardsai.net/p/machine-learning/monte-carlo-simulation-an-in-depth-tutorial-with-python-bcf6eb7856c8)
31 |
32 | [Survival Analysis with Python Tutorial — How, What, When, and Why](https://towardsai.net/p/machine-learning/survival-analysis-with-python-tutorial-how-what-when-and-why-19a5cfb3c312)
33 |
34 | [Moment Generating Function for Probability Distribution with Python](https://towardsai.net/p/data-science/moment-generating-function-for-probability-distribution-with-python-tutorial-34857e93d8f6)
35 |
36 | [Bernoulli Distribution — Probability Tutorial with Python](https://towardsai.net/p/statistics/bernoulli-distribution-probability-tutorial-with-python-90061ee078a)
37 |
38 | [Recommendation System Tutorial with Python using Collaborative Filtering](https://towardsai.net/p/machine-learning/recommendation-system-in-depth-tutorial-with-python-for-netflix-using-collaborative-filtering-533ff8a0e444)
39 |
40 | [Linear Algebra for Deep Learning and Machine Learning (ML) Python Tutorial](https://towardsai.net/p/machine-learning/basic-linear-algebra-for-deep-learning-and-machine-learning-ml-python-tutorial-444e23db3e9e)
41 |
42 | [Principal Component Analysis (PCA) with Python Examples — Tutorial](https://towardsai.net/p/data-science/principal-component-analysis-pca-with-python-examples-tutorial-67a917bae9aa)
43 |
44 | [Decision Trees in Machine Learning (ML) with Python Tutorial](https://towardsai.net/p/machine-learning/decision-trees-in-machine-learning-ml-with-python-tutorial-3bfb457bce67)
45 |
46 | [Convolutional Neural Networks (CNNs) Tutorial with Python](https://towardsai.net/p/deeplearning/convolutional-neural-networks-cnns-tutorial-with-python-417c29f0403f)
47 |
48 | [Sentiment Analysis (Opinion Mining) with Python - NLP Tutorial](https://towardsai.net/p/nlp/sentiment-analysis-opinion-mining-with-python-nlp-tutorial-d1f173ca4e3c)
49 |
50 | [Gradient Descent for Machine Learning (ML) 101 with Python Tutorial](https://towardsai.net/p/data-science/gradient-descent-algorithm-for-machine-learning-python-tutorial-ml-9ded189ec556)
51 |
52 | [Random Number Generator Tutorial with Python](https://towardsai.net/p/data-science/random-number-generator-tutorial-with-python-3b35986132c7)
53 |
54 | [What is Deep Learning?](https://towardsai.net/p/deep-learning/what-is-deep-learning-34767bb10366)
55 |
56 | [Genetic Algorithm (GA) Introduction with Example Code](https://towardsai.net/p/programming/genetic-algorithm-ga-introduction-with-example-code-e59f9bc58eaf)
57 |
58 | [K-Nearest Neighbors (KNN) Algorithm Tutorial — Machine Learning Basics](https://news.towardsai.net/knn)
59 |
60 | [What is a GPU? Are GPUs Needed for Deep Learning?](https://news.towardsai.net/gpu)
61 |
62 | # Books
63 |
64 | [Descriptive Statistics for Data-driven Decision Making with Python](https://gumroad.com/l/descriptive-statistics)
65 |
66 | # Sponsors
67 |
68 | Big thank you to C4H3I LLC for sponsoring us in June, 2022!
69 |
--------------------------------------------------------------------------------
/bernoulli_distribution_tutorial/bernoulli_distribution.py:
--------------------------------------------------------------------------------
1 | #Import required libraries:
2 | from scipy.stats import bernoulli
3 | import matplotlib.pyplot as plt
4 |
5 | #Define probability of success:
6 | p = 0.7
7 |
8 | #Find the statisticsal values:
9 | mean, var, skew, kurt = bernoulli.stats(p, moments='mvsk')
10 |
11 | #Print mean:
12 | print("Mean = ",mean)
13 |
14 | #Print variance:
15 | print("Variance = ",var)
16 |
17 | #Print skewness:
18 | print("Skewness = ",skew)
19 |
20 | #Print kurtosis:
21 | print("Kurtosis = ",kurt)
22 |
23 | #Get only mean value:
24 | mean = bernoulli.mean(p)
25 | print("Mean = ",mean)
26 |
27 | #Get only median value:
28 | median = bernoulli.median(p)
29 | print("Median = ",median)
30 |
31 | #Get only variance value:
32 | var = bernoulli.var(p)
33 | print("Variance = ",var)
34 |
35 | #Get only standard deviation value:
36 | std = bernoulli.std(p)
37 | print("Standard Deviation = ",std)
38 |
39 | #Get Probability Mass Function(PMF):
40 | x = [0,1]
41 | p=0.7
42 | print("Probability Mass Function = ",bernoulli.pmf(x,p))
43 |
44 | #Plot the graph for Probability Mass Function(PMF):
45 | x = [0,1]
46 | p=0.7
47 | plt.scatter(x,bernoulli.pmf(x,p),label="PMF")
48 | plt.title("Probability Mass Function")
49 | plt.xlabel("Data Points")
50 | plt.ylabel("Probability")
51 | plt.legend()
52 |
53 | #Get Cumulative Density Function(CDF):
54 | x = [0,1]
55 | p = 0.7
56 | print("Cumulative Density Function = ",bernoulli.cdf(x,p))
57 |
58 | #Plot the Cumulative Density Function(CDF):
59 | x = [0,1]
60 | p = 0.7
61 | plt.scatter(x,bernoulli.cdf(x,p),label="CDF")
62 | plt.title("Cumulative Density Function")
63 | plt.xlabel("Data Points")
64 | plt.ylabel("Probability")
65 | plt.legend()
66 |
67 | #Plot the bar graph for PMF:
68 | x = [0,1]
69 | p = 0.7
70 | plt.bar(x,bernoulli.pmf(x,p),width=0.1,color=["r","b"])
71 | plt.title("Probability Mass Function")
72 | plt.xlabel("Data Points")
73 | plt.ylabel("Probability")
74 |
75 |
76 | #Plot the bar graph for CDF:
77 | x = [0,1]
78 | p = 0.7
79 | plt.bar(x,bernoulli.cdf(x,p),width=0.1,color=["r","b"])
80 | plt.title("Cumulative Density Function")
81 | plt.xlabel("Data Points")
82 | plt.ylabel("Probability")
83 |
84 | #Generate Output for Random Bernoulli Events:
85 | p = 0.7
86 | r = bernoulli.rvs(p, size=100)
87 | print(r)
88 |
--------------------------------------------------------------------------------
/bernoulli_distribution_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/convolutional-neural-networks-python/cnns.py:
--------------------------------------------------------------------------------
1 | #Import required libraries
2 | import numpy as np
3 | import pandas as pd
4 | from keras.optimizers import SGD
5 | from keras.datasets import cifar10
6 | from keras.models import Sequential
7 | from keras.utils import np_utils as utils
8 | from keras.layers import Dropout, Dense, Flatten
9 | from keras.layers.convolutional import Conv2D, MaxPooling2D
10 |
11 | #Load the Cifar01 dataset
12 | (X, y), (X_test, y_test) = cifar10.load_data()
13 |
14 | #Display the test dataset
15 | X_test
16 |
17 | #Normalize the data
18 | X, X_test = X.astype("float32") / 255.0, X_test.astype("float32") / 255.0
19 |
20 | #Convert to categorical
21 | y, y_test = utils.to_categorical(y, 10), u.to_categorical(y_test, 10)
22 |
23 | #Initialize the model
24 | model = Sequential()
25 |
26 | #Add a convolutional layer with test parameters
27 | model.add(
28 | Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding="same", activation="relu")
29 | )
30 |
31 | #Add the dropout rate
32 | model.add(Dropout(0.2))
33 |
34 | #Add another CNN layer with a valid padding value
35 | model.add(Conv2D(32, (3, 3), activation="relu", padding="valid"))
36 |
37 | #Add a max pooling lkayer
38 | model.add(MaxPooling2D(pool_size=(2, 2)))
39 |
40 | #Flatten the data
41 | model.add(Flatten())
42 |
43 | #Add a dense layer
44 | model.add(Dense(512, activation="relu"))
45 |
46 | #Add dropout
47 | model.add(Dropout(0.3))
48 |
49 | #Add the output dense layer
50 | model.add(Dense(10, activation="softmax"))
51 |
52 | #Compile the model
53 | model.compile(
54 | loss="categorical_crossentropy",
55 | optimizer=SGD(momentum=0.5, decay=0.0004),
56 | metrics=["accuracy"],
57 | )
58 |
59 | #Fit the algorithm with a number of epochs, 25 in this case
60 | model.fit(X, y, validation_data=(X_test, y_test), epochs=25, batch_size=512)
61 |
62 | #Check the accuracy of the model
63 | print("Accuracy: &2.f%%" %(model.evaluate(X_test, y_test)[1]*100))
64 |
65 | #Max pooling shape
66 | model.add(MaxPooling1D(pool_size=2))
67 |
68 | #Filter shape
69 | model.add(Conv1D(filters=32, kernel_size=3, padding="same", activation="relu"))
70 |
71 | #Number of filters
72 | model.add(Conv1D(filters=32, kernel_size=3, padding="same", activation="relu"))
73 |
74 | #Add dropout
75 | model.add(Dropout(0.2))
76 |
77 | # Early stopping for overfitting
78 | from keras.callbacks import EarlyStopping
79 |
80 | earlystop = EarlyStopping(
81 | monitor="val_loss", min_delta=0, patience=3, verbose=1, restore_best_weights=True
82 | )
83 |
--------------------------------------------------------------------------------
/convolutional-neural-networks-python/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/convolutional-neural-networks-python/masks.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import numpy.ma as ma
3 |
4 | original_array = np.array([1, 2, 3, -1, 5])
5 | original_array
6 |
7 | masked = ma.masked_array(original_array, mask=[0, 0, 0, 1, 0])
8 | masked
9 |
--------------------------------------------------------------------------------
/decision_tree_learning/Iris.csv:
--------------------------------------------------------------------------------
1 | Id,sepal_length,sepal_width,petal_length,petal_width,species
2 | 1,5.1,3.5,1.4,0.2,Iris-setosa
3 | 2,4.9,3.0,1.4,0.2,Iris-setosa
4 | 3,4.7,3.2,1.3,0.2,Iris-setosa
5 | 4,4.6,3.1,1.5,0.2,Iris-setosa
6 | 5,5.0,3.6,1.4,0.2,Iris-setosa
7 | 6,5.4,3.9,1.7,0.4,Iris-setosa
8 | 7,4.6,3.4,1.4,0.3,Iris-setosa
9 | 8,5.0,3.4,1.5,0.2,Iris-setosa
10 | 9,4.4,2.9,1.4,0.2,Iris-setosa
11 | 10,4.9,3.1,1.5,0.1,Iris-setosa
12 | 11,5.4,3.7,1.5,0.2,Iris-setosa
13 | 12,4.8,3.4,1.6,0.2,Iris-setosa
14 | 13,4.8,3.0,1.4,0.1,Iris-setosa
15 | 14,4.3,3.0,1.1,0.1,Iris-setosa
16 | 15,5.8,4.0,1.2,0.2,Iris-setosa
17 | 16,5.7,4.4,1.5,0.4,Iris-setosa
18 | 17,5.4,3.9,1.3,0.4,Iris-setosa
19 | 18,5.1,3.5,1.4,0.3,Iris-setosa
20 | 19,5.7,3.8,1.7,0.3,Iris-setosa
21 | 20,5.1,3.8,1.5,0.3,Iris-setosa
22 | 21,5.4,3.4,1.7,0.2,Iris-setosa
23 | 22,5.1,3.7,1.5,0.4,Iris-setosa
24 | 23,4.6,3.6,1.0,0.2,Iris-setosa
25 | 24,5.1,3.3,1.7,0.5,Iris-setosa
26 | 25,4.8,3.4,1.9,0.2,Iris-setosa
27 | 26,5.0,3.0,1.6,0.2,Iris-setosa
28 | 27,5.0,3.4,1.6,0.4,Iris-setosa
29 | 28,5.2,3.5,1.5,0.2,Iris-setosa
30 | 29,5.2,3.4,1.4,0.2,Iris-setosa
31 | 30,4.7,3.2,1.6,0.2,Iris-setosa
32 | 31,4.8,3.1,1.6,0.2,Iris-setosa
33 | 32,5.4,3.4,1.5,0.4,Iris-setosa
34 | 33,5.2,4.1,1.5,0.1,Iris-setosa
35 | 34,5.5,4.2,1.4,0.2,Iris-setosa
36 | 35,4.9,3.1,1.5,0.1,Iris-setosa
37 | 36,5.0,3.2,1.2,0.2,Iris-setosa
38 | 37,5.5,3.5,1.3,0.2,Iris-setosa
39 | 38,4.9,3.1,1.5,0.1,Iris-setosa
40 | 39,4.4,3.0,1.3,0.2,Iris-setosa
41 | 40,5.1,3.4,1.5,0.2,Iris-setosa
42 | 41,5.0,3.5,1.3,0.3,Iris-setosa
43 | 42,4.5,2.3,1.3,0.3,Iris-setosa
44 | 43,4.4,3.2,1.3,0.2,Iris-setosa
45 | 44,5.0,3.5,1.6,0.6,Iris-setosa
46 | 45,5.1,3.8,1.9,0.4,Iris-setosa
47 | 46,4.8,3.0,1.4,0.3,Iris-setosa
48 | 47,5.1,3.8,1.6,0.2,Iris-setosa
49 | 48,4.6,3.2,1.4,0.2,Iris-setosa
50 | 49,5.3,3.7,1.5,0.2,Iris-setosa
51 | 50,5.0,3.3,1.4,0.2,Iris-setosa
52 | 51,7.0,3.2,4.7,1.4,Iris-versicolor
53 | 52,6.4,3.2,4.5,1.5,Iris-versicolor
54 | 53,6.9,3.1,4.9,1.5,Iris-versicolor
55 | 54,5.5,2.3,4.0,1.3,Iris-versicolor
56 | 55,6.5,2.8,4.6,1.5,Iris-versicolor
57 | 56,5.7,2.8,4.5,1.3,Iris-versicolor
58 | 57,6.3,3.3,4.7,1.6,Iris-versicolor
59 | 58,4.9,2.4,3.3,1.0,Iris-versicolor
60 | 59,6.6,2.9,4.6,1.3,Iris-versicolor
61 | 60,5.2,2.7,3.9,1.4,Iris-versicolor
62 | 61,5.0,2.0,3.5,1.0,Iris-versicolor
63 | 62,5.9,3.0,4.2,1.5,Iris-versicolor
64 | 63,6.0,2.2,4.0,1.0,Iris-versicolor
65 | 64,6.1,2.9,4.7,1.4,Iris-versicolor
66 | 65,5.6,2.9,3.6,1.3,Iris-versicolor
67 | 66,6.7,3.1,4.4,1.4,Iris-versicolor
68 | 67,5.6,3.0,4.5,1.5,Iris-versicolor
69 | 68,5.8,2.7,4.1,1.0,Iris-versicolor
70 | 69,6.2,2.2,4.5,1.5,Iris-versicolor
71 | 70,5.6,2.5,3.9,1.1,Iris-versicolor
72 | 71,5.9,3.2,4.8,1.8,Iris-versicolor
73 | 72,6.1,2.8,4.0,1.3,Iris-versicolor
74 | 73,6.3,2.5,4.9,1.5,Iris-versicolor
75 | 74,6.1,2.8,4.7,1.2,Iris-versicolor
76 | 75,6.4,2.9,4.3,1.3,Iris-versicolor
77 | 76,6.6,3.0,4.4,1.4,Iris-versicolor
78 | 77,6.8,2.8,4.8,1.4,Iris-versicolor
79 | 78,6.7,3.0,5.0,1.7,Iris-versicolor
80 | 79,6.0,2.9,4.5,1.5,Iris-versicolor
81 | 80,5.7,2.6,3.5,1.0,Iris-versicolor
82 | 81,5.5,2.4,3.8,1.1,Iris-versicolor
83 | 82,5.5,2.4,3.7,1.0,Iris-versicolor
84 | 83,5.8,2.7,3.9,1.2,Iris-versicolor
85 | 84,6.0,2.7,5.1,1.6,Iris-versicolor
86 | 85,5.4,3.0,4.5,1.5,Iris-versicolor
87 | 86,6.0,3.4,4.5,1.6,Iris-versicolor
88 | 87,6.7,3.1,4.7,1.5,Iris-versicolor
89 | 88,6.3,2.3,4.4,1.3,Iris-versicolor
90 | 89,5.6,3.0,4.1,1.3,Iris-versicolor
91 | 90,5.5,2.5,4.0,1.3,Iris-versicolor
92 | 91,5.5,2.6,4.4,1.2,Iris-versicolor
93 | 92,6.1,3.0,4.6,1.4,Iris-versicolor
94 | 93,5.8,2.6,4.0,1.2,Iris-versicolor
95 | 94,5.0,2.3,3.3,1.0,Iris-versicolor
96 | 95,5.6,2.7,4.2,1.3,Iris-versicolor
97 | 96,5.7,3.0,4.2,1.2,Iris-versicolor
98 | 97,5.7,2.9,4.2,1.3,Iris-versicolor
99 | 98,6.2,2.9,4.3,1.3,Iris-versicolor
100 | 99,5.1,2.5,3.0,1.1,Iris-versicolor
101 | 100,5.7,2.8,4.1,1.3,Iris-versicolor
102 | 101,6.3,3.3,6.0,2.5,Iris-virginica
103 | 102,5.8,2.7,5.1,1.9,Iris-virginica
104 | 103,7.1,3.0,5.9,2.1,Iris-virginica
105 | 104,6.3,2.9,5.6,1.8,Iris-virginica
106 | 105,6.5,3.0,5.8,2.2,Iris-virginica
107 | 106,7.6,3.0,6.6,2.1,Iris-virginica
108 | 107,4.9,2.5,4.5,1.7,Iris-virginica
109 | 108,7.3,2.9,6.3,1.8,Iris-virginica
110 | 109,6.7,2.5,5.8,1.8,Iris-virginica
111 | 110,7.2,3.6,6.1,2.5,Iris-virginica
112 | 111,6.5,3.2,5.1,2.0,Iris-virginica
113 | 112,6.4,2.7,5.3,1.9,Iris-virginica
114 | 113,6.8,3.0,5.5,2.1,Iris-virginica
115 | 114,5.7,2.5,5.0,2.0,Iris-virginica
116 | 115,5.8,2.8,5.1,2.4,Iris-virginica
117 | 116,6.4,3.2,5.3,2.3,Iris-virginica
118 | 117,6.5,3.0,5.5,1.8,Iris-virginica
119 | 118,7.7,3.8,6.7,2.2,Iris-virginica
120 | 119,7.7,2.6,6.9,2.3,Iris-virginica
121 | 120,6.0,2.2,5.0,1.5,Iris-virginica
122 | 121,6.9,3.2,5.7,2.3,Iris-virginica
123 | 122,5.6,2.8,4.9,2.0,Iris-virginica
124 | 123,7.7,2.8,6.7,2.0,Iris-virginica
125 | 124,6.3,2.7,4.9,1.8,Iris-virginica
126 | 125,6.7,3.3,5.7,2.1,Iris-virginica
127 | 126,7.2,3.2,6.0,1.8,Iris-virginica
128 | 127,6.2,2.8,4.8,1.8,Iris-virginica
129 | 128,6.1,3.0,4.9,1.8,Iris-virginica
130 | 129,6.4,2.8,5.6,2.1,Iris-virginica
131 | 130,7.2,3.0,5.8,1.6,Iris-virginica
132 | 131,7.4,2.8,6.1,1.9,Iris-virginica
133 | 132,7.9,3.8,6.4,2.0,Iris-virginica
134 | 133,6.4,2.8,5.6,2.2,Iris-virginica
135 | 134,6.3,2.8,5.1,1.5,Iris-virginica
136 | 135,6.1,2.6,5.6,1.4,Iris-virginica
137 | 136,7.7,3.0,6.1,2.3,Iris-virginica
138 | 137,6.3,3.4,5.6,2.4,Iris-virginica
139 | 138,6.4,3.1,5.5,1.8,Iris-virginica
140 | 139,6.0,3.0,4.8,1.8,Iris-virginica
141 | 140,6.9,3.1,5.4,2.1,Iris-virginica
142 | 141,6.7,3.1,5.6,2.4,Iris-virginica
143 | 142,6.9,3.1,5.1,2.3,Iris-virginica
144 | 143,5.8,2.7,5.1,1.9,Iris-virginica
145 | 144,6.8,3.2,5.9,2.3,Iris-virginica
146 | 145,6.7,3.3,5.7,2.5,Iris-virginica
147 | 146,6.7,3.0,5.2,2.3,Iris-virginica
148 | 147,6.3,2.5,5.0,1.9,Iris-virginica
149 | 148,6.5,3.0,5.2,2.0,Iris-virginica
150 | 149,6.2,3.4,5.4,2.3,Iris-virginica
151 | 150,5.9,3.0,5.1,1.8,Iris-virginica
152 |
--------------------------------------------------------------------------------
/decision_tree_learning/basic_decision_tree.py:
--------------------------------------------------------------------------------
1 | from sklearn.tree import DecisionTreeClassifier
2 | from sklearn.datasets import load_breast_cancer
3 | from sklearn.model_selection import train_test_split
4 |
5 | cancer = load_breast_cancer()
6 |
7 | X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, stratify=cancer.target, random_state=42)
8 |
9 | tree = DecisionTreeClassifier(random_state=0)
10 | tree.fit(X_train, y_train)
11 |
12 | print("Accuracy on training set: {:.3f}".format(tree.score(X_train, y_train)))
13 | print("Accuracy on test set: {:.3f}".format(tree.score(X_test, y_test)))
--------------------------------------------------------------------------------
/decision_tree_learning/decision_tree_classificaiton.py:
--------------------------------------------------------------------------------
1 | # Commented out IPython magic to ensure Python compatibility.
2 | import numpy as np
3 | import pandas as pd
4 | import matplotlib.pyplot as plt
5 | import seaborn as sns
6 |
7 | from sklearn import tree
8 |
9 | # %matplotlib inline
10 |
11 | """**Read Iris Dataset**"""
12 |
13 | data = pd.read_csv('Iris.csv')
14 | data
15 |
16 | data.shape
17 |
18 | """**Define Colunms**"""
19 |
20 | col_names = ['id', 'sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
21 |
22 |
23 | data.columns = col_names
24 |
25 | col_names
26 |
27 | """**Drop Id Column**"""
28 |
29 | data = data.drop(['id'], axis=1)
30 |
31 | data.head()
32 |
33 | data.info()
34 |
35 | """**Checking the target categorical counts**"""
36 |
37 | data['species'].value_counts()
38 |
39 | """**Check missing values in variables**"""
40 |
41 | data.isnull().sum()
42 |
43 | target_col = ['species']
44 |
45 | X = data.drop(['species'], axis=1)
46 |
47 | y = data['species']
48 |
49 | """**Split dataset into train and test**"""
50 |
51 | from sklearn.model_selection import train_test_split
52 |
53 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state = 42)
54 |
55 | """**Check datatypes**"""
56 |
57 | X_train.dtypes
58 |
59 | """**Decision Tree Classification based on Gini index criterion**"""
60 |
61 | from sklearn.tree import DecisionTreeClassifier
62 |
63 | clf_gini = DecisionTreeClassifier(criterion='gini', max_depth=3, random_state=0)
64 | clf_gini.fit(X_train, y_train)
65 |
66 | y_pred_gini = clf_gini.predict(X_test)
67 | y_pred_gini
68 |
69 | """**Check accurcy of model**"""
70 |
71 | from sklearn.metrics import accuracy_score
72 |
73 | print('Model accuracy score with criterion gini index: {0:0.4f}'. format(accuracy_score(y_test, y_pred_gini)))# y_pred_gini are the predicted class labels in the test-set.
74 |
75 | #Compare the train-set and test-set accuracy
76 | y_pred_train_gini = clf_gini.predict(X_train)
77 |
78 | y_pred_train_gini
79 |
80 | print('Training-set accuracy score: {0:0.4f}'. format(accuracy_score(y_train, y_pred_train_gini)))
81 |
82 | #Check for overfitting and underfitting
83 |
84 | print('Training set score: {:.4f}'.format(clf_gini.score(X_train, y_train)))
85 |
86 | print('Test set score: {:.4f}'.format(clf_gini.score(X_test, y_test)))
87 |
88 | """**Pictorial representation of Decision Tree**"""
89 |
90 | plt.figure(figsize=(12,8))
91 | tree.plot_tree(clf_gini.fit(X_train, y_train))
--------------------------------------------------------------------------------
/decision_tree_learning/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/deep-learning/deep_learning_basics.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "deep-learning-basics.ipynb",
7 | "provenance": [],
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | }
14 | },
15 | "cells": [
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {
19 | "id": "view-in-github",
20 | "colab_type": "text"
21 | },
22 | "source": [
23 | "
"
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "metadata": {
29 | "id": "DrdHQbwAvYcO"
30 | },
31 | "source": [
32 | "# Neuron Implementation\r\n",
33 | "\r\n",
34 | "* Tutorial: https://towardsai.net/deep-learning\r\n",
35 | "* Github: https://github.com/towardsai/tutorials/tree/master/deep-learning"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "metadata": {
41 | "id": "29JF978gvSpE"
42 | },
43 | "source": [
44 | "import numpy as np"
45 | ],
46 | "execution_count": 1,
47 | "outputs": []
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "metadata": {
52 | "id": "c814KzCsvk8x"
53 | },
54 | "source": [
55 | "**Create Sigmoid function**"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "metadata": {
61 | "id": "D_VDpun0vdJ1"
62 | },
63 | "source": [
64 | "def sigmoid(x):\r\n",
65 | " return 1/ (1 + np.exp(-x))"
66 | ],
67 | "execution_count": 2,
68 | "outputs": []
69 | },
70 | {
71 | "cell_type": "markdown",
72 | "metadata": {
73 | "id": "oDvAJGrlv-6_"
74 | },
75 | "source": [
76 | "# Creating an Artificial Neuron (AN)"
77 | ]
78 | },
79 | {
80 | "cell_type": "code",
81 | "metadata": {
82 | "id": "EzX9ccMXvnjM"
83 | },
84 | "source": [
85 | "class Neuron:\r\n",
86 | " def __init__(self, weights, bias):\r\n",
87 | " self.weights = weights\r\n",
88 | " self.bias = bias\r\n",
89 | " \r\n",
90 | " def feedforwards(self, inputs):\r\n",
91 | " total = np.dot(self.weights, inputs) + self.bias\r\n",
92 | " return sigmoid(total)\r\n"
93 | ],
94 | "execution_count": 3,
95 | "outputs": []
96 | },
97 | {
98 | "cell_type": "code",
99 | "metadata": {
100 | "colab": {
101 | "base_uri": "https://localhost:8080/"
102 | },
103 | "id": "DdAkVREWwBmT",
104 | "outputId": "1711b30b-c520-4965-9636-162ffd1814a8"
105 | },
106 | "source": [
107 | "weights = np.array([0, 1])\r\n",
108 | "bias = 4\r\n",
109 | "\r\n",
110 | "neuron = Neuron(weights, bias)\r\n",
111 | "\r\n",
112 | "x = np.array([2, 3])\r\n",
113 | "\r\n",
114 | "forward = neuron.feedforwards(x)\r\n",
115 | "\r\n",
116 | "print(forward)\r\n"
117 | ],
118 | "execution_count": 4,
119 | "outputs": [
120 | {
121 | "output_type": "stream",
122 | "text": [
123 | "0.9990889488055994\n"
124 | ],
125 | "name": "stdout"
126 | }
127 | ]
128 | }
129 | ]
130 | }
--------------------------------------------------------------------------------
/deep-learning/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/deep-learning/neuron.py:
--------------------------------------------------------------------------------
1 | # import numpy
2 | import numpy as np
3 |
4 | # define our sigmoid function
5 | def sigmoid(x):
6 | return 1/ (1 + np.exp(-x))
7 |
8 | # craete the AN
9 | class Neuron:
10 | def __init__(self, weights, bias):
11 | self.weights = weights
12 | self.bias = bias
13 |
14 | def feedforwards(self, inputs):
15 | total = np.dot(self.weights, inputs) + self.bias
16 | return sigmoid(total)
17 |
18 | weights = np.array([0, 1])
19 | bias = 4
20 |
21 | neuron = Neuron(weights, bias)
22 |
23 | x = np.array([2, 3])
24 |
25 | forward = neuron.feedforwards(x)
26 |
27 | print(forward)
28 |
--------------------------------------------------------------------------------
/descriptive-statistics/descriptive-statistics-pdf-book-sample.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/towardsai/tutorials/cc12fe183d50ce6095f044d7346f30d5d0522584/descriptive-statistics/descriptive-statistics-pdf-book-sample.pdf
--------------------------------------------------------------------------------
/descriptive-statistics/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/genetic-algorithm-tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
6 |
--------------------------------------------------------------------------------
/genetic-algorithm-tutorial/implementation.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """genetic-algorithm-python-tutorial.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/161ijkvn8wG_seVtQexm-p3fW3r5p8s_x
8 |
9 | # Genetic Algorithm Implementation with Python
10 |
11 | * Tutorial: https://towardsai.net/p/computer-science/genetic-algorithm-ga-introduction-with-example-code-e59f9bc58eaf
12 |
13 | * Github: https://github.com/towardsai/tutorials/tree/master/genetic-algorithm-tutorial
14 |
15 | The Genetic Algorithm is a class of evolutionary algorithm that is broadly inspired by biological evolution. We all know evolution, it is a selection of parents, reproduction, and mutation of offsprings. The main aim of evolution is to reproduce offsprings that are biologically better than their parents. Genetic algorithm is mainly based on natural selection and it tries to simulate the theory of evolution.
16 | """
17 |
18 | import numpy as np
19 | import matplotlib.pyplot as plt
20 | import copy
21 |
22 | # cost function
23 | def sphere(x):
24 | ''' This is the problem we will be
25 | optimizing, each chromosome of parent has a cost
26 | which is calculated from this cost function'''
27 |
28 | return sum(x**2)
29 |
30 | def roulette_wheel_selection(p):
31 | ''' Roulette Wheel Selection is a method of parent
32 | selection for breeding. We take the cummulative sum of probabilities
33 | and select the first parent whose cummulative sum is greater than
34 | random number'''
35 |
36 | c = np.cumsum(p)
37 | r = sum(p) * np.random.rand()
38 | ind = np.argwhere(r <= c)
39 |
40 | return ind[0][0]
41 |
42 | def crossover(p1, p2):
43 | ''' Performing uniform crossover. Alpha is the flag
44 | that determines which gene of each chromosome is choosen
45 | to be inherited by the offspring. Maultiply the alpha value
46 | with each gene of every chromosome of both the parents and
47 | then add the resultant value to get child chromosome'''
48 |
49 | c1 = copy.deepcopy(p1)
50 | c2 = copy.deepcopy(p2)
51 |
52 | # Uniform crossover
53 | alpha = np.random.uniform(0, 1, *(c1['position'].shape))
54 | c1['position'] = alpha*p1['position'] + (1-alpha)*p2['position']
55 | c2['position'] = alpha*p2['position'] + (1-alpha)*p1['position']
56 |
57 | return c1, c2
58 |
59 | def mutate(c, mu, sigma):
60 | '''
61 | c: child chromosome
62 | mu: mutation rate. % of gene to be modified
63 | sigma: step size of mutation'''
64 |
65 | y = copy.deepcopy(c)
66 | flag = np.random.rand(*(c['position'].shape)) <= mu # array of True and Flase, indicating at which position to perform mutation
67 | ind = np.argwhere(flag)
68 | y['position'][ind] += sigma * np.random.randn(*ind.shape)
69 |
70 | return y
71 |
72 | def bounds(c, varmin, varmax):
73 | ''' Defines the upper and lower bound of gene value'''
74 |
75 | c['position'] = np.maximum(c['position'], varmin)
76 | c['position'] = np.minimum(c['position'], varmax)
77 |
78 | def sort(arr):
79 | ''' Bubble sorting the population + offsoring
80 | in every iteration to get best fit individuals at top'''
81 |
82 | n = len(arr)
83 |
84 | for i in range(n-1):
85 |
86 | for j in range(0, n-i-1):
87 | if arr[j]['cost'] > arr[j+1]['cost'] :
88 | arr[j], arr[j+1] = arr[j+1], arr[j]
89 | return arr
90 |
91 | def ga(costfunc, num_var, varmin, varmax, maxit, npop, num_children, mu, sigma, beta):
92 |
93 | # Placeholder for each individual
94 | population = {}
95 | for i in range(npop): # each inidivdual has position(chromosomes) and cost,
96 | population[i] = {'position': None, 'cost': None} # create individual as many as population size(npop)
97 |
98 | # Best solution found
99 | bestsol = copy.deepcopy(population)
100 | bestsol_cost = np.inf # initial best cost is infinity
101 |
102 | # Initialize population - 1st Gen
103 | for i in range(npop):
104 | population[i]['position'] = np.random.uniform(varmin, varmax, num_var) # randomly initialize the chromosomes and cost
105 | population[i]['cost'] = costfunc(population[i]['position'])
106 |
107 | if population[i]['cost'] < bestsol_cost: # if cost of an individual is less(best) than best cost,
108 | bestsol = copy.deepcopy(population[i]) # replace the best solution with that individual
109 |
110 | # Best cost of each generation/iteration
111 | bestcost = np.empty(maxit)
112 |
113 | # Main loop
114 | for it in range(maxit):
115 |
116 | # Calculating probability for roulette wheel selection
117 | costs = []
118 | for i in range(len(population)):
119 | costs.append(population[i]['cost']) # list of all the population cost
120 | costs = np.array(costs)
121 | avg_cost = np.mean(costs) # taking average of the costs
122 | if avg_cost != 0:
123 | costs = costs/avg_cost
124 | probs = np.exp(-beta*costs) # probability is exponensial of -ve beta times costs
125 |
126 | for _ in range(num_children//2): # we will be having two off springs for each crossover
127 | # hence divide number of children by 2
128 | '''
129 | -> choosing two parents randomly for mating
130 | -> we are shuffling all the 20 parent individuals and
131 | -> choosing first two of the shuffled array as our parents for mating
132 |
133 | Randomly selecting parents by shiffling them.
134 | But we will be using roulette wheel slection
135 | for our algorithm
136 |
137 | q = np.random.permutation(npop)
138 | p1 = population[q[0]]
139 | p2 = population[q[1]]
140 | '''
141 |
142 | # Roulette wheel selection
143 | p1 = population[roulette_wheel_selection(probs)]
144 | p2 = population[roulette_wheel_selection(probs)]
145 |
146 | # crossover two parents
147 | c1, c2 = crossover(p1, p2)
148 |
149 | # Perform mutation
150 | c1 = mutate(c1, mu, sigma)
151 | c2 = mutate(c2, mu, sigma)
152 |
153 | # Apply bounds
154 | bounds(c1, varmin, varmax)
155 | bounds(c2, varmin, varmax)
156 |
157 | # Evaluate first off spring
158 | c1['cost'] = costfunc(c1['position']) # calculate cost function of child 1
159 |
160 | if type(bestsol_cost) == float:
161 | if c1['cost'] < bestsol_cost: # replacing best solution in every generation/iteration
162 | bestsol_cost = copy.deepcopy(c1)
163 | else:
164 | if c1['cost'] < bestsol_cost['cost']: # replacing best solution in every generation/iteration
165 | bestsol_cost = copy.deepcopy(c1)
166 |
167 |
168 | # Evaluate second off spring
169 | if c2['cost'] < bestsol_cost['cost']: # replacing best solution in every generation/iteration
170 | bestsol_cost = copy.deepcopy(c2)
171 |
172 | # Merge, Sort and Select
173 | population[len(population)] = c1
174 | population[len(population)] = c2
175 |
176 | population = sort(population)
177 |
178 | # Store best cost
179 | bestcost[it] = bestsol_cost['cost']
180 |
181 | # Show generation information
182 | print('Iteration {}: Best Cost = {}'. format(it, bestcost[it]))
183 |
184 |
185 | out = population
186 | Bestsol = bestsol
187 | bestcost = bestcost
188 | return (out, Bestsol, bestcost)
189 |
190 | # Problem definition
191 | costfunc = sphere
192 | num_var = 5 # number of decicion variables
193 | varmin = -10 # lower bound
194 | varmax = 10 # upper bound
195 |
196 | # GA Parameters
197 | maxit = 501 # number of iterations
198 | npop = 20 # initial population size
199 | beta = 1
200 | prop_children = 1 # proportion of children to population
201 | num_children = int(np.round(prop_children * npop/2)*2) # making sure it always an even number
202 | mu = 0.2 # mutation rate 20%, 205 of 5 is 1, mutating 1 gene
203 | sigma = 0.1 # step size of mutation
204 |
205 |
206 | # Run GA
207 | out = ga(costfunc, num_var, varmin, varmax, maxit, npop, num_children, mu, sigma, beta)
208 |
209 | # Results
210 | #(out, Bestsol, bestcost)
211 | plt.plot(out[2])
212 | plt.xlim(0, maxit)
213 | plt.xlabel('Generations')
214 | plt.ylabel('Best Cost')
215 | plt.title('Genetic Algorithm')
216 | plt.grid(True)
217 | plt.show
218 |
--------------------------------------------------------------------------------
/google_colab_tutorial/check_gpu.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """check_gpu.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/1FDLs0qvY17D8-_4Ds5NDslVvl1oC_i5K
8 |
9 | # Check the detail about GPU Hardware Accelator in Colab
10 | """
11 |
12 | import tensorflow as tf
13 | from tensorflow.python.client import device_lib
14 |
15 | tf.test.gpu_device_name()
16 |
17 | device_lib.list_local_devices()
18 |
19 |
--------------------------------------------------------------------------------
/google_colab_tutorial/collab_magic.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """collab_magic.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/1ad_oOndoeQsxr0W_fOJ-NocZ4Y8isacy
8 |
9 | # Colab Magics
10 |
11 | **List of All Magic Commands**
12 | """
13 |
14 | # Commented out IPython magic to ensure Python compatibility.
15 | # %lsmagic
16 |
17 | """**List Local Directries**"""
18 |
19 | # Commented out IPython magic to ensure Python compatibility.
20 | # %ldir
21 |
22 | """**Get Notebook History**"""
23 |
24 | # Commented out IPython magic to ensure Python compatibility.
25 | # %history
26 |
27 | """**CPU Time**"""
28 |
29 | # Commented out IPython magic to ensure Python compatibility.
30 | # %time
31 |
32 | """**How long the system has been running?**"""
33 |
34 | !uptime
35 |
36 | """**Display available and used memory**"""
37 |
38 | !free -h
39 | print("-"*100)
40 |
41 | """**Display the CPU specification**"""
42 |
43 | !lscpu
44 | print("-"*70)
45 |
46 | """**List all running VM processes**"""
47 |
48 | # Commented out IPython magic to ensure Python compatibility.
49 | # %%sh
50 | # echo "List all running VM processes."
51 | # ps -ef
52 | # echo "Done"
53 |
54 | """**Embed HTML**"""
55 |
56 | # Commented out IPython magic to ensure Python compatibility.
57 | # %%html
58 | #
59 |
60 | #@title Personal Detail
61 | #@markdown Informations.
62 |
63 | Name = 'Peter' #@param {type: "string"}
64 | Age = 25 #@param {type: "slider", min: 1, max: 100}
65 | zip = 1234 #@param {type: "number"}
66 | Date = '2020-01-26' #@param {type: "date"}
67 | Gender = "Male" #@param ['Male', 'Female', 'Other']
68 | #@markdown ---
69 | print("Submitting the form")
70 | print(string_type, slider_value, number, date, pick_me)
71 | print("Submitted")
72 |
73 | """# Plotting"""
74 |
75 | # Commented out IPython magic to ensure Python compatibility.
76 | # %matplotlib inline
77 | import numpy as np
78 | from matplotlib import pyplot
79 |
80 | random_data = np.random.rand(500).astype(np.float32)
81 | noise_data = np.random.normal(scale=0.5, size=len(x))
82 | y = np.sin(random_data * 7) + noise_data
83 | pyplot.scatter(random_data, y)
84 |
85 | """**Plot HeatMap**"""
86 |
87 | import matplotlib.pyplot as plt
88 | import numpy as np
89 | import seaborn as sns
90 |
91 | length = 10
92 | data = 5 + np.random.randn(length, length)
93 | data += np.arange(length)
94 | data += np.reshape(np.arange(length), (length, 1))
95 | sns.heatmap(data)
96 | plt.show()
--------------------------------------------------------------------------------
/google_colab_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/google_colab_tutorial/kaggle_data_download.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """kaggle_data_download.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/1-bSWzsTCU18243Z75sY2zvs5q6uFbw6v
8 |
9 | # Directly Upload Data from Kaggle
10 |
11 | **Install Kaggle Package**
12 | """
13 |
14 | !pip install -q kaggle
15 |
16 | from google.colab import files
17 |
18 | files.upload()
19 |
20 | !mkdir ~/.kaggle
21 |
22 | !cp kaggle.json ~/.kaggle/
23 |
24 | !chmod 600 ~/.kaggle/kaggle.json
25 |
26 | !kaggle datasets list
27 |
28 | !kaggle competitions download -c competitive-data-science-predict-future-sales
--------------------------------------------------------------------------------
/gradient_descent_tutorial/data.txt:
--------------------------------------------------------------------------------
1 | 6.1101,17.592
2 | 5.5277,9.1302
3 | 8.5186,13.662
4 | 7.0032,11.854
5 | 5.8598,6.8233
6 | 8.3829,11.886
7 | 7.4764,4.3483
8 | 8.5781,12
9 | 6.4862,6.5987
10 | 5.0546,3.8166
11 | 5.7107,3.2522
12 | 14.164,15.505
13 | 5.734,3.1551
14 | 8.4084,7.2258
15 | 5.6407,0.71618
16 | 5.3794,3.5129
17 | 6.3654,5.3048
18 | 5.1301,0.56077
19 | 6.4296,3.6518
20 | 7.0708,5.3893
21 | 6.1891,3.1386
22 | 20.27,21.767
23 | 5.4901,4.263
24 | 6.3261,5.1875
25 | 5.5649,3.0825
26 | 18.945,22.638
27 | 12.828,13.501
28 | 10.957,7.0467
29 | 13.176,14.692
30 | 22.203,24.147
31 | 5.2524,-1.22
32 | 6.5894,5.9966
33 | 9.2482,12.134
34 | 5.8918,1.8495
35 | 8.2111,6.5426
36 | 7.9334,4.5623
37 | 8.0959,4.1164
38 | 5.6063,3.3928
39 | 12.836,10.117
40 | 6.3534,5.4974
41 | 5.4069,0.55657
42 | 6.8825,3.9115
43 | 11.708,5.3854
44 | 5.7737,2.4406
45 | 7.8247,6.7318
46 | 7.0931,1.0463
47 | 5.0702,5.1337
48 | 5.8014,1.844
49 | 11.7,8.0043
50 | 5.5416,1.0179
51 | 7.5402,6.7504
52 | 5.3077,1.8396
53 | 7.4239,4.2885
54 | 7.6031,4.9981
55 | 6.3328,1.4233
56 | 6.3589,-1.4211
57 | 6.2742,2.4756
58 | 5.6397,4.6042
59 | 9.3102,3.9624
60 | 9.4536,5.4141
61 | 8.8254,5.1694
62 | 5.1793,-0.74279
63 | 21.279,17.929
64 | 14.908,12.054
65 | 18.959,17.054
66 | 7.2182,4.8852
67 | 8.2951,5.7442
68 | 10.236,7.7754
69 | 5.4994,1.0173
70 | 20.341,20.992
71 | 10.136,6.6799
72 | 7.3345,4.0259
73 | 6.0062,1.2784
74 | 7.2259,3.3411
75 | 5.0269,-2.6807
76 | 6.5479,0.29678
77 | 7.5386,3.8845
78 | 5.0365,5.7014
79 | 10.274,6.7526
80 | 5.1077,2.0576
81 | 5.7292,0.47953
82 | 5.1884,0.20421
83 | 6.3557,0.67861
84 | 9.7687,7.5435
85 | 6.5159,5.3436
86 | 8.5172,4.2415
87 | 9.1802,6.7981
88 | 6.002,0.92695
89 | 5.5204,0.152
90 | 5.0594,2.8214
91 | 5.7077,1.8451
92 | 7.6366,4.2959
93 | 5.8707,7.2029
94 | 5.3054,1.9869
95 | 8.2934,0.14454
96 | 13.394,9.0551
97 | 5.4369,0.61705
98 |
--------------------------------------------------------------------------------
/gradient_descent_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/k-nearest-neighbors/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/k-nearest-neighbors/k_nearest_neighbor_knn_tutorial.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #K-Nearest Neighbors (KNN) Algorithm Tutorial - Machine Learning Basics
4 | * Tutorial: https://news.towardsai.net/knn
5 | * Github: https://github.com/towardsai/tutorials/tree/master/k-nearest-neighbors
6 | """
7 |
8 | import numpy as np
9 | import pandas as pd
10 | import matplotlib.pyplot as plt
11 | import seaborn as sns
12 |
13 | from sklearn.model_selection import train_test_split
14 | from sklearn.neighbors import KNeighborsClassifier
15 | from sklearn import metrics
16 |
17 | # Import the iris dataset as provided by the sklearn Python module
18 | from sklearn.datasets import load_iris
19 | iris = load_iris()
20 |
21 | type(iris)
22 |
23 | # Converting sklearn data into Pandas dataframe
24 | # target variables imply
25 | # 0.0 - Setosa
26 | # 1.0 - Versicolor
27 | # 2.0 - Virginica
28 | iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
29 | columns= iris['feature_names'] + ['target'])
30 | iris.head()
31 |
32 | """## Checking for outliers and imbalanced data"""
33 |
34 | # data is perfectly balanced
35 | sns.countplot(x='target', data=iris)
36 |
37 | # not much of outliers to br handled
38 | for feature in ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']:
39 | sns.boxplot(x='target', y=feature, data=iris)
40 | plt.show()
41 |
42 | """## Plotting a 2-D graph"""
43 |
44 | sns.scatterplot(x='sepal length (cm)', y='sepal width (cm)', data=iris, hue='target', palette="deep")
45 |
46 | """## Separating features and target"""
47 |
48 | # X variable contains flower features
49 | # Y variable contains target values
50 | X = iris.drop(['target'], axis=1)
51 | y = iris['target']
52 |
53 | """## Split the dataset into train and test sets"""
54 |
55 | # 60% of the data will be randomly selected at training data
56 | # remaining 40% as testing data
57 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
58 |
59 | # checking accuracy score for k-value rangin from 1 to 26
60 | k_range = list(range(1,26))
61 | scores = []
62 |
63 | # model fitting and calculating accuracy score
64 | # for each k-value in the range 1-26
65 | for k in k_range:
66 | knn = KNeighborsClassifier(n_neighbors=k)
67 | knn.fit(X_train, y_train)
68 | y_pred = knn.predict(X_test)
69 | scores.append(metrics.accuracy_score(y_test, y_pred))
70 |
71 | plt.plot(k_range, scores)
72 | plt.xlabel('Value of k')
73 | plt.ylabel('Accuracy Score')
74 | plt.title('Accuracy Scores for different values of k')
75 | plt.show()
76 |
77 | # 60% of the data will be randomly selected at training data
78 | # remaining 40% as testing data
79 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=0)
80 |
81 | """## Initial model"""
82 |
83 | # Initial model with nearest neighbor as 1(k-value)
84 | # further, k will be replaced with optimal value
85 | knn = KNeighborsClassifier(n_neighbors=1)
86 |
87 | knn.fit(X_train, y_train)
88 | print(knn.score(X_test, y_test))
89 |
90 | """## Finding the right k-value"""
91 |
92 | # checking accuracy score for k-value rangin from 1 to 26
93 | k_range = list(range(1,26))
94 | scores = []
95 |
96 | # model fitting and calculating accuracy score
97 | # for each k-value in the range 1-26
98 | for k in k_range:
99 | knn = KNeighborsClassifier(n_neighbors=k)
100 | knn.fit(X_train, y_train)
101 | y_pred = knn.predict(X_test)
102 | scores.append(metrics.accuracy_score(y_test, y_pred))
103 |
104 | plt.plot(k_range, scores)
105 | plt.xlabel('Value of k')
106 | plt.ylabel('Accuracy Score')
107 | plt.title('Accuracy Scores for different values of k')
108 | plt.show()
109 |
110 | """## Accuracy for optimal k-value"""
111 |
112 | # 11 is the optimal k-value for this dataset
113 | knn = KNeighborsClassifier(n_neighbors=11)
114 | knn.fit(X_train, y_train)
115 | print(knn.score(X_test, y_test))
116 |
117 | """## Predicting class of new data"""
118 |
119 | knn = KNeighborsClassifier(n_neighbors=11)
120 |
121 | # fitting the entire data without splitting
122 | # into train and test
123 | knn.fit(iris.drop(['target'], axis=1), iris['target'])
124 |
125 | # new data to be classified
126 | X_new = np.array([[1, 2.9, 10, 0.2]])
127 | prediction = knn.predict(X_new)
128 | print(prediction)
129 |
130 | if prediction[0] == 0.0:
131 | print('Setosa')
132 | elif prediction[0] == 1.0:
133 | print('Versicolor')
134 | else:
135 | print('Virginica')
136 |
--------------------------------------------------------------------------------
/linear-algebra-for-ml-and-deep-learning/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/linear-algebra-for-ml-and-deep-learning/house_price.csv:
--------------------------------------------------------------------------------
1 | square_feet,price
2 | 150,6450
3 | 200,7450
4 | 250,8450
5 | 300,9450
6 | 350,11450
7 | 400,15450
8 | 600,18450
9 |
--------------------------------------------------------------------------------
/linear-algebra-for-ml-and-deep-learning/linear_regression.py:
--------------------------------------------------------------------------------
1 | # import important libraries
2 |
3 | import pandas as pd
4 | import numpy as np
5 |
6 | df = pd.read_csv('house_price.csv')
7 |
8 | print(df.head())
9 |
10 | def get_mean(value):
11 | total = sum(value)
12 | length = len(value)
13 | mean = total/length
14 | return mean
15 |
16 | def get_variance(value):
17 | mean = get_mean(value)
18 | mean_difference_square = [pow((item - mean), 2) for item in value]
19 | variance = sum(mean_difference_square)/float(len(value)-1)
20 | return variance
21 |
22 | def get_covariance(value1, value2):
23 | value1_mean = get_mean(value1)
24 | value2_mean = get_mean(value2)
25 | values_size = len(value1)
26 | covariance = 0.0
27 | for i in range(0, values_size):
28 | covariance += (value1[i] - value1_mean) * (value2[i] - value2_mean)
29 |
30 | return covariance / float(values_size - 1)
31 |
32 | def linear_regression(df):
33 |
34 | X = df['square_feet']
35 | Y = df['price']
36 | m = len(X)
37 |
38 | square_feet_mean = get_mean(X)
39 | price_mean = get_mean(Y)
40 |
41 | #variance of X
42 | square_feet_variance = get_variance(X)
43 | price_variance = get_variance(Y)
44 |
45 | covariance_of_price_and_square_feet = get_covariance(X, Y)
46 | w1 = covariance_of_price_and_square_feet / float(square_feet_variance)
47 | w0 = price_mean - w1 * square_feet_mean
48 |
49 | # prediction --> Linear Equation
50 | prediction = w0 + w1 * X
51 |
52 | df['price (prediction)'] = prediction
53 | return df['price (prediction)']
54 |
55 |
56 | predicted = linear_regression(df)
57 |
58 | print(predicted)
59 |
60 |
--------------------------------------------------------------------------------
/linear-algebra-for-ml-and-deep-learning/pca_with_python.py:
--------------------------------------------------------------------------------
1 | # Import important libraries
2 | import numpy as np
3 | import pylab as plt
4 | import pandas as pd
5 | from sklearn import datasets
6 | import matplotlib.pyplot as plt
7 | from sklearn.preprocessing import StandardScaler
8 |
9 | load_iris = datasets.load_iris()
10 | iris_df = pd.DataFrame(load_iris.data, columns=[load_iris.feature_names])
11 |
12 | print(iris_df.head())
13 |
14 | print(load_iris.data.shape)
15 |
16 | standardized_x = StandardScaler().fit_transform(load_iris.data)
17 | print(standardized_x[:2])
18 |
19 | print(standardized_x.T)
20 |
21 | covariance_matrix_x = np.cov(standardized_x.T)
22 | print(covariance_matrix_x)
23 |
24 | eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix_x)
25 |
26 | print(eigenvalues)
27 |
28 | print(eigenvectors)
29 |
30 | total_of_eigenvalues = sum(eigenvalues)
31 | varariance = [(i / total_of_eigenvalues)*100 for i in sorted(eigenvalues, reverse=True)]
32 |
33 | print(varariance)
34 |
35 | eigenpairs = [(np.abs(eigenvalues[i]), eigenvectors[:,i]) for i in range(len(eigenvalues))]
36 |
37 | # Sorting from Higher values to lower value
38 | eigenpairs.sort(key=lambda x: x[0], reverse=True)
39 | print(eigenpairs)
40 |
41 | matrix_weighing = np.hstack((eigenpairs[0][1].reshape(4,1),
42 | eigenpairs[1][1].reshape(4,1)))
43 | print(matrix_weighing)
44 |
45 | Y = standardized_x.dot(matrix_weighing)
46 | print(Y)
47 |
48 | plt.figure()
49 | target_names = load_iris.target_names
50 | y = load_iris.target
51 | for c, i, target_name in zip("rgb", [0, 1, 2], target_names):
52 | plt.scatter(Y[y==i,0], Y[y==i,1], c=c, label=target_name)
53 |
54 | plt.xlabel('PCA 1')
55 | plt.ylabel('PCA 2')
56 | plt.legend()
57 | plt.title('PCA')
58 | plt.show()
59 |
60 |
--------------------------------------------------------------------------------
/logic/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/exponential_regression.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 | from scipy.optimize import curve_fit
5 |
6 | # Dataset values :
7 | day = np.arange(0,8)
8 | weight = np.array([251,209,157,129,103,81,66,49])
9 |
10 | # Exponential Function :
11 | def expo_func(x, a, b):
12 | return a * b ** x
13 |
14 | #popt :Optimal values for the parameters
15 | #pcov :The estimated covariance of popt
16 |
17 | popt, pcov = curve_fit(expo_func, day, weight)
18 | weight_pred = expo_func(day,popt[0],popt[1])
19 |
20 | # Plotting the data
21 | plt.plot(day, weight_pred, 'r-')
22 | plt.scatter(day,weight,label='Day vs Weight')
23 | plt.title("Day vs Weight a*b^x")
24 | plt.xlabel('Day')
25 | plt.ylabel('Weight')
26 | plt.legend()
27 | plt.show()
28 |
29 | # Equation
30 | a=popt[0].round(4)
31 | b=popt[1].round(4)
32 | print(f'The equation of regression line is y={a}*{b}^x')
33 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/linear_regression_example.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from sklearn import linear_model
6 |
7 | # Read the CSV file :
8 | data = pd.read_csv("https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv")
9 | data.head()
10 |
11 | # Let's select some features to explore more :
12 | data = data[["ENGINESIZE","CO2EMISSIONS"]]
13 |
14 | # ENGINESIZE vs CO2EMISSIONS:
15 | plt.scatter(data["ENGINESIZE"] , data["CO2EMISSIONS"] , color="blue")
16 | plt.xlabel("ENGINESIZE")
17 | plt.ylabel("CO2EMISSIONS")
18 | plt.show()
19 |
20 | # Generating training and testing data from our data:
21 | # We are using 80% data for training.
22 | train = data[:(int((len(data)*0.8)))]
23 | test = data[(int((len(data)*0.8))):]
24 |
25 | # Modeling:
26 |
27 | # Using sklearn package to model data :
28 | regr = linear_model.LinearRegression()
29 | train_x = np.array(train[["ENGINESIZE"]])
30 | train_y = np.array(train[["CO2EMISSIONS"]])
31 | regr.fit(train_x,train_y)
32 |
33 | # The coefficients:
34 | print ("coefficients : ",regr.coef_) #Slope
35 | print ("Intercept : ",regr.intercept_) #Intercept
36 |
37 | # Plotting the regression line:
38 | plt.scatter(train["ENGINESIZE"], train["CO2EMISSIONS"], color='blue')
39 | plt.plot(train_x, regr.coef_*train_x + regr.intercept_, '-r')
40 | plt.xlabel("Engine size")
41 | plt.ylabel("Emission")
42 |
43 | # Predicting values:
44 |
45 | # Function for predicting future values :
46 | def get_regression_predictions(input_features,intercept,slope):
47 | predicted_values = input_features*slope + intercept
48 | return predicted_values
49 |
50 | # Predicting emission for future car:
51 | my_engine_size = 3.5
52 | estimatd_emission = get_regression_predictions(my_engine_size,regr.intercept_[0],regr.coef_[0][0])
53 | print ("Estimated Emission :",estimatd_emission)
54 |
55 | # Checking various accuracy:
56 | from sklearn.metrics import r2_score
57 | test_x = np.array(test[['ENGINESIZE']])
58 | test_y = np.array(test[['CO2EMISSIONS']])
59 | test_y_ = regr.predict(test_x)
60 | print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y_ - test_y)))
61 | print("Mean sum of squares (MSE): %.2f" % np.mean((test_y_ - test_y) ** 2))
62 | print("R2-score: %.2f" % r2_score(test_y_ , test_y) )
63 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/logarithmic_regression.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 | from sklearn.metrics import r2_score
5 |
6 | # Dataset:
7 | # Y = a + b*ln(X)
8 |
9 | X = np.arange(1,50,0.5)
10 | Y = 10 + 2*np.log(X)
11 |
12 | #Adding some noise to calculate error!
13 | Y_noise = np.random.rand(len(Y))
14 | Y = Y +Y_noise
15 | plt.scatter(X,Y)
16 |
17 | # 1st column of our X matrix should be 1:
18 | n = len(X)
19 | x_bias = np.ones((n,1))
20 | print (X.shape)
21 | print (x_bias.shape)
22 |
23 | # Reshaping X :
24 | X = np.reshape(X,(n,1))
25 | print (X.shape)
26 |
27 | # Going with the formula:
28 | # Y = a + b*ln(X)
29 | X_log = np.log(X)
30 |
31 | # Append the X_log to X_bias:
32 | x_new = np.append(x_bias,X_log,axis=1)
33 |
34 | # Transpose of a matrix:
35 | x_new_transpose = np.transpose(x_new)
36 |
37 | # Matrix multiplication:
38 | x_new_transpose_dot_x_new = x_new_transpose.dot(x_new)
39 |
40 | # Find inverse:
41 | temp_1 = np.linalg.inv(x_new_transpose_dot_x_new)
42 |
43 | # Matrix Multiplication:
44 | temp_2 = x_new_transpose.dot(Y)
45 |
46 | # Find the coefficient values:
47 | theta = temp_1.dot(temp_2)
48 |
49 | # Plot the data:
50 | a = theta[0]
51 | b = theta[1]
52 | Y_plot = a + b*np.log(X)
53 | plt.scatter(X,Y)
54 | plt.plot(X,Y_plot,c="r")
55 |
56 | # Check the accuracy:
57 | Accuracy = r2_score(Y,Y_plot)
58 | print (Accuracy)
59 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/ml_algorithms_1.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from sklearn import linear_model
6 |
7 | # Read the CSV file :
8 | data = pd.read_csv("https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv")
9 | data.head()
10 |
11 | # Let's select some features to explore more :
12 | data = data[["ENGINESIZE","CO2EMISSIONS"]]
13 |
14 | # ENGINESIZE vs CO2EMISSIONS:
15 | plt.scatter(data["ENGINESIZE"] , data["CO2EMISSIONS"] , color="blue")
16 | plt.xlabel("ENGINESIZE")
17 | plt.ylabel("CO2EMISSIONS")
18 | plt.show()
19 |
20 | # Generating training and testing data from our data:
21 | # We are using 80% data for training.
22 | train = data[:(int((len(data)*0.8)))]
23 | test = data[(int((len(data)*0.8))):]
24 |
25 | # Modeling:
26 |
27 | # Using sklearn package to model data :
28 | regr = linear_model.LinearRegression()
29 | train_x = np.array(train[["ENGINESIZE"]])
30 | train_y = np.array(train[["CO2EMISSIONS"]])
31 | regr.fit(train_x,train_y)
32 |
33 | # The coefficients:
34 | print ("coefficients : ",regr.coef_) #Slope
35 | print ("Intercept : ",regr.intercept_) #Intercept
36 |
37 | # Plotting the regression line:
38 | plt.scatter(train["ENGINESIZE"], train["CO2EMISSIONS"], color='blue')
39 | plt.plot(train_x, regr.coef_*train_x + regr.intercept_, '-r')
40 | plt.xlabel("Engine size")
41 | plt.ylabel("Emission")
42 |
43 | # Predicting values:
44 |
45 | # Function for predicting future values :
46 | def get_regression_predictions(input_features,intercept,slope):
47 | predicted_values = input_features*slope + intercept
48 | return predicted_values
49 |
50 | # Predicting emission for future car:
51 | my_engine_size = 3.5
52 | estimatd_emission = get_regression_predictions(my_engine_size,regr.intercept_[0],regr.coef_[0][0])
53 | print ("Estimated Emission :",estimatd_emission)
54 |
55 | # Checking various accuracy:
56 | from sklearn.metrics import r2_score
57 | test_x = np.array(test[['ENGINESIZE']])
58 | test_y = np.array(test[['CO2EMISSIONS']])
59 | test_y_ = regr.predict(test_x)
60 | print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y_ - test_y)))
61 | print("Mean sum of squares (MSE): %.2f" % np.mean((test_y_ - test_y) ** 2))
62 | print("R2-score: %.2f" % r2_score(test_y_ , test_y) )
63 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/multivariable_linear_regression.py:
--------------------------------------------------------------------------------
1 | # Import the required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from sklearn import linear_model
6 |
7 | # Read the CSV file:
8 | data = pd.read_csv("https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv")
9 | data.head()
10 |
11 | # Consider features we want to work on:
12 | X = data[[ 'ENGINESIZE', 'CYLINDERS', 'FUELCONSUMPTION_CITY','FUELCONSUMPTION_HWY',
13 | 'FUELCONSUMPTION_COMB','FUELCONSUMPTION_COMB_MPG']]
14 | Y = data["CO2EMISSIONS"]
15 |
16 | # Generating training and testing data from our data:
17 | # We are using 80% data for training.
18 | train = data[:(int((len(data)*0.8)))]
19 | test = data[(int((len(data)*0.8))):]
20 |
21 | #Modeling:
22 |
23 | #Using sklearn package to model data :
24 |
25 | regr = linear_model.LinearRegression()
26 | train_x = np.array(train[[ 'ENGINESIZE', 'CYLINDERS', 'FUELCONSUMPTION_CITY',
27 | 'FUELCONSUMPTION_HWY', 'FUELCONSUMPTION_COMB','FUELCONSUMPTION_COMB_MPG']])
28 | train_y = np.array(train["CO2EMISSIONS"])
29 | regr.fit(train_x,train_y)
30 | test_x = np.array(test[[ 'ENGINESIZE', 'CYLINDERS', 'FUELCONSUMPTION_CITY',
31 | 'FUELCONSUMPTION_HWY', 'FUELCONSUMPTION_COMB','FUELCONSUMPTION_COMB_MPG']])
32 | test_y = np.array(test["CO2EMISSIONS"])
33 |
34 | # print the coefficient values:
35 | coeff_data = pd.DataFrame(regr.coef_ , X.columns , columns=["Coefficients"])
36 | coeff_data
37 |
38 | #Now let's do prediction of data:
39 | Y_pred = regr.predict(test_x)
40 |
41 | # Check accuracy:
42 | from sklearn.metrics import r2_score
43 | R = r2_score(test_y , Y_pred)
44 | print ("R² :",R)
45 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/polynomial_regression.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | # Generate datapoints:
6 | x = np.arange(-5,5,0.1)
7 | y_noise = 20 * np.random.normal(size = len(x))
8 | y = 1*(x**3) + 1*(x**2) + 1*x + 3+y_noise
9 | plt.scatter(x,y)
10 |
11 | # Make polynomial data:
12 | x1 = x
13 | x2 = np.power(x1,2)
14 | x3 = np.power(x1,3)
15 | n = len(x1)
16 |
17 | # Reshaping data:
18 | x1_new = np.reshape(x1,(n,1))
19 | x2_new = np.reshape(x2,(n,1))
20 | x3_new = np.reshape(x3,(n,1))
21 |
22 | # First column of matrix X:
23 | x_bias = np.ones((n,1))
24 |
25 | # Form the complete x matrix:
26 | x_new = np.append(x_bias,x1_new,axis=1)
27 | x_new = np.append(x_new,x2_new,axis=1)
28 | x_new = np.append(x_new,x3_new,axis=1)
29 |
30 | # Finding transpose:
31 | x_new_transpose = np.transpose(x_new)
32 |
33 | # Finding dot product of original and transposed matrix :
34 | x_new_transpose_dot_x_new = x_new_transpose.dot(x_new)
35 |
36 | # Finding Inverse:
37 | temp_1 = np.linalg.inv(x_new_transpose_dot_x_new)# Finding the dot product of transposed x and y :
38 | temp_2 = x_new_transpose.dot(y)
39 |
40 | # Finding coefficients:
41 | theta = temp_1.dot(temp_2)
42 | theta
43 |
44 | # Store coefficient values in different variables:
45 | beta_0 = theta[0]
46 | beta_1 = theta[1]
47 | beta_2 = theta[2]
48 | beta_3 = theta[3]
49 |
50 | # Plot the polynomial curve:
51 | plt.scatter(x,y)
52 | plt.plot(x,beta_0 + beta_1*x1 + beta_2*x2 + beta_3*x3,c="red")
53 |
54 | # Prediction function:
55 | def prediction(x1,x2,x3,beta_0,beta_1,beta_2,beta_3):
56 | y_pred = beta_0 + beta_1*x1 + beta_2*x2 + beta_3*x3
57 | return y_pred
58 |
59 | # Making predictions:
60 | pred = prediction(x1,x2,x3,beta_0,beta_1,beta_2,beta_3)
61 |
62 | # Calculate accuracy of model:
63 | def err(y_pred,y):
64 | var = (y - y_pred)
65 | var = var*var
66 | n = len(var)
67 | MSE = var.sum()
68 | MSE = MSE/n
69 |
70 | return MSE
71 |
72 | # Calculating the error:
73 | error = err(pred,y)
74 | error
75 |
--------------------------------------------------------------------------------
/machine_learning_algorithms_for_beginners/sinusoidal_regression.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 | from scipy.optimize import curve_fit
5 | from sklearn.metrics import r2_score
6 |
7 | # Generating dataset:
8 | # Y = A*sin(B(X + C)) + D
9 | # A = Amplitude
10 | # Period = 2*pi/B
11 | # Period = Length of One Cycle
12 | # C = Phase Shift (In Radian)
13 | # D = Vertical Shift
14 |
15 | X = np.linspace(0,1,100) #(Start,End,Points)
16 |
17 | # Here…
18 | # A = 1
19 | # B= 2*pi
20 | # B = 2*pi/Period
21 | # Period = 1
22 | # C = 0
23 | # D = 0
24 |
25 | Y = 1*np.sin(2*np.pi*X)
26 |
27 |
28 | # Adding some Noise :
29 | Noise = 0.4*np.random.normal(size=100)
30 | Y_data = Y + Noise
31 | plt.scatter(X,Y_data,c="r")
32 |
33 | # Calculate the value:
34 | def calc_sine(x,a,b,c,d):
35 | return a * np.sin(b* ( x + np.radians(c))) + d
36 |
37 | # Finding optimal parameters :
38 | popt,pcov = curve_fit(calc_sine,X,Y_data)
39 |
40 | # Plot the main data :
41 | plt.scatter(X,Y_data)# Plot the best fit curve :
42 | plt.plot(X,calc_sine(X,*popt),c="r")
43 | plt.show()
44 |
45 | # Check the accuracy :
46 | Accuracy =r2_score(Y_data,calc_sine(X,*popt))
47 | print (Accuracy)
48 |
49 | # Function to calculate the value :
50 | def calc_line(X,m,b):
51 | return b + X*m
52 |
53 | # It returns optimized parametes for our function :
54 | # popt stores optimal parameters
55 | # pcov stores the covarience between each parameters.
56 | popt,pcov = curve_fit(calc_line,X,Y_data)
57 |
58 | # Plot the main data :
59 | plt.scatter(X,Y_data)
60 |
61 | # Plot the best fit line :
62 | plt.plot(X,calc_line(X,*popt),c="r")
63 | plt.show()
64 |
65 | # Check the accuracy of model :
66 | Accuracy =r2_score(Y_data,calc_line(X,*popt))
67 | print ("Accuracy of Linear Model : ",Accuracy)
68 |
--------------------------------------------------------------------------------
/moment_generating_function/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/moment_generating_function/moment_generating_function.py:
--------------------------------------------------------------------------------
1 | #1-Dimensional Data:
2 |
3 | #Import required libraries:
4 | from scipy import stats
5 |
6 | #Dataset:
7 | d = [1,2,3,4,5]
8 |
9 | #Finding 0th moment:
10 | print("0th Moment = ",stats.moment(d,moment=0))
11 |
12 | #Finding 1st moment:
13 | print("1st Moment = ",stats.moment(d,moment=1))
14 |
15 | #Finding 2nd moment:
16 | print("2nd Moment = ",stats.moment(d,moment=2))
17 |
18 | #Finding 3nd moment:
19 | print("3nd Moment = ",stats.moment(d,moment=3))
20 |
21 | #Finding 4th moment:
22 | print("4th Moment = ",stats.moment(d,moment=4))
23 |
24 |
25 | #============================================================
26 |
27 |
28 | #2-Dimensional Data:
29 |
30 | #Import required libraries:
31 | from scipy import stats
32 |
33 | #Dataset:
34 | d = [[5,6,9,11,3],[21,4,8,15,2]]
35 |
36 | #Finding 0th moment:
37 | print("0th Moment = ",stats.moment(d,moment=0))
38 |
39 | #Finding 1st moment:
40 | print("1st Moment = ",stats.moment(d,moment=1))
41 |
42 | #Finding 2nd moment:
43 | print("2nd Moment = ",stats.moment(d,moment=2))
44 |
45 | #Finding 3nd moment:
46 | print("3nd Moment = ",stats.moment(d,moment=3))
47 |
48 | #Finding 4th moment:
49 | print("4th Moment = ",stats.moment(d,moment=4))
50 |
51 |
52 | #============================================================
53 |
54 |
55 | #2-Dimensional Data:
56 | #Set axis=1 (Horizonatal):
57 |
58 | #Import required libraries:
59 | from scipy import stats
60 |
61 | #Dataset:
62 | d = [[5,6,9,11,3],[21,4,8,15,2]]
63 |
64 | #Finding 0th moment:
65 | print("0th Moment = ",stats.moment(d,moment=0,axis=1))
66 |
67 | #Finding 1st moment:
68 | print("1st Moment = ",stats.moment(d,moment=1,axis=1))
69 |
70 | #Finding 2nd moment:
71 | print("2nd Moment = ",stats.moment(d,moment=2,axis=1))
72 |
73 | #Finding 3nd moment:
74 | print("3nd Moment = ",stats.moment(d,moment=3,axis=1))
75 |
76 | #Finding 4th moment:
77 | print("4th Moment = ",stats.moment(d,moment=4,axis=1))
78 |
79 |
80 | #============================================================
81 |
82 |
83 | #Multi-Dimensional Data:
84 |
85 | #Import required libraries:
86 | from scipy import stats
87 |
88 | #Dataset:
89 | d = [[5,6,9,11,3],
90 | [21,4,8,15,2],
91 | [15,23,42,1,36]]
92 |
93 | #Finding 0th moment:
94 | print("0th Moment = ",stats.moment(d,moment=0))
95 |
96 | #Finding 1st moment:
97 | print("1st Moment = ",stats.moment(d,moment=1))
98 |
99 | #Finding 2nd moment:
100 | print("2nd Moment = ",stats.moment(d,moment=2))
101 |
102 | #Finding 3nd moment:
103 | print("3nd Moment = ",stats.moment(d,moment=3))
104 |
105 | #Finding 4th moment:
106 | print("4th Moment = ",stats.moment(d,moment=4))
107 |
108 |
109 | #============================================================
110 |
111 |
112 | #2-Dimensional Data:
113 | #Set axis=1 (Horizonatal):
114 | #Higher Order Moments:
115 |
116 | #Import required libraries:
117 | from scipy import stats
118 |
119 | #Dataset:
120 | d = [[5,6,9,11,3],[21,4,8,15,2]]
121 |
122 | #Finding 10th moment:
123 | print("10th Moment = ",stats.moment(d,moment=10,axis=1))
124 |
125 | #Finding 12th moment:
126 | print("12th Moment = ",stats.moment(d,moment=12,axis=1))
127 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/monte_carlo_buffon's_needle_problem.py:
--------------------------------------------------------------------------------
1 | #Import required libraries :
2 | import random
3 | import math
4 | import matplotlib.pyplot as plt
5 |
6 | #Main function to estimate PI value :
7 | def monte_carlo(runs,needles,n_length,b_width):
8 | #Empty list to store pi values :
9 | pi_values = []
10 |
11 | #Horizontal line for actual value of PI :
12 | plt.axhline(y=math.pi, color='r', linestyle='-')
13 |
14 | #For all runs :
15 | for i in range(runs):
16 | #Initialize number of hits as 0.
17 | nhits = 0
18 |
19 | #For all needles :
20 | for j in range(needles):
21 | #We will find the distance from the nearest vertical line :
22 | #Min = 0 Max = b_width/2
23 | x = random.uniform(0,b_width/2.0)
24 |
25 | #The theta value will be from 0 to pi/2 :
26 | theta = random.uniform(0,math.pi/2)
27 |
28 | #Checking if the needle crosses the line or not :
29 | xtip = x - (n_length/2.0)*math.cos(theta)
30 | if xtip < 0 :
31 | nhits += 1
32 |
33 | #Going with the formula :
34 | numerator = 2.0 * n_length * needles
35 | denominator = b_width * nhits
36 |
37 | #Append the final value of pi :
38 | pi_values.append((numerator/denominator))
39 |
40 | #Final pi value after all iterations :
41 | print(pi_values[-1])
42 |
43 | #Plotting the graph :
44 | plt.plot(pi_values)
45 |
46 | #Total number of runs :
47 | runs = 100
48 |
49 | #Total number of needles :
50 | needles = 100000
51 |
52 | #Length of needle :
53 | n_length = 2
54 |
55 | #space between 2 verical lines :
56 | b_width =2
57 |
58 | #Calling the main function :
59 | monte_carlo(runs,needles,n_length,b_width)
60 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/monte_carlo_casino_example.py:
--------------------------------------------------------------------------------
1 | #Import required libraries :
2 |
3 | import random
4 | import matplotlib.pyplot as plt
5 |
6 | """RULES :
7 |
8 | 1) There are chits containing numbers ranging from 1-100 in a bag.
9 | 2) Users can bet on even or odd.
10 | 3) In this game 10 and 11 are special numbers. 10 will be counted as an odd number and 11 will be counted as an even number.
11 | 4) If you bet on even number and if you get 10 then you lose.
12 | 5) If you bet on odd number and if you get 11 then you lose.
13 | """
14 |
15 | #Place your bet:
16 |
17 | #User can choose even or odd number :
18 | choice = input("Do you want to bet on Even number or odd number \n")
19 |
20 | #For even :
21 | if choice=="Even":
22 | def pickNote():
23 | #Get random number between 1-100.
24 | note = random.randint(1,100)
25 |
26 | #Check for our game conditions.
27 |
28 | #Notice that 10 isn't considered as even number.
29 | if note%2!=0 or note==10:
30 | return False
31 | elif note%2==0:
32 | return True
33 |
34 | #For odd :
35 | elif choice=="Odd":
36 | def pickNote():
37 | #Get random number between 1-100.
38 | note = random.randint(1,100)
39 |
40 | #Check for our game conditions.
41 |
42 | #Notice that 11 isn't considered as odd number.
43 | if note%2==0 or note==11:
44 | return False
45 | elif note%2==1:
46 | return True
47 |
48 | #Main function :
49 | def play(total_money, bet_money, total_plays):
50 |
51 | num_of_plays = []
52 | money = []
53 |
54 | #Start with play number 1
55 | play = 1
56 |
57 | for play in range(total_plays):
58 | #Win :
59 | if pickNote():
60 | #Add the money to our funds
61 | total_money = total_money + bet_money
62 | #Append the play number
63 | num_of_plays.append(play)
64 | #Append the new fund amount
65 | money.append(total_money)
66 |
67 | #Lose :
68 | else:
69 | #Add the money to our funds
70 | total_money = total_money - bet_money
71 | #Append the play number
72 | num_of_plays.append(play)
73 | #Append the new fund amount
74 | money.append(total_money)
75 |
76 | #Plot the data :
77 | plt.ylabel('Player Money in $')
78 | plt.xlabel('Number of bets')
79 | plt.plot(num_of_plays,money)
80 |
81 | #Final value after all the iterations :
82 | final_funds.append(money[-1])
83 | return(final_funds)
84 |
85 | #Create a list for calculating final funds
86 | final_funds= []
87 |
88 | #Run 10 iterations :
89 | for i in range(10):
90 | ending_fund = play(10000,100,50)
91 |
92 | print(ending_fund)
93 | print(sum(ending_fund))
94 |
95 | #Print the money the player ends with
96 | print("The player started with $10,000")
97 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
98 |
99 | #Create a list for calculating final funds
100 | final_funds= []
101 |
102 | #Run 1000 iterations :
103 | for i in range(1000):
104 | ending_fund = play(10000,100,50)
105 |
106 | #Print the money the player ends with
107 | print("The player started with $10,000")
108 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
109 |
110 |
111 | #Create a list for calculating final funds
112 | final_funds= []
113 |
114 | #Run 10 iterations :
115 | for i in range(10):
116 | ending_fund = play(10000,100,5)
117 |
118 | #Print the money the player ends with
119 | print("Number of bets = 5")
120 | print("The player started with $10,000")
121 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
122 |
123 |
124 | #Create a list for calculating final funds
125 | final_funds= []
126 |
127 | #Run 10 iterations :
128 | for i in range(10):
129 | ending_fund = play(10000,100,10)
130 |
131 | #Print the money the player ends with
132 | print("Number of bets = 10")
133 | print("The player started with $10,000")
134 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
135 |
136 |
137 | #Create a list for calculating final funds
138 | final_funds= []
139 |
140 | #Run 10 iterations :
141 | for i in range(10):
142 | ending_fund = play(10000,100,100)
143 |
144 | #Print the money the player ends with
145 | print("Number of bets = 100")
146 | print("The player started with $10,000")
147 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
148 |
149 |
150 | #Create a list for calculating final funds
151 | final_funds= []
152 |
153 | #Run 10 iterations :
154 | for i in range(10):
155 | ending_fund = play(10000,100,1000)
156 |
157 | #Print the money the player ends with
158 | print("Number of bets = 1000")
159 | print("The player started with $10,000")
160 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
161 |
162 |
163 | #Create a list for calculating final funds
164 | final_funds= []
165 |
166 | #Run 10 iterations :
167 | for i in range(10):
168 | ending_fund = play(10000,100,5000)
169 |
170 | #Print the money the player ends with
171 | print("Number of bets = 5000")
172 | print("The player started with $10,000")
173 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
174 |
175 |
176 | #Create a list for calculating final funds
177 | final_funds= []
178 |
179 | #Run 10 iterations :
180 | for i in range(10):
181 | ending_fund = play(10000,100,10000)
182 |
183 | #Print the money the player ends with
184 | print("Number of bets = 10000")
185 | print("The player started with $10,000")
186 | print("The player left with $",str(sum(ending_fund)/len(ending_fund)))
187 |
188 |
189 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/monte_carlo_coin_flip.py:
--------------------------------------------------------------------------------
1 | #Import required libraries :
2 |
3 | import random
4 | import numpy as np
5 | import matplotlib.pyplot as plt
6 |
7 | #Coin flip function :
8 |
9 | #0 --> Heads
10 | #1 --> Tails
11 |
12 | def coin_flip():
13 | return random.randint(0,1)
14 |
15 | #Check the return value of coin_flip() :
16 |
17 | coin_flip()
18 |
19 | #Monte Carlo Simulation :
20 |
21 | #Empty list to store the probability values.
22 | list1 = []
23 |
24 | def monte_carlo(n):
25 | results = 0
26 | for i in range(n):
27 | flip_result = coin_flip()
28 | results = results + flip_result
29 |
30 | #Calculating probability value :
31 | prob_value = results/(i+1)
32 |
33 | #Append the probability values to the list :
34 | list1.append(prob_value)
35 |
36 | #Plot the results :
37 | plt.axhline(y=0.5, color='r', linestyle='-')
38 | plt.xlabel("Iterations")
39 | plt.ylabel("Probability")
40 | plt.plot(list1)
41 |
42 | return results/n
43 |
44 | #Calling the function :
45 |
46 | answer = monte_carlo(5000)
47 | print("Final value :",answer)
48 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/monte_carlo_estimating_pi_using_circle_and_square.py:
--------------------------------------------------------------------------------
1 | #Import required libraries :
2 | import turtle
3 | import random
4 | import matplotlib.pyplot as plt
5 | import math
6 |
7 | #To visualize the random points :
8 | myPen = turtle.Turtle()
9 | myPen.hideturtle()
10 | myPen.speed(0)
11 |
12 | #Drawing a square :
13 | myPen.up()
14 | myPen.setposition(-100,-100)
15 | myPen.down()
16 | myPen.fd(200)
17 | myPen.left(90)
18 | myPen.fd(200)
19 |
20 | myPen.left(90)
21 | myPen.fd(200)
22 | myPen.left(90)
23 | myPen.fd(200)
24 | myPen.left(90)
25 |
26 | #Drawing a circle :
27 | myPen.up()
28 | myPen.setposition(0,-100)
29 | myPen.down()
30 | myPen.circle(100)
31 |
32 | #To count the points inside and outside the circle :
33 | in_circle = 0
34 | out_circle = 0
35 |
36 | #To store the values of PI :
37 | pi_values = []
38 |
39 | #Running for 5 times :
40 | for i in range(5):
41 | for j in range(1000):
42 |
43 | #Generate random numbers :
44 | x=random.randrange(-100,100)
45 | y=random.randrange(-100,100)
46 |
47 | #Check if the number lies outside the circle :
48 | if (x**2+y**2>100**2):
49 | myPen.color("black")
50 | myPen.up()
51 | myPen.goto(x,y)
52 | myPen.down()
53 | myPen.dot()
54 | out_circle = out_circle+1
55 |
56 | else:
57 | myPen.color("red")
58 | myPen.up()
59 | myPen.goto(x,y)
60 | myPen.down()
61 | myPen.dot()
62 | in_circle = in_circle+1
63 |
64 | #Calculating the value of PI :
65 | pi = 4.0 * in_circle / (in_circle + out_circle)
66 |
67 | #Append the values of PI in list :
68 | pi_values.append(pi)
69 |
70 | #Calculating the errors :
71 | avg_pi_errors = [abs(math.pi - pi) for pi in pi_values]
72 |
73 | #Print the final value of PI for each iterations :
74 | print (pi_values[-1])
75 |
76 | #Plot the PI values :
77 | plt.axhline(y=math.pi, color='g', linestyle='-')
78 | plt.plot(pi_values)
79 | plt.xlabel("Iterations")
80 | plt.ylabel("Value of PI")
81 | plt.show()
82 |
83 | #Plot the error in calculation :
84 | plt.axhline(y=0.0, color='g', linestyle='-')
85 | plt.plot(avg_pi_errors)
86 | plt.xlabel("Iterations")
87 | plt.ylabel("Error")
88 | plt.show()
89 |
90 |
--------------------------------------------------------------------------------
/monte_carlo_simulation/monte_carlo_monty_hall_problem.py:
--------------------------------------------------------------------------------
1 | #Import required libraries :
2 | import random
3 | import matplotlib.pyplot as plt
4 |
5 | #We are going with 3 doors :
6 | #1 - Car
7 | #2 - Goats
8 | doors = ["goat","goat","car"]
9 |
10 | #Empty lists to store probability values :
11 | switch_win_probability = []
12 | stick_win_probability = []
13 |
14 | plt.axhline(y=0.66666, color='r', linestyle='-')
15 | plt.axhline(y=0.33333, color='g', linestyle='-')
16 |
17 | #Monte_Carlo Simulation :
18 | def monte_carlo(n):
19 |
20 | #Calculating switch and stick wins :
21 | switch_wins = 0
22 | stick_wins = 0
23 |
24 | for i in range(n):
25 |
26 | #Randomly placing the car and goats behind the three doors :
27 | random.shuffle(doors)
28 |
29 | #Contestant's choice :
30 | k = random.randrange(2)
31 |
32 | #If the contestant doesn't get car :
33 | if doors[k] != 'car':
34 | switch_wins += 1
35 |
36 | #If the contestant got car :
37 | else:
38 | stick_wins += 1
39 |
40 | #Updating the list values :
41 | switch_win_probability.append(switch_wins/(i+1))
42 | stick_win_probability.append(stick_wins/(i+1))
43 |
44 | #Plotting the data :
45 | plt.plot(switch_win_probability)
46 | plt.plot(stick_win_probability)
47 |
48 | #Print the probability values :
49 | print('Winning probability if you always switch:',switch_win_probability[-1])
50 | print('Winning probability if you always stick to your original choice:', stick_win_probability[-1])
51 |
52 |
53 | #Calling the function :
54 | monte_carlo(1000)
55 |
--------------------------------------------------------------------------------
/natural_language_processing/Natural_Language_Processing_Text.txt:
--------------------------------------------------------------------------------
1 | Once upon a time there was an old mother pig who had three little pigs and not enough food to feed them. So when they were old enough, she sent them out into the world to seek their fortunes.
2 |
3 | The first little pig was very lazy. He didn't want to work at all and he built his house out of straw. The second little pig worked a little bit harder but he was somewhat lazy too and he built his house out of sticks. Then, they sang and danced and played together the rest of the day.
4 |
5 | The third little pig worked hard all day and built his house with bricks. It was a sturdy house complete with a fine fireplace and chimney. It looked like it could withstand the strongest winds.
--------------------------------------------------------------------------------
/natural_language_processing/circle.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/towardsai/tutorials/cc12fe183d50ce6095f044d7346f30d5d0522584/natural_language_processing/circle.png
--------------------------------------------------------------------------------
/natural_language_processing/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/natural_language_processing/natural_language_processing_code.py:
--------------------------------------------------------------------------------
1 | #Open the text file :
2 | text_file = open("Natural_Language_Processing_Text.txt")
3 |
4 | #Read the data :
5 | text = text_file.read()
6 |
7 | #Datatype of the data read :
8 | print (type(text))
9 | print("\n")
10 |
11 | #Print the text :
12 | print(text)
13 | print("\n")
14 | #Length of the text :
15 | print (len(text))
16 |
17 | #Import required libraries :
18 | import nltk
19 | from nltk import sent_tokenize
20 | from nltk import word_tokenize
21 |
22 | #Tokenize the text by sentences :
23 | sentences = sent_tokenize(text)
24 |
25 | #How many sentences are there? :
26 | print (len(sentences))
27 |
28 | #Print the sentences :
29 | #print(sentences)
30 | print(sentences)
31 |
32 | #Tokenize the text with words :
33 | words = word_tokenize(text)
34 |
35 | #How many words are there? :
36 | print (len(words))
37 | print("\n")
38 |
39 | #Print words :
40 | print (words)
41 |
42 | #Import required libraries :
43 | from nltk.probability import FreqDist
44 |
45 | #Find the frequency :
46 | fdist = FreqDist(words)
47 |
48 | #Print 10 most common words :
49 | fdist.most_common(10)
50 |
51 | #Plot the graph for fdist :
52 | import matplotlib.pyplot as plt
53 |
54 | fdist.plot(10)
55 |
56 | #Empty list to store words:
57 | words_no_punc = []
58 |
59 | #Removing punctuation marks :
60 | for w in words:
61 | if w.isalpha():
62 | words_no_punc.append(w.lower())
63 |
64 | #Print the words without punctution marks :
65 | print (words_no_punc)
66 |
67 | print ("\n")
68 |
69 | #Length :
70 | print (len(words_no_punc))
71 |
72 | #Frequency distribution :
73 | fdist = FreqDist(words_no_punc)
74 |
75 | fdist.most_common(10)
76 |
77 |
78 | #Plot the most common words on grpah:
79 |
80 | fdist.plot(10)
81 |
82 | from nltk.corpus import stopwords
83 |
84 | #List of stopwords
85 | stopwords = stopwords.words("english")
86 | print(stopwords)
87 |
88 | #Empty list to store clean words :
89 | clean_words = []
90 |
91 | for w in words_no_punc:
92 | if w not in stopwords:
93 | clean_words.append(w)
94 |
95 | print(clean_words)
96 | print("\n")
97 | print(len(clean_words))
98 |
99 | #Frequency distribution :
100 | fdist = FreqDist(clean_words)
101 |
102 | fdist.most_common(10)
103 |
104 |
105 | #Plot the most common words on grpah:
106 |
107 | fdist.plot(10)
108 |
109 | #Library to form wordcloud :
110 | from wordcloud import WordCloud
111 |
112 | #Library to plot the wordcloud :
113 | import matplotlib.pyplot as plt
114 |
115 | #Generating the wordcloud :
116 | wordcloud = WordCloud().generate(text)
117 |
118 | #Plot the wordcloud :
119 | plt.figure(figsize = (12, 12))
120 | plt.imshow(wordcloud)
121 |
122 | #To remove the axis value :
123 | plt.axis("off")
124 | plt.show()
125 |
126 | #Import required libraries :
127 | import numpy as np
128 | from PIL import Image
129 | from wordcloud import WordCloud
130 |
131 | #Here we are going to use a circle image as mask :
132 | char_mask = np.array(Image.open("circle.png"))
133 |
134 | #Generating wordcloud :
135 | wordcloud = WordCloud(background_color="black",mask=char_mask).generate(text)
136 |
137 | #Plot the wordcloud :
138 | plt.figure(figsize = (8,8))
139 | plt.imshow(wordcloud)
140 |
141 | #To remove the axis value :
142 | plt.axis("off")
143 | plt.show()
144 |
145 | #Stemming Example :
146 |
147 | #Import stemming library :
148 | from nltk.stem import PorterStemmer
149 |
150 | porter = PorterStemmer()
151 |
152 | #Word-list for stemming :
153 | word_list = ["Study","Studying","Studies","Studied"]
154 |
155 | for w in word_list:
156 | print(porter.stem(w))
157 |
158 | #Stemming Example :
159 |
160 | #Import stemming library :
161 | from nltk.stem import SnowballStemmer
162 |
163 | snowball = SnowballStemmer("english")
164 |
165 | #Word-list for stemming :
166 | word_list = ["Study","Studying","Studies","Studied"]
167 |
168 | for w in word_list:
169 | print(snowball.stem(w))
170 |
171 | #Stemming Example :
172 |
173 | #Import stemming library :
174 | from nltk.stem import SnowballStemmer
175 |
176 | #Print languages supported :
177 | print(SnowballStemmer.languages)
178 |
179 | from nltk import WordNetLemmatizer
180 |
181 | lemma = WordNetLemmatizer()
182 | word_list = ["Study","Studying","Studies","Studied"]
183 |
184 | for w in word_list:
185 | print(lemma.lemmatize(w ,pos="v"))
186 |
187 | from nltk import WordNetLemmatizer
188 |
189 | lemma = WordNetLemmatizer()
190 | word_list = ["am","is","are","was","were"]
191 |
192 | for w in word_list:
193 | print(lemma.lemmatize(w ,pos="v"))
194 |
195 | from nltk.stem import PorterStemmer
196 |
197 | stemmer = PorterStemmer()
198 |
199 | print(stemmer.stem('studies'))
200 |
201 | from nltk.stem import WordNetLemmatizer
202 |
203 | lemmatizer = WordNetLemmatizer()
204 |
205 | print(lemmatizer.lemmatize('studies'))
206 |
207 |
208 | from nltk.stem import WordNetLemmatizer
209 |
210 | lemmatizer = WordNetLemmatizer()
211 | print(lemmatizer.lemmatize('studying', pos="v"))
212 | print(lemmatizer.lemmatize('studying', pos="n"))
213 | print(lemmatizer.lemmatize('studying', pos="a"))
214 | print(lemmatizer.lemmatize('studying', pos="r"))
215 |
216 | from nltk import WordNetLemmatizer
217 |
218 | lemma = WordNetLemmatizer()
219 | word_list = ["studies","leaves","decreases","plays"]
220 |
221 | for w in word_list:
222 | print(lemma.lemmatize(w))
223 |
224 | #PoS tagging :
225 | tag = nltk.pos_tag(["Studying","Study"])
226 | print (tag)
227 |
228 | #PoS tagging example :
229 |
230 | sentence = "A very beautiful young lady is walking on the beach"
231 |
232 | #Tokenizing words :
233 | tokenized_words = word_tokenize(sentence)
234 |
235 | for words in tokenized_words:
236 | tagged_words = nltk.pos_tag(tokenized_words)
237 |
238 | tagged_words
239 |
240 | #Extracting Noun Phrase from text :
241 |
242 | # ? - optional character
243 | # * - 0 or more repetations
244 | grammar = "NP : {
?*} "
245 | import matplotlib.pyplot as plt
246 | #Creating a parser :
247 | parser = nltk.RegexpParser(grammar)
248 |
249 | #Parsing text :
250 | output = parser.parse(tagged_words)
251 | print (output)
252 |
253 | #To visualize :
254 | #output.draw()
255 |
256 |
257 | #Chinking example :
258 | # * - 0 or more repetations
259 | # + - 1 or more repetations
260 |
261 | #Here we are taking the whole string and then
262 | #excluding adjectives from that chunk.
263 |
264 | grammar = r""" NP: {<.*>+}
265 | }+{"""
266 |
267 | #Creating parser :
268 | parser = nltk.RegexpParser(grammar)
269 |
270 | #parsing string :
271 | output = parser.parse(tagged_words)
272 | print(output)
273 |
274 | #To visualize :
275 | #output.draw()
276 |
277 |
278 | #Sentence for NER :
279 | sentence = "Mr. Smith made a deal on a beach of Switzerland near WHO."
280 |
281 | #Tokenizing words :
282 | tokenized_words = word_tokenize(sentence)
283 |
284 | #PoS tagging :
285 | for w in tokenized_words:
286 | tagged_words = nltk.pos_tag(tokenized_words)
287 |
288 | #print (tagged_words)
289 |
290 | #Named Entity Recognition :
291 | N_E_R = nltk.ne_chunk(tagged_words,binary=False)
292 | print(N_E_R)
293 |
294 | #To visualize :
295 | #N_E_R.draw()
296 |
297 |
298 | #Sentence for NER :
299 | sentence = "Mr. Smith made a deal on a beach of Switzerland near WHO."
300 |
301 | #Tokenizing words :
302 | tokenized_words = word_tokenize(sentence)
303 |
304 | #PoS tagging :
305 | for w in tokenized_words:
306 | tagged_words = nltk.pos_tag(tokenized_words)
307 |
308 | #print (tagged_words)
309 |
310 | #Named Entity Recognition :
311 | N_E_R = nltk.ne_chunk(tagged_words,binary=True)
312 |
313 | print(N_E_R)
314 |
315 | #To visualize :
316 | #N_E_R.draw()
317 |
318 | #Import wordnet :
319 | from nltk.corpus import wordnet
320 |
321 | for words in wordnet.synsets("Fun"):
322 | print(words)
323 |
324 | #Word meaning with definitions :
325 | for words in wordnet.synsets("Fun"):
326 | print(words.name())
327 | print(words.definition())
328 | print(words.examples())
329 |
330 | for lemma in words.lemmas():
331 | print(lemma)
332 | print("\n")
333 |
334 |
335 | #How many differnt meanings :
336 | for words in wordnet.synsets("Fun"):
337 | for lemma in words.lemmas():
338 | print(lemma)
339 | print("\n")
340 |
341 |
342 | word = wordnet.synsets("Play")[0]
343 |
344 | #Checking name :
345 | print(word.name())
346 |
347 | #Checking definition :
348 | print(word.definition())
349 |
350 | #Checking examples:
351 | print(word.examples())
352 |
353 | word = wordnet.synsets("Play")[0]
354 |
355 | #Find more abstract term :
356 | print(word.hypernyms())
357 |
358 | word = wordnet.synsets("Play")[0]
359 |
360 | #Find more specific term :
361 | word.hyponyms()
362 |
363 | word = wordnet.synsets("Play")[0]
364 |
365 | #Get only name :
366 | print(word.lemmas()[0].name())
367 |
368 | #Finding synonyms :
369 |
370 | #Empty list to store synonyms :
371 | synonyms = []
372 |
373 | for words in wordnet.synsets('Fun'):
374 | for lemma in words.lemmas():
375 | synonyms.append(lemma.name())
376 |
377 | print(synonyms)
378 |
379 | #Finding antonyms :
380 |
381 | #Empty list to store antonyms :
382 | antonyms = []
383 |
384 | for words in wordnet.synsets('Natural'):
385 | for lemma in words.lemmas():
386 | if lemma.antonyms():
387 | antonyms.append(lemma.antonyms()[0].name())
388 |
389 | #Print antonyms :
390 | print(antonyms)
391 |
392 |
393 | #Finding synonyms and antonyms :
394 |
395 | #Empty lists to store synonyms/antonynms :
396 | synonyms = []
397 | antonyms = []
398 |
399 | for words in wordnet.synsets('New'):
400 | for lemma in words.lemmas():
401 | synonyms.append(lemma.name())
402 | if lemma.antonyms():
403 | antonyms.append(lemma.antonyms()[0].name())
404 |
405 | #Print lists :
406 | print(synonyms)
407 | print("\n")
408 | print(antonyms)
409 |
410 |
411 | #Similarity in words :
412 | word1 = wordnet.synsets("ship","n")[0]
413 |
414 | word2 = wordnet.synsets("boat","n")[0]
415 |
416 | #Check similarity :
417 | print(word1.wup_similarity(word2))
418 |
419 | #Similarity in words :
420 | word1 = wordnet.synsets("ship","n")[0]
421 |
422 | word2 = wordnet.synsets("bike","n")[0]
423 |
424 | #Check similarity :
425 | print(word1.wup_similarity(word2))
426 |
427 |
428 | #Import required libraries :
429 | from sklearn.feature_extraction.text import CountVectorizer
430 |
431 | #Text for analysis :
432 | sentences = ["Jim and Pam travelled by the bus:",
433 | "The train was late",
434 | "The flight was full.Travelling by flight is expensive"]
435 |
436 | #Create an object :
437 | cv = CountVectorizer()
438 |
439 | #Generating output for Bag of Words :
440 | B_O_W = cv.fit_transform(sentences).toarray()
441 |
442 | #Total words with their index in model :
443 | print(cv.vocabulary_)
444 | print("\n")
445 |
446 | #Features :
447 | print(cv.get_feature_names())
448 | print("\n")
449 |
450 | #Show the output :
451 | print(B_O_W)
452 |
453 |
454 | #Import required libraries :
455 | from sklearn.feature_extraction.text import TfidfVectorizer
456 |
457 | #Sentences for analysis :
458 | sentences = ['This is the first document','This document is the second document']
459 |
460 | #Create an object :
461 | vectorizer = TfidfVectorizer(norm = None)
462 |
463 | #Generating output for TF_IDF :
464 | X = vectorizer.fit_transform(sentences).toarray()
465 |
466 | #Total words with their index in model :
467 | print(vectorizer.vocabulary_)
468 | print("\n")
469 |
470 | #Features :
471 | print(vectorizer.get_feature_names())
472 | print("\n")
473 |
474 | #Show the output :
475 | print(X)
476 |
--------------------------------------------------------------------------------
/natural_language_processing/semantic-analysis.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 |
4 | #Semantic Analysis Using Python - NLP
5 |
6 | * Tutorial: http://news.towardsai.net/nls
7 | * Github: https://github.com/towardsai/tutorials/tree/master/natural_language_processing/semantic-analysis.py
8 | """
9 |
10 | import numpy as np
11 | import pandas as pd
12 | import matplotlib.pyplot as plt
13 | import seaborn as sns
14 |
15 | from sklearn.datasets import fetch_20newsgroups
16 |
17 | dataset = fetch_20newsgroups(shuffle=True, random_state=5, remove=('headers', 'footers', 'quotes'))
18 | df = dataset.data
19 | df
20 |
21 | new_df = pd.DataFrame({'document':df})
22 |
23 | # removing everything except alphabets
24 | new_df['clean_doc'] = new_df['document'].str.replace("[^a-zA-Z#]", " ")
25 |
26 | # removing short words
27 | new_df['clean_doc'] = new_df['clean_doc'].apply(lambda x: ' '.join([w for w in x.split() if len(w)>4]))
28 |
29 | # make all text lowercase
30 | new_df['clean_doc'] = new_df['clean_doc'].apply(lambda x: x.lower())
31 |
32 | from nltk.corpus import stopwords
33 | swords = stopwords.words('english')
34 |
35 | # tokenization
36 | tokenized_doc = new_df['clean_doc'].apply(lambda x: x.split())
37 |
38 | # remove stop-words
39 | tokenized_doc = tokenized_doc.apply(lambda x: [item for item in x if item not in swords])
40 |
41 | # de-tokenization
42 | detokenized_doc = []
43 | for i in range(len(news_df)):
44 | t = ' '.join(tokenized_doc[i])
45 | detokenized_doc.append(t)
46 |
47 | new_df['clean_doc'] = detokenized_doc
48 |
49 | from sklearn.feature_extraction.text import TfidfVectorizer
50 |
51 | vectorizer = TfidfVectorizer(stop_words='english', max_features= 300, max_df = 0.5, smooth_idf=True)
52 |
53 | X = vectorizer.fit_transform(news_df['clean_doc'])
54 |
55 | X.shape
56 |
57 | from sklearn.decomposition import TruncatedSVD
58 |
59 | svd_model = TruncatedSVD(n_components=20, algorithm='randomized', n_iter=120, random_state=100)
60 |
61 | svd_model.fit(X)
62 | len(svd_model.components_)
63 |
64 | terms = vectorizer.get_feature_names()
65 |
66 | for i, comp in enumerate(svd_model.components_):
67 | terms_comp = zip(terms, comp)
68 | sorted_terms = sorted(terms_comp, key= lambda x:x[1], reverse=True)[:7]
69 | print("Topic "+str(i)+": ")
70 | for t in sorted_terms:
71 | print(t[0])
72 | print(" ")
73 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_1/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_1/neural_network_part1_1.py:
--------------------------------------------------------------------------------
1 | #Import required libraries:
2 | import numpy as np
3 |
4 | #Define input features:
5 | input_features = np.array([[0,0],[0,1],[1,0],[1,1]])
6 | print (input_features.shape)
7 | print (input_features)
8 |
9 | #Define target output:
10 | target_output = np.array([[0,1,1,1]])
11 |
12 | #Reshaping our target output into vector:
13 | target_output = target_output.reshape(4,1)
14 | print(target_output.shape)
15 | print (target_output)
16 |
17 | #Define weights:
18 | weights = np.array([[0.1],[0.2]])
19 | print(weights.shape)
20 | print (weights)
21 |
22 | #Bias weight:
23 | bias = 0.3
24 |
25 | #Learning Rate:
26 | lr = 0.05
27 |
28 | #Sigmoid function:
29 | def sigmoid(x):
30 | return 1/(1+np.exp(-x))
31 |
32 | #Derivative of sigmoid function:
33 | def sigmoid_der(x):
34 | return sigmoid(x)*(1-sigmoid(x))
35 |
36 | #Main logic for neural network:
37 |
38 | # Running our code 10000 times:
39 | for epoch in range(10000):
40 | inputs = input_features
41 |
42 | #Feedforward input:
43 | in_o = np.dot(inputs, weights) + bias
44 |
45 | #Feedforward output:
46 | out_o = sigmoid(in_o)
47 |
48 | #Backpropogation
49 |
50 | #Calculating error
51 | error = out_o - target_output
52 |
53 | #Going with the formula:
54 | x = error.sum()
55 | print(x)
56 |
57 | #Calculating derivative:
58 | derror_douto = error
59 | douto_dino = sigmoid_der(out_o)
60 |
61 | #Multiplying individual derivatives:
62 |
63 | deriv = derror_douto * douto_dino
64 |
65 | #Multiplying with the 3rd individual derivative:
66 | #Finding the transpose of input_features:
67 | inputs = input_features.T
68 | deriv_final = np.dot(inputs,deriv)
69 |
70 | #Updating the weights values:
71 | weights -= lr * deriv_final
72 |
73 | #Updating the bias weight value:
74 | for i in deriv:
75 | bias -= lr * i #
76 |
77 | #Check the final values for weight and biasprint (weights)
78 | print (bias)
79 |
80 | #Taking inputs:
81 | single_point = np.array([1,0])
82 |
83 | #1st step:
84 | result1 = np.dot(single_point, weights) + bias
85 |
86 | #2nd step:
87 | result2 = sigmoid(result1)
88 |
89 | #Print final result
90 | print(result2)
91 |
92 | #Taking inputs:
93 | single_point = np.array([1,1])
94 |
95 | #1st step:
96 | result1 = np.dot(single_point, weights) + bias
97 |
98 | #2nd step:
99 | result2 = sigmoid(result1) #Print final result
100 | print(result2)
101 |
102 | #Taking inputs:
103 | single_point = np.array([0,0])
104 |
105 | #1st step:
106 | result1 = np.dot(single_point, weights) + bias
107 |
108 | #2nd step:
109 | result2 = sigmoid(result1)
110 |
111 | #Print final result
112 | print(result2)
113 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_1/neural_network_part1_2.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 |
4 | # Define input features:
5 | input_features = np.array([[0,0],[0,1],[1,0],[1,1]])
6 | print (input_features.shape)
7 | print (input_features)
8 |
9 | # Define target output:
10 | target_output = np.array([[0,1,1,1]])
11 |
12 | # Reshaping our target output into vector:
13 | target_output = target_output.reshape(4,1)
14 | print(target_output.shape)
15 | print (target_output)
16 |
17 | # Define weights:
18 | weights = np.array([[0.1],[0.2]])
19 | print(weights.shape)
20 | print (weights)
21 |
22 | # Define learning rate:
23 | lr = 0.05
24 |
25 | # Sigmoid function:
26 | def sigmoid(x):
27 | return 1/(1+np.exp(-x))
28 |
29 | # Derivative of sigmoid function:
30 | def sigmoid_der(x):
31 | return sigmoid(x)*(1-sigmoid(x))
32 |
33 | # Main logic for neural network:
34 | # Running our code 10000 times:
35 |
36 | for epoch in range(10000):
37 | inputs = input_features
38 |
39 | #Feedforward input:
40 | pred_in = np.dot(inputs, weights)
41 |
42 | #Feedforward output:
43 | pred_out = sigmoid(pred_in)
44 |
45 | #Backpropogation
46 | #Calculating error
47 | error = pred_out - target_output
48 | x = error.sum()
49 |
50 | #Going with the formula:
51 | print(x)
52 |
53 | #Calculating derivative:
54 | dcost_dpred = error
55 | dpred_dz = sigmoid_der(pred_out)
56 |
57 | #Multiplying individual derivatives:
58 | z_delta = dcost_dpred * dpred_dz#Multiplying with the 3rd individual derivative:
59 | inputs = input_features.T
60 | weights -= lr * np.dot(inputs, z_delta)
61 |
62 |
63 | #Taking inputs:
64 | single_point = np.array([1,0])
65 |
66 | #1st step:
67 | result1 = np.dot(single_point, weights)
68 |
69 | #2nd step:
70 | result2 = sigmoid(result1)
71 |
72 | #Print final result
73 | print(result2)
74 |
75 | #Taking inputs:
76 | single_point = np.array([0,0])
77 |
78 | #1st step:
79 | result1 = np.dot(single_point, weights)
80 |
81 | #2nd step:
82 | result2 = sigmoid(result1)
83 |
84 | #Print final result
85 | print(result2)
86 |
87 | #Taking inputs:
88 | single_point = np.array([1,1])
89 |
90 | #1st step:
91 | result1 = np.dot(single_point, weights)
92 |
93 | #2nd step:
94 | result2 = sigmoid(result1)
95 |
96 | #Print final result
97 | print(result2)
98 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_1/neural_network_part1_3.py:
--------------------------------------------------------------------------------
1 | # Import required libraries:
2 | import numpy as np
3 |
4 | # Define input features:
5 | input_features = np.array([[1,0,0,1],[1,0,0,0],[0,0,1,1],
6 | [0,1,0,0],[1,1,0,0],[0,0,1,1],
7 | [0,0,0,1],[0,0,1,0]])
8 | print (input_features.shape)
9 | print (input_features)
10 |
11 | # Define target output:
12 | target_output = np.array([[1,1,0,0,1,1,0,0]])
13 |
14 | # Reshaping our target output into vector:
15 | target_output = target_output.reshape(8,1)
16 | print(target_output.shape)
17 | print (target_output)
18 |
19 | # Define weights:
20 | weights = np.array([[0.1],[0.2],[0.3],[0.4]])
21 | print(weights.shape)
22 | print (weights)
23 |
24 | # Bias weight:
25 | bias = 0.3
26 |
27 | # Learning Rate:
28 | lr = 0.05
29 |
30 | # Sigmoid function:
31 | def sigmoid(x):
32 | return 1/(1+np.exp(-x))
33 |
34 | # Derivative of sigmoid function:
35 | def sigmoid_der(x):
36 | return sigmoid(x)*(1-sigmoid(x))
37 |
38 | # Main logic for neural network:
39 | # Running our code 10000 times:
40 | for epoch in range(10000):
41 | inputs = input_features
42 |
43 | #Feedforward input:
44 | pred_in = np.dot(inputs, weights) + bias
45 |
46 | #Feedforward output:
47 | pred_out = sigmoid(pred_in)
48 |
49 | #Backpropogation
50 | #Calculating error
51 | error = pred_out - target_output
52 |
53 | #Going with the formula:
54 | x = error.sum()
55 | print(x)
56 |
57 | #Calculating derivative:
58 | dcost_dpred = error
59 | dpred_dz = sigmoid_der(pred_out)
60 |
61 | #Multiplying individual derivatives:
62 | z_delta = dcost_dpred * dpred_dz
63 |
64 | #Multiplying with the 3rd individual derivative:
65 | inputs = input_features.T
66 | weights -= lr * np.dot(inputs, z_delta)#Updating the bias weight value:
67 | for i in z_delta:
68 | bias -= lr * i
69 |
70 | #Printing final weights:
71 |
72 | print (weights)
73 | print ("\n\n")
74 | print (bias)
75 |
76 | #Taking inputs:
77 | single_point = np.array([1,0,0,1])
78 |
79 | #1st step:
80 | result1 = np.dot(single_point, weights) + bias
81 |
82 | #2nd step:
83 | result2 = sigmoid(result1)
84 |
85 | #Print final result
86 | print(result2)
87 |
88 | #Taking inputs:
89 | single_point = np.array([0,0,1,0])
90 |
91 | #1st step:
92 | result1 = np.dot(single_point, weights) + bias
93 |
94 | #2nd step:
95 | result2 = sigmoid(result1)
96 |
97 | #Print final result
98 | print(result2)
99 |
100 | #Taking inputs:
101 | single_point = np.array([1,0,1,0])
102 |
103 | #1st step:
104 | result1 = np.dot(single_point, weights) + bias
105 |
106 | #2nd step:
107 | result2 = sigmoid(result1)
108 |
109 | #Print final result
110 | print(result2)
111 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_1/neural_networks_tutorial.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "neural-networks-tutorial.ipynb",
7 | "provenance": [],
8 | "authorship_tag": "ABX9TyMigAWqeY7WvQBiWBJFFzXD",
9 | "include_colab_link": true
10 | },
11 | "kernelspec": {
12 | "name": "python3",
13 | "display_name": "Python 3"
14 | }
15 | },
16 | "cells": [
17 | {
18 | "cell_type": "markdown",
19 | "metadata": {
20 | "id": "view-in-github",
21 | "colab_type": "text"
22 | },
23 | "source": [
24 | "
"
25 | ]
26 | },
27 | {
28 | "cell_type": "markdown",
29 | "metadata": {
30 | "id": "ScvnRKjsV0zS"
31 | },
32 | "source": [
33 | "# Neural Networks from Scratch with Python Code and Math in Detail\n",
34 | "\n",
35 | "* Tutorial: https://towardsai.net/p/machine-learning/building-neural-networks-from-scratch-with-python-code-and-math-in-detail-i-536fae5d7bbf \n",
36 | "\n",
37 | "* Github: https://github.com/towardsai/tutorials/tree/master/neural_networks_tutorial_part_1 \n",
38 | "\n"
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "metadata": {
44 | "id": "TfbSw80eVNCS",
45 | "colab": {
46 | "base_uri": "https://localhost:8080/",
47 | "height": 312
48 | },
49 | "outputId": "ebfaf520-2318-4a80-ddec-8240ca14bafc"
50 | },
51 | "source": [
52 | "# Import required libraries:\n",
53 | "import numpy as np# Define input features:\n",
54 | "input_features = np.array([[0,0],[0,1],[1,0],[1,1]])\n",
55 | "print (input_features.shape)\n",
56 | "print (input_features)# Define target output:\n",
57 | "target_output = np.array([[0,1,1,1]])# Reshaping our target output into vector:\n",
58 | "target_output = target_output.reshape(4,1)\n",
59 | "print(target_output.shape)\n",
60 | "print (target_output)# Define weights:\n",
61 | "weights = np.array([[0.1],[0.2]])\n",
62 | "print(weights.shape)\n",
63 | "print (weights)# Bias weight:\n",
64 | "bias = 0.3# Learning Rate:\n",
65 | "lr = 0.05# Sigmoid function:\n",
66 | "def sigmoid(x):\n",
67 | " return 1/(1+np.exp(-x))# Derivative of sigmoid function:\n",
68 | "def sigmoid_der(x):\n",
69 | " return sigmoid(x)*(1-sigmoid(x))# Main logic for neural network:\n",
70 | " # Running our code 10000 times:for epoch in range(10000):\n",
71 | " inputs = input_features#Feedforward input:\n",
72 | " in_o = np.dot(inputs, weights) + bias #Feedforward output:\n",
73 | " out_o = sigmoid(in_o) #Backpropogation \n",
74 | " #Calculating error\n",
75 | " error = out_o - target_output\n",
76 | " \n",
77 | " #Going with the formula:\n",
78 | " x = error.sum()\n",
79 | " print(x)\n",
80 | " \n",
81 | " #Calculating derivative:\n",
82 | " derror_douto = error\n",
83 | " douto_dino = sigmoid_der(out_o)\n",
84 | " \n",
85 | " #Multiplying individual derivatives:\n",
86 | " deriv = derror_douto * douto_dino #Multiplying with the 3rd individual derivative:\n",
87 | " #Finding the transpose of input_features:\n",
88 | " inputs = input_features.T\n",
89 | " deriv_final = np.dot(inputs,deriv)\n",
90 | " \n",
91 | " #Updating the weights values:\n",
92 | " weights -= lr * deriv_final #Updating the bias weight value:\n",
93 | " for i in deriv:\n",
94 | " bias -= lr * i #Check the final values for weight and biasprint (weights)\n",
95 | " \n",
96 | "print (bias) #Taking inputs:\n",
97 | "single_point = np.array([1,0]) #1st step:\n",
98 | "result1 = np.dot(single_point, weights) + bias #2nd step:\n",
99 | "result2 = sigmoid(result1) #Print final result\n",
100 | "print(result2) #Taking inputs:\n",
101 | "single_point = np.array([1,1]) #1st step:\n",
102 | "result1 = np.dot(single_point, weights) + bias #2nd step:\n",
103 | "result2 = sigmoid(result1) #Print final result\n",
104 | "print(result2) #Taking inputs:\n",
105 | "single_point = np.array([0,0]) #1st step:\n",
106 | "result1 = np.dot(single_point, weights) + bias #2nd step:\n",
107 | "result2 = sigmoid(result1) #Print final result\n",
108 | "print(result2)"
109 | ],
110 | "execution_count": null,
111 | "outputs": [
112 | {
113 | "output_type": "stream",
114 | "text": [
115 | "(4, 2)\n",
116 | "[[0 0]\n",
117 | " [0 1]\n",
118 | " [1 0]\n",
119 | " [1 1]]\n",
120 | "(4, 1)\n",
121 | "[[0]\n",
122 | " [1]\n",
123 | " [1]\n",
124 | " [1]]\n",
125 | "(2, 1)\n",
126 | "[[0.1]\n",
127 | " [0.2]]\n",
128 | "0.3\n",
129 | "[0.59868766]\n",
130 | "[0.64565631]\n",
131 | "[0.57444252]\n"
132 | ],
133 | "name": "stdout"
134 | }
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "metadata": {
140 | "id": "YrRCooGFcb0f",
141 | "colab": {
142 | "base_uri": "https://localhost:8080/",
143 | "height": 295
144 | },
145 | "outputId": "275a1ab6-cd9b-475f-9775-f198e161a9e3"
146 | },
147 | "source": [
148 | "# Import required libraries:\n",
149 | "import numpy as np# Define input features:\n",
150 | "input_features = np.array([[0,0],[0,1],[1,0],[1,1]])\n",
151 | "print (input_features.shape)\n",
152 | "print (input_features)# Define target output:\n",
153 | "target_output = np.array([[0,1,1,1]])# Reshaping our target output into vector:\n",
154 | "target_output = target_output.reshape(4,1)\n",
155 | "print(target_output.shape)\n",
156 | "print (target_output)# Define weights:\n",
157 | "weights = np.array([[0.1],[0.2]])\n",
158 | "print(weights.shape)\n",
159 | "print (weights)# Define learning rate:\n",
160 | "lr = 0.05# Sigmoid function:\n",
161 | "def sigmoid(x):\n",
162 | " return 1/(1+np.exp(-x))# Derivative of sigmoid function:\n",
163 | "def sigmoid_der(x):\n",
164 | " return sigmoid(x)*(1-sigmoid(x))# Main logic for neural network:\n",
165 | "# Running our code 10000 times:for epoch in range(10000):\n",
166 | " inputs = input_features#Feedforward input:\n",
167 | " pred_in = np.dot(inputs, weights)#Feedforward output:\n",
168 | " pred_out = sigmoid(pred_in)#Backpropogation \n",
169 | " #Calculating error\n",
170 | " error = pred_out - target_output\n",
171 | " x = error.sum()\n",
172 | " \n",
173 | " #Going with the formula:\n",
174 | " print(x)\n",
175 | " \n",
176 | " #Calculating derivative:\n",
177 | " dcost_dpred = error\n",
178 | " dpred_dz = sigmoid_der(pred_out)\n",
179 | " \n",
180 | " #Multiplying individual derivatives:\n",
181 | " z_delta = dcost_dpred * dpred_dz#Multiplying with the 3rd individual derivative:\n",
182 | " inputs = input_features.T\n",
183 | " weights -= lr * np.dot(inputs, z_delta)\n",
184 | " \n",
185 | " \n",
186 | "#Taking inputs:\n",
187 | "single_point = np.array([1,0])#1st step:\n",
188 | "result1 = np.dot(single_point, weights)#2nd step:\n",
189 | "result2 = sigmoid(result1)#Print final result\n",
190 | "print(result2)#Taking inputs:\n",
191 | "single_point = np.array([0,0])#1st step:\n",
192 | "result1 = np.dot(single_point, weights)#2nd step:\n",
193 | "result2 = sigmoid(result1)#Print final result\n",
194 | "print(result2)#Taking inputs:\n",
195 | "single_point = np.array([1,1])#1st step:\n",
196 | "result1 = np.dot(single_point, weights)#2nd step:\n",
197 | "result2 = sigmoid(result1)#Print final result\n",
198 | "print(result2)"
199 | ],
200 | "execution_count": null,
201 | "outputs": [
202 | {
203 | "output_type": "stream",
204 | "text": [
205 | "(4, 2)\n",
206 | "[[0 0]\n",
207 | " [0 1]\n",
208 | " [1 0]\n",
209 | " [1 1]]\n",
210 | "(4, 1)\n",
211 | "[[0]\n",
212 | " [1]\n",
213 | " [1]\n",
214 | " [1]]\n",
215 | "(2, 1)\n",
216 | "[[0.1]\n",
217 | " [0.2]]\n",
218 | "[0.52497919]\n",
219 | "[0.5]\n",
220 | "[0.57444252]\n"
221 | ],
222 | "name": "stdout"
223 | }
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "metadata": {
229 | "id": "ES5UHf2ufWXc",
230 | "colab": {
231 | "base_uri": "https://localhost:8080/",
232 | "height": 607
233 | },
234 | "outputId": "508da63f-9aaa-4bb0-8c56-294ab7fc0ce6"
235 | },
236 | "source": [
237 | "# Import required libraries:\n",
238 | "import numpy as np# Define input features:\n",
239 | "input_features = np.array([[1,0,0,1],[1,0,0,0],[0,0,1,1],\n",
240 | " [0,1,0,0],[1,1,0,0],[0,0,1,1],\n",
241 | " [0,0,0,1],[0,0,1,0]])\n",
242 | "print (input_features.shape)\n",
243 | "print (input_features)# Define target output:\n",
244 | "target_output = np.array([[1,1,0,0,1,1,0,0]])# Reshaping our target output into vector:\n",
245 | "target_output = target_output.reshape(8,1)\n",
246 | "print(target_output.shape)\n",
247 | "print (target_output)# Define weights:\n",
248 | "weights = np.array([[0.1],[0.2],[0.3],[0.4]])\n",
249 | "print(weights.shape)\n",
250 | "print (weights)# Bias weight:\n",
251 | "bias = 0.3# Learning Rate:\n",
252 | "lr = 0.05# Sigmoid function:\n",
253 | "def sigmoid(x):\n",
254 | " return 1/(1+np.exp(-x))# Derivative of sigmoid function:\n",
255 | "def sigmoid_der(x):\n",
256 | " return sigmoid(x)*(1-sigmoid(x))# Main logic for neural network:\n",
257 | "# Running our code 10000 times:for epoch in range(10000):\n",
258 | " inputs = input_features#Feedforward input:\n",
259 | " pred_in = np.dot(inputs, weights) + bias#Feedforward output:\n",
260 | " pred_out = sigmoid(pred_in)#Backpropogation \n",
261 | " #Calculating error\n",
262 | " error = pred_out - target_output\n",
263 | " \n",
264 | " #Going with the formula:\n",
265 | " x = error.sum()\n",
266 | " print(x)\n",
267 | " \n",
268 | " #Calculating derivative:\n",
269 | " dcost_dpred = error\n",
270 | " dpred_dz = sigmoid_der(pred_out)\n",
271 | " \n",
272 | " #Multiplying individual derivatives:\n",
273 | " z_delta = dcost_dpred * dpred_dz#Multiplying with the 3rd individual derivative:\n",
274 | " inputs = input_features.T\n",
275 | " weights -= lr * np.dot(inputs, z_delta)#Updating the bias weight value:\n",
276 | " for i in z_delta:\n",
277 | " bias -= lr * i#Printing final weights: \n",
278 | "\n",
279 | "print (weights)\n",
280 | "print (\"\\n\\n\")\n",
281 | "print (bias)#Taking inputs:\n",
282 | "single_point = np.array([1,0,0,1])#1st step:\n",
283 | "result1 = np.dot(single_point, weights) + bias#2nd step:\n",
284 | "result2 = sigmoid(result1)#Print final result\n",
285 | "print(result2)#Taking inputs:\n",
286 | "single_point = np.array([0,0,1,0])#1st step:\n",
287 | "result1 = np.dot(single_point, weights) + bias#2nd step:\n",
288 | "result2 = sigmoid(result1)#Print final result\n",
289 | "print(result2)#Taking inputs:\n",
290 | "single_point = np.array([1,0,1,0])#1st step:\n",
291 | "result1 = np.dot(single_point, weights) + bias#2nd step:\n",
292 | "result2 = sigmoid(result1)#Print final result\n",
293 | "print(result2)"
294 | ],
295 | "execution_count": null,
296 | "outputs": [
297 | {
298 | "output_type": "stream",
299 | "text": [
300 | "(8, 4)\n",
301 | "[[1 0 0 1]\n",
302 | " [1 0 0 0]\n",
303 | " [0 0 1 1]\n",
304 | " [0 1 0 0]\n",
305 | " [1 1 0 0]\n",
306 | " [0 0 1 1]\n",
307 | " [0 0 0 1]\n",
308 | " [0 0 1 0]]\n",
309 | "(8, 1)\n",
310 | "[[1]\n",
311 | " [1]\n",
312 | " [0]\n",
313 | " [0]\n",
314 | " [1]\n",
315 | " [1]\n",
316 | " [0]\n",
317 | " [0]]\n",
318 | "(4, 1)\n",
319 | "[[0.1]\n",
320 | " [0.2]\n",
321 | " [0.3]\n",
322 | " [0.4]]\n",
323 | "[[0.1]\n",
324 | " [0.2]\n",
325 | " [0.3]\n",
326 | " [0.4]]\n",
327 | "\n",
328 | "\n",
329 | "\n",
330 | "0.3\n",
331 | "[0.68997448]\n",
332 | "[0.64565631]\n",
333 | "[0.66818777]\n"
334 | ],
335 | "name": "stdout"
336 | }
337 | ]
338 | }
339 | ]
340 | }
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_2/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_2/neural_networks_part2_1.py:
--------------------------------------------------------------------------------
1 | # Import required libraries :
2 | import numpy as np
3 |
4 | # Define input features :
5 | input_features = np.array([[0,0],[0,1],[1,0],[1,1]])
6 | print (input_features.shape)
7 | print (input_features)
8 |
9 | # Define target output :
10 | target_output = np.array([[0,1,1,1]])
11 |
12 | # Reshaping our target output into vector :
13 | target_output = target_output.reshape(4,1)
14 | print(target_output.shape)
15 | print (target_output)
16 |
17 | # Define weights :
18 | # 6 for hidden layer
19 | # 3 for output layer
20 | # 9 total
21 | weight_hidden = np.array([[0.1,0.2,0.3],
22 | [0.4,0.5,0.6]])
23 | weight_output = np.array([[0.7],[0.8],[0.9]])
24 |
25 | # Learning Rate :
26 | lr = 0.05
27 |
28 | # Sigmoid function :
29 | def sigmoid(x):
30 | return 1/(1+np.exp(-x))
31 |
32 | # Derivative of sigmoid function :
33 | def sigmoid_der(x):
34 | return sigmoid(x)*(1-sigmoid(x))
35 |
36 | for epoch in range(200000):
37 | # Input for hidden layer :
38 | input_hidden = np.dot(input_features, weight_hidden)
39 |
40 | # Output from hidden layer :
41 | output_hidden = sigmoid(input_hidden)
42 |
43 | # Input for output layer :
44 | input_op = np.dot(output_hidden, weight_output)
45 |
46 | # Output from output layer :
47 | output_op = sigmoid(input_op)
48 |
49 | #========================================================
50 |
51 | # Phase1
52 |
53 | # Calculating Mean Squared Error :
54 | error_out = ((1 / 2) * (np.power((output_op - target_output), 2)))
55 | print(error_out.sum())
56 |
57 | # Derivatives for phase 1 :
58 | derror_douto = output_op - target_output
59 | douto_dino = sigmoid_der(input_op)
60 | dino_dwo = output_hidden
61 | derror_dwo = np.dot(dino_dwo.T, derror_douto * douto_dino)
62 |
63 | #===========================================================
64 |
65 | # Phase 2
66 | # derror_w1 = derror_douth * douth_dinh * dinh_dw1
67 | # derror_douth = derror_dino * dino_outh
68 |
69 | # Derivatives for phase 2 :
70 | derror_dino = derror_douto * douto_dino
71 | dino_douth = weight_output
72 | derror_douth = np.dot(derror_dino , dino_douth.T)
73 | douth_dinh = sigmoid_der(input_hidden)
74 | dinh_dwh = input_features
75 | derror_wh = np.dot(dinh_dwh.T, douth_dinh * derror_douth)
76 |
77 | # Update Weights
78 | weight_hidden -= lr * derror_wh
79 | weight_output -= lr * derror_dwo
80 |
81 | # Final hidden layer weight values :
82 | print (weight_hidden)
83 |
84 | # Final output layer weight values :
85 | print (weight_output)
86 |
87 | # Predictions :
88 | #Taking inputs :
89 | single_point = np.array([1,1])
90 |
91 | #1st step :
92 | result1 = np.dot(single_point, weight_hidden)
93 |
94 | #2nd step :
95 | result2 = sigmoid(result1)
96 |
97 | #3rd step :
98 | result3 = np.dot(result2,weight_output)
99 |
100 | #4th step :
101 | result4 = sigmoid(result3)
102 | print(result4)
103 |
104 | #=================================================
105 |
106 | #Taking inputs :
107 | single_point = np.array([0,0])
108 |
109 | #1st step :
110 | result1 = np.dot(single_point, weight_hidden)
111 |
112 | #2nd step :
113 | result2 = sigmoid(result1)
114 |
115 | #3rd step :
116 | result3 = np.dot(result2,weight_output)
117 |
118 | #4th step :
119 | result4 = sigmoid(result3)
120 | print(result4)
121 |
122 | #=====================================================
123 |
124 | #Taking inputs :
125 | single_point = np.array([1,0])
126 |
127 | #1st step :
128 | result1 = np.dot(single_point, weight_hidden)
129 |
130 | #2nd step :
131 | result2 = sigmoid(result1)
132 |
133 | #3rd step :
134 | result3 = np.dot(result2,weight_output)
135 |
136 | #4th step :
137 | result4 = sigmoid(result3)
138 | print(result4)
139 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_2/neural_networks_part2_2.py:
--------------------------------------------------------------------------------
1 | # Import required libraries :
2 | import numpy as np
3 |
4 | # Define input features :
5 | input_features = np.array([[0,0],[0,1],[1,0],[1,1]])
6 | print (input_features.shape)
7 | print (input_features)# Define target output :
8 | target_output = np.array([[0,1,1,1]])
9 |
10 | # Reshaping our target output into vector :
11 | target_output = target_output.reshape(4,1)
12 | print(target_output.shape)
13 | print (target_output)
14 |
15 | # Define weights :
16 | weights = np.array([[0.1],[0.2]])
17 | print(weights.shape)
18 | print (weights)
19 |
20 | # Define learning rate :
21 | lr = 0.05
22 |
23 | # Sigmoid function :
24 | def sigmoid(x):
25 | return 1/(1+np.exp(-x))
26 |
27 | # Derivative of sigmoid function :
28 | def sigmoid_der(x):
29 | return sigmoid(x)*(1-sigmoid(x))
30 |
31 | # Main logic for neural network :
32 | # Running our code 10000 times :
33 | for epoch in range(10000):
34 | inputs = input_features
35 |
36 | #Feedforward input :
37 | pred_in = np.dot(inputs, weights)
38 |
39 | #Feedforward output :
40 | pred_out = sigmoid(pred_in)
41 |
42 | #Backpropogation
43 | #Calculating error
44 | error = pred_out - target_output
45 | x = error.sum()
46 |
47 | #Going with the formula :
48 | print(x)
49 |
50 | #Calculating derivative :
51 | dcost_dpred = error
52 | dpred_dz = sigmoid_der(pred_out)
53 |
54 | #Multiplying individual derivatives :
55 | z_delta = dcost_dpred * dpred_dz
56 |
57 | #Multiplying with the 3rd individual derivative :
58 | inputs = input_features.T
59 | weights -= lr * np.dot(inputs, z_delta)
60 |
61 | #Predictions :
62 |
63 | #Taking inputs :
64 | single_point = np.array([1,0])
65 |
66 | #1st step :
67 | result1 = np.dot(single_point, weights)
68 |
69 | #2nd step :
70 | result2 = sigmoid(result1)
71 |
72 | #Print final result
73 | print(result2)
74 |
75 | #====================================
76 |
77 | #Taking inputs :
78 | single_point = np.array([0,0])
79 |
80 | #1st step :
81 | result1 = np.dot(single_point, weights)
82 |
83 | #2nd step :
84 | result2 = sigmoid(result1)
85 |
86 | #Print final result
87 | print(result2)
88 |
89 | #===================================
90 | #Taking inputs :
91 | single_point = np.array([1,1])
92 |
93 | #1st step :
94 | result1 = np.dot(single_point, weights)
95 |
96 | #2nd step :
97 | result2 = sigmoid(result1)
98 |
99 | #Print final result
100 | print(result2)
101 |
--------------------------------------------------------------------------------
/neural_networks_tutorial_part_2/neural_networks_part2_3.py:
--------------------------------------------------------------------------------
1 | # Import required libraries :
2 | import numpy as np
3 |
4 | # Define input features :
5 | input_features = np.array([[0,0],[0,1],[1,0],[1,1]])
6 | print (input_features.shape)
7 | print (input_features)
8 |
9 | # Define target output :
10 | target_output = np.array([[0,1,1,0]])
11 |
12 | # Reshaping our target output into vector :
13 | target_output = target_output.reshape(4,1)
14 | print(target_output.shape)
15 | print (target_output)
16 |
17 | # Define weights :
18 | # 8 for hidden layer
19 | # 4 for output layer
20 | # 12 total
21 | weight_hidden = np.random.rand(2,4)
22 | weight_output = np.random.rand(4,1)
23 |
24 | # Learning Rate :
25 | lr = 0.05
26 |
27 | # Sigmoid function :
28 | def sigmoid(x):
29 | return 1/(1+np.exp(-x))
30 |
31 | # Derivative of sigmoid function :
32 | def sigmoid_der(x):
33 | return sigmoid(x)*(1-sigmoid(x))
34 |
35 | # Main logic :
36 | for epoch in range(200000):
37 |
38 | # Input for hidden layer :
39 | input_hidden = np.dot(input_features, weight_hidden)
40 |
41 | # Output from hidden layer :
42 | output_hidden = sigmoid(input_hidden)
43 |
44 | # Input for output layer :
45 | input_op = np.dot(output_hidden, weight_output)
46 |
47 | # Output from output layer :
48 | output_op = sigmoid(input_op)
49 |
50 | #========================================================================
51 | # Phase1
52 |
53 | # Calculating Mean Squared Error :
54 | error_out = ((1 / 2) * (np.power((output_op - target_output), 2)))
55 | print(error_out.sum())
56 |
57 | # Derivatives for phase 1 :
58 | derror_douto = output_op - target_output
59 | douto_dino = sigmoid_der(input_op)
60 | dino_dwo = output_hidden
61 | derror_dwo = np.dot(dino_dwo.T, derror_douto * douto_dino)
62 |
63 | # ========================================================================
64 | # Phase 2
65 | # derror_w1 = derror_douth * douth_dinh * dinh_dw1
66 | # derror_douth = derror_dino * dino_outh
67 |
68 | # Derivatives for phase 2 :
69 | derror_dino = derror_douto * douto_dino
70 | dino_douth = weight_output
71 | derror_douth = np.dot(derror_dino , dino_douth.T)
72 | douth_dinh = sigmoid_der(input_hidden)
73 | dinh_dwh = input_features
74 | derror_dwh = np.dot(dinh_dwh.T, douth_dinh * derror_douth)
75 |
76 | # Update Weights
77 | weight_hidden -= lr * derror_dwh
78 | weight_output -= lr * derror_dwo
79 |
80 | # Final values of weight in hidden layer :
81 | print (weight_hidden)
82 |
83 | # Final values of weight in output layer :
84 | print (weight_output)
85 |
86 | #Taking inputs :
87 | single_point = np.array([0,-1])
88 |
89 | #1st step :
90 | result1 = np.dot(single_point, weight_hidden)
91 |
92 | #2nd step :
93 | result2 = sigmoid(result1)
94 |
95 | #3rd step :
96 | result3 = np.dot(result2,weight_output)
97 |
98 | #4th step :
99 | result4 = sigmoid(result3)
100 | print(result4)
101 |
102 | #Taking inputs :
103 | single_point = np.array([0,5])
104 |
105 | #1st step :
106 | result1 = np.dot(single_point, weight_hidden)
107 |
108 | #2nd step :
109 | result2 = sigmoid(result1)
110 |
111 | #3rd step :
112 | result3 = np.dot(result2,weight_output)
113 |
114 | #4th step :
115 | result4 = sigmoid(result3)
116 | print(result4)
117 |
118 | #Taking inputs :
119 | single_point = np.array([1,1.2])
120 |
121 | #1st step :
122 | result1 = np.dot(single_point, weight_hidden)
123 |
124 | #2nd step :
125 | result2 = sigmoid(result1)
126 |
127 | #3rd step :
128 | result3 = np.dot(result2,weight_output)
129 |
130 | #4th step :
131 | result4 = sigmoid(result3)
132 | print(result4)
133 |
--------------------------------------------------------------------------------
/pandas/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/pandas/pd-melt.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Understanding Pandas Melt - pd.melt()
4 |
5 | * Tutorial: https://news.towardsai.net/pdm
6 | * Github: https://github.com/towardsai/tutorials/tree/master/pandas/pd-melt.py
7 | """
8 |
9 | #Import Required Libraries:
10 | import pandas as pd
11 |
12 | #Raw data in form of dictionary:
13 | data = {"Person":["Alan","Berta","Charlie","Danielle"], #Name of Person
14 | "House":["A","B","A","C"], #Name of houses they live in
15 | "Age":[32,46,35,28], #Age of Person
16 | "Books":[100,30,20,40], #Number of books owned
17 | "Movies":[10,20,80,60] #Number of movie watched
18 | }
19 |
20 | #Converting the raw data into pandas DataFrame:
21 | data_wide = pd.DataFrame(data)
22 |
23 | #Printing the pandas DataFrame:
24 | data_wide
25 |
26 | #Melting the DataFrame from wide to long format:
27 | #Without specifying any parameters:
28 |
29 | data_wide.melt()
30 |
31 | #Melting the DataFrame from wide to long format:
32 | #id_vars
33 |
34 | data_wide.melt(id_vars=["Person","House"]) #Identifier columns
35 |
36 | #Melting the DataFrame from wide to long format:
37 | #id_vars
38 | #value_vars
39 |
40 | data_wide.melt(id_vars=["Person","House"], #Identifier columns
41 | value_vars=["Age","Books","Movies"]) #Columns to be melted
42 |
43 | #Melting the DataFrame from wide to long format:
44 | #id_vars
45 | #value_vars
46 |
47 | data_wide.melt(id_vars=["Person"], #Identifier columns
48 | value_vars=["Books","Movies"]) #Columns to be melted
49 |
50 | #Melting the DataFrame from wide to long format:
51 | #id_vars
52 | #value_vars
53 | #var_name
54 | #value_name
55 |
56 | data_wide.melt(id_vars=["Person","House"], #Identifier columns
57 | value_vars=["Age","Books","Movies"], #Columns to be melted
58 | var_name="Info", #Renaming the variable column name
59 | value_name="Numerical") #Renaming the value column name
60 |
61 | #Melting the DataFrame from wide to long format:
62 | #id_vars
63 | #value_vars
64 | #var_name
65 | #value_name
66 |
67 | data_wide.melt(id_vars=["Person"], #Identifier columns
68 | value_vars=["Books","Movies"], #Columns to be melted
69 | var_name="Info", #Renaming the variable column name
70 | value_name="Numerical") #Renaming the value column name
71 |
72 | #Melting the DataFrame from wide to long format:
73 | #id_vars
74 | #value_vars
75 | #var_name
76 | #value_name
77 | #ignore_index
78 |
79 | data_wide.melt(id_vars=["Person","House"], #Identifier columns
80 | value_vars=["Age","Books","Movies"], #Columns to be melted
81 | var_name="Info", #Renaming the variable column name
82 | value_name="Numerical", #Renaming the value column name
83 | ignore_index=False) #Using the original index
84 |
85 | #Creating multiple indexes for columns:
86 | data_wide.columns = [["Person","House","Age","Books","Movies"],
87 | ["Name","Flat","Old","Text","Video"]]
88 |
89 | #Printing the DataFrame:
90 | data_wide
91 |
92 | #Melting the DataFrame from wide to long format:
93 | #id_vars
94 | #value_vars
95 | #var_name
96 | #value_name
97 | #col_level
98 |
99 | data_wide.melt(id_vars=["Person","House"], #Identifier columns
100 | value_vars=["Age","Books","Movies"], #Columns to be melted
101 | var_name="Info", #Renaming the variable column name
102 | value_name="Numerical", #Renaming the value column name
103 | col_level=0) #Using the 0th column level index
104 |
105 | #Melting the DataFrame from wide to long format:
106 | #id_vars
107 | #value_vars
108 | #var_name
109 | #value_name
110 | #col_level
111 |
112 | data_wide.melt(id_vars=["Name","Flat"], #Identifier columns
113 | value_vars=["Old","Text","Video"], #Columns to be melted
114 | var_name="Info", #Renaming the variable column name
115 | value_name="Numerical", #Renaming the value column name
116 | col_level=1) #Using the 1st column level index
117 |
118 | #Melting the DataFrame from wide to long format:
119 | #id_vars
120 | #value_vars
121 | #var_name
122 | #value_name
123 | #col_level
124 | #ignore_index
125 |
126 | data_wide.melt(id_vars=["Name","Flat"], #Identifier columns
127 | value_vars=["Old","Text","Video"], #Columns to be melted
128 | var_name="Info", #Renaming the variable column name
129 | value_name="Numerical", #Renaming the value column name
130 | ignore_index=False, #Using the original index
131 | col_level=1) #Using the 1st column level index
132 |
--------------------------------------------------------------------------------
/pandas/pd_dropna().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github: https://github.com/towardsai/tutorials/tree/master/pandas
7 | """
8 |
9 | #Import Required Libraries:
10 | import pandas as pd
11 |
12 | #Raw data in form of dictionary:
13 | info = {"Person":["Alan","Berta","Charlie","Danielle","Euler",pd.NA], #Name of Person.
14 | "Age":[32,45,35,28,30,pd.NA], #Age of Person.
15 | "Degree":["CS","Biology","Physics",pd.NA,"Physics","CS"], #Major.
16 | "Country":["USA","Mexico","USA","Canada","USA","Canada"], #Country of study.
17 | "Books":[10,pd.NA,30,40,50,60], #Books owned.
18 | "Batch Size":[200,100,50,200,50,pd.NA] #Batch Size.
19 | }
20 |
21 | #Converting the raw data into DataFrame:
22 | data = pd.DataFrame(info)
23 |
24 | #Printing the DataFrame:
25 | data
26 |
27 | #Dropping the rows where at least one element is missing.
28 |
29 | data.dropna()
30 |
31 | #Drop the rows where at least one element is missing.
32 |
33 | data.dropna(axis=0)
34 |
35 | #Drop the rows where at least one element is missing.
36 |
37 | data.dropna(axis="rows")
38 |
39 | #Drop the columns where at least one element is missing.
40 |
41 | data.dropna(axis=1)
42 |
43 | #Drop the columns where at least one element is missing.
44 |
45 | data.dropna(axis="columns")
46 |
47 | #Drop the rows where at least one element is missing.
48 |
49 | data.dropna(how="any")
50 |
51 | #Import Required Libraries:
52 | import pandas as pd
53 |
54 | #Raw data in form of dictionary:
55 | info = {"Person":["Alan","Berta",pd.NA,"Charlie","Danielle","Euler"], #Name of Person.
56 | "Age":[32,45,pd.NA,35,28,30], #Age of Person.
57 | "Degree":["CS","Biology",pd.NA,"Physics",pd.NA,"Physics"], #Major.
58 | "Country":["USA","Mexico",pd.NA,"USA","Canada","USA"], #Country of study.
59 | "Books":[10,pd.NA,pd.NA,30,40,50], #Books owned.
60 | "Batch Size":[200,100,pd.NA,50,200,50] #Batch Size.
61 | }
62 |
63 | #Converting the raw data into DataFrame:
64 | data = pd.DataFrame(info)
65 |
66 | #Printing the DataFrame:
67 | data
68 |
69 | #Drop the rows if all elements are missing.
70 |
71 | data.dropna(how="all")
72 |
73 | #Keep the rows with at least 5 non missing elements.
74 |
75 | data.dropna(thresh=5)
76 |
77 | #Import Required Libraries:
78 | import pandas as pd
79 |
80 | #Raw data in form of dictionary:
81 | info = {"Person":["Alan","Berta",pd.NA,"Charlie","Danielle","Euler"], #Name of Person.
82 | "Age":[32,pd.NA,pd.NA,35,pd.NA,30], #Age of Person.
83 | "Degree":["CS","Biology",pd.NA,"Physics",pd.NA,"Physics"], #Major.
84 | "Country":["USA",pd.NA,pd.NA,"USA","Canada","USA"], #Country of study.
85 | "Books":[10,pd.NA,pd.NA,30,40,50], #Books owned.
86 | "Batch Size":[200,100,pd.NA,50,200,50] #Batch Size.
87 | }
88 |
89 | #Converting the raw data into DataFrame:
90 | data = pd.DataFrame(info)
91 |
92 | #Printing the DataFrame:
93 | data
94 |
95 | #Define in which columns to look for missing elements.
96 |
97 | data.dropna(subset=["Person","Degree","Country"])
98 |
99 | #Import Required Libraries:
100 | import pandas as pd
101 |
102 | #Raw data in form of dictionary:
103 | info = {"Person":["Alan","Berta","Charlie","Danielle","Euler",pd.NA], #Name of Person.
104 | "Age":[32,45,35,28,30,pd.NA], #Age of Person.
105 | "Degree":["CS","Biology","Physics",pd.NA,"Physics","CS"], #Major.
106 | "Country":["USA","Mexico","USA","Canada","USA","Canada"], #Country of study.
107 | "Books":[10,pd.NA,30,40,50,60], #Books owned.
108 | "Batch Size":[200,100,50,200,50,pd.NA] #Batch Size.
109 | }
110 |
111 | #Converting the raw data into DataFrame:
112 | data = pd.DataFrame(info)
113 |
114 | #Printing the DataFrame:
115 | data
116 |
117 | #Inplace=True will make changes in the original DataFrame.
118 | #It will return nothing.
119 |
120 | data.dropna(inplace=True)
121 | data
122 |
--------------------------------------------------------------------------------
/pandas/pd_fillna().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github https://github.com/towardsai/tutorials/tree/master/pandas
7 | """
8 |
9 | #Import Required Libraries:
10 | import pandas as pd
11 |
12 | #Raw data in form of dictionary:
13 | info = {"Person":["Alan","Berta","Charlie","Danielle","Euler",pd.NA], #Name of Person.
14 | "Age":[32,45,35,28,30,pd.NA], #Age of Person.
15 | "Degree":["CS",pd.NA,pd.NA,pd.NA,"Physics",pd.NA], #Major.
16 | "Country":["USA","Mexico","USA",pd.NA,"USA",pd.NA], #Country of study.
17 | "Books":[10,pd.NA,30,pd.NA,50,60], #Books owned.
18 | "Batch Size":[200,100,50,200,50,pd.NA] #Batch Size.
19 | }
20 |
21 | #Converting the raw data into DataFrame:
22 | data = pd.DataFrame(info)
23 |
24 | #Printing the DataFrame:
25 | data
26 |
27 | #Replacing all missing values with 0:
28 |
29 | data.fillna(value=0)
30 |
31 | #Replacing values:
32 |
33 | #Values to be used for specific column:
34 | values = {"Person":"---", "Age":0, "Degree":"---","Books":0,"Batch Size":0}
35 |
36 | #Replacing the values:
37 | data.fillna(value=values)
38 |
39 | #Using method="ffill":
40 |
41 | data.fillna(method="ffill")
42 |
43 | #Using method="pad":
44 | #Same as method="ffill"
45 |
46 | data.fillna(method="pad")
47 |
48 | #Using method="ffill":
49 | #Using axis=0 (Default)
50 |
51 | data.fillna(method="ffill", axis=0)
52 |
53 | #Using method="ffill":
54 | #Using axis=1
55 |
56 | data.fillna(method="ffill", axis=1)
57 |
58 | #Using method="bfill":
59 |
60 | data.fillna(method="bfill")
61 |
62 | #Using method="backfill":
63 | #Same as method="bfill"
64 |
65 | data.fillna(method="backfill")
66 |
67 | #Using method="bfill":
68 | #Using axis=0 (Default)
69 |
70 | data.fillna(method="bfill",axis=0)
71 |
72 | #Using method="bfill":
73 | #Using axis=1:
74 |
75 | data.fillna(method="bfill",axis=1)
76 |
77 | #Using method="ffill":
78 | #Using axis=0:
79 | #Using limit=1:
80 |
81 | data.fillna(method="ffill",axis=0, limit=1)
82 |
83 | #Using method="ffill":
84 | #Using axis=1:
85 | #Using limit=1:
86 |
87 | data.fillna(method="ffill",axis=1, limit=1)
88 |
89 | #Using method="bfill":
90 | #Using axis=0:
91 | #Using limit=1:
92 |
93 | data.fillna(method="bfill",axis=0, limit=1)
94 |
95 | #Using method="bfill":
96 | #Using axis=1:
97 | #Using limit=1:
98 |
99 | data.fillna(method="bfill",axis=1, limit=1)
100 |
101 | #Import Required Libraries:
102 | import pandas as pd
103 |
104 | #Raw data in form of dictionary:
105 | info = {"Age":[32.0,45.0,35.0,28.0,30.0,40.0],
106 | "Books":[10.0,pd.NA,30.0,40.0,50.0,60.0],
107 | "Batch Size":[200,100,50,200,50,300]
108 | }
109 |
110 | #Converting the raw data into DataFrame:
111 | data = pd.DataFrame(info)
112 |
113 | #Printing the DataFrame:
114 | data
115 |
116 | data.dtypes
117 |
118 | a = data.fillna(0,downcast="infer")
119 | a
120 |
121 | a.dtypes
122 |
123 | #Inplace=True will make changes in the original DataFrame.
124 | #It will return nothing.
125 |
126 | data.fillna(value=0,inplace=True)
127 | data
128 |
129 | #Import Required Libraries:
130 | import pandas as pd
131 |
132 | #Raw data in form of dictionary:
133 | info = {"Person":["Alan","Berta","Charlie","Danielle","Euler",pd.NA], #Name of Person.
134 | "Age":[32,45,35,28,30,pd.NA], #Age of Person.
135 | "Degree":["CS",pd.NA,pd.NA,pd.NA,"Physics",pd.NA], #Major.
136 | "Country":["USA","Mexico","USA",pd.NA,"USA",pd.NA], #Country of study.
137 | "Books":[10,pd.NA,30,pd.NA,50,60], #Books owned.
138 | "Batch Size":[200,100,50,200,50,pd.NA] #Batch Size.
139 | }
140 |
141 | #Converting the raw data into DataFrame:
142 | data = pd.DataFrame(info)
143 |
144 | #Printing the DataFrame:
145 | data
146 |
147 | #Using pd.DataFrame.bfill():
148 |
149 | data.bfill()
150 |
151 | #Using pd.DataFrame.backfill():
152 |
153 | data.backfill()
154 |
155 | #Using pd.DataFrame.ffill():
156 |
157 | data.ffill()
158 |
159 | #Using pd.DataFrame.pad():
160 |
161 | data.pad()
162 |
--------------------------------------------------------------------------------
/pandas/pd_isna().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github
7 | """
8 |
9 | #Import Required Libraries:
10 |
11 | import numpy as np
12 | import pandas as pd
13 |
14 | #Scalar arguments:
15 | #Numerical value
16 |
17 | pd.isna(28)
18 |
19 | #Scalar arguments:
20 | #String value
21 |
22 | pd.isna("Pratik")
23 |
24 | #Scalar arguments:
25 | #Empty strings are not considered as NA values
26 |
27 | pd.isna("")
28 |
29 | #Scalar arguments:
30 | #Infinite values are not considered as NA values
31 |
32 | pd.isna(np.inf)
33 |
34 | #Scalar arguments:
35 | #NaN: Not a Number
36 |
37 | pd.isna(np.NaN)
38 |
39 | #Scalar arguments:
40 | #None
41 |
42 | pd.isna(None)
43 |
44 | #Scalar arguments:
45 | #NA: Not Available
46 |
47 | pd.isna(pd.NA)
48 |
49 | #Scalar arguments:
50 | #NaT: Not a Timestamp
51 |
52 | pd.isna(pd.NaT)
53 |
54 | #nd-arrays:
55 |
56 | arr = np.array([1,2,"Blue"])
57 | print(arr)
58 | print("\n")
59 | pd.isna(arr)
60 |
61 | #nd-arrays:
62 | #Empty strings are not considered as NA values
63 |
64 | arr = np.array([[1,2,None],
65 | [3,4,pd.NA],
66 | [5,np.NaN,6],
67 | ["",7,8],
68 | ["Blue",pd.NaT,"Red"]])
69 |
70 | print(arr)
71 | print("\n")
72 | pd.isna(arr)
73 |
74 | #For index values:
75 |
76 | id = pd.Index([1,2,np.NaN,"Blue"])
77 | print(id)
78 | print("\n")
79 | pd.isna(id)
80 |
81 | #For index values:
82 |
83 | id = pd.DatetimeIndex([pd.Timestamp("2020-10-28"),
84 | pd.Timestamp(""),
85 | None,
86 | np.NaN,
87 | pd.NA,
88 | pd.NaT])
89 |
90 | print(id)
91 | print("\n")
92 | pd.isna(id)
93 |
94 | #Series:
95 |
96 | s = pd.Series([1,2,3,None,4,np.NaN,pd.NA,pd.NaT,"Blue"])
97 | print(s)
98 | print("\n")
99 | pd.isna(s)
100 |
101 | #DataFrame:
102 |
103 | df = pd.DataFrame({"Name":["Alan","Berta","Charlie",None],
104 | "Age":[32,45,np.NaN,28],
105 | "Birthday":[pd.NaT,None,pd.Timestamp("1975-10-28"),np.NaN],
106 | "Country":["USA","","USA","Canada"]
107 | })
108 |
109 | print(df)
110 | print("\n")
111 | pd.isna(df)
112 |
--------------------------------------------------------------------------------
/pandas/pd_isnull().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github
7 | """
8 |
9 | #Import Required Libraries:
10 |
11 | import numpy as np
12 | import pandas as pd
13 |
14 | #Scalar arguments:
15 | #Numerical value
16 |
17 | pd.isnull(28)
18 |
19 | #Scalar arguments:
20 | #String value
21 |
22 | pd.isnull("Pratik")
23 |
24 | #Scalar arguments:
25 | #Empty strings are not considered as null values
26 |
27 | pd.isnull("")
28 |
29 | #Scalar arguments:
30 | #Infinite values are not considered as null values
31 |
32 | pd.isnull(np.inf)
33 |
34 | #Scalar arguments:
35 | #NaN: Not a Number
36 |
37 | pd.isnull(np.NaN)
38 |
39 | #Scalar arguments:
40 | #None
41 |
42 | pd.isnull(None)
43 |
44 | #Scalar arguments:
45 | #NA: Not Available
46 |
47 | pd.isnull(pd.NA)
48 |
49 | #Scalar arguments:
50 | #NaT: Not a Timestamp
51 |
52 | pd.isnull(pd.NaT)
53 |
54 | #nd-arrays:
55 |
56 | arr = np.array([1,2,"Blue"])
57 | print(arr)
58 | print("\n")
59 | pd.isnull(arr)
60 |
61 | #nd-arrays:
62 | #Empty strings are not considered as NA values
63 |
64 | arr = np.array([[1,2,None],
65 | [3,4,pd.NA],
66 | [5,np.NaN,6],
67 | ["",7,8],
68 | ["Blue",pd.NaT,"Red"]])
69 |
70 | print(arr)
71 | print("\n")
72 | pd.isnull(arr)
73 |
74 | #For index values:
75 |
76 | id = pd.Index([1,2,np.NaN,"Blue"])
77 | print(id)
78 | print("\n")
79 | pd.isnull(id)
80 |
81 | #For index values:
82 |
83 | id = pd.DatetimeIndex([pd.Timestamp("2020-10-28"),
84 | pd.Timestamp(""),
85 | None,
86 | np.NaN,
87 | pd.NA,
88 | pd.NaT])
89 |
90 | print(id)
91 | print("\n")
92 | pd.isnull(id)
93 |
94 | #Series:
95 |
96 | s = pd.Series([1,2,3,None,4,np.NaN,pd.NA,pd.NaT,"Blue"])
97 | print(s)
98 | print("\n")
99 | pd.isnull(s)
100 |
101 | #DataFrame:
102 |
103 | df = pd.DataFrame({"Name":["Alan","Berta","Charlie",None],
104 | "Age":[32,45,np.NaN,28],
105 | "Birthday":[pd.NaT,None,pd.Timestamp("1975-10-28"),np.NaN],
106 | "Country":["USA","","USA","Canada"]
107 | })
108 |
109 | print(df)
110 | print("\n")
111 | pd.isnull(df)
112 |
--------------------------------------------------------------------------------
/pandas/pd_join().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Diving Into Pandas Join - pd.join()
4 |
5 | * Tutorial: https://news.towardsai.net/xfr
6 | * Github: https://github.com/towardsai/tutorials/tree/master/pandas
7 | """
8 |
9 | #Import Required Libraries:
10 |
11 | import pandas as pd
12 | import numpy as np
13 |
14 | #Creating first DataFrame:
15 |
16 | data1 = {"Name":["Alan","Berta","Charlie","Danielle","Euler"],
17 | "Age":[32,45,35,28,30]}
18 |
19 | df1 = pd.DataFrame(data1)
20 | df1
21 |
22 | #Creating second DataFrame:
23 |
24 | data2 = {"Name":["Berta","Charlie","Danielle","Euler","Frank"],
25 | "Money":[5000,20000,15000,10000,5000]}
26 |
27 | df2 = pd.DataFrame(data2)
28 | df2
29 |
30 | #Creating Third DataFrame:
31 |
32 | data3 = {"Name":["Berta","Charlie","Danielle"],
33 | "Books":[10,20,30]}
34 |
35 | df3 = pd.DataFrame(data3)
36 | df3
37 |
38 | #Creating fourth DataFrame:
39 |
40 | data4 = {"Name":["Alan","Berta","Charlie","Danielle","Euler"],
41 | "Age":[32,45,35,28,30]}
42 |
43 | df4 = pd.DataFrame(data4)
44 | df4
45 |
46 | #Creating fifth DataFrame:
47 |
48 | data5 = {"Money":[5000,20000,15000,10000,5000],
49 | "Books":[10,20,30,40,50]}
50 |
51 | df5 = pd.DataFrame(data5)
52 | df5
53 |
54 | #Creating sixth DataFrame:
55 |
56 | data6 = {"Name":["Euler","Danielle","Charlie","Berta","Alan"],
57 | "Age":[30,28,35,45,32]}
58 |
59 | df6 = pd.DataFrame(data6)
60 | df6
61 |
62 | #Creating seventh DataFrame:
63 |
64 | data7 = {"Name":["Rose","Patrick","Euler","Danielle"],
65 | "Money":[10,20,30,40]}
66 |
67 | df7 = pd.DataFrame(data7)
68 | df7
69 |
70 | #Using the join() function:
71 |
72 | df1.join(df2)
73 |
74 | #Using the join() function:
75 |
76 | df1.join(df2,lsuffix="_Left",rsuffix="_Right")
77 |
78 | #Using the join() function:
79 |
80 | df1.join(df2,lsuffix="_Left")
81 |
82 | #Using the join() function:
83 |
84 | df1.join(df2,rsuffix="_Right")
85 |
86 | #Using join() function:
87 |
88 | df4.join(df5)
89 |
90 | #Using the join() Function:
91 |
92 | df1.set_index("Name").join(df2.set_index("Name"))
93 |
94 | #Using the join() function:
95 |
96 | df1.join(df2.set_index("Name"),on="Name")
97 |
98 | #Using the join() function:
99 |
100 | df1.set_index("Name").join([df2.set_index("Name"),df3.set_index("Name")])
101 |
102 | #Creating a Series:
103 |
104 | s = pd.Series(["A","B","C","D"],name="Initial")
105 | s
106 |
107 | #Using the join() Function:
108 |
109 | df1.join(s)
110 |
111 | #Using the join() Function:
112 |
113 | df1.set_index("Name").join(df2.set_index("Name"),how="left")
114 |
115 | #Using the join() Function:
116 |
117 | df1.set_index("Name").join(df2.set_index("Name"),how="right")
118 |
119 | #Using join() function:
120 |
121 | df1.set_index("Name").join(df2.set_index("Name"), how="outer")
122 |
123 | #Using join() function:
124 |
125 | df1.set_index("Name").join(df2.set_index("Name"), how="inner")
126 |
127 | #Default how = left:
128 | #how=left
129 | #sort=False
130 |
131 | df6.set_index("Name").join(other=df7.set_index("Name"),how="left",sort=False)
132 |
133 | #Default how = left:
134 | #how=left
135 | #sort=True
136 |
137 | df6.set_index("Name").join(other=df7.set_index("Name"),how="left",sort=True)
138 |
139 | #how=right
140 | #sort=False
141 |
142 | df6.set_index("Name").join(other=df7.set_index("Name"),how="right",sort=False)
143 |
144 | #how=right
145 | #sort=True
146 |
147 | df6.set_index("Name").join(other=df7.set_index("Name"),how="right",sort=True)
148 |
149 | #how=outer
150 | #sort=False
151 |
152 | df6.set_index("Name").join(other=df7.set_index("Name"),how="outer",sort=False)
153 |
154 | #how=outer
155 | #sort=True
156 |
157 | df6.set_index("Name").join(other=df7.set_index("Name"),how="outer",sort=True)
158 |
159 | #how=inner
160 | #sort=False
161 |
162 | df6.set_index("Name").join(other=df7.set_index("Name"),how="inner",sort=False)
163 |
164 | #how=inner
165 | #sort=True
166 |
167 | df6.set_index("Name").join(other=df7.set_index("Name"),how="inner",sort=True)
168 |
--------------------------------------------------------------------------------
/pandas/pd_notna().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github
7 | """
8 |
9 | #Import Required Libraries:
10 |
11 | import numpy as np
12 | import pandas as pd
13 |
14 | #Scalar arguments:
15 | #Numerical value
16 |
17 | pd.notna(28)
18 |
19 | #Scalar arguments:
20 | #String value
21 |
22 | pd.notna("Pratik")
23 |
24 | #Scalar arguments:
25 | #Empty strings are not considered as NA values
26 |
27 | pd.notna("")
28 |
29 | #Scalar arguments:
30 | #Infinite values are not considered as NA values
31 |
32 | pd.notna(np.inf)
33 |
34 | #Scalar arguments:
35 | #NaN: Not a Number
36 |
37 | pd.notna(np.NaN)
38 |
39 | #Scalar arguments:
40 | #None
41 |
42 | pd.notna(None)
43 |
44 | #Scalar arguments:
45 | #NA: Not Available
46 |
47 | pd.notna(pd.NA)
48 |
49 | #Scalar arguments:
50 | #NaT: Not a Timestamp
51 |
52 | pd.notna(pd.NaT)
53 |
54 | #nd-arrays:
55 |
56 | arr = np.array([1,2,"Blue"])
57 | print(arr)
58 | print("\n")
59 | pd.notna(arr)
60 |
61 | #nd-arrays:
62 | #Empty strings are not considered as NA values
63 |
64 | arr = np.array([[1,2,None],
65 | [3,4,pd.NA],
66 | [5,np.NaN,6],
67 | ["",7,8],
68 | ["Blue",pd.NaT,"Red"]])
69 |
70 | print(arr)
71 | print("\n")
72 | pd.notna(arr)
73 |
74 | #For index values:
75 |
76 | id = pd.Index([1,2,np.NaN,"Blue"])
77 | print(id)
78 | print("\n")
79 | pd.notna(id)
80 |
81 | #For index values:
82 |
83 | id = pd.DatetimeIndex([pd.Timestamp("2020-10-28"),
84 | pd.Timestamp(""),
85 | None,
86 | np.NaN,
87 | pd.NA,
88 | pd.NaT])
89 |
90 | print(id)
91 | print("\n")
92 | pd.notna(id)
93 |
94 | #Series:
95 |
96 | s = pd.Series([1,2,3,None,4,np.NaN,pd.NA,pd.NaT,"Blue"])
97 | print(s)
98 | print("\n")
99 | pd.notna(s)
100 |
101 | #DataFrame:
102 |
103 | df = pd.DataFrame({"Name":["Alan","Berta","Charlie",None],
104 | "Age":[32,45,np.NaN,28],
105 | "Birthday":[pd.NaT,None,pd.Timestamp("1975-10-28"),np.NaN],
106 | "Country":["USA","","USA","Canada"]
107 | })
108 |
109 | print(df)
110 | print("\n")
111 | pd.notna(df)
112 |
113 | pd.isnull
114 |
115 | pd.notnull
116 |
--------------------------------------------------------------------------------
/pandas/pd_notnull().py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Handling Missing Values in Pandas
4 |
5 | * Tutorial: https://news.towardsai.net/hmv
6 | * Github
7 | """
8 |
9 | #Import Required Libraries:
10 |
11 | import numpy as np
12 | import pandas as pd
13 |
14 | #Scalar arguments:
15 | #Numerical value
16 |
17 | pd.notnull(28)
18 |
19 | #Scalar arguments:
20 | #String value
21 |
22 | pd.notnull("Pratik")
23 |
24 | #Scalar arguments:
25 | #Empty strings are not considered as null values
26 |
27 | pd.notnull("")
28 |
29 | #Scalar arguments:
30 | #Infinite values are not considered as null values
31 |
32 | pd.notnull(np.inf)
33 |
34 | #Scalar arguments:
35 | #NaN: Not a Number
36 |
37 | pd.notnull(np.NaN)
38 |
39 | #Scalar arguments:
40 | #None
41 |
42 | pd.notnull(None)
43 |
44 | #Scalar arguments:
45 | #NA: Not Available
46 |
47 | pd.notnull(pd.NA)
48 |
49 | #Scalar arguments:
50 | #NaT: Not a Timestamp
51 |
52 | pd.notnull(pd.NaT)
53 |
54 | #nd-arrays:
55 |
56 | arr = np.array([1,2,"Blue"])
57 | print(arr)
58 | print("\n")
59 | pd.notnull(arr)
60 |
61 | #nd-arrays:
62 | #Empty strings are not considered as NA values
63 |
64 | arr = np.array([[1,2,None],
65 | [3,4,pd.NA],
66 | [5,np.NaN,6],
67 | ["",7,8],
68 | ["Blue",pd.NaT,"Red"]])
69 |
70 | print(arr)
71 | print("\n")
72 | pd.notnull(arr)
73 |
74 | #For index values:
75 |
76 | id = pd.Index([1,2,np.NaN,"Blue"])
77 | print(id)
78 | print("\n")
79 | pd.notnull(id)
80 |
81 | #For index values:
82 |
83 | id = pd.DatetimeIndex([pd.Timestamp("2020-10-28"),
84 | pd.Timestamp(""),
85 | None,
86 | np.NaN,
87 | pd.NA,
88 | pd.NaT])
89 |
90 | print(id)
91 | print("\n")
92 | pd.notnull(id)
93 |
94 | #Series:
95 |
96 | s = pd.Series([1,2,3,None,4,np.NaN,pd.NA,pd.NaT,"Blue"])
97 | print(s)
98 | print("\n")
99 | pd.notnull(s)
100 |
101 | #DataFrame:
102 |
103 | df = pd.DataFrame({"Name":["Alan","Berta","Charlie",None],
104 | "Age":[32,45,np.NaN,28],
105 | "Birthday":[pd.NaT,None,pd.Timestamp("1975-10-28"),np.NaN],
106 | "Country":["USA","","USA","Canada"]
107 | })
108 |
109 | print(df)
110 | print("\n")
111 | pd.notnull(df)
112 |
--------------------------------------------------------------------------------
/poisson-distribution-process/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/poisson-distribution-process/poisson.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """poisson-distribution-and-poisson-process-tutorial.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/1u-t6oSxbMd2FrzaIMimXnPMR5dfV6Gel
8 |
9 | # Diving Into the Poisson Distribution and Poisson Process
10 |
11 | * Tutorial: https://news.towardsai.net/pd
12 | * Github: https://github.com/towardsai/tutorials/tree/master/poisson-distribution-process
13 |
14 | A TV Assembly unit performs a defects analysis task to understand the number of defects that could happen for a given defective TV. Using past quality and audit data is noted that 12 defects are marked on an average for a defective TV, calculate below:
15 |
16 | * The probability that a defective laptop has exactly 5 defects.
17 | * The probability that a defective laptop has less than 5 defects.
18 | """
19 |
20 | import numpy as np
21 | import scipy.stats as stats
22 | import matplotlib.pyplot as plt
23 |
24 | n = np.arange(0,30)
25 | n
26 |
27 | rate = 12
28 | poisson = stats.poisson.pmf(n,rate)
29 |
30 | poisson
31 |
32 | poisson[5]
33 |
34 | poisson[0] + poisson[1] + poisson[2] + poisson[3] + poisson[4]
35 |
36 | plt.plot(n,poisson, 'o-')
37 | plt.show()
--------------------------------------------------------------------------------
/principal_component_analysis/correlation_matrix_covariance_matrix.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 |
4 | # Implementation of Correlation Matrix and covariance Matrix
5 |
6 | * This is used with Eigenvalues and Eigenvectors of PCA
7 |
8 | **Import Packages**
9 | """
10 |
11 | import pandas as pd
12 | import numpy as np
13 |
14 | matrix = np.array([[0, 3, 4], [1, 2, 4], [3, 4, 5]])
15 | print(matrix)
16 |
17 | """**Convert to covariance**"""
18 |
19 | np.cov(matrix)
20 |
21 | """**Convert to Correlation Matrix**"""
22 |
23 | matrix_a = np.array([[0.1, .32, .2, 0.4, 0.8],
24 | [.23, .18, .56, .61, .12],
25 | [.9, .3, .6, .5, .3],
26 | [.34, .75, .91, .19, .21]])
27 |
28 | pd.DataFrame(matrix_a).corr()
29 |
30 | np.corrcoef(matrix_a.T)
--------------------------------------------------------------------------------
/principal_component_analysis/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/principal_component_analysis/pca_with_python.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """pca_with_python.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | **Import Libraries**
8 | """
9 |
10 | import pandas as pd
11 | import numpy as np
12 | from sklearn.datasets import load_iris
13 | from sklearn.preprocessing import StandardScaler
14 | import matplotlib.pyplot as plt
15 |
16 | """**Load Iris Data**"""
17 |
18 | iris = load_iris()
19 |
20 | df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
21 |
22 | df['class'] = iris.target
23 | print(df)
24 |
25 | """**Get the value of x and y**"""
26 |
27 | x = df.drop(labels='class', axis=1).values
28 | y = df['class'].values
29 |
30 | print(x.shape, y.shape)
31 |
32 | """**Implementation of PCA**"""
33 |
34 | class convers_pca():
35 | def __init__(self, no_of_components):
36 | self.no_of_components = no_of_components
37 | self.eigen_values = None
38 | self.eigen_vectors = None
39 |
40 | def transform(self, x):
41 | return np.dot(x - self.mean, self.projection_matrix.T)
42 |
43 | def inverse_transform(self, x):
44 | return np.dot(x, self.projection_matrix) + self.mean
45 |
46 | def fit(self, x):
47 | self.no_of_components = x.shape[1] if self.no_of_components is None else self.no_of_components
48 | self.mean = np.mean(x, axis=0)
49 |
50 | cov_matrix = np.cov(x - self.mean, rowvar=False)
51 |
52 | self.eigen_values, self.eigen_vectors = np.linalg.eig(cov_matrix)
53 | self.eigen_vectors = self.eigen_vectors.T
54 |
55 | self.sorted_components = np.argsort(self.eigen_values)[::-1]
56 |
57 | self.projection_matrix = self.eigen_vectors[self.sorted_components[:self.no_of_components]]
58 |
59 | self.explained_variance = self.eigen_values[self.sorted_components]
60 | self.explained_variance_ratio = self.explained_variance / self.eigen_values.sum()
61 |
62 | """**Standardization**"""
63 |
64 | std = StandardScaler()
65 | transformed = StandardScaler().fit_transform(x)
66 |
67 | """**PCA with Component = 2**"""
68 |
69 | pca = convers_pca(no_of_components=2)
70 | pca.fit(transformed)
71 |
72 | """**Plotting**"""
73 |
74 | x_std = pca.transform(transformed)
75 |
76 | plt.figure()
77 | plt.scatter(x_std[:, 0], x_std[:, 1], c=y)
--------------------------------------------------------------------------------
/programming/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/programming/variable_swap_data_science.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #5 Ways to Swap Two Variables in Python
4 |
5 | * Tutorial: https://news.towardsai.net/hnk
6 | * Github: https://github.com/towardsai/tutorials/tree/master/programming
7 | """
8 |
9 | #Using Naive Approach to Swap Values in Python:
10 |
11 | def swap_values(x,y):
12 | #Printing Original Values:
13 | print("Original Values")
14 | print("X:",x)
15 | print("Y:",y)
16 | print("---------------")
17 |
18 | #Swapping Values:
19 | temp = x
20 | x = y
21 | y = temp
22 |
23 | #Printing Values after Swapping:
24 | print("Values after Swapping")
25 | print("X:",x)
26 | print("Y:",y)
27 |
28 | #Function Call:
29 |
30 | #Integer Values:
31 | swap_values(10,20)
32 | print("\n")
33 |
34 | #Float Values:
35 | swap_values(10.5,20.5)
36 | print("\n")
37 |
38 | #String Values:
39 | swap_values("Pratik","Shukla")
40 |
41 | #Using comma operator to Swap Values in Python:
42 |
43 | def swap_values(x,y):
44 | #Printing Original Values:
45 | print("Original Values")
46 | print("X:",x)
47 | print("Y:",y)
48 | print("---------------")
49 |
50 | #Swapping Values:
51 | x,y = y,x
52 |
53 | #Printing Values after Swapping:
54 | print("Values after Swapping")
55 | print("X:",x)
56 | print("Y:",y)
57 |
58 | #Function Call:
59 |
60 | #Integer Values:
61 | swap_values(10,20)
62 | print("\n")
63 |
64 | #Float Values:
65 | swap_values(10.5,20.5)
66 | print("\n")
67 |
68 | #String Values:
69 | swap_values("Pratik","Shukla")
70 |
71 | #Using Arithmatic Operator to Swap Values in Python:
72 |
73 | def swap_values(x,y):
74 | #Printing Original Values:
75 | print("Original Values")
76 | print("X:",x)
77 | print("Y:",y)
78 | print("---------------")
79 |
80 | #Swapping Values:
81 | x = x+y
82 | y = x-y
83 | x = x-y
84 |
85 | #Printing Values after Swapping:
86 | print("Values after Swapping")
87 | print("X:",x)
88 | print("Y:",y)
89 |
90 | #Function Call:
91 |
92 | #Integer Values:
93 | swap_values(10,20)
94 | print("\n")
95 |
96 | #Float Values:
97 | swap_values(10.5,20.5)
98 | print("\n")
99 |
100 | #String Values:
101 | #It doesn't work with Strings as it uses numerical operators.
102 | #swap_values("Pratik","Shukla")
103 |
104 | #Using Arithmatic Operator to Swap Values in Python:
105 |
106 | def swap_values(x,y):
107 | #Printing Original Values:
108 | print("Original Values")
109 | print("X:",x)
110 | print("Y:",y)
111 | print("---------------")
112 |
113 | #Swapping Values:
114 | x = x*y
115 | y = x/y
116 | x = x/y
117 |
118 | #Printing Values after Swapping:
119 | print("Values after Swapping")
120 | print("X:",x)
121 | print("Y:",y)
122 |
123 | #Function Call:
124 |
125 | #Integer Values:
126 | swap_values(10,20)
127 | print("\n")
128 |
129 | #Float Values:
130 | swap_values(10.5,20.5)
131 | print("\n")
132 |
133 | #String Values:
134 | #It doesn't work with Strings as it uses numerical operators.
135 | #swap_values("Pratik","Shukla")
136 |
137 | #Using Arithmatic Operator to Swap Values in Python:
138 |
139 | def swap_values(x,y):
140 | #Printing Original Values:
141 | print("Original Values")
142 | print("X:",x)
143 | print("Y:",y)
144 | print("---------------")
145 |
146 | #Swapping Values:
147 | x = x^y
148 | y = x^y
149 | x = x^y
150 |
151 | #Printing Values after Swapping:
152 | print("Values after Swapping")
153 | print("X:",x)
154 | print("Y:",y)
155 |
156 | #Function Call:
157 |
158 | #Integer Values:
159 | swap_values(10,20)
160 | print("\n")
161 |
162 | #Float Values:
163 | #It only works with integers.
164 | #swap_values(10.5,20.5)
165 |
166 | #String Values:
167 | #It doesn't work with Strings as it uses numerical operators.
168 | #swap_values("Pratik","Shukla")
169 |
--------------------------------------------------------------------------------
/random-number-generator/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/random-number-generator/random_number_generator_tutorial_with_python.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """Random Number Generator Tutorial with Python.ipynb
3 |
4 | Automatically generated by Colaboratory.
5 |
6 | Original file is located at
7 | https://colab.research.google.com/drive/1qKiX3ztczxxXNSlC0IKtSr8ND4UILE-G
8 |
9 | # Random Number Generator Tutorial with Python
10 |
11 | ## Generating pseudorandom numbers with Python standard library
12 |
13 | Python has a built-in module called random to generate a variety of pseudorandom numbers. Although it is recommended that this module should not be used for security purposes like cryptographic uses this will do for machine learning and data science. This module uses a PRNG called Mersenne Twister.
14 |
15 | ### Importing module: random
16 | """
17 |
18 | import random
19 |
20 | """### Random numbers within a range"""
21 |
22 | #initialize the seed to 25
23 | random.seed(25)
24 |
25 | #generating random number between 10 and 20(both excluded)
26 | print(random.randrange(10, 20))
27 |
28 | #generating random number between 10 and 20(both included)
29 | print(random.randint(10, 20))
30 |
31 | """### Random element from a sequence"""
32 |
33 | #initialize the seed to 2
34 | random.seed(2)
35 |
36 | #setting up the sequence
37 | myseq = ['Towards', 'AI', 'is', 1]
38 |
39 | #randomly choosing an element from the sequence
40 | random.choice(myseq)
41 |
42 | """### Multiple random selections with different possibilities"""
43 |
44 | #initialize the seed to 25
45 | random.seed(25)
46 |
47 | #setting up the sequence
48 | myseq = ['Towards', 'AI', 'is', 1]
49 |
50 | #random selection of length 15
51 | #10 time higher possibility of selecting 'Towards'
52 | #5 time higher possibility of selecting 'AI'
53 | #2 time higher possibility of selecting 'is'
54 | #2 time higher possibility of selecting 1
55 | random.choices(myseq, weights=[10, 5, 2, 2], k = 15)
56 |
57 | """### Random element from a sequence without replacement"""
58 |
59 | #initialize the seed to 25
60 | random.seed(25)
61 |
62 | #setting up the sequence
63 | myseq = ['Towards', 'AI', 'is', 1]
64 |
65 | #randomly choosing an element from the sequence
66 | random.sample(myseq, 2)
67 |
68 | #initialize the seed to 25
69 | random.seed(25)
70 |
71 | #setting up the sequence
72 | myseq = ['Towards', 'AI', 'is', 1]
73 |
74 | #randomly choosing an element from the sequence
75 | #you are trying to choose 5 random elements from a sequence of lenth 4
76 | #since the selection is without replacement it is not possible and hence the error
77 | random.sample(myseq, 5)
78 |
79 | """### Rearrange the sequence"""
80 |
81 | #initialize the seed to 25
82 | random.seed(25)
83 |
84 | #setting up the sequence
85 | myseq = ['Towards', 'AI', 'is', 1]
86 |
87 | #rearranging the order of elements of the list
88 | random.shuffle(myseq)
89 | myseq
90 |
91 | """### Floating-point random number"""
92 |
93 | #initialize the seed to 25
94 | random.seed(25)
95 |
96 | #random float number between 0 and 1
97 | random.random()
98 |
99 | """### Real-valued distributions"""
100 |
101 | #initialize the seed to 25
102 | random.seed(25)
103 |
104 | #random float number between 10 and 20 (both included)
105 | print(random.uniform(10, 20))
106 |
107 | #random float number mean 10 standard deviation 4
108 | print(random.gauss(10, 4))
109 |
110 | """## Generating pseudorandom numbers with Numpy"""
111 |
112 | #importing random module from numpy
113 | import numpy as np
114 |
115 | """### Uniform distributed floating values"""
116 |
117 | #initialize the seed to 25
118 | np.random.seed(25)
119 |
120 | #single uniformly distributed random number
121 | np.random.rand()
122 |
123 | #initialize the seed to 25
124 | np.random.seed(25)
125 |
126 | #uniformly distributed random numbers of length 10: 1-D array
127 | np.random.rand(10)
128 |
129 | #initialize the seed to 25
130 | np.random.seed(25)
131 |
132 | #uniformly distributed random numbers of 2 rows and 3 columns: 2-D array
133 | np.random.rand(2, 3)
134 |
135 | """### Normal distributed floating values"""
136 |
137 | #initialize the seed to 25
138 | np.random.seed(25)
139 |
140 | #single normally distributed random number
141 | np.random.randn()
142 |
143 | #initialize the seed to 25
144 | np.random.seed(25)
145 |
146 | #normally distributed random numbers of length 10: 1-D array
147 | np.random.randn(10)
148 |
149 | #initialize the seed to 25
150 | np.random.seed(25)
151 |
152 | #normally distributed random numbers of 2 rows and 3 columns: 2-D array
153 | np.random.randn(2, 3)
154 |
155 | """### Uniformly distributed integers in a given range"""
156 |
157 | #initialize the seed to 25
158 | np.random.seed(25)
159 |
160 | #single uniformly distributed random integer between 10 and 20
161 | np.random.randint(10, 20)
162 |
163 | #initialize the seed to 25
164 | np.random.seed(25)
165 |
166 | #uniformly distributed random integer between 0 to 100 of length 10: 1-D array
167 | np.random.randint(100, size=(10))
168 |
169 | #initialize the seed to 25
170 | np.random.seed(25)
171 |
172 | #uniformly distributed random integer between 0 to 100 of 2 rows and 3 columns: 2-D array
173 | np.random.randint(100, size=(2, 3))
174 |
175 | """### Random elements from a defined list"""
176 |
177 | #initialize the seed to 25
178 | random.seed(25)
179 |
180 | #setting up the sequence
181 | myseq = ['Towards', 'AI', 'is', 1]
182 |
183 | #randomly choosing an element from the sequence
184 | np.random.choice(myseq)
185 |
186 | #initialize the seed to 25
187 | random.seed(25)
188 |
189 | #setting up the sequence
190 | myseq = ['Towards', 'AI', 'is', 1]
191 |
192 | #randomly choosing elements from the sequence: 2-D array
193 | np.random.choice(myseq, size=(2, 3))
194 |
195 | #initialize the seed to 25
196 | random.seed(25)
197 |
198 | #setting up the sequence
199 | myseq = ['Towards', 'AI', 'is', 1]
200 |
201 | #randomly choosing elements from the sequence with defined probabilities
202 | #The probability for the value to be 'Towards' is set to be 0.1
203 | #The probability for the value to be 'AI' is set to be 0.6
204 | #The probability for the value to be 'is' is set to be 0.05
205 | #The probability for the value to be 1 is set to be 0.25
206 | #0.1 + 0.6 + 0.05 + 0.25 = 1
207 | np.random.choice(myseq, p=[0.1, 0.6, 0.05, 0.25], size=(2, 3))
208 |
209 | """### Binomial distributed values"""
210 |
211 | #initialize the seed to 25
212 | np.random.seed(25)
213 |
214 | #10 number of trials with probability of 0.5 each
215 | np.random.binomial(n=10, p=0.5, size=10)
216 |
217 | """### Poisson Distribution values"""
218 |
219 | #initialize the seed to 25
220 | np.random.seed(25)
221 |
222 | #rate 2 and size 10
223 | np.random.poisson(lam=2, size=10)
224 |
225 | """### Chi Square distribution"""
226 |
227 | #initialize the seed to 25
228 | np.random.seed(25)
229 |
230 | #degree of freedom 2 and size (2, 3)
231 | np.random.chisquare(df=2, size=(2, 3))
232 |
233 | """Thank you
"""
--------------------------------------------------------------------------------
/recommendation_system_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/recommendation_system_tutorial/movie_titles.csv:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/towardsai/tutorials/cc12fe183d50ce6095f044d7346f30d5d0522584/recommendation_system_tutorial/movie_titles.csv
--------------------------------------------------------------------------------
/recommendation_system_tutorial/new_features.csv:
--------------------------------------------------------------------------------
1 | 212668,16604,3.4864864864864864,4,5,5,4,5,5.0,5.0,5.0,5.0,5.0,5.0,3.782608695652174,5
2 | 242896,16604,3.4864864864864864,4,4,3,4,3,3,3.0,3.0,3.0,3.0,3.0,3.782608695652174,3
3 | 295174,16604,3.4864864864864864,5,3,4,3,4,3,4.0,4.0,4.0,4.0,4.0,3.782608695652174,5
4 | 398239,16604,3.4864864864864864,4,4,5,3,4,2,1,2,2.5,2.5,2.5,3.782608695652174,5
5 | 605969,16604,3.4864864864864864,4,5,5,4,5,5.0,5.0,5.0,5.0,5.0,5.0,3.782608695652174,5
6 | 634715,16604,3.4864864864864864,4,5,5,4,5,3.0,3.0,3.0,3.0,3.0,3.0,3.782608695652174,3
7 | 754728,16604,3.4864864864864864,4,5,5,4,5,4.0,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
8 | 769103,16604,3.4864864864864864,3,5,5,3,4,3,3.5,3.5,3.5,3.5,3.5,3.782608695652174,4
9 | 1018210,16604,3.4864864864864864,4,4,3,4,3,4,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
10 | 1049467,16604,3.4864864864864864,5,3,4,3,4,3,4.0,4.0,4.0,4.0,4.0,3.782608695652174,5
11 | 1209983,16604,3.4864864864864864,5,5,4,3,4,2,2.5,2.5,2.5,2.5,2.5,3.782608695652174,3
12 | 1218016,16604,3.4864864864864864,3,3,3,4,4,4,3.0,3.0,3.0,3.0,3.0,3.782608695652174,2
13 | 1453970,16604,3.4864864864864864,4,5,5,4,5,5.0,5.0,5.0,5.0,5.0,5.0,3.782608695652174,5
14 | 1533963,16604,3.4864864864864864,4,5,5,4,5,4.0,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
15 | 1553211,16604,3.4864864864864864,4,5,5,4,5,4.0,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
16 | 1862911,16604,3.4864864864864864,4,5,4,5,5,2,2,2.3333333333333335,2.3333333333333335,2.3333333333333335,2.3333333333333335,3.782608695652174,3
17 | 2059366,16604,3.4864864864864864,3,3,4,4,2,4,3.5,3.5,3.5,3.5,3.5,3.782608695652174,3
18 | 2204342,16604,3.4864864864864864,4,5,5,4,5,3.0,3.0,3.0,3.0,3.0,3.0,3.782608695652174,3
19 | 2329676,16604,3.4864864864864864,4,5,5,4,5,2.0,2.0,2.0,2.0,2.0,2.0,3.782608695652174,2
20 | 2382060,16604,3.4864864864864864,3,4,5,5,2,3,3.5,3.5,3.5,3.5,3.5,3.782608695652174,4
21 | 2465999,16604,3.4864864864864864,4,5,5,4,5,4.0,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
22 | 2557596,16604,3.4864864864864864,2,3,3,4,4,5,4.0,4.0,4.0,4.0,4.0,3.782608695652174,3
23 | 2579003,16604,3.4864864864864864,4,4,3,4,3,4,4.0,4.0,4.0,4.0,4.0,3.782608695652174,4
24 | 2620858,16630,3.4864864864864864,4,3.5,3.5,3.5,3.5,3.5,3.0,3
25 | 782261,16795,3.4864864864864864,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5
26 | 1570292,17135,3.4864864864864864,4,3,2,4,4.0,4.0,4.0,4.0,4.0,3.0,4
27 | 1805778,17135,3.4864864864864864,4,3,2,3,3.0,3.0,3.0,3.0,3.0,3.0,3
28 | 1862911,17135,3.4864864864864864,3,3,4,2,3,2.3333333333333335,2.3333333333333335,2.3333333333333335,2.3333333333333335,3.0,2
29 | 2382060,17135,3.4864864864864864,2,3,4,4,3.5,3.5,3.5,3.5,3.5,3.0,3
30 | 35573,17149,3.4864864864864864,5,2,5,3,4,5.0,5.0,5.0,5.0,5.0,5.0,3.4390243902439024,5
31 | 115726,17149,3.4864864864864864,5,2,5,3,4,5.0,5.0,5.0,5.0,5.0,5.0,3.4390243902439024,5
32 | 193510,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
33 | 197313,17149,3.4864864864864864,5,2,5,3,4,5.0,5.0,5.0,5.0,5.0,5.0,3.4390243902439024,5
34 | 229421,17149,3.4864864864864864,5,2,5,3,4,2.0,2.0,2.0,2.0,2.0,2.0,3.4390243902439024,2
35 | 242896,17149,3.4864864864864864,4,4,4,3,2,3,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
36 | 295174,17149,3.4864864864864864,3,2,3,3,4,5,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,3
37 | 398239,17149,3.4864864864864864,3,3,2,3,4,5,2,2,2.5,2.5,2.5,3.4390243902439024,1
38 | 619230,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
39 | 732495,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
40 | 769103,17149,3.4864864864864864,2,3,3,3,4,4,3.5,3.5,3.5,3.5,3.5,3.4390243902439024,3
41 | 772116,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
42 | 842750,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
43 | 934315,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
44 | 980189,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
45 | 1018210,17149,3.4864864864864864,4,4,4,3,2,4,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
46 | 1049467,17149,3.4864864864864864,3,2,3,3,4,5,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,3
47 | 1180569,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
48 | 1209983,17149,3.4864864864864864,3,3,3,3,4,3,2.5,2.5,2.5,2.5,2.5,3.4390243902439024,2
49 | 1217165,17149,3.4864864864864864,5,2,5,3,4,2.0,2.0,2.0,2.0,2.0,2.0,3.4390243902439024,2
50 | 1218016,17149,3.4864864864864864,5,4,3,4,4,2,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,4
51 | 1235495,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
52 | 1413107,17149,3.4864864864864864,5,2,5,3,4,2.0,2.0,2.0,2.0,2.0,2.0,3.4390243902439024,2
53 | 1460255,17149,3.4864864864864864,5,2,5,3,4,2.0,2.0,2.0,2.0,2.0,2.0,3.4390243902439024,2
54 | 1469739,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
55 | 1473708,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
56 | 1570292,17149,3.4864864864864864,4,4,3,5,3,4,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
57 | 1615195,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
58 | 1683156,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
59 | 1720816,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
60 | 1805778,17149,3.4864864864864864,4,4,3,5,3,3,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
61 | 1814953,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
62 | 1895168,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
63 | 1991264,17149,3.4864864864864864,5,2,5,3,4,1.0,1.0,1.0,1.0,1.0,1.0,3.4390243902439024,1
64 | 2022141,17149,3.4864864864864864,5,2,5,3,4,3.0,3.0,3.0,3.0,3.0,3.0,3.4390243902439024,3
65 | 2024916,17149,3.4864864864864864,5,2,5,3,4,4.0,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
66 | 2059366,17149,3.4864864864864864,5,3,4,4,4,3,3.5,3.5,3.5,3.5,3.5,3.4390243902439024,4
67 | 2557596,17149,3.4864864864864864,4,4,3,4,4,3,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,5
68 | 2579003,17149,3.4864864864864864,4,4,4,3,2,4,4.0,4.0,4.0,4.0,4.0,3.4390243902439024,4
69 | 2580515,17149,3.4864864864864864,5,2,5,3,4,5.0,5.0,5.0,5.0,5.0,5.0,3.4390243902439024,5
70 | 2620858,17149,3.4864864864864864,3,5,2,5,3,3,3.5,3.5,3.5,3.5,3.5,3.4390243902439024,4
71 | 2086948,17177,3.4864864864864864,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4
72 | 398239,17280,3.4864864864864864,2,2,5,1,2.5,2.5,2.5,2.0,2
73 | 1862911,17280,3.4864864864864864,2,3,2,2.3333333333333335,2.3333333333333335,2.3333333333333335,2.3333333333333335,2.0,2
74 | 398239,17315,3.4864864864864864,2,5,1,2.5,2.5,2.5,2.0,2
75 |
--------------------------------------------------------------------------------
/recommendation_system_tutorial/recommendation_system_tutorial_netflix.py:
--------------------------------------------------------------------------------
1 | # Recommendation System Tutorial - Netflix
2 | # URL: https://towardsai.net/recommendation-system-tutorial
3 |
4 | # Download datasets
5 | !wget https://datasets.towardsai.net/combined_data_4.txt
6 | !wget https://raw.githubusercontent.com/towardsai/tutorials/master/recommendation_system_tutorial/movie_titles.csv
7 | !wget https://raw.githubusercontent.com/towardsai/tutorials/master/recommendation_system_tutorial/new_features.csv
8 |
9 | !pip install scikit-surprise
10 |
11 | from datetime import datetime
12 | import pandas as pd
13 | import numpy as np
14 | import seaborn as sns
15 | import os
16 | import random
17 | import matplotlib
18 | import matplotlib.pyplot as plt
19 | from scipy import sparse
20 | from sklearn.metrics.pairwise import cosine_similarity
21 | from sklearn.metrics import mean_squared_error
22 |
23 | import xgboost as xgb
24 | from surprise import Reader, Dataset
25 | from surprise import BaselineOnly
26 | from surprise import KNNBaseline
27 | from surprise import SVD
28 | from surprise import SVDpp
29 | from surprise.model_selection import GridSearchCV
30 |
31 | def load_data():
32 | netflix_csv_file = open("netflix_rating.csv", mode = "w")
33 | rating_files = ['combined_data_4.txt']
34 | for file in rating_files:
35 | with open(file) as f:
36 | for line in f:
37 | line = line.strip()
38 | if line.endswith(":"):
39 | movie_id = line.replace(":", "")
40 | else:
41 | row_data = []
42 | row_data = [item for item in line.split(",")]
43 | row_data.insert(0, movie_id)
44 | netflix_csv_file.write(",".join(row_data))
45 | netflix_csv_file.write('\n')
46 |
47 | netflix_csv_file.close()
48 | df = pd.read_csv('netflix_rating.csv', sep=",", names = ["movie_id","customer_id", "rating", "date"])
49 | return df
50 |
51 | netflix_rating_df = load_data()
52 | netflix_rating_df
53 | netflix_rating_df.head()
54 |
55 | netflix_rating_df.duplicated(["movie_id","customer_id", "rating", "date"]).sum()
56 |
57 | split_value = int(len(netflix_rating_df) * 0.80)
58 | train_data = netflix_rating_df[:split_value]
59 | test_data = netflix_rating_df[split_value:]
60 |
61 | plt.figure(figsize = (12, 8))
62 | ax = sns.countplot(x="rating", data=train_data)
63 |
64 | ax.set_yticklabels([num for num in ax.get_yticks()])
65 |
66 | plt.tick_params(labelsize = 15)
67 | plt.title("Count Ratings in train data", fontsize = 20)
68 | plt.xlabel("Ratings", fontsize = 20)
69 | plt.ylabel("Number of Ratings", fontsize = 20)
70 | plt.show()
71 |
72 | def get_user_item_sparse_matrix(df):
73 | sparse_data = sparse.csr_matrix((df.rating, (df.customer_id, df.movie_id)))
74 | return sparse_data
75 |
76 | train_sparse_data = get_user_item_sparse_matrix(train_data)
77 |
78 | test_sparse_data = get_user_item_sparse_matrix(test_data)
79 |
80 | global_average_rating = train_sparse_data.sum()/train_sparse_data.count_nonzero()
81 | print("Global Average Rating: {}".format(global_average_rating))
82 |
83 | def get_average_rating(sparse_matrix, is_user):
84 | ax = 1 if is_user else 0
85 | sum_of_ratings = sparse_matrix.sum(axis = ax).A1
86 | no_of_ratings = (sparse_matrix != 0).sum(axis = ax).A1
87 | rows, cols = sparse_matrix.shape
88 | average_ratings = {i: sum_of_ratings[i]/no_of_ratings[i] for i in range(rows if is_user else cols) if no_of_ratings[i] != 0}
89 | return average_ratings
90 |
91 | average_rating_user = get_average_rating(train_sparse_data, True)
92 |
93 | avg_rating_movie = get_average_rating(train_sparse_data, False)
94 |
95 | total_users = len(np.unique(netflix_rating_df["customer_id"]))
96 | train_users = len(average_rating_user)
97 | uncommonUsers = total_users - train_users
98 |
99 | print("Total no. of Users = {}".format(total_users))
100 | print("No. of Users in train data= {}".format(train_users))
101 | print("No. of Users not present in train data = {}({}%)".format(uncommonUsers, np.round((uncommonUsers/total_users)*100), 2))
102 |
103 | total_movies = len(np.unique(netflix_rating_df["movie_id"]))
104 | train_movies = len(avg_rating_movie)
105 | uncommonMovies = total_movies - train_movies
106 |
107 | print("Total no. of Movies = {}".format(total_movies))
108 | print("No. of Movies in train data= {}".format(train_movies))
109 | print("No. of Movies not present in train data = {}({}%)".format(uncommonMovies, np.round((uncommonMovies/total_movies)*100), 2))
110 |
111 | def compute_user_similarity(sparse_matrix, limit=100):
112 | row_index, col_index = sparse_matrix.nonzero()
113 | rows = np.unique(row_index)
114 | similar_arr = np.zeros(61700).reshape(617,100)
115 |
116 | for row in rows[:limit]:
117 | sim = cosine_similarity(sparse_matrix.getrow(row), train_sparse_data).ravel()
118 | similar_indices = sim.argsort()[-limit:]
119 | similar = sim[similar_indices]
120 | similar_arr[row] = similar
121 |
122 | return similar_arr
123 |
124 | similar_user_matrix = compute_user_similarity(train_sparse_data, 100)
125 |
126 | similar_user_matrix[0]
127 |
128 | movie_titles_df = pd.read_csv("movie_titles.csv",sep = ",",
129 | header = None, names=['movie_id', 'year_of_release', 'movie_title'],
130 | index_col = "movie_id", encoding = "iso8859_2")
131 | movie_titles_df.head()
132 |
133 | def compute_movie_similarity_count(sparse_matrix, movie_titles_df, movie_id):
134 | similarity = cosine_similarity(sparse_matrix.T, dense_output = False)
135 | no_of_similar_movies = movie_titles_df.loc[movie_id][1], similarity[movie_id].count_nonzero()
136 | return no_of_similar_movies
137 |
138 | similar_movies = compute_movie_similarity_count(train_sparse_data, movie_titles_df, 1775)
139 | print("Similar Movies = {}".format(similar_movies))
140 |
141 | def get_sample_sparse_matrix(sparse_matrix, no_of_users, no_of_movies):
142 | users, movies, ratings = sparse.find(sparse_matrix)
143 | uniq_users = np.unique(users)
144 | uniq_movies = np.unique(movies)
145 | np.random.seed(15)
146 | user = np.random.choice(uniq_users, no_of_users, replace = False)
147 | movie = np.random.choice(uniq_movies, no_of_movies, replace = True)
148 | mask = np.logical_and(np.isin(users, user), np.isin(movies, movie))
149 | sparse_matrix = sparse.csr_matrix((ratings[mask], (users[mask], movies[mask])),
150 | shape = (max(user)+1, max(movie)+1))
151 | return sparse_matrix
152 |
153 | train_sample_sparse_matrix = get_sample_sparse_matrix(train_sparse_data, 400, 40)
154 |
155 | test_sparse_matrix_matrix = get_sample_sparse_matrix(test_sparse_data, 200, 20)
156 |
157 | def create_new_similar_features(sample_sparse_matrix):
158 | global_avg_rating = get_average_rating(sample_sparse_matrix, False)
159 | global_avg_users = get_average_rating(sample_sparse_matrix, True)
160 | global_avg_movies = get_average_rating(sample_sparse_matrix, False)
161 | sample_train_users, sample_train_movies, sample_train_ratings = sparse.find(sample_sparse_matrix)
162 | new_features_csv_file = open("new_features.csv", mode = "w")
163 |
164 | for user, movie, rating in zip(sample_train_users, sample_train_movies, sample_train_ratings):
165 | similar_arr = list()
166 | similar_arr.append(user)
167 | similar_arr.append(movie)
168 | similar_arr.append(sample_sparse_matrix.sum()/sample_sparse_matrix.count_nonzero())
169 |
170 | similar_users = cosine_similarity(sample_sparse_matrix[user], sample_sparse_matrix).ravel()
171 | indices = np.argsort(-similar_users)[1:]
172 | ratings = sample_sparse_matrix[indices, movie].toarray().ravel()
173 | top_similar_user_ratings = list(ratings[ratings != 0][:5])
174 | top_similar_user_ratings.extend([global_avg_rating[movie]] * (5 - len(ratings)))
175 | similar_arr.extend(top_similar_user_ratings)
176 |
177 | similar_movies = cosine_similarity(sample_sparse_matrix[:,movie].T, sample_sparse_matrix.T).ravel()
178 | similar_movies_indices = np.argsort(-similar_movies)[1:]
179 | similar_movies_ratings = sample_sparse_matrix[user, similar_movies_indices].toarray().ravel()
180 | top_similar_movie_ratings = list(similar_movies_ratings[similar_movies_ratings != 0][:5])
181 | top_similar_movie_ratings.extend([global_avg_users[user]] * (5-len(top_similar_movie_ratings)))
182 | similar_arr.extend(top_similar_movie_ratings)
183 |
184 | similar_arr.append(global_avg_users[user])
185 | similar_arr.append(global_avg_movies[movie])
186 | similar_arr.append(rating)
187 |
188 | new_features_csv_file.write(",".join(map(str, similar_arr)))
189 | new_features_csv_file.write("\n")
190 |
191 | new_features_csv_file.close()
192 | new_features_df = pd.read_csv('new_features.csv', names = ["user_id", "movie_id", "gloabl_average", "similar_user_rating1",
193 | "similar_user_rating2", "similar_user_rating3",
194 | "similar_user_rating4", "similar_user_rating5",
195 | "similar_movie_rating1", "similar_movie_rating2",
196 | "similar_movie_rating3", "similar_movie_rating4",
197 | "similar_movie_rating5", "user_average",
198 | "movie_average", "rating"])
199 | return new_features_df
200 |
201 | train_new_similar_features = create_new_similar_features(train_sample_sparse_matrix)
202 |
203 | train_new_similar_features = train_new_similar_features.fillna(0)
204 | train_new_similar_features.head()
205 |
206 | test_new_similar_features = create_new_similar_features(test_sparse_matrix_matrix)
207 |
208 | test_new_similar_features = test_new_similar_features.fillna(0)
209 | test_new_similar_features.head()
210 |
211 | x_train = train_new_similar_features.drop(["user_id", "movie_id", "rating"], axis = 1)
212 |
213 | x_test = test_new_similar_features.drop(["user_id", "movie_id", "rating"], axis = 1)
214 |
215 | y_train = train_new_similar_features["rating"]
216 |
217 | y_test = test_new_similar_features["rating"]
218 |
219 | def error_metrics(y_true, y_pred):
220 | rmse = np.sqrt(mean_squared_error(y_true, y_pred))
221 | return rmse
222 |
223 | clf = xgb.XGBRegressor(n_estimators = 100, silent = False, n_jobs = 10)
224 | clf.fit(x_train, y_train)
225 |
226 | y_pred_test = clf.predict(x_test)
227 |
228 | rmse_test = error_metrics(y_test, y_pred_test)
229 | print("RMSE = {}".format(rmse_test))
230 |
231 | def plot_importance(model, clf):
232 | fig = plt.figure(figsize = (8, 6))
233 | ax = fig.add_axes([0,0,1,1])
234 | model.plot_importance(clf, ax = ax, height = 0.3)
235 | plt.xlabel("F Score", fontsize = 20)
236 | plt.ylabel("Features", fontsize = 20)
237 | plt.title("Feature Importance", fontsize = 20)
238 | plt.tick_params(labelsize = 15)
239 |
240 | plt.show()
241 |
242 | plot_importance(xgb, clf)
243 |
--------------------------------------------------------------------------------
/sentiment_analysis_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/simple_linear_regression_tutorial/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/simple_linear_regression_tutorial/simple_linear_regression_from_scratch.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/support-vector-machine-svm/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/support-vector-machine-svm/svm_machine_learning.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | #Support Vector Machines (SVM) Introduction - Machine Learning
4 |
5 | * Tutorial: https://news.towardsai.net/svm
6 | * Github: https://github.com/towardsai/tutorials/tree/master/support-vector-machine-svm
7 | """
8 |
9 | import pandas as pd
10 | import numpy as np
11 | import matplotlib.pyplot as plt
12 | import seaborn as sns
13 |
14 | #classic datasets from sklearn library
15 | from sklearn import datasets
16 |
17 | from sklearn.model_selection import train_test_split
18 |
19 | #Support Vector Classification-wrapper around SVM
20 | from sklearn.svm import SVC
21 |
22 | #different matrices to score model performance
23 | from sklearn import metrics
24 | from sklearn.metrics import classification_report,confusion_matrix
25 |
26 | """## Load data"""
27 |
28 | #loading WINE dataset
29 | cancer_data = datasets.load_wine()
30 |
31 | #converting into DataFrame
32 | df = pd.DataFrame(cancer_data.data, columns = cancer_data.feature_names)
33 | df['target'] = cancer_data.target
34 | df.head()
35 |
36 | """## Exploratory data analysis"""
37 |
38 | #analysing target variable
39 | sns.countplot(df.target)
40 | plt.show()
41 |
42 | #visualizing datapoints seperability
43 | fig, axes = plt.subplots(4, 3, figsize=(22,14))
44 | axes = [ax for axes_rows in axes for ax in axes_rows]
45 | columns = list(df.columns)
46 | columns.remove('target')
47 | columns.remove('alcohol')
48 |
49 | #looping through every columns of data
50 | #and plotting against alcohol
51 | for i, col in enumerate(columns):
52 | sns.scatterplot(data=df, x='alcohol', y=col, hue='target', palette="deep", ax=axes[i])
53 |
54 | """## Splitting data"""
55 |
56 | #splitting data into 80:20 train test ratio
57 | X = df.drop('target', axis=1)
58 | y = df.target
59 |
60 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)
61 |
62 | """## Model training and performance evaluation"""
63 |
64 | #training SVM model with linear kernel
65 | model = SVC(kernel='linear',random_state = 10)
66 | model.fit(X_train, y_train)
67 |
68 | #predicting output for test data
69 | pred = model.predict(X_test)
70 |
71 | #building confusion matrix
72 | cm = confusion_matrix(y_test, pred)
73 |
74 | #defining the size of the canvas
75 | plt.rcParams['figure.figsize'] = [15,8]
76 |
77 | #confusion matrix to DataFrame
78 | conf_matrix = pd.DataFrame(data = cm,columns = ['Predicted:0','Predicted:1', 'Predicted:2'], index = ['Actual:0','Actual:1', 'Actual:2'])
79 |
80 | #plotting the confusion matrix
81 | sns.heatmap(conf_matrix, annot = True, fmt = 'd', cmap = 'Paired', cbar = False,
82 | linewidths = 0.1, annot_kws = {'size':25})
83 | plt.xticks(fontsize = 20)
84 | plt.yticks(fontsize = 20)
85 | plt.show()
86 |
87 | print(classification_report(y_test,pred))
88 |
--------------------------------------------------------------------------------
/survival_analysis_in_python/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/survival_analysis_in_python/lung.csv:
--------------------------------------------------------------------------------
1 | "","inst","time","status","age","sex","ph.ecog","ph.karno","pat.karno","meal.cal","wt.loss"
2 | "1",3,306,2,74,1,1,90,100,1175,NA
3 | "2",3,455,2,68,1,0,90,90,1225,15
4 | "3",3,1010,1,56,1,0,90,90,NA,15
5 | "4",5,210,2,57,1,1,90,60,1150,11
6 | "5",1,883,2,60,1,0,100,90,NA,0
7 | "6",12,1022,1,74,1,1,50,80,513,0
8 | "7",7,310,2,68,2,2,70,60,384,10
9 | "8",11,361,2,71,2,2,60,80,538,1
10 | "9",1,218,2,53,1,1,70,80,825,16
11 | "10",7,166,2,61,1,2,70,70,271,34
12 | "11",6,170,2,57,1,1,80,80,1025,27
13 | "12",16,654,2,68,2,2,70,70,NA,23
14 | "13",11,728,2,68,2,1,90,90,NA,5
15 | "14",21,71,2,60,1,NA,60,70,1225,32
16 | "15",12,567,2,57,1,1,80,70,2600,60
17 | "16",1,144,2,67,1,1,80,90,NA,15
18 | "17",22,613,2,70,1,1,90,100,1150,-5
19 | "18",16,707,2,63,1,2,50,70,1025,22
20 | "19",1,61,2,56,2,2,60,60,238,10
21 | "20",21,88,2,57,1,1,90,80,1175,NA
22 | "21",11,301,2,67,1,1,80,80,1025,17
23 | "22",6,81,2,49,2,0,100,70,1175,-8
24 | "23",11,624,2,50,1,1,70,80,NA,16
25 | "24",15,371,2,58,1,0,90,100,975,13
26 | "25",12,394,2,72,1,0,90,80,NA,0
27 | "26",12,520,2,70,2,1,90,80,825,6
28 | "27",4,574,2,60,1,0,100,100,1025,-13
29 | "28",13,118,2,70,1,3,60,70,1075,20
30 | "29",13,390,2,53,1,1,80,70,875,-7
31 | "30",1,12,2,74,1,2,70,50,305,20
32 | "31",12,473,2,69,2,1,90,90,1025,-1
33 | "32",1,26,2,73,1,2,60,70,388,20
34 | "33",7,533,2,48,1,2,60,80,NA,-11
35 | "34",16,107,2,60,2,2,50,60,925,-15
36 | "35",12,53,2,61,1,2,70,100,1075,10
37 | "36",1,122,2,62,2,2,50,50,1025,NA
38 | "37",22,814,2,65,1,2,70,60,513,28
39 | "38",15,965,1,66,2,1,70,90,875,4
40 | "39",1,93,2,74,1,2,50,40,1225,24
41 | "40",1,731,2,64,2,1,80,100,1175,15
42 | "41",5,460,2,70,1,1,80,60,975,10
43 | "42",11,153,2,73,2,2,60,70,1075,11
44 | "43",10,433,2,59,2,0,90,90,363,27
45 | "44",12,145,2,60,2,2,70,60,NA,NA
46 | "45",7,583,2,68,1,1,60,70,1025,7
47 | "46",7,95,2,76,2,2,60,60,625,-24
48 | "47",1,303,2,74,1,0,90,70,463,30
49 | "48",3,519,2,63,1,1,80,70,1025,10
50 | "49",13,643,2,74,1,0,90,90,1425,2
51 | "50",22,765,2,50,2,1,90,100,1175,4
52 | "51",3,735,2,72,2,1,90,90,NA,9
53 | "52",12,189,2,63,1,0,80,70,NA,0
54 | "53",21,53,2,68,1,0,90,100,1025,0
55 | "54",1,246,2,58,1,0,100,90,1175,7
56 | "55",6,689,2,59,1,1,90,80,1300,15
57 | "56",1,65,2,62,1,0,90,80,725,NA
58 | "57",5,5,2,65,2,0,100,80,338,5
59 | "58",22,132,2,57,1,2,70,60,NA,18
60 | "59",3,687,2,58,2,1,80,80,1225,10
61 | "60",1,345,2,64,2,1,90,80,1075,-3
62 | "61",22,444,2,75,2,2,70,70,438,8
63 | "62",12,223,2,48,1,1,90,80,1300,68
64 | "63",21,175,2,73,1,1,80,100,1025,NA
65 | "64",11,60,2,65,2,1,90,80,1025,0
66 | "65",3,163,2,69,1,1,80,60,1125,0
67 | "66",3,65,2,68,1,2,70,50,825,8
68 | "67",16,208,2,67,2,2,70,NA,538,2
69 | "68",5,821,1,64,2,0,90,70,1025,3
70 | "69",22,428,2,68,1,0,100,80,1039,0
71 | "70",6,230,2,67,1,1,80,100,488,23
72 | "71",13,840,1,63,1,0,90,90,1175,-1
73 | "72",3,305,2,48,2,1,80,90,538,29
74 | "73",5,11,2,74,1,2,70,100,1175,0
75 | "74",2,132,2,40,1,1,80,80,NA,3
76 | "75",21,226,2,53,2,1,90,80,825,3
77 | "76",12,426,2,71,2,1,90,90,1075,19
78 | "77",1,705,2,51,2,0,100,80,1300,0
79 | "78",6,363,2,56,2,1,80,70,1225,-2
80 | "79",3,11,2,81,1,0,90,NA,731,15
81 | "80",1,176,2,73,1,0,90,70,169,30
82 | "81",4,791,2,59,1,0,100,80,768,5
83 | "82",13,95,2,55,1,1,70,90,1500,15
84 | "83",11,196,1,42,1,1,80,80,1425,8
85 | "84",21,167,2,44,2,1,80,90,588,-1
86 | "85",16,806,1,44,1,1,80,80,1025,1
87 | "86",6,284,2,71,1,1,80,90,1100,14
88 | "87",22,641,2,62,2,1,80,80,1150,1
89 | "88",21,147,2,61,1,0,100,90,1175,4
90 | "89",13,740,1,44,2,1,90,80,588,39
91 | "90",1,163,2,72,1,2,70,70,910,2
92 | "91",11,655,2,63,1,0,100,90,975,-1
93 | "92",22,239,2,70,1,1,80,100,NA,23
94 | "93",5,88,2,66,1,1,90,80,875,8
95 | "94",10,245,2,57,2,1,80,60,280,14
96 | "95",1,588,1,69,2,0,100,90,NA,13
97 | "96",12,30,2,72,1,2,80,60,288,7
98 | "97",3,179,2,69,1,1,80,80,NA,25
99 | "98",12,310,2,71,1,1,90,100,NA,0
100 | "99",11,477,2,64,1,1,90,100,910,0
101 | "100",3,166,2,70,2,0,90,70,NA,10
102 | "101",1,559,1,58,2,0,100,100,710,15
103 | "102",6,450,2,69,2,1,80,90,1175,3
104 | "103",13,364,2,56,1,1,70,80,NA,4
105 | "104",6,107,2,63,1,1,90,70,NA,0
106 | "105",13,177,2,59,1,2,50,NA,NA,32
107 | "106",12,156,2,66,1,1,80,90,875,14
108 | "107",26,529,1,54,2,1,80,100,975,-3
109 | "108",1,11,2,67,1,1,90,90,925,NA
110 | "109",21,429,2,55,1,1,100,80,975,5
111 | "110",3,351,2,75,2,2,60,50,925,11
112 | "111",13,15,2,69,1,0,90,70,575,10
113 | "112",1,181,2,44,1,1,80,90,1175,5
114 | "113",10,283,2,80,1,1,80,100,1030,6
115 | "114",3,201,2,75,2,0,90,100,NA,1
116 | "115",6,524,2,54,2,1,80,100,NA,15
117 | "116",1,13,2,76,1,2,70,70,413,20
118 | "117",3,212,2,49,1,2,70,60,675,20
119 | "118",1,524,2,68,1,2,60,70,1300,30
120 | "119",16,288,2,66,1,2,70,60,613,24
121 | "120",15,363,2,80,1,1,80,90,346,11
122 | "121",22,442,2,75,1,0,90,90,NA,0
123 | "122",26,199,2,60,2,2,70,80,675,10
124 | "123",3,550,2,69,2,1,70,80,910,0
125 | "124",11,54,2,72,1,2,60,60,768,-3
126 | "125",1,558,2,70,1,0,90,90,1025,17
127 | "126",22,207,2,66,1,1,80,80,925,20
128 | "127",7,92,2,50,1,1,80,60,1075,13
129 | "128",12,60,2,64,1,1,80,90,993,0
130 | "129",16,551,1,77,2,2,80,60,750,28
131 | "130",12,543,1,48,2,0,90,60,NA,4
132 | "131",4,293,2,59,2,1,80,80,925,52
133 | "132",16,202,2,53,1,1,80,80,NA,20
134 | "133",6,353,2,47,1,0,100,90,1225,5
135 | "134",13,511,1,55,2,1,80,70,NA,49
136 | "135",1,267,2,67,1,0,90,70,313,6
137 | "136",22,511,1,74,2,2,60,40,96,37
138 | "137",12,371,2,58,2,1,80,70,NA,0
139 | "138",13,387,2,56,1,2,80,60,1075,NA
140 | "139",1,457,2,54,1,1,90,90,975,-5
141 | "140",5,337,2,56,1,0,100,100,1500,15
142 | "141",21,201,2,73,2,2,70,60,1225,-16
143 | "142",3,404,1,74,1,1,80,70,413,38
144 | "143",26,222,2,76,1,2,70,70,1500,8
145 | "144",1,62,2,65,2,1,80,90,1075,0
146 | "145",11,458,1,57,1,1,80,100,513,30
147 | "146",26,356,1,53,2,1,90,90,NA,2
148 | "147",16,353,2,71,1,0,100,80,775,2
149 | "148",16,163,2,54,1,1,90,80,1225,13
150 | "149",12,31,2,82,1,0,100,90,413,27
151 | "150",13,340,2,59,2,0,100,90,NA,0
152 | "151",13,229,2,70,1,1,70,60,1175,-2
153 | "152",22,444,1,60,1,0,90,100,NA,7
154 | "153",5,315,1,62,2,0,90,90,NA,0
155 | "154",16,182,2,53,2,1,80,60,NA,4
156 | "155",32,156,2,55,1,2,70,30,1025,10
157 | "156",NA,329,2,69,1,2,70,80,713,20
158 | "157",26,364,1,68,2,1,90,90,NA,7
159 | "158",4,291,2,62,1,2,70,60,475,27
160 | "159",12,179,2,63,1,1,80,70,538,-2
161 | "160",1,376,1,56,2,1,80,90,825,17
162 | "161",32,384,1,62,2,0,90,90,588,8
163 | "162",10,268,2,44,2,1,90,100,2450,2
164 | "163",11,292,1,69,1,2,60,70,2450,36
165 | "164",6,142,2,63,1,1,90,80,875,2
166 | "165",7,413,1,64,1,1,80,70,413,16
167 | "166",16,266,1,57,2,0,90,90,1075,3
168 | "167",11,194,2,60,2,1,80,60,NA,33
169 | "168",21,320,2,46,1,0,100,100,860,4
170 | "169",6,181,2,61,1,1,90,90,730,0
171 | "170",12,285,2,65,1,0,100,90,1025,0
172 | "171",13,301,1,61,1,1,90,100,825,2
173 | "172",2,348,2,58,2,0,90,80,1225,10
174 | "173",2,197,2,56,1,1,90,60,768,37
175 | "174",16,382,1,43,2,0,100,90,338,6
176 | "175",1,303,1,53,1,1,90,80,1225,12
177 | "176",13,296,1,59,2,1,80,100,1025,0
178 | "177",1,180,2,56,1,2,60,80,1225,-2
179 | "178",13,186,2,55,2,1,80,70,NA,NA
180 | "179",1,145,2,53,2,1,80,90,588,13
181 | "180",7,269,1,74,2,0,100,100,588,0
182 | "181",13,300,1,60,1,0,100,100,975,5
183 | "182",1,284,1,39,1,0,100,90,1225,-5
184 | "183",16,350,2,66,2,0,90,100,1025,NA
185 | "184",32,272,1,65,2,1,80,90,NA,-1
186 | "185",12,292,1,51,2,0,90,80,1225,0
187 | "186",12,332,1,45,2,0,90,100,975,5
188 | "187",2,285,2,72,2,2,70,90,463,20
189 | "188",3,259,1,58,1,0,90,80,1300,8
190 | "189",15,110,2,64,1,1,80,60,1025,12
191 | "190",22,286,2,53,1,0,90,90,1225,8
192 | "191",16,270,2,72,1,1,80,90,488,14
193 | "192",16,81,2,52,1,2,60,70,1075,NA
194 | "193",12,131,2,50,1,1,90,80,513,NA
195 | "194",1,225,1,64,1,1,90,80,825,33
196 | "195",22,269,2,71,1,1,90,90,1300,-2
197 | "196",12,225,1,70,1,0,100,100,1175,6
198 | "197",32,243,1,63,2,1,80,90,825,0
199 | "198",21,279,1,64,1,1,90,90,NA,4
200 | "199",1,276,1,52,2,0,100,80,975,0
201 | "200",32,135,2,60,1,1,90,70,1275,0
202 | "201",15,79,2,64,2,1,90,90,488,37
203 | "202",22,59,2,73,1,1,60,60,2200,5
204 | "203",32,240,1,63,2,0,90,100,1025,0
205 | "204",3,202,1,50,2,0,100,100,635,1
206 | "205",26,235,1,63,2,0,100,90,413,0
207 | "206",33,105,2,62,1,2,NA,70,NA,NA
208 | "207",5,224,1,55,2,0,80,90,NA,23
209 | "208",13,239,2,50,2,2,60,60,1025,-3
210 | "209",21,237,1,69,1,1,80,70,NA,NA
211 | "210",33,173,1,59,2,1,90,80,NA,10
212 | "211",1,252,1,60,2,0,100,90,488,-2
213 | "212",6,221,1,67,1,1,80,70,413,23
214 | "213",15,185,1,69,1,1,90,70,1075,0
215 | "214",11,92,1,64,2,2,70,100,NA,31
216 | "215",11,13,2,65,1,1,80,90,NA,10
217 | "216",11,222,1,65,1,1,90,70,1025,18
218 | "217",13,192,1,41,2,1,90,80,NA,-10
219 | "218",21,183,2,76,1,2,80,60,825,7
220 | "219",11,211,1,70,2,2,70,30,131,3
221 | "220",2,175,1,57,2,0,80,80,725,11
222 | "221",22,197,1,67,1,1,80,90,1500,2
223 | "222",11,203,1,71,2,1,80,90,1025,0
224 | "223",1,116,2,76,1,1,80,80,NA,0
225 | "224",1,188,1,77,1,1,80,60,NA,3
226 | "225",13,191,1,39,1,0,90,90,2350,-5
227 | "226",32,105,1,75,2,2,60,70,1025,5
228 | "227",6,174,1,66,1,1,90,100,1075,1
229 | "228",22,177,1,58,2,1,80,90,1060,0
230 |
--------------------------------------------------------------------------------
/survival_analysis_in_python/survival_analysis_1.py:
--------------------------------------------------------------------------------
1 | #Import required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from lifelines import KaplanMeierFitter
6 |
7 | #Read the dataset:
8 | data = pd.read_csv("lung.csv")
9 | data.head()
10 |
11 | #Print the column names of our data:
12 | print(data.columns)
13 |
14 | #Additional info about our dataset:
15 | data.info()
16 |
17 | #Statistical info about our dataset:
18 | data.describe()
19 |
20 | #Plot histogram for sex of patient:
21 | print (data["sex"].hist())
22 |
23 | #Create an object of KaplanMeierFitter:
24 | kmf = KaplanMeierFitter()
25 |
26 | #Organize our data:
27 | #If status = 1 , then dead = 0
28 | #If status = 2 , then dead = 1
29 | data.loc[data.status == 1, 'dead'] = 0
30 | data.loc[data.status == 2, 'dead'] = 1
31 | data.head()
32 |
33 | #Fit the parameter values in our object:
34 | kmf.fit(durations = data["time"], event_observed = data["dead"])
35 |
36 | #Print the event table:
37 | kmf.event_table
38 | # Removed = Observed + Censored
39 | # Censored = Person that didn't die.(They are of no use to us!)
40 | # Observed = Persons that died.
41 |
42 | #Calculating the survival probability for a given time:
43 | event_at_0 = kmf.event_table.iloc[0,:]
44 |
45 | #Calculate the survival probability for t=0:
46 | surv_for_0 = (event_at_0.at_risk - event_at_0.observed)/event_at_0.at_risk
47 | surv_for_0
48 |
49 | #Calculating the survival probability for a given time:
50 | event_at_5 = kmf.event_table.iloc[1,:]
51 |
52 | #Calculate the survival probability for t=5:
53 | surv_for_5 = (event_at_5.at_risk - event_at_5.observed)/event_at_5.at_risk
54 | surv_for_5
55 |
56 | #Calculating the survival probability for a given time:
57 | event_at_11 = kmf.event_table.iloc[2,:]
58 |
59 | #Calculate the survival probability for t=11:
60 | surv_for_11 = (event_at_11.at_risk - event_at_11.observed)/event_at_11.at_risk
61 | surv_for_11
62 |
63 | #Calculating the actual survival probability at a given time:
64 |
65 | surv_after_0 = surv_for_0
66 | print("Survival Probability After 0 Days: ",surv_after_0)
67 |
68 | #Calculating the actual survival probability at a given time:
69 | surv_after_5 = surv_for_0 * surv_for_5
70 | print("Survival Probability After 5 Days: ",surv_after_5)
71 |
72 |
73 | #Calculating the actual survival probability at a given time:surv_after_11 = surv_for_0 * surv_for_5 * surv_for_11
74 | print("Survival Probability After 11 Days: ",surv_after_11)
75 |
76 | #Get the probability values the easy way!
77 | print("Survival probability for t=0: ",kmf.predict(0))
78 | print("Survival probability for t=5: ",kmf.predict(5))
79 | print("Survival probability for t=11: ",kmf.predict(11))
80 |
81 | #Predicting the surviaval probability for an array of value:
82 | kmf.predict([0,5,11,12])
83 |
84 | #To get the full list:
85 | kmf.survival_function_
86 |
87 | #Plot the graph:
88 | kmf.plot()
89 | plt.title("The Kaplan-Meier Estimate")
90 | plt.xlabel("Number of days")
91 | plt.ylabel("Probability of survival")
92 |
93 | #The median number of days:
94 | print("The median survival time: ",kmf.median_survival_time_)
95 |
96 | #Survival probability with confidence interval:
97 | kmf.confidence_interval_survival_function_
98 |
99 | #Plot survival function with confidence interval:
100 | confidence_surv_func = kmf.confidence_interval_survival_function_
101 | plt.plot(confidence_surv_func["KM_estimate_lower_0.95"],label="Lower")
102 | plt.plot(confidence_surv_func["KM_estimate_upper_0.95"],label="Upper")
103 | plt.title("Survival Function With Confidence Interval")
104 | plt.xlabel("Number of days")
105 | plt.ylabel("Survival Probability")
106 | plt.legend()
107 |
108 | #Probabaility of a subject dying:
109 | #p(1022) = p(0) +......+p(1022)
110 | kmf.cumulative_density_
111 |
112 | #Plot the cumulative density graph:
113 | kmf.plot_cumulative_density()
114 | plt.title("Cumulative Density Plot")
115 | plt.xlabel("Number of days")
116 | plt.ylabel("Probability of person's death")
117 |
118 | #Cumulative density with confidence interval:
119 | kmf.confidence_interval_cumulative_density_
120 |
121 | #Plot cumulative density with confidence interval:
122 | confidence_cumulative_density = kmf.confidence_interval_cumulative_density_
123 | plt.plot(kmf.confidence_interval_cumulative_density_["KM_estimate_lower_0.95"],label="Lower")
124 | plt.plot(kmf.confidence_interval_cumulative_density_["KM_estimate_upper_0.95"],label="Upper")
125 | plt.title("Cumulative Density With Confidence Interval")
126 | plt.xlabel("Number of days")
127 | plt.ylabel("Cumulative Density")
128 | plt.legend()
129 |
130 | #Find cumulative density at a specific time:
131 | kmf.cumulative_density_at_times(times=1022)
132 |
133 | #Conditional median time to event of interest:
134 | kmf.conditional_time_to_event_
135 |
136 | #Conditional median time left for event:
137 | median_time_to_event = kmf.conditional_time_to_event_
138 | plt.plot(median_time_to_event,label="Median Time left")
139 | plt.title("Medain time to event")
140 | plt.xlabel("Total days")
141 | plt.ylabel("Conditional median time to event")
142 | plt.legend()
143 |
144 | #Hazard function:
145 | from lifelines import NelsonAalenFitter
146 |
147 | #Create an object of NelsonAalenFitter:
148 | naf = NelsonAalenFitter()
149 |
150 | #Fit our data into the object:
151 | naf.fit(data["time"], event_observed=data["dead"])
152 |
153 | #Print the cumulative hazard:
154 | naf.cumulative_hazard_
155 |
156 | #Plot the cumulative hazard grpah:
157 | naf.plot_cumulative_hazard()
158 | plt.title("Cumulative Probability for Event of Interest")
159 | plt.xlabel("Number of days")
160 | plt.ylabel("Cumulative Probability of person's death")
161 |
162 | #We can predict the value at a certain point :
163 | print("Time = 500 days: ",naf.predict(500))
164 | print("Time = 1022 days: ",naf.predict(1022))
165 |
166 | #Cumulative hazard with confidence interval:
167 | naf.confidence_interval_
168 |
169 | #Plot cumulative hazard with confidence interval:
170 | confidence_interval = naf.confidence_interval_
171 | plt.plot(confidence_interval["NA_estimate_lower_0.95"],label="Lower")
172 | plt.plot(confidence_interval["NA_estimate_upper_0.95"],label="Upper")
173 | plt.title("Cumulative hazard With Confidence Interval")
174 | plt.xlabel("Number of days")
175 | plt.ylabel("Cumulative hazard")
176 | plt.legend()
177 |
178 | #Plot the cumulative_hazard and cumulative density:
179 | kmf.plot_cumulative_density(label="Cumulative Hazard")
180 | naf.plot_cumulative_hazard(label="Cumulative Density")
181 | plt.xlabel("Number of Days")
182 |
--------------------------------------------------------------------------------
/survival_analysis_in_python/survival_analysis_2.py:
--------------------------------------------------------------------------------
1 | #Import required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from lifelines import KaplanMeierFitter
6 |
7 | #Read the dataset:
8 | data = pd.read_csv("lung.csv")
9 | data.head()
10 |
11 | #Organize our data:
12 | #If status = 1 , then dead = 0
13 | #If status = 2 , then dead = 1
14 | data.loc[data.status == 1, 'dead'] = 0
15 | data.loc[data.status == 2, 'dead'] = 1
16 | data.head()
17 |
18 | #Create two objects for groups:
19 | #kmf_m for male data:
20 | #kmf_f for female data:
21 | kmf_m = KaplanMeierFitter()
22 | kmf_f = KaplanMeierFitter()
23 |
24 | #Dividing data into groups:
25 | Male = data.query("sex == 1")
26 | Female = data.query("sex == 2")
27 |
28 | #View data of Male group:
29 | Male.head()
30 |
31 | #View data of Female group:
32 | Female.head()
33 |
34 | #Fit data into objects:
35 | kmf_m.fit(durations = Male["time"],event_observed = Male["dead"] ,label="Male")
36 | kmf_f.fit(durations = Female["time"],event_observed = Female["dead"], label="Female")
37 |
38 | #Event table for male group:
39 | kmf_m.event_table
40 |
41 | #Event table for female group:
42 | kmf_f.event_table
43 |
44 | #Predict value based on time:
45 | kmf_m.predict(11)
46 |
47 | #Predict value based on time:
48 | kmf_f.predict(11)
49 |
50 | #Get complete data of survival function for male group:
51 | kmf_m.survival_function_
52 |
53 | #Get complete data of survival function for female group:
54 | kmf_f.survival_function_
55 |
56 | #Plot the survival_function data:
57 | kmf_m.plot()
58 | kmf_f.plot()
59 | plt.xlabel("Days Passed")
60 | plt.ylabel("Survival Probability")
61 | plt.title("KMF")
62 |
63 | #Cumulative density for male group:
64 | kmf_m.cumulative_density_
65 |
66 | #Cumulative density for female group:
67 | kmf_f.cumulative_density_
68 |
69 | #PLot the graph for cumulative density for both groups:
70 | kmf_m.plot_cumulative_density()
71 | kmf_f.plot_cumulative_density()
72 | plt.title("Cumulative Density")
73 | plt.xlabel("Number of days")
74 | plt.ylabel("Probability")
75 |
76 | #Hazard Function:
77 | from lifelines import NelsonAalenFitter
78 |
79 | #Fitting the data into objects:
80 | naf_m = NelsonAalenFitter()
81 | naf_f = NelsonAalenFitter()
82 | naf_m.fit(Male["time"],event_observed = Male["dead"])
83 | naf_f.fit(Female["time"],event_observed = Female["dead"])
84 |
85 | #Cumulative hazard for male group:
86 | naf_m.cumulative_hazard_
87 |
88 | #Cumulative hazard for female group:
89 | naf_f.cumulative_hazard_
90 |
91 | #Plot the graph for cumulative hazard:
92 | naf_m.plot_cumulative_hazard(label="Male")
93 | naf_f.plot_cumulative_hazard(label="Female")
94 | plt.title("Cumulative Hazard Plot")
95 | plt.xlabel("Number of Days")
96 | plt.ylabel("Cumulative Hazard")
97 |
98 | #Conditional median time to event of interest:
99 | kmf_m.conditional_time_to_event_
100 |
101 | #Conditional median time left for event for male group:
102 | median_time_to_event = kmf_m.conditional_time_to_event_
103 | plt.plot(median_time_to_event,label="Median Time left")
104 | plt.title("Medain time to event")
105 | plt.xlabel("Total days")
106 | plt.ylabel("Conditional median time to event")
107 | plt.legend()
108 |
109 | #Conditional median time to event of interest for female group:
110 | kmf_f.conditional_time_to_event_
111 |
112 | #Conditional median time left for event for female group:
113 | median_time_to_event = kmf_f.conditional_time_to_event_
114 | plt.plot(median_time_to_event,label="Median Time left")
115 | plt.title("Medain time to event")
116 | plt.xlabel("Total days")
117 | plt.ylabel("Conditional median time to event")
118 | plt.legend()
119 |
120 | #Survival probability with confidence interval for male group:
121 | kmf_m.confidence_interval_survival_function_
122 |
123 | #Plot survival function with confidence interval for male group:
124 | confidence_surv_func = kmf_m.confidence_interval_survival_function_
125 | plt.plot(confidence_surv_func["Male_lower_0.95"],label="Lower")
126 | plt.plot(confidence_surv_func["Male_upper_0.95"],label="Upper")
127 | plt.title("Survival Function With Confidence Interval")
128 | plt.xlabel("Number of days")
129 | plt.ylabel("Survival Probability")
130 | plt.legend()
131 |
132 | #Survival probability with confidence interval for female group:
133 | kmf_f.confidence_interval_survival_function_
134 |
135 | #Plot survival function with confidence interval for female group:
136 | confidence_surv_func = kmf_f.confidence_interval_survival_function_
137 | plt.plot(confidence_surv_func["Female_lower_0.95"],label="Lower")
138 | plt.plot(confidence_surv_func["Female_upper_0.95"],label="Upper")
139 | plt.title("Survival Function With Confidence Interval")
140 | plt.xlabel("Number of days")
141 | plt.ylabel("Survival Probability")
142 | plt.legend()
143 |
144 | #Plot the cumulative_hazard and cumulative density:
145 | kmf_m.plot_cumulative_density(label="Male Density")
146 | naf_m.plot_cumulative_hazard(label="Male Hazard")
147 | plt.xlabel("Number of Days")
148 |
149 | #Plot the cumulative_hazard and cumulative density:
150 | kmf_f.plot_cumulative_density(label="Female Density")
151 | naf_f.plot_cumulative_hazard(label="Female Hazard")
152 | plt.xlabel("Number of Days")
153 |
154 | #Define variables for log-rank test:
155 | Time_A = Male['time']
156 | Event_A = Male['dead']
157 |
158 | Time_B = Female['time']
159 | Event_B = Female['dead']
160 |
161 | #Performing the Log-Rank test:
162 | from lifelines.statistics import logrank_test
163 |
164 | results = logrank_test(Time_A, Time_B, event_observed_A=Event_A, event_observed_B=Event_B)
165 | results.print_summary()
166 |
167 | #Print the P-value:
168 | print("P-value :",results.p_value)
169 |
--------------------------------------------------------------------------------
/survival_analysis_in_python/survival_analysis_3.py:
--------------------------------------------------------------------------------
1 | #Import required libraries:
2 | import pandas as pd
3 | import numpy as np
4 | import matplotlib.pyplot as plt
5 | from lifelines import KaplanMeierFitter
6 | from lifelines import CoxPHFitter
7 |
8 | #Read the data file:
9 | data = pd.read_csv("lung.csv")
10 | data = data.drop(["Unnamed: 0"],axis=1)
11 | data.head()
12 |
13 | #Columns of dataset:
14 | data.columns
15 |
16 | #Drop rows with null values:
17 | data= data.dropna(subset=['inst', 'time', 'status', 'age', 'sex','ph.ecog','ph.karno', 'pat.karno', 'meal.cal', 'wt.loss'])
18 | data.head()
19 |
20 | #Create an object:
21 | kmf = KaplanMeierFitter()
22 |
23 | #Organize the data:
24 | data.loc[data.status == 1, 'dead'] = 0
25 | data.loc[data.status == 2, 'dead'] = 1
26 | data.head()
27 |
28 | #Fit data into our object:
29 | kmf.fit(durations = data["time"], event_observed = data["dead"])
30 |
31 | #Get the event table:
32 | kmf.event_table
33 |
34 | #Get required columns from the data:
35 | data = data[[ 'time', 'age', 'sex', 'ph.ecog','ph.karno','pat.karno', 'meal.cal', 'wt.loss', 'dead']]
36 |
37 | #Get the summary using CoxPHFitter:
38 | cph = CoxPHFitter()
39 | cph.fit(data,"time",event_col="dead")
40 | cph.print_summary()
41 |
42 | #Plot the result on graph:
43 | cph.plot()
44 |
45 | data.iloc[10:15,:]
46 |
47 | #Plotting the data:
48 | d_data = data.iloc[10:15,:]
49 | cph.predict_survival_function(d_data).plot()
50 |
--------------------------------------------------------------------------------
/what-is-a-gpu/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/what-is-a-gpu/script.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | # What is a GPU? Are GPUs Needed for Deep Learning?
4 |
5 | * Tutorial: https://news.towardsai.net/gpu
6 | * Github: https://github.com/towardsai/tutorials/tree/master/what-is-a-gpu
7 |
8 | **GPU Usage on Your Local Computer / Google Colab
**
9 |
10 | In this notebook you will connect to a GPU, and then run some basic TensorFlow operations on both the CPU and a GPU, observing the speedup provided by using the GPU.
11 |
12 | #### Enabling and testing the GPU
13 |
14 | First, you'll need to enable GPUs for the notebook:
15 |
16 | 1. Navigate to Edit→Notebook Settings
17 | 2. Select GPU from the Hardware Accelerator drop-down
18 |
19 | ### Installing TensorFlow GPU
20 |
21 | As a first step, we need to install *tensorflow-gpu*.
22 |
23 | If you are going to install it on your computer, you should follow these steps.
24 |
25 | As a result of this code, you will have the latest version tensorflow gpu.
26 | """
27 |
28 | pip install tensorflow-gpu
29 |
30 | # Commented out IPython magic to ensure Python compatibility.
31 | # %tensorflow_version 2.x
32 | import tensorflow as tf
33 | device_name = tf.test.gpu_device_name()
34 | if device_name != '/device:GPU:0':
35 | raise SystemError('GPU device not found')
36 | print('Found GPU at: {}'.format(device_name))
37 |
38 | """If the TensorFlow version you want to use is specific, install it by entering the version name."""
39 |
40 | pip install tensorflow-gpu==1.15.0
41 |
42 | # Commented out IPython magic to ensure Python compatibility.
43 | # %tensorflow_version 1.x
44 | import tensorflow as tf
45 | device_name = tf.test.gpu_device_name()
46 | if device_name != '/device:GPU:0':
47 | raise SystemError('GPU device not found')
48 | print('Found GPU at: {}'.format(device_name))
49 |
50 | """Configuration Specific GPU on TensorFlow"""
51 |
52 | import tensorflow as tf
53 | try:
54 | tf.device('/job:localhost/replica:0/task:0/device:GPU:1')
55 | except RuntimeError as e:
56 | print(e)
57 |
58 | """### Checking installed TensorFlow GPU version"""
59 |
60 | pip show tensorflow-gpu
61 |
62 | """As you can see, the latest version of TensorFlow has been installed. If you want to use a specific version distribution, it is necessary to install with the version name.
63 |
64 | If you are also getting warnings as above, it is because of:
65 | - Other libraries that came with the last version of tensorflow-gpu that we installed before are not uninstall, so they have version conflicts with the newly installed version. Decide on the version you want to use and use only that version distribution.
66 |
67 | To check the new TensorFlow version installed, work again with the command.
68 | ```
69 | pip show packagename
70 | ```
71 |
72 | ### Listing Eligible CPU and GPU Devices
73 |
74 | Next, we are at the step of showing all possible devices that can be used.
75 | """
76 |
77 | from tensorflow.python.client import device_lib
78 | device_lib.list_local_devices()
79 |
80 | """4 devices are shown in the list here. 2 of them are a concept excluding CPU and GPU.
81 |
82 | As mentioned in the docs, XLA stands for "accelerated linear algebra". It's Tensorflow's relatively new optimizing compiler that can further speed up your ML models' GPU operations by combining what used to be multiple CUDA kernels into one (simplifying because this isn't that important for your question).
83 |
84 | In the next step, the default device name used will be listed.
85 | """
86 |
87 | import tensorflow as tf
88 | tf.test.gpu_device_name()
89 |
90 | """### Speed Comparison in Model Tutorials for GPU and CPU"""
91 |
92 | import tensorflow as tf
93 | mnist = tf.keras.datasets.fashion_mnist
94 | (training_images, training_labels), (test_images, test_labels) = mnist.load_data()
95 | training_images=training_images / 255.0
96 | test_images=test_images / 255.0
97 | model = tf.keras.models.Sequential([
98 | tf.keras.layers.Flatten(),
99 | tf.keras.layers.Dense(128, activation=tf.nn.relu),
100 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
101 | ])
102 | model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
103 | model.fit(training_images, training_labels, epochs=5)
104 |
105 | test_loss = model.evaluate(test_images, test_labels)
106 |
107 | """As you can see, ETA times are very short. One minute to train almost this amount of data! Now, by following the steps below, let's choose the hardware CPU for the runtime and see how we will experience a difference in speed.
108 |
109 |
110 | ---
111 |
112 | You'll need to enable CPUs for the notebook:
113 |
114 |
115 | 1. Navigate to Edit→Notebook Settings
116 | 2. Select None from the Hardware Accelerator drop-down
117 | """
118 |
119 | import tensorflow as tf
120 | tf.test.gpu_device_name()
121 |
122 | """And now when we test whether it is using GPU or not, we see that the value of None comes out. After making sure we are using a CPU, we can provide the training for this as well."""
123 |
124 | import tensorflow as tf
125 | mnist = tf.keras.datasets.fashion_mnist
126 | (training_images, training_labels), (test_images, test_labels) = mnist.load_data()
127 | training_images=training_images / 255.0
128 | test_images=test_images / 255.0
129 | model = tf.keras.models.Sequential([
130 | tf.keras.layers.Flatten(),
131 | tf.keras.layers.Dense(128, activation=tf.nn.relu),
132 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
133 | ])
134 | model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
135 | model.fit(training_images, training_labels, epochs=5)
136 |
137 | test_loss = model.evaluate(test_images, test_labels)
138 |
139 | """Here, as the number of data increases and the problem becomes more intense, the difference between training will become much wider.
140 |
141 | Since there is not a very large data set in this example line of code, there are no big differences between CPU and GPU when processing data. However, there will be a significant difference when processing big data.
142 |
143 |
144 | ---
145 |
146 | # A sample GPU setup on the local computer
147 |
148 | If you want to learn the features of your graphics card, you can learn the features of your graphics card by typing **dxdiag**. Or instead, you can run the command below on your **computer's terminal**.
149 |
150 | ### Controlling graphic card name
151 | """
152 |
153 | wmic path win32_VideoController get name
154 |
155 | """Since I do not want to work in the base area of the machine, I create a virtual environment and I do this with Conda. You can also use Mini Conda if you wish.
156 |
157 | ### Creating virtual environment
158 | """
159 |
160 | conda create -n virtualenv python=3.6
161 | conda activate virtualenv
162 |
163 | """### TensorFlow GPU Installation
164 |
165 | To use GPU with TensorFlow, it is necessary to install the tensorflow-gpu library. If loading with conda, the appropriate CUDA and cuDNN versions will also be displayed during the process.
166 | """
167 |
168 | conda install tensorflow-gpu==1.15.0
169 | #pip install tensorflow-gpu==1.15.0
170 |
171 | """After all these stages, TensorFlow GPU must be installed. If you wish, you can control the terminal with the following commands."""
172 |
173 | import tensorflow as tf
174 | sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))
175 |
176 | """### Keras Installation"""
177 |
178 | pip install keras==2.2.5
179 |
--------------------------------------------------------------------------------