├── Chapter 02
    └── code
    │   ├── confusion_matrix.py
    │   ├── data_multivar_nb.txt
    │   ├── data_multivar_regr.txt
    │   ├── data_preprocessor.py
    │   ├── data_singlevar_regr.txt
    │   ├── house_prices.py
    │   ├── income_classifier.py
    │   ├── income_data.txt
    │   ├── label_encoder.py
    │   ├── logistic_regression.py
    │   ├── naive_bayes.py
    │   ├── regressor_multivar.py
    │   ├── regressor_singlevar.py
    │   └── utilities.py
├── Chapter 03
    └── code
    │   ├── class_imbalance.py
    │   ├── data_decision_trees.txt
    │   ├── data_imbalance.txt
    │   ├── data_random_forests.txt
    │   ├── decision_trees.py
    │   ├── feature_importance.py
    │   ├── grid_search.py
    │   ├── random_forests.py
    │   ├── run_grid_search.py
    │   ├── traffic_data.txt
    │   ├── traffic_prediction.py
    │   └── utilities.py
├── Chapter 04
    └── code
    │   ├── clustering_quality.py
    │   ├── company_symbol_mapping.json
    │   ├── data_clustering.txt
    │   ├── data_quality.txt
    │   ├── gmm_classifier.py
    │   ├── kmeans.py
    │   ├── market_segmentation.py
    │   ├── mean_shift.py
    │   ├── sales.csv
    │   └── stocks.py
├── Chapter 05
    └── code
    │   ├── collaborative_filtering.py
    │   ├── compute_scores.py
    │   ├── data.txt
    │   ├── k_nearest_neighbors.py
    │   ├── movie_recommender.py
    │   ├── nearest_neighbors_classifier.py
    │   ├── pipeline_trainer.py
    │   └── ratings.json
├── Chapter 06
    └── code
    │   ├── adjacent_states.txt
    │   ├── coastal_states.txt
    │   ├── expression_matcher.py
    │   ├── family.py
    │   ├── prime.py
    │   ├── puzzle.py
    │   ├── relationships.json
    │   └── states.py
├── Chapter 07
    └── code
    │   ├── coloring.py
    │   ├── constrained_problem.py
    │   ├── greedy_search.py
    │   ├── maze.py
    │   ├── puzzle.py
    │   └── simpleai.zip
├── Chapter 08
    └── code
    │   ├── bit_counter.py
    │   ├── robot.py
    │   ├── symbol_regression.py
    │   ├── target_map.txt
    │   └── visualization.py
├── Chapter 09
    └── code
    │   ├── coins.py
    │   ├── connect_four.py
    │   ├── easyAI
    │       ├── .DS_Store
    │       ├── AI
    │       │   ├── .DS_Store
    │       │   ├── DUAL.py
    │       │   ├── DictTT.py
    │       │   ├── HashTT.py
    │       │   ├── Hashes.py
    │       │   ├── MTdriver.py
    │       │   ├── Negamax.py
    │       │   ├── SSS.py
    │       │   ├── TT.py
    │       │   ├── __init__.py
    │       │   └── solving.py
    │       ├── Player.py
    │       ├── TwoPlayersGame.py
    │       └── __init__.py
    │   ├── hexapawn.py
    │   └── tic_tac_toe.py
├── Chapter 10
    └── code
    │   ├── bag_of_words.py
    │   ├── category_predictor.py
    │   ├── data.txt
    │   ├── gender_identifier.py
    │   ├── lemmatizer.py
    │   ├── sentiment_analyzer.py
    │   ├── stemmer.py
    │   ├── text_chunker.py
    │   ├── tokenizer.py
    │   └── topic_modeler.py
├── Chapter 11
    └── code
    │   ├── crf.py
    │   ├── data_1D.txt
    │   ├── data_2D.txt
    │   ├── hmm.py
    │   ├── operator.py
    │   ├── slicer.py
    │   ├── stats_extractor.py
    │   ├── stock_market.py
    │   └── timeseries.py
├── Chapter 12
    └── code
    │   ├── audio_generator.py
    │   ├── audio_plotter.py
    │   ├── data
    │       ├── .DS_Store
    │       ├── apple
    │       │   ├── apple01.wav
    │       │   ├── apple02.wav
    │       │   ├── apple03.wav
    │       │   ├── apple04.wav
    │       │   ├── apple05.wav
    │       │   ├── apple06.wav
    │       │   ├── apple07.wav
    │       │   ├── apple08.wav
    │       │   ├── apple09.wav
    │       │   ├── apple10.wav
    │       │   ├── apple11.wav
    │       │   ├── apple12.wav
    │       │   ├── apple13.wav
    │       │   ├── apple14.wav
    │       │   └── apple15.wav
    │       ├── banana
    │       │   ├── banana01.wav
    │       │   ├── banana02.wav
    │       │   ├── banana03.wav
    │       │   ├── banana04.wav
    │       │   ├── banana05.wav
    │       │   ├── banana06.wav
    │       │   ├── banana07.wav
    │       │   ├── banana08.wav
    │       │   ├── banana09.wav
    │       │   ├── banana10.wav
    │       │   ├── banana11.wav
    │       │   ├── banana12.wav
    │       │   ├── banana13.wav
    │       │   ├── banana14.wav
    │       │   └── banana15.wav
    │       ├── kiwi
    │       │   ├── kiwi01.wav
    │       │   ├── kiwi02.wav
    │       │   ├── kiwi03.wav
    │       │   ├── kiwi04.wav
    │       │   ├── kiwi05.wav
    │       │   ├── kiwi06.wav
    │       │   ├── kiwi07.wav
    │       │   ├── kiwi08.wav
    │       │   ├── kiwi09.wav
    │       │   ├── kiwi10.wav
    │       │   ├── kiwi11.wav
    │       │   ├── kiwi12.wav
    │       │   ├── kiwi13.wav
    │       │   ├── kiwi14.wav
    │       │   └── kiwi15.wav
    │       ├── lime
    │       │   ├── lime01.wav
    │       │   ├── lime02.wav
    │       │   ├── lime03.wav
    │       │   ├── lime04.wav
    │       │   ├── lime05.wav
    │       │   ├── lime06.wav
    │       │   ├── lime07.wav
    │       │   ├── lime08.wav
    │       │   ├── lime09.wav
    │       │   ├── lime10.wav
    │       │   ├── lime11.wav
    │       │   ├── lime12.wav
    │       │   ├── lime13.wav
    │       │   ├── lime14.wav
    │       │   └── lime15.wav
    │       ├── orange
    │       │   ├── orange01.wav
    │       │   ├── orange02.wav
    │       │   ├── orange03.wav
    │       │   ├── orange04.wav
    │       │   ├── orange05.wav
    │       │   ├── orange06.wav
    │       │   ├── orange07.wav
    │       │   ├── orange08.wav
    │       │   ├── orange09.wav
    │       │   ├── orange10.wav
    │       │   ├── orange11.wav
    │       │   ├── orange12.wav
    │       │   ├── orange13.wav
    │       │   ├── orange14.wav
    │       │   └── orange15.wav
    │       ├── peach
    │       │   ├── peach01.wav
    │       │   ├── peach02.wav
    │       │   ├── peach03.wav
    │       │   ├── peach04.wav
    │       │   ├── peach05.wav
    │       │   ├── peach06.wav
    │       │   ├── peach07.wav
    │       │   ├── peach08.wav
    │       │   ├── peach09.wav
    │       │   ├── peach10.wav
    │       │   ├── peach11.wav
    │       │   ├── peach12.wav
    │       │   ├── peach13.wav
    │       │   ├── peach14.wav
    │       │   └── peach15.wav
    │       └── pineapple
    │       │   ├── pineapple01.wav
    │       │   ├── pineapple02.wav
    │       │   ├── pineapple03.wav
    │       │   ├── pineapple04.wav
    │       │   ├── pineapple05.wav
    │       │   ├── pineapple06.wav
    │       │   ├── pineapple07.wav
    │       │   ├── pineapple08.wav
    │       │   ├── pineapple09.wav
    │       │   ├── pineapple10.wav
    │       │   ├── pineapple11.wav
    │       │   ├── pineapple12.wav
    │       │   ├── pineapple13.wav
    │       │   ├── pineapple14.wav
    │       │   └── pineapple15.wav
    │   ├── feature_extractor.py
    │   ├── features
    │       ├── .DS_Store
    │       ├── __init__.py
    │       ├── base.py
    │       └── sigproc.py
    │   ├── frequency_transformer.py
    │   ├── random_sound.wav
    │   ├── speech_recognizer.py
    │   ├── spoken_word.wav
    │   ├── synthesizer.py
    │   └── tone_mapping.json
├── Chapter 13
    └── code
    │   ├── background_subtraction.py
    │   ├── camshift.py
    │   ├── colorspaces.py
    │   ├── eye_detector.py
    │   ├── face_detector.py
    │   ├── frame_diff.py
    │   ├── haar_cascade_files
    │       ├── .DS_Store
    │       ├── haarcascade_eye.xml
    │       ├── haarcascade_frontalface_default.xml
    │       └── haarcascade_mcs_nose.xml
    │   └── optical_flow.py
├── Chapter 14
    └── code
    │   ├── character_visualizer.py
    │   ├── data_perceptron.txt
    │   ├── data_simple_nn.txt
    │   ├── data_vector_quantization.txt
    │   ├── letter.data
    │   ├── multilayer_neural_network.py
    │   ├── ocr.py
    │   ├── perceptron_classifier.py
    │   ├── recurrent_neural_network.py
    │   ├── simple_neural_network.py
    │   └── vector_quantizer.py
├── Chapter 15
    └── code
    │   ├── balancer.py
    │   └── run_environment.py
├── Chapter 16
    └── code
    │   ├── cnn.py
    │   ├── linear_regession.py
    │   └── single_layer.py
├── LICENSE
└── README.md


/Chapter 02/code/confusion_matrix.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.metrics import confusion_matrix
 4 | from sklearn.metrics import classification_report
 5 | 
 6 | # Define sample labels
 7 | true_labels = [2, 0, 0, 2, 4, 4, 1, 0, 3, 3, 3]
 8 | pred_labels = [2, 1, 0, 2, 4, 3, 1, 0, 1, 3, 3]
 9 | 
10 | # Create confusion matrix
11 | confusion_mat = confusion_matrix(true_labels, pred_labels)
12 | 
13 | # Visualize confusion matrix
14 | plt.imshow(confusion_mat, interpolation='nearest', cmap=plt.cm.gray)
15 | plt.title('Confusion matrix')
16 | plt.colorbar()
17 | ticks = np.arange(5)
18 | plt.xticks(ticks, ticks)
19 | plt.yticks(ticks, ticks)
20 | plt.ylabel('True labels')
21 | plt.xlabel('Predicted labels')
22 | plt.show()
23 | 
24 | # Classification report
25 | targets = ['Class-0', 'Class-1', 'Class-2', 'Class-3', 'Class-4']
26 | print('\n', classification_report(true_labels, pred_labels, target_names=targets))
27 | 
28 | 


--------------------------------------------------------------------------------
/Chapter 02/code/data_preprocessor.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn import preprocessing
 3 | 
 4 | input_data = np.array([[5.1, -2.9, 3.3],
 5 |                        [-1.2, 7.8, -6.1],
 6 |                        [3.9, 0.4, 2.1],
 7 |                        [7.3, -9.9, -4.5]])
 8 | 
 9 | # Binarize data 
10 | data_binarized = preprocessing.Binarizer(threshold=2.1).transform(input_data)
11 | print("\nBinarized data:\n", data_binarized)
12 | 
13 | # Print mean and standard deviation
14 | print("\nBEFORE:")
15 | print("Mean =", input_data.mean(axis=0))
16 | print("Std deviation =", input_data.std(axis=0))
17 | 
18 | # Remove mean
19 | data_scaled = preprocessing.scale(input_data)
20 | print("\nAFTER:")
21 | print("Mean =", data_scaled.mean(axis=0))
22 | print("Std deviation =", data_scaled.std(axis=0))
23 | 
24 | # Min max scaling
25 | data_scaler_minmax = preprocessing.MinMaxScaler(feature_range=(0, 1))
26 | data_scaled_minmax = data_scaler_minmax.fit_transform(input_data)
27 | print("\nMin max scaled data:\n", data_scaled_minmax)
28 | 
29 | # Normalize data
30 | data_normalized_l1 = preprocessing.normalize(input_data, norm='l1')
31 | data_normalized_l2 = preprocessing.normalize(input_data, norm='l2')
32 | print("\nL1 normalized data:\n", data_normalized_l1)
33 | print("\nL2 normalized data:\n", data_normalized_l2)
34 | 
35 | 


--------------------------------------------------------------------------------
/Chapter 02/code/data_singlevar_regr.txt:
--------------------------------------------------------------------------------
 1 | -0.86,4.38
 2 | 2.58,6.97
 3 | 4.17,7.01
 4 | 2.6,5.44
 5 | 5.13,6.45
 6 | 3.23,5.49
 7 | -0.26,4.25
 8 | 2.76,5.94
 9 | 0.47,4.8
10 | -3.9,2.7
11 | 0.27,3.26
12 | 2.88,6.48
13 | -0.54,4.08
14 | -4.39,0.09
15 | -1.12,2.74
16 | 2.09,5.8
17 | -5.78,0.16
18 | 1.77,4.97
19 | -7.91,-2.26
20 | 4.86,5.75
21 | -2.17,3.33
22 | 1.38,5.26
23 | 0.54,4.43
24 | 3.12,6.6
25 | -2.19,3.77
26 | -0.33,2.4
27 | -1.21,2.98
28 | -4.52,0.29
29 | -0.46,2.47
30 | -1.13,4.08
31 | 4.61,8.97
32 | 0.31,3.94
33 | 0.25,3.46
34 | -2.67,2.46
35 | -4.66,1.14
36 | -0.2,4.31
37 | -0.52,1.97
38 | 1.24,4.83
39 | -2.53,3.12
40 | -0.34,4.97
41 | 5.74,8.65
42 | -0.34,3.59
43 | 0.99,3.66
44 | 5.01,7.54
45 | -2.38,1.52
46 | -0.56,4.55
47 | -1.01,3.23
48 | 0.47,4.39
49 | 4.81,7.04
50 | 2.38,4.46
51 | -3.32,2.41
52 | -3.86,1.11
53 | -1.41,3.23
54 | 6.04,7.46
55 | 4.18,5.71
56 | -0.78,3.59
57 | -2.2,2.93
58 | 0.76,4.16
59 | 2.02,6.43
60 | 0.42,4.92
61 | 


--------------------------------------------------------------------------------
/Chapter 02/code/house_prices.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn import datasets
 3 | from sklearn.svm import SVR
 4 | from sklearn.metrics import mean_squared_error, explained_variance_score
 5 | from sklearn.utils import shuffle
 6 | 
 7 | # Load housing data
 8 | data = datasets.load_boston() 
 9 | 
10 | # Shuffle the data
11 | X, y = shuffle(data.data, data.target, random_state=7)
12 | 
13 | # Split the data into training and testing datasets 
14 | num_training = int(0.8 * len(X))
15 | X_train, y_train = X[:num_training], y[:num_training]
16 | X_test, y_test = X[num_training:], y[num_training:]
17 | 
18 | # Create Support Vector Regression model
19 | sv_regressor = SVR(kernel='linear', C=1.0, epsilon=0.1)
20 | 
21 | # Train Support Vector Regressor
22 | sv_regressor.fit(X_train, y_train)
23 | 
24 | # Evaluate performance of Support Vector Regressor
25 | y_test_pred = sv_regressor.predict(X_test)
26 | mse = mean_squared_error(y_test, y_test_pred)
27 | evs = explained_variance_score(y_test, y_test_pred) 
28 | print("\n#### Performance ####")
29 | print("Mean squared error =", round(mse, 2))
30 | print("Explained variance score =", round(evs, 2))
31 | 
32 | # Test the regressor on test datapoint
33 | test_data = [3.7, 0, 18.4, 1, 0.87, 5.95, 91, 2.5052, 26, 666, 20.2, 351.34, 15.27]
34 | print("\nPredicted price:", sv_regressor.predict([test_data])[0])
35 | 
36 | 


--------------------------------------------------------------------------------
/Chapter 02/code/income_classifier.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn import preprocessing
 4 | from sklearn.svm import LinearSVC
 5 | from sklearn.multiclass import OneVsOneClassifier
 6 | from sklearn import cross_validation
 7 | 
 8 | # Input file containing data
 9 | input_file = 'income_data.txt'
10 | 
11 | # Read the data
12 | X = []
13 | y = []
14 | count_class1 = 0
15 | count_class2 = 0
16 | max_datapoints = 25000
17 | 
18 | with open(input_file, 'r') as f:
19 |     for line in f.readlines():
20 |         if count_class1 >= max_datapoints and count_class2 >= max_datapoints:
21 |             break
22 | 
23 |         if '?' in line:
24 |             continue
25 | 
26 |         data = line[:-1].split(', ')
27 | 
28 |         if data[-1] == '<=50K' and count_class1 < max_datapoints:
29 |             X.append(data)
30 |             count_class1 += 1
31 | 
32 |         if data[-1] == '>50K' and count_class2 < max_datapoints:
33 |             X.append(data)
34 |             count_class2 += 1
35 | 
36 | # Convert to numpy array
37 | X = np.array(X)
38 | 
39 | # Convert string data to numerical data
40 | label_encoder = [] 
41 | X_encoded = np.empty(X.shape)
42 | for i,item in enumerate(X[0]):
43 |     if item.isdigit(): 
44 |         X_encoded[:, i] = X[:, i]
45 |     else:
46 |         label_encoder.append(preprocessing.LabelEncoder())
47 |         X_encoded[:, i] = label_encoder[-1].fit_transform(X[:, i])
48 | 
49 | X = X_encoded[:, :-1].astype(int)
50 | y = X_encoded[:, -1].astype(int)
51 | 
52 | # Create SVM classifier
53 | classifier = OneVsOneClassifier(LinearSVC(random_state=0))
54 | 
55 | # Train the classifier
56 | classifier.fit(X, y)
57 | 
58 | # Cross validation
59 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2, random_state=5)
60 | classifier = OneVsOneClassifier(LinearSVC(random_state=0))
61 | classifier.fit(X_train, y_train)
62 | y_test_pred = classifier.predict(X_test)
63 | 
64 | # Compute the F1 score of the SVM classifier
65 | f1 = cross_validation.cross_val_score(classifier, X, y, scoring='f1_weighted', cv=3)
66 | print("F1 score: " + str(round(100*f1.mean(), 2)) + "%")
67 | 
68 | # Predict output for a test datapoint
69 | input_data = ['37', 'Private', '215646', 'HS-grad', '9', 'Never-married', 'Handlers-cleaners', 'Not-in-family', 'White', 'Male', '0', '0', '40', 'United-States']
70 | 
71 | # Encode test datapoint
72 | input_data_encoded = [-1] * len(input_data)
73 | count = 0
74 | for i, item in enumerate(input_data):
75 |     if item.isdigit():
76 |         input_data_encoded[i] = int(input_data[i])
77 |     else:
78 |         input_data_encoded[i] = int(label_encoder[count].transform(input_data[i]))
79 |         count += 1 
80 | 
81 | input_data_encoded = np.array(input_data_encoded)
82 | 
83 | # Run classifier on encoded datapoint and print output
84 | predicted_class = classifier.predict(input_data_encoded)
85 | print(label_encoder[-1].inverse_transform(predicted_class)[0])
86 | 
87 | 


--------------------------------------------------------------------------------
/Chapter 02/code/label_encoder.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn import preprocessing
 3 | 
 4 | # Sample input labels
 5 | input_labels = ['red', 'black', 'red', 'green', 'black', 'yellow', 'white']
 6 | 
 7 | # Create label encoder and fit the labels
 8 | encoder = preprocessing.LabelEncoder()
 9 | encoder.fit(input_labels)
10 | 
11 | # Print the mapping 
12 | print("\nLabel mapping:")
13 | for i, item in enumerate(encoder.classes_):
14 |     print(item, '-->', i)
15 | 
16 | # Encode a set of labels using the encoder
17 | test_labels = ['green', 'red', 'black']
18 | encoded_values = encoder.transform(test_labels)
19 | print("\nLabels =", test_labels)
20 | print("Encoded values =", list(encoded_values))
21 | 
22 | # Decode a set of values using the encoder
23 | encoded_values = [3, 0, 4, 1]
24 | decoded_list = encoder.inverse_transform(encoded_values)
25 | print("\nEncoded values =", encoded_values)
26 | print("Decoded labels =", list(decoded_list))
27 | 
28 | 


--------------------------------------------------------------------------------
/Chapter 02/code/logistic_regression.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn import linear_model 
 3 | import matplotlib.pyplot as plt
 4 | 
 5 | from utilities import visualize_classifier
 6 | 
 7 | # Define sample input data
 8 | X = np.array([[3.1, 7.2], [4, 6.7], [2.9, 8], [5.1, 4.5], [6, 5], [5.6, 5], [3.3, 0.4], [3.9, 0.9], [2.8, 1], [0.5, 3.4], [1, 4], [0.6, 4.9]])
 9 | y = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3])
10 | 
11 | # Create the logistic regression classifier
12 | classifier = linear_model.LogisticRegression(solver='liblinear', C=1)
13 | #classifier = linear_model.LogisticRegression(solver='liblinear', C=100)
14 | 
15 | # Train the classifier
16 | classifier.fit(X, y)
17 | 
18 | # Visualize the performance of the classifier 
19 | visualize_classifier(classifier, X, y)
20 | 


--------------------------------------------------------------------------------
/Chapter 02/code/naive_bayes.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.naive_bayes import GaussianNB 
 4 | from sklearn import cross_validation
 5 | 
 6 | from utilities import visualize_classifier
 7 | 
 8 | # Input file containing data
 9 | input_file = 'data_multivar_nb.txt'
10 | 
11 | # Load data from input file
12 | data = np.loadtxt(input_file, delimiter=',')
13 | X, y = data[:, :-1], data[:, -1] 
14 | 
15 | # Create Naive Bayes classifier 
16 | classifier = GaussianNB()
17 | 
18 | # Train the classifier
19 | classifier.fit(X, y)
20 | 
21 | # Predict the values for training data
22 | y_pred = classifier.predict(X)
23 | 
24 | # Compute accuracy
25 | accuracy = 100.0 * (y == y_pred).sum() / X.shape[0]
26 | print("Accuracy of Naive Bayes classifier =", round(accuracy, 2), "%")
27 | 
28 | # Visualize the performance of the classifier
29 | visualize_classifier(classifier, X, y)
30 | 
31 | ###############################################
32 | # Cross validation 
33 | 
34 | # Split data into training and test data 
35 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2, random_state=3)
36 | classifier_new = GaussianNB()
37 | classifier_new.fit(X_train, y_train)
38 | y_test_pred = classifier_new.predict(X_test)
39 | 
40 | # compute accuracy of the classifier
41 | accuracy = 100.0 * (y_test == y_test_pred).sum() / X_test.shape[0]
42 | print("Accuracy of the new classifier =", round(accuracy, 2), "%")
43 | 
44 | # Visualize the performance of the classifier
45 | visualize_classifier(classifier_new, X_test, y_test)
46 | 
47 | ###############################################
48 | # Scoring functions
49 | 
50 | num_folds = 3
51 | accuracy_values = cross_validation.cross_val_score(classifier, 
52 |         X, y, scoring='accuracy', cv=num_folds)
53 | print("Accuracy: " + str(round(100*accuracy_values.mean(), 2)) + "%")
54 | 
55 | precision_values = cross_validation.cross_val_score(classifier, 
56 |         X, y, scoring='precision_weighted', cv=num_folds)
57 | print("Precision: " + str(round(100*precision_values.mean(), 2)) + "%")
58 | 
59 | recall_values = cross_validation.cross_val_score(classifier, 
60 |         X, y, scoring='recall_weighted', cv=num_folds)
61 | print("Recall: " + str(round(100*recall_values.mean(), 2)) + "%")
62 | 
63 | f1_values = cross_validation.cross_val_score(classifier, 
64 |         X, y, scoring='f1_weighted', cv=num_folds)
65 | print("F1: " + str(round(100*f1_values.mean(), 2)) + "%")
66 | 
67 | 


--------------------------------------------------------------------------------
/Chapter 02/code/regressor_multivar.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn import linear_model
 3 | import sklearn.metrics as sm
 4 | from sklearn.preprocessing import PolynomialFeatures
 5 | 
 6 | # Input file containing data
 7 | input_file = 'data_multivar_regr.txt'
 8 | 
 9 | # Load the data from the input file
10 | data = np.loadtxt(input_file, delimiter=',')
11 | X, y = data[:, :-1], data[:, -1]
12 | 
13 | # Split data into training and testing 
14 | num_training = int(0.8 * len(X))
15 | num_test = len(X) - num_training
16 | 
17 | # Training data
18 | X_train, y_train = X[:num_training], y[:num_training]
19 | 
20 | # Test data
21 | X_test, y_test = X[num_training:], y[num_training:]
22 | 
23 | # Create the linear regressor model
24 | linear_regressor = linear_model.LinearRegression()
25 | 
26 | # Train the model using the training sets
27 | linear_regressor.fit(X_train, y_train)
28 | 
29 | # Predict the output
30 | y_test_pred = linear_regressor.predict(X_test)
31 | 
32 | # Measure performance
33 | print("Linear Regressor performance:")
34 | print("Mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred), 2))
35 | print("Mean squared error =", round(sm.mean_squared_error(y_test, y_test_pred), 2))
36 | print("Median absolute error =", round(sm.median_absolute_error(y_test, y_test_pred), 2))
37 | print("Explained variance score =", round(sm.explained_variance_score(y_test, y_test_pred), 2))
38 | print("R2 score =", round(sm.r2_score(y_test, y_test_pred), 2))
39 | 
40 | # Polynomial regression
41 | polynomial = PolynomialFeatures(degree=10)
42 | X_train_transformed = polynomial.fit_transform(X_train)
43 | datapoint = [[7.75, 6.35, 5.56]]
44 | poly_datapoint = polynomial.fit_transform(datapoint)
45 | 
46 | poly_linear_model = linear_model.LinearRegression()
47 | poly_linear_model.fit(X_train_transformed, y_train)
48 | print("\nLinear regression:\n", linear_regressor.predict(datapoint))
49 | print("\nPolynomial regression:\n", poly_linear_model.predict(poly_datapoint))
50 | 
51 | 


--------------------------------------------------------------------------------
/Chapter 02/code/regressor_singlevar.py:
--------------------------------------------------------------------------------
 1 | import pickle
 2 | 
 3 | import numpy as np
 4 | from sklearn import linear_model
 5 | import sklearn.metrics as sm
 6 | import matplotlib.pyplot as plt
 7 | 
 8 | # Input file containing data
 9 | input_file = 'data_singlevar_regr.txt' 
10 | 
11 | # Read data
12 | data = np.loadtxt(input_file, delimiter=',')
13 | X, y = data[:, :-1], data[:, -1]
14 | 
15 | # Train and test split
16 | num_training = int(0.8 * len(X))
17 | num_test = len(X) - num_training
18 | 
19 | # Training data
20 | X_train, y_train = X[:num_training], y[:num_training]
21 | 
22 | # Test data
23 | X_test, y_test = X[num_training:], y[num_training:]
24 | 
25 | # Create linear regressor object
26 | regressor = linear_model.LinearRegression()
27 | 
28 | # Train the model using the training sets
29 | regressor.fit(X_train, y_train)
30 | 
31 | # Predict the output
32 | y_test_pred = regressor.predict(X_test)
33 | 
34 | # Plot outputs
35 | plt.scatter(X_test, y_test, color='green')
36 | plt.plot(X_test, y_test_pred, color='black', linewidth=4)
37 | plt.xticks(())
38 | plt.yticks(())
39 | plt.show()
40 | 
41 | # Compute performance metrics
42 | print("Linear regressor performance:")
43 | print("Mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred), 2))
44 | print("Mean squared error =", round(sm.mean_squared_error(y_test, y_test_pred), 2)) 
45 | print("Median absolute error =", round(sm.median_absolute_error(y_test, y_test_pred), 2)) 
46 | print("Explain variance score =", round(sm.explained_variance_score(y_test, y_test_pred), 2))
47 | print("R2 score =", round(sm.r2_score(y_test, y_test_pred), 2))
48 | 
49 | # Model persistence
50 | output_model_file = 'model.pkl'
51 | 
52 | # Save the model
53 | with open(output_model_file, 'wb') as f:
54 |     pickle.dump(regressor, f)
55 | 
56 | # Load the model
57 | with open(output_model_file, 'rb') as f:
58 |     regressor_model = pickle.load(f)
59 | 
60 | # Perform prediction on test data
61 | y_test_pred_new = regressor_model.predict(X_test)
62 | print("\nNew mean absolute error =", round(sm.mean_absolute_error(y_test, y_test_pred_new), 2))
63 | 
64 | 


--------------------------------------------------------------------------------
/Chapter 02/code/utilities.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | 
 4 | def visualize_classifier(classifier, X, y):
 5 |     # Define the minimum and maximum values for X and Y
 6 |     # that will be used in the mesh grid
 7 |     min_x, max_x = X[:, 0].min() - 1.0, X[:, 0].max() + 1.0
 8 |     min_y, max_y = X[:, 1].min() - 1.0, X[:, 1].max() + 1.0
 9 | 
10 |     # Define the step size to use in plotting the mesh grid 
11 |     mesh_step_size = 0.01
12 | 
13 |     # Define the mesh grid of X and Y values
14 |     x_vals, y_vals = np.meshgrid(np.arange(min_x, max_x, mesh_step_size), np.arange(min_y, max_y, mesh_step_size))
15 | 
16 |     # Run the classifier on the mesh grid
17 |     output = classifier.predict(np.c_[x_vals.ravel(), y_vals.ravel()])
18 | 
19 |     # Reshape the output array
20 |     output = output.reshape(x_vals.shape)
21 | 
22 |     # Create a plot
23 |     plt.figure()
24 | 
25 |     # Choose a color scheme for the plot 
26 |     plt.pcolormesh(x_vals, y_vals, output, cmap=plt.cm.gray)
27 | 
28 |     # Overlay the training points on the plot 
29 |     plt.scatter(X[:, 0], X[:, 1], c=y, s=75, edgecolors='black', linewidth=1, cmap=plt.cm.Paired)
30 | 
31 |     # Specify the boundaries of the plot
32 |     plt.xlim(x_vals.min(), x_vals.max())
33 |     plt.ylim(y_vals.min(), y_vals.max())
34 | 
35 |     # Specify the ticks on the X and Y axes
36 |     plt.xticks((np.arange(int(X[:, 0].min() - 1), int(X[:, 0].max() + 1), 1.0)))
37 |     plt.yticks((np.arange(int(X[:, 1].min() - 1), int(X[:, 1].max() + 1), 1.0)))
38 | 
39 |     plt.show()
40 | 
41 | 


--------------------------------------------------------------------------------
/Chapter 03/code/class_imbalance.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | 
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | from sklearn.ensemble import ExtraTreesClassifier 
 6 | from sklearn import cross_validation
 7 | from sklearn.metrics import classification_report
 8 | 
 9 | from utilities import visualize_classifier
10 | 
11 | # Load input data
12 | input_file = 'data_imbalance.txt'
13 | data = np.loadtxt(input_file, delimiter=',')
14 | X, y = data[:, :-1], data[:, -1]
15 | 
16 | # Separate input data into two classes based on labels
17 | class_0 = np.array(X[y==0])
18 | class_1 = np.array(X[y==1])
19 | 
20 | # Visualize input data
21 | plt.figure()
22 | plt.scatter(class_0[:, 0], class_0[:, 1], s=75, facecolors='black', 
23 |                 edgecolors='black', linewidth=1, marker='x')
24 | plt.scatter(class_1[:, 0], class_1[:, 1], s=75, facecolors='white', 
25 |                 edgecolors='black', linewidth=1, marker='o')
26 | plt.title('Input data')
27 | 
28 | # Split data into training and testing datasets 
29 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
30 |         X, y, test_size=0.25, random_state=5)
31 | 
32 | # Extremely Random Forests classifier
33 | params = {'n_estimators': 100, 'max_depth': 4, 'random_state': 0}
34 | if len(sys.argv) > 1:
35 |     if sys.argv[1] == 'balance':
36 |         params = {'n_estimators': 100, 'max_depth': 4, 'random_state': 0, 'class_weight': 'balanced'}
37 |     else:
38 |         raise TypeError("Invalid input argument; should be 'balance'")
39 | 
40 | classifier = ExtraTreesClassifier(**params)
41 | classifier.fit(X_train, y_train)
42 | visualize_classifier(classifier, X_train, y_train, 'Training dataset')
43 | 
44 | y_test_pred = classifier.predict(X_test)
45 | visualize_classifier(classifier, X_test, y_test, 'Test dataset')
46 | 
47 | # Evaluate classifier performance
48 | class_names = ['Class-0', 'Class-1']
49 | print("\n" + "#"*40)
50 | print("\nClassifier performance on training dataset\n")
51 | print(classification_report(y_train, classifier.predict(X_train), target_names=class_names))
52 | print("#"*40 + "\n")
53 | 
54 | print("#"*40)
55 | print("\nClassifier performance on test dataset\n")
56 | print(classification_report(y_test, y_test_pred, target_names=class_names))
57 | print("#"*40 + "\n")
58 | 
59 | plt.show()
60 | 


--------------------------------------------------------------------------------
/Chapter 03/code/decision_trees.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.metrics import classification_report
 4 | from sklearn import cross_validation
 5 | from sklearn.tree import DecisionTreeClassifier
 6 | 
 7 | from utilities import visualize_classifier
 8 | 
 9 | # Load input data
10 | input_file = 'data_decision_trees.txt'
11 | data = np.loadtxt(input_file, delimiter=',')
12 | X, y = data[:, :-1], data[:, -1]
13 | 
14 | # Separate input data into two classes based on labels
15 | class_0 = np.array(X[y==0])
16 | class_1 = np.array(X[y==1])
17 | 
18 | # Visualize input data
19 | plt.figure()
20 | plt.scatter(class_0[:, 0], class_0[:, 1], s=75, facecolors='black', 
21 |         edgecolors='black', linewidth=1, marker='x')
22 | plt.scatter(class_1[:, 0], class_1[:, 1], s=75, facecolors='white', 
23 |         edgecolors='black', linewidth=1, marker='o')
24 | plt.title('Input data')
25 | 
26 | # Split data into training and testing datasets 
27 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
28 |         X, y, test_size=0.25, random_state=5)
29 | 
30 | # Decision Trees classifier 
31 | params = {'random_state': 0, 'max_depth': 4}
32 | classifier = DecisionTreeClassifier(**params)
33 | classifier.fit(X_train, y_train)
34 | visualize_classifier(classifier, X_train, y_train, 'Training dataset')
35 | 
36 | y_test_pred = classifier.predict(X_test)
37 | visualize_classifier(classifier, X_test, y_test, 'Test dataset')
38 | 
39 | # Evaluate classifier performance
40 | class_names = ['Class-0', 'Class-1']
41 | print("\n" + "#"*40)
42 | print("\nClassifier performance on training dataset\n")
43 | print(classification_report(y_train, classifier.predict(X_train), target_names=class_names))
44 | print("#"*40 + "\n")
45 | 
46 | print("#"*40)
47 | print("\nClassifier performance on test dataset\n")
48 | print(classification_report(y_test, y_test_pred, target_names=class_names))
49 | print("#"*40 + "\n")
50 | 
51 | plt.show()
52 | 


--------------------------------------------------------------------------------
/Chapter 03/code/feature_importance.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.tree import DecisionTreeRegressor
 4 | from sklearn.ensemble import AdaBoostRegressor
 5 | from sklearn import datasets
 6 | from sklearn.metrics import mean_squared_error, explained_variance_score
 7 | from sklearn import cross_validation
 8 | from sklearn.utils import shuffle
 9 | 
10 | from utilities import visualize_feature_importances
11 | 
12 | # Load housing data
13 | housing_data = datasets.load_boston() 
14 | 
15 | # Shuffle the data
16 | X, y = shuffle(housing_data.data, housing_data.target, random_state=7)
17 | 
18 | # Split data into training and testing datasets 
19 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
20 |         X, y, test_size=0.2, random_state=7)
21 | 
22 | # AdaBoost Regressor model
23 | regressor = AdaBoostRegressor(DecisionTreeRegressor(max_depth=4), 
24 |         n_estimators=400, random_state=7)
25 | regressor.fit(X_train, y_train)
26 | 
27 | # Evaluate performance of AdaBoost regressor
28 | y_pred = regressor.predict(X_test)
29 | mse = mean_squared_error(y_test, y_pred)
30 | evs = explained_variance_score(y_test, y_pred )
31 | print("\nADABOOST REGRESSOR")
32 | print("Mean squared error =", round(mse, 2))
33 | print("Explained variance score =", round(evs, 2))
34 | 
35 | # Extract feature importances
36 | feature_importances = regressor.feature_importances_
37 | feature_names = housing_data.feature_names
38 | 
39 | # Normalize the importance values 
40 | feature_importances = 100.0 * (feature_importances / max(feature_importances))
41 | 
42 | # Sort the values and flip them
43 | index_sorted = np.flipud(np.argsort(feature_importances))
44 | 
45 | # Arrange the X ticks
46 | pos = np.arange(index_sorted.shape[0]) + 0.5
47 | 
48 | # Plot the bar graph
49 | plt.figure()
50 | plt.bar(pos, feature_importances[index_sorted], align='center')
51 | plt.xticks(pos, feature_names[index_sorted])
52 | plt.ylabel('Relative Importance')
53 | plt.title('Feature importance using AdaBoost regressor')
54 | plt.show()
55 | 
56 | 


--------------------------------------------------------------------------------
/Chapter 03/code/grid_search.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.metrics import classification_report
 4 | from sklearn import cross_validation, grid_search
 5 | from sklearn.ensemble import ExtraTreesClassifier
 6 | from sklearn import cross_validation
 7 | from sklearn.metrics import classification_report
 8 | 
 9 | from utilities import visualize_classifier
10 | 
11 | # Load input data
12 | input_file = 'data_random_forests.txt'
13 | data = np.loadtxt(input_file, delimiter=',')
14 | X, y = data[:, :-1], data[:, -1]
15 | 
16 | # Separate input data into three classes based on labels
17 | class_0 = np.array(X[y==0])
18 | class_1 = np.array(X[y==1])
19 | class_2 = np.array(X[y==2])
20 | 
21 | # Split the data into training and testing datasets 
22 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
23 |         X, y, test_size=0.25, random_state=5)
24 | 
25 | # Define the parameter grid 
26 | parameter_grid = [ {'n_estimators': [100], 'max_depth': [2, 4, 7, 12, 16]},
27 |                    {'max_depth': [4], 'n_estimators': [25, 50, 100, 250]}
28 |                  ]
29 | 
30 | metrics = ['precision_weighted', 'recall_weighted']
31 | 
32 | for metric in metrics:
33 |     print("\n##### Searching optimal parameters for", metric)
34 | 
35 |     classifier = grid_search.GridSearchCV(
36 |             ExtraTreesClassifier(random_state=0), 
37 |             parameter_grid, cv=5, scoring=metric)
38 |     classifier.fit(X_train, y_train)
39 | 
40 |     print("\nGrid scores for the parameter grid:")
41 |     for params, avg_score, _ in classifier.grid_scores_:
42 |         print(params, '-->', round(avg_score, 3))
43 | 
44 |     print("\nBest parameters:", classifier.best_params_)
45 | 
46 |     y_pred = classifier.predict(X_test)
47 |     print("\nPerformance report:\n")
48 |     print(classification_report(y_test, y_pred))
49 | 
50 | 


--------------------------------------------------------------------------------
/Chapter 03/code/random_forests.py:
--------------------------------------------------------------------------------
 1 | import argparse 
 2 | 
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | from sklearn.metrics import classification_report
 6 | from sklearn import cross_validation
 7 | from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier
 8 | from sklearn import cross_validation
 9 | from sklearn.metrics import classification_report
10 | 
11 | from utilities import visualize_classifier
12 | 
13 | # Argument parser 
14 | def build_arg_parser():
15 |     parser = argparse.ArgumentParser(description='Classify data using \
16 |             Ensemble Learning techniques')
17 |     parser.add_argument('--classifier-type', dest='classifier_type', 
18 |             required=True, choices=['rf', 'erf'], help="Type of classifier \
19 |                     to use; can be either 'rf' or 'erf'")
20 |     return parser
21 | 
22 | if __name__=='__main__':
23 |     # Parse the input arguments
24 |     args = build_arg_parser().parse_args()
25 |     classifier_type = args.classifier_type
26 | 
27 |     # Load input data
28 |     input_file = 'data_random_forests.txt'
29 |     data = np.loadtxt(input_file, delimiter=',')
30 |     X, y = data[:, :-1], data[:, -1]
31 | 
32 |     # Separate input data into three classes based on labels
33 |     class_0 = np.array(X[y==0])
34 |     class_1 = np.array(X[y==1])
35 |     class_2 = np.array(X[y==2])
36 | 
37 |     # Visualize input data
38 |     plt.figure()
39 |     plt.scatter(class_0[:, 0], class_0[:, 1], s=75, facecolors='white', 
40 |                     edgecolors='black', linewidth=1, marker='s')
41 |     plt.scatter(class_1[:, 0], class_1[:, 1], s=75, facecolors='white', 
42 |                     edgecolors='black', linewidth=1, marker='o')
43 |     plt.scatter(class_2[:, 0], class_2[:, 1], s=75, facecolors='white', 
44 |                     edgecolors='black', linewidth=1, marker='^')
45 |     plt.title('Input data')
46 | 
47 |     # Split data into training and testing datasets 
48 |     X_train, X_test, y_train, y_test = cross_validation.train_test_split(
49 |             X, y, test_size=0.25, random_state=5)
50 | 
51 |     # Ensemble Learning classifier
52 |     params = {'n_estimators': 100, 'max_depth': 4, 'random_state': 0}
53 |     if classifier_type == 'rf':
54 |         classifier = RandomForestClassifier(**params)
55 |     else:
56 |         classifier = ExtraTreesClassifier(**params)
57 | 
58 |     classifier.fit(X_train, y_train)
59 |     visualize_classifier(classifier, X_train, y_train, 'Training dataset')
60 | 
61 |     y_test_pred = classifier.predict(X_test)
62 |     visualize_classifier(classifier, X_test, y_test, 'Test dataset')
63 | 
64 |     # Evaluate classifier performance
65 |     class_names = ['Class-0', 'Class-1', 'Class-2']
66 |     print("\n" + "#"*40)
67 |     print("\nClassifier performance on training dataset\n")
68 |     print(classification_report(y_train, classifier.predict(X_train), target_names=class_names))
69 |     print("#"*40 + "\n")
70 | 
71 |     print("#"*40)
72 |     print("\nClassifier performance on test dataset\n")
73 |     print(classification_report(y_test, y_test_pred, target_names=class_names))
74 |     print("#"*40 + "\n")
75 | 
76 |     # Compute confidence
77 |     test_datapoints = np.array([[5, 5], [3, 6], [6, 4], [7, 2], [4, 4], [5, 2]])
78 | 
79 |     print("\nConfidence measure:")
80 |     for datapoint in test_datapoints:
81 |         probabilities = classifier.predict_proba([datapoint])[0]
82 |         predicted_class = 'Class-' + str(np.argmax(probabilities))
83 |         print('\nDatapoint:', datapoint)
84 |         print('Predicted class:', predicted_class) 
85 | 
86 |     # Visualize the datapoints
87 |     visualize_classifier(classifier, test_datapoints, [0]*len(test_datapoints), 
88 |             'Test datapoints')
89 | 
90 |     plt.show()
91 | 


--------------------------------------------------------------------------------
/Chapter 03/code/run_grid_search.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.metrics import classification_report
 4 | from sklearn import cross_validation, grid_search
 5 | from sklearn.ensemble import ExtraTreesClassifier
 6 | from sklearn import cross_validation
 7 | from sklearn.metrics import classification_report
 8 | 
 9 | from utilities import visualize_classifier
10 | 
11 | # Load input data
12 | input_file = 'data_random_forests.txt'
13 | data = np.loadtxt(input_file, delimiter=',')
14 | X, y = data[:, :-1], data[:, -1]
15 | 
16 | # Separate input data into three classes based on labels
17 | class_0 = np.array(X[y==0])
18 | class_1 = np.array(X[y==1])
19 | class_2 = np.array(X[y==2])
20 | 
21 | # Split the data into training and testing datasets 
22 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
23 |         X, y, test_size=0.25, random_state=5)
24 | 
25 | # Define the parameter grid 
26 | parameter_grid = [ {'n_estimators': [100], 'max_depth': [2, 4, 7, 12, 16]},
27 |                    {'max_depth': [4], 'n_estimators': [25, 50, 100, 250]}
28 |                  ]
29 | 
30 | metrics = ['precision_weighted', 'recall_weighted']
31 | 
32 | for metric in metrics:
33 |     print("\n##### Searching optimal parameters for", metric)
34 | 
35 |     classifier = grid_search.GridSearchCV(
36 |             ExtraTreesClassifier(random_state=0), 
37 |             parameter_grid, cv=5, scoring=metric)
38 |     classifier.fit(X_train, y_train)
39 | 
40 |     print("\nGrid scores for the parameter grid:")
41 |     for params, avg_score, _ in classifier.grid_scores_:
42 |         print(params, '-->', round(avg_score, 3))
43 | 
44 |     print("\nBest parameters:", classifier.best_params_)
45 | 
46 |     y_pred = classifier.predict(X_test)
47 |     print("\nPerformance report:\n")
48 |     print(classification_report(y_test, y_pred))
49 | 
50 | 


--------------------------------------------------------------------------------
/Chapter 03/code/traffic_prediction.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.metrics import classification_report, mean_absolute_error
 4 | from sklearn import cross_validation, preprocessing
 5 | from sklearn.ensemble import ExtraTreesRegressor
 6 | from sklearn.metrics import classification_report
 7 | 
 8 | # Load input data
 9 | input_file = 'traffic_data.txt'
10 | data = []
11 | with open(input_file, 'r') as f:
12 |     for line in f.readlines():
13 |         items = line[:-1].split(',')
14 |         data.append(items)
15 | 
16 | data = np.array(data)
17 | 
18 | # Convert string data to numerical data
19 | label_encoder = [] 
20 | X_encoded = np.empty(data.shape)
21 | for i, item in enumerate(data[0]):
22 |     if item.isdigit():
23 |         X_encoded[:, i] = data[:, i]
24 |     else:
25 |         label_encoder.append(preprocessing.LabelEncoder())
26 |         X_encoded[:, i] = label_encoder[-1].fit_transform(data[:, i])
27 | 
28 | X = X_encoded[:, :-1].astype(int)
29 | y = X_encoded[:, -1].astype(int)
30 | 
31 | # Split data into training and testing datasets 
32 | X_train, X_test, y_train, y_test = cross_validation.train_test_split(
33 |         X, y, test_size=0.25, random_state=5)
34 | 
35 | # Extremely Random Forests regressor
36 | params = {'n_estimators': 100, 'max_depth': 4, 'random_state': 0}
37 | regressor = ExtraTreesRegressor(**params)
38 | regressor.fit(X_train, y_train)
39 | 
40 | # Compute the regressor performance on test data
41 | y_pred = regressor.predict(X_test)
42 | print("Mean absolute error:", round(mean_absolute_error(y_test, y_pred), 2))
43 | 
44 | # Testing encoding on single data instance
45 | test_datapoint = ['Saturday', '10:20', 'Atlanta', 'no']
46 | test_datapoint_encoded = [-1] * len(test_datapoint)
47 | count = 0
48 | for i, item in enumerate(test_datapoint):
49 |     if item.isdigit():
50 |         test_datapoint_encoded[i] = int(test_datapoint[i])
51 |     else:
52 |         test_datapoint_encoded[i] = int(label_encoder[count].transform(test_datapoint[i]))
53 |         count = count + 1 
54 | 
55 | test_datapoint_encoded = np.array(test_datapoint_encoded)
56 | 
57 | # Predict the output for the test datapoint
58 | print("Predicted traffic:", int(regressor.predict([test_datapoint_encoded])[0]))
59 | 
60 | 


--------------------------------------------------------------------------------
/Chapter 03/code/utilities.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | 
 4 | def visualize_classifier(classifier, X, y, title=''):
 5 |     # Define the minimum and maximum values for X and Y
 6 |     # that will be used in the mesh grid
 7 |     min_x, max_x = X[:, 0].min() - 1.0, X[:, 0].max() + 1.0
 8 |     min_y, max_y = X[:, 1].min() - 1.0, X[:, 1].max() + 1.0
 9 | 
10 |     # Define the step size to use in plotting the mesh grid 
11 |     mesh_step_size = 0.01
12 | 
13 |     # Define the mesh grid of X and Y values
14 |     x_vals, y_vals = np.meshgrid(np.arange(min_x, max_x, mesh_step_size), np.arange(min_y, max_y, mesh_step_size))
15 | 
16 |     # Run the classifier on the mesh grid
17 |     output = classifier.predict(np.c_[x_vals.ravel(), y_vals.ravel()])
18 | 
19 |     # Reshape the output array
20 |     output = output.reshape(x_vals.shape)
21 | 
22 |     # Create a plot
23 |     plt.figure()
24 | 
25 |     # Specify the title
26 |     plt.title(title)
27 | 
28 |     # Choose a color scheme for the plot 
29 |     plt.pcolormesh(x_vals, y_vals, output, cmap=plt.cm.gray)
30 | 
31 |     # Overlay the training points on the plot 
32 |     plt.scatter(X[:, 0], X[:, 1], c=y, s=75, edgecolors='black', linewidth=1, cmap=plt.cm.Paired)
33 | 
34 |     # Specify the boundaries of the plot
35 |     plt.xlim(x_vals.min(), x_vals.max())
36 |     plt.ylim(y_vals.min(), y_vals.max())
37 | 
38 |     # Specify the ticks on the X and Y axes
39 |     plt.xticks((np.arange(int(X[:, 0].min() - 1), int(X[:, 0].max() + 1), 1.0)))
40 |     plt.yticks((np.arange(int(X[:, 1].min() - 1), int(X[:, 1].max() + 1), 1.0)))
41 | 
42 |     plt.show()
43 | 


--------------------------------------------------------------------------------
/Chapter 04/code/clustering_quality.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn import metrics
 4 | from sklearn.cluster import KMeans
 5 | 
 6 | # Load data from input file
 7 | X = np.loadtxt('data_quality.txt', delimiter=',')
 8 | 
 9 | # Plot input data
10 | plt.figure()
11 | plt.scatter(X[:,0], X[:,1], color='black', s=80, marker='o', facecolors='none')
12 | x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
13 | y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
14 | plt.title('Input data')
15 | plt.xlim(x_min, x_max)
16 | plt.ylim(y_min, y_max)
17 | plt.xticks(())
18 | plt.yticks(())
19 | 
20 | # Initialize variables
21 | scores = []
22 | values = np.arange(2, 10)
23 | 
24 | # Iterate through the defined range
25 | for num_clusters in values:
26 |     # Train the KMeans clustering model
27 |     kmeans = KMeans(init='k-means++', n_clusters=num_clusters, n_init=10)
28 |     kmeans.fit(X)
29 |     score = metrics.silhouette_score(X, kmeans.labels_, 
30 |                 metric='euclidean', sample_size=len(X))
31 | 
32 |     print("\nNumber of clusters =", num_clusters)
33 |     print("Silhouette score =", score)
34 |                     
35 |     scores.append(score)
36 | 
37 | # Plot silhouette scores
38 | plt.figure()
39 | plt.bar(values, scores, width=0.7, color='black', align='center')
40 | plt.title('Silhouette score vs number of clusters')
41 | 
42 | # Extract best score and optimal number of clusters
43 | num_clusters = np.argmax(scores) + values[0]
44 | print('\nOptimal number of clusters =', num_clusters)
45 | 
46 | plt.show()
47 | 


--------------------------------------------------------------------------------
/Chapter 04/code/company_symbol_mapping.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "TOT": "Total",
 3 |     "XOM": "Exxon",
 4 |     "CVX": "Chevron",
 5 |     "COP": "ConocoPhillips",
 6 |     "VLO": "Valero Energy",
 7 |     "MSFT": "Microsoft",
 8 |     "IBM": "IBM",
 9 |     "TWX": "Time Warner",
10 |     "CMCSA": "Comcast",
11 |     "CVC": "Cablevision",
12 |     "YHOO": "Yahoo",
13 |     "DELL": "Dell",
14 |     "HPQ": "HP",
15 |     "AMZN": "Amazon",
16 |     "TM": "Toyota",
17 |     "CAJ": "Canon",
18 |     "MTU": "Mitsubishi",
19 |     "SNE": "Sony",
20 |     "F": "Ford",
21 |     "HMC": "Honda",
22 |     "NAV": "Navistar",
23 |     "NOC": "Northrop Grumman",
24 |     "BA": "Boeing",
25 |     "KO": "Coca Cola",
26 |     "MMM": "3M",
27 |     "MCD": "Mc Donalds",
28 |     "PEP": "Pepsi",
29 |     "MDLZ": "Kraft Foods",
30 |     "K": "Kellogg",
31 |     "UN": "Unilever",
32 |     "MAR": "Marriott",
33 |     "PG": "Procter Gamble",
34 |     "CL": "Colgate-Palmolive",
35 |     "GE": "General Electrics",
36 |     "WFC": "Wells Fargo",
37 |     "JPM": "JPMorgan Chase",
38 |     "AIG": "AIG",
39 |     "AXP": "American express",
40 |     "BAC": "Bank of America",
41 |     "GS": "Goldman Sachs",
42 |     "AAPL": "Apple",
43 |     "SAP": "SAP",
44 |     "CSCO": "Cisco",
45 |     "TXN": "Texas instruments",
46 |     "XRX": "Xerox",
47 |     "LMT": "Lookheed Martin",
48 |     "WMT": "Wal-Mart",
49 |     "WBA": "Walgreen",
50 |     "HD": "Home Depot",
51 |     "GSK": "GlaxoSmithKline",
52 |     "PFE": "Pfizer",
53 |     "SNY": "Sanofi-Aventis",
54 |     "NVS": "Novartis",
55 |     "KMB": "Kimberly-Clark",
56 |     "R": "Ryder",
57 |     "GD": "General Dynamics",
58 |     "RTN": "Raytheon",
59 |     "CVS": "CVS",
60 |     "CAT": "Caterpillar",
61 |     "DD": "DuPont de Nemours"
62 | }
63 | 


--------------------------------------------------------------------------------
/Chapter 04/code/gmm_classifier.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from matplotlib import patches
 4 | 
 5 | from sklearn import datasets
 6 | from sklearn.mixture import GMM
 7 | from sklearn.cross_validation import StratifiedKFold
 8 | 
 9 | # Load the iris dataset
10 | iris = datasets.load_iris()
11 | 
12 | # Split dataset into training and testing (80/20 split)
13 | indices = StratifiedKFold(iris.target, n_folds=5)
14 | 
15 | # Take the first fold
16 | train_index, test_index = next(iter(indices))
17 | 
18 | # Extract training data and labels
19 | X_train = iris.data[train_index]
20 | y_train = iris.target[train_index]
21 | 
22 | # Extract testing data and labels
23 | X_test = iris.data[test_index]
24 | y_test = iris.target[test_index]
25 | 
26 | # Extract the number of classes
27 | num_classes = len(np.unique(y_train))
28 | 
29 | # Build GMM
30 | classifier = GMM(n_components=num_classes, covariance_type='full', 
31 |         init_params='wc', n_iter=20)
32 | 
33 | # Initialize the GMM means 
34 | classifier.means_ = np.array([X_train[y_train == i].mean(axis=0)
35 |                               for i in range(num_classes)])
36 | 
37 | # Train the GMM classifier 
38 | classifier.fit(X_train)
39 | 
40 | # Draw boundaries
41 | plt.figure()
42 | colors = 'bgr'
43 | for i, color in enumerate(colors):
44 |     # Extract eigenvalues and eigenvectors
45 |     eigenvalues, eigenvectors = np.linalg.eigh(
46 |             classifier._get_covars()[i][:2, :2])
47 | 
48 |     # Normalize the first eigenvector
49 |     norm_vec = eigenvectors[0] / np.linalg.norm(eigenvectors[0])
50 | 
51 |     # Extract the angle of tilt
52 |     angle = np.arctan2(norm_vec[1], norm_vec[0])
53 |     angle = 180 * angle / np.pi 
54 | 
55 |     # Scaling factor to magnify the ellipses
56 |     # (random value chosen to suit our needs)
57 |     scaling_factor = 8
58 |     eigenvalues *= scaling_factor 
59 | 
60 |     # Draw the ellipse
61 |     ellipse = patches.Ellipse(classifier.means_[i, :2], 
62 |             eigenvalues[0], eigenvalues[1], 180 + angle, 
63 |             color=color)
64 |     axis_handle = plt.subplot(1, 1, 1)
65 |     ellipse.set_clip_box(axis_handle.bbox)
66 |     ellipse.set_alpha(0.6)
67 |     axis_handle.add_artist(ellipse)
68 | 
69 | # Plot the data 
70 | colors = 'bgr'
71 | for i, color in enumerate(colors):
72 |     cur_data = iris.data[iris.target == i]
73 |     plt.scatter(cur_data[:,0], cur_data[:,1], marker='o', 
74 |             facecolors='none', edgecolors='black', s=40, 
75 |             label=iris.target_names[i])
76 | 
77 |     test_data = X_test[y_test == i]
78 |     plt.scatter(test_data[:,0], test_data[:,1], marker='s', 
79 |             facecolors='black', edgecolors='black', s=40, 
80 |             label=iris.target_names[i])
81 | 
82 | # Compute predictions for training and testing data
83 | y_train_pred = classifier.predict(X_train)
84 | accuracy_training = np.mean(y_train_pred.ravel() == y_train.ravel()) * 100
85 | print('Accuracy on training data =', accuracy_training)
86 |          
87 | y_test_pred = classifier.predict(X_test)
88 | accuracy_testing = np.mean(y_test_pred.ravel() == y_test.ravel()) * 100
89 | print('Accuracy on testing data =', accuracy_testing)
90 | 
91 | plt.title('GMM classifier')
92 | plt.xticks(())
93 | plt.yticks(())
94 | 
95 | plt.show()
96 | 


--------------------------------------------------------------------------------
/Chapter 04/code/kmeans.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.cluster import KMeans
 4 | from sklearn import metrics
 5 | 
 6 | # Load input data
 7 | X = np.loadtxt('data_clustering.txt', delimiter=',')
 8 | num_clusters = 5
 9 | 
10 | # Plot input data
11 | plt.figure()
12 | plt.scatter(X[:,0], X[:,1], marker='o', facecolors='none', 
13 |         edgecolors='black', s=80)
14 | x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
15 | y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
16 | plt.title('Input data')
17 | plt.xlim(x_min, x_max)
18 | plt.ylim(y_min, y_max)
19 | plt.xticks(())
20 | plt.yticks(())
21 | 
22 | # Create KMeans object 
23 | kmeans = KMeans(init='k-means++', n_clusters=num_clusters, n_init=10)
24 | 
25 | # Train the KMeans clustering model
26 | kmeans.fit(X)
27 | 
28 | # Step size of the mesh
29 | step_size = 0.01
30 | 
31 | # Define the grid of points to plot the boundaries
32 | x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
33 | y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
34 | x_vals, y_vals = np.meshgrid(np.arange(x_min, x_max, step_size), 
35 |         np.arange(y_min, y_max, step_size))
36 | 
37 | # Predict output labels for all the points on the grid 
38 | output = kmeans.predict(np.c_[x_vals.ravel(), y_vals.ravel()])
39 | 
40 | # Plot different regions and color them 
41 | output = output.reshape(x_vals.shape)
42 | plt.figure()
43 | plt.clf()
44 | plt.imshow(output, interpolation='nearest',
45 |            extent=(x_vals.min(), x_vals.max(), 
46 |                y_vals.min(), y_vals.max()),
47 |            cmap=plt.cm.Paired, 
48 |            aspect='auto', 
49 |            origin='lower')
50 | 
51 | # Overlay input points
52 | plt.scatter(X[:,0], X[:,1], marker='o', facecolors='none', 
53 |         edgecolors='black', s=80)
54 | 
55 | # Plot the centers of clusters
56 | cluster_centers = kmeans.cluster_centers_
57 | plt.scatter(cluster_centers[:,0], cluster_centers[:,1], 
58 |         marker='o', s=210, linewidths=4, color='black', 
59 |         zorder=12, facecolors='black')
60 | 
61 | x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
62 | y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
63 | plt.title('Boundaries of clusters')
64 | plt.xlim(x_min, x_max)
65 | plt.ylim(y_min, y_max)
66 | plt.xticks(())
67 | plt.yticks(())
68 | plt.show()
69 | 


--------------------------------------------------------------------------------
/Chapter 04/code/market_segmentation.py:
--------------------------------------------------------------------------------
 1 | import csv
 2 | 
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | from sklearn.cluster import MeanShift, estimate_bandwidth
 6 | 
 7 | # Load data from input file
 8 | input_file = 'sales.csv'
 9 | file_reader = csv.reader(open(input_file, 'r'), delimiter=',')
10 | X = []
11 | for count, row in enumerate(file_reader):
12 |     if not count:
13 |         names = row[1:]
14 |         continue
15 | 
16 |     X.append([float(x) for x in row[1:]])
17 | 
18 | # Convert to numpy array
19 | X = np.array(X)
20 | 
21 | # Estimating the bandwidth of input data
22 | bandwidth = estimate_bandwidth(X, quantile=0.8, n_samples=len(X))
23 | 
24 | # Compute clustering with MeanShift
25 | meanshift_model = MeanShift(bandwidth=bandwidth, bin_seeding=True)
26 | meanshift_model.fit(X)
27 | labels = meanshift_model.labels_
28 | cluster_centers = meanshift_model.cluster_centers_
29 | num_clusters = len(np.unique(labels))
30 | 
31 | print("\nNumber of clusters in input data =", num_clusters)
32 | 
33 | print("\nCenters of clusters:")
34 | print('\t'.join([name[:3] for name in names]))
35 | for cluster_center in cluster_centers:
36 |     print('\t'.join([str(int(x)) for x in cluster_center]))
37 | 
38 | # Extract two features for visualization 
39 | cluster_centers_2d = cluster_centers[:, 1:3]
40 | 
41 | # Plot the cluster centers 
42 | plt.figure()
43 | plt.scatter(cluster_centers_2d[:,0], cluster_centers_2d[:,1], 
44 |         s=120, edgecolors='black', facecolors='none')
45 | 
46 | offset = 0.25
47 | plt.xlim(cluster_centers_2d[:,0].min() - offset * cluster_centers_2d[:,0].ptp(),
48 |         cluster_centers_2d[:,0].max() + offset * cluster_centers_2d[:,0].ptp(),)
49 | plt.ylim(cluster_centers_2d[:,1].min() - offset * cluster_centers_2d[:,1].ptp(),
50 |         cluster_centers_2d[:,1].max() + offset * cluster_centers_2d[:,1].ptp())
51 | 
52 | plt.title('Centers of 2D clusters')
53 | plt.show()
54 | 


--------------------------------------------------------------------------------
/Chapter 04/code/mean_shift.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.cluster import MeanShift, estimate_bandwidth
 4 | from itertools import cycle
 5 | 
 6 | # Load data from input file
 7 | X = np.loadtxt('data_clustering.txt', delimiter=',')
 8 | 
 9 | # Estimate the bandwidth of X
10 | bandwidth_X = estimate_bandwidth(X, quantile=0.1, n_samples=len(X))
11 | 
12 | # Cluster data with MeanShift
13 | meanshift_model = MeanShift(bandwidth=bandwidth_X, bin_seeding=True)
14 | meanshift_model.fit(X)
15 | 
16 | # Extract the centers of clusters
17 | cluster_centers = meanshift_model.cluster_centers_
18 | print('\nCenters of clusters:\n', cluster_centers)
19 | 
20 | # Estimate the number of clusters
21 | labels = meanshift_model.labels_
22 | num_clusters = len(np.unique(labels))
23 | print("\nNumber of clusters in input data =", num_clusters)
24 | 
25 | # Plot the points and cluster centers
26 | plt.figure()
27 | markers = 'o*xvs'
28 | for i, marker in zip(range(num_clusters), markers):
29 |     # Plot points that belong to the current cluster
30 |     plt.scatter(X[labels==i, 0], X[labels==i, 1], marker=marker, color='black')
31 | 
32 |     # Plot the cluster center
33 |     cluster_center = cluster_centers[i]
34 |     plt.plot(cluster_center[0], cluster_center[1], marker='o', 
35 |             markerfacecolor='black', markeredgecolor='black', 
36 |             markersize=15)
37 | 
38 | plt.title('Clusters')
39 | plt.show()
40 | 


--------------------------------------------------------------------------------
/Chapter 04/code/stocks.py:
--------------------------------------------------------------------------------
 1 | import datetime
 2 | import json
 3 | 
 4 | import numpy as np
 5 | import matplotlib.pyplot as plt
 6 | from sklearn import covariance, cluster
 7 | from matplotlib.finance import quotes_historical_yahoo_ochl as quotes_yahoo
 8 | 
 9 | # Input file containing company symbols 
10 | input_file = 'company_symbol_mapping.json'
11 | 
12 | # Load the company symbol map
13 | with open(input_file, 'r') as f:
14 |     company_symbols_map = json.loads(f.read())
15 | 
16 | symbols, names = np.array(list(company_symbols_map.items())).T
17 | 
18 | # Load the historical stock quotes 
19 | start_date = datetime.datetime(2003, 7, 3)
20 | end_date = datetime.datetime(2007, 5, 4)
21 | quotes = [quotes_yahoo(symbol, start_date, end_date, asobject=True) 
22 |                 for symbol in symbols]
23 | 
24 | # Extract opening and closing quotes
25 | opening_quotes = np.array([quote.open for quote in quotes]).astype(np.float)
26 | closing_quotes = np.array([quote.close for quote in quotes]).astype(np.float)
27 | 
28 | # Compute differences between opening and closing quotes 
29 | quotes_diff = closing_quotes - opening_quotes
30 | 
31 | # Normalize the data 
32 | X = quotes_diff.copy().T
33 | X /= X.std(axis=0)
34 | 
35 | # Create a graph model 
36 | edge_model = covariance.GraphLassoCV()
37 | 
38 | # Train the model
39 | with np.errstate(invalid='ignore'):
40 |     edge_model.fit(X)
41 | 
42 | # Build clustering model using Affinity Propagation model
43 | _, labels = cluster.affinity_propagation(edge_model.covariance_)
44 | num_labels = labels.max()
45 | 
46 | # Print the results of clustering
47 | print('\nClustering of stocks based on difference in opening and closing quotes:\n')
48 | for i in range(num_labels + 1):
49 |     print("Cluster", i+1, "==>", ', '.join(names[labels == i]))
50 | 
51 | 


--------------------------------------------------------------------------------
/Chapter 05/code/collaborative_filtering.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import json
 3 | import numpy as np
 4 | 
 5 | from compute_scores import pearson_score
 6 | 
 7 | def build_arg_parser():
 8 |     parser = argparse.ArgumentParser(description='Find users who are similar to the input user')
 9 |     parser.add_argument('--user', dest='user', required=True,
10 |             help='Input user')
11 |     return parser
12 | 
13 | # Finds users in the dataset that are similar to the input user 
14 | def find_similar_users(dataset, user, num_users):
15 |     if user not in dataset:
16 |         raise TypeError('Cannot find ' + user + ' in the dataset')
17 | 
18 |     # Compute Pearson score between input user 
19 |     # and all the users in the dataset
20 |     scores = np.array([[x, pearson_score(dataset, user, 
21 |             x)] for x in dataset if x != user])
22 | 
23 |     # Sort the scores in decreasing order
24 |     scores_sorted = np.argsort(scores[:, 1])[::-1]
25 | 
26 |     # Extract the top 'num_users' scores
27 |     top_users = scores_sorted[:num_users] 
28 | 
29 |     return scores[top_users] 
30 | 
31 | if __name__=='__main__':
32 |     args = build_arg_parser().parse_args()
33 |     user = args.user
34 | 
35 |     ratings_file = 'ratings.json'
36 | 
37 |     with open(ratings_file, 'r') as f:
38 |         data = json.loads(f.read())
39 | 
40 |     print('\nUsers similar to ' + user + ':\n')
41 |     similar_users = find_similar_users(data, user, 3) 
42 |     print('User\t\t\tSimilarity score')
43 |     print('-'*41)
44 |     for item in similar_users:
45 |         print(item[0], '\t\t', round(float(item[1]), 2))
46 | 
47 | 


--------------------------------------------------------------------------------
/Chapter 05/code/k_nearest_neighbors.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from sklearn.neighbors import NearestNeighbors
 4 | 
 5 | # Input data
 6 | X = np.array([[2.1, 1.3], [1.3, 3.2], [2.9, 2.5], [2.7, 5.4], [3.8, 0.9], 
 7 |         [7.3, 2.1], [4.2, 6.5], [3.8, 3.7], [2.5, 4.1], [3.4, 1.9],
 8 |         [5.7, 3.5], [6.1, 4.3], [5.1, 2.2], [6.2, 1.1]])
 9 | 
10 | # Number of nearest neighbors
11 | k = 5
12 | 
13 | # Test datapoint 
14 | test_datapoint = [4.3, 2.7]
15 | 
16 | # Plot input data 
17 | plt.figure()
18 | plt.title('Input data')
19 | plt.scatter(X[:,0], X[:,1], marker='o', s=75, color='black')
20 | 
21 | # Build K Nearest Neighbors model
22 | knn_model = NearestNeighbors(n_neighbors=k, algorithm='ball_tree').fit(X)
23 | distances, indices = knn_model.kneighbors([test_datapoint])
24 | 
25 | # Print the 'k' nearest neighbors
26 | print("\nK Nearest Neighbors:")
27 | for rank, index in enumerate(indices[0][:k], start=1):
28 |     print(str(rank) + " ==>", X[index])
29 | 
30 | # Visualize the nearest neighbors along with the test datapoint 
31 | plt.figure()
32 | plt.title('Nearest neighbors')
33 | plt.scatter(X[:, 0], X[:, 1], marker='o', s=75, color='k')
34 | plt.scatter(X[indices][0][:][:, 0], X[indices][0][:][:, 1], 
35 |         marker='o', s=250, color='k', facecolors='none')
36 | plt.scatter(test_datapoint[0], test_datapoint[1],
37 |         marker='x', s=75, color='k')
38 | 
39 | plt.show()
40 | 


--------------------------------------------------------------------------------
/Chapter 05/code/movie_recommender.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import json
 3 | import numpy as np
 4 | 
 5 | from compute_scores import pearson_score
 6 | from collaborative_filtering import find_similar_users
 7 | 
 8 | def build_arg_parser():
 9 |     parser = argparse.ArgumentParser(description='Find the movie recommendations for the given user')
10 |     parser.add_argument('--user', dest='user', required=True,
11 |             help='Input user')
12 |     return parser
13 |  
14 | # Get movie recommendations for the input user
15 | def get_recommendations(dataset, input_user):
16 |     if input_user not in dataset:
17 |         raise TypeError('Cannot find ' + input_user + ' in the dataset')
18 | 
19 |     overall_scores = {}
20 |     similarity_scores = {}
21 | 
22 |     for user in [x for x in dataset if x != input_user]:
23 |         similarity_score = pearson_score(dataset, input_user, user)
24 | 
25 |         if similarity_score <= 0:
26 |             continue
27 |         
28 |         filtered_list = [x for x in dataset[user] if x not in \
29 |                 dataset[input_user] or dataset[input_user][x] == 0]
30 | 
31 |         for item in filtered_list: 
32 |             overall_scores.update({item: dataset[user][item] * similarity_score})
33 |             similarity_scores.update({item: similarity_score})
34 | 
35 |     if len(overall_scores) == 0:
36 |         return ['No recommendations possible']
37 | 
38 |     # Generate movie ranks by normalization 
39 |     movie_scores = np.array([[score/similarity_scores[item], item] 
40 |             for item, score in overall_scores.items()])
41 | 
42 |     # Sort in decreasing order 
43 |     movie_scores = movie_scores[np.argsort(movie_scores[:, 0])[::-1]]
44 | 
45 |     # Extract the movie recommendations
46 |     movie_recommendations = [movie for _, movie in movie_scores]
47 | 
48 |     return movie_recommendations
49 |  
50 | if __name__=='__main__':
51 |     args = build_arg_parser().parse_args()
52 |     user = args.user
53 | 
54 |     ratings_file = 'ratings.json'
55 | 
56 |     with open(ratings_file, 'r') as f:
57 |         data = json.loads(f.read())
58 | 
59 |     print("\nMovie recommendations for " + user + ":")
60 |     movies = get_recommendations(data, user) 
61 |     for i, movie in enumerate(movies):
62 |         print(str(i+1) + '. ' + movie)
63 | 
64 | 


--------------------------------------------------------------------------------
/Chapter 05/code/nearest_neighbors_classifier.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import matplotlib.cm as cm
 4 | from sklearn import neighbors, datasets
 5 | 
 6 | # Load input data
 7 | input_file = 'data.txt'
 8 | data = np.loadtxt(input_file, delimiter=',')
 9 | X, y = data[:, :-1], data[:, -1].astype(np.int)
10 | 
11 | # Plot input data
12 | plt.figure()
13 | plt.title('Input data')
14 | marker_shapes = 'v^os'
15 | mapper = [marker_shapes[i] for i in y]
16 | for i in range(X.shape[0]):
17 |     plt.scatter(X[i, 0], X[i, 1], marker=mapper[i], 
18 |             s=75, edgecolors='black', facecolors='none')
19 | 
20 | # Number of nearest neighbors 
21 | num_neighbors = 12
22 | 
23 | # Step size of the visualization grid
24 | step_size = 0.01  
25 | 
26 | # Create a K Nearest Neighbours classifier model 
27 | classifier = neighbors.KNeighborsClassifier(num_neighbors, weights='distance')
28 | 
29 | # Train the K Nearest Neighbours model
30 | classifier.fit(X, y)
31 | 
32 | # Create the mesh to plot the boundaries
33 | x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
34 | y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
35 | x_values, y_values = np.meshgrid(np.arange(x_min, x_max, step_size), 
36 |         np.arange(y_min, y_max, step_size))
37 | 
38 | # Evaluate the classifier on all the points on the grid 
39 | output = classifier.predict(np.c_[x_values.ravel(), y_values.ravel()])
40 | 
41 | # Visualize the predicted output 
42 | output = output.reshape(x_values.shape)
43 | plt.figure()
44 | plt.pcolormesh(x_values, y_values, output, cmap=cm.Paired)
45 | 
46 | # Overlay the training points on the map
47 | for i in range(X.shape[0]):
48 |     plt.scatter(X[i, 0], X[i, 1], marker=mapper[i], 
49 |             s=50, edgecolors='black', facecolors='none')
50 | 
51 | plt.xlim(x_values.min(), x_values.max())
52 | plt.ylim(y_values.min(), y_values.max())
53 | plt.title('K Nearest Neighbors classifier model boundaries')
54 | 
55 | # Test input datapoint
56 | test_datapoint = [5.1, 3.6]
57 | plt.figure()
58 | plt.title('Test datapoint')
59 | for i in range(X.shape[0]):
60 |     plt.scatter(X[i, 0], X[i, 1], marker=mapper[i], 
61 |             s=75, edgecolors='black', facecolors='none')
62 | 
63 | plt.scatter(test_datapoint[0], test_datapoint[1], marker='x', 
64 |         linewidth=6, s=200, facecolors='black')
65 | 
66 | # Extract the K nearest neighbors
67 | _, indices = classifier.kneighbors([test_datapoint])
68 | indices = indices.astype(np.int)[0]
69 | 
70 | # Plot k nearest neighbors
71 | plt.figure()
72 | plt.title('K Nearest Neighbors')
73 | 
74 | for i in indices:
75 |     plt.scatter(X[i, 0], X[i, 1], marker=mapper[y[i]], 
76 |             linewidth=3, s=100, facecolors='black')
77 | 
78 | plt.scatter(test_datapoint[0], test_datapoint[1], marker='x', 
79 |         linewidth=6, s=200, facecolors='black')
80 | 
81 | for i in range(X.shape[0]):
82 |     plt.scatter(X[i, 0], X[i, 1], marker=mapper[i], 
83 |             s=75, edgecolors='black', facecolors='none')
84 | 
85 | print("Predicted output:", classifier.predict([test_datapoint])[0])
86 | 
87 | plt.show()
88 | 
89 | 


--------------------------------------------------------------------------------
/Chapter 05/code/pipeline_trainer.py:
--------------------------------------------------------------------------------
 1 | from sklearn.datasets import samples_generator
 2 | from sklearn.feature_selection import SelectKBest, f_regression
 3 | from sklearn.pipeline import Pipeline
 4 | from sklearn.ensemble import ExtraTreesClassifier
 5 | 
 6 | # Generate data 
 7 | X, y = samples_generator.make_classification(n_samples=150, 
 8 |         n_features=25, n_classes=3, n_informative=6, 
 9 |         n_redundant=0, random_state=7)
10 | 
11 | # Select top K features 
12 | k_best_selector = SelectKBest(f_regression, k=9)
13 | 
14 | # Initialize Extremely Random Forests classifier 
15 | classifier = ExtraTreesClassifier(n_estimators=60, max_depth=4)
16 | 
17 | # Construct the pipeline
18 | processor_pipeline = Pipeline([('selector', k_best_selector), ('erf', classifier)])
19 | 
20 | # Set the parameters
21 | processor_pipeline.set_params(selector__k=7, erf__n_estimators=30)
22 | 
23 | # Training the pipeline 
24 | processor_pipeline.fit(X, y)
25 | 
26 | # Predict outputs for the input data
27 | output = processor_pipeline.predict(X)
28 | print("\nPredicted output:\n", output)
29 | 
30 | # Print scores 
31 | print("\nScore:", processor_pipeline.score(X, y))
32 | 
33 | # Print the features chosen by the pipeline selector
34 | status = processor_pipeline.named_steps['selector'].get_support()
35 | 
36 | # Extract and print indices of selected features
37 | selected = [i for i, x in enumerate(status) if x]
38 | print("\nIndices of selected features:", ', '.join([str(x) for x in selected]))
39 | 


--------------------------------------------------------------------------------
/Chapter 05/code/ratings.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "David Smith": 
 3 |     {
 4 |         "Vertigo": 4,
 5 |         "Scarface": 4.5,
 6 |         "Raging Bull": 3.0,
 7 |         "Goodfellas": 4.5,
 8 |         "The Apartment": 1.0
 9 |     },
10 |     "Brenda Peterson": 
11 |     {
12 |         "Vertigo": 3.0,
13 |         "Scarface": 1.5,
14 |         "Raging Bull": 1.0,
15 |         "Goodfellas": 2.0,
16 |         "The Apartment": 5.0,
17 |         "Roman Holiday": 4.5 
18 |     },
19 |     "Bill Duffy": 
20 |     {
21 |         "Vertigo": 4.5,
22 |         "Scarface": 5.0,
23 |         "Goodfellas": 4.5,
24 |         "The Apartment": 1.0
25 |     },
26 |     "Samuel Miller": 
27 |     {
28 |         "Scarface": 3.5,
29 |         "Raging Bull": 5.0,
30 |         "The Apartment": 1.0,
31 |         "Goodfellas": 5.0,
32 |         "Roman Holiday": 1.0 
33 |     },
34 |     "Julie Hammel": 
35 |     {
36 |        "Scarface": 2.5,
37 |        "Roman Holiday": 4.5,
38 |        "Goodfellas": 3.0
39 |     },
40 |     "Clarissa Jackson": 
41 |     {
42 |         "Vertigo": 5.0,
43 |         "Scarface": 4.5,
44 |         "Raging Bull": 4.0,
45 |         "Goodfellas": 2.5,
46 |         "The Apartment": 1.0,
47 |         "Roman Holiday": 1.5
48 |     },
49 |     "Adam Cohen": 
50 |     {
51 |        "Vertigo": 3.5,
52 |        "Scarface": 3.0,
53 |        "The Apartment": 1.0,
54 |        "Goodfellas": 4.5,
55 |        "Roman Holiday": 3.0
56 |     },
57 |     "Chris Duncan": 
58 |     {
59 |        "The Apartment": 1.5,
60 |        "Raging Bull": 4.5
61 |     }
62 | }
63 | 


--------------------------------------------------------------------------------
/Chapter 06/code/adjacent_states.txt:
--------------------------------------------------------------------------------
 1 | Alaska
 2 | Alabama,Mississippi,Tennessee,Georgia,Florida
 3 | Arkansas,Missouri,Tennessee,Mississippi,Louisiana,Texas,Oklahoma
 4 | Arizona,California,Nevada,Utah,Colorado,New Mexico
 5 | California,Oregon,Nevada,Arizona
 6 | Colorado,Wyoming,Nebraska,Kansas,Oklahoma,New Mexico,Arizona,Utah
 7 | Connecticut,New York,Massachusetts,Rhode Island
 8 | District of Columbia,Maryland,Virginia
 9 | Delaware,Maryland,Pennsylvania,New Jersey
10 | Florida,Alabama,Georgia
11 | Georgia,Florida,Alabama,Tennessee,North Carolina,South Carolina
12 | Hawaii
13 | Iowa,Minnesota,Wisconsin,Illinois,Missouri,Nebraska,South Dakota
14 | Idaho,Montana,Wyoming,Utah,Nevada,Oregon,Washington
15 | Illinois,Indiana,Kentucky,Missouri,Iowa,Wisconsin
16 | Indiana,Michigan,Ohio,Kentucky,Illinois
17 | Kansas,Nebraska,Missouri,Oklahoma,Colorado
18 | Kentucky,Indiana,Ohio,West Virginia,Virginia,Tennessee,Missouri,Illinois
19 | Louisiana,Texas,Arkansas,Mississippi
20 | Massachusetts,Rhode Island,Connecticut,New York,New Hampshire,Vermont
21 | Maryland,Virginia,West Virginia,Pennsylvania,District of Columbia,Delaware
22 | Maine,New Hampshire
23 | Michigan,Wisconsin,Indiana,Ohio
24 | Minnesota,Wisconsin,Iowa,South Dakota,North Dakota
25 | Missouri,Iowa,Illinois,Kentucky,Tennessee,Arkansas,Oklahoma,Kansas,Nebraska
26 | Mississippi,Louisiana,Arkansas,Tennessee,Alabama
27 | Montana,North Dakota,South Dakota,Wyoming,Idaho
28 | North Carolina,Virginia,Tennessee,Georgia,South Carolina
29 | North Dakota,Minnesota,South Dakota,Montana
30 | Nebraska,South Dakota,Iowa,Missouri,Kansas,Colorado,Wyoming
31 | New Hampshire,Vermont,Maine,Massachusetts
32 | New Jersey,Delaware,Pennsylvania,New York
33 | New Mexico,Arizona,Utah,Colorado,Oklahoma,Texas
34 | Nevada,Idaho,Utah,Arizona,California,Oregon
35 | New York,New Jersey,Pennsylvania,Vermont,Massachusetts,Connecticut
36 | Ohio,Pennsylvania,West Virginia,Kentucky,Indiana,Michigan
37 | Oklahoma,Kansas,Missouri,Arkansas,Texas,New Mexico,Colorado
38 | Oregon,California,Nevada,Idaho,Washington
39 | Pennsylvania,New York,New Jersey,Delaware,Maryland,West Virginia,Ohio
40 | Rhode Island,Connecticut,Massachusetts
41 | South Carolina,Georgia,North Carolina
42 | South Dakota,North Dakota,Minnesota,Iowa,Nebraska,Wyoming,Montana
43 | Tennessee,Kentucky,Virginia,North Carolina,Georgia,Alabama,Mississippi,Arkansas,Missouri
44 | Texas,New Mexico,Oklahoma,Arkansas,Louisiana
45 | Utah,Idaho,Wyoming,Colorado,New Mexico,Arizona,Nevada
46 | Virginia,North Carolina,Tennessee,Kentucky,West Virginia,Maryland,District of Columbia
47 | Vermont,New York,New Hampshire,Massachusetts
48 | Washington,Idaho,Oregon
49 | Wisconsin,Michigan,Minnesota,Iowa,Illinois
50 | West Virginia,Ohio,Pennsylvania,Maryland,Virginia,Kentucky
51 | Wyoming,Montana,South Dakota,Nebraska,Colorado,Utah,Idaho
52 | 


--------------------------------------------------------------------------------
/Chapter 06/code/coastal_states.txt:
--------------------------------------------------------------------------------
1 | Washington,Oregon,California,Texas,Louisiana,Michigan,Alabama,Georgia,Florida,South Carolina,North Carolina,Virgin Islands,Maryland,Delaware,New Jersey,New York,Connecticut,Rhode Island,Massachusetts,Minnesota,New Hampshire


--------------------------------------------------------------------------------
/Chapter 06/code/expression_matcher.py:
--------------------------------------------------------------------------------
 1 | from logpy import run, var, fact
 2 | import logpy.assoccomm as la
 3 | 
 4 | # Define mathematical operations
 5 | add = 'addition'
 6 | mul = 'multiplication'
 7 | 
 8 | # Declare that these operations are commutative
 9 | # using the facts system
10 | fact(la.commutative, mul)
11 | fact(la.commutative, add)
12 | fact(la.associative, mul)
13 | fact(la.associative, add)
14 | 
15 | # Define some variables
16 | a, b, c = var('a'), var('b'), var('c')
17 | 
18 | # Generate expressions
19 | expression_orig = (add, (mul, 3, -2), (mul, (add, 1, (mul, 2, 3)), -1))
20 | expression1 = (add, (mul, (add, 1, (mul, 2, a)), b), (mul, 3, c))
21 | expression2 = (add, (mul, c, 3), (mul, b, (add, (mul, 2, a), 1)))
22 | expression3 = (add, (add, (mul, (mul, 2, a), b), b), (mul, 3, c)) 
23 | 
24 | # Compare expressions
25 | print(run(0, (a, b, c), la.eq_assoccomm(expression1, expression_orig)))
26 | print(run(0, (a, b, c), la.eq_assoccomm(expression2, expression_orig)))
27 | print(run(0, (a, b, c), la.eq_assoccomm(expression3, expression_orig)))
28 | 


--------------------------------------------------------------------------------
/Chapter 06/code/family.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | from logpy import Relation, facts, run, conde, var, eq
 3 | 
 4 | # Check if 'x' is the parent of 'y'
 5 | def parent(x, y):
 6 |     return conde([father(x, y)], [mother(x, y)])
 7 | 
 8 | # Check if 'x' is the grandparent of 'y'
 9 | def grandparent(x, y):
10 |     temp = var()
11 |     return conde((parent(x, temp), parent(temp, y)))
12 | 
13 | # Check for sibling relationship between 'a' and 'b'  
14 | def sibling(x, y):
15 |     temp = var()
16 |     return conde((parent(temp, x), parent(temp, y)))
17 | 
18 | # Check if x is y's uncle
19 | def uncle(x, y):
20 |     temp = var()
21 |     return conde((father(temp, x), grandparent(temp, y)))
22 | 
23 | if __name__=='__main__':
24 |     father = Relation()
25 |     mother = Relation()
26 |     
27 |     with open('relationships.json') as f:
28 |         d = json.loads(f.read())
29 | 
30 |     for item in d['father']:
31 |         facts(father, (list(item.keys())[0], list(item.values())[0]))
32 | 
33 |     for item in d['mother']:
34 |         facts(mother, (list(item.keys())[0], list(item.values())[0]))
35 | 
36 |     x = var()
37 | 
38 |     # John's children
39 |     name = 'John'
40 |     output = run(0, x, father(name, x))
41 |     print("\nList of " + name + "'s children:")
42 |     for item in output:
43 |         print(item)
44 | 
45 |     # William's mother
46 |     name = 'William'
47 |     output = run(0, x, mother(x, name))[0]
48 |     print("\n" + name + "'s mother:\n" + output)
49 | 
50 |     # Adam's parents 
51 |     name = 'Adam'
52 |     output = run(0, x, parent(x, name))
53 |     print("\nList of " + name + "'s parents:")
54 |     for item in output:
55 |         print(item)
56 | 
57 |     # Wayne's grandparents 
58 |     name = 'Wayne'
59 |     output = run(0, x, grandparent(x, name))
60 |     print("\nList of " + name + "'s grandparents:")
61 |     for item in output:
62 |         print(item)
63 | 
64 |     # Megan's grandchildren 
65 |     name = 'Megan'
66 |     output = run(0, x, grandparent(name, x))
67 |     print("\nList of " + name + "'s grandchildren:")
68 |     for item in output:
69 |         print(item)
70 | 
71 |     # David's siblings 
72 |     name = 'David'
73 |     output = run(0, x, sibling(x, name))
74 |     siblings = [x for x in output if x != name]
75 |     print("\nList of " + name + "'s siblings:")
76 |     for item in siblings:
77 |         print(item)
78 | 
79 |     # Tiffany's uncles
80 |     name = 'Tiffany'
81 |     name_father = run(0, x, father(x, name))[0]
82 |     output = run(0, x, uncle(x, name))
83 |     output = [x for x in output if x != name_father]
84 |     print("\nList of " + name + "'s uncles:")
85 |     for item in output:
86 |         print(item)
87 | 
88 |     # All spouses
89 |     a, b, c = var(), var(), var()
90 |     output = run(0, (a, b), (father, a, c), (mother, b, c))
91 |     print("\nList of all spouses:")
92 |     for item in output:
93 |         print('Husband:', item[0], '<==> Wife:', item[1])
94 |      


--------------------------------------------------------------------------------
/Chapter 06/code/prime.py:
--------------------------------------------------------------------------------
 1 | import itertools as it
 2 | import logpy.core as lc
 3 | from sympy.ntheory.generate import prime, isprime
 4 | 
 5 | # Check if the elements of x are prime 
 6 | def check_prime(x):
 7 |     if lc.isvar(x):
 8 |         return lc.condeseq([(lc.eq, x, p)] for p in map(prime, it.count(1)))
 9 |     else:
10 |         return lc.success if isprime(x) else lc.fail
11 | 
12 | # Declate the variable
13 | x = lc.var()
14 | 
15 | # Check if an element in the list is a prime number
16 | list_nums = (23, 4, 27, 17, 13, 10, 21, 29, 3, 32, 11, 19)
17 | print('\nList of primes in the list:')
18 | print(set(lc.run(0, x, (lc.membero, x, list_nums), (check_prime, x))))
19 | 
20 | # Print first 7 prime numbers
21 | print('\nList of first 7 prime numbers:')
22 | print(lc.run(7, x, check_prime(x)))
23 | 


--------------------------------------------------------------------------------
/Chapter 06/code/puzzle.py:
--------------------------------------------------------------------------------
 1 | from logpy import *
 2 | from logpy.core import lall
 3 | 
 4 | # Declare the variable
 5 | people = var()
 6 | 
 7 | # Define the rules
 8 | rules = lall(
 9 |     # There are 4 people
10 |     (eq, (var(), var(), var(), var()), people),
11 | 
12 |     # Steve's car is blue
13 |     (membero, ('Steve', var(), 'blue', var()), people),
14 | 
15 |     # Person who owns the cat lives in Canada 
16 |     (membero, (var(), 'cat', var(), 'Canada'), people),
17 | 
18 |     # Matthew lives in USA
19 |     (membero, ('Matthew', var(), var(), 'USA'), people),
20 | 
21 |     # The person who has a black car lives in Australia
22 |     (membero, (var(), var(), 'black', 'Australia'), people),
23 | 
24 |     # Jack has a cat
25 |     (membero, ('Jack', 'cat', var(), var()), people),
26 | 
27 |     # Alfred lives in Australia
28 |     (membero, ('Alfred', var(), var(), 'Australia'), people),
29 | 
30 |     # Person who owns the dog lives in France 
31 |     (membero, (var(), 'dog', var(), 'France'), people),
32 | 
33 |     # Who is the owner of the rabbit? 
34 |     (membero, (var(), 'rabbit', var(), var()), people)
35 | )
36 | 
37 | # Run the solver
38 | solutions = run(0, people, rules)
39 | 
40 | # Extract the output
41 | output = [house for house in solutions[0] if 'rabbit' in house][0][0]
42 | 
43 | # Print the output
44 | print('\n' + output + ' is the owner of the rabbit')
45 | print('\nHere are all the details:')
46 | attribs = ['Name', 'Pet', 'Color', 'Country']
47 | print('\n' + '\t\t'.join(attribs))
48 | print('=' * 57)
49 | for item in solutions[0]:
50 |     print('')
51 |     print('\t\t'.join([str(x) for x in item]))
52 | 
53 | 


--------------------------------------------------------------------------------
/Chapter 06/code/relationships.json:
--------------------------------------------------------------------------------
 1 | {
 2 |       "father": 
 3 |       [
 4 |             {"John": "William"},
 5 |             {"John": "David"},
 6 |             {"John": "Adam"},
 7 |             {"William": "Chris"},
 8 |             {"William": "Stephanie"},
 9 |             {"David": "Wayne"},
10 |             {"David": "Tiffany"},
11 |             {"David": "Julie"},
12 |             {"David": "Neil"},
13 |             {"David": "Peter"},
14 |             {"Adam": "Sophia"}
15 |       ],
16 |       "mother": 
17 |       [
18 |             {"Megan": "William"},
19 |             {"Megan": "David"},
20 |             {"Megan": "Adam"},
21 |             {"Emma": "Stephanie"},
22 |             {"Emma": "Chris"},
23 |             {"Olivia": "Tiffany"},
24 |             {"Olivia": "Julie"},
25 |             {"Olivia": "Neil"},
26 |             {"Olivia": "Peter"},
27 |             {"Lily": "Sophia"}
28 |       ]
29 | }
30 | 


--------------------------------------------------------------------------------
/Chapter 06/code/states.py:
--------------------------------------------------------------------------------
 1 | from logpy import run, fact, eq, Relation, var
 2 | 
 3 | adjacent = Relation()
 4 | coastal = Relation()
 5 | 
 6 | file_coastal = 'coastal_states.txt'
 7 | file_adjacent = 'adjacent_states.txt'
 8 | 
 9 | # Read the file containing the coastal states
10 | with open(file_coastal, 'r') as f:
11 |     line = f.read()
12 |     coastal_states = line.split(',')
13 | 
14 | # Add the info to the fact base
15 | for state in coastal_states:
16 |     fact(coastal, state)
17 | 
18 | # Read the file containing the coastal states
19 | with open(file_adjacent, 'r') as f:
20 |     adjlist = [line.strip().split(',') for line in f if line and line[0].isalpha()]
21 | 
22 | # Add the info to the fact base
23 | for L in adjlist:
24 |     head, tail = L[0], L[1:]
25 |     for state in tail:
26 |         fact(adjacent, head, state)
27 | 
28 | # Initialize the variables
29 | x = var()
30 | y = var()
31 | 
32 | # Is Nevada adjacent to Louisiana?
33 | output = run(0, x, adjacent('Nevada', 'Louisiana'))
34 | print('\nIs Nevada adjacent to Louisiana?:')
35 | print('Yes' if len(output) else 'No')
36 | 
37 | # States adjacent to Oregon
38 | output = run(0, x, adjacent('Oregon', x))
39 | print('\nList of states adjacent to Oregon:')
40 | for item in output:
41 |     print(item)
42 | 
43 | # States adjacent to Mississippi that are coastal
44 | output = run(0, x, adjacent('Mississippi', x), coastal(x))
45 | print('\nList of coastal states adjacent to Mississippi:')
46 | for item in output:
47 |     print(item)
48 | 
49 | # List of 'n' states that border a coastal state
50 | n = 7
51 | output = run(n, x, coastal(y), adjacent(x, y))
52 | print('\nList of ' + str(n) + ' states that border a coastal state:')
53 | for item in output:
54 |     print(item)
55 | 
56 | # List of states that adjacent to the two given states
57 | output = run(0, x, adjacent('Arkansas', x), adjacent('Kentucky', x))
58 | print('\nList of states that are adjacent to Arkansas and Kentucky:')
59 | for item in output:
60 |     print(item)
61 | 
62 | 


--------------------------------------------------------------------------------
/Chapter 07/code/coloring.py:
--------------------------------------------------------------------------------
 1 | from simpleai.search import CspProblem, backtrack
 2 | 
 3 | # Define the function that imposes the constraint 
 4 | # that neighbors should be different
 5 | def constraint_func(names, values):
 6 |     return values[0] != values[1]  
 7 | 
 8 | if __name__=='__main__':
 9 |     # Specify the variables
10 |     names = ('Mark', 'Julia', 'Steve', 'Amanda', 'Brian', 
11 |             'Joanne', 'Derek', 'Allan', 'Michelle', 'Kelly')
12 | 
13 |     # Define the possible colors 
14 |     colors = dict((name, ['red', 'green', 'blue', 'gray']) for name in names)
15 | 
16 |     # Define the constraints 
17 |     constraints = [
18 |         (('Mark', 'Julia'), constraint_func),
19 |         (('Mark', 'Steve'), constraint_func),
20 |         (('Julia', 'Steve'), constraint_func),
21 |         (('Julia', 'Amanda'), constraint_func),
22 |         (('Julia', 'Derek'), constraint_func),
23 |         (('Julia', 'Brian'), constraint_func),
24 |         (('Steve', 'Amanda'), constraint_func),
25 |         (('Steve', 'Allan'), constraint_func),
26 |         (('Steve', 'Michelle'), constraint_func),
27 |         (('Amanda', 'Michelle'), constraint_func),
28 |         (('Amanda', 'Joanne'), constraint_func),
29 |         (('Amanda', 'Derek'), constraint_func),
30 |         (('Brian', 'Derek'), constraint_func),
31 |         (('Brian', 'Kelly'), constraint_func),
32 |         (('Joanne', 'Michelle'), constraint_func),
33 |         (('Joanne', 'Amanda'), constraint_func),
34 |         (('Joanne', 'Derek'), constraint_func),
35 |         (('Joanne', 'Kelly'), constraint_func),
36 |         (('Derek', 'Kelly'), constraint_func),
37 |     ]
38 | 
39 |     # Solve the problem
40 |     problem = CspProblem(names, colors, constraints)
41 | 
42 |     # Print the solution
43 |     output = backtrack(problem)
44 |     print('\nColor mapping:\n')
45 |     for k, v in output.items():
46 |         print(k, '==>', v)
47 |     
48 | 


--------------------------------------------------------------------------------
/Chapter 07/code/constrained_problem.py:
--------------------------------------------------------------------------------
 1 | from simpleai.search import CspProblem, backtrack, \
 2 |         min_conflicts, MOST_CONSTRAINED_VARIABLE, \
 3 |         HIGHEST_DEGREE_VARIABLE, LEAST_CONSTRAINING_VALUE
 4 | 
 5 | # Constraint that expects all the different variables 
 6 | # to have different values
 7 | def constraint_unique(variables, values):
 8 |     # Check if all the values are unique
 9 |     return len(values) == len(set(values))  
10 | 
11 | # Constraint that specifies that one variable 
12 | # should be bigger than other
13 | def constraint_bigger(variables, values):
14 |     return values[0] > values[1]
15 | 
16 | # Constraint that specifies that there should be 
17 | # one odd and one even variables in the two variables 
18 | def constraint_odd_even(variables, values):
19 |     # If first variable is even, then second should
20 |     # be odd and vice versa
21 |     if values[0] % 2 == 0:
22 |         return values[1] % 2 == 1 
23 |     else:
24 |         return values[1] % 2 == 0
25 | 
26 | if __name__=='__main__':
27 |     variables = ('John', 'Anna', 'Tom', 'Patricia')
28 | 
29 |     domains = {
30 |         'John': [1, 2, 3],
31 |         'Anna': [1, 3],
32 |         'Tom': [2, 4],
33 |         'Patricia': [2, 3, 4],
34 |     }
35 | 
36 |     constraints = [
37 |         (('John', 'Anna', 'Tom'), constraint_unique),
38 |         (('Tom', 'Anna'), constraint_bigger),
39 |         (('John', 'Patricia'), constraint_odd_even),
40 |     ]
41 | 
42 |     problem = CspProblem(variables, domains, constraints)
43 | 
44 |     print('\nSolutions:\n\nNormal:', backtrack(problem))
45 |     print('\nMost constrained variable:', backtrack(problem, 
46 |             variable_heuristic=MOST_CONSTRAINED_VARIABLE))
47 |     print('\nHighest degree variable:', backtrack(problem, 
48 |             variable_heuristic=HIGHEST_DEGREE_VARIABLE))
49 |     print('\nLeast constraining value:', backtrack(problem, 
50 |             value_heuristic=LEAST_CONSTRAINING_VALUE))
51 |     print('\nMost constrained variable and least constraining value:', 
52 |             backtrack(problem, variable_heuristic=MOST_CONSTRAINED_VARIABLE, 
53 |             value_heuristic=LEAST_CONSTRAINING_VALUE))
54 |     print('\nHighest degree and least constraining value:', 
55 |             backtrack(problem, variable_heuristic=HIGHEST_DEGREE_VARIABLE, 
56 |             value_heuristic=LEAST_CONSTRAINING_VALUE))
57 |     print('\nMinimum conflicts:', min_conflicts(problem))
58 | 
59 | 


--------------------------------------------------------------------------------
/Chapter 07/code/greedy_search.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import simpleai.search as ss 
 3 | 
 4 | def build_arg_parser():
 5 |     parser = argparse.ArgumentParser(description='Creates the input string \
 6 |             using the greedy algorithm')
 7 |     parser.add_argument("--input-string", dest="input_string", required=True,
 8 |             help="Input string")
 9 |     parser.add_argument("--initial-state", dest="initial_state", required=False,
10 |             default='', help="Starting point for the search")
11 |     return parser
12 | 
13 | class CustomProblem(ss.SearchProblem):
14 |     def set_target(self, target_string):
15 |         self.target_string = target_string
16 | 
17 |     # Check the current state and take the right action
18 |     def actions(self, cur_state):
19 |         if len(cur_state) < len(self.target_string):
20 |             alphabets = 'abcdefghijklmnopqrstuvwxyz'
21 |             return list(alphabets + ' ' + alphabets.upper())
22 |         else:
23 |             return []
24 | 
25 |     # Concatenate state and action to get the result
26 |     def result(self, cur_state, action):
27 |         return cur_state + action
28 | 
29 |     # Check if goal has been achieved
30 |     def is_goal(self, cur_state):
31 |         return cur_state == self.target_string
32 | 
33 |     # Define the heuristic that will be used
34 |     def heuristic(self, cur_state):
35 |         # Compare current string with target string 
36 |         dist = sum([1 if cur_state[i] != self.target_string[i] else 0
37 |                     for i in range(len(cur_state))])
38 | 
39 |         # Difference between the lengths
40 |         diff = len(self.target_string) - len(cur_state)
41 | 
42 |         return dist + diff 
43 | 
44 | if __name__=='__main__':
45 |     args = build_arg_parser().parse_args()
46 | 
47 |     # Initialize the object
48 |     problem = CustomProblem()
49 | 
50 |     # Set target string and initial state
51 |     problem.set_target(args.input_string)
52 |     problem.initial_state = args.initial_state
53 | 
54 |     # Solve the problem
55 |     output = ss.greedy(problem)
56 | 
57 |     print('\nTarget string:', args.input_string)
58 |     print('\nPath to the solution:')
59 |     for item in output.path():
60 |         print(item)
61 | 


--------------------------------------------------------------------------------
/Chapter 07/code/maze.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | from simpleai.search import SearchProblem, astar
  3 | 
  4 | # Class containing the methods to solve the maze
  5 | class MazeSolver(SearchProblem):
  6 |     # Initialize the class 
  7 |     def __init__(self, board):
  8 |         self.board = board
  9 |         self.goal = (0, 0)
 10 | 
 11 |         for y in range(len(self.board)):
 12 |             for x in range(len(self.board[y])):
 13 |                 if self.board[y][x].lower() == "o":
 14 |                     self.initial = (x, y)
 15 |                 elif self.board[y][x].lower() == "x":
 16 |                     self.goal = (x, y)
 17 | 
 18 |         super(MazeSolver, self).__init__(initial_state=self.initial)
 19 | 
 20 |     # Define the method that takes actions
 21 |     # to arrive at the solution
 22 |     def actions(self, state):
 23 |         actions = []
 24 |         for action in COSTS.keys():
 25 |             newx, newy = self.result(state, action)
 26 |             if self.board[newy][newx] != "#":
 27 |                 actions.append(action)
 28 | 
 29 |         return actions
 30 | 
 31 |     # Update the state based on the action
 32 |     def result(self, state, action):
 33 |         x, y = state
 34 | 
 35 |         if action.count("up"):
 36 |             y -= 1
 37 |         if action.count("down"):
 38 |             y += 1
 39 |         if action.count("left"):
 40 |             x -= 1
 41 |         if action.count("right"):
 42 |             x += 1
 43 | 
 44 |         new_state = (x, y)
 45 | 
 46 |         return new_state
 47 | 
 48 |     # Check if we have reached the goal
 49 |     def is_goal(self, state):
 50 |         return state == self.goal
 51 | 
 52 |     # Compute the cost of taking an action
 53 |     def cost(self, state, action, state2):
 54 |         return COSTS[action]
 55 | 
 56 |     # Heuristic that we use to arrive at the solution
 57 |     def heuristic(self, state):
 58 |         x, y = state
 59 |         gx, gy = self.goal
 60 | 
 61 |         return math.sqrt((x - gx) ** 2 + (y - gy) ** 2)
 62 | 
 63 | if __name__ == "__main__":
 64 |     # Define the map
 65 |     MAP = """
 66 |     ##############################
 67 |     #         #              #   #
 68 |     # ####    ########       #   #
 69 |     #  o #    #              #   #
 70 |     #    ###     #####  ######   #
 71 |     #      #   ###   #           #
 72 |     #      #     #   #  #  #   ###
 73 |     #     #####    #    #  # x   #
 74 |     #              #       #     #
 75 |     ##############################
 76 |     """
 77 | 
 78 |     # Convert map to a list
 79 |     print(MAP)
 80 |     MAP = [list(x) for x in MAP.split("\n") if x]
 81 | 
 82 |     # Define cost of moving around the map
 83 |     cost_regular = 1.0
 84 |     cost_diagonal = 1.7
 85 | 
 86 |     # Create the cost dictionary
 87 |     COSTS = {
 88 |         "up": cost_regular,
 89 |         "down": cost_regular,
 90 |         "left": cost_regular,
 91 |         "right": cost_regular,
 92 |         "up left": cost_diagonal,
 93 |         "up right": cost_diagonal,
 94 |         "down left": cost_diagonal,
 95 |         "down right": cost_diagonal,
 96 |     }
 97 | 
 98 |     # Create maze solver object
 99 |     problem = MazeSolver(MAP)
100 | 
101 |     # Run the solver
102 |     result = astar(problem, graph_search=True)
103 | 
104 |     # Extract the path
105 |     path = [x[1] for x in result.path()]
106 | 
107 |     # Print the result
108 |     print()
109 |     for y in range(len(MAP)):
110 |         for x in range(len(MAP[y])):
111 |             if (x, y) == problem.initial:
112 |                 print('o', end='')
113 |             elif (x, y) == problem.goal:
114 |                 print('x', end='')
115 |             elif (x, y) in path:
116 |                 print('·', end='')
117 |             else:
118 |                 print(MAP[y][x], end='')
119 | 
120 |         print()
121 | 
122 | 


--------------------------------------------------------------------------------
/Chapter 07/code/puzzle.py:
--------------------------------------------------------------------------------
 1 | from simpleai.search import astar, SearchProblem
 2 | 
 3 | # Class containing methods to solve the puzzle
 4 | class PuzzleSolver(SearchProblem):
 5 |     # Action method to get the list of the possible 
 6 |     # numbers that can be moved in to the empty space 
 7 |     def actions(self, cur_state):
 8 |         rows = string_to_list(cur_state)
 9 |         row_empty, col_empty = get_location(rows, 'e')
10 | 
11 |         actions = []
12 |         if row_empty > 0:
13 |             actions.append(rows[row_empty - 1][col_empty])
14 |         if row_empty < 2:
15 |             actions.append(rows[row_empty + 1][col_empty])
16 |         if col_empty > 0:
17 |             actions.append(rows[row_empty][col_empty - 1])
18 |         if col_empty < 2:
19 |             actions.append(rows[row_empty][col_empty + 1])
20 | 
21 |         return actions
22 | 
23 |     # Return the resulting state after moving a piece to the empty space
24 |     def result(self, state, action):
25 |         rows = string_to_list(state)
26 |         row_empty, col_empty = get_location(rows, 'e')
27 |         row_new, col_new = get_location(rows, action)
28 | 
29 |         rows[row_empty][col_empty], rows[row_new][col_new] = \
30 |                 rows[row_new][col_new], rows[row_empty][col_empty]
31 | 
32 |         return list_to_string(rows)
33 | 
34 |     # Returns true if a state is the goal state
35 |     def is_goal(self, state):
36 |         return state == GOAL
37 | 
38 |     # Returns an estimate of the distance from a state to 
39 |     # the goal using the manhattan distance
40 |     def heuristic(self, state):
41 |         rows = string_to_list(state)
42 | 
43 |         distance = 0
44 | 
45 |         for number in '12345678e':
46 |             row_new, col_new = get_location(rows, number)
47 |             row_new_goal, col_new_goal = goal_positions[number]
48 | 
49 |             distance += abs(row_new - row_new_goal) + abs(col_new - col_new_goal)
50 | 
51 |         return distance
52 | 
53 | # Convert list to string
54 | def list_to_string(input_list):
55 |     return '\n'.join(['-'.join(x) for x in input_list])
56 | 
57 | # Convert string to list
58 | def string_to_list(input_string):
59 |     return [x.split('-') for x in input_string.split('\n')]
60 | 
61 | # Find the 2D location of the input element 
62 | def get_location(rows, input_element):
63 |     for i, row in enumerate(rows):
64 |         for j, item in enumerate(row):
65 |             if item == input_element:
66 |                 return i, j  
67 | 
68 | # Final result that we want to achieve
69 | GOAL = '''1-2-3
70 | 4-5-6
71 | 7-8-e'''
72 | 
73 | # Starting point
74 | INITIAL = '''1-e-2
75 | 6-3-4
76 | 7-5-8'''
77 | 
78 | # Create a cache for the goal position of each piece
79 | goal_positions = {}
80 | rows_goal = string_to_list(GOAL)
81 | for number in '12345678e':
82 |     goal_positions[number] = get_location(rows_goal, number)
83 | 
84 | # Create the solver object
85 | result = astar(PuzzleSolver(INITIAL))
86 | 
87 | # Print the results
88 | for i, (action, state) in enumerate(result.path()):
89 |     print()
90 |     if action == None:
91 |         print('Initial configuration')
92 |     elif i == len(result.path()) - 1:
93 |         print('After moving', action, 'into the empty space. Goal achieved!')
94 |     else:
95 |         print('After moving', action, 'into the empty space')
96 | 
97 |     print(state)
98 | 
99 | 


--------------------------------------------------------------------------------
/Chapter 07/code/simpleai.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 07/code/simpleai.zip


--------------------------------------------------------------------------------
/Chapter 08/code/symbol_regression.py:
--------------------------------------------------------------------------------
 1 | import operator
 2 | import math
 3 | import random
 4 | 
 5 | import numpy as np
 6 | from deap import algorithms, base, creator, tools, gp
 7 | 
 8 | # Define new functions
 9 | def division_operator(numerator, denominator):
10 |     if denominator == 0:
11 |         return 1
12 | 
13 |     return numerator / denominator 
14 | 
15 | # Define the evaluation function
16 | def eval_func(individual, points):
17 |     # Transform the tree expression in a callable function
18 |     func = toolbox.compile(expr=individual)
19 | 
20 |     # Evaluate the mean squared error
21 |     mse = ((func(x) - (2 * x**3 - 3 * x**2 + 4 * x - 1))**2 for x in points)
22 | 
23 |     return math.fsum(mse) / len(points),
24 | 
25 | # Function to create the toolbox
26 | def create_toolbox():
27 |     pset = gp.PrimitiveSet("MAIN", 1)
28 |     pset.addPrimitive(operator.add, 2)
29 |     pset.addPrimitive(operator.sub, 2)
30 |     pset.addPrimitive(operator.mul, 2)
31 |     pset.addPrimitive(division_operator, 2)
32 |     pset.addPrimitive(operator.neg, 1)
33 |     pset.addPrimitive(math.cos, 1)
34 |     pset.addPrimitive(math.sin, 1)
35 | 
36 |     pset.addEphemeralConstant("rand101", lambda: random.randint(-1,1))
37 | 
38 |     pset.renameArguments(ARG0='x')
39 | 
40 |     creator.create("FitnessMin", base.Fitness, weights=(-1.0,))
41 |     creator.create("Individual", gp.PrimitiveTree, fitness=creator.FitnessMin)
42 | 
43 |     toolbox = base.Toolbox()
44 | 
45 |     toolbox.register("expr", gp.genHalfAndHalf, pset=pset, min_=1, max_=2)
46 |     toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.expr)
47 |     toolbox.register("population", tools.initRepeat, list, toolbox.individual)
48 |     toolbox.register("compile", gp.compile, pset=pset)
49 |     toolbox.register("evaluate", eval_func, points=[x/10. for x in range(-10,10)])
50 |     toolbox.register("select", tools.selTournament, tournsize=3)
51 |     toolbox.register("mate", gp.cxOnePoint)
52 |     toolbox.register("expr_mut", gp.genFull, min_=0, max_=2)
53 |     toolbox.register("mutate", gp.mutUniform, expr=toolbox.expr_mut, pset=pset)
54 | 
55 |     toolbox.decorate("mate", gp.staticLimit(key=operator.attrgetter("height"), max_value=17))
56 |     toolbox.decorate("mutate", gp.staticLimit(key=operator.attrgetter("height"), max_value=17))
57 | 
58 |     return toolbox
59 | 
60 | if __name__ == "__main__":
61 |     random.seed(7)
62 | 
63 |     toolbox = create_toolbox()
64 | 
65 |     population = toolbox.population(n=450)
66 |     hall_of_fame = tools.HallOfFame(1)
67 |     
68 |     stats_fit = tools.Statistics(lambda x: x.fitness.values)
69 |     stats_size = tools.Statistics(len)
70 | 
71 |     mstats = tools.MultiStatistics(fitness=stats_fit, size=stats_size)
72 |     mstats.register("avg", np.mean)
73 |     mstats.register("std", np.std)
74 |     mstats.register("min", np.min)
75 |     mstats.register("max", np.max)
76 | 
77 |     # Define parameters
78 |     probab_crossover = 0.4
79 |     probab_mutate = 0.2
80 |     num_generations = 60
81 | 
82 |     population, log = algorithms.eaSimple(population, toolbox, 
83 |             probab_crossover, probab_mutate, num_generations, 
84 |             stats=mstats, halloffame=hall_of_fame, verbose=True)
85 | 
86 | 


--------------------------------------------------------------------------------
/Chapter 08/code/target_map.txt:
--------------------------------------------------------------------------------
 1 | S##.............................
 2 | ..#.............................
 3 | ...######.........##########....
 4 | ........#..................#....
 5 | ........#.................#.....
 6 | ........#####.......######......
 7 | ............#......#............
 8 | ............#.......#...........
 9 | ............#.......#...........
10 | ......#######.......#...........
11 | ......#.............#...........
12 | ......#..............###........
13 | ......#................#........
14 | .......#...............#........
15 | ........#...............#.......
16 | .........#.......#......#.......
17 | .........#..............#.......
18 | ............#...........#.......
19 | ............#...#.......#.......
20 | ............#...#........#......
21 | ............#...#.........#.....
22 | ............#...#.........#.....
23 | ............#..........#........
24 | ............#..............#....
25 | ...##..#####....#..........#....
26 | .#..............#...........#...
27 | .#..............#...........#...
28 | .#......#######............#....
29 | .#.....#................#.......
30 | .......#................#.......
31 | ..####.........#.....#..........
32 | ................######..........


--------------------------------------------------------------------------------
/Chapter 09/code/coins.py:
--------------------------------------------------------------------------------
 1 | # This is a variant of the Game of Bones recipe given in the easyAI library
 2 | 
 3 | from easyAI import TwoPlayersGame, id_solve, Human_Player, AI_Player
 4 | from easyAI.AI import TT
 5 | 
 6 | class LastCoinStanding(TwoPlayersGame):
 7 |     def __init__(self, players):
 8 |         # Define the players. Necessary parameter.
 9 |         self.players = players
10 | 
11 |         # Define who starts the game. Necessary parameter.
12 |         self.nplayer = 1 
13 | 
14 |         # Overall number of coins in the pile 
15 |         self.num_coins = 25
16 | 
17 |         # Define max number of coins per move 
18 |         self.max_coins = 4 
19 | 
20 |     # Define possible moves
21 |     def possible_moves(self): 
22 |         return [str(x) for x in range(1, self.max_coins + 1)]
23 |     
24 |     # Remove coins
25 |     def make_move(self, move): 
26 |         self.num_coins -= int(move) 
27 | 
28 |     # Did the opponent take the last coin?
29 |     def win(self): 
30 |         return self.num_coins <= 0 
31 | 
32 |     # Stop the game when somebody wins 
33 |     def is_over(self): 
34 |         return self.win() 
35 | 
36 |     # Compute score
37 |     def scoring(self): 
38 |         return 100 if self.win() else 0
39 | 
40 |     # Show number of coins remaining in the pile
41 |     def show(self): 
42 |         print(self.num_coins, 'coins left in the pile')
43 | 
44 | if __name__ == "__main__":
45 |     # Define the transposition table
46 |     tt = TT()
47 | 
48 |     # Define the method
49 |     LastCoinStanding.ttentry = lambda self: self.num_coins
50 | 
51 |     # Solve the game
52 |     result, depth, move = id_solve(LastCoinStanding, 
53 |             range(2, 20), win_score=100, tt=tt)
54 |     print(result, depth, move)  
55 | 
56 |     # Start the game 
57 |     game = LastCoinStanding([AI_Player(tt), Human_Player()])
58 |     game.play() 
59 | 
60 | 


--------------------------------------------------------------------------------
/Chapter 09/code/connect_four.py:
--------------------------------------------------------------------------------
 1 | # This is a variant of the Connect Four recipe given in the easyAI library
 2 | 
 3 | import numpy as np
 4 | from easyAI import TwoPlayersGame, Human_Player, AI_Player, \
 5 |         Negamax, SSS
 6 | 
 7 | class GameController(TwoPlayersGame):
 8 |     def __init__(self, players, board = None):
 9 |         # Define the players
10 |         self.players = players
11 | 
12 |         # Define the configuration of the board
13 |         self.board = board if (board != None) else (
14 |             np.array([[0 for i in range(7)] for j in range(6)]))
15 | 
16 |         # Define who starts the game
17 |         self.nplayer = 1
18 | 
19 |         # Define the positions
20 |         self.pos_dir = np.array([[[i, 0], [0, 1]] for i in range(6)] +
21 |                    [[[0, i], [1, 0]] for i in range(7)] +
22 |                    [[[i, 0], [1, 1]] for i in range(1, 3)] +
23 |                    [[[0, i], [1, 1]] for i in range(4)] +
24 |                    [[[i, 6], [1, -1]] for i in range(1, 3)] +
25 |                    [[[0, i], [1, -1]] for i in range(3, 7)])
26 | 
27 |     # Define possible moves
28 |     def possible_moves(self):
29 |         return [i for i in range(7) if (self.board[:, i].min() == 0)]
30 | 
31 |     # Define how to make the move
32 |     def make_move(self, column):
33 |         line = np.argmin(self.board[:, column] != 0)
34 |         self.board[line, column] = self.nplayer
35 | 
36 |     # Show the current status
37 |     def show(self):
38 |         print('\n' + '\n'.join(
39 |                 ['0 1 2 3 4 5 6', 13 * '-'] +
40 |                 [' '.join([['.', 'O', 'X'][self.board[5 - j][i]]
41 |                 for i in range(7)]) for j in range(6)]))
42 | 
43 |     # Define what a loss_condition looks like 
44 |     def loss_condition(self):
45 |         for pos, direction in self.pos_dir:
46 |             streak = 0
47 |             while (0 <= pos[0] <= 5) and (0 <= pos[1] <= 6):
48 |                 if self.board[pos[0], pos[1]] == self.nopponent:
49 |                     streak += 1
50 |                     if streak == 4:
51 |                         return True
52 |                 else:
53 |                     streak = 0
54 | 
55 |                 pos = pos + direction
56 | 
57 |         return False
58 | 
59 |     # Check if the game is over
60 |     def is_over(self):
61 |         return (self.board.min() > 0) or self.loss_condition()
62 | 
63 |     # Compute the score
64 |     def scoring(self):
65 |         return -100 if self.loss_condition() else 0
66 | 
67 | if __name__ == '__main__':
68 |     # Define the algorithms that will be used
69 |     algo_neg = Negamax(5)
70 |     algo_sss = SSS(5)
71 | 
72 |     # Start the game
73 |     game = GameController([AI_Player(algo_neg), AI_Player(algo_sss)])
74 |     game.play()
75 | 
76 |     # Print the result
77 |     if game.loss_condition():
78 |         print('\nPlayer', game.nopponent, 'wins.')
79 |     else:
80 |         print("\nIt's a draw.")
81 | 
82 | 


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 09/code/easyAI/.DS_Store


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 09/code/easyAI/AI/.DS_Store


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/DUAL.py:
--------------------------------------------------------------------------------
 1 | #contributed by mrfesol (Tomasz Wesolowski)
 2 | 
 3 | from easyAI.AI.MTdriver import mtd
 4 | 
 5 | class DUAL:
 6 |     """
 7 |     This implements DUAL algorithm. The following example shows
 8 |     how to setup the AI and play a Connect Four game:
 9 |     
10 |         >>> from easyAI import Human_Player, AI_Player, DUAL
11 |         >>> AI = DUAL(7)
12 |         >>> game = ConnectFour([AI_Player(AI),Human_Player()])
13 |         >>> game.play()
14 |     
15 |     Parameters
16 |     -----------
17 |     
18 |     depth:
19 |       How many moves in advance should the AI think ?
20 |       (2 moves = 1 complete turn)
21 |     
22 |     scoring:
23 |       A function f(game)-> score. If no scoring is provided
24 |          and the game object has a ``scoring`` method it ill be used.
25 |     
26 |     win_score:
27 |       Score LARGER than the largest score of game, but smaller than inf. 
28 |       It's required to run algorithm.
29 |         
30 |     tt:
31 |       A transposition table (a table storing game states and moves)
32 |       scoring: can be none if the game that the AI will be given has a
33 |       ``scoring`` method.
34 |       
35 |     Notes
36 |     -----
37 |    
38 |     The score of a given game is given by
39 |     
40 |     >>> scoring(current_game) - 0.01*sign*current_depth
41 |     
42 |     for instance if a lose is -100 points, then losing after 4 moves
43 |     will score -99.96 points but losing after 8 moves will be -99.92
44 |     points. Thus, the AI will chose the move that leads to defeat in
45 |     8 turns, which makes it more difficult for the (human) opponent.
46 |     This will not always work if a ``win_score`` argument is provided.
47 |     
48 |     """
49 |     
50 |     def __init__(self, depth, scoring=None, win_score=100000, tt=None):
51 |         self.scoring = scoring        
52 |         self.depth = depth
53 |         self.tt = tt
54 |         self.win_score= win_score
55 |     
56 |     def __call__(self,game):
57 |         """
58 |         Returns the AI's best move given the current state of the game.
59 |         """
60 |         
61 |         scoring = self.scoring if self.scoring else (
62 |                        lambda g: g.scoring() ) # horrible hack
63 |         
64 |         first = -self.win_score #essence of DUAL algorithm
65 |         next = (lambda lowerbound, upperbound, bestValue: bestValue + 1) 
66 |         
67 |         self.alpha = mtd(game, 
68 |                          first, next,
69 |                          self.depth, 
70 |                          scoring,
71 |                          self.tt)
72 |         
73 |         return game.ai_move
74 | 


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/DictTT.py:
--------------------------------------------------------------------------------
  1 | #contributed by mrfesol (Tomasz Wesolowski)
  2 | from easyAI.AI.HashTT import HashTT
  3 | 
  4 | class DictTT:
  5 |     """
  6 |     A DictTT implements custom dictionary,
  7 |     which can be used with transposition tables.
  8 |     """
  9 |     def __init__(self, num_buckets=1024, own_hash = None):
 10 |         """
 11 |         Initializes a dictionary with the given number of buckets.
 12 |         """
 13 |         self.dict = []
 14 |         for i in range(num_buckets):
 15 |             self.dict.append((None, None))
 16 |         self.keys = dict()
 17 |         self.hash = hash
 18 |         if own_hash != None:
 19 |             own_hash.modulo = len(self.dict)
 20 |             self.hash = own_hash.get_hash
 21 |         self.num_collisions = 0
 22 |         self.num_calls = 0
 23 |     
 24 |     def hash_key(self, key):
 25 |         """
 26 |         Given a key this will create a number and then convert it to
 27 |         an index for the dict.
 28 |         """
 29 |         self.num_calls += 1
 30 |         return self.hash(key) % len(self.dict)
 31 |     
 32 |     def get_slot(self, key, default=None):
 33 |         """
 34 |         Returns the index, key, and value of a slot found in the dict.
 35 |         Returns -1, key, and default (None if not set) when not found.
 36 |         """
 37 |         slot = self.hash_key(key)
 38 |         
 39 |         if key == self.dict[slot][0]:
 40 |             return slot, self.dict[slot][0], self.dict[slot][1]
 41 |     
 42 |         return -1, key, default
 43 |     
 44 |     def get(self, key, default=None):
 45 |         """
 46 |         Gets the value for the given key, or the default.
 47 |         """        
 48 |         i, k, v = self.get_slot(key, default=default)
 49 |         return v
 50 |     
 51 |     def set(self, key, value):
 52 |         """
 53 |         Sets the key to the value, replacing any existing value.
 54 |         """
 55 |         slot = self.hash_key(key)
 56 |         
 57 |         if self.dict[slot] != (None, None):
 58 |             self.num_collisions += 1 #collision occured
 59 |                 
 60 |         self.dict[slot] = (key, value)
 61 |      
 62 |         if self.keys.__contains__(key):
 63 |             self.keys[key] = self.keys[key] + 1
 64 |         else:
 65 |             self.keys[key] = 1
 66 |     
 67 |     def delete(self, key):
 68 |         """
 69 |         Deletes the given key from the dictionary.
 70 |         """
 71 |         
 72 |         slot = self.hash_key(key)
 73 |         self.dict[slot] = (None, None)
 74 |             
 75 |         if self.keys.__contains__(key):
 76 |             self.keys[key] = self.keys[key] - 1
 77 |             if self.keys[key] <= 0:
 78 |                 del self.keys[key]
 79 |                 
 80 |     def collisions(self):
 81 |         return self.num_collisions
 82 |                     
 83 |     def __getitem__(self, key):
 84 |         return self.get(key)
 85 |     
 86 |     def __missing__(self, key):
 87 |         return None
 88 |     
 89 |     def __setitem__(self, key, value):
 90 |         self.set(key, value)
 91 |         
 92 |     def __delitem__(self, key):
 93 |         self.delete(key)
 94 |         
 95 |     def __iter__(self):
 96 |         return iter(self.keys)
 97 |         
 98 |     def __contains__(self, key):
 99 |         return self.keys.__contains__(key)
100 |         


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/HashTT.py:
--------------------------------------------------------------------------------
 1 | #contributed by mrfesol (Tomasz Wesolowski)
 2 | 
 3 | class HashTT:
 4 |     """ 
 5 |         Base Class for various types of hashes
 6 |     """    
 7 |     
 8 |     def __init__(self):
 9 |         self.modulo = 1024 #default value
10 |         
11 |     def before(self, key):
12 |         """
13 |         Returns initial value of hash.
14 |         It's also the place where you can initialize some auxiliary variables
15 |         """
16 |         return 0
17 |     
18 |     def after(self, key, hash):
19 |         """
20 |         Returns final value of hash
21 |         """
22 |         return hash
23 |         
24 |     def get_hash(self, key, depth = 0):
25 |         """
26 |         Recursively computes a hash
27 |         """
28 |         ret_hash = self.before(key)
29 |         if type(key) is int:
30 |             return self.hash_int(key)
31 |         if type(key) is str and len(key) <= 1:
32 |             return self.hash_char(key)
33 |         for v in list(key):
34 |             ret_hash = self.join(ret_hash, self.get_hash(v, depth+1)) % self.modulo
35 |         if depth == 0:
36 |             ret_hash = self.after(key, ret_hash)
37 |         return ret_hash
38 |     
39 |     def hash_int(self, number):
40 |         """
41 |         Returns hash for a number
42 |         """
43 |         return number
44 |     
45 |     def hash_char(self, string):
46 |         """
47 |         Returns hash for an one-letter string
48 |         """
49 |         return ord(string) 
50 |     
51 |     def join(self, one, two):
52 |         """
53 |         Returns combined hash from two hashes
54 |         one - existing (combined) hash so far
55 |         two - hash of new element
56 |         one = join(one, two)
57 |         """
58 |         return (one * two) % self.modulo


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/SSS.py:
--------------------------------------------------------------------------------
 1 | #contributed by mrfesol (Tomasz Wesolowski)
 2 | 
 3 | from easyAI.AI.MTdriver import mtd
 4 | 
 5 | class SSS:
 6 |     """
 7 |     This implements SSS* algorithm. The following example shows
 8 |     how to setup the AI and play a Connect Four game:
 9 |     
10 |         >>> from easyAI import Human_Player, AI_Player, SSS
11 |         >>> AI = SSS(7)
12 |         >>> game = ConnectFour([AI_Player(AI),Human_Player()])
13 |         >>> game.play()
14 |     
15 |     Parameters
16 |     -----------
17 |     
18 |     depth:
19 |       How many moves in advance should the AI think ?
20 |       (2 moves = 1 complete turn)
21 |     
22 |     scoring:
23 |       A function f(game)-> score. If no scoring is provided
24 |          and the game object has a ``scoring`` method it ill be used.
25 |     
26 |     win_score:
27 |       Score LARGER than the largest score of game, but smaller than inf. 
28 |       It's required to run algorithm.
29 |         
30 |     tt:
31 |       A transposition table (a table storing game states and moves)
32 |       scoring: can be none if the game that the AI will be given has a
33 |       ``scoring`` method.
34 |       
35 |     Notes
36 |     -----
37 |    
38 |     The score of a given game is given by
39 |     
40 |     >>> scoring(current_game) - 0.01*sign*current_depth
41 |     
42 |     for instance if a lose is -100 points, then losing after 4 moves
43 |     will score -99.96 points but losing after 8 moves will be -99.92
44 |     points. Thus, the AI will chose the move that leads to defeat in
45 |     8 turns, which makes it more difficult for the (human) opponent.
46 |     This will not always work if a ``win_score`` argument is provided.
47 |     
48 |     """
49 |     
50 |     def __init__(self, depth, scoring=None, win_score=100000, tt=None):
51 |         self.scoring = scoring        
52 |         self.depth = depth
53 |         self.tt = tt
54 |         self.win_score= win_score
55 |     
56 |     def __call__(self,game):
57 |         """
58 |         Returns the AI's best move given the current state of the game.
59 |         """
60 |         
61 |         scoring = self.scoring if self.scoring else (
62 |                        lambda g: g.scoring() ) # horrible hack
63 |         
64 |         first = self.win_score #essence of SSS algorithm
65 |         next = (lambda lowerbound, upperbound, bestValue: bestValue) 
66 |         
67 |         self.alpha = mtd(game, 
68 |                          first, next,
69 |                          self.depth, 
70 |                          scoring,
71 |                          self.tt)
72 |                 
73 |         return game.ai_move
74 | 


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/TT.py:
--------------------------------------------------------------------------------
 1 | """
 2 | This module implements transposition tables, which store positions
 3 | and moves to speed up the AI.
 4 | """
 5 | 
 6 | import pickle
 7 | from easyAI.AI.DictTT import DictTT
 8 | 
 9 | class TT:
10 |     """
11 |     A tranposition table made out of a Python dictionnary.
12 |     It can only be used on games which have a method
13 |     game.ttentry() -> string, or tuple
14 |     
15 |     Usage:
16 |         
17 |         >>> table = TT(DictTT(1024)) or table = TT() for default dictionary
18 |         >>> ai = Negamax(8, scoring, tt = table) # boosted Negamax !
19 |         >>> ai(some_game) # computes a move, fills the table
20 |         >>> table.to_file('saved_tt.data') # maybe save for later ?
21 |         
22 |         >>> # later...
23 |         >>> table = TT.fromfile('saved_tt.data')
24 |         >>> ai = Negamax(8, scoring, tt = table) # boosted Negamax !
25 |     
26 |     Transposition tables can also be used as an AI (``AI_player(tt)``)
27 |     but they must be exhaustive in this case: if they are asked for
28 |     a position that isn't stored in the table, it will lead to an error.
29 |     
30 |     """
31 |     
32 |     def __init__(self, own_dict = None):
33 |         self.d = own_dict if own_dict != None else dict()
34 |         
35 |     def lookup(self, game):
36 |         """ Requests the entry in the table. Returns None if the
37 |             entry has not been previously stored in the table. """
38 |         return self.d.get(game.ttentry(), None)
39 |         
40 |     def __call__(self,game):
41 |         """
42 |         This method enables the transposition table to be used
43 |         like an AI algorithm. However it will just break if it falls
44 |         on some game state that is not in the table. Therefore it is a
45 |         better option to use a mixed algorithm like
46 |         
47 |         >>> # negamax boosted with a transposition table !
48 |         >>> Negamax(10, tt= my_dictTT) 
49 |         """
50 |         return self.d[game.ttentry()]['move']
51 |         
52 |     def store(self, **data):
53 |         """ Stores an entry into the table """        
54 |         entry = data.pop("game").ttentry()
55 |         self.d[entry] = data
56 |         
57 |     def tofile(self, filename):
58 |         """ Saves the transposition table to a file. Warning: the file
59 |             can be big (~100Mo). """
60 |         with open(filename, 'w+') as f:
61 |             pickle.dump(self, f)
62 |     
63 |     @staticmethod
64 |     def fromfile(self, filename):
65 |         """ Loads a transposition table previously saved with
66 |              ``TT.tofile`` """
67 |         with open(filename, 'r') as f:
68 |             pickle.load(self, filename)
69 | 


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/AI/__init__.py:
--------------------------------------------------------------------------------
1 | from .Negamax import Negamax
2 | from .TT import TT
3 | from .solving import id_solve, df_solve
4 | from .MTdriver import mtd
5 | from .SSS import SSS
6 | from .DUAL import DUAL
7 | from .HashTT import HashTT


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/Player.py:
--------------------------------------------------------------------------------
 1 | """
 2 | This module implements the Player (Human or AI), which is basically an
 3 | object with an ``ask_move(game)`` method
 4 | """
 5 | try:
 6 |     input = raw_input
 7 | except NameError:
 8 |     pass
 9 | 
10 | 
11 | class Human_Player:
12 |     """
13 |     Class for a human player, which gets asked by text what moves
14 |     she wants to play. She can type ``show moves`` to display a list of
15 |     moves, or ``quit`` to quit the game.
16 |     """
17 | 
18 |     def __init__(self, name = 'Human'):
19 |         self.name = name
20 | 
21 |     def ask_move(self, game):
22 |         possible_moves = game.possible_moves()
23 |         # The str version of every move for comparison with the user input:
24 |         possible_moves_str = list(map(str, game.possible_moves()))
25 |         move = "NO_MOVE_DECIDED_YET"
26 |         while True:
27 |             move = input("\nPlayer %s what do you play ? "%(game.nplayer))
28 |             if move == 'show moves':
29 |                 print ("Possible moves:\n"+ "\n".join(
30 |                        ["#%d: %s"%(i+1,m) for i,m in enumerate(possible_moves)])
31 |                        +"\nType a move or type 'move #move_number' to play.")
32 | 
33 |             elif move == 'quit':
34 |                 raise KeyboardInterrupt
35 | 
36 |             elif move.startswith("move #"):
37 |                 # Fetch the corresponding move and return.
38 |                 move = possible_moves[int(move[6:])-1]
39 |                 return move
40 | 
41 |             elif str(move) in possible_moves_str:
42 |                 # Transform the move into its real type (integer, etc. and return).
43 |                 move = possible_moves[possible_moves_str.index(str(move))]
44 |                 return move
45 | 
46 | class AI_Player:
47 |     """
48 |     Class for an AI player. This class must be initialized with an
49 |     AI algortihm, like ``AI_Player( Negamax(9) )``
50 |     """
51 | 
52 |     def __init__(self, AI_algo, name = 'AI'):
53 |         self.AI_algo = AI_algo
54 |         self.name = name
55 |         self.move = {}
56 | 
57 |     def ask_move(self, game):
58 |         return self.AI_algo(game)
59 | 


--------------------------------------------------------------------------------
/Chapter 09/code/easyAI/__init__.py:
--------------------------------------------------------------------------------
 1 | __all__ = ['TwoPlayersGame', 'Human_Player', 'AI_Player',
 2 |            'Negamax', 'TT', 'id_solve', 'df_solve']
 3 | 
 4 | from .TwoPlayersGame import TwoPlayersGame
 5 | from .Player import Human_Player, AI_Player
 6 | from .AI import Negamax, id_solve, df_solve
 7 | from .AI import TT
 8 | from .AI import mtd
 9 | from .AI import SSS, DUAL
10 | from .AI import HashTT, DictTT


--------------------------------------------------------------------------------
/Chapter 09/code/hexapawn.py:
--------------------------------------------------------------------------------
 1 | # This is a variant of the Hexapawn recipe given in the easyAI library
 2 | 
 3 | from easyAI import TwoPlayersGame, AI_Player, \
 4 |         Human_Player, Negamax
 5 | 
 6 | class GameController(TwoPlayersGame):
 7 |     def __init__(self, players, size = (4, 4)):
 8 |         self.size = size
 9 |         num_pawns, len_board = size
10 |         p = [[(i, j) for j in range(len_board)] \
11 |                 for i in [0, num_pawns - 1]]
12 | 
13 |         for i, d, goal, pawns in [(0, 1, num_pawns - 1, 
14 |                 p[0]), (1, -1, 0, p[1])]:
15 |             players[i].direction = d
16 |             players[i].goal_line = goal
17 |             players[i].pawns = pawns
18 | 
19 |         # Define the players
20 |         self.players = players
21 | 
22 |         # Define who starts first
23 |         self.nplayer = 1
24 | 
25 |         # Define the alphabets
26 |         self.alphabets = 'ABCDEFGHIJ'
27 | 
28 |         # Convert B4 to (1, 3)
29 |         self.to_tuple = lambda s: (self.alphabets.index(s[0]), 
30 |                 int(s[1:]) - 1)
31 | 
32 |         # Convert (1, 3) to B4
33 |         self.to_string = lambda move: ' '.join([self.alphabets[
34 |                 move[i][0]] + str(move[i][1] + 1)
35 |                 for i in (0, 1)])
36 | 
37 |     # Define the possible moves
38 |     def possible_moves(self):
39 |         moves = []
40 |         opponent_pawns = self.opponent.pawns
41 |         d = self.player.direction
42 | 
43 |         for i, j in self.player.pawns:
44 |             if (i + d, j) not in opponent_pawns:
45 |                 moves.append(((i, j), (i + d, j)))
46 | 
47 |             if (i + d, j + 1) in opponent_pawns:
48 |                 moves.append(((i, j), (i + d, j + 1)))
49 | 
50 |             if (i + d, j - 1) in opponent_pawns:
51 |                 moves.append(((i, j), (i + d, j - 1)))
52 | 
53 |         return list(map(self.to_string, [(i, j) for i, j in moves]))
54 | 
55 |     # Define how to make a move
56 |     def make_move(self, move):
57 |         move = list(map(self.to_tuple, move.split(' ')))
58 |         ind = self.player.pawns.index(move[0])
59 |         self.player.pawns[ind] = move[1]
60 | 
61 |         if move[1] in self.opponent.pawns:
62 |             self.opponent.pawns.remove(move[1])
63 | 
64 |     # Define what a loss looks like
65 |     def loss_condition(self):
66 |         return (any([i == self.opponent.goal_line
67 |                 for i, j in self.opponent.pawns])
68 |                 or (self.possible_moves() == []) )
69 | 
70 |     # Check if the game is over
71 |     def is_over(self):
72 |         return self.loss_condition()
73 | 
74 |     # Show the current status
75 |     def show(self):
76 |         f = lambda x: '1' if x in self.players[0].pawns else (
77 |                 '2' if x in self.players[1].pawns else '.')
78 | 
79 |         print("\n".join([" ".join([f((i, j))
80 |                 for j in range(self.size[1])])
81 |                 for i in range(self.size[0])]))
82 | 
83 | if __name__=='__main__':
84 |     # Compute the score
85 |     scoring = lambda game: -100 if game.loss_condition() else 0
86 | 
87 |     # Define the algorithm
88 |     algorithm = Negamax(12, scoring)
89 | 
90 |     # Start the game
91 |     game = GameController([AI_Player(algorithm), 
92 |             AI_Player(algorithm)])
93 |     game.play()
94 |     print('\nPlayer', game.nopponent, 'wins after', game.nmove , 'turns')
95 | 
96 | 


--------------------------------------------------------------------------------
/Chapter 09/code/tic_tac_toe.py:
--------------------------------------------------------------------------------
 1 | # This is a variant of the Tic Tac Toe recipe given in the easyAI library
 2 | 
 3 | from easyAI import TwoPlayersGame, AI_Player, Negamax
 4 | from easyAI.Player import Human_Player
 5 | 
 6 | class GameController(TwoPlayersGame):
 7 |     def __init__(self, players):
 8 |         # Define the players
 9 |         self.players = players
10 | 
11 |         # Define who starts the game
12 |         self.nplayer = 1 
13 | 
14 |         # Define the board
15 |         self.board = [0] * 9
16 |     
17 |     # Define possible moves
18 |     def possible_moves(self):
19 |         return [a + 1 for a, b in enumerate(self.board) if b == 0]
20 |     
21 |     # Make a move
22 |     def make_move(self, move):
23 |         self.board[int(move) - 1] = self.nplayer
24 | 
25 |     # Does the opponent have three in a line?
26 |     def loss_condition(self):
27 |         possible_combinations = [[1,2,3], [4,5,6], [7,8,9],
28 |             [1,4,7], [2,5,8], [3,6,9], [1,5,9], [3,5,7]]
29 | 
30 |         return any([all([(self.board[i-1] == self.nopponent)
31 |                 for i in combination]) for combination in possible_combinations]) 
32 |         
33 |     # Check if the game is over
34 |     def is_over(self):
35 |         return (self.possible_moves() == []) or self.loss_condition()
36 |         
37 |     # Show current position
38 |     def show(self):
39 |         print('\n'+'\n'.join([' '.join([['.', 'O', 'X'][self.board[3*j + i]]
40 |                 for i in range(3)]) for j in range(3)]))
41 |                  
42 |     # Compute the score
43 |     def scoring(self):
44 |         return -100 if self.loss_condition() else 0
45 | 
46 | if __name__ == "__main__":
47 |     # Define the algorithm
48 |     algorithm = Negamax(7)
49 | 
50 |     # Start the game
51 |     GameController([Human_Player(), AI_Player(algorithm)]).play()
52 | 
53 | 


--------------------------------------------------------------------------------
/Chapter 10/code/bag_of_words.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from sklearn.feature_extraction.text import CountVectorizer
 3 | from nltk.corpus import brown
 4 | from text_chunker import chunker 
 5 | 
 6 | # Read the data from the Brown corpus
 7 | input_data = ' '.join(brown.words()[:5400])
 8 | 
 9 | # Number of words in each chunk 
10 | chunk_size = 800
11 | 
12 | text_chunks = chunker(input_data, chunk_size)
13 | 
14 | # Convert to dict items
15 | chunks = []
16 | for count, chunk in enumerate(text_chunks):
17 |     d = {'index': count, 'text': chunk}
18 |     chunks.append(d)
19 | 
20 | # Extract the document term matrix
21 | count_vectorizer = CountVectorizer(min_df=7, max_df=20)
22 | document_term_matrix = count_vectorizer.fit_transform([chunk['text'] for chunk in chunks])
23 | 
24 | # Extract the vocabulary and display it
25 | vocabulary = np.array(count_vectorizer.get_feature_names())
26 | print("\nVocabulary:\n", vocabulary)
27 | 
28 | # Generate names for chunks
29 | chunk_names = []
30 | for i in range(len(text_chunks)):
31 |     chunk_names.append('Chunk-' + str(i+1))
32 | 
33 | # Print the document term matrix
34 | print("\nDocument term matrix:")
35 | formatted_text = '{:>12}' * (len(chunk_names) + 1)
36 | print('\n', formatted_text.format('Word', *chunk_names), '\n')
37 | for word, item in zip(vocabulary, document_term_matrix.T):
38 |     # 'item' is a 'csr_matrix' data structure
39 |     output = [word] + [str(freq) for freq in item.data]
40 |     print(formatted_text.format(*output))
41 | 
42 | 


--------------------------------------------------------------------------------
/Chapter 10/code/category_predictor.py:
--------------------------------------------------------------------------------
 1 | from sklearn.datasets import fetch_20newsgroups
 2 | from sklearn.naive_bayes import MultinomialNB
 3 | from sklearn.feature_extraction.text import TfidfTransformer
 4 | from sklearn.feature_extraction.text import CountVectorizer
 5 | 
 6 | # Define the category map
 7 | category_map = {'talk.politics.misc': 'Politics', 'rec.autos': 'Autos', 
 8 |         'rec.sport.hockey': 'Hockey', 'sci.electronics': 'Electronics', 
 9 |         'sci.med': 'Medicine'}
10 | 
11 | # Get the training dataset
12 | training_data = fetch_20newsgroups(subset='train', 
13 |         categories=category_map.keys(), shuffle=True, random_state=5)
14 | 
15 | # Build a count vectorizer and extract term counts 
16 | count_vectorizer = CountVectorizer()
17 | train_tc = count_vectorizer.fit_transform(training_data.data)
18 | print("\nDimensions of training data:", train_tc.shape)
19 | 
20 | # Create the tf-idf transformer
21 | tfidf = TfidfTransformer()
22 | train_tfidf = tfidf.fit_transform(train_tc)
23 | 
24 | # Define test data 
25 | input_data = [
26 |     'You need to be careful with cars when you are driving on slippery roads', 
27 |     'A lot of devices can be operated wirelessly',
28 |     'Players need to be careful when they are close to goal posts',
29 |     'Political debates help us understand the perspectives of both sides'
30 | ]
31 | 
32 | # Train a Multinomial Naive Bayes classifier
33 | classifier = MultinomialNB().fit(train_tfidf, training_data.target)
34 | 
35 | # Transform input data using count vectorizer
36 | input_tc = count_vectorizer.transform(input_data)
37 | 
38 | # Transform vectorized data using tfidf transformer
39 | input_tfidf = tfidf.transform(input_tc)
40 | 
41 | # Predict the output categories
42 | predictions = classifier.predict(input_tfidf)
43 | 
44 | # Print the outputs
45 | for sent, category in zip(input_data, predictions):
46 |     print('\nInput:', sent, '\nPredicted category:', \
47 |             category_map[training_data.target_names[category]])
48 | 
49 | 


--------------------------------------------------------------------------------
/Chapter 10/code/data.txt:
--------------------------------------------------------------------------------
 1 | The Roman empire expanded very rapidly and it was the biggest empire in the world for a long time.
 2 | An algebraic structure is a set with one or more finitary operations defined on it that satisfies a list of axioms.
 3 | Renaissance started as a cultural movement in Italy in the Late Medieval period and later spread to the rest of Europe.
 4 | The line of demarcation between prehistoric and historical times is crossed when people cease to live only in the present.
 5 | Mathematicians seek out patterns and use them to formulate new conjectures.  
 6 | A notational symbol that represents a number is called a numeral in mathematics. 
 7 | The process of extracting the underlying essence of a mathematical concept is called abstraction.
 8 | Historically, people have frequently waged wars against each other in order to expand their empires.
 9 | Ancient history indicates that various outside influences have helped formulate the culture and traditions of Eastern Europe.
10 | Mappings between sets which preserve structures are of special interest in many fields of mathematics. 


--------------------------------------------------------------------------------
/Chapter 10/code/gender_identifier.py:
--------------------------------------------------------------------------------
 1 | import random
 2 | 
 3 | from nltk import NaiveBayesClassifier
 4 | from nltk.classify import accuracy as nltk_accuracy
 5 | from nltk.corpus import names
 6 | 
 7 | # Extract last N letters from the input word
 8 | # and that will act as our "feature"
 9 | def extract_features(word, N=2):
10 |     last_n_letters = word[-N:]
11 |     return {'feature': last_n_letters.lower()}
12 | 
13 | if __name__=='__main__':
14 |     # Create training data using labeled names available in NLTK
15 |     male_list = [(name, 'male') for name in names.words('male.txt')]
16 |     female_list = [(name, 'female') for name in names.words('female.txt')]
17 |     data = (male_list + female_list)
18 | 
19 |     # Seed the random number generator
20 |     random.seed(5)
21 | 
22 |     # Shuffle the data
23 |     random.shuffle(data)
24 | 
25 |     # Create test data
26 |     input_names = ['Alexander', 'Danielle', 'David', 'Cheryl']
27 | 
28 |     # Define the number of samples used for train and test
29 |     num_train = int(0.8 * len(data))
30 | 
31 |     # Iterate through different lengths to compare the accuracy
32 |     for i in range(1, 6):
33 |         print('\nNumber of end letters:', i)
34 |         features = [(extract_features(n, i), gender) for (n, gender) in data]
35 |         train_data, test_data = features[:num_train], features[num_train:]
36 |         classifier = NaiveBayesClassifier.train(train_data)
37 | 
38 |         # Compute the accuracy of the classifier 
39 |         accuracy = round(100 * nltk_accuracy(classifier, test_data), 2)
40 |         print('Accuracy = ' + str(accuracy) + '%')
41 | 
42 |         # Predict outputs for input names using the trained classifier model
43 |         for name in input_names:
44 |             print(name, '==>', classifier.classify(extract_features(name, i)))
45 | 
46 | 


--------------------------------------------------------------------------------
/Chapter 10/code/lemmatizer.py:
--------------------------------------------------------------------------------
 1 | from nltk.stem import WordNetLemmatizer
 2 | 
 3 | input_words = ['writing', 'calves', 'be', 'branded', 'horse', 'randomize', 
 4 |         'possibly', 'provision', 'hospital', 'kept', 'scratchy', 'code']
 5 | 
 6 | # Create lemmatizer object 
 7 | lemmatizer = WordNetLemmatizer()
 8 | 
 9 | # Create a list of lemmatizer names for display
10 | lemmatizer_names = ['NOUN LEMMATIZER', 'VERB LEMMATIZER']
11 | formatted_text = '{:>24}' * (len(lemmatizer_names) + 1)
12 | print('\n', formatted_text.format('INPUT WORD', *lemmatizer_names), 
13 |         '\n', '='*75)
14 | 
15 | # Lemmatize each word and display the output
16 | for word in input_words:
17 |     output = [word, lemmatizer.lemmatize(word, pos='n'),
18 |            lemmatizer.lemmatize(word, pos='v')]
19 |     print(formatted_text.format(*output))


--------------------------------------------------------------------------------
/Chapter 10/code/sentiment_analyzer.py:
--------------------------------------------------------------------------------
 1 | from nltk.corpus import movie_reviews 
 2 | from nltk.classify import NaiveBayesClassifier
 3 | from nltk.classify.util import accuracy as nltk_accuracy
 4 |  
 5 | # Extract features from the input list of words
 6 | def extract_features(words):
 7 |     return dict([(word, True) for word in words])
 8 |  
 9 | if __name__=='__main__':
10 |     # Load the reviews from the corpus 
11 |     fileids_pos = movie_reviews.fileids('pos')
12 |     fileids_neg = movie_reviews.fileids('neg')
13 |      
14 |     # Extract the features from the reviews
15 |     features_pos = [(extract_features(movie_reviews.words(
16 |             fileids=[f])), 'Positive') for f in fileids_pos]
17 |     features_neg = [(extract_features(movie_reviews.words(
18 |             fileids=[f])), 'Negative') for f in fileids_neg]
19 |      
20 |     # Define the train and test split (80% and 20%)
21 |     threshold = 0.8
22 |     num_pos = int(threshold * len(features_pos))
23 |     num_neg = int(threshold * len(features_neg))
24 |      
25 |      # Create training and training datasets
26 |     features_train = features_pos[:num_pos] + features_neg[:num_neg]
27 |     features_test = features_pos[num_pos:] + features_neg[num_neg:]  
28 | 
29 |     # Print the number of datapoints used
30 |     print('\nNumber of training datapoints:', len(features_train))
31 |     print('Number of test datapoints:', len(features_test))
32 |      
33 |     # Train a Naive Bayes classifier 
34 |     classifier = NaiveBayesClassifier.train(features_train)
35 |     print('\nAccuracy of the classifier:', nltk_accuracy(
36 |             classifier, features_test))
37 | 
38 |     N = 15
39 |     print('\nTop ' + str(N) + ' most informative words:')
40 |     for i, item in enumerate(classifier.most_informative_features()):
41 |         print(str(i+1) + '. ' + item[0])
42 |         if i == N - 1:
43 |             break
44 | 
45 |     # Test input movie reviews
46 |     input_reviews = [
47 |         'The costumes in this movie were great', 
48 |         'I think the story was terrible and the characters were very weak',
49 |         'People say that the director of the movie is amazing', 
50 |         'This is such an idiotic movie. I will not recommend it to anyone.' 
51 |     ]
52 | 
53 |     print("\nMovie review predictions:")
54 |     for review in input_reviews:
55 |         print("\nReview:", review)
56 | 
57 |         # Compute the probabilities
58 |         probabilities = classifier.prob_classify(extract_features(review.split()))
59 | 
60 |         # Pick the maximum value
61 |         predicted_sentiment = probabilities.max()
62 | 
63 |         # Print outputs
64 |         print("Predicted sentiment:", predicted_sentiment)
65 |         print("Probability:", round(probabilities.prob(predicted_sentiment), 2))
66 | 
67 | 


--------------------------------------------------------------------------------
/Chapter 10/code/stemmer.py:
--------------------------------------------------------------------------------
 1 | from nltk.stem.porter import PorterStemmer
 2 | from nltk.stem.lancaster import LancasterStemmer
 3 | from nltk.stem.snowball import SnowballStemmer
 4 | 
 5 | input_words = ['writing', 'calves', 'be', 'branded', 'horse', 'randomize', 
 6 |         'possibly', 'provision', 'hospital', 'kept', 'scratchy', 'code']
 7 | 
 8 | # Create various stemmer objects
 9 | porter = PorterStemmer()
10 | lancaster = LancasterStemmer()
11 | snowball = SnowballStemmer('english')
12 | 
13 | # Create a list of stemmer names for display
14 | stemmer_names = ['PORTER', 'LANCASTER', 'SNOWBALL']
15 | formatted_text = '{:>16}' * (len(stemmer_names) + 1)
16 | print('\n', formatted_text.format('INPUT WORD', *stemmer_names), 
17 |         '\n', '='*68)
18 | 
19 | # Stem each word and display the output
20 | for word in input_words:
21 |     output = [word, porter.stem(word), 
22 |             lancaster.stem(word), snowball.stem(word)]
23 |     print(formatted_text.format(*output))


--------------------------------------------------------------------------------
/Chapter 10/code/text_chunker.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from nltk.corpus import brown
 3 | 
 4 | # Split the input text into chunks, where
 5 | # each chunk contains N words
 6 | def chunker(input_data, N):
 7 |     input_words = input_data.split(' ')
 8 |     output = []
 9 | 
10 |     cur_chunk = []
11 |     count = 0
12 |     for word in input_words:
13 |         cur_chunk.append(word)
14 |         count += 1
15 |         if count == N:
16 |             output.append(' '.join(cur_chunk))
17 |             count, cur_chunk = 0, []
18 | 
19 |     output.append(' '.join(cur_chunk))
20 | 
21 |     return output 
22 | 
23 | if __name__=='__main__':
24 |     # Read the first 12000 words from the Brown corpus
25 |     input_data = ' '.join(brown.words()[:12000])
26 | 
27 |     # Define the number of words in each chunk 
28 |     chunk_size = 700
29 | 
30 |     chunks = chunker(input_data, chunk_size)
31 |     print('\nNumber of text chunks =', len(chunks), '\n')
32 |     for i, chunk in enumerate(chunks):
33 |         print('Chunk', i+1, '==>', chunk[:50])
34 | 


--------------------------------------------------------------------------------
/Chapter 10/code/tokenizer.py:
--------------------------------------------------------------------------------
 1 | from nltk.tokenize import sent_tokenize, \
 2 |         word_tokenize, WordPunctTokenizer
 3 | 
 4 | # Define input text
 5 | input_text = "Do you know how tokenization works? It's actually quite interesting! Let's analyze a couple of sentences and figure it out." 
 6 | 
 7 | # Sentence tokenizer 
 8 | print("\nSentence tokenizer:")
 9 | print(sent_tokenize(input_text))
10 | 
11 | # Word tokenizer
12 | print("\nWord tokenizer:")
13 | print(word_tokenize(input_text))
14 | 
15 | # WordPunct tokenizer
16 | print("\nWord punct tokenizer:")
17 | print(WordPunctTokenizer().tokenize(input_text))
18 | 


--------------------------------------------------------------------------------
/Chapter 10/code/topic_modeler.py:
--------------------------------------------------------------------------------
 1 | from nltk.tokenize import RegexpTokenizer  
 2 | from nltk.corpus import stopwords
 3 | from nltk.stem.snowball import SnowballStemmer
 4 | from gensim import models, corpora
 5 | 
 6 | # Load input data
 7 | def load_data(input_file):
 8 |     data = []
 9 |     with open(input_file, 'r') as f:
10 |         for line in f.readlines():
11 |             data.append(line[:-1])
12 | 
13 |     return data
14 | 
15 | # Processor function for tokenizing, removing stop 
16 | # words, and stemming
17 | def process(input_text):
18 |     # Create a regular expression tokenizer
19 |     tokenizer = RegexpTokenizer(r'\w+')
20 | 
21 |     # Create a Snowball stemmer 
22 |     stemmer = SnowballStemmer('english')
23 | 
24 |     # Get the list of stop words 
25 |     stop_words = stopwords.words('english')
26 |     
27 |     # Tokenize the input string
28 |     tokens = tokenizer.tokenize(input_text.lower())
29 | 
30 |     # Remove the stop words 
31 |     tokens = [x for x in tokens if not x in stop_words]
32 |     
33 |     # Perform stemming on the tokenized words 
34 |     tokens_stemmed = [stemmer.stem(x) for x in tokens]
35 | 
36 |     return tokens_stemmed
37 |     
38 | if __name__=='__main__':
39 |     # Load input data
40 |     data = load_data('data.txt')
41 | 
42 |     # Create a list for sentence tokens
43 |     tokens = [process(x) for x in data]
44 | 
45 |     # Create a dictionary based on the sentence tokens 
46 |     dict_tokens = corpora.Dictionary(tokens)
47 |         
48 |     # Create a document-term matrix
49 |     doc_term_mat = [dict_tokens.doc2bow(token) for token in tokens]
50 | 
51 |     # Define the number of topics for the LDA model
52 |     num_topics = 2
53 | 
54 |     # Generate the LDA model 
55 |     ldamodel = models.ldamodel.LdaModel(doc_term_mat, 
56 |             num_topics=num_topics, id2word=dict_tokens, passes=25)
57 | 
58 |     num_words = 5
59 |     print('\nTop ' + str(num_words) + ' contributing words to each topic:')
60 |     for item in ldamodel.print_topics(num_topics=num_topics, num_words=num_words):
61 |         print('\nTopic', item[0])
62 | 
63 |         # Print the contributing words along with their relative contributions 
64 |         list_of_strings = item[1].split(' + ')
65 |         for text in list_of_strings:
66 |             weight = text.split('*')[0]
67 |             word = text.split('*')[1]
68 |             print(word, '==>', str(round(float(weight) * 100, 2)) + '%')
69 | 
70 | 


--------------------------------------------------------------------------------
/Chapter 11/code/crf.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import argparse 
 3 | import string
 4 | import pickle 
 5 | 
 6 | import numpy as np
 7 | import matplotlib.pyplot as plt
 8 | from pystruct.datasets import load_letters
 9 | from pystruct.models import ChainCRF 
10 | from pystruct.learners import FrankWolfeSSVM
11 | 
12 | def build_arg_parser():
13 |     parser = argparse.ArgumentParser(description='Trains a Conditional\
14 |             Random Field classifier')
15 |     parser.add_argument("--C", dest="c_val", required=False, type=float,
16 |             default=1.0, help='C value to be used for training')
17 |     return parser
18 | 
19 | # Class to model the CRF
20 | class CRFModel(object):
21 |     def __init__(self, c_val=1.0):
22 |         self.clf = FrankWolfeSSVM(model=ChainCRF(), 
23 |                 C=c_val, max_iter=50) 
24 | 
25 |     # Load the training data
26 |     def load_data(self):
27 |         alphabets = load_letters()
28 |         X = np.array(alphabets['data'])
29 |         y = np.array(alphabets['labels'])
30 |         folds = alphabets['folds']
31 | 
32 |         return X, y, folds
33 | 
34 |     # Train the CRF
35 |     def train(self, X_train, y_train):
36 |         self.clf.fit(X_train, y_train)
37 | 
38 |     # Evaluate the accuracy of the CRF
39 |     def evaluate(self, X_test, y_test):
40 |         return self.clf.score(X_test, y_test)
41 | 
42 |     # Run the CRF on unknown data
43 |     def classify(self, input_data):
44 |         return self.clf.predict(input_data)[0]
45 | 
46 | # Convert indices to alphabets
47 | def convert_to_letters(indices):
48 |     # Create a numpy array of all alphabets
49 |     alphabets = np.array(list(string.ascii_lowercase))
50 | 
51 |     # Extract the letters based on input indices
52 |     output = np.take(alphabets, indices)
53 |     output = ''.join(output)
54 | 
55 |     return output
56 | 
57 | if __name__=='__main__':
58 |     args = build_arg_parser().parse_args()
59 |     c_val = args.c_val
60 | 
61 |     # Create the CRF model
62 |     crf = CRFModel(c_val)
63 | 
64 |     # Load the train and test data
65 |     X, y, folds = crf.load_data()
66 |     X_train, X_test = X[folds == 1], X[folds != 1]
67 |     y_train, y_test = y[folds == 1], y[folds != 1]
68 | 
69 |     # Train the CRF model
70 |     print('\nTraining the CRF model...')
71 |     crf.train(X_train, y_train)
72 | 
73 |     # Evaluate the accuracy
74 |     score = crf.evaluate(X_test, y_test)
75 |     print('\nAccuracy score =', str(round(score*100, 2)) + '%')
76 | 
77 |     indices = range(3000, len(y_test), 200)
78 |     for index in indices:
79 |         print("\nOriginal  =", convert_to_letters(y_test[index]))
80 |         predicted = crf.classify([X_test[index]])
81 |         print("Predicted =", convert_to_letters(predicted))


--------------------------------------------------------------------------------
/Chapter 11/code/hmm.py:
--------------------------------------------------------------------------------
 1 | import datetime
 2 | 
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | from hmmlearn.hmm import GaussianHMM
 6 | 
 7 | from timeseries import read_data 
 8 | 
 9 | # Load input data
10 | data = np.loadtxt('data_1D.txt', delimiter=',')
11 | 
12 | # Extract the data column (third column) for training 
13 | X = np.column_stack([data[:, 2]])
14 | 
15 | # Create a Gaussian HMM 
16 | num_components = 5
17 | hmm = GaussianHMM(n_components=num_components, 
18 |         covariance_type='diag', n_iter=1000)
19 | 
20 | # Train the HMM 
21 | print('\nTraining the Hidden Markov Model...')
22 | hmm.fit(X)
23 | 
24 | # Print HMM stats
25 | print('\nMeans and variances:')
26 | for i in range(hmm.n_components):
27 |     print('\nHidden state', i+1)
28 |     print('Mean =', round(hmm.means_[i][0], 2))
29 |     print('Variance =', round(np.diag(hmm.covars_[i])[0], 2))
30 | 
31 | # Generate data using the HMM model
32 | num_samples = 1200
33 | generated_data, _ = hmm.sample(num_samples) 
34 | plt.plot(np.arange(num_samples), generated_data[:, 0], c='black')
35 | plt.title('Generated data')
36 | 
37 | plt.show()


--------------------------------------------------------------------------------
/Chapter 11/code/operator.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import pandas as pd
 3 | import matplotlib.pyplot as plt
 4 | 
 5 | from timeseries import read_data 
 6 | 
 7 | # Input filename
 8 | input_file = 'data_2D.txt'
 9 | 
10 | # Load data
11 | x1 = read_data(input_file, 2)
12 | x2 = read_data(input_file, 3)
13 | 
14 | # Create pandas dataframe for slicing
15 | data = pd.DataFrame({'dim1': x1, 'dim2': x2})
16 | 
17 | # Plot data
18 | start = '1968'
19 | end = '1975'
20 | data[start:end].plot()
21 | plt.title('Data overlapped on top of each other')
22 | 
23 | # Filtering using conditions
24 | # - 'dim1' is smaller than a certain threshold
25 | # - 'dim2' is greater than a certain threshold
26 | data[(data['dim1'] < 45) & (data['dim2'] > 30)].plot()
27 | plt.title('dim1 < 45 and dim2 > 30')
28 | 
29 | # Adding two dataframes 
30 | plt.figure()
31 | diff = data[start:end]['dim1'] + data[start:end]['dim2']
32 | diff.plot()
33 | plt.title('Summation (dim1 + dim2)')
34 | 
35 | plt.show()


--------------------------------------------------------------------------------
/Chapter 11/code/slicer.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import pandas as pd
 4 | 
 5 | from timeseries import read_data
 6 | 
 7 | # Load input data
 8 | index = 2
 9 | data = read_data('data_2D.txt', index)
10 | 
11 | # Plot data with year-level granularity 
12 | start = '2003'
13 | end = '2011'
14 | plt.figure()
15 | data[start:end].plot()
16 | plt.title('Input data from ' + start + ' to ' + end)
17 | 
18 | # Plot data with month-level granularity 
19 | start = '1998-2'
20 | end = '2006-7'
21 | plt.figure()
22 | data[start:end].plot()
23 | plt.title('Input data from ' + start + ' to ' + end)
24 | 
25 | plt.show()


--------------------------------------------------------------------------------
/Chapter 11/code/stats_extractor.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import pandas as pd
 4 | 
 5 | from timeseries import read_data 
 6 | 
 7 | # Input filename
 8 | input_file = 'data_2D.txt'
 9 | 
10 | # Load input data in time series format
11 | x1 = read_data(input_file, 2)
12 | x2 = read_data(input_file, 3)
13 | 
14 | # Create pandas dataframe for slicing
15 | data = pd.DataFrame({'dim1': x1, 'dim2': x2})
16 | 
17 | # Extract max and min values
18 | print('\nMaximum values for each dimension:')
19 | print(data.max())
20 | print('\nMinimum values for each dimension:')
21 | print(data.min())
22 | 
23 | # Extract overall mean and row-wise mean values
24 | print('\nOverall mean:')
25 | print(data.mean())
26 | print('\nRow-wise mean:')
27 | print(data.mean(1)[:12])
28 | 
29 | # Plot the rolling mean using a window size of 24
30 | data.rolling(center=False, window=24).mean().plot()
31 | plt.title('Rolling mean')
32 | 
33 | # Extract correlation coefficients
34 | print('\nCorrelation coefficients:\n', data.corr())
35 | 
36 | # Plot rolling correlation using a window size of 60
37 | plt.figure()
38 | plt.title('Rolling correlation')
39 | data['dim1'].rolling(window=60).corr(other=data['dim2']).plot()
40 | 
41 | plt.show()


--------------------------------------------------------------------------------
/Chapter 11/code/stock_market.py:
--------------------------------------------------------------------------------
 1 | import datetime
 2 | import warnings
 3 | 
 4 | import numpy as np
 5 | import matplotlib.pyplot as plt
 6 | from matplotlib.finance import quotes_historical_yahoo_ochl\
 7 |         as quotes_yahoo
 8 | from hmmlearn.hmm import GaussianHMM
 9 | 
10 | # Load historical stock quotes from matplotlib package 
11 | start = datetime.date(1970, 9, 4) 
12 | end = datetime.date(2016, 5, 17)
13 | stock_quotes = quotes_yahoo('INTC', start, end) 
14 | 
15 | # Extract the closing quotes everyday
16 | closing_quotes = np.array([quote[2] for quote in stock_quotes])
17 | 
18 | # Extract the volume of shares traded everyday 
19 | volumes = np.array([quote[5] for quote in stock_quotes])[1:]
20 | 
21 | # Take the percentage difference of closing stock prices
22 | diff_percentages = 100.0 * np.diff(closing_quotes) / closing_quotes[:-1]
23 | 
24 | # Take the list of dates starting from the second value
25 | dates = np.array([quote[0] for quote in stock_quotes], dtype=np.int)[1:]
26 | 
27 | # Stack the differences and volume values column-wise for training
28 | training_data = np.column_stack([diff_percentages, volumes])
29 | 
30 | # Create and train Gaussian HMM 
31 | hmm = GaussianHMM(n_components=7, covariance_type='diag', n_iter=1000)
32 | with warnings.catch_warnings():
33 |     warnings.simplefilter('ignore')
34 |     hmm.fit(training_data)
35 | 
36 | # Generate data using the HMM model
37 | num_samples = 300 
38 | samples, _ = hmm.sample(num_samples) 
39 | 
40 | # Plot the difference percentages 
41 | plt.figure()
42 | plt.title('Difference percentages')
43 | plt.plot(np.arange(num_samples), samples[:, 0], c='black')
44 | 
45 | # Plot the volume of shares traded
46 | plt.figure()
47 | plt.title('Volume of shares')
48 | plt.plot(np.arange(num_samples), samples[:, 1], c='black')
49 | plt.ylim(ymin=0)
50 | 
51 | plt.show()
52 | 
53 | 


--------------------------------------------------------------------------------
/Chapter 11/code/timeseries.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import pandas as pd
 4 | 
 5 | def read_data(input_file, index):
 6 |     # Read the data from the input file
 7 |     input_data = np.loadtxt(input_file, delimiter=',')
 8 | 
 9 |     # Lambda function to convert strings to Pandas date format
10 |     to_date = lambda x, y: str(int(x)) + '-' + str(int(y))
11 | 
12 |     # Extract the start date
13 |     start = to_date(input_data[0, 0], input_data[0, 1])
14 | 
15 |     # Extract the end date
16 |     if input_data[-1, 1] == 12:
17 |         year = input_data[-1, 0] + 1
18 |         month = 1
19 |     else:
20 |         year = input_data[-1, 0]
21 |         month = input_data[-1, 1] + 1
22 | 
23 |     end = to_date(year, month)
24 | 
25 |     # Create a date list with a monthly frequency
26 |     date_indices = pd.date_range(start, end, freq='M')
27 | 
28 |     # Add timestamps to the input data to create time-series data
29 |     output = pd.Series(input_data[:, index], index=date_indices)
30 | 
31 |     return output 
32 | 
33 | if __name__=='__main__':
34 |     # Input filename
35 |     input_file = 'data_2D.txt'
36 | 
37 |     # Specify the columns that need to be converted 
38 |     # into time-series data
39 |     indices = [2, 3]
40 | 
41 |     # Iterate through the columns and plot the data
42 |     for index in indices:
43 |         # Convert the column to timeseries format
44 |         timeseries = read_data(input_file, index)
45 | 
46 |         # Plot the data
47 |         plt.figure()
48 |         timeseries.plot()
49 |         plt.title('Dimension ' + str(index - 1))
50 | 
51 |     plt.show()
52 | 


--------------------------------------------------------------------------------
/Chapter 12/code/audio_generator.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from scipy.io.wavfile import write
 4 | 
 5 | # Output file where the audio will be saved 
 6 | output_file = 'generated_audio.wav'
 7 | 
 8 | # Specify audio parameters
 9 | duration = 4  # in seconds
10 | sampling_freq = 44100  # in Hz
11 | tone_freq = 784 
12 | min_val = -4 * np.pi
13 | max_val = 4 * np.pi
14 | 
15 | # Generate the audio signal
16 | t = np.linspace(min_val, max_val, duration * sampling_freq)
17 | signal = np.sin(2 * np.pi * tone_freq * t)
18 | 
19 | # Add some noise to the signal
20 | noise = 0.5 * np.random.rand(duration * sampling_freq)
21 | signal += noise
22 | 
23 | # Scale it to 16-bit integer values
24 | scaling_factor = np.power(2, 15) - 1
25 | signal_normalized = signal / np.max(np.abs(signal))
26 | signal_scaled = np.int16(signal_normalized * scaling_factor)
27 | 
28 | # Save the audio signal in the output file 
29 | write(output_file, sampling_freq, signal_scaled)
30 | 
31 | # Extract the first 200 values from the audio signal 
32 | signal = signal[:200]
33 | 
34 | # Construct the time axis in milliseconds
35 | time_axis = 1000 * np.arange(0, len(signal), 1) / float(sampling_freq) 
36 | 
37 | # Plot the audio signal
38 | plt.plot(time_axis, signal, color='black')
39 | plt.xlabel('Time (milliseconds)')
40 | plt.ylabel('Amplitude')
41 | plt.title('Generated audio signal')
42 | plt.show()
43 | 


--------------------------------------------------------------------------------
/Chapter 12/code/audio_plotter.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from scipy.io import wavfile
 4 | 
 5 | # Read the audio file
 6 | sampling_freq, signal = wavfile.read('random_sound.wav')
 7 | 
 8 | # Display the params
 9 | print('\nSignal shape:', signal.shape)
10 | print('Datatype:', signal.dtype)
11 | print('Signal duration:', round(signal.shape[0] / float(sampling_freq), 2), 'seconds')
12 | 
13 | # Normalize the signal 
14 | signal = signal / np.power(2, 15)
15 | 
16 | # Extract the first 50 values
17 | signal = signal[:50]
18 | 
19 | # Construct the time axis in milliseconds
20 | time_axis = 1000 * np.arange(0, len(signal), 1) / float(sampling_freq)
21 | 
22 | # Plot the audio signal
23 | plt.plot(time_axis, signal, color='black')
24 | plt.xlabel('Time (milliseconds)')
25 | plt.ylabel('Amplitude')
26 | plt.title('Input audio signal')
27 | plt.show()
28 | 


--------------------------------------------------------------------------------
/Chapter 12/code/data/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/.DS_Store


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/apple/apple15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/apple/apple15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/banana/banana15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/banana/banana15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/kiwi/kiwi15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/kiwi/kiwi15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/lime/lime15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/lime/lime15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/orange/orange15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/orange/orange15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/peach/peach15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/peach/peach15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple01.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple01.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple02.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple02.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple03.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple03.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple04.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple04.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple05.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple05.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple06.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple06.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple07.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple07.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple08.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple08.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple09.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple09.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple10.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple10.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple11.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple11.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple12.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple12.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple13.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple13.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple14.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple14.wav


--------------------------------------------------------------------------------
/Chapter 12/code/data/pineapple/pineapple15.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/data/pineapple/pineapple15.wav


--------------------------------------------------------------------------------
/Chapter 12/code/feature_extractor.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from scipy.io import wavfile 
 4 | from features import mfcc, logfbank
 5 | 
 6 | # Read the input audio file
 7 | sampling_freq, signal = wavfile.read('random_sound.wav')
 8 | 
 9 | # Take the first 10,000 samples for analysis
10 | signal = signal[:10000]
11 | 
12 | # Extract the MFCC features 
13 | features_mfcc = mfcc(signal, sampling_freq)
14 | 
15 | # Print the parameters for MFCC
16 | print('\nMFCC:\nNumber of windows =', features_mfcc.shape[0])
17 | print('Length of each feature =', features_mfcc.shape[1])
18 | 
19 | # Plot the features
20 | features_mfcc = features_mfcc.T
21 | plt.matshow(features_mfcc)
22 | plt.title('MFCC')
23 | 
24 | # Extract the Filter Bank features
25 | features_fb = logfbank(signal, sampling_freq)
26 | 
27 | # Print the parameters for Filter Bank 
28 | print('\nFilter bank:\nNumber of windows =', features_fb.shape[0])
29 | print('Length of each feature =', features_fb.shape[1])
30 | 
31 | # Plot the features
32 | features_fb = features_fb.T
33 | plt.matshow(features_fb)
34 | plt.title('Filter bank')
35 | 
36 | plt.show()
37 | 


--------------------------------------------------------------------------------
/Chapter 12/code/features/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/features/.DS_Store


--------------------------------------------------------------------------------
/Chapter 12/code/features/__init__.py:
--------------------------------------------------------------------------------
1 | from .base import *
2 | 


--------------------------------------------------------------------------------
/Chapter 12/code/frequency_transformer.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | from scipy.io import wavfile
 4 | 
 5 | # Read the audio file
 6 | sampling_freq, signal = wavfile.read('spoken_word.wav')
 7 | 
 8 | # Normalize the values
 9 | signal = signal / np.power(2, 15) 
10 | 
11 | # Extract the length of the audio signal
12 | len_signal = len(signal)
13 | 
14 | # Extract the half length
15 | len_half = np.ceil((len_signal + 1) / 2.0).astype(np.int)
16 | 
17 | # Apply Fourier transform
18 | freq_signal = np.fft.fft(signal)
19 | 
20 | # Normalization
21 | freq_signal = abs(freq_signal[0:len_half]) / len_signal
22 | 
23 | # Take the square
24 | freq_signal **= 2
25 | 
26 | # Extract the length of the frequency transformed signal
27 | len_fts = len(freq_signal)
28 | 
29 | # Adjust the signal for even and odd cases
30 | if len_signal % 2:
31 |     freq_signal[1:len_fts] *= 2
32 | else:
33 |     freq_signal[1:len_fts-1] *= 2
34 | 
35 | # Extract the power value in dB
36 | signal_power = 10 * np.log10(freq_signal)
37 | 
38 | # Build the X axis
39 | x_axis = np.arange(0, len_half, 1) * (sampling_freq / len_signal) / 1000.0
40 | 
41 | # Plot the figure
42 | plt.figure()
43 | plt.plot(x_axis, signal_power, color='black')
44 | plt.xlabel('Frequency (kHz)')
45 | plt.ylabel('Signal power (dB)')
46 | plt.show()
47 | 


--------------------------------------------------------------------------------
/Chapter 12/code/random_sound.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/random_sound.wav


--------------------------------------------------------------------------------
/Chapter 12/code/spoken_word.wav:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 12/code/spoken_word.wav


--------------------------------------------------------------------------------
/Chapter 12/code/synthesizer.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | 
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | from scipy.io.wavfile import write
 6 | 
 7 | # Synthesize the tone based on the input parameters
 8 | def tone_synthesizer(freq, duration, amplitude=1.0, sampling_freq=44100):
 9 |     # Construct the time axis 
10 |     time_axis = np.linspace(0, duration, duration * sampling_freq)
11 | 
12 |     # Construct the audio signal
13 |     signal = amplitude * np.sin(2 * np.pi * freq * time_axis)
14 | 
15 |     return signal.astype(np.int16) 
16 | 
17 | if __name__=='__main__':
18 |     # Names of output files
19 |     file_tone_single = 'generated_tone_single.wav'
20 |     file_tone_sequence = 'generated_tone_sequence.wav'
21 | 
22 |     # Source: http://www.phy.mtu.edu/~suits/notefreqs.html
23 |     mapping_file = 'tone_mapping.json'
24 |     
25 |     # Load the tone to frequency map from the mapping file
26 |     with open(mapping_file, 'r') as f:
27 |         tone_map = json.loads(f.read())
28 |         
29 |     # Set input parameters to generate 'F' tone
30 |     tone_name = 'F'
31 |     duration = 3     # seconds
32 |     amplitude = 12000
33 |     sampling_freq = 44100    # Hz
34 | 
35 |     # Extract the tone frequency
36 |     tone_freq = tone_map[tone_name]
37 | 
38 |     # Generate the tone using the above parameters
39 |     synthesized_tone = tone_synthesizer(tone_freq, duration, amplitude, sampling_freq)
40 | 
41 |     # Write the audio signal to the output file
42 |     write(file_tone_single, sampling_freq, synthesized_tone)
43 | 
44 |     # Define the tone sequence along with corresponding durations in seconds
45 |     tone_sequence = [('G', 0.4), ('D', 0.5), ('F', 0.3), ('C', 0.6), ('A', 0.4)]
46 | 
47 |     # Construct the audio signal based on the above sequence 
48 |     signal = np.array([])
49 |     for item in tone_sequence:
50 |         # Get the name of the tone 
51 |         tone_name = item[0]
52 | 
53 |         # Extract the corresponding frequency of the tone
54 |         freq = tone_map[tone_name]
55 | 
56 |         # Extract the duration
57 |         duration = item[1]
58 | 
59 |         # Synthesize the tone
60 |         synthesized_tone = tone_synthesizer(freq, duration, amplitude, sampling_freq)
61 | 
62 |         # Append the output signal
63 |         signal = np.append(signal, synthesized_tone, axis=0)
64 | 
65 |     # Save the audio in the output file
66 |     write(file_tone_sequence, sampling_freq, signal)


--------------------------------------------------------------------------------
/Chapter 12/code/tone_mapping.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "A": 440,
 3 |     "Asharp": 466,
 4 |     "B": 494,
 5 |     "C": 523,
 6 |     "Csharp": 554,
 7 |     "D": 587,
 8 |     "Dsharp": 622,
 9 |     "E": 659,
10 |     "F": 698,
11 |     "Fsharp": 740,
12 |     "G": 784,
13 |     "Gsharp": 831
14 | }
15 | 


--------------------------------------------------------------------------------
/Chapter 13/code/background_subtraction.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | 
 4 | # Define a function to get the current frame from the webcam
 5 | def get_frame(cap, scaling_factor):
 6 |     # Read the current frame from the video capture object
 7 |     _, frame = cap.read()
 8 | 
 9 |     # Resize the image
10 |     frame = cv2.resize(frame, None, fx=scaling_factor, 
11 |             fy=scaling_factor, interpolation=cv2.INTER_AREA)
12 | 
13 |     return frame
14 | 
15 | if __name__=='__main__':
16 |     # Define the video capture object
17 |     cap = cv2.VideoCapture(0)
18 | 
19 |     # Define the background subtractor object
20 |     bg_subtractor = cv2.createBackgroundSubtractorMOG2()
21 |      
22 |     # Define the number of previous frames to use to learn. 
23 |     # This factor controls the learning rate of the algorithm. 
24 |     # The learning rate refers to the rate at which your model 
25 |     # will learn about the background. Higher value for 
26 |     # ‘history’ indicates a slower learning rate. You can 
27 |     # play with this parameter to see how it affects the output.
28 |     history = 100
29 | 
30 |     # Define the learning rate
31 |     learning_rate = 1.0/history
32 | 
33 |     # Keep reading the frames from the webcam 
34 |     # until the user hits the 'Esc' key
35 |     while True:
36 |         # Grab the current frame
37 |         frame = get_frame(cap, 0.5)
38 | 
39 |         # Compute the mask 
40 |         mask = bg_subtractor.apply(frame, learningRate=learning_rate)
41 | 
42 |         # Convert grayscale image to RGB color image
43 |         mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
44 | 
45 |         # Display the images
46 |         cv2.imshow('Input', frame)
47 |         cv2.imshow('Output', mask & frame)
48 | 
49 |         # Check if the user hit the 'Esc' key
50 |         c = cv2.waitKey(10)
51 |         if c == 27:
52 |             break
53 | 
54 |     # Release the video capture object
55 |     cap.release()
56 |     
57 |     # Close all the windows
58 |     cv2.destroyAllWindows()
59 | 


--------------------------------------------------------------------------------
/Chapter 13/code/colorspaces.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | 
 4 | # Define a function to get the current frame from the webcam
 5 | def get_frame(cap, scaling_factor):
 6 |     # Read the current frame from the video capture object
 7 |     _, frame = cap.read()
 8 | 
 9 |     # Resize the image
10 |     frame = cv2.resize(frame, None, fx=scaling_factor, 
11 |             fy=scaling_factor, interpolation=cv2.INTER_AREA)
12 | 
13 |     return frame
14 | 
15 | if __name__=='__main__':
16 |     # Define the video capture object
17 |     cap = cv2.VideoCapture(0)
18 | 
19 |     # Define the scaling factor for the images
20 |     scaling_factor = 0.5
21 | 
22 |     # Keep reading the frames from the webcam 
23 |     # until the user hits the 'Esc' key
24 |     while True:
25 |         # Grab the current frame
26 |         frame = get_frame(cap, scaling_factor) 
27 | 
28 |         # Convert the image to HSV colorspace
29 |         hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
30 | 
31 |         # Define range of skin color in HSV
32 |         lower = np.array([0, 70, 60])
33 |         upper = np.array([50, 150, 255])
34 | 
35 |         # Threshold the HSV image to get only skin color
36 |         mask = cv2.inRange(hsv, lower, upper)
37 | 
38 |         # Bitwise-AND between the mask and original image
39 |         img_bitwise_and = cv2.bitwise_and(frame, frame, mask=mask)
40 | 
41 |         # Run median blurring
42 |         img_median_blurred = cv2.medianBlur(img_bitwise_and, 5)
43 | 
44 |         # Display the input and output
45 |         cv2.imshow('Input', frame)
46 |         cv2.imshow('Output', img_median_blurred)
47 | 
48 |         # Check if the user hit the 'Esc' key
49 |         c = cv2.waitKey(5) 
50 |         if c == 27:
51 |             break
52 | 
53 |     # Close all the windows
54 |     cv2.destroyAllWindows()
55 | 


--------------------------------------------------------------------------------
/Chapter 13/code/eye_detector.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | 
 4 | # Load the Haar cascade files for face and eye
 5 | face_cascade = cv2.CascadeClassifier('haar_cascade_files/haarcascade_frontalface_default.xml')
 6 | eye_cascade = cv2.CascadeClassifier('haar_cascade_files/haarcascade_eye.xml')
 7 | 
 8 | # Check if the face cascade file has been loaded correctly
 9 | if face_cascade.empty():
10 | 	raise IOError('Unable to load the face cascade classifier xml file')
11 | 
12 | # Check if the eye cascade file has been loaded correctly
13 | if eye_cascade.empty():
14 | 	raise IOError('Unable to load the eye cascade classifier xml file')
15 | 
16 | # Initialize the video capture object
17 | cap = cv2.VideoCapture(0)
18 | 
19 | # Define the scaling factor
20 | ds_factor = 0.5
21 | 
22 | # Iterate until the user hits the 'Esc' key
23 | while True:
24 |     # Capture the current frame
25 |     _, frame = cap.read()
26 | 
27 |     # Resize the frame
28 |     frame = cv2.resize(frame, None, fx=ds_factor, fy=ds_factor, interpolation=cv2.INTER_AREA)
29 | 
30 |     # Convert to grayscale
31 |     gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
32 | 
33 |     # Run the face detector on the grayscale image
34 |     faces = face_cascade.detectMultiScale(gray, 1.3, 5)
35 | 
36 |     # For each face that's detected, run the eye detector
37 |     for (x,y,w,h) in faces:
38 |         # Extract the grayscale face ROI
39 |         roi_gray = gray[y:y+h, x:x+w]
40 | 
41 |         # Extract the color face ROI
42 |         roi_color = frame[y:y+h, x:x+w]
43 | 
44 |         # Run the eye detector on the grayscale ROI
45 |         eyes = eye_cascade.detectMultiScale(roi_gray)
46 | 
47 |         # Draw circles around the eyes
48 |         for (x_eye,y_eye,w_eye,h_eye) in eyes:
49 |             center = (int(x_eye + 0.5*w_eye), int(y_eye + 0.5*h_eye))
50 |             radius = int(0.3 * (w_eye + h_eye))
51 |             color = (0, 255, 0)
52 |             thickness = 3
53 |             cv2.circle(roi_color, center, radius, color, thickness)
54 | 
55 |     # Display the output
56 |     cv2.imshow('Eye Detector', frame)
57 | 
58 |     # Check if the user hit the 'Esc' key
59 |     c = cv2.waitKey(1)
60 |     if c == 27:
61 |         break
62 | 
63 | # Release the video capture object
64 | cap.release()
65 | 
66 | # Close all the windows
67 | cv2.destroyAllWindows()
68 | 


--------------------------------------------------------------------------------
/Chapter 13/code/face_detector.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | 
 4 | # Load the Haar cascade file
 5 | face_cascade = cv2.CascadeClassifier(
 6 |         'haar_cascade_files/haarcascade_frontalface_default.xml')
 7 | 
 8 | # Check if the cascade file has been loaded correctly
 9 | if face_cascade.empty():
10 | 	raise IOError('Unable to load the face cascade classifier xml file')
11 | 
12 | # Initialize the video capture object
13 | cap = cv2.VideoCapture(0)
14 | 
15 | # Define the scaling factor
16 | scaling_factor = 0.5
17 | 
18 | # Iterate until the user hits the 'Esc' key
19 | while True:
20 |     # Capture the current frame
21 |     _, frame = cap.read()
22 | 
23 |     # Resize the frame
24 |     frame = cv2.resize(frame, None, 
25 |             fx=scaling_factor, fy=scaling_factor, 
26 |             interpolation=cv2.INTER_AREA)
27 | 
28 |     # Convert to grayscale
29 |     gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
30 | 
31 |     # Run the face detector on the grayscale image
32 |     face_rects = face_cascade.detectMultiScale(gray, 1.3, 5)
33 | 
34 |     # Draw a rectangle around the face
35 |     for (x,y,w,h) in face_rects:
36 |         cv2.rectangle(frame, (x,y), (x+w,y+h), (0,255,0), 3)
37 | 
38 |     # Display the output
39 |     cv2.imshow('Face Detector', frame)
40 | 
41 |     # Check if the user hit the 'Esc' key
42 |     c = cv2.waitKey(1)
43 |     if c == 27:
44 |         break
45 | 
46 | # Release the video capture object
47 | cap.release()
48 | 
49 | # Close all the windows
50 | cv2.destroyAllWindows()
51 | 


--------------------------------------------------------------------------------
/Chapter 13/code/frame_diff.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | 
 3 | # Compute the frame differences
 4 | def frame_diff(prev_frame, cur_frame, next_frame):
 5 |     # Difference between the current frame and the next frame
 6 |     diff_frames_1 = cv2.absdiff(next_frame, cur_frame)
 7 | 
 8 |     # Difference between the current frame and the previous frame
 9 |     diff_frames_2 = cv2.absdiff(cur_frame, prev_frame)
10 | 
11 |     return cv2.bitwise_and(diff_frames_1, diff_frames_2)
12 | 
13 | # Define a function to get the current frame from the webcam
14 | def get_frame(cap, scaling_factor):
15 |     # Read the current frame from the video capture object
16 |     _, frame = cap.read()
17 | 
18 |     # Resize the image
19 |     frame = cv2.resize(frame, None, fx=scaling_factor, 
20 |             fy=scaling_factor, interpolation=cv2.INTER_AREA)
21 | 
22 |     # Convert to grayscale
23 |     gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
24 | 
25 |     return gray 
26 | 
27 | if __name__=='__main__':
28 |     # Define the video capture object
29 |     cap = cv2.VideoCapture(0)
30 | 
31 |     # Define the scaling factor for the images
32 |     scaling_factor = 0.5
33 |     
34 |     # Grab the current frame
35 |     prev_frame = get_frame(cap, scaling_factor) 
36 | 
37 |     # Grab the next frame
38 |     cur_frame = get_frame(cap, scaling_factor) 
39 | 
40 |     # Grab the frame after that
41 |     next_frame = get_frame(cap, scaling_factor) 
42 | 
43 |     # Keep reading the frames from the webcam 
44 |     # until the user hits the 'Esc' key
45 |     while True:
46 |         # Display the frame difference
47 |         cv2.imshow('Object Movement', frame_diff(prev_frame, 
48 |                 cur_frame, next_frame))
49 | 
50 |         # Update the variables
51 |         prev_frame = cur_frame
52 |         cur_frame = next_frame 
53 | 
54 |         # Grab the next frame
55 |         next_frame = get_frame(cap, scaling_factor)
56 | 
57 |         # Check if the user hit the 'Esc' key
58 |         key = cv2.waitKey(10)
59 |         if key == 27:
60 |             break
61 | 
62 |     # Close all the windows
63 |     cv2.destroyAllWindows()
64 | 


--------------------------------------------------------------------------------
/Chapter 13/code/haar_cascade_files/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PacktPublishing/Artificial-Intelligence-with-Python/d930bc2d055433781559683f69e05207f0eaab13/Chapter 13/code/haar_cascade_files/.DS_Store


--------------------------------------------------------------------------------
/Chapter 14/code/character_visualizer.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import sys
 3 | 
 4 | import cv2
 5 | import numpy as np
 6 | 
 7 | # Define the input file 
 8 | input_file = 'letter.data' 
 9 | 
10 | # Define the visualization parameters 
11 | img_resize_factor = 12
12 | start = 6
13 | end = -1
14 | height, width = 16, 8
15 | 
16 | # Iterate until the user presses the Esc key
17 | with open(input_file, 'r') as f:
18 |     for line in f.readlines():
19 |         # Read the data
20 |         data = np.array([255 * float(x) for x in line.split('\t')[start:end]])
21 | 
22 |         # Reshape the data into a 2D image
23 |         img = np.reshape(data, (height, width))
24 | 
25 |         # Scale the image
26 |         img_scaled = cv2.resize(img, None, fx=img_resize_factor, fy=img_resize_factor)
27 | 
28 |         # Display the image
29 |         cv2.imshow('Image', img_scaled)
30 | 
31 |         # Check if the user pressed the Esc key
32 |         c = cv2.waitKey()
33 |         if c == 27:
34 |             break
35 | 


--------------------------------------------------------------------------------
/Chapter 14/code/data_perceptron.txt:
--------------------------------------------------------------------------------
1 | 0.38 0.19 0
2 | 0.17 0.31 0
3 | 0.29 0.54 0
4 | 0.89 0.55 1
5 | 0.78 0.36 1


--------------------------------------------------------------------------------
/Chapter 14/code/data_simple_nn.txt:
--------------------------------------------------------------------------------
 1 | 1.0 4.0 0 0
 2 | 1.1 3.9 0 0
 3 | 1.2 4.1 0 0
 4 | 0.9 3.7 0 0
 5 | 7.0 4.0 0 1
 6 | 7.2 4.1 0 1
 7 | 6.9 3.9 0 1
 8 | 7.1 4.2 0 1
 9 | 4.0 1.0 1 0
10 | 4.1 0.9 1 0
11 | 4.2 1.1 1 0
12 | 3.9 0.8 1 0
13 | 4.0 7.0 1 1
14 | 4.2 7.2 1 1
15 | 3.9 7.1 1 1
16 | 4.1 6.8 1 1
17 | 


--------------------------------------------------------------------------------
/Chapter 14/code/data_vector_quantization.txt:
--------------------------------------------------------------------------------
 1 | 0.9 5.1 1 0 0 0
 2 | 1.2 4.8 1 0 0 0
 3 | 1.0 4.9 1 0 0 0
 4 | 0.8 5.2 1 0 0 0
 5 | 8.0 4.1 0 1 0 0
 6 | 8.2 4.3 0 1 0 0
 7 | 7.9 3.8 0 1 0 0
 8 | 8.3 4.3 0 1 0 0
 9 | 5.0 1.1 0 0 1 0
10 | 5.1 0.8 0 0 1 0
11 | 5.3 1.2 0 0 1 0
12 | 4.9 0.9 0 0 1 0
13 | 5.0 7.0 0 0 0 1
14 | 5.2 7.2 0 0 0 1
15 | 4.9 7.1 0 0 0 1
16 | 5.1 6.8 0 0 0 1
17 | 


--------------------------------------------------------------------------------
/Chapter 14/code/multilayer_neural_network.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import neurolab as nl
 4 | 
 5 | # Generate some training data
 6 | min_val = -15
 7 | max_val = 15
 8 | num_points = 130
 9 | x = np.linspace(min_val, max_val, num_points)
10 | y = 3 * np.square(x) + 5
11 | y /= np.linalg.norm(y)
12 | 
13 | # Create data and labels
14 | data = x.reshape(num_points, 1)
15 | labels = y.reshape(num_points, 1)
16 | 
17 | # Plot input data
18 | plt.figure()
19 | plt.scatter(data, labels)
20 | plt.xlabel('Dimension 1')
21 | plt.ylabel('Dimension 2')
22 | plt.title('Input data')
23 | 
24 | # Define a multilayer neural network with 2 hidden layers;
25 | # First hidden layer consists of 10 neurons
26 | # Second hidden layer consists of 6 neurons
27 | # Output layer consists of 1 neuron
28 | nn = nl.net.newff([[min_val, max_val]], [10, 6, 1])
29 | 
30 | # Set the training algorithm to gradient descent
31 | nn.trainf = nl.train.train_gd
32 | 
33 | # Train the neural network
34 | error_progress = nn.train(data, labels, epochs=2000, show=100, goal=0.01)
35 | 
36 | # Run the neural network on training datapoints
37 | output = nn.sim(data)
38 | y_pred = output.reshape(num_points)
39 | 
40 | # Plot training error
41 | plt.figure()
42 | plt.plot(error_progress)
43 | plt.xlabel('Number of epochs')
44 | plt.ylabel('Error')
45 | plt.title('Training error progress')
46 | 
47 | # Plot the output 
48 | x_dense = np.linspace(min_val, max_val, num_points * 2)
49 | y_dense_pred = nn.sim(x_dense.reshape(x_dense.size,1)).reshape(x_dense.size)
50 | 
51 | plt.figure()
52 | plt.plot(x_dense, y_dense_pred, '-', x, y, '.', x, y_pred, 'p')
53 | plt.title('Actual vs predicted')
54 | 
55 | plt.show()
56 | 


--------------------------------------------------------------------------------
/Chapter 14/code/ocr.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import neurolab as nl
 3 | 
 4 | # Define the input file
 5 | input_file = 'letter.data'
 6 | 
 7 | # Define the number of datapoints to 
 8 | # be loaded from the input file
 9 | num_datapoints = 50
10 | 
11 | # String containing all the distinct characters
12 | orig_labels = 'omandig'
13 | 
14 | # Compute the number of distinct characters
15 | num_orig_labels = len(orig_labels)
16 | 
17 | # Define the training and testing parameters
18 | num_train = int(0.9 * num_datapoints)
19 | num_test = num_datapoints - num_train
20 | 
21 | # Define the dataset extraction parameters 
22 | start = 6
23 | end = -1
24 | 
25 | # Creating the dataset
26 | data = []
27 | labels = []
28 | with open(input_file, 'r') as f:
29 |     for line in f.readlines():
30 |         # Split the current line tabwise
31 |         list_vals = line.split('\t')
32 | 
33 |         # Check if the label is in our ground truth 
34 |         # labels. If not, we should skip it.
35 |         if list_vals[1] not in orig_labels:
36 |             continue
37 | 
38 |         # Extract the current label and append it 
39 |         # to the main list
40 |         label = np.zeros((num_orig_labels, 1))
41 |         label[orig_labels.index(list_vals[1])] = 1
42 |         labels.append(label)
43 | 
44 |         # Extract the character vector and append it to the main list
45 |         cur_char = np.array([float(x) for x in list_vals[start:end]])
46 |         data.append(cur_char)
47 | 
48 |         # Exit the loop once the required dataset has been created 
49 |         if len(data) >= num_datapoints:
50 |             break
51 | 
52 | # Convert the data and labels to numpy arrays
53 | data = np.asfarray(data)
54 | labels = np.array(labels).reshape(num_datapoints, num_orig_labels)
55 | 
56 | # Extract the number of dimensions
57 | num_dims = len(data[0])
58 | 
59 | # Create a feedforward neural network
60 | nn = nl.net.newff([[0, 1] for _ in range(len(data[0]))], 
61 |         [128, 16, num_orig_labels])
62 | 
63 | # Set the training algorithm to gradient descent
64 | nn.trainf = nl.train.train_gd
65 | 
66 | # Train the network
67 | error_progress = nn.train(data[:num_train,:], labels[:num_train,:], 
68 |         epochs=10000, show=100, goal=0.01)
69 | 
70 | # Predict the output for test inputs 
71 | print('\nTesting on unknown data:')
72 | predicted_test = nn.sim(data[num_train:, :])
73 | for i in range(num_test):
74 |     print('\nOriginal:', orig_labels[np.argmax(labels[i])])
75 |     print('Predicted:', orig_labels[np.argmax(predicted_test[i])])
76 | 
77 | 


--------------------------------------------------------------------------------
/Chapter 14/code/perceptron_classifier.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import neurolab as nl
 4 | 
 5 | # Load input data 
 6 | text = np.loadtxt('data_perceptron.txt')
 7 | 
 8 | # Separate datapoints and labels
 9 | data = text[:, :2]
10 | labels = text[:, 2].reshape((text.shape[0], 1))
11 | 
12 | # Plot input data
13 | plt.figure()
14 | plt.scatter(data[:,0], data[:,1])
15 | plt.xlabel('Dimension 1')
16 | plt.ylabel('Dimension 2')
17 | plt.title('Input data')
18 | 
19 | # Define minimum and maximum values for each dimension
20 | dim1_min, dim1_max, dim2_min, dim2_max = 0, 1, 0, 1
21 | 
22 | # Number of neurons in the output layer
23 | num_output = labels.shape[1]
24 | 
25 | # Define a perceptron with 2 input neurons (because we 
26 | # have 2 dimensions in the input data)
27 | dim1 = [dim1_min, dim1_max]
28 | dim2 = [dim2_min, dim2_max]
29 | perceptron = nl.net.newp([dim1, dim2], num_output)
30 | 
31 | # Train the perceptron using the data
32 | error_progress = perceptron.train(data, labels, epochs=100, show=20, lr=0.03)
33 | 
34 | # Plot the training progress
35 | plt.figure()
36 | plt.plot(error_progress)
37 | plt.xlabel('Number of epochs')
38 | plt.ylabel('Training error')
39 | plt.title('Training error progress')
40 | plt.grid()
41 | 
42 | plt.show()
43 | 


--------------------------------------------------------------------------------
/Chapter 14/code/recurrent_neural_network.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import neurolab as nl
 4 | 
 5 | def get_data(num_points):
 6 |     # Create sine waveforms
 7 |     wave_1 = 0.5 * np.sin(np.arange(0, num_points))
 8 |     wave_2 = 3.6 * np.sin(np.arange(0, num_points))
 9 |     wave_3 = 1.1 * np.sin(np.arange(0, num_points))
10 |     wave_4 = 4.7 * np.sin(np.arange(0, num_points))
11 | 
12 |     # Create varying amplitudes
13 |     amp_1 = np.ones(num_points)
14 |     amp_2 = 2.1 + np.zeros(num_points) 
15 |     amp_3 = 3.2 * np.ones(num_points) 
16 |     amp_4 = 0.8 + np.zeros(num_points) 
17 | 
18 |     wave = np.array([wave_1, wave_2, wave_3, wave_4]).reshape(num_points * 4, 1)
19 |     amp = np.array([[amp_1, amp_2, amp_3, amp_4]]).reshape(num_points * 4, 1)
20 | 
21 |     return wave, amp 
22 | 
23 | # Visualize the output 
24 | def visualize_output(nn, num_points_test):
25 |     wave, amp = get_data(num_points_test)
26 |     output = nn.sim(wave)
27 |     plt.plot(amp.reshape(num_points_test * 4))
28 |     plt.plot(output.reshape(num_points_test * 4))
29 | 
30 | if __name__=='__main__':
31 |     # Create some sample data
32 |     num_points = 40
33 |     wave, amp = get_data(num_points)
34 | 
35 |     # Create a recurrent neural network with 2 layers
36 |     nn = nl.net.newelm([[-2, 2]], [10, 1], [nl.trans.TanSig(), nl.trans.PureLin()])
37 | 
38 |     # Set the init functions for each layer 
39 |     nn.layers[0].initf = nl.init.InitRand([-0.1, 0.1], 'wb')
40 |     nn.layers[1].initf = nl.init.InitRand([-0.1, 0.1], 'wb')
41 |     nn.init()
42 | 
43 |     # Train the recurrent neural network
44 |     error_progress = nn.train(wave, amp, epochs=1200, show=100, goal=0.01)
45 | 
46 |     # Run the training data through the network
47 |     output = nn.sim(wave)
48 | 
49 |     # Plot the results
50 |     plt.subplot(211)
51 |     plt.plot(error_progress)
52 |     plt.xlabel('Number of epochs')
53 |     plt.ylabel('Error (MSE)')
54 | 
55 |     plt.subplot(212)
56 |     plt.plot(amp.reshape(num_points * 4))
57 |     plt.plot(output.reshape(num_points * 4))
58 |     plt.legend(['Original', 'Predicted'])
59 | 
60 |     # Testing the network performance on unknown data 
61 |     plt.figure()
62 | 
63 |     plt.subplot(211)
64 |     visualize_output(nn, 82)
65 |     plt.xlim([0, 300])
66 | 
67 |     plt.subplot(212)
68 |     visualize_output(nn, 49)
69 |     plt.xlim([0, 300])
70 | 
71 |     plt.show()
72 | 


--------------------------------------------------------------------------------
/Chapter 14/code/simple_neural_network.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import neurolab as nl
 4 | 
 5 | # Load input data
 6 | text = np.loadtxt('data_simple_nn.txt')
 7 | 
 8 | # Separate it into datapoints and labels
 9 | data = text[:, 0:2]
10 | labels = text[:, 2:]
11 | 
12 | # Plot input data
13 | plt.figure()
14 | plt.scatter(data[:,0], data[:,1])
15 | plt.xlabel('Dimension 1')
16 | plt.ylabel('Dimension 2')
17 | plt.title('Input data')
18 | 
19 | # Minimum and maximum values for each dimension
20 | dim1_min, dim1_max = data[:,0].min(), data[:,0].max()
21 | dim2_min, dim2_max = data[:,1].min(), data[:,1].max()
22 | 
23 | # Define the number of neurons in the output layer
24 | num_output = labels.shape[1]
25 | 
26 | # Define a single-layer neural network 
27 | dim1 = [dim1_min, dim1_max]
28 | dim2 = [dim2_min, dim2_max]
29 | nn = nl.net.newp([dim1, dim2], num_output)
30 | 
31 | # Train the neural network
32 | error_progress = nn.train(data, labels, epochs=100, show=20, lr=0.03)
33 | 
34 | # Plot the training progress
35 | plt.figure()
36 | plt.plot(error_progress)
37 | plt.xlabel('Number of epochs')
38 | plt.ylabel('Training error')
39 | plt.title('Training error progress')
40 | plt.grid()
41 | 
42 | plt.show()
43 | 
44 | # Run the classifier on test datapoints
45 | print('\nTest results:')
46 | data_test = [[0.4, 4.3], [4.4, 0.6], [4.7, 8.1]]
47 | for item in data_test:
48 |     print(item, '-->', nn.sim([item])[0])
49 | 


--------------------------------------------------------------------------------
/Chapter 14/code/vector_quantizer.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import neurolab as nl
 4 | 
 5 | # Load input data
 6 | text = np.loadtxt('data_vector_quantization.txt')
 7 | 
 8 | # Separate it into data and labels
 9 | data = text[:, 0:2]
10 | labels = text[:, 2:]
11 | 
12 | # Define a neural network with 2 layers:
13 | # 10 neurons in input layer and 4 neurons in output layer
14 | num_input_neurons = 10
15 | num_output_neurons = 4
16 | weights = [1/num_output_neurons] * num_output_neurons
17 | nn = nl.net.newlvq(nl.tool.minmax(data), num_input_neurons, weights)
18 | 
19 | # Train the neural network
20 | _ = nn.train(data, labels, epochs=500, goal=-1)
21 | 
22 | # Create the input grid
23 | xx, yy = np.meshgrid(np.arange(0, 10, 0.2), np.arange(0, 10, 0.2))
24 | xx.shape = xx.size, 1
25 | yy.shape = yy.size, 1
26 | grid_xy = np.concatenate((xx, yy), axis=1)
27 | 
28 | # Evaluate the input grid of points
29 | grid_eval = nn.sim(grid_xy)
30 | 
31 | # Define the 4 classes
32 | class_1 = data[labels[:,0] == 1]
33 | class_2 = data[labels[:,1] == 1]
34 | class_3 = data[labels[:,2] == 1]
35 | class_4 = data[labels[:,3] == 1]
36 | 
37 | # Define X-Y grids for all the 4 classes
38 | grid_1 = grid_xy[grid_eval[:,0] == 1]
39 | grid_2 = grid_xy[grid_eval[:,1] == 1]
40 | grid_3 = grid_xy[grid_eval[:,2] == 1]
41 | grid_4 = grid_xy[grid_eval[:,3] == 1]
42 | 
43 | # Plot the outputs 
44 | plt.plot(class_1[:,0], class_1[:,1], 'ko', 
45 |         class_2[:,0], class_2[:,1], 'ko', 
46 |         class_3[:,0], class_3[:,1], 'ko', 
47 |         class_4[:,0], class_4[:,1], 'ko')
48 | plt.plot(grid_1[:,0], grid_1[:,1], 'm.',
49 |         grid_2[:,0], grid_2[:,1], 'bx',
50 |         grid_3[:,0], grid_3[:,1], 'c^', 
51 |         grid_4[:,0], grid_4[:,1], 'y+')
52 | plt.axis([0, 10, 0, 10])
53 | plt.xlabel('Dimension 1')
54 | plt.ylabel('Dimension 2')
55 | plt.title('Vector quantization')
56 | 
57 | plt.show()
58 | 
59 | 


--------------------------------------------------------------------------------
/Chapter 15/code/balancer.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | import gym
 4 | 
 5 | def build_arg_parser():
 6 |     parser = argparse.ArgumentParser(description='Run an environment')
 7 |     parser.add_argument('--input-env', dest='input_env', required=True,
 8 |             choices=['cartpole', 'mountaincar', 'pendulum'], 
 9 |             help='Specify the name of the environment')
10 |     return parser
11 | 
12 | if __name__=='__main__':
13 |     args = build_arg_parser().parse_args()
14 |     input_env = args.input_env
15 | 
16 |     name_map = {'cartpole': 'CartPole-v0', 
17 |                 'mountaincar': 'MountainCar-v0',
18 |                 'pendulum': 'Pendulum-v0'}
19 | 
20 |     # Create the environment 
21 |     env = gym.make(name_map[input_env])
22 | 
23 |     # Start iterating 
24 |     for _ in range(20):
25 |         # Reset the environment
26 |         observation = env.reset()
27 | 
28 |         # Iterate 100 times
29 |         for i in range(100):
30 |             # Render the environment
31 |             env.render()
32 | 
33 |             # Print the current observation
34 |             print(observation)
35 | 
36 |             # Take action 
37 |             action = env.action_space.sample()
38 | 
39 |             # Extract the observation, reward, status and 
40 |             # other info based on the action taken
41 |             observation, reward, done, info = env.step(action)
42 |             
43 |             # Check if it's done
44 |             if done:
45 |                 print('Episode finished after {} timesteps'.format(i+1))
46 |                 break
47 | 


--------------------------------------------------------------------------------
/Chapter 15/code/run_environment.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | import gym
 4 | 
 5 | def build_arg_parser():
 6 |     parser = argparse.ArgumentParser(description='Run an environment')
 7 |     parser.add_argument('--input-env', dest='input_env', required=True,
 8 |             choices=['cartpole', 'mountaincar', 'pendulum', 'taxi', 'lake'], 
 9 |             help='Specify the name of the environment')
10 |     return parser
11 | 
12 | if __name__=='__main__':
13 |     args = build_arg_parser().parse_args()
14 |     input_env = args.input_env
15 | 
16 |     name_map = {'cartpole': 'CartPole-v0', 
17 |                 'mountaincar': 'MountainCar-v0',
18 |                 'pendulum': 'Pendulum-v0',
19 |                 'taxi': 'Taxi-v1',
20 |                 'lake': 'FrozenLake-v0'}
21 | 
22 |     # Create the environment and reset it
23 |     env = gym.make(name_map[input_env])
24 |     env.reset()
25 | 
26 |     # Iterate 1000 times
27 |     for _ in range(1000):
28 |         # Render the environment
29 |         env.render()
30 | 
31 |         # take a random action
32 |         env.step(env.action_space.sample()) 
33 | 
34 | 


--------------------------------------------------------------------------------
/Chapter 16/code/linear_regession.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import matplotlib.pyplot as plt
 3 | import tensorflow as tf
 4 | 
 5 | # Define the number of points to generate
 6 | num_points = 1200
 7 | 
 8 | # Generate the data based on equation y = mx + c
 9 | data = []
10 | m = 0.2
11 | c = 0.5
12 | for i in range(num_points):
13 |     # Generate 'x' 
14 |     x = np.random.normal(0.0, 0.8)
15 | 
16 |     # Generate some noise
17 |     noise = np.random.normal(0.0, 0.04)
18 | 
19 |     # Compute 'y' 
20 |     y = m*x + c + noise 
21 | 
22 |     data.append([x, y])
23 | 
24 | # Separate x and y
25 | x_data = [d[0] for d in data]
26 | y_data = [d[1] for d in data]
27 | 
28 | # Plot the generated data
29 | plt.plot(x_data, y_data, 'ro')
30 | plt.title('Input data')
31 | plt.show()
32 | 
33 | # Generate weights and biases
34 | W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
35 | b = tf.Variable(tf.zeros([1]))
36 | 
37 | # Define equation for 'y'
38 | y = W * x_data + b
39 | 
40 | # Define how to compute the loss
41 | loss = tf.reduce_mean(tf.square(y - y_data))
42 | 
43 | # Define the gradient descent optimizer
44 | optimizer = tf.train.GradientDescentOptimizer(0.5)
45 | train = optimizer.minimize(loss)
46 | 
47 | # Initialize all the variables
48 | init = tf.initialize_all_variables()
49 | 
50 | # Start the tensorflow session and run it
51 | sess = tf.Session()
52 | sess.run(init)
53 | 
54 | # Start iterating
55 | num_iterations = 10
56 | for step in range(num_iterations):
57 |     # Run the session
58 |     sess.run(train)
59 | 
60 |     # Print the progress
61 |     print('\nITERATION', step+1)
62 |     print('W =', sess.run(W)[0])
63 |     print('b =', sess.run(b)[0])
64 |     print('loss =', sess.run(loss))
65 | 
66 |     # Plot the input data 
67 |     plt.plot(x_data, y_data, 'ro')
68 | 
69 |     # Plot the predicted output line
70 |     plt.plot(x_data, sess.run(W) * x_data + sess.run(b))
71 | 
72 |     # Set plotting parameters
73 |     plt.xlabel('Dimension 0')
74 |     plt.ylabel('Dimension 1')
75 |     plt.title('Iteration ' + str(step+1) + ' of ' + str(num_iterations))
76 |     plt.show()
77 | 
78 | 


--------------------------------------------------------------------------------
/Chapter 16/code/single_layer.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | import tensorflow as tf
 4 | from tensorflow.examples.tutorials.mnist import input_data
 5 | 
 6 | def build_arg_parser():
 7 |     parser = argparse.ArgumentParser(description='Build a classifier using \
 8 |             MNIST data')
 9 |     parser.add_argument('--input-dir', dest='input_dir', type=str, 
10 |             default='./mnist_data', help='Directory for storing data')
11 |     return parser
12 | 
13 | if __name__ == '__main__':
14 |     args = build_arg_parser().parse_args()
15 | 
16 |     # Get the MNIST data
17 |     mnist = input_data.read_data_sets(args.input_dir, one_hot=True)
18 | 
19 |     # The images are 28x28, so create the input layer 
20 |     # with 784 neurons (28x28=784) 
21 |     x = tf.placeholder(tf.float32, [None, 784])
22 | 
23 |     # Create a layer with weights and biases. There are 10 distinct 
24 |     # digits, so the output layer should have 10 classes
25 |     W = tf.Variable(tf.zeros([784, 10]))
26 |     b = tf.Variable(tf.zeros([10]))
27 | 
28 |     # Create the equation for 'y' using y = W*x + b
29 |     y = tf.matmul(x, W) + b
30 | 
31 |     # Define the entropy loss and the gradient descent optimizer
32 |     y_loss = tf.placeholder(tf.float32, [None, 10])
33 |     loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_loss))
34 |     optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
35 | 
36 |     # Initialize all the variables 
37 |     init = tf.initialize_all_variables()
38 | 
39 |     # Create a session
40 |     session = tf.Session()
41 |     session.run(init)
42 | 
43 |     # Start training
44 |     num_iterations = 1200
45 |     batch_size = 90
46 |     for _ in range(num_iterations):
47 |         # Get the next batch of images
48 |         x_batch, y_batch = mnist.train.next_batch(batch_size)
49 | 
50 |         # Train on this batch of images
51 |         session.run(optimizer, feed_dict = {x: x_batch, y_loss: y_batch})
52 | 
53 |     # Compute the accuracy using test data
54 |     predicted = tf.equal(tf.argmax(y, 1), tf.argmax(y_loss, 1))
55 |     accuracy = tf.reduce_mean(tf.cast(predicted, tf.float32))
56 |     print('\nAccuracy =', session.run(accuracy, feed_dict = {
57 |             x: mnist.test.images, 
58 |             y_loss: mnist.test.labels}))
59 | 
60 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Packt
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | 
 3 | 
 4 | # Artificial Intelligence with Python
 5 | This is the code repository for [Artificial Intelligence with Python](https://www.packtpub.com/big-data-and-business-intelligence/artificial-intelligence-python?utm_source=github&utm_medium=repository&utm_campaign=9781786464392), published by [Packt](https://www.packtpub.com/?utm_source=github). It contains all the supporting project files necessary to work through the book from start to finish.
 6 | ## About the Book
 7 | During the course of this book, you will find out how to make informed decisions about what algorithms to use in a given context. Starting from the basics of Artificial Intelligence, you will learn how to develop various building blocks using different data mining techniques. You will see how to implement different algorithms to get the best possible results, and will understand how to apply them to real-world scenarios. If you want to add an intelligence layer to any application that’s based on images, text, stock market, or some other form of data, this exciting book on Artificial Intelligence will definitely be your guide!
 8 | 
 9 | ##Instructions and Navigation
10 | All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.
11 | 
12 | 
13 | 
14 | The code will look like the following:
15 | ```
16 | A block of code is set as follows:
17 | 
18 | [default]
19 | exten => s,1,Dial(Zap/1|30)
20 | exten => s,2,Voicemail(u100)
21 | exten => s,102,Voicemail(b100)
22 | exten => i,1,Voicemail(s0)
23 | ```
24 | 
25 | This book is focused on artificial intelligence in Python as opposed to the Python itself. We have used Python 3 to build various applications. We focus on how to utilize various Python libraries in the best possible way to build real world applications. In that spirit, we have tried to keep all of the code as friendly and readable as possible. We feel that this will enable our readers to easily understand the code and readily use it in different scenarios.
26 | 
27 | ## Related Products
28 | * [Deep Learning with Python [Video]](https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-python-video?utm_source=github&utm_medium=repository&utm_campaign=9781785883873)
29 | 
30 | * [Learning IPython for Interactive Computing and Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/learning-ipython-interactive-computing-and-data-visualization?utm_source=github&utm_medium=repository&utm_campaign=9781782169932)
31 | 
32 | * [Python High Performance Programming](https://www.packtpub.com/application-development/python-high-performance-programming?utm_source=github&utm_medium=repository&utm_campaign=9781783288458)
33 | 
34 | ### Suggestions and Feedback
35 | [Click here](https://docs.google.com/forms/d/e/1FAIpQLSe5qwunkGf6PUvzPirPDtuy1Du5Rlzew23UBp2S-P3wB-GcwQ/viewform) if you have any feedback or suggestions.
36 | ### Download a free PDF
37 | 
38 |  <i>If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.<br>Simply click on the link to claim your free PDF.</i>
39 | <p align="center"> <a href="https://packt.link/free-ebook/9781839219535">https://packt.link/free-ebook/9781839219535 </a> </p>


--------------------------------------------------------------------------------