├── .gitignore
├── README.md
├── __init__.py
├── classifier
├── __init__.py
├── mlp.py
├── rf_workflow.py
├── ssmlp.py
├── svm_cv_workflow.py
└── svm_workflow.py
├── examples
├── example_pipeline.py
└── example_scae_svm.py
├── feature_extraction
├── __init__.py
├── cae.py
├── mcae.py
└── scae.py
├── fileIO
├── __init__.py
├── aviris.py
├── aviris_bands.hdr
├── gliht.py
├── hyperion.py
├── hyperion_bands.csv
└── utils.py
├── metrics.py
├── pipeline.py
├── postprocessing.py
├── preprocessing.py
├── setup.py
├── tf_initializers.py
└── utils
├── __init__.py
├── convolution.py
├── train_val_split.py
└── utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *.hdf5
2 | *feature_extraction/pre_trained/*
3 | *checkpoint/
4 | *__pycache__
5 | *checkpoints*
6 | *checkpoint*
7 | *.png
8 | *.mat
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # EarthMapper #
2 |
3 | Project repository for EarthMapper. This is a toolbox for the semantic segmentation of non-RGB (i.e., multispectral/hyperspectral) imagery. We will work on adding more examples and better documentation.
4 |
5 |
6 |
7 |
8 |
9 | ## Description ##
10 |
11 | This is a classification pipeline from various projects that we have worked on over the past few years. Currently available options include:
12 |
13 | ### Pre-Processing ###
14 | * MinMaxScaler - Scale data (per-channel) between a given feature range (e.g., 0-1)
15 | * StandardScaler - Scale data (per-channel) to zero-mean/unit-variance
16 | * PCA - Reduce dimensionality via principal component analysis
17 | * Normalize - Scale data using the per-channel L2 norm
18 |
19 | ### Spatial-Spectral Feature Extraction ###
20 | * Stacked Convolutional Autoencoder (SCAE)
21 | * Stacked Multi-Loss Convolutional Autoencoder (SMCAE)
22 |
23 | ### Classifiers ###
24 | * SVMWorkflow - Support vector machine with a given training/validation split
25 | * SVMCVWorkflow - Support vector machine that uses n-fold cross-validation to find optimal hyperparameters
26 | * RandomForestWorkflow - Random Forest classifier
27 | * MLP - Multi-layer Perceptron Neural Network classifier
28 | * SSMLP - Semi-supervised MLP Neural Network classifier
29 |
30 | ### Post-Processors ###
31 | * Markov Random Field (MRF)
32 | * Fully-Connected Conditional Random Field (CRF)
33 |
34 | ## Dependencies ##
35 | * Python 3.5 (We recommend the [Anaconda Python Distribution](https://www.anaconda.com/download/))
36 | * numpy, scipy, and matplotlib
37 | * [scikit-learn](http://scikit-learn.org/stable/)
38 | * [spectral python](http://www.spectralpython.net/)
39 | * [gdal](http://www.gdal.org/)
40 | * [tensorflow](https://www.tensorflow.org/)
41 | * [pydensecrf](https://github.com/lucasb-eyer/pydensecrf)
42 | * [gco_python](https://github.com/amueller/gco_python)
43 |
44 | ## Instructions ##
45 |
46 | ### Installation ###
47 | ```console
48 | $ python setup.py
49 | ```
50 | ### Run example ###
51 | ```console
52 | $ python examples/example_pipeline.py
53 | ```
54 | ## Citations ##
55 |
56 | If you use our product, please cite:
57 | * Kemker, R., Gewali, U. B., Kanan, C. [EarthMapper: A Tool Box for the Semantic Segmentation of Remote Sensing Imagery
58 | ](https://arxiv.org/abs/1804.00292). In review at the IEEE Geoscience and Remote Sensing Letters (GRSL).
59 | * Kemker, R., Luu, R., and Kanan C. (2018) [Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery](https://arxiv.org/abs/1803.09824). IEEE Transactions on Geoscience and Remote Sensing (TGRS).
60 | * Kemker, R., Kanan C. (2017) [Self-Taught Feature Learning for Hyperspectral Image Classification](http://ieeexplore.ieee.org/document/7875467/). IEEE Transactions on Geoscience and Remote Sensing (TGRS), 55(5): 2693-2705. 10.1109/TGRS.2017.2651639
61 | * U. B. Gewali and S. T. Monteiro, [A tutorial on modeling and inference in undirected graphical models for hyperspectral image analysis](https://arxiv.org/abs/1801.08268), In review at the International Journal of Remote Sensing (IJRS).
62 | * U. B. Gewali and S. T. Monteiro, [Spectral angle based unary energy functions for spatial-spectral hyperspectral classification using Markov random fields](http://ieeexplore.ieee.org/abstract/document/8071716/), in Proc. IEEE Workshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), Los Angeles, CA, USA, Aug. 2016.
63 |
64 | ## Points of Contact ##
65 | * Ronald Kemker - http://www.cis.rit.edu/~rmk6217/
66 | * Utsav Gewali - http://www.cis.rit.edu/~ubg9540/
67 | * Chris Kanan - http://www.chriskanan.com/
68 |
69 | ## Also Check Out ##
70 | * RIT-18 Dataset - https://github.com/rmkemker/RIT-18
71 | * Machine and Neuromorphic Perception Laboratory - http://klab.cis.rit.edu/
72 |
--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/__init__.py
--------------------------------------------------------------------------------
/classifier/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/classifier/__init__.py
--------------------------------------------------------------------------------
/classifier/mlp.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import os
4 | import numpy as np
5 | import tf_initializers
6 |
7 | class CMA(object):
8 |
9 | def __init__(self):
10 | self.N = 0
11 | self.avg = 0.0
12 |
13 | def update(self, X):
14 | self.avg = (X+self.N * self.avg)/(self.N+1)
15 | self.N = self.N + 1
16 |
17 | class MLP(object):
18 |
19 | def __init__(self, layer_sizes, num_epochs=150,
20 | batch_size=200, weight_decay=None,
21 | starter_learning_rate=2e-3, ckpt_path='checkpoints',
22 | patience=50,
23 | activation='pelu',
24 | weight_initializer='glorot_normal',
25 | mu = None):
26 |
27 | '''
28 | MLP Attributes
29 | '''
30 | self.L = len(layer_sizes) - 1 # number of layers
31 | self.layer_sizes = layer_sizes
32 | self.weight_decay = weight_decay
33 | self.num_epochs = num_epochs
34 | self.batch_size = batch_size
35 | self.ckpt_path = ckpt_path
36 | self.starter_learning_rate = starter_learning_rate
37 | self.patience = patience
38 | self.lr_patience = int(patience/2)
39 | self.activation = activation
40 | self.weight_initializer = tf_initializers.get(weight_initializer)
41 | self.mu = mu
42 |
43 | with tf.Graph().as_default():
44 |
45 | inputs = tf.placeholder(tf.float32, shape=(None,self.layer_sizes[0]),name='inputs')
46 | outputs = tf.placeholder(tf.int32, name='outputs')
47 | class_weights = tf.placeholder(tf.float32, name='class_weights')
48 | learning_rate = tf.placeholder(tf.float32, name='learning_rate')
49 |
50 | training = tf.placeholder(tf.bool,name='training')
51 |
52 | shapes = list(zip(self.layer_sizes[:-1], self.layer_sizes[1:])) # shapes of linear layers
53 |
54 | weights = {'W': [self.weight_initializer(s, "W") for s in shapes],
55 | 'Wb':[self.bi(1.0, self.layer_sizes[l+1], "Wb") for l in range(self.L)],
56 | 'a1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)],
57 | 'b1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)]}
58 |
59 | tf.add_to_collection("inputs", inputs)
60 | tf.add_to_collection("outputs", outputs)
61 | tf.add_to_collection("training", training)
62 | tf.add_to_collection('learning_rate', learning_rate)
63 | tf.add_to_collection('class_weights', class_weights)
64 |
65 | eps = tf.constant(1e-10)
66 |
67 | #Encoder
68 | h = inputs
69 | for i in range(self.L-1):
70 | h = tf.matmul(h, weights['W'][i])
71 | h = tf.add(h, weights['Wb'][i])
72 | h = tf.layers.batch_normalization(h, training=training)
73 |
74 | if i == self.L - 2:
75 | h = self.act(h, weights["a1"][i], weights["b1"][i])
76 | h = tf.matmul(h, weights['W'][self.L-1])
77 | h = tf.add(h, weights['Wb'][self.L-1])
78 | y = tf.nn.softmax(h)
79 | else:
80 | h = self.act(h, weights["a1"][i], weights["b1"][i])
81 |
82 | tf.add_to_collection("y", y)
83 |
84 | # clipped_output = tf.clip_by_value(outputs, 1e-10, 1.0)
85 | one_hot = tf.cast(tf.one_hot(outputs, layer_sizes[-1]) , tf.float32)
86 | loss_per = tf.reduce_sum(one_hot * tf.multiply(tf.log(y+eps),class_weights), 1) # supervised cost
87 | loss = -tf.reduce_mean(loss_per)
88 |
89 | reg_loss = 0.0
90 | if self.weight_decay is not None:
91 | for i in range(0,self.L):
92 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['W'][i])
93 |
94 | pelu_loss = 0.0
95 | if self.activation == 'pelu':
96 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0)
97 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0)
98 |
99 |
100 | loss = loss + reg_loss + pelu_loss
101 | loss_per = -loss_per + reg_loss + pelu_loss
102 | tf.add_to_collection('loss_per_t', loss_per)
103 | tf.add_to_collection('loss', loss)
104 |
105 | pred = tf.argmax(y, 1)
106 | tf.add_to_collection('pred', pred)
107 | # truth = tf.argmax(one_hot, 1)
108 | correct_prediction = tf.equal(tf.cast(tf.argmax(y, 1), tf.int32), outputs) # no of correct predictions
109 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) * tf.constant(100.0)
110 | tf.add_to_collection('accuracy', accuracy)
111 |
112 | opt = tf.contrib.opt.NadamOptimizer(learning_rate)
113 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
114 | with tf.control_dependencies(update_ops):
115 | train_step = opt.minimize(loss)
116 |
117 | tf.add_to_collection('train_step', train_step)
118 |
119 | with tf.Session() as sess:
120 | saver = tf.train.Saver()
121 | sess.run(tf.global_variables_initializer())
122 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0)
123 |
124 | def bi(self, inits, size, name):
125 | return tf.Variable(inits * tf.ones([size]), name=name)
126 |
127 | def wi(self, shape, name):
128 | return tf.Variable(tf.random_normal(shape, name=name)/ math.sqrt(shape[0]))
129 |
130 | def act(self, z, a, b):
131 | """Applies PELU activation function.
132 |
133 | Args:
134 | z (tensor): data to apply activation function on.
135 | a (float): `a` parameter.
136 | b (float): `b` parameter.
137 | """
138 | if self.activation == 'relu':
139 | return tf.nn.relu(z)
140 | elif self.activation == 'tanh':
141 | return tf.nn.tanh(z)
142 | elif self.activation == 'leakyrelu':
143 | return tf.nn.relu(z) - 0.1 * tf.nn.relu(-z)
144 | elif self.activation == 'pelu':
145 | positive = tf.nn.relu(z) * a / (b + 1e-9)
146 | negative = a * (tf.exp((-tf.nn.relu(-z)) / (b + 1e-9)) - 1)
147 | return negative + positive
148 | elif self.activation == 'elu':
149 | return tf.nn.elu(z)
150 | else:
151 | raise NotImplementedError('Only RELU, ELU, TANH, and PELU activations are supported')
152 |
153 | def fit(self, X_train, y_train, X_val, y_val):
154 |
155 | with tf.Graph().as_default():
156 |
157 | with tf.Session() as sess:
158 | print("=== Training PFC===")
159 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
160 | if ckpt and ckpt.model_checkpoint_path:
161 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
162 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
163 | clear_devices=True)
164 | saver.restore(sess, ckpt.model_checkpoint_path)
165 | inputs = tf.get_collection('inputs')[0]
166 | outputs = tf.get_collection('outputs')[0]
167 | training = tf.get_collection('training')[0]
168 | class_weights = tf.get_collection('class_weights')[0]
169 | train_step = tf.get_collection('train_step')[0]
170 | loss_t = tf.get_collection('loss')[0]
171 | loss_per_t = tf.get_collection('loss_per_t')[0]
172 | learning_rate = tf.get_collection('learning_rate')[0]
173 | else:
174 | raise FileNotFoundError('Cannot Find Checkpoint')
175 |
176 | #Compute initial val_loss
177 | best_loss = 1000000.0
178 | val_loss_arr = np.zeros((X_val.shape[0]), np.float32)
179 |
180 |
181 | lr = self.starter_learning_rate
182 |
183 | pc = 0 # patience counter
184 | lrc = 0 # learning rate patience counter
185 |
186 | num_classes= np.max(y_train)+1
187 | if self.mu == None:
188 | cw = np.ones(num_classes, dtype=np.float32)
189 | else:
190 | num_classes= np.max(y_train)+1
191 | cw = np.bincount(y_train, minlength=num_classes)
192 | cw = np.sum(cw)/cw
193 | cw[np.isinf(cw)] = 0
194 | cw = self.mu * np.log(cw)
195 | cw[cw < 1] = 1
196 |
197 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e -- loss: %1.2f'
198 | v_msg1 = ' -- val=(%1.2f)'
199 |
200 | train_samples = X_train.shape[0]
201 | val_samples = X_val.shape[0]
202 | for e in range(self.num_epochs):
203 | train_cma = CMA()
204 |
205 | idx = np.arange(X_train.shape[0])
206 | np.random.shuffle(idx)
207 |
208 | c=0
209 | for i in range(0 , train_samples, self.batch_size):
210 |
211 | _, loss0 = sess.run([train_step, loss_t],
212 | feed_dict={inputs: X_train[idx[i:i+self.batch_size]],
213 | outputs: y_train[idx[i:i+self.batch_size]],
214 | training: True,
215 | learning_rate: lr,
216 | class_weights:cw})
217 |
218 | train_cma.update(loss0)
219 | print(t_msg1 % (e+1,self.num_epochs,i+1,train_samples,lr,train_cma.avg), end="")
220 |
221 |
222 | c+=1
223 | if math.isnan(loss0):
224 | break
225 | #Calculate validation loss/acc
226 | eval_mb = 256
227 | for i in range(0, val_samples, eval_mb):
228 | start = np.min([i, val_samples - eval_mb])
229 | end = np.min([i + eval_mb, val_samples])
230 | val_loss_arr[start:end] = sess.run(loss_per_t,
231 | feed_dict={inputs: X_val[start:end],
232 | outputs: y_val[start:end],
233 | training: False,
234 | class_weights:cw})
235 | val_loss = np.mean(val_loss_arr)
236 | print(v_msg1 % val_loss)
237 |
238 | if val_loss < best_loss and not math.isnan(train_cma.avg):
239 | saver.save(sess, self.ckpt_path+'/model.ckpt',e)
240 | best_loss = val_loss
241 | print(' -- best_val_loss=%1.2f' % best_loss)
242 | pc = 0
243 | lrc=0
244 | else:
245 | pc += 1
246 | lrc += 1
247 | if pc > self.patience:
248 | break
249 | elif lrc >= self.lr_patience:
250 | lrc = 0
251 | lr /= 10.0
252 | print(' -- Patience=%d/%d' % (pc, self.patience))
253 |
254 | if math.isnan(train_cma.avg):
255 | break
256 |
257 | def predict(self, X):
258 |
259 | n = X.shape[0]
260 |
261 | with tf.Graph().as_default():
262 |
263 | with tf.Session() as sess:
264 |
265 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
266 | if ckpt and ckpt.model_checkpoint_path:
267 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
268 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
269 | clear_devices=True)
270 | saver.restore(sess, ckpt.model_checkpoint_path)
271 | inputs = tf.get_collection('inputs')[0]
272 | outputs = tf.get_collection('outputs')[0]
273 | training = tf.get_collection('training')[0]
274 | pred = tf.get_collection('pred')[0]
275 | else:
276 | raise FileNotFoundError('No checkpoint loaded')
277 |
278 | prediction = np.zeros(n, dtype=np.int32)
279 |
280 | for i in range(0, n, self.batch_size):
281 | start = np.min([i, n - self.batch_size])
282 | end = np.min([i + self.batch_size, n])
283 |
284 | X_test = X[start:end]
285 | y_test = X[start:end]
286 |
287 | prediction[start:end] = sess.run(pred,
288 | feed_dict={inputs: X_test,
289 | outputs: y_test,
290 | training: False})
291 |
292 | return prediction
293 |
294 | def predict_proba(self, X):
295 |
296 | n = X.shape[0]
297 |
298 | with tf.Graph().as_default():
299 |
300 | with tf.Session() as sess:
301 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
302 | if ckpt and ckpt.model_checkpoint_path:
303 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
304 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
305 | clear_devices=True)
306 | saver.restore(sess, ckpt.model_checkpoint_path)
307 | inputs = tf.get_collection('inputs')[0]
308 | training = tf.get_collection('training')[0]
309 | proba = tf.get_collection('y')[0]
310 | else:
311 | raise FileNotFoundError('No checkpoint loaded')
312 |
313 | prediction = np.zeros((n, self.layer_sizes[-1]), dtype=np.float32)
314 |
315 | for i in range(0, n, self.batch_size):
316 | start = np.min([i, n - self.batch_size])
317 | end = np.min([i + self.batch_size, n])
318 |
319 | X_test = X[start:end]
320 |
321 | prediction[start:end] = sess.run(proba,
322 | feed_dict={inputs: X_test,
323 | training: False})
324 |
325 | return prediction
326 |
327 |
328 |
--------------------------------------------------------------------------------
/classifier/rf_workflow.py:
--------------------------------------------------------------------------------
1 | """
2 | @author: ubg9540
3 | """
4 |
5 | from sklearn.base import BaseEstimator, ClassifierMixin
6 | from sklearn.ensemble import RandomForestClassifier
7 | from sklearn.model_selection import GridSearchCV, PredefinedSplit
8 | from sklearn.preprocessing import StandardScaler
9 | from sklearn.decomposition import PCA
10 | import numpy as np
11 |
12 | class RandomForestWorkflow(BaseEstimator, ClassifierMixin):
13 |
14 | """
15 | RandomForest (Classifier)
16 |
17 | Args:
18 | n_jobs (int, optional): Number of jobs to run in parallel for both fit
19 | and predict. (Default: 8)
20 | verbosity (int, optional): Controls the verbosity of GridSearchCV: the
21 | higher the number, the more messages. (Default: 10)
22 | refit (bool, optional): Whether to refit during the fine-tune search.
23 | (Default: True)
24 | scikit_args (dict, optional)
25 | """
26 |
27 | def __init__(self, n_jobs=8, verbosity = 10, refit=True, scikit_args={}):
28 | self.n_jobs = n_jobs
29 | self.verbose = verbosity
30 | self.refit = refit
31 | self.scikit_args = scikit_args
32 |
33 | def fit(self, train_data, train_labels, val_data, val_labels):
34 | """
35 | Fits to training data.
36 |
37 | Args:
38 | train_data (ndarray): Training data.
39 | train_labels (ndarray): Training labels.
40 | val_data (ndarray): Validation data.
41 | val_labels (ndarray): Validation labels.
42 | """
43 | split = np.append(-np.ones(train_labels.shape, dtype=np.float32),
44 | np.zeros(val_labels.shape, dtype=np.float32))
45 | ps = PredefinedSplit(split)
46 |
47 | sh = train_data.shape
48 | train_data = np.append(train_data, val_data , axis=0)
49 | train_labels = np.append(train_labels , val_labels, axis=0)
50 | del val_data, val_labels
51 |
52 | model = RandomForestClassifier(n_jobs=self.n_jobs,
53 | **self.scikit_args)
54 |
55 | params = {'n_estimators':np.arange(1,1001,50)}
56 | #Coarse search
57 | gs = GridSearchCV(model, params, refit=False, n_jobs=self.n_jobs,
58 | verbose=self.verbose, cv=ps)
59 | gs.fit(train_data, train_labels)
60 |
61 | #Fine-Tune Search
62 | params = {'n_estimators':np.arange(gs.best_params_['n_estimators']-50,
63 | gs.best_params_['n_estimators']+50)}
64 |
65 | self.gs = GridSearchCV(model, params, refit=self.refit, n_jobs=self.n_jobs,
66 | verbose=self.verbose, cv=ps)
67 | self.gs.fit(train_data, train_labels)
68 |
69 | if not self.refit:
70 | model.set_params(n_estimators=gs.best_params_['n_estimators'])
71 | self.gs = model
72 | self.gs.fit(train_data[:sh[0]], train_labels[:sh[0]])
73 |
74 | # return self.gs.fit(train_data, train_labels)
75 |
76 | def predict(self, test_data):
77 | """
78 | Performs classification on samples in test_data.
79 |
80 | Args:
81 | test_data (ndarray): Test data to be classified.
82 |
83 | Returns:
84 | ndarray of predicted class labels for test_data.
85 | """
86 | return self.gs.predict(test_data)
87 |
88 |
89 | def predict_proba(self, test_data):
90 | """
91 | Computes probabilities of possible outcomes for samples in test_data.
92 |
93 | Args:
94 | test_data (ndarray): Test data that will be used to compute
95 | probabilities.
96 |
97 | Returns:
98 | Array of the probability of the sample for each class in the
99 | model.
100 | """
101 | return self.gs.predict_proba(test_data)
--------------------------------------------------------------------------------
/classifier/ssmlp.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import os
4 | import numpy as np
5 | import tf_initializers
6 |
7 | class EMA(object):
8 | def __init__(self, decay=0.99):
9 | self.avg = None
10 | self.decay = decay
11 | self.history = []
12 | def update(self, X):
13 | if not self.avg:
14 | self.avg = X
15 | else:
16 | self.avg = self.avg * self.decay + (1-self.decay) * X
17 | self.history.append(self.avg)
18 |
19 | class SSMLP(object):
20 |
21 | def __init__(self, layer_sizes, unsupervised_weights=None, num_epochs=150,
22 | batch_size=32, weight_decay=None,
23 | starter_learning_rate=2e-3, ckpt_path='checkpoints',
24 | patience=50,
25 | activation='pelu', unsupervised_cost_fn='mse',
26 | weight_initializer='glorot_normal',
27 | mu = 0.25, eval_batch_size=512):
28 |
29 | '''
30 | Attributes
31 | '''
32 | self.L = len(layer_sizes) - 1 # number of layers
33 | self.layer_sizes = layer_sizes
34 | self.weight_decay = weight_decay
35 | self.num_epochs = num_epochs
36 | self.batch_size = batch_size
37 | self.ckpt_path = ckpt_path
38 | self.starter_learning_rate = starter_learning_rate
39 | self.patience = patience
40 | self.lr_patience = int(patience/2)
41 | self.activation = activation
42 | self.unsup_cost_fn = unsupervised_cost_fn
43 | self.weight_initializer = tf_initializers.get(weight_initializer)
44 | self.mu = mu
45 | self.eval_batch_size = eval_batch_size
46 |
47 | if unsupervised_weights is None:
48 | self.unsup_weights = np.ones(len(layer_sizes)-2)
49 | else:
50 | self.unsup_weights = unsupervised_weights
51 |
52 | tf.reset_default_graph()
53 |
54 | inputs = tf.placeholder(tf.float32, shape=(None,self.layer_sizes[0]),name='inputs')
55 | outputs = tf.placeholder(tf.int32, name='outputs')
56 | class_weights = tf.placeholder(tf.float32, name='class_weights')
57 | learning_rate = tf.placeholder(tf.float32, name='learning_rate')
58 |
59 | training = tf.placeholder(tf.bool,name='training')
60 |
61 | shapes = list(zip(self.layer_sizes[:-1], self.layer_sizes[1:])) # shapes of linear layers
62 |
63 | weights = {'W': [self.weight_initializer(s, "W") for s in shapes],
64 | 'Wb':[self.bi(1.0, self.layer_sizes[l+1], "Wb") for l in range(self.L)],
65 | 'V': [self.weight_initializer(s[::-1], "V") for s in shapes],
66 | 'Vb':[self.bi(1.0, self.layer_sizes[l], "Vb") for l in range(self.L)],
67 | 'a1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)],
68 | 'b1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)],
69 | 'a2':[tf.Variable(tf.ones(1)) for l in range(self.L-1)],
70 | 'b2':[tf.Variable(tf.ones(1)) for l in range(self.L-1)]}
71 |
72 | tf.add_to_collection("inputs", inputs)
73 | tf.add_to_collection("outputs", outputs)
74 | tf.add_to_collection("training", training)
75 | tf.add_to_collection('learning_rate', learning_rate)
76 | tf.add_to_collection('class_weights', class_weights)
77 |
78 | eps = tf.constant(1e-10)
79 |
80 | #Encoder
81 | h = inputs
82 | encoder_layers = [inputs]
83 | for i in range(self.L-1):
84 | h = tf.matmul(h, weights['W'][i])
85 | h = tf.add(h, weights['Wb'][i])
86 | h = tf.layers.batch_normalization(h, training=training)
87 |
88 | if i == self.L - 2:
89 | encoded_path = self.act(h, weights["a1"][i], weights["b1"][i])
90 | h = tf.matmul(encoded_path, weights['W'][self.L-1])
91 | h = tf.add(h, weights['Wb'][self.L-1])
92 | y = tf.nn.softmax(h)
93 | else:
94 | h = self.act(h, weights["a1"][i], weights["b1"][i])
95 | encoder_layers.append(h)
96 |
97 | tf.add_to_collection("y", y)
98 |
99 | #Decoder
100 |
101 | h = encoded_path
102 | decoder_layers = []
103 | for i in range(self.L-2, -1, -1):
104 | h = tf.matmul(h, weights['V'][i])
105 | h = tf.add(h, weights['Vb'][i])
106 |
107 | h = tf.layers.batch_normalization(h, training=training)
108 |
109 | if i == 0:
110 | decoded_path = h
111 | decoder_layers.append(decoded_path)
112 | else:
113 | h = self.act(h, weights["a2"][i], weights["b2"][i])
114 | decoder_layers.append(h)
115 | h = tf.add(h, encoder_layers[i])
116 |
117 |
118 | decoder_layers = decoder_layers[::-1]
119 | u_cost = 0.0
120 | if self.unsup_cost_fn == 'mae':
121 | for i in range(len(encoder_layers)):
122 | u_cost += (self.unsup_weights[i] * tf.reduce_mean(tf.abs(encoder_layers[i] - decoder_layers[i]),1))
123 | else:
124 | for i in range(len(encoder_layers)):
125 | u_cost += (self.unsup_weights[i] * tf.reduce_mean(tf.square(encoder_layers[i] - decoder_layers[i]),1))
126 |
127 | # tf.add_to_collection("u_cost", u_cost)
128 |
129 | # clipped_output = tf.clip_by_value(outputs, 1e-10, 1.0)
130 | # cost = -tf.reduce_mean(tf.reduce_sum(outputs * tf.multiply(tf.log(y+eps),class_weights), 1)) # supervised cost
131 | # loss = cost + u_cost # total cost
132 |
133 | # tf.add_to_collection('cost', cost)
134 |
135 | # if self.weight_decay is not None:
136 | # weight_decay = tf.nn.l2_loss(weights['W'][0]) + tf.nn.l2_loss(weights['V'][0])
137 | # for i in range(1,self.L):
138 | # weight_decay = weight_decay + tf.nn.l2_loss(weights['W'][i]) + tf.nn.l2_loss(weights['V'][i])
139 | #
140 | # loss = loss + self.weight_decay * weight_decay # total cost
141 |
142 | # if self.activation == 'pelu':
143 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0)
144 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0)
145 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a2']), 0)
146 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b2']), 0)
147 |
148 | # tf.add_to_collection('loss', loss)
149 |
150 | one_hot = tf.cast(tf.one_hot(outputs, layer_sizes[-1]) , tf.float32)
151 | loss_per = -tf.reduce_sum(one_hot * tf.multiply(tf.log(y+eps),class_weights), 1) + u_cost# supervised cost
152 | loss = tf.reduce_mean(loss_per)
153 |
154 | reg_loss = 0.0
155 | if self.weight_decay is not None:
156 | for i in range(0,self.L):
157 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['W'][i])
158 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['V'][i])
159 |
160 | pelu_loss = 0.0
161 | if self.activation == 'pelu':
162 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0)
163 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0)
164 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a2']), 0)
165 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b2']), 0)
166 |
167 | loss = loss + reg_loss + pelu_loss
168 | loss_per = loss_per + reg_loss + pelu_loss
169 | tf.add_to_collection('loss_per_t', loss_per)
170 | tf.add_to_collection('loss_t', loss)
171 |
172 |
173 | pred = tf.argmax(y, 1)
174 | tf.add_to_collection('pred', pred)
175 | truth = tf.argmax(outputs, 1)
176 | correct_prediction = tf.equal(tf.argmax(y, 1), truth) # no of correct predictions
177 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) * tf.constant(100.0)
178 | tf.add_to_collection('accuracy', accuracy)
179 |
180 | opt = tf.contrib.opt.NadamOptimizer(learning_rate)
181 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
182 | with tf.control_dependencies(update_ops):
183 | train_step = opt.minimize(loss)
184 |
185 | tf.add_to_collection('train_step', train_step)
186 |
187 | with tf.Session() as sess:
188 | saver = tf.train.Saver()
189 | sess.run(tf.global_variables_initializer())
190 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0)
191 |
192 | def bi(self, inits, size, name):
193 | return tf.Variable(inits * tf.ones([size]), name=name)
194 |
195 | def wi(self, shape, name):
196 | return tf.Variable(tf.random_normal(shape, name=name)/ math.sqrt(shape[0]))
197 |
198 | def act(self, z, a, b):
199 | """Applies PELU activation function.
200 |
201 | Args:
202 | z (tensor): data to apply activation function on.
203 | a (float): `a` parameter.
204 | b (float): `b` parameter.
205 | """
206 | if self.activation == 'relu':
207 | return tf.nn.relu(z)
208 | elif self.activation == 'tanh':
209 | return tf.nn.tanh(z)
210 | elif self.activation == 'leakyrelu':
211 | return tf.nn.relu(z) - 0.1 * tf.nn.relu(-z)
212 | elif self.activation == 'pelu':
213 | positive = tf.nn.relu(z) * a / (b + 1e-9)
214 | negative = a * (tf.exp((-tf.nn.relu(-z)) / (b + 1e-9)) - 1)
215 | return negative + positive
216 | elif self.activation == 'elu':
217 | return tf.nn.elu(z)
218 | else:
219 | raise NotImplementedError('Only RELU, ELU, TANH, and PELU activations are supported')
220 |
221 | def fit(self, X_train, y_train, X_val, y_val):
222 |
223 | if not os.path.exists(self.ckpt_path):
224 | os.makedirs(self.ckpt_path)
225 |
226 | with tf.Session() as sess:
227 | print("=== Training RonNet===")
228 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
229 | if ckpt and ckpt.model_checkpoint_path:
230 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
231 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
232 | clear_devices=True)
233 | saver.restore(sess, ckpt.model_checkpoint_path)
234 | inputs = tf.get_collection('inputs')[0]
235 | outputs = tf.get_collection('outputs')[0]
236 | training = tf.get_collection('training')[0]
237 | class_weights = tf.get_collection('class_weights')[0]
238 | train_step = tf.get_collection('train_step')[0]
239 | loss_t = tf.get_collection('loss_t')[0]
240 | loss_per_t = tf.get_collection('loss_per_t')[0]
241 | learning_rate = tf.get_collection('learning_rate')[0]
242 | else:
243 | raise FileNotFoundError('Cannot Find Checkpoint')
244 |
245 | #Compute initial val_loss
246 | best_loss = 1000000.0
247 |
248 |
249 | lr = self.starter_learning_rate
250 |
251 | pc = 0 # patience counter
252 | lrc = 0 # learning rate patience counter
253 |
254 | num_classes= np.max(y_train)+1
255 | if self.mu is None:
256 | cw = np.ones(num_classes)
257 | else:
258 | cw = np.bincount(y_train, minlength=num_classes)
259 | cw = np.sum(cw)/cw
260 | cw[np.isinf(cw)] = 0
261 | cw = self.mu * np.log(cw)
262 | cw[cw < 1] = 1
263 |
264 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e -- train=%1.2f'
265 | v_msg1 = ' -- val=%1.2f'
266 | val_loss_arr = np.zeros((X_val.shape[0]), np.float32)
267 | ema = EMA()
268 | val_samples = X_val.shape[0]
269 | idx = np.arange(X_train.shape[0])
270 | for e in range(self.num_epochs):
271 |
272 | np.random.shuffle(idx)
273 |
274 |
275 | for i in range(0, X_train.shape[0], self.batch_size):
276 | _, loss = sess.run([train_step, loss_t],
277 | feed_dict={inputs: X_train[idx[i:i+self.batch_size]],
278 | outputs: y_train[idx[i:i+self.batch_size]],
279 | training: True,
280 | learning_rate: lr,
281 | class_weights:cw})
282 | ema.update(loss)
283 | print(t_msg1 % (e+1,self.num_epochs,i+1,X_train.shape[0],lr, ema.avg), end="")
284 |
285 | #Calculate validation loss/acc
286 | eval_mb = 256
287 | for i in range(0, val_samples, eval_mb):
288 | start = np.min([i, val_samples - eval_mb])
289 | end = np.min([i + eval_mb, val_samples])
290 | val_loss_arr[start:end] = sess.run(loss_per_t,
291 | feed_dict={inputs: X_val[start:end],
292 | outputs: y_val[start:end],
293 | training: False,
294 | class_weights:cw})
295 | val_loss = np.mean(val_loss_arr)
296 |
297 |
298 | print(v_msg1 % (val_loss), end="")
299 |
300 | if val_loss < best_loss and not math.isnan(val_loss):
301 | saver.save(sess, self.ckpt_path+'/model.ckpt',e)
302 | best_loss = val_loss
303 | print(' -- best_val_loss=%1.2f' % best_loss)
304 | pc = 0
305 | lrc=0
306 | else:
307 | pc += 1
308 | lrc += 1
309 | if pc > self.patience:
310 | break
311 | elif lrc >= self.lr_patience:
312 | lrc = 0
313 | lr /= 10.0
314 | print(' -- Patience=%d/%d' % (pc, self.patience))
315 |
316 | if math.isnan(val_loss):
317 | break
318 |
319 | def predict(self, X):
320 |
321 | n = X.shape[0]
322 |
323 | with tf.Graph().as_default():
324 |
325 | with tf.Session() as sess:
326 |
327 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
328 | if ckpt and ckpt.model_checkpoint_path:
329 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
330 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
331 | clear_devices=True)
332 | saver.restore(sess, ckpt.model_checkpoint_path)
333 | inputs = tf.get_collection('inputs')[0]
334 | outputs = tf.get_collection('outputs')[0]
335 | training = tf.get_collection('training')[0]
336 | pred = tf.get_collection('pred')[0]
337 | else:
338 | raise FileNotFoundError('No checkpoint loaded')
339 |
340 | prediction = np.zeros(n, dtype=np.int32)
341 |
342 | for i in range(0, n, self.batch_size):
343 | start = np.min([i, n - self.batch_size])
344 | end = np.min([i + self.batch_size, n])
345 |
346 | X_test = X[start:end]
347 | y_test = X[start:end]
348 |
349 | prediction[start:end] = sess.run(pred,
350 | feed_dict={inputs: X_test,
351 | outputs: y_test,
352 | training: False})
353 |
354 | return prediction
355 |
356 | def predict_proba(self, X):
357 |
358 | n = X.shape[0]
359 |
360 | with tf.Graph().as_default():
361 |
362 | with tf.Session() as sess:
363 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
364 | if ckpt and ckpt.model_checkpoint_path:
365 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
366 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
367 | clear_devices=True)
368 | saver.restore(sess, ckpt.model_checkpoint_path)
369 | inputs = tf.get_collection('inputs')[0]
370 | training = tf.get_collection('training')[0]
371 | proba = tf.get_collection('y')[0]
372 | else:
373 | raise FileNotFoundError('No checkpoint loaded')
374 |
375 | prediction = np.zeros((n, self.layer_sizes[-1]), dtype=np.float32)
376 |
377 | for i in range(0, n, self.batch_size):
378 | start = np.min([i, n - self.batch_size])
379 | end = np.min([i + self.batch_size, n])
380 |
381 | X_test = X[start:end]
382 |
383 | prediction[start:end] = sess.run(proba,
384 | feed_dict={inputs: X_test,
385 | training: False})
386 |
387 | return prediction
388 |
389 |
390 |
391 |
--------------------------------------------------------------------------------
/classifier/svm_cv_workflow.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: svm_cv_workflow.py
3 | Author: Ronald Kemker
4 | Description: Classification pipeline for support vector machine which uses
5 | k-fold cross-validation
6 | Note:
7 | Requires scikit-learn
8 | """
9 |
10 | from sklearn.svm import LinearSVC
11 | from sklearn.svm import SVC
12 | from sklearn.model_selection import GridSearchCV
13 | from sklearn.preprocessing import StandardScaler
14 | from sklearn.decomposition import PCA
15 | import numpy as np
16 |
17 | class SVMWorkflow():
18 |
19 | def __init__(self, kernel='linear', n_jobs=8, probability=False,
20 | verbosity = 0, cv=3, max_iter=1000, scikit_args={}):
21 | self.kernel = kernel
22 | self.n_jobs = n_jobs
23 | self.probability = probability
24 | self.verbose = verbosity
25 | self.cv = cv
26 | self.max_iter = max_iter
27 | self.scikit_args = scikit_args
28 |
29 |
30 | def fit(self, train_data, train_labels, X_val=None, y_val=None):
31 |
32 | if self.kernel == 'linear':
33 | if self.probability:
34 | clf = SVC(kernel='linear', class_weight='balanced',
35 | random_state=6, multi_class='ovr',
36 | max_iter=self.max_iter, probability=self.probability,
37 | **self.scikit_args)
38 | else:
39 | clf = LinearSVC(class_weight='balanced', dual=False,
40 | random_state=6, multi_class='ovr',
41 | max_iter=self.max_iter, **self.scikit_args)
42 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float)}
43 |
44 | elif self.kernel == 'rbf':
45 | clf = SVC(random_state=6, class_weight='balanced', cache_size=8000,
46 | decision_function_shape='ovr',max_iter=self.max_iter,
47 | tol=1e-4)
48 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float),
49 | 'gamma': 2.0**np.arange(-15,4,2,dtype=np.float)}
50 |
51 | #Coarse search
52 | gs = GridSearchCV(clf, params, refit=False, n_jobs=self.n_jobs,
53 | verbose=self.verbose, cv=self.cv)
54 | gs.fit(train_data, train_labels)
55 |
56 | #Fine-Tune Search
57 | if self.kernel == 'linear':
58 | best_C = np.log2(gs.best_params_['C'])
59 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10,
60 | dtype=np.float)}
61 | elif self.kernel == 'rbf':
62 | best_C = np.log2(gs.best_params_['C'])
63 | best_G = np.log2(gs.best_params_['gamma'])
64 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10,
65 | dtype=np.float),
66 | 'gamma': 2.0**np.linspace(best_G-2,best_G+2,10,
67 | dtype=np.float)}
68 |
69 | self.gs = GridSearchCV(clf, params, refit=True, n_jobs=self.n_jobs,
70 | verbose=self.verbose, cv=self.cv)
71 |
72 | self.gs.fit(train_data, train_labels)
73 |
74 | return 1
75 |
76 |
77 | def predict(self, test_data):
78 | return self.gs.predict(test_data)
79 |
80 | def predict_proba(self, test_data):
81 | return self.gs.predict_proba(test_data)
82 |
--------------------------------------------------------------------------------
/classifier/svm_workflow.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: svm_workflow.py
3 | Author: Ronald Kemker
4 | Description: Classification pipeline for support vector machine where train and
5 | validation folds are pre-defined.
6 | Note:
7 | Requires scikit-learn
8 | """
9 |
10 |
11 | from sklearn.svm import LinearSVC
12 | from sklearn.svm import SVC
13 | from sklearn.model_selection import GridSearchCV, PredefinedSplit
14 | import numpy as np
15 | from sklearn.base import BaseEstimator, ClassifierMixin
16 |
17 | class SVMWorkflow(BaseEstimator, ClassifierMixin):
18 |
19 | """
20 | Support Vector Machine (Classifier)
21 |
22 | Args:
23 | kernel(str, optional): Type of SVM kernel. Can be either 'linear' or
24 | 'rbf'. (Default: 'linear')
25 | n_jobs(int, optional): Number of jobs to run in parallel for
26 | GridSearchCV. (Default: 8)
27 | verbosity(int ,optional): Controls the verbosity of GridSearchCV: the
28 | higher the number, the more messages. (Default: 10)
29 | probability(bool, optional): Whether to enable probability estimates.
30 | (Default: False)
31 | refit(bool, optional): For GridSearchCV. Whether to refit the best
32 | estimator with the entire dataset. (Default: True)
33 | scikit_args(dict, optional)
34 | """
35 |
36 | def __init__(self, kernel='linear', n_jobs=8, verbosity = 0,
37 | probability=False, refit=True, scikit_args={}):
38 | self.kernel = kernel
39 | self.n_jobs = n_jobs
40 | self.verbosity = verbosity
41 | self.probability = probability
42 | self.scikit_args = scikit_args
43 | self.refit = refit
44 |
45 | def fit(self, train_data, train_labels, val_data, val_labels):
46 | """
47 | Fits to training data.
48 |
49 | Args:
50 | train_data (ndarray): Training data.
51 | train_labels (ndarray): Training labels.
52 | val_data (ndarray): Validation data.
53 | val_labels (ndarray): Validation labels.
54 | """
55 | split = np.append(-np.ones(train_labels.shape, dtype=np.float32),
56 | np.zeros(val_labels.shape, dtype=np.float32))
57 | ps = PredefinedSplit(split)
58 |
59 | sh = train_data.shape
60 | train_data = np.append(train_data, val_data , axis=0)
61 | train_labels = np.append(train_labels , val_labels, axis=0)
62 | del val_data, val_labels
63 |
64 | if self.kernel == 'linear':
65 | if self.probability:
66 | clf = SVC(kernel='linear', class_weight='balanced',
67 | random_state=6, decision_function_shape='ovr',
68 | max_iter=1000, probability=self.probability,
69 | **self.scikit_args)
70 | else:
71 | clf = LinearSVC(class_weight='balanced', dual=False,
72 | random_state=6, multi_class='ovr',
73 | max_iter=1000, **self.scikit_args)
74 |
75 | #Cross-validate over these parameters
76 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float)}
77 | elif self.kernel == 'rbf':
78 | clf = SVC(random_state=6, class_weight='balanced', cache_size=16000,
79 | decision_function_shape='ovr',max_iter=1000, tol=1e-4,
80 | probability=self.probability, **self.scikit_args)
81 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float),
82 | 'gamma': 2.0**np.arange(-15,4,2,dtype=np.float)}
83 |
84 | #Coarse search
85 | gs = GridSearchCV(clf, params, refit=False, n_jobs=self.n_jobs,
86 | verbose=self.verbosity, cv=ps)
87 | gs.fit(train_data, train_labels)
88 |
89 | #Fine-Tune Search
90 | if self.kernel == 'linear':
91 | best_C = np.log2(gs.best_params_['C'])
92 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10,
93 | dtype=np.float)}
94 | elif self.kernel == 'rbf':
95 | best_C = np.log2(gs.best_params_['C'])
96 | best_G = np.log2(gs.best_params_['gamma'])
97 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10,
98 | dtype=np.float),
99 | 'gamma': 2.0**np.linspace(best_G-2,best_G+2,10,
100 | dtype=np.float)}
101 |
102 | self.gs = GridSearchCV(clf, params, refit=self.refit, n_jobs=self.n_jobs,
103 | verbose=self.verbosity, cv=ps)
104 | self.gs.fit(train_data, train_labels)
105 |
106 | if not self.refit:
107 | clf.set_params(C=gs.best_params_['C'])
108 | if self.kernel == 'rbf':
109 | clf.set_params(gamma=gs.best_params_['gamma'])
110 | self.gs = clf
111 | self.gs.fit(train_data[:sh[0]], train_labels[:sh[0]])
112 |
113 |
114 | def predict(self, test_data):
115 | """
116 | Performs classification on samples in test_data.
117 |
118 | Args:
119 | test_data (ndarray): Test data to be classified.
120 |
121 | Returns:
122 | ndarray of predicted class labels for test_data.
123 | """
124 | return self.gs.predict(test_data)
125 |
126 | def predict_proba(self, test_data):
127 | """
128 | Computes probabilities of possible outcomes for samples in test_data.
129 |
130 | Args:
131 | test_data (ndarray): Test data that will be used to compute
132 | probabilities.
133 |
134 | Returns:
135 | Array of the probability of the sample for each class in the
136 | model.
137 | """
138 | return self.gs.predict_proba(test_data)
139 |
140 |
--------------------------------------------------------------------------------
/examples/example_pipeline.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Thu Mar 29 08:23:03 2018
5 |
6 | @author: Ronald Kemker and Utsav Gewali
7 | """
8 |
9 | from fileIO.utils import read
10 | import numpy as np
11 | from metrics import Metrics, Timer
12 | import os
13 | from utils.fileIO import readENVIHeader
14 | from pipeline import Pipeline
15 | from preprocessing import ImagePreprocessor
16 | from utils.train_val_split import train_val_split
17 | from feature_extraction.scae import SCAE
18 | from classifier.svm_workflow import SVMWorkflow
19 | from classifier.rf_workflow import RandomForestWorkflow
20 |
21 | gpu_id = "0"
22 | dataset = 'ip' # dataset choices for this example are 'ip' and 'paviau'
23 | fe_type = 'scae' #feature extractor options are 'scae' and 'smcae'
24 | os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id
25 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3"
26 | root = '/home/rmk6217/Documents/EarthMapper/'
27 | fe_path = root + 'feature_extraction/pretrained/' + fe_type
28 | data_path = root + 'datasets/indian_pines/'
29 |
30 | #Load Indian Pines or Pavia University dataset
31 | if dataset == 'paviau':
32 | data_path = root + 'datasets/pavia_university/'
33 | X = np.float32(read(data_path+'pavia_ds/pavia_ds.tif').transpose(1,2,0))[:,:340]
34 | y = np.int32(read(data_path+'PaviaU_gt.mat')['paviaU_gt']).ravel()
35 | source_bands, source_fwhm = readENVIHeader(data_path + 'University.hdr')
36 | source_bands *= 1e3
37 | source_fwhm *= 1e3
38 | num_classes = 9
39 | padding = ((3,3),(2,2),(0,0))
40 |
41 | elif dataset == 'ip':
42 | data_path = root + 'datasets/indian_pines/'
43 | X = read(data_path + 'indianpines_ds.tif').transpose(1,2,0)
44 | X = np.float32(X)
45 | y = read(data_path + 'indianPinesGT.mat')['indian_pines_gt']
46 | y = y.ravel()
47 | source_bands, source_fwhm = readENVIHeader(data_path + 'indianpines_ds_raw.hdr')
48 | num_classes = 16
49 | padding = ((4,3),(4,3),(0,0))
50 |
51 | else:
52 | raise ValueError('Invalid dataset %s: Valid entries include "ip", "paviau"')
53 |
54 |
55 | #Create a random training/validation split
56 | y_train, y_val = train_val_split(y, 15 , 35)
57 |
58 | #Load Spatial-Spectral Feature Extractor
59 | fe = []
60 | target_bands = []
61 | target_fwhm = []
62 | fe_sensors = ['aviris'] # This can include 'aviris', 'hyperion', and 'gliht'
63 |
64 | for ds in fe_sensors:
65 | mat = read(fe_path + '/%s/%s_bands.mat' % (ds,fe_type))
66 | fe.append(SCAE(ckpt_path=fe_path + '/%s' % ds, nb_cae=5))
67 | target_bands.append(mat['bands'][0])
68 | target_fwhm.append(mat['fwhm'][0])
69 |
70 | #This pre-processes the data prior to passing it to a feature extractor
71 | #This can be a string or a ImagePreprocessor object
72 | pre_processor = ['StandardScaler', 'MinMaxScaler']
73 |
74 | #This pre-process the extracted features prior to passing it to a classifier
75 | #This can be a string or a ImagePreprocessor object
76 | feature_scaler = ['StandardScaler', 'MinMaxScaler',
77 | ImagePreprocessor(mode='PCA', PCA_components=0.99)]
78 |
79 | #Classifier - SVM-RBF (probability=True required for CRF post-processor)
80 | clf = SVMWorkflow(probability=True, kernel='rbf')
81 | #clf = RandomForestWorkflow()
82 |
83 | #The classification pipeline
84 | pipe = Pipeline(pre_processor=pre_processor,
85 | feature_extractor=fe,
86 | feature_spatial_padding = padding,
87 | feature_scaler=feature_scaler,
88 | classifier=clf, source_bands=source_bands,
89 | source_fwhm=source_fwhm,
90 | target_bands=target_bands, target_fwhm=target_fwhm,
91 | post_processor='CRF')
92 |
93 | #Fit Data
94 | with Timer('Pipeline Fit Timer'):
95 | pipe.fit(X , y_train, X, y_val)
96 |
97 | #Generate Prediction
98 | with Timer('Pipeline Predict Timer'):
99 | pred = pipe.predict(X)
100 |
101 | #Evaluate and Print Results
102 | m = Metrics(y[y>0]-1, pred[y>0])
103 | overall_accuracy, mean_class_accuracy, kappa_statistic = m.standard_metrics()
104 | confusion_matrix = m.c
105 | print('Overall Accuracy: %1.2f%%' % (overall_accuracy * 100.0))
106 | print('Mean-Class Accuracy: %1.2f%%' % (mean_class_accuracy * 100.0))
107 | print('Kappa Statistic: %1.4f' % (kappa_statistic))
108 |
109 |
110 |
111 |
--------------------------------------------------------------------------------
/examples/example_scae_svm.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Thu Mar 29 08:23:03 2018
5 |
6 | @author: Ronald Kemker and Utsav Gewali
7 | """
8 |
9 | from fileIO.utils import read
10 | import numpy as np
11 | from metrics import Metrics, Timer
12 | import os
13 | from utils.fileIO import readENVIHeader
14 | from pipeline import Pipeline
15 | from preprocessing import ImagePreprocessor, AveragePooling2D
16 | from utils.train_val_split import random_data_split, train_val_split
17 | from feature_extraction.scae import SCAE
18 | from classifier.svm_cv_workflow import SVMWorkflow
19 |
20 | np.random.seed(6)
21 | gpu_id = "0"
22 | dataset = 'paviau' # dataset choices for this example are 'ip' and 'paviau'
23 | fe_type = 'scae' #feature extractor options are 'scae' and 'smcae'
24 | os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id
25 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3"
26 | root = '/home/rmk6217/Documents/EarthMapper/'
27 | fe_path = root + 'feature_extraction/pretrained/' + fe_type
28 | data_path = root + 'datasets/indian_pines/'
29 | num_trials = 30
30 |
31 | #Load Indian Pines or Pavia University dataset
32 | if dataset == 'paviau':
33 | data_path = root + 'datasets/pavia_university/'
34 | X = np.float32(read(data_path+'pavia_ds/pavia_ds.tif').transpose(1,2,0))[:,:340]
35 | y = np.int32(read(data_path+'PaviaU_gt.mat')['paviaU_gt']).ravel()
36 | source_bands, source_fwhm = readENVIHeader(data_path + 'University.hdr')
37 | source_bands *= 1e3
38 | source_fwhm *= 1e3
39 | num_classes = 9
40 | padding = ((3,3),(2,2),(0,0))
41 |
42 | elif dataset == 'ip':
43 | data_path = root + 'datasets/indian_pines/'
44 | X = read(data_path + 'indianpines_ds.tif').transpose(1,2,0)
45 | X = np.float32(X)
46 | y = read(data_path + 'indianPinesGT.mat')['indian_pines_gt']
47 | y = y.ravel()
48 | source_bands, source_fwhm = readENVIHeader(data_path + 'indianpines_ds_raw.hdr')
49 | num_classes = 16
50 | padding = ((4,3),(4,3),(0,0))
51 |
52 | else:
53 | raise ValueError('Invalid dataset %s: Valid entries include "ip", "paviau"')
54 |
55 |
56 | oa_arr = np.zeros((num_trials, ))
57 | aa_arr = np.zeros((num_trials, ))
58 | kappa_arr = np.zeros((num_trials, ))
59 |
60 | for i in range(num_trials):
61 |
62 | #Create a random training/validation split
63 | y_train, y_val = random_data_split(y, train_samples=50, shuffle=True)
64 |
65 | #Load Spatial-Spectral Feature Extractor
66 | fe = []
67 | target_bands = []
68 | target_fwhm = []
69 | fe_sensors = ['hyperion'] # This can include 'aviris', 'hyperion', and 'gliht'
70 |
71 | for ds in fe_sensors:
72 | mat = read(fe_path + '/%s/%s_bands.mat' % (ds,fe_type))
73 | fe.append(SCAE(ckpt_path=fe_path + '/%s' % ds, nb_cae=3))
74 | target_bands.append(mat['bands'][0])
75 | target_fwhm.append(mat['fwhm'][0])
76 |
77 | #This pre-processes the data prior to passing it to a feature extractor
78 | #This can be a string or a ImagePreprocessor objectb
79 | pre_processor = ['StandardScaler']
80 |
81 | #This pre-process the extracted features prior to passing it to a classifier
82 | #This can be a string or a ImagePreprocessor object
83 | feature_scaler = [AveragePooling2D(), 'StandardScaler', 'MinMaxScaler',
84 | ImagePreprocessor('PCA', whiten=False, PCA_components=0.999)]
85 |
86 | #Classifier - SVM-RBF
87 | # The SVM from svm_cv_workflow.py does NOT use the validation set - it does
88 | # N-fold cross-validation with the training set. This is the ONLY classifier
89 | # we have that does this.
90 | clf = SVMWorkflow(cv=7)
91 |
92 | #The classification pipeline
93 | pipe = Pipeline(pre_processor=pre_processor,
94 | feature_extractor=fe,
95 | feature_spatial_padding = padding,
96 | feature_scaler=feature_scaler,
97 | classifier=clf, source_bands=source_bands,
98 | source_fwhm=source_fwhm,
99 | target_bands=target_bands, target_fwhm=target_fwhm)
100 |
101 | #Fit Data
102 | with Timer('Pipeline Fit Timer'):
103 | pipe.fit(X, y_train, X, y_val)
104 |
105 | #Generate Prediction
106 | with Timer('Pipeline Predict Timer'):
107 | pred = pipe.predict(X)
108 |
109 | #Evaluate and Print Results
110 | m = Metrics(y[y>0]-1, pred[y>0])
111 | oa_arr[i], aa_arr[i], kappa_arr[i] = m.standard_metrics()
112 |
113 | print('Overall Accuracy: %1.2f+/-%1.2f' % (np.mean(oa_arr) * 100.0, np.std(oa_arr)*100.0))
114 | print('Mean-Class Accuracy: %1.2f+/-%1.2f' % (np.mean(aa_arr) * 100.0, np.std(aa_arr)*100.0))
115 | print('Kappa Statistic: %1.4f+/-%1.4f' % (np.mean(kappa_arr), np.std(kappa_arr)))
116 |
117 | #SCAE-Hyperion 99.9%
118 | #Overall Accuracy: 95.25+/-1.37
119 | #Mean-Class Accuracy: 97.05+/-0.54
120 | #Kappa Statistic: 0.9378+/-0.0175
--------------------------------------------------------------------------------
/feature_extraction/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/feature_extraction/__init__.py
--------------------------------------------------------------------------------
/feature_extraction/cae.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Mon Oct 30 07:36:14 2017
5 |
6 | @author: rmk6217
7 | """
8 |
9 | import tensorflow as tf
10 | import numpy as np
11 | import os
12 | import math
13 |
14 |
15 | class DataSet(object):
16 |
17 | def __init__(self, images):
18 |
19 | self._num_examples = images.shape[0]
20 | self._images = images
21 | self._epochs_completed = 0
22 | self._index_in_epoch = 0
23 | self._index_array = np.arange(self._num_examples)
24 | np.random.shuffle(self._index_array)
25 |
26 | @property
27 | def images(self):
28 | return self._images
29 |
30 | @property
31 | def num_examples(self):
32 | return self._num_examples
33 |
34 |
35 | @property
36 | def epochs_completed(self):
37 | return self._epochs_completed
38 |
39 | def next_batch(self, batch_size):
40 | """Return the next `batch_size` examples from this data set."""
41 | start = self._index_in_epoch
42 | self._index_in_epoch += batch_size
43 | if self._index_in_epoch > self._num_examples:
44 | # Finished epoch
45 | self._epochs_completed += 1
46 | # Shuffle the data
47 | np.random.shuffle(self._index_array)
48 | # Start next epoch
49 | start = 0
50 | self._index_in_epoch = batch_size
51 | end = self._index_in_epoch
52 |
53 | if batch_size <= self._num_examples:
54 | return self._images[self._index_array[start:end]]
55 | else:
56 | return self._images
57 |
58 |
59 | def bn_relu(input_tensor, train_tensor, name_sub=''):
60 | bn = tf.layers.batch_normalization(input_tensor, fused=True, training=train_tensor,
61 | name='bn'+name_sub)
62 | return tf.nn.relu(bn, name='act'+name_sub)
63 |
64 | def bn_relu_conv(input_tensor, train_tensor, scope_id, num_filters,
65 | kernel_size=(3,3), strides=(1,1), l2_loss=0.0):
66 |
67 | with tf.variable_scope(scope_id):
68 | act= bn_relu(input_tensor, train_tensor)
69 | conv = tf.layers.conv2d(act, num_filters, kernel_size , padding='same',
70 | name='conv2d',
71 | kernel_initializer=tf.glorot_normal_initializer(),
72 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
73 |
74 | return conv
75 |
76 | def mse_loss(x, y):
77 | diff = tf.squared_difference(x,y)
78 | return tf.reduce_mean(diff)
79 |
80 | def refinement_layer(lateral_tensor, vertical_tensor, train_tensor, num_filters,
81 | scope_id, l2_loss):
82 | with tf.variable_scope(scope_id):
83 |
84 | conv1 = tf.layers.conv2d(lateral_tensor, num_filters, kernel_size=(3,3),
85 | padding='same',name='conv2d_1',
86 | kernel_initializer=tf.glorot_normal_initializer(),
87 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
88 | act1 = bn_relu(conv1, train_tensor, name_sub='1')
89 | conv2 = tf.layers.conv2d(act1, num_filters, kernel_size=(3,3),
90 | padding='same',name='conv2d_2',
91 | kernel_initializer=tf.glorot_normal_initializer(),
92 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
93 | conv3 = tf.layers.conv2d(vertical_tensor, num_filters, kernel_size=(3,3),
94 | padding='same',name='conv2d_3',
95 | kernel_initializer=tf.glorot_normal_initializer(),
96 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
97 | add = tf.add(conv2, conv3)
98 |
99 | old_shape = tf.shape(add)[1:3]
100 | up = tf.image.resize_nearest_neighbor(add, old_shape * 2 )
101 | act2 = bn_relu(up, train_tensor, name_sub='2')
102 | return act2
103 |
104 |
105 | def cae(N , input_tensor, train_tensor, nb_bands, cost_weights, l2_loss):
106 |
107 |
108 | conv1 = bn_relu_conv(input_tensor, train_tensor, scope_id='conv1',
109 | num_filters=N, l2_loss=l2_loss)
110 | pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1')
111 |
112 | conv2 = bn_relu_conv(pool1, train_tensor, scope_id='conv2',
113 | num_filters=N*2, l2_loss=l2_loss)
114 |
115 | pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2')
116 |
117 | conv3 = bn_relu_conv(pool2, train_tensor, scope_id='conv3',
118 | num_filters=N*2, l2_loss=l2_loss)
119 |
120 | pool3 = tf.layers.max_pooling2d(conv3, 2, 2, name='pool3')
121 |
122 | conv4 = bn_relu_conv(pool3, train_tensor, scope_id='conv4',
123 | num_filters=N*4, kernel_size=(1,1), strides=(1,1),
124 | l2_loss=l2_loss)
125 |
126 | refinement3 = refinement_layer(pool3, conv4, train_tensor, N*2,
127 | scope_id='refinement3', l2_loss=l2_loss)
128 |
129 |
130 | refinement2 = refinement_layer(pool2, refinement3, train_tensor, N*2,
131 | scope_id='refinement2', l2_loss=l2_loss)
132 |
133 | refinement1 = refinement_layer(pool1, refinement2, train_tensor, N,
134 | scope_id='refinement1', l2_loss=l2_loss)
135 |
136 | output_tensor = tf.layers.conv2d(refinement1, nb_bands, kernel_size=(1,1),
137 | padding='same',name='output_conv',
138 | kernel_initializer=tf.glorot_normal_initializer(),
139 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
140 |
141 | loss = mse_loss(output_tensor,input_tensor)
142 |
143 | return loss, refinement1
144 |
145 |
146 | class CAE(object):
147 | """Convolutional Autoencoder with single- and multi-loss support.
148 |
149 | """
150 | def __init__(self, nb_bands, loss_weights, ckpt_path = 'checkpoints',
151 | num_epochs=1000, starting_learning_rate=2e-3, patience=10,
152 | weight_decay=0.0):
153 | """
154 | Args:
155 | nb_bands (int): Number of bands (dimensionality).
156 | weight_decay (float): (Default: 1e-4)
157 | gpu_list (list of int, None): List of GPU ids that the model is trained on.
158 | If None, the model is trained on all available GPUs.
159 | model_checkpoint_path (str): Filepath to save the model file after every epoch.
160 | loss_mode (str): Specifies whether the CAE is single- or multi-loss.
161 | Supported values are `single` and `multi`.
162 | (Default: `multi`)
163 | weights (str): Filepath to the weights file to load. (Default: None)
164 | loss (str): Loss function used for MCAE.
165 | Supported functions are `mse` and `mae`.
166 |
167 | """
168 | self.N = 64
169 | self.nb_bands = nb_bands
170 | self.ckpt_path = ckpt_path
171 | self.num_epochs = num_epochs
172 | self.starting_learning_rate = starting_learning_rate
173 | self.patience = patience
174 | self.lr_patience = int(patience/2)
175 |
176 | inputs = tf.placeholder(tf.float32, shape=(None, None,None, nb_bands),name='inputs')
177 | learning_rate = tf.placeholder(tf.float32, name='learning_rate')
178 | training = tf.placeholder(tf.bool,name='training')
179 |
180 | tf.add_to_collection("inputs", inputs)
181 | tf.add_to_collection("training", training)
182 | tf.add_to_collection('learning_rate', learning_rate)
183 |
184 | loss, hidden_output = cae(self.N , inputs, training, nb_bands, loss_weights, weight_decay)
185 | tf.add_to_collection('loss', loss)
186 | tf.add_to_collection('hidden_output', hidden_output)
187 |
188 | opt = tf.contrib.opt.NadamOptimizer(learning_rate)
189 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
190 | with tf.control_dependencies(update_ops):
191 | train_step = opt.minimize(loss)
192 | tf.add_to_collection('train_step', train_step)
193 |
194 | with tf.Session() as sess:
195 | saver = tf.train.Saver()
196 | sess.run(tf.global_variables_initializer())
197 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0)
198 |
199 | def fit(self, X_train, X_val, batch_size=128):
200 |
201 | train_set = DataSet(X_train)
202 | val_set = DataSet(X_val)
203 |
204 | if not os.path.exists(self.ckpt_path):
205 | os.makedirs(self.ckpt_path)
206 |
207 | with tf.Session() as sess:
208 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
209 | if ckpt and ckpt.model_checkpoint_path:
210 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
211 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
212 | clear_devices=True)
213 | saver.restore(sess, ckpt.model_checkpoint_path)
214 | inputs = tf.get_collection('inputs')[0]
215 | training = tf.get_collection('training')[0]
216 | train_step = tf.get_collection('train_step')[0]
217 | loss = tf.get_collection('loss')[0]
218 | learning_rate = tf.get_collection('learning_rate')[0]
219 | else:
220 | raise FileNotFoundError('Cannot Find Checkpoint')
221 |
222 | iter_per_epoch = int(X_train.shape[0] / batch_size)
223 | val_iter = int(val_set.num_examples / batch_size) + 1
224 |
225 | best_loss = 100000.0
226 | loss_arr = np.zeros((iter_per_epoch),np.float32)
227 | lr = self.starting_learning_rate
228 |
229 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e'
230 | t_msg2 = ' -- train=%1.2f'
231 | v_msg1 = ' -- val=%1.2f'
232 | pc = 0
233 | lrc = 0
234 | for e in range(self.num_epochs):
235 | for i in range(iter_per_epoch):
236 | print(t_msg1 % (e+1,self.num_epochs,i+1,iter_per_epoch,lr), end="")
237 | images = train_set.next_batch(batch_size)
238 | _, loss0 = sess.run([train_step, loss],
239 | feed_dict={inputs: images,
240 | training: True,
241 | learning_rate: lr})
242 | loss_arr[i] = loss0
243 | print(t_msg2 % (np.mean(loss_arr)), end="")
244 |
245 | #Calculate validation loss/acc
246 | val_losses = np.array([],dtype=np.float32)
247 | for i in range(val_iter):
248 | X = val_set.next_batch(batch_size)
249 | val_loss = sess.run([loss],
250 | feed_dict={inputs: X,
251 | training: False})
252 | val_losses = np.append(val_losses, val_loss)
253 | val_loss = np.mean(val_losses)
254 |
255 | print(v_msg1 % (val_loss), end="")
256 |
257 | if val_loss < best_loss and not math.isnan(val_loss):
258 | saver.save(sess, self.ckpt_path+'/model.ckpt',e)
259 | best_loss = val_loss
260 | print(' -- best_val_loss=%1.2f' % best_loss)
261 | pc = 0
262 | lrc=0
263 | else:
264 | pc += 1
265 | lrc += 1
266 | if pc > self.patience:
267 | break
268 | elif lrc >= self.lr_patience:
269 | lrc = 0
270 | lr /= 10.0
271 | print(' -- Patience=%d/%d' % (pc, self.patience))
272 |
273 | if math.isnan(val_loss):
274 | break
275 |
276 | def transform(self, X, batch_size=128):
277 |
278 | dataset = DataSet(X)
279 | n = dataset._num_examples
280 |
281 | if not os.path.exists(self.ckpt_path):
282 | os.makedirs(self.ckpt_path)
283 |
284 | with tf.Session() as sess:
285 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
286 | if ckpt and ckpt.model_checkpoint_path:
287 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
288 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
289 | clear_devices=True)
290 | saver.restore(sess, ckpt.model_checkpoint_path)
291 | inputs = tf.get_collection('inputs')[0]
292 | training = tf.get_collection('training')[0]
293 | hidden_output = tf.get_collection('hidden_output')[0]
294 | else:
295 | raise FileNotFoundError('Cannot Find Checkpoint')
296 |
297 | if n > batch_size:
298 |
299 | prediction = np.zeros((dataset._num_examples,X.shape[1], X.shape[2], self.N), dtype=np.float32)
300 |
301 | for i in range(0, n, batch_size):
302 | start = np.min([i, n - batch_size])
303 | end = np.min([i + batch_size, n])
304 |
305 | X_test = dataset.images[start:end]
306 |
307 | prediction[start:end] = sess.run(hidden_output,
308 | feed_dict={inputs: X_test,
309 | training: False})
310 |
311 | else:
312 | prediction = sess.run(hidden_output,
313 | feed_dict={inputs: X,
314 | training: False})
315 | return prediction
316 |
--------------------------------------------------------------------------------
/feature_extraction/mcae.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Mon Oct 30 07:36:14 2017
5 |
6 | @author: rmk6217
7 | """
8 |
9 | import tensorflow as tf
10 | import numpy as np
11 | import os
12 | import math
13 |
14 |
15 | class EMA(object):
16 | def __init__(self, decay=0.99):
17 | self.avg = None
18 | self.decay = decay
19 | self.history = []
20 | def update(self, X):
21 | if not self.avg:
22 | self.avg = X
23 | else:
24 | self.avg = self.avg * self.decay + (1-self.decay) * X
25 | self.history.append(self.avg)
26 |
27 | def pelu(input_tensor, a , b):
28 |
29 | positive = tf.nn.relu(input_tensor) * a / (b + 1e-9)
30 | negative = a * (tf.exp((-tf.nn.relu(-input_tensor)) / (b + 1e-9)) - 1)
31 | return (positive + negative)
32 |
33 | def bn_pelu(input_tensor, train_tensor, name_sub=''):
34 | bn = tf.layers.batch_normalization(input_tensor, fused=True, training=train_tensor,
35 | name='bn'+name_sub)
36 | a = tf.Variable(tf.ones(1), name='a'+name_sub)
37 | b = tf.Variable(tf.ones(1), name='b'+name_sub)
38 | return pelu(bn , a, b), a, b
39 |
40 | def bn_pelu_conv(input_tensor, train_tensor, scope_id, num_filters,
41 | kernel_size=(3,3), strides=(1,1), l2_loss=0.0):
42 |
43 | with tf.variable_scope(scope_id):
44 | act, a, b = bn_pelu(input_tensor, train_tensor)
45 | conv = tf.layers.conv2d(act, num_filters, kernel_size , padding='same',
46 | name='conv2d',
47 | kernel_initializer=tf.glorot_normal_initializer(),
48 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
49 |
50 | return conv, a, b
51 |
52 | def mse_loss(x, y):
53 | diff = tf.squared_difference(x,y)
54 | return tf.reduce_mean(diff)
55 |
56 | def refinement_layer(lateral_tensor, vertical_tensor, train_tensor, num_filters,
57 | scope_id, l2_loss):
58 | with tf.variable_scope(scope_id):
59 |
60 | conv1 = tf.layers.conv2d(lateral_tensor, num_filters, kernel_size=(3,3),
61 | padding='same',name='conv2d_1',
62 | kernel_initializer=tf.glorot_normal_initializer(),
63 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
64 | act1, a1, b1 = bn_pelu(conv1, train_tensor, name_sub='1')
65 | conv2 = tf.layers.conv2d(act1, num_filters, kernel_size=(3,3),
66 | padding='same',name='conv2d_2',
67 | kernel_initializer=tf.glorot_normal_initializer(),
68 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
69 | conv3 = tf.layers.conv2d(vertical_tensor, num_filters, kernel_size=(3,3),
70 | padding='same',name='conv2d_3',
71 | kernel_initializer=tf.glorot_normal_initializer(),
72 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
73 | add = tf.add(conv2, conv3)
74 |
75 | old_shape = tf.shape(add)[1:3]
76 | up = tf.image.resize_nearest_neighbor(add, old_shape * 2 )
77 | act2, a2, b2 = bn_pelu(up, train_tensor, name_sub='2')
78 | return act2, [a1, a2], [b1, b2]
79 |
80 |
81 | def mcae(input_tensor, train_tensor, nb_bands, cost_weights, l2_loss):
82 |
83 | a = []
84 | b = []
85 |
86 | conv1, a_t, b_t = bn_pelu_conv(input_tensor, train_tensor, scope_id='conv1',
87 | num_filters=256, l2_loss=l2_loss)
88 | a.append(a_t)
89 | b.append(b_t)
90 | pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1')
91 |
92 | conv2, a_t, b_t = bn_pelu_conv(pool1, train_tensor, scope_id='conv2',
93 | num_filters=512, l2_loss=l2_loss)
94 | a.append(a_t)
95 | b.append(b_t)
96 | pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2')
97 |
98 | conv3, a_t, b_t = bn_pelu_conv(pool2, train_tensor, scope_id='conv3',
99 | num_filters=512, l2_loss=l2_loss)
100 | a.append(a_t)
101 | b.append(b_t)
102 | pool3 = tf.layers.max_pooling2d(conv3, 2, 2, name='pool3')
103 |
104 | conv4, a_t, b_t = bn_pelu_conv(pool3, train_tensor, scope_id='conv4',
105 | num_filters=1024, kernel_size=(1,1), strides=(1,1),
106 | l2_loss=l2_loss)
107 | a.append(a_t)
108 | b.append(b_t)
109 |
110 | refinement3, a_t, b_t = refinement_layer(pool3, conv4, train_tensor, 512,
111 | scope_id='refinement3', l2_loss=l2_loss)
112 | a+=a_t
113 | b+=b_t
114 |
115 | refinement2, a_t, b_t = refinement_layer(pool2, refinement3, train_tensor, 512,
116 | scope_id='refinement2', l2_loss=l2_loss)
117 | a+=a_t
118 | b+=b_t
119 |
120 | refinement1, a_t, b_t = refinement_layer(pool1, refinement2, train_tensor, 256,
121 | scope_id='refinement1', l2_loss=l2_loss)
122 | a+=a_t
123 | b+=b_t
124 |
125 | output_tensor = tf.layers.conv2d(refinement1, nb_bands, kernel_size=(1,1),
126 | padding='same',name='output_conv',
127 | kernel_initializer=tf.glorot_normal_initializer(),
128 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss))
129 |
130 | loss = cost_weights[0] * mse_loss(output_tensor,input_tensor)
131 | loss += (cost_weights[1] * mse_loss(conv1,refinement1))
132 | loss += (cost_weights[2] * mse_loss(conv2,refinement2))
133 | loss += (cost_weights[3] * mse_loss(conv3,refinement3))
134 |
135 | loss = loss - 1000.0 * tf.minimum( tf.reduce_min(a), 0)
136 | loss = loss - 1000.0 * tf.minimum( tf.reduce_min(b), 0)
137 |
138 | return loss, refinement1
139 |
140 |
141 | class MCAE(object):
142 | """Convolutional Autoencoder with single- and multi-loss support.
143 |
144 | """
145 | def __init__(self, nb_bands, loss_weights, ckpt_path = 'checkpoints',
146 | num_epochs=1000, starting_learning_rate=2e-3, patience=10,
147 | weight_decay=0.0):
148 | """
149 | Args:
150 | nb_bands (int): Number of bands (dimensionality).
151 | weight_decay (float): (Default: 1e-4)
152 | gpu_list (list of int, None): List of GPU ids that the model is trained on.
153 | If None, the model is trained on all available GPUs.
154 | model_checkpoint_path (str): Filepath to save the model file after every epoch.
155 | loss_mode (str): Specifies whether the CAE is single- or multi-loss.
156 | Supported values are `single` and `multi`.
157 | (Default: `multi`)
158 | weights (str): Filepath to the weights file to load. (Default: None)
159 | loss (str): Loss function used for MCAE.
160 | Supported functions are `mse` and `mae`.
161 |
162 | """
163 | self.nb_bands = nb_bands
164 | self.ckpt_path = ckpt_path
165 | self.num_epochs = num_epochs
166 | self.starting_learning_rate = starting_learning_rate
167 | self.patience = patience
168 | self.lr_patience = int(patience/2)
169 | with tf.Graph().as_default():
170 | inputs = tf.placeholder(tf.float32, shape=(None, None,None,nb_bands),name='inputs')
171 | learning_rate = tf.placeholder(tf.float32, name='learning_rate')
172 | training = tf.placeholder(tf.bool,name='training')
173 |
174 | tf.add_to_collection("inputs", inputs)
175 | tf.add_to_collection("training", training)
176 | tf.add_to_collection('learning_rate', learning_rate)
177 |
178 | loss, hidden_output = mcae(inputs, training, nb_bands, loss_weights, weight_decay)
179 | tf.add_to_collection('loss', loss)
180 | tf.add_to_collection('hidden_output', hidden_output)
181 |
182 | opt = tf.contrib.opt.NadamOptimizer(learning_rate)
183 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
184 | with tf.control_dependencies(update_ops):
185 | train_step = opt.minimize(loss)
186 | tf.add_to_collection('train_step', train_step)
187 |
188 | with tf.Session() as sess:
189 | saver = tf.train.Saver()
190 | sess.run(tf.global_variables_initializer())
191 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0)
192 |
193 | def fit(self, X_train, X_val, batch_size=128):
194 |
195 | if not os.path.exists(self.ckpt_path):
196 | os.makedirs(self.ckpt_path)
197 |
198 | with tf.Graph().as_default():
199 |
200 | with tf.Session() as sess:
201 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
202 | if ckpt and ckpt.model_checkpoint_path:
203 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
204 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
205 | clear_devices=True)
206 | saver.restore(sess, ckpt.model_checkpoint_path)
207 | inputs = tf.get_collection('inputs')[0]
208 | training = tf.get_collection('training')[0]
209 | train_step = tf.get_collection('train_step')[0]
210 | loss = tf.get_collection('loss')[0]
211 | learning_rate = tf.get_collection('learning_rate')[0]
212 | else:
213 | raise FileNotFoundError('Cannot Find Checkpoint')
214 |
215 | best_loss = 100000.0
216 | lr = self.starting_learning_rate
217 |
218 | t_msg1 = '\rEpoch %d (%d/%d) -- lr=%1.0e -- train=%1.2f'
219 | v_msg1 = ' -- val=%1.2f'
220 | pc = 0
221 | lrc = 0
222 |
223 | train_samples = X_train.shape[0]
224 | idx = np.arange(train_samples)
225 | train_loss = EMA()
226 |
227 | for e in range(self.num_epochs):
228 | np.random.shuffle(idx)
229 | for i in range(0 , train_samples, batch_size):
230 | _, loss0 = sess.run([train_step, loss],
231 | feed_dict={inputs: X_train[idx[i:i+batch_size]],
232 | training: True,
233 | learning_rate: lr})
234 | train_loss.update(loss0)
235 | print(t_msg1 % (e+1,i+1,train_samples,lr, train_loss.avg), end="")
236 |
237 | #Calculate validation loss/acc
238 | val_losses = np.array([],dtype=np.float32)
239 | for i in range(0, X_val.shape[0], batch_size):
240 | val_loss = sess.run([loss],
241 | feed_dict={inputs: X_val[i:i+batch_size],
242 | training: False})
243 | val_losses = np.append(val_losses, val_loss)
244 | val_loss = np.mean(val_losses)
245 |
246 | print(v_msg1 % (val_loss), end="")
247 |
248 | if val_loss < best_loss and not math.isnan(val_loss):
249 | saver.save(sess, self.ckpt_path+'/model.ckpt',e)
250 | best_loss = val_loss
251 | print(' -- best_val_loss=%1.2f' % best_loss)
252 | pc = 0
253 | lrc=0
254 | else:
255 | pc += 1
256 | lrc += 1
257 | if pc > self.patience:
258 | break
259 | elif lrc >= self.lr_patience:
260 | lrc = 0
261 | lr /= 10.0
262 | print(' -- Patience=%d/%d' % (pc, self.patience))
263 |
264 | if math.isnan(val_loss):
265 | break
266 |
267 | def transform(self, X, batch_size=128):
268 |
269 | n = X.shape[0]
270 | with tf.Graph().as_default():
271 | if not os.path.exists(self.ckpt_path):
272 | os.makedirs(self.ckpt_path)
273 |
274 | with tf.Session() as sess:
275 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any)
276 | if ckpt and ckpt.model_checkpoint_path:
277 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
278 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
279 | clear_devices=True)
280 | saver.restore(sess, ckpt.model_checkpoint_path)
281 | inputs = tf.get_collection('inputs')[0]
282 | training = tf.get_collection('training')[0]
283 | hidden_output = tf.get_collection('hidden_output')[0]
284 | else:
285 | raise FileNotFoundError('Cannot Find Checkpoint')
286 |
287 | if n > batch_size:
288 |
289 | prediction = np.zeros((n,X.shape[1], X.shape[2], 256), dtype=np.float32)
290 |
291 | for i in range(0, n, batch_size):
292 | start = np.min([i, n - batch_size])
293 | end = np.min([i + batch_size, n])
294 |
295 | prediction[start:end] = sess.run(hidden_output,
296 | feed_dict={inputs: X[start:end],
297 | training: False})
298 |
299 | else:
300 | prediction = sess.run(hidden_output,
301 | feed_dict={inputs: X,
302 | training: False})
303 | return prediction
304 |
--------------------------------------------------------------------------------
/feature_extraction/scae.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from feature_extraction.cae import CAE
3 | from feature_extraction.mcae import MCAE
4 | import tensorflow as tf
5 |
6 | class SCAE():
7 | """Stacked Convolutional Autoencoder with single- and multi-loss support.
8 |
9 | """
10 | def __init__(self, nb_bands=None, nb_cae=None, weight_decay=0.0,
11 | ckpt_path='checkpoints', patience=10, num_epochs=1000,
12 | pre_trained=True, loss_mode='mcae'):
13 | """
14 | Args:
15 | nb_bands (int): Number of bands (dimensionality).
16 | nb_cae (int): Number of CAEs.
17 | weight_decay (float): (Default: 1e-4)
18 | gpu_list (list of int, None): List of GPU ids that the model is trained on.
19 | If None, the model is trained on all available GPUs.
20 | model_checkpoint_path (str): Filepath to save the model file after every epoch.
21 | loss_mode (str): Specifies whether the CAE is single- or multi-loss.
22 | Supported values are `single` and `multi`.
23 | (Default: `multi`)
24 | weights (str): Filepath to the weights file to load. (Default: None)
25 | loss (str): Loss function used for MCAE.
26 | Supported functions are `mse` and `mae`.
27 |
28 | """
29 |
30 | self.N = 256
31 | self.nb_cae = nb_cae
32 | self.ckpt_path = ckpt_path
33 | loss_weights=[1.0, 1e-1, 1e-2, 1e-2]
34 |
35 | if nb_bands is not None:
36 | self.nb_bands = np.append(nb_bands ,
37 | np.ones(nb_cae-1,dtype=np.int32)*self.N)
38 |
39 | self.weight_decay= weight_decay
40 | self.patience = patience
41 | self.loss_weights = loss_weights
42 | self.num_epochs = num_epochs
43 | self.loss_mode = loss_mode
44 |
45 | def fit(self, X_train, X_val, batch_size=128):
46 | """ Trains the model on the given dataset for 500 epochs.
47 |
48 | Args:
49 | X_train (arr): Training data.
50 | X_val (arr): Validation data.
51 | batch_size (int): Number of samples per gradient update.
52 |
53 | """
54 | for i in range(self.nb_cae):
55 | print('Training CAE #%d...' % i)
56 |
57 | if self.loss_mode == 'cae':
58 |
59 | cae = CAE(self.nb_bands[i],
60 | ckpt_path=self.ckpt_path+'/cae%d' % i,
61 | loss_weights=self.loss_weights,
62 | weight_decay=self.weight_decay,
63 | patience=self.patience,
64 | num_epochs=self.num_epochs)
65 | elif self.loss_mode == 'mcae':
66 | cae = MCAE(self.nb_bands[i],
67 | ckpt_path=self.ckpt_path+'/mcae%d' % i,
68 | loss_weights=self.loss_weights,
69 | weight_decay=self.weight_decay,
70 | patience=self.patience,
71 | num_epochs=self.num_epochs)
72 | else:
73 | raise ValueError('Not a valid loss_mode.')
74 |
75 | cae.fit(X_train, X_val, batch_size=batch_size)
76 | if i < self.nb_cae-1:
77 | X_train = cae.transform(X_train)
78 | X_val = cae.transform(X_val)
79 |
80 | return 1
81 |
82 |
83 | def transform(self, X):
84 | """Generates output reconstructions for the input samples.
85 |
86 | Args:
87 | X (arr): The input data.
88 |
89 | Returns:
90 | Reconstruction of input data.
91 |
92 | """
93 | output = np.zeros((X.shape[0], X.shape[1], self.N*self.nb_cae),
94 | dtype=np.float32)
95 | X = X[np.newaxis]
96 | for i in range(self.nb_cae):
97 |
98 | with tf.Graph().as_default():
99 | with tf.Session() as sess:
100 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/cae%d/' % i)
101 | if ckpt and ckpt.model_checkpoint_path:
102 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter
103 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta',
104 | clear_devices=True)
105 | saver.restore(sess, ckpt.model_checkpoint_path)
106 | inputs = tf.get_collection('inputs')[0]
107 | training = tf.get_collection('training')[0]
108 | hidden_output = tf.get_collection('hidden_output')[0]
109 | else:
110 | raise FileNotFoundError('Cannot Find Checkpoint')
111 |
112 |
113 | X = sess.run(hidden_output, feed_dict={inputs: X,
114 | training: False})
115 | output[:,:, self.N*i:self.N*(i+1)] = X[0]
116 |
117 |
118 | return output
119 |
--------------------------------------------------------------------------------
/fileIO/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/fileIO/__init__.py
--------------------------------------------------------------------------------
/fileIO/aviris.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: aviris.py
3 | Author: Ronald Kemker
4 | Description: Pipeline for processing AVIRIS imagery
5 |
6 | Note:
7 | Requires GDAL, remote_sensing.utils, and display (optional)
8 | http://www.gdal.org/
9 | """
10 |
11 | import gdal
12 | from glob import glob
13 | import numpy as np
14 | from utils.utils import band_resample_hsi_cube
15 | from fileIO.utils import readENVIHeader
16 |
17 | class AVIRIS():
18 | """
19 | AVIRIS
20 |
21 | This processes data from the Airborne Visible/Infrared Imaging Spectrometer
22 | (AVIRIS).
23 |
24 | Parameters
25 | ----------
26 | directory : string, file directory of AVIRIS data
27 | calibrate : boolean, convert hyperspectral cube frol digital count to
28 | radiance. (Default: True)
29 | bands : numpy array [1 x num target bands], target bands for spectral
30 | resampling (Default: None -> No resampling is performed)
31 | fwhm : numpy array [1 x num target bands], target full-width half maxes for
32 | for spectral resampling (Default: None)
33 |
34 | Attributes
35 | ----------
36 | dName : string, input "directory"
37 | cal : boolean, input "calibrate"
38 | bands : numpy array [1 x num source bands], band centers of input cube
39 | fwhm : numpy array [1 x num source bands], full width, half maxes of input
40 | cube
41 | tgt_bands : numpy array [1 x num target bands], input "bands"
42 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm"
43 | data : numpy array, hyperspectral cube
44 | shape : tuple, shape of hyperspectral cube
45 | mask : numpy array, binary mask of valid data (1 is valid pixel)
46 |
47 | Notes
48 | -----
49 | Ref: https://aviris.jpl.nasa.gov/
50 |
51 | """
52 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None):
53 | self.dName = directory
54 | self.cal = calibrate
55 | self.tgt_bands = bands
56 | self.tgt_fwhm = fwhm
57 | self.read()
58 | pass
59 |
60 | def read(self):
61 | """Reads orthomosaic, calculates binary mask, performs radiometric
62 | calibration (optional), and band resamples (optional)"""
63 | f = glob(self.dName+'/*ort_img')[0]
64 | self.data = gdal.Open(f).ReadAsArray().transpose(1,2,0)
65 | self.shape = self.data.shape
66 | self.mask = self.data[:,:,10] != self._mask_value()
67 | self.data[self.mask==False] = 0
68 | self.hdr_data()
69 |
70 | if self.cal:
71 | self.radiometric_cal()
72 |
73 | if self.tgt_bands is not None:
74 | self.band_resample()
75 |
76 | def radiometric_cal(self):
77 | """Performs radiometric calibration with gain file"""
78 | with open(glob(self.dName+'/*gain')[0]) as f:
79 | lines = f.read().splitlines()
80 |
81 | gain = np.zeros(224, dtype=np.float32)
82 | for i in range(len(lines)):
83 | gain[i] = np.float32(lines[i].split()[0])
84 |
85 | self.data[self.mask] = np.float32(self.data[self.mask])*gain
86 |
87 | def show_rgb(self):
88 | """Displays RGB visualization of HSI cube"""
89 | from display import imshow
90 | idx = np.array([13,19,27], dtype=np.uint8)
91 | imshow(self.data[:,:,idx])
92 |
93 | def hdr_data(self):
94 | """Finds band-centers and full-width half-maxes of hyperspectral cube"""
95 | f = glob(self.dName+'/*ort_img.hdr')[0]
96 | self.bands, self.fwhm = readENVIHeader(f)
97 |
98 | def band_resample(self):
99 | """Performs band resampling operation"""
100 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands,
101 | self.fwhm, self.tgt_fwhm,
102 | self.mask)
103 |
104 | def _mask_value(self):
105 | """Finds value of invalid orthomosaic pixels"""
106 | return np.min([self.data[0,-1],self.data[-1,0],self.data[0,-1],
107 | self.data[-1,-1]])
108 |
--------------------------------------------------------------------------------
/fileIO/aviris_bands.hdr:
--------------------------------------------------------------------------------
1 | ENVI
2 | description = {
3 | AVIRIS orthocorrected file, pixel size = 17.2000
4 | rotation angle = 0.000000
5 | datum = WGS-84
6 | UTM zone = 10
7 | upper left corner (1,1) (Easting) = 752834.71
8 | upper left corner (1,1) (Northing) = 4047735.4 }
9 | samples = 748
10 | lines = 1425
11 | bands = 224
12 | header offset = 0
13 | data type = 2
14 | interleave = bip
15 | byte order = 1
16 | map info ={UTM, 1, 1, 752834.710, 4047735.400, 17.200, 17.200,
17 | 10, North, WGS-84, units=Meters, rotation=0.000000}
18 | x start = 1
19 | y start = 1
20 | wavelength = {
21 | 365.9298 ,
22 | 375.5940 ,
23 | 385.2625 ,
24 | 394.9355 ,
25 | 404.6129 ,
26 | 414.2946 ,
27 | 423.9808 ,
28 | 433.6713 ,
29 | 443.3662 ,
30 | 453.0655 ,
31 | 462.7692 ,
32 | 472.4773 ,
33 | 482.1898 ,
34 | 491.9066 ,
35 | 501.6279 ,
36 | 511.3535 ,
37 | 521.0836 ,
38 | 530.8180 ,
39 | 540.5568 ,
40 | 550.3000 ,
41 | 560.0477 ,
42 | 569.7996 ,
43 | 579.5560 ,
44 | 589.3168 ,
45 | 599.0819 ,
46 | 608.8515 ,
47 | 618.6254 ,
48 | 628.4037 ,
49 | 638.1865 ,
50 | 647.9736 ,
51 | 657.7651 ,
52 | 667.5610 ,
53 | 655.2923 ,
54 | 665.0994 ,
55 | 674.9012 ,
56 | 684.6979 ,
57 | 694.4894 ,
58 | 704.2756 ,
59 | 714.0566 ,
60 | 723.8325 ,
61 | 733.6031 ,
62 | 743.3685 ,
63 | 753.1287 ,
64 | 762.8837 ,
65 | 772.6335 ,
66 | 782.3781 ,
67 | 792.1174 ,
68 | 801.8516 ,
69 | 811.5805 ,
70 | 821.3043 ,
71 | 831.0228 ,
72 | 840.7361 ,
73 | 850.4442 ,
74 | 860.1471 ,
75 | 869.8448 ,
76 | 879.5372 ,
77 | 889.2245 ,
78 | 898.9066 ,
79 | 908.5834 ,
80 | 918.2551 ,
81 | 927.9214 ,
82 | 937.5827 ,
83 | 947.2387 ,
84 | 956.8895 ,
85 | 966.5351 ,
86 | 976.1755 ,
87 | 985.8106 ,
88 | 995.4406 ,
89 | 1005.065 ,
90 | 1014.685 ,
91 | 1024.299 ,
92 | 1033.908 ,
93 | 1043.512 ,
94 | 1053.111 ,
95 | 1062.704 ,
96 | 1072.293 ,
97 | 1081.876 ,
98 | 1091.454 ,
99 | 1101.026 ,
100 | 1110.594 ,
101 | 1120.156 ,
102 | 1129.713 ,
103 | 1139.265 ,
104 | 1148.811 ,
105 | 1158.353 ,
106 | 1167.889 ,
107 | 1177.420 ,
108 | 1186.946 ,
109 | 1196.466 ,
110 | 1205.982 ,
111 | 1215.492 ,
112 | 1224.997 ,
113 | 1234.496 ,
114 | 1243.991 ,
115 | 1253.480 ,
116 | 1262.964 ,
117 | 1253.373 ,
118 | 1263.346 ,
119 | 1273.318 ,
120 | 1283.291 ,
121 | 1293.262 ,
122 | 1303.234 ,
123 | 1313.206 ,
124 | 1323.177 ,
125 | 1333.148 ,
126 | 1343.119 ,
127 | 1353.089 ,
128 | 1363.060 ,
129 | 1373.030 ,
130 | 1383.000 ,
131 | 1392.969 ,
132 | 1402.939 ,
133 | 1412.908 ,
134 | 1422.877 ,
135 | 1432.845 ,
136 | 1442.814 ,
137 | 1452.782 ,
138 | 1462.750 ,
139 | 1472.718 ,
140 | 1482.685 ,
141 | 1492.652 ,
142 | 1502.619 ,
143 | 1512.586 ,
144 | 1522.552 ,
145 | 1532.518 ,
146 | 1542.484 ,
147 | 1552.450 ,
148 | 1562.416 ,
149 | 1572.381 ,
150 | 1582.346 ,
151 | 1592.311 ,
152 | 1602.275 ,
153 | 1612.240 ,
154 | 1622.204 ,
155 | 1632.167 ,
156 | 1642.131 ,
157 | 1652.094 ,
158 | 1662.057 ,
159 | 1672.020 ,
160 | 1681.983 ,
161 | 1691.945 ,
162 | 1701.907 ,
163 | 1711.869 ,
164 | 1721.831 ,
165 | 1731.792 ,
166 | 1741.753 ,
167 | 1751.714 ,
168 | 1761.675 ,
169 | 1771.635 ,
170 | 1781.596 ,
171 | 1791.556 ,
172 | 1801.515 ,
173 | 1811.475 ,
174 | 1821.434 ,
175 | 1831.393 ,
176 | 1841.352 ,
177 | 1851.310 ,
178 | 1861.269 ,
179 | 1871.227 ,
180 | 1873.184 ,
181 | 1867.164 ,
182 | 1877.225 ,
183 | 1887.285 ,
184 | 1897.342 ,
185 | 1907.396 ,
186 | 1917.448 ,
187 | 1927.499 ,
188 | 1937.546 ,
189 | 1947.592 ,
190 | 1957.635 ,
191 | 1967.675 ,
192 | 1977.714 ,
193 | 1987.750 ,
194 | 1997.784 ,
195 | 2007.815 ,
196 | 2017.845 ,
197 | 2027.872 ,
198 | 2037.896 ,
199 | 2047.919 ,
200 | 2057.939 ,
201 | 2067.956 ,
202 | 2077.972 ,
203 | 2087.985 ,
204 | 2097.996 ,
205 | 2108.004 ,
206 | 2118.010 ,
207 | 2128.014 ,
208 | 2138.016 ,
209 | 2148.015 ,
210 | 2158.012 ,
211 | 2168.007 ,
212 | 2177.999 ,
213 | 2187.989 ,
214 | 2197.977 ,
215 | 2207.962 ,
216 | 2217.945 ,
217 | 2227.926 ,
218 | 2237.905 ,
219 | 2247.881 ,
220 | 2257.854 ,
221 | 2267.826 ,
222 | 2277.795 ,
223 | 2287.762 ,
224 | 2297.727 ,
225 | 2307.689 ,
226 | 2317.649 ,
227 | 2327.607 ,
228 | 2337.562 ,
229 | 2347.516 ,
230 | 2357.467 ,
231 | 2367.415 ,
232 | 2377.361 ,
233 | 2387.305 ,
234 | 2397.247 ,
235 | 2407.186 ,
236 | 2417.123 ,
237 | 2427.058 ,
238 | 2436.990 ,
239 | 2446.920 ,
240 | 2456.848 ,
241 | 2466.773 ,
242 | 2476.696 ,
243 | 2486.617 ,
244 | 2496.536 }
245 | fwhm = {
246 | 9.852108 ,
247 | 9.796976 ,
248 | 9.744104 ,
249 | 9.693492 ,
250 | 9.645140 ,
251 | 9.599048 ,
252 | 9.555216 ,
253 | 9.513644 ,
254 | 9.474332 ,
255 | 9.437280 ,
256 | 9.402488 ,
257 | 9.369956 ,
258 | 9.339684 ,
259 | 9.311672 ,
260 | 9.285920 ,
261 | 9.262428 ,
262 | 9.241196 ,
263 | 9.222224 ,
264 | 9.205512 ,
265 | 9.191060 ,
266 | 9.178868 ,
267 | 9.168936 ,
268 | 9.161264 ,
269 | 9.155852 ,
270 | 9.152700 ,
271 | 9.151808 ,
272 | 9.153176 ,
273 | 9.156804 ,
274 | 9.162692 ,
275 | 9.170840 ,
276 | 9.181248 ,
277 | 9.193916 ,
278 | 9.908470 ,
279 | 9.848707 ,
280 | 9.798712 ,
281 | 9.758482 ,
282 | 9.728020 ,
283 | 9.707324 ,
284 | 9.696396 ,
285 | 9.695233 ,
286 | 9.703838 ,
287 | 9.722210 ,
288 | 9.750348 ,
289 | 9.788254 ,
290 | 9.835926 ,
291 | 9.893364 ,
292 | 9.960570 ,
293 | 10.03754 ,
294 | 10.12428 ,
295 | 10.22079 ,
296 | 10.32706 ,
297 | 10.44310 ,
298 | 10.56891 ,
299 | 10.70448 ,
300 | 10.84982 ,
301 | 11.00493 ,
302 | 11.16980 ,
303 | 11.34444 ,
304 | 11.52885 ,
305 | 11.72302 ,
306 | 11.92696 ,
307 | 10.78153 ,
308 | 9.761805 ,
309 | 9.760064 ,
310 | 9.759324 ,
311 | 9.759585 ,
312 | 9.760846 ,
313 | 9.763108 ,
314 | 9.766370 ,
315 | 9.770633 ,
316 | 9.775896 ,
317 | 9.782160 ,
318 | 9.789424 ,
319 | 9.797689 ,
320 | 9.806954 ,
321 | 9.817220 ,
322 | 9.828486 ,
323 | 9.840753 ,
324 | 9.854020 ,
325 | 9.868288 ,
326 | 9.883556 ,
327 | 9.899825 ,
328 | 9.917094 ,
329 | 9.935364 ,
330 | 9.954635 ,
331 | 9.974905 ,
332 | 9.996177 ,
333 | 10.01845 ,
334 | 10.04172 ,
335 | 10.06599 ,
336 | 10.09127 ,
337 | 10.11754 ,
338 | 10.14481 ,
339 | 10.17309 ,
340 | 10.20236 ,
341 | 10.23264 ,
342 | 10.83826 ,
343 | 10.83267 ,
344 | 10.82721 ,
345 | 10.82188 ,
346 | 10.81670 ,
347 | 10.81165 ,
348 | 10.80674 ,
349 | 10.80196 ,
350 | 10.79733 ,
351 | 10.79283 ,
352 | 10.78847 ,
353 | 10.78425 ,
354 | 10.78016 ,
355 | 10.77621 ,
356 | 10.77240 ,
357 | 10.76873 ,
358 | 10.76519 ,
359 | 10.76180 ,
360 | 10.75853 ,
361 | 10.75541 ,
362 | 10.75243 ,
363 | 10.74958 ,
364 | 10.74687 ,
365 | 10.74429 ,
366 | 10.74186 ,
367 | 10.73956 ,
368 | 10.73740 ,
369 | 10.73538 ,
370 | 10.73349 ,
371 | 10.73174 ,
372 | 10.73013 ,
373 | 10.72866 ,
374 | 10.72732 ,
375 | 10.72612 ,
376 | 10.72506 ,
377 | 10.72414 ,
378 | 10.72335 ,
379 | 10.72270 ,
380 | 10.72219 ,
381 | 10.72182 ,
382 | 10.72159 ,
383 | 10.72149 ,
384 | 10.72153 ,
385 | 10.72170 ,
386 | 10.72202 ,
387 | 10.72247 ,
388 | 10.72306 ,
389 | 10.72378 ,
390 | 10.72465 ,
391 | 10.72565 ,
392 | 10.72679 ,
393 | 10.72807 ,
394 | 10.72948 ,
395 | 10.73103 ,
396 | 10.73272 ,
397 | 10.73455 ,
398 | 10.73651 ,
399 | 10.73861 ,
400 | 10.74085 ,
401 | 10.74323 ,
402 | 10.74574 ,
403 | 10.74839 ,
404 | 10.75118 ,
405 | 10.75411 ,
406 | 11.16661 ,
407 | 11.15791 ,
408 | 11.14889 ,
409 | 11.13955 ,
410 | 11.12990 ,
411 | 11.11992 ,
412 | 11.10964 ,
413 | 11.09903 ,
414 | 11.08811 ,
415 | 11.07687 ,
416 | 11.06531 ,
417 | 11.05344 ,
418 | 11.04125 ,
419 | 11.02875 ,
420 | 11.01592 ,
421 | 11.00278 ,
422 | 10.98933 ,
423 | 10.97555 ,
424 | 10.96146 ,
425 | 10.94705 ,
426 | 10.93233 ,
427 | 10.91729 ,
428 | 10.90193 ,
429 | 10.88626 ,
430 | 10.87026 ,
431 | 10.85396 ,
432 | 10.83733 ,
433 | 10.82039 ,
434 | 10.80313 ,
435 | 10.78555 ,
436 | 10.76766 ,
437 | 10.74945 ,
438 | 10.73092 ,
439 | 10.71208 ,
440 | 10.69292 ,
441 | 10.67344 ,
442 | 10.65365 ,
443 | 10.63354 ,
444 | 10.61311 ,
445 | 10.59236 ,
446 | 10.57130 ,
447 | 10.54992 ,
448 | 10.52823 ,
449 | 10.50622 ,
450 | 10.48389 ,
451 | 10.46124 ,
452 | 10.43828 ,
453 | 10.41500 ,
454 | 10.39140 ,
455 | 10.36749 ,
456 | 10.34326 ,
457 | 10.31871 ,
458 | 10.29385 ,
459 | 10.26867 ,
460 | 10.24317 ,
461 | 10.21736 ,
462 | 10.19123 ,
463 | 10.16478 ,
464 | 10.13801 ,
465 | 10.11093 ,
466 | 10.08353 ,
467 | 10.05582 ,
468 | 10.02778 ,
469 | 9.999434 }
470 |
--------------------------------------------------------------------------------
/fileIO/gliht.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: aviris.py
3 | Author: Ronald Kemker
4 | Description: Pipeline for processing AVIRIS imagery
5 |
6 | Note:
7 | Requires GDAL, remote_sensing.utils, and display (optional)
8 | http://www.gdal.org/
9 | """
10 |
11 | import gdal
12 | from glob import glob
13 | import numpy as np
14 | from utils.utils import band_resample_hsi_cube
15 | from fileIO.utils import readENVIHeader
16 |
17 | class GLiHT():
18 | """
19 | GLiHT
20 |
21 | This processes hyperspectral data from the G-LiHT:
22 | Goddard's LiDAR, Hyperspectral & Thermal Imager
23 | (GLiHT).
24 |
25 | Parameters
26 | ----------
27 | directory : string, file directory of GLiHT data
28 | calibrate : boolean, convert hyperspectral cube frol digital count to
29 | radiance. (Default: True)
30 | bands : numpy array [1 x num target bands], target bands for spectral
31 | resampling (Default: None -> No resampling is performed)
32 | fwhm : numpy array [1 x num target bands], target full-width half maxes for
33 | for spectral resampling (Default: None)
34 |
35 | Attributes
36 | ----------
37 | dName : string, input "directory"
38 | cal : boolean, input "calibrate"
39 | bands : numpy array [1 x num source bands], band centers of input cube
40 | fwhm : numpy array [1 x num source bands], full width, half maxes of input
41 | cube
42 | tgt_bands : numpy array [1 x num target bands], input "bands"
43 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm"
44 | data : numpy array, hyperspectral cube
45 | shape : tuple, shape of hyperspectral cube
46 | mask : numpy array, binary mask of valid data (1 is valid pixel)
47 |
48 | Notes
49 | -----
50 | Ref: https://gliht.gsfc.nasa.gov/
51 |
52 | """
53 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None):
54 | self.dName = directory
55 | self.cal = calibrate
56 | self.tgt_bands = bands
57 | self.tgt_fwhm = fwhm
58 | self.read()
59 | pass
60 |
61 | def read(self):
62 | '''Reads orthomosaic, calculates binary mask, performs radiometric
63 | ibration (optional), and band resamples (optional)'''
64 | f = glob(self.dName+'/*radiance_L1G')[0]
65 | self.data = gdal.Open(f).ReadAsArray().transpose(1,2,0)
66 | self.shape = self.data.shape
67 | self.mask = self.data[:,:,10] != 0
68 | self.hdr_data()
69 |
70 | if self.cal:
71 | self.radiometric_cal()
72 |
73 | if self.tgt_bands is not None:
74 | self.band_resample()
75 |
76 | self.data[self.mask==False] = 0
77 |
78 |
79 | def radiometric_cal(self):
80 | """Performs radiometric calibration with gain file"""
81 | gain = 1e-4
82 | self.data[self.mask] = np.float32(self.data[self.mask])*gain
83 |
84 | def show_rgb(self):
85 | '''Displays RGB visualization of HSI cube'''
86 | from display import imshow
87 | idx = np.array([15,33,54], dtype=np.uint8) #blue, green, red
88 | imshow(self.data[:,:,idx])
89 |
90 | def hdr_data(self):
91 | """Finds band-centers and full-width half-maxes of hyperspectral cube"""
92 | f = glob(self.dName+'/*radiance_L1G.hdr')[0]
93 | self.bands, self.fwhm = readENVIHeader(f)
94 |
95 | def band_resample(self):
96 | """Performs band resampling operation"""
97 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands,
98 | self.fwhm, self.tgt_fwhm)
99 |
100 |
--------------------------------------------------------------------------------
/fileIO/hyperion.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: hyperion.py
3 | Author: Ronald Kemker
4 | Description: Pipeline for processing Hyperion imagery
5 |
6 | Note:
7 | Requires GDAL, remote_sensing.utils, and display (optional)
8 | http://www.gdal.org/
9 | """
10 |
11 | import gdal
12 | from glob import glob
13 | import numpy as np
14 | from utils.utils import band_resample_hsi_cube
15 |
16 | class Hyperion():
17 | """
18 | Hyperion
19 |
20 | This processes data from the Hyperion HSI sensor.
21 | (Hyperion).
22 |
23 | Parameters
24 | ----------
25 | directory : string, file directory of Hyperion data
26 | calibrate : boolean, convert hyperspectral cube frol digital count to
27 | radiance. (Default: True)
28 | bands : numpy array [1 x num target bands], target bands for spectral
29 | resampling (Default: None -> No resampling is performed)
30 | fwhm : numpy array [1 x num target bands], target full-width half maxes for
31 | for spectral resampling (Default: None)
32 |
33 | Attributes
34 | ----------
35 | dName : string, input "directory"
36 | cal : boolean, input "calibrate"
37 | bands : numpy array [1 x num source bands], band centers of input cube
38 | fwhm : numpy array [1 x num source bands], full width, half maxes of input
39 | cube
40 | tgt_bands : numpy array [1 x num target bands], input "bands"
41 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm"
42 | data : numpy array, hyperspectral cube
43 | shape : tuple, shape of hyperspectral cube
44 | mask : numpy array, binary mask of valid data (1 is valid pixel)
45 |
46 | Notes
47 | -----
48 | Ref: https://eo1.usgs.gov/sensors/hyperion
49 |
50 | """
51 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None):
52 | self.dName = directory
53 | self.cal = calibrate
54 | self.tgt_bands = bands
55 | self.tgt_fwhm = fwhm
56 | self.read()
57 | pass
58 |
59 | def read(self):
60 | """Reads orthomosaic, calculates binary mask, performs radiometric
61 | calibration (optional), and band resamples (optional)"""
62 | dataFN = sorted(glob(self.dName+'/*.TIF'))
63 | self.calIdx = np.hstack([np.arange(7,57),np.arange(76,224)])
64 |
65 | tmp = gdal.Open(dataFN[self.calIdx[0]]).ReadAsArray()
66 |
67 | self.data = np.zeros((tmp.shape[0],tmp.shape[1],198),dtype=np.float32)
68 | self.shape = self.data.shape
69 | self.hdr_data()
70 |
71 | self.data[:,:,0] = tmp
72 | for i in range(1,198):
73 | self.data[:,:,i] = gdal.Open(dataFN[self.calIdx[i]]).ReadAsArray()
74 |
75 | if self.cal:
76 | self.data[:,:,0:50] = self.data[:,:,0:50]/40
77 | self.data[:,:,50:] = self.data[:,:,50:]/80
78 |
79 | self.mask = self.data[:,:,10] != self._mask_value()
80 |
81 | if self.tgt_bands is not None:
82 | self.band_resample()
83 |
84 | self.data[self.mask==False] = 0
85 |
86 | def show_rgb(self):
87 | """Displays RGB visualization of HSI cube"""
88 | from display import imshow
89 | idx = np.array([20,10,0], dtype=np.uint8)
90 | imshow(self.data[:,:,idx])
91 |
92 | def hdr_data(self):
93 | """Finds band-centers and full-width half-maxes of hyperspectral cube"""
94 | hdr = np.genfromtxt('/home/rmk6217/Documents/grss_challenge/fileIO/hyperion_bands.csv', delimiter=',')
95 | self.bands = hdr[:,0]
96 | self.fwhm = hdr[:,1]
97 |
98 |
99 | def band_resample(self):
100 | """Performs band resampling operation"""
101 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands,
102 | self.fwhm, self.tgt_fwhm)
103 |
104 | def _mask_value(self):
105 | """Finds value of invalid orthomosaic pixels"""
106 | return np.min([self.data[0,-1],self.data[-1,0],self.data[0,-1],
107 | self.data[-1,-1]])
108 |
109 |
110 |
--------------------------------------------------------------------------------
/fileIO/hyperion_bands.csv:
--------------------------------------------------------------------------------
1 | 426.8200,11.3871
2 | 436.9900,11.3871
3 | 447.1700,11.3871
4 | 457.3400,11.3871
5 | 467.5200,11.3871
6 | 477.6900,11.3871
7 | 487.8700,11.3784
8 | 498.0400,11.3538
9 | 508.2200,11.3133
10 | 518.3900,11.2580
11 | 528.5700,11.1907
12 | 538.7400,11.1119
13 | 548.9200,11.0245
14 | 559.0900,10.9321
15 | 569.2700,10.8368
16 | 579.4500,10.7407
17 | 589.6200,10.6482
18 | 599.8000,10.5607
19 | 609.9700,10.4823
20 | 620.1500,10.4147
21 | 630.3200,10.3595
22 | 640.5000,10.3188
23 | 650.6700,10.2942
24 | 660.8500,10.2856
25 | 671.0200,10.2980
26 | 681.2000,10.3349
27 | 691.3700,10.3909
28 | 701.5500,10.4592
29 | 711.7200,10.5322
30 | 721.9000,10.6004
31 | 732.0700,10.6562
32 | 742.2500,10.6933
33 | 752.4300,10.7058
34 | 762.6000,10.7276
35 | 772.7800,10.7907
36 | 782.9500,10.8833
37 | 793.1300,10.9938
38 | 803.3000,11.1044
39 | 813.4800,11.1980
40 | 823.6500,11.2600
41 | 833.8300,11.2824
42 | 844.0000,11.2822
43 | 854.1800,11.2816
44 | 864.3500,11.2809
45 | 874.5300,11.2797
46 | 884.7000,11.2782
47 | 894.8800,11.2771
48 | 905.0500,11.2765
49 | 915.2300,11.2756
50 | 925.4100,11.2754
51 | 912.4500,11.0457
52 | 922.5400,11.0457
53 | 932.6400,11.0457
54 | 942.7300,11.0457
55 | 952.8200,11.0457
56 | 962.9100,11.0457
57 | 972.9900,11.0457
58 | 983.0800,11.0457
59 | 993.1700,11.0457
60 | 1003.3000,11.0457
61 | 1013.3000,11.0457
62 | 1023.4000,11.0451
63 | 1033.4900,11.0423
64 | 1043.5900,11.0372
65 | 1053.6900,11.0302
66 | 1063.7900,11.0218
67 | 1073.8900,11.0122
68 | 1083.9900,11.0013
69 | 1094.0900,10.9871
70 | 1104.1900,10.9732
71 | 1114.1900,10.9572
72 | 1124.2800,10.9418
73 | 1134.3800,10.9248
74 | 1144.4800,10.9065
75 | 1154.5800,10.8884
76 | 1164.6800,10.8696
77 | 1174.7700,10.8513
78 | 1184.8700,10.8335
79 | 1194.9700,10.8154
80 | 1205.0700,10.7979
81 | 1215.1700,10.7822
82 | 1225.1700,10.7663
83 | 1235.2700,10.7520
84 | 1245.3600,10.7385
85 | 1255.4600,10.7270
86 | 1265.5600,10.7174
87 | 1275.6600,10.7091
88 | 1285.7600,10.7022
89 | 1295.8600,10.6970
90 | 1305.9600,10.6946
91 | 1316.0500,10.6937
92 | 1326.0500,10.6949
93 | 1336.1500,10.6996
94 | 1346.2500,10.7058
95 | 1356.3500,10.7163
96 | 1366.4500,10.7283
97 | 1376.5500,10.7437
98 | 1386.6500,10.7612
99 | 1396.7400,10.7807
100 | 1406.8400,10.8034
101 | 1416.9400,10.8267
102 | 1426.9400,10.8534
103 | 1437.0400,10.8818
104 | 1447.1400,10.9110
105 | 1457.2300,10.9422
106 | 1467.3300,10.9743
107 | 1477.4300,11.0074
108 | 1487.5300,11.0414
109 | 1497.6300,11.0759
110 | 1507.7300,11.1108
111 | 1517.8300,11.1461
112 | 1527.9200,11.1811
113 | 1537.9200,11.2156
114 | 1548.0200,11.2496
115 | 1558.1200,11.2826
116 | 1568.2200,11.3146
117 | 1578.3200,11.3460
118 | 1588.4200,11.3753
119 | 1598.5100,11.4037
120 | 1608.6100,11.4302
121 | 1618.7100,11.4538
122 | 1628.8100,11.4760
123 | 1638.8100,11.4958
124 | 1648.9000,11.5133
125 | 1659.0000,11.5286
126 | 1669.1000,11.5404
127 | 1679.2000,11.5505
128 | 1689.3000,11.5580
129 | 1699.4000,11.5621
130 | 1709.5000,11.5634
131 | 1719.6000,11.5617
132 | 1729.7000,11.5563
133 | 1739.7000,11.5477
134 | 1749.7900,11.5346
135 | 1759.8900,11.5193
136 | 1769.9900,11.5002
137 | 1780.0900,11.4789
138 | 1790.1900,11.4548
139 | 1800.2900,11.4279
140 | 1810.3800,11.3994
141 | 1820.4800,11.3688
142 | 1830.5800,11.3366
143 | 1840.5800,11.3036
144 | 1850.6800,11.2696
145 | 1860.7800,11.2363
146 | 1870.8700,11.2007
147 | 1880.9800,11.1666
148 | 1891.0700,11.1333
149 | 1901.1700,11.1018
150 | 1911.2700,11.0714
151 | 1921.3700,11.0424
152 | 1931.4700,11.0155
153 | 1941.5700,10.9912
154 | 1951.5700,10.9698
155 | 1961.6600,10.9508
156 | 1971.7600,10.9355
157 | 1981.8600,10.9230
158 | 1991.9600,10.9139
159 | 2002.0600,10.9083
160 | 2012.1500,10.9069
161 | 2022.2500,10.9057
162 | 2032.3500,10.9013
163 | 2042.4500,10.8951
164 | 2052.4500,10.8854
165 | 2062.5500,10.8740
166 | 2072.6500,10.8591
167 | 2082.7500,10.8429
168 | 2092.8400,10.8242
169 | 2102.9400,10.8039
170 | 2113.0400,10.7820
171 | 2123.1400,10.7592
172 | 2133.2400,10.7342
173 | 2143.3400,10.7092
174 | 2153.3400,10.6834
175 | 2163.4300,10.6572
176 | 2173.5300,10.6312
177 | 2183.6300,10.6052
178 | 2193.7300,10.5803
179 | 2203.8300,10.5560
180 | 2213.9300,10.5328
181 | 2224.0300,10.5101
182 | 2234.1200,10.4904
183 | 2244.2200,10.4722
184 | 2254.2200,10.4552
185 | 2264.3200,10.4408
186 | 2274.4200,10.4285
187 | 2284.5200,10.4197
188 | 2294.6100,10.4129
189 | 2304.7100,10.4088
190 | 2314.8100,10.4077
191 | 2324.9100,10.4077
192 | 2335.0100,10.4077
193 | 2345.1100,10.4077
194 | 2355.2100,10.4077
195 | 2365.2000,10.4077
196 | 2375.3000,10.4077
197 | 2385.4000,10.4077
198 | 2395.5000,10.4077
199 |
--------------------------------------------------------------------------------
/fileIO/utils.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: fileIO.py
3 | Author: Ronald Kemker
4 | Description: Input/Output helper functions
5 |
6 | Note:
7 | Requires GDAL, remote_sensing.utils, and display (optional)
8 | http://www.gdal.org/
9 | """
10 |
11 | import numpy as np
12 | import pickle
13 | from scipy.io import loadmat, savemat
14 | from gdal import Open
15 | import spectral.io.envi as envi
16 | from numba import jit
17 | from random import sample
18 |
19 | def read(fName):
20 | """File reader
21 |
22 | Parameters
23 | ----------
24 | fname : string, file path to read in (supports .npy, .sav, .pkl, and
25 | .tif (GEOTIFF)) including file extension
26 |
27 | Returns
28 | -------
29 | output : output data (in whatever format)
30 | """
31 | ext = fName[-3:]
32 |
33 | if ext == 'npy':
34 | return np.load(fName)
35 | elif ext == 'sav' or ext == 'pkl':
36 | return pickle.load(open(fName, 'rb'))
37 | elif ext == 'mat':
38 | return loadmat(fName)
39 | elif ext == 'tif' or 'pix':
40 | return Open(fName).ReadAsArray()
41 | else:
42 | print('Unknown filename extension')
43 | return -1
44 |
45 | def write(fName, data):
46 | """File writer
47 |
48 | Parameters
49 | ----------
50 | fname : string, file path to read in (supports .npy, .sav, .pkl, and
51 | .tif (GEOTIFF)) including file extension
52 | data : data to be stored
53 | Returns
54 | -------
55 | output : output data (in whatever format)
56 | """
57 | ext = fName[-3:]
58 |
59 | if ext == 'npy':
60 | np.save(fName, data)
61 | return 1
62 | elif ext == 'sav' or ext == 'pkl':
63 | pickle.dump(data, open(fName, 'wb'))
64 | return 1
65 | elif ext == 'mat':
66 | savemat(fName, data)
67 | return 1
68 | else:
69 | print('Unknown filename extension')
70 | return -1
71 |
72 | def readENVIHeader(fName):
73 | """
74 | Reads envi header
75 |
76 | Parameters
77 | ----------
78 | fName : String, Path to .hdr file
79 |
80 | Returns
81 | -------
82 | centers : Band-centers
83 | fwhm : full-width half-maxes
84 | """
85 | hdr = envi.read_envi_header(fName)
86 | centers = np.array(hdr['wavelength'],dtype=np.float)
87 | fwhm = np.array(hdr['fwhm'],dtype=np.float)
88 | return centers, fwhm
89 |
90 | @jit
91 | def patch_extractor_with_mask(data, mask, num_patches=50, patch_size=16):
92 | """
93 | Extracts patches inside a mask. I need to find a faster way of doing this.
94 |
95 | Parameters
96 | ----------
97 | data : 3-D input numpy array [rows x columns x channels]
98 | mask : 2-D binary mask where 1 is valid, 0 else
99 | num_patches : int, number of patches to extract (Default: 50)
100 | patch_size : int, pixel dimension of square patch (Default: 16)
101 |
102 | Returns
103 | -------
104 | output : 4-D Numpy array [num patches x rows x columns x channels]
105 | """
106 | sh = data.shape
107 | patch_arr = np.zeros((num_patches,patch_size, patch_size, sh[-1]),
108 | dtype=data.dtype)
109 |
110 | if type(patch_size) == int:
111 | patch_size = np.array([patch_size, patch_size], dtype = np.int32)
112 |
113 | #Find Valid Regions to Extract Patches
114 | valid = np.zeros(mask.shape, dtype=np.uint8)
115 | for i in range(0, sh[0]-patch_size[0]):
116 | for j in range(0, sh[1]-patch_size[1]):
117 | if np.all(mask[i:i+patch_size[0], j:j+patch_size[1]]) == True:
118 | valid[i,j] += 1
119 |
120 | idx = np.argwhere(valid > 0 )
121 | idx = idx[np.array(sample(range(idx.shape[0]), num_patches))]
122 |
123 | for i in range(num_patches):
124 | patch_arr[i] = data[idx[i,0]:idx[i,0]+patch_size[0],
125 | idx[i,1]:idx[i,1]+patch_size[1]]
126 |
127 | return patch_arr
128 |
129 | @jit
130 | def patch_extractor(data, num_patches=50, patch_size=16):
131 | """
132 | Extracts patches inside a mask. I need to find a faster way of doing this.
133 |
134 | Parameters
135 | ----------
136 | data : 3-D input numpy array [rows x columns x channels]
137 | num_patches : int, number of patches to extract (Default: 50)
138 | patch_size : int, pixel dimension of square patch (Default: 16)
139 |
140 | Returns
141 | -------
142 | output : 4-D Numpy array [num patches x rows x columns x channels]
143 | """
144 | sh = data.shape
145 | patch_arr = np.zeros((num_patches,patch_size, patch_size, sh[-1]),
146 | dtype=data.dtype)
147 |
148 | if type(patch_size) == int:
149 | patch_size = np.array([patch_size, patch_size], dtype = np.int32)
150 |
151 | #Find Valid Regions to Extract Patches
152 | valid = np.zeros(data.shape[:2], dtype=np.uint8)
153 | valid[0: sh[0]-patch_size[0], 0: sh[1]-patch_size[1]] = 1
154 |
155 | idx = np.argwhere(valid > 0 )
156 | idx = idx[np.array(sample(range(idx.shape[0]), num_patches))]
157 |
158 | for i in range(num_patches):
159 | patch_arr[i] = data[idx[i,0]:idx[i,0]+patch_size[0],
160 | idx[i,1]:idx[i,1]+patch_size[1]]
161 | return patch_arr
162 |
--------------------------------------------------------------------------------
/metrics.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Wed Feb 22 10:42:10 2017
5 |
6 | @author: rmkemker
7 | """
8 |
9 | from sklearn.metrics import confusion_matrix
10 | import numpy as np
11 | import time
12 |
13 | class Metrics():
14 | def __init__(self, truth, prediction):
15 | self.c = confusion_matrix(truth , prediction)
16 |
17 | def _oa(self):
18 | return np.sum(np.diag(self.c))/np.sum(self.c)
19 |
20 | def _aa(self):
21 | return np.mean(np.diag(self.c)/np.sum(self.c, axis=1))
22 |
23 | def _ca(self):
24 | return np.diag(self.c)/np.sum(self.c, axis=1)
25 |
26 | def per_class_accuracies(self):
27 | return self._ca()
28 |
29 | def mean_class_accuracy(self):
30 | return self._aa()
31 |
32 | def overall_accuracy(self):
33 | return self._oa()
34 |
35 | def _kappa(self):
36 | e = np.sum(np.sum(self.c,axis=1)*np.sum(self.c,axis=0))/np.sum(self.c)**2
37 | return (self._oa()-e)/(1-e)
38 |
39 | def standard_metrics(self):
40 | return self._oa(), self._aa(), self._kappa()
41 |
42 | class Timer(object):
43 | def __init__(self, name=None):
44 | self.name = name
45 | def __enter__(self):
46 | self.tic = time.time()
47 | def __exit__(self, type, value, traceback):
48 | if self.name:
49 | print('[%s]' % self.name)
50 | print('Elapsed: %1.3f seconds' % (time.time() - self.tic))
--------------------------------------------------------------------------------
/pipeline.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Thu Mar 23 14:22:36 2017
5 |
6 | @author: Ronald Kemker and Utsav Gewali
7 | """
8 |
9 | import numpy as np
10 | from sklearn.base import BaseEstimator, ClassifierMixin
11 | from preprocessing import ImagePreprocessor
12 | from postprocessing import MRF, CRF
13 | from utils.utils import band_resample_hsi_cube
14 |
15 | class DummyOp(object):
16 | def __init__(self):
17 | pass
18 |
19 | def transform(self, X):
20 | return X
21 |
22 | class Pipeline(BaseEstimator, ClassifierMixin):
23 | """
24 | Pipeline to classify single-image hyperspectral image cube
25 | - Deployable Framework for Remote Sensing Classification
26 |
27 | Note: TODO Link to paper here
28 |
29 | Args:
30 | classifier (obj): A classifier object.
31 | pre_processor (str, list, obj, optional): Pre-process the image data.
32 | It can either be a string ('MinMaxScaler', 'StandardScaler',
33 | 'PCA', 'Normalize','Exponential', 'ConeResponse, or 'ReLU'), an
34 | ImagePreprocessor object, or a list containing one or more of the
35 | previous strings or ImagePre-processor objects.
36 | feature_extractor (obj, optional): A spatial-spectral feature extractor
37 | object (e.g. MICA or SCAE).
38 | feature_scaler (str, list, obj, optional): If a feature extractor is
39 | used, you can pre-process the data again prior to classification.
40 | The input format is the same as the pre_processor object.
41 | feature_spatial_padding (2-D tuple): This is the pad_width argument to
42 | numpy.pad. This is used if spatial padding is required for feature
43 | extraction.
44 | post_processor (str, obj, optional): The post-processing object or
45 | string with the name of that post-processing object. The only
46 | supported post-processor at this time is 'CRF'.
47 | source_bands (array of floats, optional): The band centers of the
48 | source data.
49 | source_fwhm (array of floats, optional): The full-width, half maxes
50 | of the source data. Should be the same shape as source_bands.
51 | target_bands (array of floats, optional): The band centers of the
52 | target data.
53 | target_fwhm (array of floats, optional): The full-width, half maxes
54 | of the target data. Should be the same shape as target_bands.
55 | semisupervised (boolean, optional): If true, the classifier will treat
56 | all labels = -1 as unlabeled training data. Labels < -1 are
57 | ignored. (Default: False)
58 | **kwargs (optional): These are various input variables to the
59 | pre-processor, feature_scaler, classifier, and post_processor
60 | class instances. This doesn't really work well...
61 |
62 | Attributes:
63 |
64 | """
65 | def __init__(self, classifier, pre_processor=None, feature_extractor=None,
66 | feature_scaler=[], feature_spatial_padding=None,
67 | post_processor=None,
68 | source_bands=[], source_fwhm=[], target_bands=[],
69 | target_fwhm=[], layer_sizes=None, denoising_cost=None,
70 | semisupervised = False, **kwargs):
71 |
72 | valid_pre = ['MinMaxScaler', 'StandardScaler', 'PCA',
73 | 'GlobalContrastNormalization', 'ZCAWhiten', 'Normalize',
74 | 'Exponential','ReLU','ConeResponse', 'AveragePooling2D']
75 |
76 | valid_post = ['MRF','CRF']
77 |
78 | self.classifier = classifier
79 | self.semisupervised = semisupervised
80 |
81 | self.probability = post_processor is not None
82 | kwargs['probability'] = self.probability
83 |
84 | #Pre-process the data
85 | if pre_processor is None:
86 | pre_processor = []
87 | elif type(pre_processor) is not list:
88 | pre_processor = [pre_processor]
89 |
90 | self.pre_processor = []
91 | for pre in pre_processor:
92 | if type(pre) == str:
93 | if pre in valid_pre:
94 | self.pre_processor.append(ImagePreprocessor(pre))
95 | for k in kwargs.keys():
96 | if k in self.pre_processor[-1].__dict__.keys():
97 | self.pre_processor[-1].__dict__[k] = kwargs[k]
98 | else:
99 | msg = 'Invalid pre_processor type: %s. Built-in options include %s.'
100 | raise ValueError(msg % (pre, str(valid_pre)[1:-1]))
101 | else:
102 | self.pre_processor.append(pre)
103 |
104 |
105 | #Pre-process the data after feature extraction
106 | if type(feature_scaler) is not list:
107 | feature_scaler = [feature_scaler]
108 |
109 | self.feature_scaler = []
110 | for pre in feature_scaler:
111 | if type(pre) == str:
112 | if pre in valid_pre:
113 | self.feature_scaler.append(ImagePreprocessor(pre))
114 | for k in kwargs.keys():
115 | if k in self.feature_scaler[-1].__dict__.keys():
116 | self.feature_scaler[-1].__dict__[k] = kwargs[k]
117 | else:
118 | msg = 'Invalid pre_processor type: %s. Built-in options include %s.'
119 | raise ValueError(msg % (pre, str(valid_pre)[1:-1]))
120 | else:
121 | self.feature_scaler.append(pre)
122 |
123 | if type(post_processor) == str:
124 | if post_processor in valid_post:
125 | self.post_processor = eval(post_processor)()
126 | else:
127 | msg = 'Invalid post-processor type: %s. Built-in options include %s.'
128 | raise ValueError(msg % (post_processor, str(valid_post)[1:-1]))
129 | else:
130 | self.post_processor = post_processor
131 |
132 | self.feature_extractor = feature_extractor
133 | self.source_bands = source_bands
134 | self.source_fwhm = source_fwhm
135 | self.target_bands = target_bands
136 | self.target_fwhm = target_fwhm
137 | self.pad = feature_spatial_padding
138 |
139 | def fit(self, X_train, y_train, X_val, y_val):
140 |
141 | """
142 | Fits to training data.
143 |
144 | Args:
145 | X_train (ndarray): the X training data
146 | y_train (ndarray): the y training data
147 | X_val (ndarray): the X validation data
148 | y_val (ndarray): the y validation data
149 | """
150 |
151 | same = np.all(X_train == X_val)
152 |
153 | #Data pre-processing
154 | for i in range(len(self.pre_processor)):
155 | self.pre_processor[i].fit(X_train)
156 | X_train = self.pre_processor[i].transform(X_train)
157 | if not same:
158 | X_val = self.pre_processor[i].transform(X_val)
159 |
160 | if self.pad:
161 | X_train = np.pad(X_train, self.pad, mode='constant')
162 | if not same:
163 | X_val = np.pad(X_val, self.pad, mode='constant')
164 |
165 | if not self.feature_extractor:
166 | self.feature_extractor = [DummyOp()]
167 | self.target_bands= [[]]
168 | self.target_fwhm = [[]]
169 | elif not isinstance(self.feature_extractor, list):
170 | self.feature_extractor = [self.feature_extractor]
171 | self.target_bands = [self.target_bands]
172 | self.target_fwhm = [self.target_fwhm]
173 |
174 | base_train = np.copy(X_train)
175 | final_train = np.zeros((base_train.shape[:2] + (0, )),dtype=np.float32)
176 | if not same:
177 | base_val = np.copy(X_val)
178 | final_val = np.zeros((base_val.shape[:2] + (0, )),dtype=np.float32)
179 |
180 | for i, fe in enumerate(self.feature_extractor):
181 | #Band resampling
182 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \
183 | len(self.source_bands) and len(self.source_fwhm):
184 | X_train = band_resample_hsi_cube(base_train, self.source_bands,
185 | self.target_bands[i],
186 | self.source_fwhm,
187 | self.target_fwhm[i])
188 | if not same:
189 | X_val = band_resample_hsi_cube(base_val, self.source_bands,
190 | self.target_bands[i],
191 | self.source_fwhm,
192 | self.target_fwhm[i])
193 | #Feature extraction
194 | X_train = fe.transform(X_train)
195 | if not same:
196 | X_val = fe.transform(X_val)
197 |
198 | final_train = np.append(final_train, X_train, axis=-1)
199 | if not same:
200 | final_val = np.append(final_val, X_val, axis=-1)
201 |
202 |
203 | X_train = np.copy(final_train)
204 | del base_train, final_train
205 | if not same:
206 | X_val = np.copy(final_val)
207 | del base_val, final_val
208 |
209 | if self.pad:
210 | X_train = X_train[self.pad[0][0]:-self.pad[0][1],
211 | self.pad[1][0]:-self.pad[1][1]]
212 | if not same:
213 | X_val = X_val[self.pad[0][0]:-self.pad[0][1],
214 | self.pad[1][0]:-self.pad[1][1]]
215 |
216 | print(X_train.shape)
217 | if len(self.feature_scaler):
218 | for feature_scaler_i in self.feature_scaler:
219 | feature_scaler_i.fit(X_train) #TODO: Combine train and val
220 | X_train = feature_scaler_i.transform(X_train)
221 | if not same:
222 | X_val = feature_scaler_i.transform(X_val)
223 |
224 | if same:
225 | X_val = X_train
226 |
227 | if self.post_processor:
228 | feature2d = X_val
229 | val_idx = y_val.ravel() > -1
230 |
231 | #Reshape to N x F
232 | if len(X_train.shape) > 2:
233 | X_train = X_train.reshape(-1, X_train.shape[-1])
234 | if len(X_val.shape) > 2:
235 | X_val = X_val.reshape(-1, X_val.shape[-1])
236 | if len(y_train.shape) > 1:
237 | y_train = y_train.ravel()
238 | if len(y_val.shape) > 1:
239 | y_val = y_val.ravel()
240 |
241 | if not self.semisupervised:
242 | X_train = X_train[y_train > -1]
243 | y_train = y_train[y_train > -1]
244 | else:
245 | X_train = X_train[y_train >= -1]
246 | y_train = y_train[y_train >= -1]
247 |
248 | X_val = X_val[y_val > -1]
249 | y_val = y_val[y_val > -1]
250 |
251 | print("Train: %d x %d" % (X_train.shape[0], X_train.shape[1]))
252 | print("Val: %d x %d" % (X_val.shape[0], X_val.shape[1]))
253 |
254 | #Fit classifier
255 | self.classifier.fit(X_train, y_train, X_val, y_val)
256 |
257 | #Fit MRF/CRF
258 | if self.post_processor:
259 | sh = feature2d.shape
260 | prob_map = self.classifier.predict_proba(feature2d.reshape(-1, sh[-1]))
261 | prob_map = prob_map.reshape(sh[0], sh[1], -1)
262 | self.post_processor.fit(prob_map+1e-50, y_val, val_idx)
263 |
264 |
265 | def predict(self, X):
266 |
267 | """
268 | Makes predictions based on training and X.
269 |
270 | Args:
271 | X (ndarray): the testing data
272 |
273 | Returns:
274 | pred (ndarray): the predictions of the y data based
275 | on training and X
276 | """
277 |
278 | if len(self.pre_processor):
279 | for j in range(len(self.pre_processor)):
280 | X = self.pre_processor[j].transform(X)
281 |
282 | if self.pad:
283 | X = np.pad(X, self.pad, mode='constant')
284 |
285 | base = np.copy(X)
286 | final = np.zeros(X.shape[:2] + (0,),dtype=np.float32)
287 |
288 | for i, fe in enumerate(self.feature_extractor):
289 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \
290 | len(self.source_bands) and len(self.source_fwhm):
291 | X = band_resample_hsi_cube(base, self.source_bands,
292 | self.target_bands[i],
293 | self.source_fwhm,
294 | self.target_fwhm[i])
295 | else:
296 | X = np.copy(base)
297 |
298 | #Feature extraction
299 | final = np.append(final, fe.transform(X) , axis=-1)
300 |
301 |
302 | X = np.copy(final)
303 | del base, final
304 |
305 | if self.pad:
306 | #TODO: generalize this to N-dimensions
307 | X = X[self.pad[0][0]:-self.pad[0][1],
308 | self.pad[1][0]:-self.pad[1][1]]
309 |
310 | if len(self.feature_scaler):
311 | for feature_scaler_i in self.feature_scaler:
312 | X = feature_scaler_i.transform(X)
313 |
314 |
315 | sh = X.shape
316 | if len(X.shape) > 2:
317 | X = X.reshape(-1, X.shape[-1])
318 |
319 | if self.post_processor:
320 | prob = self.classifier.predict_proba(X)
321 | pred = self.post_processor.predict(prob.reshape(sh[0], sh[1], -1)+1e-50)
322 | pred = pred.ravel()
323 | else:
324 | pred = self.classifier.predict(X)
325 |
326 |
327 | return pred
328 |
329 | def predict_proba(self, X):
330 |
331 | """
332 | Finds probabilities of possible outcomes using samples from X.
333 |
334 | Args:
335 | X (ndarray): the testing data
336 |
337 | Returns: ndarray of possible outcomes based on samples from X
338 | """
339 |
340 | if len(self.pre_processor):
341 | for j in range(len(self.pre_processor)):
342 | X = self.pre_processor[j].transform(X)
343 |
344 | if self.pad:
345 | X = np.pad(X, self.pad, mode='constant')
346 |
347 | base = np.copy(X)
348 | final = np.zeros(X.shape[:2] + (0,),dtype=np.float32)
349 |
350 | for i, fe in enumerate(self.feature_extractor):
351 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \
352 | len(self.source_bands) and len(self.source_fwhm):
353 | X = band_resample_hsi_cube(base, self.source_bands,
354 | self.target_bands[i],
355 | self.source_fwhm,
356 | self.target_fwhm[i])
357 | else:
358 | X = np.copy(base)
359 |
360 | #Feature extraction
361 | final = np.append(final, fe.transform(X) , axis=-1)
362 |
363 |
364 | X = np.copy(final)
365 | del base, final
366 |
367 | if self.pad:
368 | #TODO: generalize this to N-dimensions
369 | X = X[self.pad[0][0]:-self.pad[0][1],
370 | self.pad[1][0]:-self.pad[1][1]]
371 |
372 | if len(self.feature_scaler):
373 | for feature_scaler_i in self.feature_scaler:
374 | X = feature_scaler_i.transform(X)
375 |
376 | if len(X.shape) > 2:
377 | X = X.reshape(-1, X.shape[-1])
378 |
379 | proba = self.classifier.predict_proba(X)
380 |
381 | return proba
382 |
--------------------------------------------------------------------------------
/postprocessing.py:
--------------------------------------------------------------------------------
1 | """
2 | Classes for gridwise MRF and dense CRF
3 | Uses pydense (https://github.com/lucasb-eyer/pydensecrf) and python_gco (https://github.com/amueller/gco_python) libraries
4 | """
5 | import numpy as np
6 | import pydensecrf.densecrf as dcrf
7 | import pygco as gco
8 |
9 | class MRF(object):
10 | """
11 | MRF postprocessing
12 | -uses alpha-beta swap algorithm from Multilabel optimization library GCO (http://vision.csd.uwo.ca/code/)
13 | -unary energy is derived from predicted probability map
14 | -pairwise energy is Potts function (parameter optimized using grid search)
15 | """
16 |
17 | def __init__(self,inference_niter=50, multiplier=1000):
18 | self.INFERENCE_NITER = inference_niter #number of iteration of alpha-beta swap
19 | self.MULTIPLIER = multiplier #GCO uses int32 for energy therefore floating point energy is
20 | #scaled by multiplier and truncated
21 |
22 |
23 | def fit(self, X, y, idx):#img is unused here but is needed for the pipeline
24 | self.unaryEnergy = np.ascontiguousarray(-self.MULTIPLIER*np.log(X)).astype('int32') #energy=-log(probability)
25 | nClass = self.unaryEnergy.shape[2]
26 |
27 | params = np.logspace(-3,3,10) #grid search values for Potts compatibility
28 | best_score = 0.0
29 |
30 | print('Fitting Grid MRF...')
31 | for i in range(params.shape[0]):
32 | pairwiseEnergy = (self.MULTIPLIER*params[i]*(1-np.eye(nClass))).astype('int32')
33 | out = gco.cut_simple(unary_cost=self.unaryEnergy,\
34 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER,
35 | algorithm='swap').ravel()
36 |
37 | score = np.sum(out[idx]==y)/float(y.size)
38 | if score > best_score:
39 | self.cost = params[i]
40 | best_score = score
41 |
42 | params = np.logspace(np.log10(self.cost)-1,np.log10(self.cost)+1,30)
43 | best_score = 0.0
44 |
45 | print('Finetuning Grid MRF...')
46 | for i in range(params.shape[0]):
47 | pairwiseEnergy = (self.MULTIPLIER*params[i]*(1-np.eye(nClass))).astype('int32')
48 | out = gco.cut_simple(unary_cost=self.unaryEnergy,\
49 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER,
50 | algorithm='swap').ravel()
51 |
52 | score = np.sum(out[idx]==y)/float(y.size)
53 | if score > best_score:
54 | self.cost = params[i]
55 | best_score = score
56 |
57 | def predict(self, X):
58 | #self.cost = 2.
59 | self.unaryEnergy = np.ascontiguousarray(-self.MULTIPLIER*np.log(X)).astype('int32')
60 | nClass = self.unaryEnergy.shape[2]
61 | pairwiseEnergy = (self.MULTIPLIER*self.cost*(1-np.eye(nClass))).astype('int32')
62 | return gco.cut_simple(unary_cost=self.unaryEnergy,\
63 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER,algorithm='swap')
64 |
65 |
66 | class CRF(object):
67 | """
68 | CRF postprocessing
69 | -uses dense CRF with Gaussian edge potential (https://arxiv.org/abs/1210.5644)
70 | -unary energy is derived from predicted probability map
71 | -pairwise energy is Potts function (parameters optimized using grid search)
72 | """
73 |
74 | def __init__(self, inference_niter=10):
75 | self.INFERENCE_NITER = inference_niter #number of mean field iteration in dense CRF
76 | self.w1 = 1.
77 | self.scale = 1000.
78 |
79 | def fit(self, X, y, idx):
80 | [self.nRows,self.nCols,self.nClasses] = X.shape
81 | self.nPixels = self.nRows*self.nCols
82 | prob_list = X.reshape([self.nPixels,self.nClasses])
83 | self.uEnergy = -np.log(prob_list.transpose().astype('float32')) #energy=-log(probability)
84 |
85 | params = [np.logspace(-3,3,10) for i in range(2)]
86 | params = np.meshgrid(*params)
87 | params = np.vstack([x.ravel() for x in params]).T
88 |
89 | best_score = 0.0
90 |
91 | print('Fitting Fully-Connected CRF...')
92 | for i in range(params.shape[0]):
93 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses)
94 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy))
95 | crf.addPairwiseGaussian(sxy=params[i,0], compat=params[i,1])
96 |
97 | Q = crf.inference(self.INFERENCE_NITER)
98 | pred = np.argmax(Q,axis=0)
99 | score = np.sum(pred[idx]==y)/float(y.size)
100 |
101 | if score > best_score:
102 | self.w1 = params[i,1]
103 | self.scale = params[i,0]
104 | best_score = score
105 |
106 | params = [np.logspace(np.log10(x)-1,np.log10(x)+1,10) for x in [self.scale, self.w1]]
107 | params = np.meshgrid(*params)
108 | params = np.vstack([x.ravel() for x in params]).T
109 |
110 | best_score = 0.0
111 |
112 | print('Finetuning Fully-Connected CRF...')
113 | for i in range(params.shape[0]):
114 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses)
115 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy))
116 | crf.addPairwiseGaussian(sxy=params[i,0], compat=params[i,1])
117 |
118 | Q = crf.inference(self.INFERENCE_NITER)
119 | pred = np.argmax(Q,axis=0)
120 | score = np.sum(pred[idx]==y)/float(y.size)
121 |
122 | if score > best_score:
123 | self.w1 = params[i,1]
124 | self.scale = params[i,0]
125 | best_score = score
126 |
127 | return 1
128 |
129 | def predict(self, X):
130 | [self.nRows,self.nCols,self.nClasses] = X.shape
131 | self.nPixels = self.nRows*self.nCols
132 | prob_list = X.reshape([self.nPixels,self.nClasses])
133 | self.uEnergy = -np.log(prob_list.transpose().astype('float32'))
134 |
135 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses)
136 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy))
137 | crf.addPairwiseGaussian(sxy=self.scale, compat=self.w1)
138 |
139 | Q = crf.inference(self.INFERENCE_NITER)
140 | pred = np.argmax(Q,axis=0)
141 | return pred.reshape([self.nRows,self.nCols])
142 |
143 |
--------------------------------------------------------------------------------
/preprocessing.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: preprocessing.py
3 | Author: Ronald Kemker
4 | Description: Image Pre-processing workflow
5 | Note:
6 | Requires scikit-learn
7 | """
8 |
9 | from sklearn.preprocessing import StandardScaler, MinMaxScaler
10 | from sklearn.decomposition import PCA
11 | from utils.convolution import mean_pooling
12 | import numpy as np
13 | from scipy.stats import expon
14 |
15 | class ConeResponse():
16 | def __init__(self, eps=0.05):
17 | self.eps = eps
18 | def fit(self, X):
19 | pass
20 | def transform(self, X):
21 | return (np.log(self.eps)-np.log(X + self.eps))/(np.log(self.eps)-np.log(1 + self.eps))
22 | def fit_transform(self, X):
23 | self.fit(X)
24 | return self.transform(X)
25 |
26 | class ReLU():
27 | def __init__(self):
28 | pass
29 | def fit(self, X):
30 | pass
31 | def transform(self, X):
32 | X[X<0] = 0
33 | return X
34 | def fit_transform(self, X):
35 | self.fit(X)
36 | return self.transform(X)
37 |
38 | class Exponential():
39 | """
40 | Class used to exponentially scale data by fitting and transforming it
41 | """
42 | def __init__(self):
43 | pass
44 | def fit(self, X):
45 | """
46 | Sets the scale based on the input data
47 | Args:
48 | X (array): the data to be used to set the scale
49 | """
50 | self.scale = np.zeros(X.shape[-1], dtype=np.float32)
51 | for i in range(0,X.shape[-1]):
52 | _, self.scale[i] = expon.fit(X[:,i],floc=0)
53 | def transform(self, X):
54 | """
55 | Transforms data based on the scale set in the fit function
56 | Args:
57 | X (array): the data to be transformed according to the scale
58 | Returns: the transformed data
59 | """
60 | return 1-np.exp(-np.abs(X)/self.scale)
61 | def fit_transform(self, X):
62 | """
63 | Performs both the fit and the transform methods on the same input data
64 | Args:
65 | X (array): the data to be used to set the scale then transformed
66 | Returns: the result of transforming the X data after fitting it
67 | """
68 | self.fit(X)
69 | return self.transform(X)
70 |
71 | class Normalize():
72 | """
73 | Class used to normalize and scale data by fitting and transforming it
74 | Args:
75 | ord: Default value is None
76 | """
77 | def __init__(self, norm_order=None):
78 | self.ord = norm_order
79 | def fit(self, X):
80 | """
81 | Sets the scale based on the input data; uses np.linalg.norm
82 | Args:
83 | X (array): the data to be used to set the scale
84 | """
85 | self.ord = np.linalg.norm(X, axis=0)
86 | self.ord[self.ord==0] = 1e-7
87 | def transform(self, X):
88 | """
89 | Transforms data based on the scale set in the fit function
90 | Args:
91 | X (array): the data to be transformed according to the scale
92 | Returns: the transformed data
93 | """
94 | return X/self.ord
95 | def fit_transform(self , X):
96 | """
97 | Performs both the fit and the transform methods on the same input data
98 | Args:
99 | X (array): the data to be used to set the scale then transformed
100 | Returns: the result of transforming the X data after fitting it
101 | """
102 | self.fit(X)
103 | return self.transform(X)
104 |
105 | class AveragePooling2D():
106 |
107 | def __init__(self, pool_size=5):
108 | self.pool_size = pool_size
109 | def fit(self, X):
110 | pass
111 | def transform(self, X):
112 | return mean_pooling(X, self.pool_size)
113 | def fit_transform(self , X):
114 | return self.transform(X)
115 |
116 | class ImagePreprocessor():
117 |
118 | def __init__(self, mode='StandardScaler' , feature_range=[0,1],
119 | with_std=True, PCA_components=None, svd_solver='auto',
120 | norm_order=None, eps=0.05, whiten=True):
121 | self.mode = mode.lower()
122 | self.with_std = with_std
123 | if self.mode == 'standardscaler':
124 | self.sclr = StandardScaler(with_std=with_std)
125 | elif self.mode == 'minmaxscaler':
126 | self.sclr = MinMaxScaler(feature_range = feature_range)
127 | elif self.mode == 'pca':
128 | self.sclr = PCA(n_components=PCA_components,whiten=whiten,
129 | svd_solver=svd_solver)
130 | elif self.mode == 'normalize':
131 | self.sclr = Normalize(norm_order)
132 | elif self.mode == 'exponential':
133 | self.sclr = Exponential()
134 | elif self.mode == 'coneresponse':
135 | self.sclr = ConeResponse(eps)
136 | elif self.mode == 'relu':
137 | self.sclr = ReLU()
138 | else:
139 | raise ValueError('Invalid pre-processing mode: ' + mode)
140 |
141 | def fit(self, data):
142 | data = data.reshape(-1,data.shape[-1])
143 | self.sclr.fit(data)
144 | def transform(self,data):
145 | sh = data.shape
146 | data = data.reshape(-1,sh[-1])
147 | data = self.sclr.transform(data)
148 | return data.reshape(sh[:-1] + (data.shape[-1],))
149 | def fit_transform(self, data):
150 | sh = data.shape
151 | data = data.reshape(-1,sh[-1])
152 | data = self.sclr.fit_transform(data)
153 | return data.reshape(sh[:-1] + (data.shape[-1],))
154 | def get_params(self):
155 | if self.mode == 'standardscaler':
156 | if self.with_std:
157 | return self.sclr.mean_ , self.sclr.scale_
158 | else:
159 | return self.sclr.mean_
160 | elif self.mode == 'minmaxscaler':
161 | return self.sclr.data_min_, self.sclr.data_max_
162 | elif self.mode == 'pca':
163 | return self.sclr.components_
164 | elif self.mode == 'normalize':
165 | return self.sclr.ord
166 | elif self.mode == 'exponential':
167 | return self.sclr.scale
168 | else:
169 | return None
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | """
2 | setup.py
3 | Downloads and installs dependencies, datasets, and pretrained filters.
4 | """
5 | import pip
6 | import zipfile
7 | import urllib.request
8 | import os
9 |
10 |
11 | def main():
12 |
13 | #install numpy
14 | pip_install('numpy')
15 |
16 | #install scipy
17 | pip_install('scipy')
18 |
19 | #install matplotlib
20 | pip_install('matplotlib')
21 |
22 | #install scikit-learn
23 | pip_install('sklearn')
24 |
25 | #install spectral python
26 | pip_install('spectral')
27 |
28 | #install gdal
29 | pip_install('gdal')
30 |
31 | #install tensorflow-gpu
32 | pip_install('tensorflow-gpu')
33 |
34 | #install pydensecrf from https://github.com/lucasb-eyer/pydensecrf
35 | pip_install('git+https://github.com/lucasb-eyer/pydensecrf.git')
36 |
37 | #install py_gco from https://github.com/amueller/gco_python
38 | pip_install('git+git://github.com/amueller/gco_python')
39 |
40 |
41 | #download and install datasets and pretrained filters
42 | pretrained_filters = [['http://www.cis.rit.edu/~rmk6217/scae.zip', './feature_extraction/pretrained'],
43 | ['http://www.cis.rit.edu/~rmk6217/smcae.zip', './feature_extraction/pretrained']]
44 | datasets = [['http://www.cis.rit.edu/~rmk6217/datasets.zip', './']]
45 | to_download = pretrained_filters + datasets
46 | for i in range(len(to_download)):
47 | print('Downloading ' + to_download[i][0] + ' .....')
48 | download_and_unzip(to_download[i][0],to_download[i][1])
49 |
50 |
51 | def pip_install(pkg):
52 | pip.main(['install', pkg])
53 |
54 | def download_and_unzip(url, out_folder):
55 | tempfile_path = './tmp_dw_file.zip'
56 | if not os.path.exists(out_folder):
57 | os.mkdir(out_folder)
58 |
59 | urllib.request.urlretrieve(url, tempfile_path)
60 | zip_ref = zipfile.ZipFile(tempfile_path, 'r')
61 | zip_ref.extractall(out_folder)
62 | zip_ref.close()
63 | os.remove(tempfile_path)
64 |
65 |
66 | if __name__ == '__main__':
67 | main()
68 |
--------------------------------------------------------------------------------
/tf_initializers.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Wed Aug 16 15:32:06 2017
5 |
6 | @author: Ronald Kemker
7 | """
8 |
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 |
13 | def get(initializer):
14 | return eval(initializer) if isinstance(initializer, str) else initializer
15 |
16 | def zeros(shape, name=None):
17 | return tf.Variable(tf.zeros([shape]), name=name)
18 |
19 | def ones(shape, name=None):
20 | return tf.Variable(tf.ones([shape]), name=name)
21 |
22 | def constant(shape, value=0, name=None):
23 | return tf.Variable(value * tf.ones([shape]), name=name)
24 |
25 | def glorot_normal(shape, name=None):
26 | return _normal_base(shape, name, stddev=math.sqrt(2 / np.sum(shape)))
27 |
28 | def he_normal(shape, name=None):
29 | return _normal_base(shape, name, stddev=math.sqrt(2 / shape[0]))
30 |
31 | def lecun_normal(shape, name=None):
32 | return _normal_base(shape, name, stddev=math.sqrt(1 / shape[0]))
33 |
34 | def _normal_base(shape, name, stddev):
35 | return tf.Variable(tf.random_normal(shape, name=name)) * stddev
36 |
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/utils/__init__.py
--------------------------------------------------------------------------------
/utils/convolution.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Wed Feb 22 10:19:42 2017
5 |
6 | @author: Ronald Kemker
7 | """
8 |
9 | import numpy as np
10 | import tensorflow as tf
11 | from scipy.ndimage import convolve
12 |
13 | def meanPooling(X, poolSize):
14 | """
15 | meanPooling - Average pooling filter across data
16 | Input: data - [height x width x channels]
17 | poolSize - (int) Receptive field size of mean pooling filter
18 | Output: output - [height x width x channels]
19 | """
20 | X = np.float32(X[np.newaxis])
21 | with tf.Graph().as_default():
22 | X_t = tf.placeholder(tf.float32, shape=X.shape)
23 | pool_filter = tf.ones((poolSize, poolSize, X.shape[-1], 1), tf.float32)/poolSize**2
24 | half = int(poolSize/2)
25 | pad_t = tf.pad(X_t , tf.constant([[0,0],[half, half], [half, half],[0,0]]), 'REFLECT')
26 | pool_op = tf.nn.depthwise_conv2d(pad_t, pool_filter, strides=[1,1,1,1], padding='VALID')
27 | with tf.Session() as sess:
28 | return sess.run(pool_op, {X_t:X})[0]
29 |
30 | def conv2d(X, w):
31 |
32 | with tf.Graph().as_default():
33 | X = np.float32(X[np.newaxis])
34 | X_t = tf.placeholder(tf.float32, name='X_t')
35 | w_t = tf.placeholder(tf.float32, name='w_t')
36 |
37 | conv_op = tf.nn.conv2d(X_t, w_t, strides=[1,1,1,1], padding='SAME')
38 |
39 | nb_filters = w.shape[3]
40 |
41 | with tf.Session() as sess:
42 |
43 | if nb_filters <= 64:
44 | return sess.run(conv_op, {X_t:X, w_t:w})[0]
45 | else:
46 | output = np.zeros((X.shape[1], X.shape[2], nb_filters))
47 | for i in range(0, nb_filters, 64):
48 | idx = np.arange(i,np.min([i+64, nb_filters]))
49 | output[:,:,idx] = sess.run(conv_op, {X_t:X,w_t:w[:,:,:,idx]})[0]
50 | return output[0]
51 |
52 | def mean_pooling(X, pool_size):
53 |
54 | X = np.float32(X)
55 | sh = X.shape
56 | f = np.ones((pool_size, pool_size) , dtype=np.float32) / pool_size**2
57 |
58 | result = np.zeros(sh, dtype=np.float32)
59 |
60 | for i in range(sh[-1]):
61 | result[:,:,i] = convolve(X[:,:,i], f, mode='reflect')
62 | return result
63 |
64 | if __name__ == "__main__":
65 |
66 | X = np.float32(np.random.rand(610,340,256*5*3))
67 | result = mean_pooling(X, 5)
68 |
--------------------------------------------------------------------------------
/utils/train_val_split.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Tue Jul 25 10:58:44 2017
5 |
6 | @author: rmk6217
7 | """
8 |
9 | import numpy as np
10 |
11 | def random_data_split(y, train_samples, classwise=True, ignore_zero=True,
12 | shuffle=False):
13 | """Split label data into two folds.
14 |
15 | Args:
16 | y (array of ints): M x N array of labels, from [0, N]
17 | train_samples (float, int): Determines how many samples will go to the
18 | training fold. Treated as percentage if float, treated as absolute
19 | if int.
20 | classwise (boolean): if True, training samples will be picked with
21 | respect to the classes rather than with respect to the image.
22 | ignore_zero (boolean): if True, neither fold will contain the 0 class.
23 |
24 | Returns:
25 | Array of ints, both M x N. Samples allocated for the other fold are
26 | set to -1. If `ignore_zero`, all values are decremented by 1.
27 | """
28 |
29 |
30 | if not isinstance(train_samples, np.ndarray):
31 | relative = isinstance(train_samples, float) and 0 <= train_samples <= 1
32 | if not relative:
33 | cutoff = train_samples
34 |
35 | y = np.array(y)
36 | sh = y.shape
37 | y = y.ravel()
38 |
39 | num_classes = np.max(y) + 1
40 |
41 | y_train = np.ones(np.prod(sh), dtype=np.int32) * -1
42 | y_test = np.ones(np.prod(sh), dtype=np.int32) * -1
43 |
44 | if isinstance(train_samples, np.ndarray):
45 | start = 1 if ignore_zero else 0
46 | for i in range(start, num_classes):
47 | idx = np.where(y == i)[0]
48 | if shuffle:
49 | np.random.shuffle(idx)
50 | y_train[idx[:train_samples[i]]] = i
51 | y_test[idx[train_samples[i]:]] = i
52 |
53 | elif classwise:
54 | start = 1 if ignore_zero else 0
55 | for i in range(start, num_classes):
56 | idx = np.where(y == i)[0]
57 | if relative:
58 | cutoff = int(round(len(idx) * train_samples))
59 | if shuffle:
60 | np.random.shuffle(idx)
61 | y_train[idx[:cutoff]] = i
62 | y_test[idx[cutoff:]] = i
63 | else:
64 | if ignore_zero:
65 | idx = np.where(y > 0)[0]
66 | else:
67 | idx = np.arange(len(y))
68 | if shuffle:
69 | np.random.shuffle(idx)
70 | if relative:
71 | cutoff = int(round(len(idx) * train_samples))
72 | y_train[idx[:cutoff]] = y[idx[:cutoff]]
73 | y_test[idx[cutoff:]] = y[idx[cutoff:]]
74 | y_train = y_train.reshape(sh)
75 | y_test = y_test.reshape(sh)
76 |
77 | if ignore_zero:
78 | y_train[y_train > 0] -= 1
79 | y_test[y_test > 0] -= 1
80 |
81 | return y_train, y_test
82 |
83 |
84 | def train_val_split(y, train_samples, val_samples, classwise=True, ignore_zero=True):
85 | """Randomly split label data into two folds.
86 |
87 | Args:
88 | y (array of ints): M x N array of labels, from [0, N]
89 | train_samples (int): Determines how many samples will go to the
90 | training fold.
91 | val_samples (int): Determines how many samples will go to the
92 | validation fold.
93 | classwise (boolean): if True, training samples will be picked with
94 | respect to the classes rather than with respect to the image.
95 | ignore_zero (boolean): if True, neither fold will contain the 0 class.
96 |
97 | Returns:
98 | Array of ints, both M x N. Samples allocated for the other fold are
99 | set to -1. If `ignore_zero`, all values are decremented by 1.
100 | """
101 |
102 |
103 | y = np.array(y)
104 | sh = y.shape
105 | y = y.ravel()
106 |
107 | num_classes = np.max(y) + 1
108 |
109 | y_train = np.ones(np.prod(sh), dtype=np.int32) * -1
110 | y_val = np.ones(np.prod(sh), dtype=np.int32) * -1
111 |
112 | if isinstance(train_samples, np.ndarray):
113 |
114 | train_samples = np.int32(train_samples)
115 | val_samples = np.int32(val_samples)
116 | start = 1 if ignore_zero else 0
117 | for i in range(start, num_classes):
118 | idx = np.where(y == i)[0]
119 | np.random.shuffle(idx)
120 | y_train[idx[:train_samples[i]]] = i
121 | y_val[idx[train_samples[i]:train_samples[i]+val_samples[i]]] = i
122 |
123 | elif classwise:
124 | start = 1 if ignore_zero else 0
125 | train_samples = np.int32(train_samples)
126 | val_samples = np.int32(val_samples)
127 | for i in range(start, num_classes):
128 | idx = np.where(y == i)[0]
129 |
130 | np.random.shuffle(idx)
131 | y_train[idx[:train_samples]] = i
132 | y_val[idx[train_samples:train_samples+val_samples]] = i
133 | else:
134 | if ignore_zero:
135 | idx = np.where(y > 0)[0]
136 | else:
137 | idx = np.arange(len(y))
138 | np.random.shuffle(idx)
139 |
140 | y_train[idx[:train_samples]] = y[idx[:train_samples]]
141 | y_val[idx[train_samples:train_samples+val_samples]] = y[idx[train_samples:train_samples+val_samples]]
142 | y_train = y_train.reshape(sh)
143 | y_val = y_val.reshape(sh)
144 |
145 | if ignore_zero:
146 | y_train[y_train > 0] -= 1
147 | y_val[y_val > 0] -= 1
148 |
149 | return y_train, y_val
150 |
151 |
--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
1 | """
2 | Name: utils.py
3 | Author: Ronald Kemker
4 | Description: Helper functions for remote sensing applications.
5 |
6 | Note:
7 | Requires SpectralPython and GDAL
8 | http://www.spectralpython.net/
9 | http://www.gdal.org/
10 | """
11 |
12 | import numpy as np
13 | from spectral import BandResampler
14 | import os
15 |
16 | def band_resample_hsi_cube(data, bands1, bands2, fwhm1,fwhm2 , mask=None):
17 | """
18 | band_resample_hsi_cube : Resample hyperspectral image
19 |
20 | Parameters
21 | ----------
22 | data : numpy array (height x width x spectral bands)
23 | bands1 : numpy array [1 x num source bands],
24 | the band centers of the input hyperspectral cube
25 | bands2 : numpy array [1 x num target bands],
26 | the band centers of the output hyperspectral cube
27 | fwhm1 : numpy array [1 x num source bands],
28 | the full-width half-max of the input hyperspectral cube
29 | fwhm2 : numpy array [1 x num target bands],
30 | the full-width half-max of the output hyperspectral cube
31 | mask : numpy array (height x width), optional mask to perform the band-
32 | resampling operation on.
33 |
34 | Returns
35 | -------
36 | output - numpy array (height x width x N)
37 |
38 | """
39 | resample = BandResampler(bands1,bands2,fwhm1,fwhm2)
40 | dataSize = data.shape
41 | data = data.reshape((-1,dataSize[2]))
42 |
43 | if mask is None:
44 | out = resample.matrix.dot(data.T).T
45 | else:
46 | out = np.zeros((data.shape[0], len(bands2)), dtype=data.dtype)
47 | mask = mask.ravel()
48 | out[mask] = resample.matrix.dot(data[mask].T).T
49 |
50 | out[np.isnan(out)] = 0
51 | return out.reshape((dataSize[0],dataSize[1],len(bands2)))
52 |
53 |
54 | def set_gpu(device=None):
55 |
56 | if device is None:
57 | device=""
58 | os.environ['CUDA_VISIBLE_DEVICES'] = str(device)
59 |
--------------------------------------------------------------------------------