├── .gitignore ├── README.md ├── __init__.py ├── classifier ├── __init__.py ├── mlp.py ├── rf_workflow.py ├── ssmlp.py ├── svm_cv_workflow.py └── svm_workflow.py ├── examples ├── example_pipeline.py └── example_scae_svm.py ├── feature_extraction ├── __init__.py ├── cae.py ├── mcae.py └── scae.py ├── fileIO ├── __init__.py ├── aviris.py ├── aviris_bands.hdr ├── gliht.py ├── hyperion.py ├── hyperion_bands.csv └── utils.py ├── metrics.py ├── pipeline.py ├── postprocessing.py ├── preprocessing.py ├── setup.py ├── tf_initializers.py └── utils ├── __init__.py ├── convolution.py ├── train_val_split.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.hdf5 2 | *feature_extraction/pre_trained/* 3 | *checkpoint/ 4 | *__pycache__ 5 | *checkpoints* 6 | *checkpoint* 7 | *.png 8 | *.mat -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # EarthMapper # 2 | 3 | Project repository for EarthMapper. This is a toolbox for the semantic segmentation of non-RGB (i.e., multispectral/hyperspectral) imagery. We will work on adding more examples and better documentation. 4 | 5 |

6 | 7 |

8 | 9 | ## Description ## 10 | 11 | This is a classification pipeline from various projects that we have worked on over the past few years. Currently available options include: 12 | 13 | ### Pre-Processing ### 14 | * MinMaxScaler - Scale data (per-channel) between a given feature range (e.g., 0-1) 15 | * StandardScaler - Scale data (per-channel) to zero-mean/unit-variance 16 | * PCA - Reduce dimensionality via principal component analysis 17 | * Normalize - Scale data using the per-channel L2 norm 18 | 19 | ### Spatial-Spectral Feature Extraction ### 20 | * Stacked Convolutional Autoencoder (SCAE) 21 | * Stacked Multi-Loss Convolutional Autoencoder (SMCAE) 22 | 23 | ### Classifiers ### 24 | * SVMWorkflow - Support vector machine with a given training/validation split 25 | * SVMCVWorkflow - Support vector machine that uses n-fold cross-validation to find optimal hyperparameters 26 | * RandomForestWorkflow - Random Forest classifier 27 | * MLP - Multi-layer Perceptron Neural Network classifier 28 | * SSMLP - Semi-supervised MLP Neural Network classifier 29 | 30 | ### Post-Processors ### 31 | * Markov Random Field (MRF) 32 | * Fully-Connected Conditional Random Field (CRF) 33 | 34 | ## Dependencies ## 35 | * Python 3.5 (We recommend the [Anaconda Python Distribution](https://www.anaconda.com/download/)) 36 | * numpy, scipy, and matplotlib 37 | * [scikit-learn](http://scikit-learn.org/stable/) 38 | * [spectral python](http://www.spectralpython.net/) 39 | * [gdal](http://www.gdal.org/) 40 | * [tensorflow](https://www.tensorflow.org/) 41 | * [pydensecrf](https://github.com/lucasb-eyer/pydensecrf) 42 | * [gco_python](https://github.com/amueller/gco_python) 43 | 44 | ## Instructions ## 45 | 46 | ### Installation ### 47 | ```console 48 | $ python setup.py 49 | ``` 50 | ### Run example ### 51 | ```console 52 | $ python examples/example_pipeline.py 53 | ``` 54 | ## Citations ## 55 | 56 | If you use our product, please cite: 57 | * Kemker, R., Gewali, U. B., Kanan, C. [EarthMapper: A Tool Box for the Semantic Segmentation of Remote Sensing Imagery 58 | ](https://arxiv.org/abs/1804.00292). In review at the IEEE Geoscience and Remote Sensing Letters (GRSL). 59 | * Kemker, R., Luu, R., and Kanan C. (2018) [Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery](https://arxiv.org/abs/1803.09824). IEEE Transactions on Geoscience and Remote Sensing (TGRS). 60 | * Kemker, R., Kanan C. (2017) [Self-Taught Feature Learning for Hyperspectral Image Classification](http://ieeexplore.ieee.org/document/7875467/). IEEE Transactions on Geoscience and Remote Sensing (TGRS), 55(5): 2693-2705. 10.1109/TGRS.2017.2651639 61 | * U. B. Gewali and S. T. Monteiro, [A tutorial on modeling and inference in undirected graphical models for hyperspectral image analysis](https://arxiv.org/abs/1801.08268), In review at the International Journal of Remote Sensing (IJRS). 62 | * U. B. Gewali and S. T. Monteiro, [Spectral angle based unary energy functions for spatial-spectral hyperspectral classification using Markov random fields](http://ieeexplore.ieee.org/abstract/document/8071716/), in Proc. IEEE Workshop on Hyperspectral Image and Signal Processing : Evolution in Remote Sensing (WHISPERS), Los Angeles, CA, USA, Aug. 2016. 63 | 64 | ## Points of Contact ## 65 | * Ronald Kemker - http://www.cis.rit.edu/~rmk6217/ 66 | * Utsav Gewali - http://www.cis.rit.edu/~ubg9540/ 67 | * Chris Kanan - http://www.chriskanan.com/ 68 | 69 | ## Also Check Out ## 70 | * RIT-18 Dataset - https://github.com/rmkemker/RIT-18 71 | * Machine and Neuromorphic Perception Laboratory - http://klab.cis.rit.edu/ 72 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/__init__.py -------------------------------------------------------------------------------- /classifier/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/classifier/__init__.py -------------------------------------------------------------------------------- /classifier/mlp.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import os 4 | import numpy as np 5 | import tf_initializers 6 | 7 | class CMA(object): 8 | 9 | def __init__(self): 10 | self.N = 0 11 | self.avg = 0.0 12 | 13 | def update(self, X): 14 | self.avg = (X+self.N * self.avg)/(self.N+1) 15 | self.N = self.N + 1 16 | 17 | class MLP(object): 18 | 19 | def __init__(self, layer_sizes, num_epochs=150, 20 | batch_size=200, weight_decay=None, 21 | starter_learning_rate=2e-3, ckpt_path='checkpoints', 22 | patience=50, 23 | activation='pelu', 24 | weight_initializer='glorot_normal', 25 | mu = None): 26 | 27 | ''' 28 | MLP Attributes 29 | ''' 30 | self.L = len(layer_sizes) - 1 # number of layers 31 | self.layer_sizes = layer_sizes 32 | self.weight_decay = weight_decay 33 | self.num_epochs = num_epochs 34 | self.batch_size = batch_size 35 | self.ckpt_path = ckpt_path 36 | self.starter_learning_rate = starter_learning_rate 37 | self.patience = patience 38 | self.lr_patience = int(patience/2) 39 | self.activation = activation 40 | self.weight_initializer = tf_initializers.get(weight_initializer) 41 | self.mu = mu 42 | 43 | with tf.Graph().as_default(): 44 | 45 | inputs = tf.placeholder(tf.float32, shape=(None,self.layer_sizes[0]),name='inputs') 46 | outputs = tf.placeholder(tf.int32, name='outputs') 47 | class_weights = tf.placeholder(tf.float32, name='class_weights') 48 | learning_rate = tf.placeholder(tf.float32, name='learning_rate') 49 | 50 | training = tf.placeholder(tf.bool,name='training') 51 | 52 | shapes = list(zip(self.layer_sizes[:-1], self.layer_sizes[1:])) # shapes of linear layers 53 | 54 | weights = {'W': [self.weight_initializer(s, "W") for s in shapes], 55 | 'Wb':[self.bi(1.0, self.layer_sizes[l+1], "Wb") for l in range(self.L)], 56 | 'a1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)], 57 | 'b1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)]} 58 | 59 | tf.add_to_collection("inputs", inputs) 60 | tf.add_to_collection("outputs", outputs) 61 | tf.add_to_collection("training", training) 62 | tf.add_to_collection('learning_rate', learning_rate) 63 | tf.add_to_collection('class_weights', class_weights) 64 | 65 | eps = tf.constant(1e-10) 66 | 67 | #Encoder 68 | h = inputs 69 | for i in range(self.L-1): 70 | h = tf.matmul(h, weights['W'][i]) 71 | h = tf.add(h, weights['Wb'][i]) 72 | h = tf.layers.batch_normalization(h, training=training) 73 | 74 | if i == self.L - 2: 75 | h = self.act(h, weights["a1"][i], weights["b1"][i]) 76 | h = tf.matmul(h, weights['W'][self.L-1]) 77 | h = tf.add(h, weights['Wb'][self.L-1]) 78 | y = tf.nn.softmax(h) 79 | else: 80 | h = self.act(h, weights["a1"][i], weights["b1"][i]) 81 | 82 | tf.add_to_collection("y", y) 83 | 84 | # clipped_output = tf.clip_by_value(outputs, 1e-10, 1.0) 85 | one_hot = tf.cast(tf.one_hot(outputs, layer_sizes[-1]) , tf.float32) 86 | loss_per = tf.reduce_sum(one_hot * tf.multiply(tf.log(y+eps),class_weights), 1) # supervised cost 87 | loss = -tf.reduce_mean(loss_per) 88 | 89 | reg_loss = 0.0 90 | if self.weight_decay is not None: 91 | for i in range(0,self.L): 92 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['W'][i]) 93 | 94 | pelu_loss = 0.0 95 | if self.activation == 'pelu': 96 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0) 97 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0) 98 | 99 | 100 | loss = loss + reg_loss + pelu_loss 101 | loss_per = -loss_per + reg_loss + pelu_loss 102 | tf.add_to_collection('loss_per_t', loss_per) 103 | tf.add_to_collection('loss', loss) 104 | 105 | pred = tf.argmax(y, 1) 106 | tf.add_to_collection('pred', pred) 107 | # truth = tf.argmax(one_hot, 1) 108 | correct_prediction = tf.equal(tf.cast(tf.argmax(y, 1), tf.int32), outputs) # no of correct predictions 109 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) * tf.constant(100.0) 110 | tf.add_to_collection('accuracy', accuracy) 111 | 112 | opt = tf.contrib.opt.NadamOptimizer(learning_rate) 113 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 114 | with tf.control_dependencies(update_ops): 115 | train_step = opt.minimize(loss) 116 | 117 | tf.add_to_collection('train_step', train_step) 118 | 119 | with tf.Session() as sess: 120 | saver = tf.train.Saver() 121 | sess.run(tf.global_variables_initializer()) 122 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0) 123 | 124 | def bi(self, inits, size, name): 125 | return tf.Variable(inits * tf.ones([size]), name=name) 126 | 127 | def wi(self, shape, name): 128 | return tf.Variable(tf.random_normal(shape, name=name)/ math.sqrt(shape[0])) 129 | 130 | def act(self, z, a, b): 131 | """Applies PELU activation function. 132 | 133 | Args: 134 | z (tensor): data to apply activation function on. 135 | a (float): `a` parameter. 136 | b (float): `b` parameter. 137 | """ 138 | if self.activation == 'relu': 139 | return tf.nn.relu(z) 140 | elif self.activation == 'tanh': 141 | return tf.nn.tanh(z) 142 | elif self.activation == 'leakyrelu': 143 | return tf.nn.relu(z) - 0.1 * tf.nn.relu(-z) 144 | elif self.activation == 'pelu': 145 | positive = tf.nn.relu(z) * a / (b + 1e-9) 146 | negative = a * (tf.exp((-tf.nn.relu(-z)) / (b + 1e-9)) - 1) 147 | return negative + positive 148 | elif self.activation == 'elu': 149 | return tf.nn.elu(z) 150 | else: 151 | raise NotImplementedError('Only RELU, ELU, TANH, and PELU activations are supported') 152 | 153 | def fit(self, X_train, y_train, X_val, y_val): 154 | 155 | with tf.Graph().as_default(): 156 | 157 | with tf.Session() as sess: 158 | print("=== Training PFC===") 159 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 160 | if ckpt and ckpt.model_checkpoint_path: 161 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 162 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 163 | clear_devices=True) 164 | saver.restore(sess, ckpt.model_checkpoint_path) 165 | inputs = tf.get_collection('inputs')[0] 166 | outputs = tf.get_collection('outputs')[0] 167 | training = tf.get_collection('training')[0] 168 | class_weights = tf.get_collection('class_weights')[0] 169 | train_step = tf.get_collection('train_step')[0] 170 | loss_t = tf.get_collection('loss')[0] 171 | loss_per_t = tf.get_collection('loss_per_t')[0] 172 | learning_rate = tf.get_collection('learning_rate')[0] 173 | else: 174 | raise FileNotFoundError('Cannot Find Checkpoint') 175 | 176 | #Compute initial val_loss 177 | best_loss = 1000000.0 178 | val_loss_arr = np.zeros((X_val.shape[0]), np.float32) 179 | 180 | 181 | lr = self.starter_learning_rate 182 | 183 | pc = 0 # patience counter 184 | lrc = 0 # learning rate patience counter 185 | 186 | num_classes= np.max(y_train)+1 187 | if self.mu == None: 188 | cw = np.ones(num_classes, dtype=np.float32) 189 | else: 190 | num_classes= np.max(y_train)+1 191 | cw = np.bincount(y_train, minlength=num_classes) 192 | cw = np.sum(cw)/cw 193 | cw[np.isinf(cw)] = 0 194 | cw = self.mu * np.log(cw) 195 | cw[cw < 1] = 1 196 | 197 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e -- loss: %1.2f' 198 | v_msg1 = ' -- val=(%1.2f)' 199 | 200 | train_samples = X_train.shape[0] 201 | val_samples = X_val.shape[0] 202 | for e in range(self.num_epochs): 203 | train_cma = CMA() 204 | 205 | idx = np.arange(X_train.shape[0]) 206 | np.random.shuffle(idx) 207 | 208 | c=0 209 | for i in range(0 , train_samples, self.batch_size): 210 | 211 | _, loss0 = sess.run([train_step, loss_t], 212 | feed_dict={inputs: X_train[idx[i:i+self.batch_size]], 213 | outputs: y_train[idx[i:i+self.batch_size]], 214 | training: True, 215 | learning_rate: lr, 216 | class_weights:cw}) 217 | 218 | train_cma.update(loss0) 219 | print(t_msg1 % (e+1,self.num_epochs,i+1,train_samples,lr,train_cma.avg), end="") 220 | 221 | 222 | c+=1 223 | if math.isnan(loss0): 224 | break 225 | #Calculate validation loss/acc 226 | eval_mb = 256 227 | for i in range(0, val_samples, eval_mb): 228 | start = np.min([i, val_samples - eval_mb]) 229 | end = np.min([i + eval_mb, val_samples]) 230 | val_loss_arr[start:end] = sess.run(loss_per_t, 231 | feed_dict={inputs: X_val[start:end], 232 | outputs: y_val[start:end], 233 | training: False, 234 | class_weights:cw}) 235 | val_loss = np.mean(val_loss_arr) 236 | print(v_msg1 % val_loss) 237 | 238 | if val_loss < best_loss and not math.isnan(train_cma.avg): 239 | saver.save(sess, self.ckpt_path+'/model.ckpt',e) 240 | best_loss = val_loss 241 | print(' -- best_val_loss=%1.2f' % best_loss) 242 | pc = 0 243 | lrc=0 244 | else: 245 | pc += 1 246 | lrc += 1 247 | if pc > self.patience: 248 | break 249 | elif lrc >= self.lr_patience: 250 | lrc = 0 251 | lr /= 10.0 252 | print(' -- Patience=%d/%d' % (pc, self.patience)) 253 | 254 | if math.isnan(train_cma.avg): 255 | break 256 | 257 | def predict(self, X): 258 | 259 | n = X.shape[0] 260 | 261 | with tf.Graph().as_default(): 262 | 263 | with tf.Session() as sess: 264 | 265 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 266 | if ckpt and ckpt.model_checkpoint_path: 267 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 268 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 269 | clear_devices=True) 270 | saver.restore(sess, ckpt.model_checkpoint_path) 271 | inputs = tf.get_collection('inputs')[0] 272 | outputs = tf.get_collection('outputs')[0] 273 | training = tf.get_collection('training')[0] 274 | pred = tf.get_collection('pred')[0] 275 | else: 276 | raise FileNotFoundError('No checkpoint loaded') 277 | 278 | prediction = np.zeros(n, dtype=np.int32) 279 | 280 | for i in range(0, n, self.batch_size): 281 | start = np.min([i, n - self.batch_size]) 282 | end = np.min([i + self.batch_size, n]) 283 | 284 | X_test = X[start:end] 285 | y_test = X[start:end] 286 | 287 | prediction[start:end] = sess.run(pred, 288 | feed_dict={inputs: X_test, 289 | outputs: y_test, 290 | training: False}) 291 | 292 | return prediction 293 | 294 | def predict_proba(self, X): 295 | 296 | n = X.shape[0] 297 | 298 | with tf.Graph().as_default(): 299 | 300 | with tf.Session() as sess: 301 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 302 | if ckpt and ckpt.model_checkpoint_path: 303 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 304 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 305 | clear_devices=True) 306 | saver.restore(sess, ckpt.model_checkpoint_path) 307 | inputs = tf.get_collection('inputs')[0] 308 | training = tf.get_collection('training')[0] 309 | proba = tf.get_collection('y')[0] 310 | else: 311 | raise FileNotFoundError('No checkpoint loaded') 312 | 313 | prediction = np.zeros((n, self.layer_sizes[-1]), dtype=np.float32) 314 | 315 | for i in range(0, n, self.batch_size): 316 | start = np.min([i, n - self.batch_size]) 317 | end = np.min([i + self.batch_size, n]) 318 | 319 | X_test = X[start:end] 320 | 321 | prediction[start:end] = sess.run(proba, 322 | feed_dict={inputs: X_test, 323 | training: False}) 324 | 325 | return prediction 326 | 327 | 328 | -------------------------------------------------------------------------------- /classifier/rf_workflow.py: -------------------------------------------------------------------------------- 1 | """ 2 | @author: ubg9540 3 | """ 4 | 5 | from sklearn.base import BaseEstimator, ClassifierMixin 6 | from sklearn.ensemble import RandomForestClassifier 7 | from sklearn.model_selection import GridSearchCV, PredefinedSplit 8 | from sklearn.preprocessing import StandardScaler 9 | from sklearn.decomposition import PCA 10 | import numpy as np 11 | 12 | class RandomForestWorkflow(BaseEstimator, ClassifierMixin): 13 | 14 | """ 15 | RandomForest (Classifier) 16 | 17 | Args: 18 | n_jobs (int, optional): Number of jobs to run in parallel for both fit 19 | and predict. (Default: 8) 20 | verbosity (int, optional): Controls the verbosity of GridSearchCV: the 21 | higher the number, the more messages. (Default: 10) 22 | refit (bool, optional): Whether to refit during the fine-tune search. 23 | (Default: True) 24 | scikit_args (dict, optional) 25 | """ 26 | 27 | def __init__(self, n_jobs=8, verbosity = 10, refit=True, scikit_args={}): 28 | self.n_jobs = n_jobs 29 | self.verbose = verbosity 30 | self.refit = refit 31 | self.scikit_args = scikit_args 32 | 33 | def fit(self, train_data, train_labels, val_data, val_labels): 34 | """ 35 | Fits to training data. 36 | 37 | Args: 38 | train_data (ndarray): Training data. 39 | train_labels (ndarray): Training labels. 40 | val_data (ndarray): Validation data. 41 | val_labels (ndarray): Validation labels. 42 | """ 43 | split = np.append(-np.ones(train_labels.shape, dtype=np.float32), 44 | np.zeros(val_labels.shape, dtype=np.float32)) 45 | ps = PredefinedSplit(split) 46 | 47 | sh = train_data.shape 48 | train_data = np.append(train_data, val_data , axis=0) 49 | train_labels = np.append(train_labels , val_labels, axis=0) 50 | del val_data, val_labels 51 | 52 | model = RandomForestClassifier(n_jobs=self.n_jobs, 53 | **self.scikit_args) 54 | 55 | params = {'n_estimators':np.arange(1,1001,50)} 56 | #Coarse search 57 | gs = GridSearchCV(model, params, refit=False, n_jobs=self.n_jobs, 58 | verbose=self.verbose, cv=ps) 59 | gs.fit(train_data, train_labels) 60 | 61 | #Fine-Tune Search 62 | params = {'n_estimators':np.arange(gs.best_params_['n_estimators']-50, 63 | gs.best_params_['n_estimators']+50)} 64 | 65 | self.gs = GridSearchCV(model, params, refit=self.refit, n_jobs=self.n_jobs, 66 | verbose=self.verbose, cv=ps) 67 | self.gs.fit(train_data, train_labels) 68 | 69 | if not self.refit: 70 | model.set_params(n_estimators=gs.best_params_['n_estimators']) 71 | self.gs = model 72 | self.gs.fit(train_data[:sh[0]], train_labels[:sh[0]]) 73 | 74 | # return self.gs.fit(train_data, train_labels) 75 | 76 | def predict(self, test_data): 77 | """ 78 | Performs classification on samples in test_data. 79 | 80 | Args: 81 | test_data (ndarray): Test data to be classified. 82 | 83 | Returns: 84 | ndarray of predicted class labels for test_data. 85 | """ 86 | return self.gs.predict(test_data) 87 | 88 | 89 | def predict_proba(self, test_data): 90 | """ 91 | Computes probabilities of possible outcomes for samples in test_data. 92 | 93 | Args: 94 | test_data (ndarray): Test data that will be used to compute 95 | probabilities. 96 | 97 | Returns: 98 | Array of the probability of the sample for each class in the 99 | model. 100 | """ 101 | return self.gs.predict_proba(test_data) -------------------------------------------------------------------------------- /classifier/ssmlp.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import os 4 | import numpy as np 5 | import tf_initializers 6 | 7 | class EMA(object): 8 | def __init__(self, decay=0.99): 9 | self.avg = None 10 | self.decay = decay 11 | self.history = [] 12 | def update(self, X): 13 | if not self.avg: 14 | self.avg = X 15 | else: 16 | self.avg = self.avg * self.decay + (1-self.decay) * X 17 | self.history.append(self.avg) 18 | 19 | class SSMLP(object): 20 | 21 | def __init__(self, layer_sizes, unsupervised_weights=None, num_epochs=150, 22 | batch_size=32, weight_decay=None, 23 | starter_learning_rate=2e-3, ckpt_path='checkpoints', 24 | patience=50, 25 | activation='pelu', unsupervised_cost_fn='mse', 26 | weight_initializer='glorot_normal', 27 | mu = 0.25, eval_batch_size=512): 28 | 29 | ''' 30 | Attributes 31 | ''' 32 | self.L = len(layer_sizes) - 1 # number of layers 33 | self.layer_sizes = layer_sizes 34 | self.weight_decay = weight_decay 35 | self.num_epochs = num_epochs 36 | self.batch_size = batch_size 37 | self.ckpt_path = ckpt_path 38 | self.starter_learning_rate = starter_learning_rate 39 | self.patience = patience 40 | self.lr_patience = int(patience/2) 41 | self.activation = activation 42 | self.unsup_cost_fn = unsupervised_cost_fn 43 | self.weight_initializer = tf_initializers.get(weight_initializer) 44 | self.mu = mu 45 | self.eval_batch_size = eval_batch_size 46 | 47 | if unsupervised_weights is None: 48 | self.unsup_weights = np.ones(len(layer_sizes)-2) 49 | else: 50 | self.unsup_weights = unsupervised_weights 51 | 52 | tf.reset_default_graph() 53 | 54 | inputs = tf.placeholder(tf.float32, shape=(None,self.layer_sizes[0]),name='inputs') 55 | outputs = tf.placeholder(tf.int32, name='outputs') 56 | class_weights = tf.placeholder(tf.float32, name='class_weights') 57 | learning_rate = tf.placeholder(tf.float32, name='learning_rate') 58 | 59 | training = tf.placeholder(tf.bool,name='training') 60 | 61 | shapes = list(zip(self.layer_sizes[:-1], self.layer_sizes[1:])) # shapes of linear layers 62 | 63 | weights = {'W': [self.weight_initializer(s, "W") for s in shapes], 64 | 'Wb':[self.bi(1.0, self.layer_sizes[l+1], "Wb") for l in range(self.L)], 65 | 'V': [self.weight_initializer(s[::-1], "V") for s in shapes], 66 | 'Vb':[self.bi(1.0, self.layer_sizes[l], "Vb") for l in range(self.L)], 67 | 'a1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)], 68 | 'b1':[tf.Variable(tf.ones(1)) for l in range(self.L-1)], 69 | 'a2':[tf.Variable(tf.ones(1)) for l in range(self.L-1)], 70 | 'b2':[tf.Variable(tf.ones(1)) for l in range(self.L-1)]} 71 | 72 | tf.add_to_collection("inputs", inputs) 73 | tf.add_to_collection("outputs", outputs) 74 | tf.add_to_collection("training", training) 75 | tf.add_to_collection('learning_rate', learning_rate) 76 | tf.add_to_collection('class_weights', class_weights) 77 | 78 | eps = tf.constant(1e-10) 79 | 80 | #Encoder 81 | h = inputs 82 | encoder_layers = [inputs] 83 | for i in range(self.L-1): 84 | h = tf.matmul(h, weights['W'][i]) 85 | h = tf.add(h, weights['Wb'][i]) 86 | h = tf.layers.batch_normalization(h, training=training) 87 | 88 | if i == self.L - 2: 89 | encoded_path = self.act(h, weights["a1"][i], weights["b1"][i]) 90 | h = tf.matmul(encoded_path, weights['W'][self.L-1]) 91 | h = tf.add(h, weights['Wb'][self.L-1]) 92 | y = tf.nn.softmax(h) 93 | else: 94 | h = self.act(h, weights["a1"][i], weights["b1"][i]) 95 | encoder_layers.append(h) 96 | 97 | tf.add_to_collection("y", y) 98 | 99 | #Decoder 100 | 101 | h = encoded_path 102 | decoder_layers = [] 103 | for i in range(self.L-2, -1, -1): 104 | h = tf.matmul(h, weights['V'][i]) 105 | h = tf.add(h, weights['Vb'][i]) 106 | 107 | h = tf.layers.batch_normalization(h, training=training) 108 | 109 | if i == 0: 110 | decoded_path = h 111 | decoder_layers.append(decoded_path) 112 | else: 113 | h = self.act(h, weights["a2"][i], weights["b2"][i]) 114 | decoder_layers.append(h) 115 | h = tf.add(h, encoder_layers[i]) 116 | 117 | 118 | decoder_layers = decoder_layers[::-1] 119 | u_cost = 0.0 120 | if self.unsup_cost_fn == 'mae': 121 | for i in range(len(encoder_layers)): 122 | u_cost += (self.unsup_weights[i] * tf.reduce_mean(tf.abs(encoder_layers[i] - decoder_layers[i]),1)) 123 | else: 124 | for i in range(len(encoder_layers)): 125 | u_cost += (self.unsup_weights[i] * tf.reduce_mean(tf.square(encoder_layers[i] - decoder_layers[i]),1)) 126 | 127 | # tf.add_to_collection("u_cost", u_cost) 128 | 129 | # clipped_output = tf.clip_by_value(outputs, 1e-10, 1.0) 130 | # cost = -tf.reduce_mean(tf.reduce_sum(outputs * tf.multiply(tf.log(y+eps),class_weights), 1)) # supervised cost 131 | # loss = cost + u_cost # total cost 132 | 133 | # tf.add_to_collection('cost', cost) 134 | 135 | # if self.weight_decay is not None: 136 | # weight_decay = tf.nn.l2_loss(weights['W'][0]) + tf.nn.l2_loss(weights['V'][0]) 137 | # for i in range(1,self.L): 138 | # weight_decay = weight_decay + tf.nn.l2_loss(weights['W'][i]) + tf.nn.l2_loss(weights['V'][i]) 139 | # 140 | # loss = loss + self.weight_decay * weight_decay # total cost 141 | 142 | # if self.activation == 'pelu': 143 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0) 144 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0) 145 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a2']), 0) 146 | # loss = loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b2']), 0) 147 | 148 | # tf.add_to_collection('loss', loss) 149 | 150 | one_hot = tf.cast(tf.one_hot(outputs, layer_sizes[-1]) , tf.float32) 151 | loss_per = -tf.reduce_sum(one_hot * tf.multiply(tf.log(y+eps),class_weights), 1) + u_cost# supervised cost 152 | loss = tf.reduce_mean(loss_per) 153 | 154 | reg_loss = 0.0 155 | if self.weight_decay is not None: 156 | for i in range(0,self.L): 157 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['W'][i]) 158 | reg_loss = reg_loss + self.weight_decay*tf.nn.l2_loss(weights['V'][i]) 159 | 160 | pelu_loss = 0.0 161 | if self.activation == 'pelu': 162 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a1']), 0) 163 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b1']), 0) 164 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['a2']), 0) 165 | pelu_loss = pelu_loss - 1000.0 * tf.minimum( tf.reduce_min(weights['b2']), 0) 166 | 167 | loss = loss + reg_loss + pelu_loss 168 | loss_per = loss_per + reg_loss + pelu_loss 169 | tf.add_to_collection('loss_per_t', loss_per) 170 | tf.add_to_collection('loss_t', loss) 171 | 172 | 173 | pred = tf.argmax(y, 1) 174 | tf.add_to_collection('pred', pred) 175 | truth = tf.argmax(outputs, 1) 176 | correct_prediction = tf.equal(tf.argmax(y, 1), truth) # no of correct predictions 177 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) * tf.constant(100.0) 178 | tf.add_to_collection('accuracy', accuracy) 179 | 180 | opt = tf.contrib.opt.NadamOptimizer(learning_rate) 181 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 182 | with tf.control_dependencies(update_ops): 183 | train_step = opt.minimize(loss) 184 | 185 | tf.add_to_collection('train_step', train_step) 186 | 187 | with tf.Session() as sess: 188 | saver = tf.train.Saver() 189 | sess.run(tf.global_variables_initializer()) 190 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0) 191 | 192 | def bi(self, inits, size, name): 193 | return tf.Variable(inits * tf.ones([size]), name=name) 194 | 195 | def wi(self, shape, name): 196 | return tf.Variable(tf.random_normal(shape, name=name)/ math.sqrt(shape[0])) 197 | 198 | def act(self, z, a, b): 199 | """Applies PELU activation function. 200 | 201 | Args: 202 | z (tensor): data to apply activation function on. 203 | a (float): `a` parameter. 204 | b (float): `b` parameter. 205 | """ 206 | if self.activation == 'relu': 207 | return tf.nn.relu(z) 208 | elif self.activation == 'tanh': 209 | return tf.nn.tanh(z) 210 | elif self.activation == 'leakyrelu': 211 | return tf.nn.relu(z) - 0.1 * tf.nn.relu(-z) 212 | elif self.activation == 'pelu': 213 | positive = tf.nn.relu(z) * a / (b + 1e-9) 214 | negative = a * (tf.exp((-tf.nn.relu(-z)) / (b + 1e-9)) - 1) 215 | return negative + positive 216 | elif self.activation == 'elu': 217 | return tf.nn.elu(z) 218 | else: 219 | raise NotImplementedError('Only RELU, ELU, TANH, and PELU activations are supported') 220 | 221 | def fit(self, X_train, y_train, X_val, y_val): 222 | 223 | if not os.path.exists(self.ckpt_path): 224 | os.makedirs(self.ckpt_path) 225 | 226 | with tf.Session() as sess: 227 | print("=== Training RonNet===") 228 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 229 | if ckpt and ckpt.model_checkpoint_path: 230 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 231 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 232 | clear_devices=True) 233 | saver.restore(sess, ckpt.model_checkpoint_path) 234 | inputs = tf.get_collection('inputs')[0] 235 | outputs = tf.get_collection('outputs')[0] 236 | training = tf.get_collection('training')[0] 237 | class_weights = tf.get_collection('class_weights')[0] 238 | train_step = tf.get_collection('train_step')[0] 239 | loss_t = tf.get_collection('loss_t')[0] 240 | loss_per_t = tf.get_collection('loss_per_t')[0] 241 | learning_rate = tf.get_collection('learning_rate')[0] 242 | else: 243 | raise FileNotFoundError('Cannot Find Checkpoint') 244 | 245 | #Compute initial val_loss 246 | best_loss = 1000000.0 247 | 248 | 249 | lr = self.starter_learning_rate 250 | 251 | pc = 0 # patience counter 252 | lrc = 0 # learning rate patience counter 253 | 254 | num_classes= np.max(y_train)+1 255 | if self.mu is None: 256 | cw = np.ones(num_classes) 257 | else: 258 | cw = np.bincount(y_train, minlength=num_classes) 259 | cw = np.sum(cw)/cw 260 | cw[np.isinf(cw)] = 0 261 | cw = self.mu * np.log(cw) 262 | cw[cw < 1] = 1 263 | 264 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e -- train=%1.2f' 265 | v_msg1 = ' -- val=%1.2f' 266 | val_loss_arr = np.zeros((X_val.shape[0]), np.float32) 267 | ema = EMA() 268 | val_samples = X_val.shape[0] 269 | idx = np.arange(X_train.shape[0]) 270 | for e in range(self.num_epochs): 271 | 272 | np.random.shuffle(idx) 273 | 274 | 275 | for i in range(0, X_train.shape[0], self.batch_size): 276 | _, loss = sess.run([train_step, loss_t], 277 | feed_dict={inputs: X_train[idx[i:i+self.batch_size]], 278 | outputs: y_train[idx[i:i+self.batch_size]], 279 | training: True, 280 | learning_rate: lr, 281 | class_weights:cw}) 282 | ema.update(loss) 283 | print(t_msg1 % (e+1,self.num_epochs,i+1,X_train.shape[0],lr, ema.avg), end="") 284 | 285 | #Calculate validation loss/acc 286 | eval_mb = 256 287 | for i in range(0, val_samples, eval_mb): 288 | start = np.min([i, val_samples - eval_mb]) 289 | end = np.min([i + eval_mb, val_samples]) 290 | val_loss_arr[start:end] = sess.run(loss_per_t, 291 | feed_dict={inputs: X_val[start:end], 292 | outputs: y_val[start:end], 293 | training: False, 294 | class_weights:cw}) 295 | val_loss = np.mean(val_loss_arr) 296 | 297 | 298 | print(v_msg1 % (val_loss), end="") 299 | 300 | if val_loss < best_loss and not math.isnan(val_loss): 301 | saver.save(sess, self.ckpt_path+'/model.ckpt',e) 302 | best_loss = val_loss 303 | print(' -- best_val_loss=%1.2f' % best_loss) 304 | pc = 0 305 | lrc=0 306 | else: 307 | pc += 1 308 | lrc += 1 309 | if pc > self.patience: 310 | break 311 | elif lrc >= self.lr_patience: 312 | lrc = 0 313 | lr /= 10.0 314 | print(' -- Patience=%d/%d' % (pc, self.patience)) 315 | 316 | if math.isnan(val_loss): 317 | break 318 | 319 | def predict(self, X): 320 | 321 | n = X.shape[0] 322 | 323 | with tf.Graph().as_default(): 324 | 325 | with tf.Session() as sess: 326 | 327 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 328 | if ckpt and ckpt.model_checkpoint_path: 329 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 330 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 331 | clear_devices=True) 332 | saver.restore(sess, ckpt.model_checkpoint_path) 333 | inputs = tf.get_collection('inputs')[0] 334 | outputs = tf.get_collection('outputs')[0] 335 | training = tf.get_collection('training')[0] 336 | pred = tf.get_collection('pred')[0] 337 | else: 338 | raise FileNotFoundError('No checkpoint loaded') 339 | 340 | prediction = np.zeros(n, dtype=np.int32) 341 | 342 | for i in range(0, n, self.batch_size): 343 | start = np.min([i, n - self.batch_size]) 344 | end = np.min([i + self.batch_size, n]) 345 | 346 | X_test = X[start:end] 347 | y_test = X[start:end] 348 | 349 | prediction[start:end] = sess.run(pred, 350 | feed_dict={inputs: X_test, 351 | outputs: y_test, 352 | training: False}) 353 | 354 | return prediction 355 | 356 | def predict_proba(self, X): 357 | 358 | n = X.shape[0] 359 | 360 | with tf.Graph().as_default(): 361 | 362 | with tf.Session() as sess: 363 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 364 | if ckpt and ckpt.model_checkpoint_path: 365 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 366 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 367 | clear_devices=True) 368 | saver.restore(sess, ckpt.model_checkpoint_path) 369 | inputs = tf.get_collection('inputs')[0] 370 | training = tf.get_collection('training')[0] 371 | proba = tf.get_collection('y')[0] 372 | else: 373 | raise FileNotFoundError('No checkpoint loaded') 374 | 375 | prediction = np.zeros((n, self.layer_sizes[-1]), dtype=np.float32) 376 | 377 | for i in range(0, n, self.batch_size): 378 | start = np.min([i, n - self.batch_size]) 379 | end = np.min([i + self.batch_size, n]) 380 | 381 | X_test = X[start:end] 382 | 383 | prediction[start:end] = sess.run(proba, 384 | feed_dict={inputs: X_test, 385 | training: False}) 386 | 387 | return prediction 388 | 389 | 390 | 391 | -------------------------------------------------------------------------------- /classifier/svm_cv_workflow.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: svm_cv_workflow.py 3 | Author: Ronald Kemker 4 | Description: Classification pipeline for support vector machine which uses 5 | k-fold cross-validation 6 | Note: 7 | Requires scikit-learn 8 | """ 9 | 10 | from sklearn.svm import LinearSVC 11 | from sklearn.svm import SVC 12 | from sklearn.model_selection import GridSearchCV 13 | from sklearn.preprocessing import StandardScaler 14 | from sklearn.decomposition import PCA 15 | import numpy as np 16 | 17 | class SVMWorkflow(): 18 | 19 | def __init__(self, kernel='linear', n_jobs=8, probability=False, 20 | verbosity = 0, cv=3, max_iter=1000, scikit_args={}): 21 | self.kernel = kernel 22 | self.n_jobs = n_jobs 23 | self.probability = probability 24 | self.verbose = verbosity 25 | self.cv = cv 26 | self.max_iter = max_iter 27 | self.scikit_args = scikit_args 28 | 29 | 30 | def fit(self, train_data, train_labels, X_val=None, y_val=None): 31 | 32 | if self.kernel == 'linear': 33 | if self.probability: 34 | clf = SVC(kernel='linear', class_weight='balanced', 35 | random_state=6, multi_class='ovr', 36 | max_iter=self.max_iter, probability=self.probability, 37 | **self.scikit_args) 38 | else: 39 | clf = LinearSVC(class_weight='balanced', dual=False, 40 | random_state=6, multi_class='ovr', 41 | max_iter=self.max_iter, **self.scikit_args) 42 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float)} 43 | 44 | elif self.kernel == 'rbf': 45 | clf = SVC(random_state=6, class_weight='balanced', cache_size=8000, 46 | decision_function_shape='ovr',max_iter=self.max_iter, 47 | tol=1e-4) 48 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float), 49 | 'gamma': 2.0**np.arange(-15,4,2,dtype=np.float)} 50 | 51 | #Coarse search 52 | gs = GridSearchCV(clf, params, refit=False, n_jobs=self.n_jobs, 53 | verbose=self.verbose, cv=self.cv) 54 | gs.fit(train_data, train_labels) 55 | 56 | #Fine-Tune Search 57 | if self.kernel == 'linear': 58 | best_C = np.log2(gs.best_params_['C']) 59 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10, 60 | dtype=np.float)} 61 | elif self.kernel == 'rbf': 62 | best_C = np.log2(gs.best_params_['C']) 63 | best_G = np.log2(gs.best_params_['gamma']) 64 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10, 65 | dtype=np.float), 66 | 'gamma': 2.0**np.linspace(best_G-2,best_G+2,10, 67 | dtype=np.float)} 68 | 69 | self.gs = GridSearchCV(clf, params, refit=True, n_jobs=self.n_jobs, 70 | verbose=self.verbose, cv=self.cv) 71 | 72 | self.gs.fit(train_data, train_labels) 73 | 74 | return 1 75 | 76 | 77 | def predict(self, test_data): 78 | return self.gs.predict(test_data) 79 | 80 | def predict_proba(self, test_data): 81 | return self.gs.predict_proba(test_data) 82 | -------------------------------------------------------------------------------- /classifier/svm_workflow.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: svm_workflow.py 3 | Author: Ronald Kemker 4 | Description: Classification pipeline for support vector machine where train and 5 | validation folds are pre-defined. 6 | Note: 7 | Requires scikit-learn 8 | """ 9 | 10 | 11 | from sklearn.svm import LinearSVC 12 | from sklearn.svm import SVC 13 | from sklearn.model_selection import GridSearchCV, PredefinedSplit 14 | import numpy as np 15 | from sklearn.base import BaseEstimator, ClassifierMixin 16 | 17 | class SVMWorkflow(BaseEstimator, ClassifierMixin): 18 | 19 | """ 20 | Support Vector Machine (Classifier) 21 | 22 | Args: 23 | kernel(str, optional): Type of SVM kernel. Can be either 'linear' or 24 | 'rbf'. (Default: 'linear') 25 | n_jobs(int, optional): Number of jobs to run in parallel for 26 | GridSearchCV. (Default: 8) 27 | verbosity(int ,optional): Controls the verbosity of GridSearchCV: the 28 | higher the number, the more messages. (Default: 10) 29 | probability(bool, optional): Whether to enable probability estimates. 30 | (Default: False) 31 | refit(bool, optional): For GridSearchCV. Whether to refit the best 32 | estimator with the entire dataset. (Default: True) 33 | scikit_args(dict, optional) 34 | """ 35 | 36 | def __init__(self, kernel='linear', n_jobs=8, verbosity = 0, 37 | probability=False, refit=True, scikit_args={}): 38 | self.kernel = kernel 39 | self.n_jobs = n_jobs 40 | self.verbosity = verbosity 41 | self.probability = probability 42 | self.scikit_args = scikit_args 43 | self.refit = refit 44 | 45 | def fit(self, train_data, train_labels, val_data, val_labels): 46 | """ 47 | Fits to training data. 48 | 49 | Args: 50 | train_data (ndarray): Training data. 51 | train_labels (ndarray): Training labels. 52 | val_data (ndarray): Validation data. 53 | val_labels (ndarray): Validation labels. 54 | """ 55 | split = np.append(-np.ones(train_labels.shape, dtype=np.float32), 56 | np.zeros(val_labels.shape, dtype=np.float32)) 57 | ps = PredefinedSplit(split) 58 | 59 | sh = train_data.shape 60 | train_data = np.append(train_data, val_data , axis=0) 61 | train_labels = np.append(train_labels , val_labels, axis=0) 62 | del val_data, val_labels 63 | 64 | if self.kernel == 'linear': 65 | if self.probability: 66 | clf = SVC(kernel='linear', class_weight='balanced', 67 | random_state=6, decision_function_shape='ovr', 68 | max_iter=1000, probability=self.probability, 69 | **self.scikit_args) 70 | else: 71 | clf = LinearSVC(class_weight='balanced', dual=False, 72 | random_state=6, multi_class='ovr', 73 | max_iter=1000, **self.scikit_args) 74 | 75 | #Cross-validate over these parameters 76 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float)} 77 | elif self.kernel == 'rbf': 78 | clf = SVC(random_state=6, class_weight='balanced', cache_size=16000, 79 | decision_function_shape='ovr',max_iter=1000, tol=1e-4, 80 | probability=self.probability, **self.scikit_args) 81 | params = {'C': 2.0**np.arange(-9,16,2,dtype=np.float), 82 | 'gamma': 2.0**np.arange(-15,4,2,dtype=np.float)} 83 | 84 | #Coarse search 85 | gs = GridSearchCV(clf, params, refit=False, n_jobs=self.n_jobs, 86 | verbose=self.verbosity, cv=ps) 87 | gs.fit(train_data, train_labels) 88 | 89 | #Fine-Tune Search 90 | if self.kernel == 'linear': 91 | best_C = np.log2(gs.best_params_['C']) 92 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10, 93 | dtype=np.float)} 94 | elif self.kernel == 'rbf': 95 | best_C = np.log2(gs.best_params_['C']) 96 | best_G = np.log2(gs.best_params_['gamma']) 97 | params = {'C': 2.0**np.linspace(best_C-2,best_C+2,10, 98 | dtype=np.float), 99 | 'gamma': 2.0**np.linspace(best_G-2,best_G+2,10, 100 | dtype=np.float)} 101 | 102 | self.gs = GridSearchCV(clf, params, refit=self.refit, n_jobs=self.n_jobs, 103 | verbose=self.verbosity, cv=ps) 104 | self.gs.fit(train_data, train_labels) 105 | 106 | if not self.refit: 107 | clf.set_params(C=gs.best_params_['C']) 108 | if self.kernel == 'rbf': 109 | clf.set_params(gamma=gs.best_params_['gamma']) 110 | self.gs = clf 111 | self.gs.fit(train_data[:sh[0]], train_labels[:sh[0]]) 112 | 113 | 114 | def predict(self, test_data): 115 | """ 116 | Performs classification on samples in test_data. 117 | 118 | Args: 119 | test_data (ndarray): Test data to be classified. 120 | 121 | Returns: 122 | ndarray of predicted class labels for test_data. 123 | """ 124 | return self.gs.predict(test_data) 125 | 126 | def predict_proba(self, test_data): 127 | """ 128 | Computes probabilities of possible outcomes for samples in test_data. 129 | 130 | Args: 131 | test_data (ndarray): Test data that will be used to compute 132 | probabilities. 133 | 134 | Returns: 135 | Array of the probability of the sample for each class in the 136 | model. 137 | """ 138 | return self.gs.predict_proba(test_data) 139 | 140 | -------------------------------------------------------------------------------- /examples/example_pipeline.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Thu Mar 29 08:23:03 2018 5 | 6 | @author: Ronald Kemker and Utsav Gewali 7 | """ 8 | 9 | from fileIO.utils import read 10 | import numpy as np 11 | from metrics import Metrics, Timer 12 | import os 13 | from utils.fileIO import readENVIHeader 14 | from pipeline import Pipeline 15 | from preprocessing import ImagePreprocessor 16 | from utils.train_val_split import train_val_split 17 | from feature_extraction.scae import SCAE 18 | from classifier.svm_workflow import SVMWorkflow 19 | from classifier.rf_workflow import RandomForestWorkflow 20 | 21 | gpu_id = "0" 22 | dataset = 'ip' # dataset choices for this example are 'ip' and 'paviau' 23 | fe_type = 'scae' #feature extractor options are 'scae' and 'smcae' 24 | os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id 25 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3" 26 | root = '/home/rmk6217/Documents/EarthMapper/' 27 | fe_path = root + 'feature_extraction/pretrained/' + fe_type 28 | data_path = root + 'datasets/indian_pines/' 29 | 30 | #Load Indian Pines or Pavia University dataset 31 | if dataset == 'paviau': 32 | data_path = root + 'datasets/pavia_university/' 33 | X = np.float32(read(data_path+'pavia_ds/pavia_ds.tif').transpose(1,2,0))[:,:340] 34 | y = np.int32(read(data_path+'PaviaU_gt.mat')['paviaU_gt']).ravel() 35 | source_bands, source_fwhm = readENVIHeader(data_path + 'University.hdr') 36 | source_bands *= 1e3 37 | source_fwhm *= 1e3 38 | num_classes = 9 39 | padding = ((3,3),(2,2),(0,0)) 40 | 41 | elif dataset == 'ip': 42 | data_path = root + 'datasets/indian_pines/' 43 | X = read(data_path + 'indianpines_ds.tif').transpose(1,2,0) 44 | X = np.float32(X) 45 | y = read(data_path + 'indianPinesGT.mat')['indian_pines_gt'] 46 | y = y.ravel() 47 | source_bands, source_fwhm = readENVIHeader(data_path + 'indianpines_ds_raw.hdr') 48 | num_classes = 16 49 | padding = ((4,3),(4,3),(0,0)) 50 | 51 | else: 52 | raise ValueError('Invalid dataset %s: Valid entries include "ip", "paviau"') 53 | 54 | 55 | #Create a random training/validation split 56 | y_train, y_val = train_val_split(y, 15 , 35) 57 | 58 | #Load Spatial-Spectral Feature Extractor 59 | fe = [] 60 | target_bands = [] 61 | target_fwhm = [] 62 | fe_sensors = ['aviris'] # This can include 'aviris', 'hyperion', and 'gliht' 63 | 64 | for ds in fe_sensors: 65 | mat = read(fe_path + '/%s/%s_bands.mat' % (ds,fe_type)) 66 | fe.append(SCAE(ckpt_path=fe_path + '/%s' % ds, nb_cae=5)) 67 | target_bands.append(mat['bands'][0]) 68 | target_fwhm.append(mat['fwhm'][0]) 69 | 70 | #This pre-processes the data prior to passing it to a feature extractor 71 | #This can be a string or a ImagePreprocessor object 72 | pre_processor = ['StandardScaler', 'MinMaxScaler'] 73 | 74 | #This pre-process the extracted features prior to passing it to a classifier 75 | #This can be a string or a ImagePreprocessor object 76 | feature_scaler = ['StandardScaler', 'MinMaxScaler', 77 | ImagePreprocessor(mode='PCA', PCA_components=0.99)] 78 | 79 | #Classifier - SVM-RBF (probability=True required for CRF post-processor) 80 | clf = SVMWorkflow(probability=True, kernel='rbf') 81 | #clf = RandomForestWorkflow() 82 | 83 | #The classification pipeline 84 | pipe = Pipeline(pre_processor=pre_processor, 85 | feature_extractor=fe, 86 | feature_spatial_padding = padding, 87 | feature_scaler=feature_scaler, 88 | classifier=clf, source_bands=source_bands, 89 | source_fwhm=source_fwhm, 90 | target_bands=target_bands, target_fwhm=target_fwhm, 91 | post_processor='CRF') 92 | 93 | #Fit Data 94 | with Timer('Pipeline Fit Timer'): 95 | pipe.fit(X , y_train, X, y_val) 96 | 97 | #Generate Prediction 98 | with Timer('Pipeline Predict Timer'): 99 | pred = pipe.predict(X) 100 | 101 | #Evaluate and Print Results 102 | m = Metrics(y[y>0]-1, pred[y>0]) 103 | overall_accuracy, mean_class_accuracy, kappa_statistic = m.standard_metrics() 104 | confusion_matrix = m.c 105 | print('Overall Accuracy: %1.2f%%' % (overall_accuracy * 100.0)) 106 | print('Mean-Class Accuracy: %1.2f%%' % (mean_class_accuracy * 100.0)) 107 | print('Kappa Statistic: %1.4f' % (kappa_statistic)) 108 | 109 | 110 | 111 | -------------------------------------------------------------------------------- /examples/example_scae_svm.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Thu Mar 29 08:23:03 2018 5 | 6 | @author: Ronald Kemker and Utsav Gewali 7 | """ 8 | 9 | from fileIO.utils import read 10 | import numpy as np 11 | from metrics import Metrics, Timer 12 | import os 13 | from utils.fileIO import readENVIHeader 14 | from pipeline import Pipeline 15 | from preprocessing import ImagePreprocessor, AveragePooling2D 16 | from utils.train_val_split import random_data_split, train_val_split 17 | from feature_extraction.scae import SCAE 18 | from classifier.svm_cv_workflow import SVMWorkflow 19 | 20 | np.random.seed(6) 21 | gpu_id = "0" 22 | dataset = 'paviau' # dataset choices for this example are 'ip' and 'paviau' 23 | fe_type = 'scae' #feature extractor options are 'scae' and 'smcae' 24 | os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id 25 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3" 26 | root = '/home/rmk6217/Documents/EarthMapper/' 27 | fe_path = root + 'feature_extraction/pretrained/' + fe_type 28 | data_path = root + 'datasets/indian_pines/' 29 | num_trials = 30 30 | 31 | #Load Indian Pines or Pavia University dataset 32 | if dataset == 'paviau': 33 | data_path = root + 'datasets/pavia_university/' 34 | X = np.float32(read(data_path+'pavia_ds/pavia_ds.tif').transpose(1,2,0))[:,:340] 35 | y = np.int32(read(data_path+'PaviaU_gt.mat')['paviaU_gt']).ravel() 36 | source_bands, source_fwhm = readENVIHeader(data_path + 'University.hdr') 37 | source_bands *= 1e3 38 | source_fwhm *= 1e3 39 | num_classes = 9 40 | padding = ((3,3),(2,2),(0,0)) 41 | 42 | elif dataset == 'ip': 43 | data_path = root + 'datasets/indian_pines/' 44 | X = read(data_path + 'indianpines_ds.tif').transpose(1,2,0) 45 | X = np.float32(X) 46 | y = read(data_path + 'indianPinesGT.mat')['indian_pines_gt'] 47 | y = y.ravel() 48 | source_bands, source_fwhm = readENVIHeader(data_path + 'indianpines_ds_raw.hdr') 49 | num_classes = 16 50 | padding = ((4,3),(4,3),(0,0)) 51 | 52 | else: 53 | raise ValueError('Invalid dataset %s: Valid entries include "ip", "paviau"') 54 | 55 | 56 | oa_arr = np.zeros((num_trials, )) 57 | aa_arr = np.zeros((num_trials, )) 58 | kappa_arr = np.zeros((num_trials, )) 59 | 60 | for i in range(num_trials): 61 | 62 | #Create a random training/validation split 63 | y_train, y_val = random_data_split(y, train_samples=50, shuffle=True) 64 | 65 | #Load Spatial-Spectral Feature Extractor 66 | fe = [] 67 | target_bands = [] 68 | target_fwhm = [] 69 | fe_sensors = ['hyperion'] # This can include 'aviris', 'hyperion', and 'gliht' 70 | 71 | for ds in fe_sensors: 72 | mat = read(fe_path + '/%s/%s_bands.mat' % (ds,fe_type)) 73 | fe.append(SCAE(ckpt_path=fe_path + '/%s' % ds, nb_cae=3)) 74 | target_bands.append(mat['bands'][0]) 75 | target_fwhm.append(mat['fwhm'][0]) 76 | 77 | #This pre-processes the data prior to passing it to a feature extractor 78 | #This can be a string or a ImagePreprocessor objectb 79 | pre_processor = ['StandardScaler'] 80 | 81 | #This pre-process the extracted features prior to passing it to a classifier 82 | #This can be a string or a ImagePreprocessor object 83 | feature_scaler = [AveragePooling2D(), 'StandardScaler', 'MinMaxScaler', 84 | ImagePreprocessor('PCA', whiten=False, PCA_components=0.999)] 85 | 86 | #Classifier - SVM-RBF 87 | # The SVM from svm_cv_workflow.py does NOT use the validation set - it does 88 | # N-fold cross-validation with the training set. This is the ONLY classifier 89 | # we have that does this. 90 | clf = SVMWorkflow(cv=7) 91 | 92 | #The classification pipeline 93 | pipe = Pipeline(pre_processor=pre_processor, 94 | feature_extractor=fe, 95 | feature_spatial_padding = padding, 96 | feature_scaler=feature_scaler, 97 | classifier=clf, source_bands=source_bands, 98 | source_fwhm=source_fwhm, 99 | target_bands=target_bands, target_fwhm=target_fwhm) 100 | 101 | #Fit Data 102 | with Timer('Pipeline Fit Timer'): 103 | pipe.fit(X, y_train, X, y_val) 104 | 105 | #Generate Prediction 106 | with Timer('Pipeline Predict Timer'): 107 | pred = pipe.predict(X) 108 | 109 | #Evaluate and Print Results 110 | m = Metrics(y[y>0]-1, pred[y>0]) 111 | oa_arr[i], aa_arr[i], kappa_arr[i] = m.standard_metrics() 112 | 113 | print('Overall Accuracy: %1.2f+/-%1.2f' % (np.mean(oa_arr) * 100.0, np.std(oa_arr)*100.0)) 114 | print('Mean-Class Accuracy: %1.2f+/-%1.2f' % (np.mean(aa_arr) * 100.0, np.std(aa_arr)*100.0)) 115 | print('Kappa Statistic: %1.4f+/-%1.4f' % (np.mean(kappa_arr), np.std(kappa_arr))) 116 | 117 | #SCAE-Hyperion 99.9% 118 | #Overall Accuracy: 95.25+/-1.37 119 | #Mean-Class Accuracy: 97.05+/-0.54 120 | #Kappa Statistic: 0.9378+/-0.0175 -------------------------------------------------------------------------------- /feature_extraction/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/feature_extraction/__init__.py -------------------------------------------------------------------------------- /feature_extraction/cae.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Mon Oct 30 07:36:14 2017 5 | 6 | @author: rmk6217 7 | """ 8 | 9 | import tensorflow as tf 10 | import numpy as np 11 | import os 12 | import math 13 | 14 | 15 | class DataSet(object): 16 | 17 | def __init__(self, images): 18 | 19 | self._num_examples = images.shape[0] 20 | self._images = images 21 | self._epochs_completed = 0 22 | self._index_in_epoch = 0 23 | self._index_array = np.arange(self._num_examples) 24 | np.random.shuffle(self._index_array) 25 | 26 | @property 27 | def images(self): 28 | return self._images 29 | 30 | @property 31 | def num_examples(self): 32 | return self._num_examples 33 | 34 | 35 | @property 36 | def epochs_completed(self): 37 | return self._epochs_completed 38 | 39 | def next_batch(self, batch_size): 40 | """Return the next `batch_size` examples from this data set.""" 41 | start = self._index_in_epoch 42 | self._index_in_epoch += batch_size 43 | if self._index_in_epoch > self._num_examples: 44 | # Finished epoch 45 | self._epochs_completed += 1 46 | # Shuffle the data 47 | np.random.shuffle(self._index_array) 48 | # Start next epoch 49 | start = 0 50 | self._index_in_epoch = batch_size 51 | end = self._index_in_epoch 52 | 53 | if batch_size <= self._num_examples: 54 | return self._images[self._index_array[start:end]] 55 | else: 56 | return self._images 57 | 58 | 59 | def bn_relu(input_tensor, train_tensor, name_sub=''): 60 | bn = tf.layers.batch_normalization(input_tensor, fused=True, training=train_tensor, 61 | name='bn'+name_sub) 62 | return tf.nn.relu(bn, name='act'+name_sub) 63 | 64 | def bn_relu_conv(input_tensor, train_tensor, scope_id, num_filters, 65 | kernel_size=(3,3), strides=(1,1), l2_loss=0.0): 66 | 67 | with tf.variable_scope(scope_id): 68 | act= bn_relu(input_tensor, train_tensor) 69 | conv = tf.layers.conv2d(act, num_filters, kernel_size , padding='same', 70 | name='conv2d', 71 | kernel_initializer=tf.glorot_normal_initializer(), 72 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 73 | 74 | return conv 75 | 76 | def mse_loss(x, y): 77 | diff = tf.squared_difference(x,y) 78 | return tf.reduce_mean(diff) 79 | 80 | def refinement_layer(lateral_tensor, vertical_tensor, train_tensor, num_filters, 81 | scope_id, l2_loss): 82 | with tf.variable_scope(scope_id): 83 | 84 | conv1 = tf.layers.conv2d(lateral_tensor, num_filters, kernel_size=(3,3), 85 | padding='same',name='conv2d_1', 86 | kernel_initializer=tf.glorot_normal_initializer(), 87 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 88 | act1 = bn_relu(conv1, train_tensor, name_sub='1') 89 | conv2 = tf.layers.conv2d(act1, num_filters, kernel_size=(3,3), 90 | padding='same',name='conv2d_2', 91 | kernel_initializer=tf.glorot_normal_initializer(), 92 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 93 | conv3 = tf.layers.conv2d(vertical_tensor, num_filters, kernel_size=(3,3), 94 | padding='same',name='conv2d_3', 95 | kernel_initializer=tf.glorot_normal_initializer(), 96 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 97 | add = tf.add(conv2, conv3) 98 | 99 | old_shape = tf.shape(add)[1:3] 100 | up = tf.image.resize_nearest_neighbor(add, old_shape * 2 ) 101 | act2 = bn_relu(up, train_tensor, name_sub='2') 102 | return act2 103 | 104 | 105 | def cae(N , input_tensor, train_tensor, nb_bands, cost_weights, l2_loss): 106 | 107 | 108 | conv1 = bn_relu_conv(input_tensor, train_tensor, scope_id='conv1', 109 | num_filters=N, l2_loss=l2_loss) 110 | pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1') 111 | 112 | conv2 = bn_relu_conv(pool1, train_tensor, scope_id='conv2', 113 | num_filters=N*2, l2_loss=l2_loss) 114 | 115 | pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2') 116 | 117 | conv3 = bn_relu_conv(pool2, train_tensor, scope_id='conv3', 118 | num_filters=N*2, l2_loss=l2_loss) 119 | 120 | pool3 = tf.layers.max_pooling2d(conv3, 2, 2, name='pool3') 121 | 122 | conv4 = bn_relu_conv(pool3, train_tensor, scope_id='conv4', 123 | num_filters=N*4, kernel_size=(1,1), strides=(1,1), 124 | l2_loss=l2_loss) 125 | 126 | refinement3 = refinement_layer(pool3, conv4, train_tensor, N*2, 127 | scope_id='refinement3', l2_loss=l2_loss) 128 | 129 | 130 | refinement2 = refinement_layer(pool2, refinement3, train_tensor, N*2, 131 | scope_id='refinement2', l2_loss=l2_loss) 132 | 133 | refinement1 = refinement_layer(pool1, refinement2, train_tensor, N, 134 | scope_id='refinement1', l2_loss=l2_loss) 135 | 136 | output_tensor = tf.layers.conv2d(refinement1, nb_bands, kernel_size=(1,1), 137 | padding='same',name='output_conv', 138 | kernel_initializer=tf.glorot_normal_initializer(), 139 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 140 | 141 | loss = mse_loss(output_tensor,input_tensor) 142 | 143 | return loss, refinement1 144 | 145 | 146 | class CAE(object): 147 | """Convolutional Autoencoder with single- and multi-loss support. 148 | 149 | """ 150 | def __init__(self, nb_bands, loss_weights, ckpt_path = 'checkpoints', 151 | num_epochs=1000, starting_learning_rate=2e-3, patience=10, 152 | weight_decay=0.0): 153 | """ 154 | Args: 155 | nb_bands (int): Number of bands (dimensionality). 156 | weight_decay (float): (Default: 1e-4) 157 | gpu_list (list of int, None): List of GPU ids that the model is trained on. 158 | If None, the model is trained on all available GPUs. 159 | model_checkpoint_path (str): Filepath to save the model file after every epoch. 160 | loss_mode (str): Specifies whether the CAE is single- or multi-loss. 161 | Supported values are `single` and `multi`. 162 | (Default: `multi`) 163 | weights (str): Filepath to the weights file to load. (Default: None) 164 | loss (str): Loss function used for MCAE. 165 | Supported functions are `mse` and `mae`. 166 | 167 | """ 168 | self.N = 64 169 | self.nb_bands = nb_bands 170 | self.ckpt_path = ckpt_path 171 | self.num_epochs = num_epochs 172 | self.starting_learning_rate = starting_learning_rate 173 | self.patience = patience 174 | self.lr_patience = int(patience/2) 175 | 176 | inputs = tf.placeholder(tf.float32, shape=(None, None,None, nb_bands),name='inputs') 177 | learning_rate = tf.placeholder(tf.float32, name='learning_rate') 178 | training = tf.placeholder(tf.bool,name='training') 179 | 180 | tf.add_to_collection("inputs", inputs) 181 | tf.add_to_collection("training", training) 182 | tf.add_to_collection('learning_rate', learning_rate) 183 | 184 | loss, hidden_output = cae(self.N , inputs, training, nb_bands, loss_weights, weight_decay) 185 | tf.add_to_collection('loss', loss) 186 | tf.add_to_collection('hidden_output', hidden_output) 187 | 188 | opt = tf.contrib.opt.NadamOptimizer(learning_rate) 189 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 190 | with tf.control_dependencies(update_ops): 191 | train_step = opt.minimize(loss) 192 | tf.add_to_collection('train_step', train_step) 193 | 194 | with tf.Session() as sess: 195 | saver = tf.train.Saver() 196 | sess.run(tf.global_variables_initializer()) 197 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0) 198 | 199 | def fit(self, X_train, X_val, batch_size=128): 200 | 201 | train_set = DataSet(X_train) 202 | val_set = DataSet(X_val) 203 | 204 | if not os.path.exists(self.ckpt_path): 205 | os.makedirs(self.ckpt_path) 206 | 207 | with tf.Session() as sess: 208 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 209 | if ckpt and ckpt.model_checkpoint_path: 210 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 211 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 212 | clear_devices=True) 213 | saver.restore(sess, ckpt.model_checkpoint_path) 214 | inputs = tf.get_collection('inputs')[0] 215 | training = tf.get_collection('training')[0] 216 | train_step = tf.get_collection('train_step')[0] 217 | loss = tf.get_collection('loss')[0] 218 | learning_rate = tf.get_collection('learning_rate')[0] 219 | else: 220 | raise FileNotFoundError('Cannot Find Checkpoint') 221 | 222 | iter_per_epoch = int(X_train.shape[0] / batch_size) 223 | val_iter = int(val_set.num_examples / batch_size) + 1 224 | 225 | best_loss = 100000.0 226 | loss_arr = np.zeros((iter_per_epoch),np.float32) 227 | lr = self.starting_learning_rate 228 | 229 | t_msg1 = '\rEpoch %d/%d (%d/%d) -- lr=%1.0e' 230 | t_msg2 = ' -- train=%1.2f' 231 | v_msg1 = ' -- val=%1.2f' 232 | pc = 0 233 | lrc = 0 234 | for e in range(self.num_epochs): 235 | for i in range(iter_per_epoch): 236 | print(t_msg1 % (e+1,self.num_epochs,i+1,iter_per_epoch,lr), end="") 237 | images = train_set.next_batch(batch_size) 238 | _, loss0 = sess.run([train_step, loss], 239 | feed_dict={inputs: images, 240 | training: True, 241 | learning_rate: lr}) 242 | loss_arr[i] = loss0 243 | print(t_msg2 % (np.mean(loss_arr)), end="") 244 | 245 | #Calculate validation loss/acc 246 | val_losses = np.array([],dtype=np.float32) 247 | for i in range(val_iter): 248 | X = val_set.next_batch(batch_size) 249 | val_loss = sess.run([loss], 250 | feed_dict={inputs: X, 251 | training: False}) 252 | val_losses = np.append(val_losses, val_loss) 253 | val_loss = np.mean(val_losses) 254 | 255 | print(v_msg1 % (val_loss), end="") 256 | 257 | if val_loss < best_loss and not math.isnan(val_loss): 258 | saver.save(sess, self.ckpt_path+'/model.ckpt',e) 259 | best_loss = val_loss 260 | print(' -- best_val_loss=%1.2f' % best_loss) 261 | pc = 0 262 | lrc=0 263 | else: 264 | pc += 1 265 | lrc += 1 266 | if pc > self.patience: 267 | break 268 | elif lrc >= self.lr_patience: 269 | lrc = 0 270 | lr /= 10.0 271 | print(' -- Patience=%d/%d' % (pc, self.patience)) 272 | 273 | if math.isnan(val_loss): 274 | break 275 | 276 | def transform(self, X, batch_size=128): 277 | 278 | dataset = DataSet(X) 279 | n = dataset._num_examples 280 | 281 | if not os.path.exists(self.ckpt_path): 282 | os.makedirs(self.ckpt_path) 283 | 284 | with tf.Session() as sess: 285 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 286 | if ckpt and ckpt.model_checkpoint_path: 287 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 288 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 289 | clear_devices=True) 290 | saver.restore(sess, ckpt.model_checkpoint_path) 291 | inputs = tf.get_collection('inputs')[0] 292 | training = tf.get_collection('training')[0] 293 | hidden_output = tf.get_collection('hidden_output')[0] 294 | else: 295 | raise FileNotFoundError('Cannot Find Checkpoint') 296 | 297 | if n > batch_size: 298 | 299 | prediction = np.zeros((dataset._num_examples,X.shape[1], X.shape[2], self.N), dtype=np.float32) 300 | 301 | for i in range(0, n, batch_size): 302 | start = np.min([i, n - batch_size]) 303 | end = np.min([i + batch_size, n]) 304 | 305 | X_test = dataset.images[start:end] 306 | 307 | prediction[start:end] = sess.run(hidden_output, 308 | feed_dict={inputs: X_test, 309 | training: False}) 310 | 311 | else: 312 | prediction = sess.run(hidden_output, 313 | feed_dict={inputs: X, 314 | training: False}) 315 | return prediction 316 | -------------------------------------------------------------------------------- /feature_extraction/mcae.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Mon Oct 30 07:36:14 2017 5 | 6 | @author: rmk6217 7 | """ 8 | 9 | import tensorflow as tf 10 | import numpy as np 11 | import os 12 | import math 13 | 14 | 15 | class EMA(object): 16 | def __init__(self, decay=0.99): 17 | self.avg = None 18 | self.decay = decay 19 | self.history = [] 20 | def update(self, X): 21 | if not self.avg: 22 | self.avg = X 23 | else: 24 | self.avg = self.avg * self.decay + (1-self.decay) * X 25 | self.history.append(self.avg) 26 | 27 | def pelu(input_tensor, a , b): 28 | 29 | positive = tf.nn.relu(input_tensor) * a / (b + 1e-9) 30 | negative = a * (tf.exp((-tf.nn.relu(-input_tensor)) / (b + 1e-9)) - 1) 31 | return (positive + negative) 32 | 33 | def bn_pelu(input_tensor, train_tensor, name_sub=''): 34 | bn = tf.layers.batch_normalization(input_tensor, fused=True, training=train_tensor, 35 | name='bn'+name_sub) 36 | a = tf.Variable(tf.ones(1), name='a'+name_sub) 37 | b = tf.Variable(tf.ones(1), name='b'+name_sub) 38 | return pelu(bn , a, b), a, b 39 | 40 | def bn_pelu_conv(input_tensor, train_tensor, scope_id, num_filters, 41 | kernel_size=(3,3), strides=(1,1), l2_loss=0.0): 42 | 43 | with tf.variable_scope(scope_id): 44 | act, a, b = bn_pelu(input_tensor, train_tensor) 45 | conv = tf.layers.conv2d(act, num_filters, kernel_size , padding='same', 46 | name='conv2d', 47 | kernel_initializer=tf.glorot_normal_initializer(), 48 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 49 | 50 | return conv, a, b 51 | 52 | def mse_loss(x, y): 53 | diff = tf.squared_difference(x,y) 54 | return tf.reduce_mean(diff) 55 | 56 | def refinement_layer(lateral_tensor, vertical_tensor, train_tensor, num_filters, 57 | scope_id, l2_loss): 58 | with tf.variable_scope(scope_id): 59 | 60 | conv1 = tf.layers.conv2d(lateral_tensor, num_filters, kernel_size=(3,3), 61 | padding='same',name='conv2d_1', 62 | kernel_initializer=tf.glorot_normal_initializer(), 63 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 64 | act1, a1, b1 = bn_pelu(conv1, train_tensor, name_sub='1') 65 | conv2 = tf.layers.conv2d(act1, num_filters, kernel_size=(3,3), 66 | padding='same',name='conv2d_2', 67 | kernel_initializer=tf.glorot_normal_initializer(), 68 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 69 | conv3 = tf.layers.conv2d(vertical_tensor, num_filters, kernel_size=(3,3), 70 | padding='same',name='conv2d_3', 71 | kernel_initializer=tf.glorot_normal_initializer(), 72 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 73 | add = tf.add(conv2, conv3) 74 | 75 | old_shape = tf.shape(add)[1:3] 76 | up = tf.image.resize_nearest_neighbor(add, old_shape * 2 ) 77 | act2, a2, b2 = bn_pelu(up, train_tensor, name_sub='2') 78 | return act2, [a1, a2], [b1, b2] 79 | 80 | 81 | def mcae(input_tensor, train_tensor, nb_bands, cost_weights, l2_loss): 82 | 83 | a = [] 84 | b = [] 85 | 86 | conv1, a_t, b_t = bn_pelu_conv(input_tensor, train_tensor, scope_id='conv1', 87 | num_filters=256, l2_loss=l2_loss) 88 | a.append(a_t) 89 | b.append(b_t) 90 | pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1') 91 | 92 | conv2, a_t, b_t = bn_pelu_conv(pool1, train_tensor, scope_id='conv2', 93 | num_filters=512, l2_loss=l2_loss) 94 | a.append(a_t) 95 | b.append(b_t) 96 | pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2') 97 | 98 | conv3, a_t, b_t = bn_pelu_conv(pool2, train_tensor, scope_id='conv3', 99 | num_filters=512, l2_loss=l2_loss) 100 | a.append(a_t) 101 | b.append(b_t) 102 | pool3 = tf.layers.max_pooling2d(conv3, 2, 2, name='pool3') 103 | 104 | conv4, a_t, b_t = bn_pelu_conv(pool3, train_tensor, scope_id='conv4', 105 | num_filters=1024, kernel_size=(1,1), strides=(1,1), 106 | l2_loss=l2_loss) 107 | a.append(a_t) 108 | b.append(b_t) 109 | 110 | refinement3, a_t, b_t = refinement_layer(pool3, conv4, train_tensor, 512, 111 | scope_id='refinement3', l2_loss=l2_loss) 112 | a+=a_t 113 | b+=b_t 114 | 115 | refinement2, a_t, b_t = refinement_layer(pool2, refinement3, train_tensor, 512, 116 | scope_id='refinement2', l2_loss=l2_loss) 117 | a+=a_t 118 | b+=b_t 119 | 120 | refinement1, a_t, b_t = refinement_layer(pool1, refinement2, train_tensor, 256, 121 | scope_id='refinement1', l2_loss=l2_loss) 122 | a+=a_t 123 | b+=b_t 124 | 125 | output_tensor = tf.layers.conv2d(refinement1, nb_bands, kernel_size=(1,1), 126 | padding='same',name='output_conv', 127 | kernel_initializer=tf.glorot_normal_initializer(), 128 | kernel_regularizer=tf.contrib.layers.l2_regularizer(l2_loss)) 129 | 130 | loss = cost_weights[0] * mse_loss(output_tensor,input_tensor) 131 | loss += (cost_weights[1] * mse_loss(conv1,refinement1)) 132 | loss += (cost_weights[2] * mse_loss(conv2,refinement2)) 133 | loss += (cost_weights[3] * mse_loss(conv3,refinement3)) 134 | 135 | loss = loss - 1000.0 * tf.minimum( tf.reduce_min(a), 0) 136 | loss = loss - 1000.0 * tf.minimum( tf.reduce_min(b), 0) 137 | 138 | return loss, refinement1 139 | 140 | 141 | class MCAE(object): 142 | """Convolutional Autoencoder with single- and multi-loss support. 143 | 144 | """ 145 | def __init__(self, nb_bands, loss_weights, ckpt_path = 'checkpoints', 146 | num_epochs=1000, starting_learning_rate=2e-3, patience=10, 147 | weight_decay=0.0): 148 | """ 149 | Args: 150 | nb_bands (int): Number of bands (dimensionality). 151 | weight_decay (float): (Default: 1e-4) 152 | gpu_list (list of int, None): List of GPU ids that the model is trained on. 153 | If None, the model is trained on all available GPUs. 154 | model_checkpoint_path (str): Filepath to save the model file after every epoch. 155 | loss_mode (str): Specifies whether the CAE is single- or multi-loss. 156 | Supported values are `single` and `multi`. 157 | (Default: `multi`) 158 | weights (str): Filepath to the weights file to load. (Default: None) 159 | loss (str): Loss function used for MCAE. 160 | Supported functions are `mse` and `mae`. 161 | 162 | """ 163 | self.nb_bands = nb_bands 164 | self.ckpt_path = ckpt_path 165 | self.num_epochs = num_epochs 166 | self.starting_learning_rate = starting_learning_rate 167 | self.patience = patience 168 | self.lr_patience = int(patience/2) 169 | with tf.Graph().as_default(): 170 | inputs = tf.placeholder(tf.float32, shape=(None, None,None,nb_bands),name='inputs') 171 | learning_rate = tf.placeholder(tf.float32, name='learning_rate') 172 | training = tf.placeholder(tf.bool,name='training') 173 | 174 | tf.add_to_collection("inputs", inputs) 175 | tf.add_to_collection("training", training) 176 | tf.add_to_collection('learning_rate', learning_rate) 177 | 178 | loss, hidden_output = mcae(inputs, training, nb_bands, loss_weights, weight_decay) 179 | tf.add_to_collection('loss', loss) 180 | tf.add_to_collection('hidden_output', hidden_output) 181 | 182 | opt = tf.contrib.opt.NadamOptimizer(learning_rate) 183 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 184 | with tf.control_dependencies(update_ops): 185 | train_step = opt.minimize(loss) 186 | tf.add_to_collection('train_step', train_step) 187 | 188 | with tf.Session() as sess: 189 | saver = tf.train.Saver() 190 | sess.run(tf.global_variables_initializer()) 191 | saver.save(sess, self.ckpt_path+'/model.ckpt', 0) 192 | 193 | def fit(self, X_train, X_val, batch_size=128): 194 | 195 | if not os.path.exists(self.ckpt_path): 196 | os.makedirs(self.ckpt_path) 197 | 198 | with tf.Graph().as_default(): 199 | 200 | with tf.Session() as sess: 201 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 202 | if ckpt and ckpt.model_checkpoint_path: 203 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 204 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 205 | clear_devices=True) 206 | saver.restore(sess, ckpt.model_checkpoint_path) 207 | inputs = tf.get_collection('inputs')[0] 208 | training = tf.get_collection('training')[0] 209 | train_step = tf.get_collection('train_step')[0] 210 | loss = tf.get_collection('loss')[0] 211 | learning_rate = tf.get_collection('learning_rate')[0] 212 | else: 213 | raise FileNotFoundError('Cannot Find Checkpoint') 214 | 215 | best_loss = 100000.0 216 | lr = self.starting_learning_rate 217 | 218 | t_msg1 = '\rEpoch %d (%d/%d) -- lr=%1.0e -- train=%1.2f' 219 | v_msg1 = ' -- val=%1.2f' 220 | pc = 0 221 | lrc = 0 222 | 223 | train_samples = X_train.shape[0] 224 | idx = np.arange(train_samples) 225 | train_loss = EMA() 226 | 227 | for e in range(self.num_epochs): 228 | np.random.shuffle(idx) 229 | for i in range(0 , train_samples, batch_size): 230 | _, loss0 = sess.run([train_step, loss], 231 | feed_dict={inputs: X_train[idx[i:i+batch_size]], 232 | training: True, 233 | learning_rate: lr}) 234 | train_loss.update(loss0) 235 | print(t_msg1 % (e+1,i+1,train_samples,lr, train_loss.avg), end="") 236 | 237 | #Calculate validation loss/acc 238 | val_losses = np.array([],dtype=np.float32) 239 | for i in range(0, X_val.shape[0], batch_size): 240 | val_loss = sess.run([loss], 241 | feed_dict={inputs: X_val[i:i+batch_size], 242 | training: False}) 243 | val_losses = np.append(val_losses, val_loss) 244 | val_loss = np.mean(val_losses) 245 | 246 | print(v_msg1 % (val_loss), end="") 247 | 248 | if val_loss < best_loss and not math.isnan(val_loss): 249 | saver.save(sess, self.ckpt_path+'/model.ckpt',e) 250 | best_loss = val_loss 251 | print(' -- best_val_loss=%1.2f' % best_loss) 252 | pc = 0 253 | lrc=0 254 | else: 255 | pc += 1 256 | lrc += 1 257 | if pc > self.patience: 258 | break 259 | elif lrc >= self.lr_patience: 260 | lrc = 0 261 | lr /= 10.0 262 | print(' -- Patience=%d/%d' % (pc, self.patience)) 263 | 264 | if math.isnan(val_loss): 265 | break 266 | 267 | def transform(self, X, batch_size=128): 268 | 269 | n = X.shape[0] 270 | with tf.Graph().as_default(): 271 | if not os.path.exists(self.ckpt_path): 272 | os.makedirs(self.ckpt_path) 273 | 274 | with tf.Session() as sess: 275 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/') # get latest checkpoint (if any) 276 | if ckpt and ckpt.model_checkpoint_path: 277 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 278 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 279 | clear_devices=True) 280 | saver.restore(sess, ckpt.model_checkpoint_path) 281 | inputs = tf.get_collection('inputs')[0] 282 | training = tf.get_collection('training')[0] 283 | hidden_output = tf.get_collection('hidden_output')[0] 284 | else: 285 | raise FileNotFoundError('Cannot Find Checkpoint') 286 | 287 | if n > batch_size: 288 | 289 | prediction = np.zeros((n,X.shape[1], X.shape[2], 256), dtype=np.float32) 290 | 291 | for i in range(0, n, batch_size): 292 | start = np.min([i, n - batch_size]) 293 | end = np.min([i + batch_size, n]) 294 | 295 | prediction[start:end] = sess.run(hidden_output, 296 | feed_dict={inputs: X[start:end], 297 | training: False}) 298 | 299 | else: 300 | prediction = sess.run(hidden_output, 301 | feed_dict={inputs: X, 302 | training: False}) 303 | return prediction 304 | -------------------------------------------------------------------------------- /feature_extraction/scae.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from feature_extraction.cae import CAE 3 | from feature_extraction.mcae import MCAE 4 | import tensorflow as tf 5 | 6 | class SCAE(): 7 | """Stacked Convolutional Autoencoder with single- and multi-loss support. 8 | 9 | """ 10 | def __init__(self, nb_bands=None, nb_cae=None, weight_decay=0.0, 11 | ckpt_path='checkpoints', patience=10, num_epochs=1000, 12 | pre_trained=True, loss_mode='mcae'): 13 | """ 14 | Args: 15 | nb_bands (int): Number of bands (dimensionality). 16 | nb_cae (int): Number of CAEs. 17 | weight_decay (float): (Default: 1e-4) 18 | gpu_list (list of int, None): List of GPU ids that the model is trained on. 19 | If None, the model is trained on all available GPUs. 20 | model_checkpoint_path (str): Filepath to save the model file after every epoch. 21 | loss_mode (str): Specifies whether the CAE is single- or multi-loss. 22 | Supported values are `single` and `multi`. 23 | (Default: `multi`) 24 | weights (str): Filepath to the weights file to load. (Default: None) 25 | loss (str): Loss function used for MCAE. 26 | Supported functions are `mse` and `mae`. 27 | 28 | """ 29 | 30 | self.N = 256 31 | self.nb_cae = nb_cae 32 | self.ckpt_path = ckpt_path 33 | loss_weights=[1.0, 1e-1, 1e-2, 1e-2] 34 | 35 | if nb_bands is not None: 36 | self.nb_bands = np.append(nb_bands , 37 | np.ones(nb_cae-1,dtype=np.int32)*self.N) 38 | 39 | self.weight_decay= weight_decay 40 | self.patience = patience 41 | self.loss_weights = loss_weights 42 | self.num_epochs = num_epochs 43 | self.loss_mode = loss_mode 44 | 45 | def fit(self, X_train, X_val, batch_size=128): 46 | """ Trains the model on the given dataset for 500 epochs. 47 | 48 | Args: 49 | X_train (arr): Training data. 50 | X_val (arr): Validation data. 51 | batch_size (int): Number of samples per gradient update. 52 | 53 | """ 54 | for i in range(self.nb_cae): 55 | print('Training CAE #%d...' % i) 56 | 57 | if self.loss_mode == 'cae': 58 | 59 | cae = CAE(self.nb_bands[i], 60 | ckpt_path=self.ckpt_path+'/cae%d' % i, 61 | loss_weights=self.loss_weights, 62 | weight_decay=self.weight_decay, 63 | patience=self.patience, 64 | num_epochs=self.num_epochs) 65 | elif self.loss_mode == 'mcae': 66 | cae = MCAE(self.nb_bands[i], 67 | ckpt_path=self.ckpt_path+'/mcae%d' % i, 68 | loss_weights=self.loss_weights, 69 | weight_decay=self.weight_decay, 70 | patience=self.patience, 71 | num_epochs=self.num_epochs) 72 | else: 73 | raise ValueError('Not a valid loss_mode.') 74 | 75 | cae.fit(X_train, X_val, batch_size=batch_size) 76 | if i < self.nb_cae-1: 77 | X_train = cae.transform(X_train) 78 | X_val = cae.transform(X_val) 79 | 80 | return 1 81 | 82 | 83 | def transform(self, X): 84 | """Generates output reconstructions for the input samples. 85 | 86 | Args: 87 | X (arr): The input data. 88 | 89 | Returns: 90 | Reconstruction of input data. 91 | 92 | """ 93 | output = np.zeros((X.shape[0], X.shape[1], self.N*self.nb_cae), 94 | dtype=np.float32) 95 | X = X[np.newaxis] 96 | for i in range(self.nb_cae): 97 | 98 | with tf.Graph().as_default(): 99 | with tf.Session() as sess: 100 | ckpt = tf.train.get_checkpoint_state(self.ckpt_path+'/cae%d/' % i) 101 | if ckpt and ckpt.model_checkpoint_path: 102 | # if checkpoint exists, restore the parameters and set epoch_n and i_iter 103 | saver = tf.train.import_meta_graph(ckpt.model_checkpoint_path+'.meta', 104 | clear_devices=True) 105 | saver.restore(sess, ckpt.model_checkpoint_path) 106 | inputs = tf.get_collection('inputs')[0] 107 | training = tf.get_collection('training')[0] 108 | hidden_output = tf.get_collection('hidden_output')[0] 109 | else: 110 | raise FileNotFoundError('Cannot Find Checkpoint') 111 | 112 | 113 | X = sess.run(hidden_output, feed_dict={inputs: X, 114 | training: False}) 115 | output[:,:, self.N*i:self.N*(i+1)] = X[0] 116 | 117 | 118 | return output 119 | -------------------------------------------------------------------------------- /fileIO/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/fileIO/__init__.py -------------------------------------------------------------------------------- /fileIO/aviris.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: aviris.py 3 | Author: Ronald Kemker 4 | Description: Pipeline for processing AVIRIS imagery 5 | 6 | Note: 7 | Requires GDAL, remote_sensing.utils, and display (optional) 8 | http://www.gdal.org/ 9 | """ 10 | 11 | import gdal 12 | from glob import glob 13 | import numpy as np 14 | from utils.utils import band_resample_hsi_cube 15 | from fileIO.utils import readENVIHeader 16 | 17 | class AVIRIS(): 18 | """ 19 | AVIRIS 20 | 21 | This processes data from the Airborne Visible/Infrared Imaging Spectrometer 22 | (AVIRIS). 23 | 24 | Parameters 25 | ---------- 26 | directory : string, file directory of AVIRIS data 27 | calibrate : boolean, convert hyperspectral cube frol digital count to 28 | radiance. (Default: True) 29 | bands : numpy array [1 x num target bands], target bands for spectral 30 | resampling (Default: None -> No resampling is performed) 31 | fwhm : numpy array [1 x num target bands], target full-width half maxes for 32 | for spectral resampling (Default: None) 33 | 34 | Attributes 35 | ---------- 36 | dName : string, input "directory" 37 | cal : boolean, input "calibrate" 38 | bands : numpy array [1 x num source bands], band centers of input cube 39 | fwhm : numpy array [1 x num source bands], full width, half maxes of input 40 | cube 41 | tgt_bands : numpy array [1 x num target bands], input "bands" 42 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm" 43 | data : numpy array, hyperspectral cube 44 | shape : tuple, shape of hyperspectral cube 45 | mask : numpy array, binary mask of valid data (1 is valid pixel) 46 | 47 | Notes 48 | ----- 49 | Ref: https://aviris.jpl.nasa.gov/ 50 | 51 | """ 52 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None): 53 | self.dName = directory 54 | self.cal = calibrate 55 | self.tgt_bands = bands 56 | self.tgt_fwhm = fwhm 57 | self.read() 58 | pass 59 | 60 | def read(self): 61 | """Reads orthomosaic, calculates binary mask, performs radiometric 62 | calibration (optional), and band resamples (optional)""" 63 | f = glob(self.dName+'/*ort_img')[0] 64 | self.data = gdal.Open(f).ReadAsArray().transpose(1,2,0) 65 | self.shape = self.data.shape 66 | self.mask = self.data[:,:,10] != self._mask_value() 67 | self.data[self.mask==False] = 0 68 | self.hdr_data() 69 | 70 | if self.cal: 71 | self.radiometric_cal() 72 | 73 | if self.tgt_bands is not None: 74 | self.band_resample() 75 | 76 | def radiometric_cal(self): 77 | """Performs radiometric calibration with gain file""" 78 | with open(glob(self.dName+'/*gain')[0]) as f: 79 | lines = f.read().splitlines() 80 | 81 | gain = np.zeros(224, dtype=np.float32) 82 | for i in range(len(lines)): 83 | gain[i] = np.float32(lines[i].split()[0]) 84 | 85 | self.data[self.mask] = np.float32(self.data[self.mask])*gain 86 | 87 | def show_rgb(self): 88 | """Displays RGB visualization of HSI cube""" 89 | from display import imshow 90 | idx = np.array([13,19,27], dtype=np.uint8) 91 | imshow(self.data[:,:,idx]) 92 | 93 | def hdr_data(self): 94 | """Finds band-centers and full-width half-maxes of hyperspectral cube""" 95 | f = glob(self.dName+'/*ort_img.hdr')[0] 96 | self.bands, self.fwhm = readENVIHeader(f) 97 | 98 | def band_resample(self): 99 | """Performs band resampling operation""" 100 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands, 101 | self.fwhm, self.tgt_fwhm, 102 | self.mask) 103 | 104 | def _mask_value(self): 105 | """Finds value of invalid orthomosaic pixels""" 106 | return np.min([self.data[0,-1],self.data[-1,0],self.data[0,-1], 107 | self.data[-1,-1]]) 108 | -------------------------------------------------------------------------------- /fileIO/aviris_bands.hdr: -------------------------------------------------------------------------------- 1 | ENVI 2 | description = { 3 | AVIRIS orthocorrected file, pixel size = 17.2000 4 | rotation angle = 0.000000 5 | datum = WGS-84 6 | UTM zone = 10 7 | upper left corner (1,1) (Easting) = 752834.71 8 | upper left corner (1,1) (Northing) = 4047735.4 } 9 | samples = 748 10 | lines = 1425 11 | bands = 224 12 | header offset = 0 13 | data type = 2 14 | interleave = bip 15 | byte order = 1 16 | map info ={UTM, 1, 1, 752834.710, 4047735.400, 17.200, 17.200, 17 | 10, North, WGS-84, units=Meters, rotation=0.000000} 18 | x start = 1 19 | y start = 1 20 | wavelength = { 21 | 365.9298 , 22 | 375.5940 , 23 | 385.2625 , 24 | 394.9355 , 25 | 404.6129 , 26 | 414.2946 , 27 | 423.9808 , 28 | 433.6713 , 29 | 443.3662 , 30 | 453.0655 , 31 | 462.7692 , 32 | 472.4773 , 33 | 482.1898 , 34 | 491.9066 , 35 | 501.6279 , 36 | 511.3535 , 37 | 521.0836 , 38 | 530.8180 , 39 | 540.5568 , 40 | 550.3000 , 41 | 560.0477 , 42 | 569.7996 , 43 | 579.5560 , 44 | 589.3168 , 45 | 599.0819 , 46 | 608.8515 , 47 | 618.6254 , 48 | 628.4037 , 49 | 638.1865 , 50 | 647.9736 , 51 | 657.7651 , 52 | 667.5610 , 53 | 655.2923 , 54 | 665.0994 , 55 | 674.9012 , 56 | 684.6979 , 57 | 694.4894 , 58 | 704.2756 , 59 | 714.0566 , 60 | 723.8325 , 61 | 733.6031 , 62 | 743.3685 , 63 | 753.1287 , 64 | 762.8837 , 65 | 772.6335 , 66 | 782.3781 , 67 | 792.1174 , 68 | 801.8516 , 69 | 811.5805 , 70 | 821.3043 , 71 | 831.0228 , 72 | 840.7361 , 73 | 850.4442 , 74 | 860.1471 , 75 | 869.8448 , 76 | 879.5372 , 77 | 889.2245 , 78 | 898.9066 , 79 | 908.5834 , 80 | 918.2551 , 81 | 927.9214 , 82 | 937.5827 , 83 | 947.2387 , 84 | 956.8895 , 85 | 966.5351 , 86 | 976.1755 , 87 | 985.8106 , 88 | 995.4406 , 89 | 1005.065 , 90 | 1014.685 , 91 | 1024.299 , 92 | 1033.908 , 93 | 1043.512 , 94 | 1053.111 , 95 | 1062.704 , 96 | 1072.293 , 97 | 1081.876 , 98 | 1091.454 , 99 | 1101.026 , 100 | 1110.594 , 101 | 1120.156 , 102 | 1129.713 , 103 | 1139.265 , 104 | 1148.811 , 105 | 1158.353 , 106 | 1167.889 , 107 | 1177.420 , 108 | 1186.946 , 109 | 1196.466 , 110 | 1205.982 , 111 | 1215.492 , 112 | 1224.997 , 113 | 1234.496 , 114 | 1243.991 , 115 | 1253.480 , 116 | 1262.964 , 117 | 1253.373 , 118 | 1263.346 , 119 | 1273.318 , 120 | 1283.291 , 121 | 1293.262 , 122 | 1303.234 , 123 | 1313.206 , 124 | 1323.177 , 125 | 1333.148 , 126 | 1343.119 , 127 | 1353.089 , 128 | 1363.060 , 129 | 1373.030 , 130 | 1383.000 , 131 | 1392.969 , 132 | 1402.939 , 133 | 1412.908 , 134 | 1422.877 , 135 | 1432.845 , 136 | 1442.814 , 137 | 1452.782 , 138 | 1462.750 , 139 | 1472.718 , 140 | 1482.685 , 141 | 1492.652 , 142 | 1502.619 , 143 | 1512.586 , 144 | 1522.552 , 145 | 1532.518 , 146 | 1542.484 , 147 | 1552.450 , 148 | 1562.416 , 149 | 1572.381 , 150 | 1582.346 , 151 | 1592.311 , 152 | 1602.275 , 153 | 1612.240 , 154 | 1622.204 , 155 | 1632.167 , 156 | 1642.131 , 157 | 1652.094 , 158 | 1662.057 , 159 | 1672.020 , 160 | 1681.983 , 161 | 1691.945 , 162 | 1701.907 , 163 | 1711.869 , 164 | 1721.831 , 165 | 1731.792 , 166 | 1741.753 , 167 | 1751.714 , 168 | 1761.675 , 169 | 1771.635 , 170 | 1781.596 , 171 | 1791.556 , 172 | 1801.515 , 173 | 1811.475 , 174 | 1821.434 , 175 | 1831.393 , 176 | 1841.352 , 177 | 1851.310 , 178 | 1861.269 , 179 | 1871.227 , 180 | 1873.184 , 181 | 1867.164 , 182 | 1877.225 , 183 | 1887.285 , 184 | 1897.342 , 185 | 1907.396 , 186 | 1917.448 , 187 | 1927.499 , 188 | 1937.546 , 189 | 1947.592 , 190 | 1957.635 , 191 | 1967.675 , 192 | 1977.714 , 193 | 1987.750 , 194 | 1997.784 , 195 | 2007.815 , 196 | 2017.845 , 197 | 2027.872 , 198 | 2037.896 , 199 | 2047.919 , 200 | 2057.939 , 201 | 2067.956 , 202 | 2077.972 , 203 | 2087.985 , 204 | 2097.996 , 205 | 2108.004 , 206 | 2118.010 , 207 | 2128.014 , 208 | 2138.016 , 209 | 2148.015 , 210 | 2158.012 , 211 | 2168.007 , 212 | 2177.999 , 213 | 2187.989 , 214 | 2197.977 , 215 | 2207.962 , 216 | 2217.945 , 217 | 2227.926 , 218 | 2237.905 , 219 | 2247.881 , 220 | 2257.854 , 221 | 2267.826 , 222 | 2277.795 , 223 | 2287.762 , 224 | 2297.727 , 225 | 2307.689 , 226 | 2317.649 , 227 | 2327.607 , 228 | 2337.562 , 229 | 2347.516 , 230 | 2357.467 , 231 | 2367.415 , 232 | 2377.361 , 233 | 2387.305 , 234 | 2397.247 , 235 | 2407.186 , 236 | 2417.123 , 237 | 2427.058 , 238 | 2436.990 , 239 | 2446.920 , 240 | 2456.848 , 241 | 2466.773 , 242 | 2476.696 , 243 | 2486.617 , 244 | 2496.536 } 245 | fwhm = { 246 | 9.852108 , 247 | 9.796976 , 248 | 9.744104 , 249 | 9.693492 , 250 | 9.645140 , 251 | 9.599048 , 252 | 9.555216 , 253 | 9.513644 , 254 | 9.474332 , 255 | 9.437280 , 256 | 9.402488 , 257 | 9.369956 , 258 | 9.339684 , 259 | 9.311672 , 260 | 9.285920 , 261 | 9.262428 , 262 | 9.241196 , 263 | 9.222224 , 264 | 9.205512 , 265 | 9.191060 , 266 | 9.178868 , 267 | 9.168936 , 268 | 9.161264 , 269 | 9.155852 , 270 | 9.152700 , 271 | 9.151808 , 272 | 9.153176 , 273 | 9.156804 , 274 | 9.162692 , 275 | 9.170840 , 276 | 9.181248 , 277 | 9.193916 , 278 | 9.908470 , 279 | 9.848707 , 280 | 9.798712 , 281 | 9.758482 , 282 | 9.728020 , 283 | 9.707324 , 284 | 9.696396 , 285 | 9.695233 , 286 | 9.703838 , 287 | 9.722210 , 288 | 9.750348 , 289 | 9.788254 , 290 | 9.835926 , 291 | 9.893364 , 292 | 9.960570 , 293 | 10.03754 , 294 | 10.12428 , 295 | 10.22079 , 296 | 10.32706 , 297 | 10.44310 , 298 | 10.56891 , 299 | 10.70448 , 300 | 10.84982 , 301 | 11.00493 , 302 | 11.16980 , 303 | 11.34444 , 304 | 11.52885 , 305 | 11.72302 , 306 | 11.92696 , 307 | 10.78153 , 308 | 9.761805 , 309 | 9.760064 , 310 | 9.759324 , 311 | 9.759585 , 312 | 9.760846 , 313 | 9.763108 , 314 | 9.766370 , 315 | 9.770633 , 316 | 9.775896 , 317 | 9.782160 , 318 | 9.789424 , 319 | 9.797689 , 320 | 9.806954 , 321 | 9.817220 , 322 | 9.828486 , 323 | 9.840753 , 324 | 9.854020 , 325 | 9.868288 , 326 | 9.883556 , 327 | 9.899825 , 328 | 9.917094 , 329 | 9.935364 , 330 | 9.954635 , 331 | 9.974905 , 332 | 9.996177 , 333 | 10.01845 , 334 | 10.04172 , 335 | 10.06599 , 336 | 10.09127 , 337 | 10.11754 , 338 | 10.14481 , 339 | 10.17309 , 340 | 10.20236 , 341 | 10.23264 , 342 | 10.83826 , 343 | 10.83267 , 344 | 10.82721 , 345 | 10.82188 , 346 | 10.81670 , 347 | 10.81165 , 348 | 10.80674 , 349 | 10.80196 , 350 | 10.79733 , 351 | 10.79283 , 352 | 10.78847 , 353 | 10.78425 , 354 | 10.78016 , 355 | 10.77621 , 356 | 10.77240 , 357 | 10.76873 , 358 | 10.76519 , 359 | 10.76180 , 360 | 10.75853 , 361 | 10.75541 , 362 | 10.75243 , 363 | 10.74958 , 364 | 10.74687 , 365 | 10.74429 , 366 | 10.74186 , 367 | 10.73956 , 368 | 10.73740 , 369 | 10.73538 , 370 | 10.73349 , 371 | 10.73174 , 372 | 10.73013 , 373 | 10.72866 , 374 | 10.72732 , 375 | 10.72612 , 376 | 10.72506 , 377 | 10.72414 , 378 | 10.72335 , 379 | 10.72270 , 380 | 10.72219 , 381 | 10.72182 , 382 | 10.72159 , 383 | 10.72149 , 384 | 10.72153 , 385 | 10.72170 , 386 | 10.72202 , 387 | 10.72247 , 388 | 10.72306 , 389 | 10.72378 , 390 | 10.72465 , 391 | 10.72565 , 392 | 10.72679 , 393 | 10.72807 , 394 | 10.72948 , 395 | 10.73103 , 396 | 10.73272 , 397 | 10.73455 , 398 | 10.73651 , 399 | 10.73861 , 400 | 10.74085 , 401 | 10.74323 , 402 | 10.74574 , 403 | 10.74839 , 404 | 10.75118 , 405 | 10.75411 , 406 | 11.16661 , 407 | 11.15791 , 408 | 11.14889 , 409 | 11.13955 , 410 | 11.12990 , 411 | 11.11992 , 412 | 11.10964 , 413 | 11.09903 , 414 | 11.08811 , 415 | 11.07687 , 416 | 11.06531 , 417 | 11.05344 , 418 | 11.04125 , 419 | 11.02875 , 420 | 11.01592 , 421 | 11.00278 , 422 | 10.98933 , 423 | 10.97555 , 424 | 10.96146 , 425 | 10.94705 , 426 | 10.93233 , 427 | 10.91729 , 428 | 10.90193 , 429 | 10.88626 , 430 | 10.87026 , 431 | 10.85396 , 432 | 10.83733 , 433 | 10.82039 , 434 | 10.80313 , 435 | 10.78555 , 436 | 10.76766 , 437 | 10.74945 , 438 | 10.73092 , 439 | 10.71208 , 440 | 10.69292 , 441 | 10.67344 , 442 | 10.65365 , 443 | 10.63354 , 444 | 10.61311 , 445 | 10.59236 , 446 | 10.57130 , 447 | 10.54992 , 448 | 10.52823 , 449 | 10.50622 , 450 | 10.48389 , 451 | 10.46124 , 452 | 10.43828 , 453 | 10.41500 , 454 | 10.39140 , 455 | 10.36749 , 456 | 10.34326 , 457 | 10.31871 , 458 | 10.29385 , 459 | 10.26867 , 460 | 10.24317 , 461 | 10.21736 , 462 | 10.19123 , 463 | 10.16478 , 464 | 10.13801 , 465 | 10.11093 , 466 | 10.08353 , 467 | 10.05582 , 468 | 10.02778 , 469 | 9.999434 } 470 | -------------------------------------------------------------------------------- /fileIO/gliht.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: aviris.py 3 | Author: Ronald Kemker 4 | Description: Pipeline for processing AVIRIS imagery 5 | 6 | Note: 7 | Requires GDAL, remote_sensing.utils, and display (optional) 8 | http://www.gdal.org/ 9 | """ 10 | 11 | import gdal 12 | from glob import glob 13 | import numpy as np 14 | from utils.utils import band_resample_hsi_cube 15 | from fileIO.utils import readENVIHeader 16 | 17 | class GLiHT(): 18 | """ 19 | GLiHT 20 | 21 | This processes hyperspectral data from the G-LiHT: 22 | Goddard's LiDAR, Hyperspectral & Thermal Imager 23 | (GLiHT). 24 | 25 | Parameters 26 | ---------- 27 | directory : string, file directory of GLiHT data 28 | calibrate : boolean, convert hyperspectral cube frol digital count to 29 | radiance. (Default: True) 30 | bands : numpy array [1 x num target bands], target bands for spectral 31 | resampling (Default: None -> No resampling is performed) 32 | fwhm : numpy array [1 x num target bands], target full-width half maxes for 33 | for spectral resampling (Default: None) 34 | 35 | Attributes 36 | ---------- 37 | dName : string, input "directory" 38 | cal : boolean, input "calibrate" 39 | bands : numpy array [1 x num source bands], band centers of input cube 40 | fwhm : numpy array [1 x num source bands], full width, half maxes of input 41 | cube 42 | tgt_bands : numpy array [1 x num target bands], input "bands" 43 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm" 44 | data : numpy array, hyperspectral cube 45 | shape : tuple, shape of hyperspectral cube 46 | mask : numpy array, binary mask of valid data (1 is valid pixel) 47 | 48 | Notes 49 | ----- 50 | Ref: https://gliht.gsfc.nasa.gov/ 51 | 52 | """ 53 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None): 54 | self.dName = directory 55 | self.cal = calibrate 56 | self.tgt_bands = bands 57 | self.tgt_fwhm = fwhm 58 | self.read() 59 | pass 60 | 61 | def read(self): 62 | '''Reads orthomosaic, calculates binary mask, performs radiometric 63 | ibration (optional), and band resamples (optional)''' 64 | f = glob(self.dName+'/*radiance_L1G')[0] 65 | self.data = gdal.Open(f).ReadAsArray().transpose(1,2,0) 66 | self.shape = self.data.shape 67 | self.mask = self.data[:,:,10] != 0 68 | self.hdr_data() 69 | 70 | if self.cal: 71 | self.radiometric_cal() 72 | 73 | if self.tgt_bands is not None: 74 | self.band_resample() 75 | 76 | self.data[self.mask==False] = 0 77 | 78 | 79 | def radiometric_cal(self): 80 | """Performs radiometric calibration with gain file""" 81 | gain = 1e-4 82 | self.data[self.mask] = np.float32(self.data[self.mask])*gain 83 | 84 | def show_rgb(self): 85 | '''Displays RGB visualization of HSI cube''' 86 | from display import imshow 87 | idx = np.array([15,33,54], dtype=np.uint8) #blue, green, red 88 | imshow(self.data[:,:,idx]) 89 | 90 | def hdr_data(self): 91 | """Finds band-centers and full-width half-maxes of hyperspectral cube""" 92 | f = glob(self.dName+'/*radiance_L1G.hdr')[0] 93 | self.bands, self.fwhm = readENVIHeader(f) 94 | 95 | def band_resample(self): 96 | """Performs band resampling operation""" 97 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands, 98 | self.fwhm, self.tgt_fwhm) 99 | 100 | -------------------------------------------------------------------------------- /fileIO/hyperion.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: hyperion.py 3 | Author: Ronald Kemker 4 | Description: Pipeline for processing Hyperion imagery 5 | 6 | Note: 7 | Requires GDAL, remote_sensing.utils, and display (optional) 8 | http://www.gdal.org/ 9 | """ 10 | 11 | import gdal 12 | from glob import glob 13 | import numpy as np 14 | from utils.utils import band_resample_hsi_cube 15 | 16 | class Hyperion(): 17 | """ 18 | Hyperion 19 | 20 | This processes data from the Hyperion HSI sensor. 21 | (Hyperion). 22 | 23 | Parameters 24 | ---------- 25 | directory : string, file directory of Hyperion data 26 | calibrate : boolean, convert hyperspectral cube frol digital count to 27 | radiance. (Default: True) 28 | bands : numpy array [1 x num target bands], target bands for spectral 29 | resampling (Default: None -> No resampling is performed) 30 | fwhm : numpy array [1 x num target bands], target full-width half maxes for 31 | for spectral resampling (Default: None) 32 | 33 | Attributes 34 | ---------- 35 | dName : string, input "directory" 36 | cal : boolean, input "calibrate" 37 | bands : numpy array [1 x num source bands], band centers of input cube 38 | fwhm : numpy array [1 x num source bands], full width, half maxes of input 39 | cube 40 | tgt_bands : numpy array [1 x num target bands], input "bands" 41 | tgt_fwhm : numpy array [1 x num target bands], input "fwhm" 42 | data : numpy array, hyperspectral cube 43 | shape : tuple, shape of hyperspectral cube 44 | mask : numpy array, binary mask of valid data (1 is valid pixel) 45 | 46 | Notes 47 | ----- 48 | Ref: https://eo1.usgs.gov/sensors/hyperion 49 | 50 | """ 51 | def __init__(self , directory, calibrate = True, bands=None, fwhm=None): 52 | self.dName = directory 53 | self.cal = calibrate 54 | self.tgt_bands = bands 55 | self.tgt_fwhm = fwhm 56 | self.read() 57 | pass 58 | 59 | def read(self): 60 | """Reads orthomosaic, calculates binary mask, performs radiometric 61 | calibration (optional), and band resamples (optional)""" 62 | dataFN = sorted(glob(self.dName+'/*.TIF')) 63 | self.calIdx = np.hstack([np.arange(7,57),np.arange(76,224)]) 64 | 65 | tmp = gdal.Open(dataFN[self.calIdx[0]]).ReadAsArray() 66 | 67 | self.data = np.zeros((tmp.shape[0],tmp.shape[1],198),dtype=np.float32) 68 | self.shape = self.data.shape 69 | self.hdr_data() 70 | 71 | self.data[:,:,0] = tmp 72 | for i in range(1,198): 73 | self.data[:,:,i] = gdal.Open(dataFN[self.calIdx[i]]).ReadAsArray() 74 | 75 | if self.cal: 76 | self.data[:,:,0:50] = self.data[:,:,0:50]/40 77 | self.data[:,:,50:] = self.data[:,:,50:]/80 78 | 79 | self.mask = self.data[:,:,10] != self._mask_value() 80 | 81 | if self.tgt_bands is not None: 82 | self.band_resample() 83 | 84 | self.data[self.mask==False] = 0 85 | 86 | def show_rgb(self): 87 | """Displays RGB visualization of HSI cube""" 88 | from display import imshow 89 | idx = np.array([20,10,0], dtype=np.uint8) 90 | imshow(self.data[:,:,idx]) 91 | 92 | def hdr_data(self): 93 | """Finds band-centers and full-width half-maxes of hyperspectral cube""" 94 | hdr = np.genfromtxt('/home/rmk6217/Documents/grss_challenge/fileIO/hyperion_bands.csv', delimiter=',') 95 | self.bands = hdr[:,0] 96 | self.fwhm = hdr[:,1] 97 | 98 | 99 | def band_resample(self): 100 | """Performs band resampling operation""" 101 | self.data = band_resample_hsi_cube(self.data,self.bands,self.tgt_bands, 102 | self.fwhm, self.tgt_fwhm) 103 | 104 | def _mask_value(self): 105 | """Finds value of invalid orthomosaic pixels""" 106 | return np.min([self.data[0,-1],self.data[-1,0],self.data[0,-1], 107 | self.data[-1,-1]]) 108 | 109 | 110 | -------------------------------------------------------------------------------- /fileIO/hyperion_bands.csv: -------------------------------------------------------------------------------- 1 | 426.8200,11.3871 2 | 436.9900,11.3871 3 | 447.1700,11.3871 4 | 457.3400,11.3871 5 | 467.5200,11.3871 6 | 477.6900,11.3871 7 | 487.8700,11.3784 8 | 498.0400,11.3538 9 | 508.2200,11.3133 10 | 518.3900,11.2580 11 | 528.5700,11.1907 12 | 538.7400,11.1119 13 | 548.9200,11.0245 14 | 559.0900,10.9321 15 | 569.2700,10.8368 16 | 579.4500,10.7407 17 | 589.6200,10.6482 18 | 599.8000,10.5607 19 | 609.9700,10.4823 20 | 620.1500,10.4147 21 | 630.3200,10.3595 22 | 640.5000,10.3188 23 | 650.6700,10.2942 24 | 660.8500,10.2856 25 | 671.0200,10.2980 26 | 681.2000,10.3349 27 | 691.3700,10.3909 28 | 701.5500,10.4592 29 | 711.7200,10.5322 30 | 721.9000,10.6004 31 | 732.0700,10.6562 32 | 742.2500,10.6933 33 | 752.4300,10.7058 34 | 762.6000,10.7276 35 | 772.7800,10.7907 36 | 782.9500,10.8833 37 | 793.1300,10.9938 38 | 803.3000,11.1044 39 | 813.4800,11.1980 40 | 823.6500,11.2600 41 | 833.8300,11.2824 42 | 844.0000,11.2822 43 | 854.1800,11.2816 44 | 864.3500,11.2809 45 | 874.5300,11.2797 46 | 884.7000,11.2782 47 | 894.8800,11.2771 48 | 905.0500,11.2765 49 | 915.2300,11.2756 50 | 925.4100,11.2754 51 | 912.4500,11.0457 52 | 922.5400,11.0457 53 | 932.6400,11.0457 54 | 942.7300,11.0457 55 | 952.8200,11.0457 56 | 962.9100,11.0457 57 | 972.9900,11.0457 58 | 983.0800,11.0457 59 | 993.1700,11.0457 60 | 1003.3000,11.0457 61 | 1013.3000,11.0457 62 | 1023.4000,11.0451 63 | 1033.4900,11.0423 64 | 1043.5900,11.0372 65 | 1053.6900,11.0302 66 | 1063.7900,11.0218 67 | 1073.8900,11.0122 68 | 1083.9900,11.0013 69 | 1094.0900,10.9871 70 | 1104.1900,10.9732 71 | 1114.1900,10.9572 72 | 1124.2800,10.9418 73 | 1134.3800,10.9248 74 | 1144.4800,10.9065 75 | 1154.5800,10.8884 76 | 1164.6800,10.8696 77 | 1174.7700,10.8513 78 | 1184.8700,10.8335 79 | 1194.9700,10.8154 80 | 1205.0700,10.7979 81 | 1215.1700,10.7822 82 | 1225.1700,10.7663 83 | 1235.2700,10.7520 84 | 1245.3600,10.7385 85 | 1255.4600,10.7270 86 | 1265.5600,10.7174 87 | 1275.6600,10.7091 88 | 1285.7600,10.7022 89 | 1295.8600,10.6970 90 | 1305.9600,10.6946 91 | 1316.0500,10.6937 92 | 1326.0500,10.6949 93 | 1336.1500,10.6996 94 | 1346.2500,10.7058 95 | 1356.3500,10.7163 96 | 1366.4500,10.7283 97 | 1376.5500,10.7437 98 | 1386.6500,10.7612 99 | 1396.7400,10.7807 100 | 1406.8400,10.8034 101 | 1416.9400,10.8267 102 | 1426.9400,10.8534 103 | 1437.0400,10.8818 104 | 1447.1400,10.9110 105 | 1457.2300,10.9422 106 | 1467.3300,10.9743 107 | 1477.4300,11.0074 108 | 1487.5300,11.0414 109 | 1497.6300,11.0759 110 | 1507.7300,11.1108 111 | 1517.8300,11.1461 112 | 1527.9200,11.1811 113 | 1537.9200,11.2156 114 | 1548.0200,11.2496 115 | 1558.1200,11.2826 116 | 1568.2200,11.3146 117 | 1578.3200,11.3460 118 | 1588.4200,11.3753 119 | 1598.5100,11.4037 120 | 1608.6100,11.4302 121 | 1618.7100,11.4538 122 | 1628.8100,11.4760 123 | 1638.8100,11.4958 124 | 1648.9000,11.5133 125 | 1659.0000,11.5286 126 | 1669.1000,11.5404 127 | 1679.2000,11.5505 128 | 1689.3000,11.5580 129 | 1699.4000,11.5621 130 | 1709.5000,11.5634 131 | 1719.6000,11.5617 132 | 1729.7000,11.5563 133 | 1739.7000,11.5477 134 | 1749.7900,11.5346 135 | 1759.8900,11.5193 136 | 1769.9900,11.5002 137 | 1780.0900,11.4789 138 | 1790.1900,11.4548 139 | 1800.2900,11.4279 140 | 1810.3800,11.3994 141 | 1820.4800,11.3688 142 | 1830.5800,11.3366 143 | 1840.5800,11.3036 144 | 1850.6800,11.2696 145 | 1860.7800,11.2363 146 | 1870.8700,11.2007 147 | 1880.9800,11.1666 148 | 1891.0700,11.1333 149 | 1901.1700,11.1018 150 | 1911.2700,11.0714 151 | 1921.3700,11.0424 152 | 1931.4700,11.0155 153 | 1941.5700,10.9912 154 | 1951.5700,10.9698 155 | 1961.6600,10.9508 156 | 1971.7600,10.9355 157 | 1981.8600,10.9230 158 | 1991.9600,10.9139 159 | 2002.0600,10.9083 160 | 2012.1500,10.9069 161 | 2022.2500,10.9057 162 | 2032.3500,10.9013 163 | 2042.4500,10.8951 164 | 2052.4500,10.8854 165 | 2062.5500,10.8740 166 | 2072.6500,10.8591 167 | 2082.7500,10.8429 168 | 2092.8400,10.8242 169 | 2102.9400,10.8039 170 | 2113.0400,10.7820 171 | 2123.1400,10.7592 172 | 2133.2400,10.7342 173 | 2143.3400,10.7092 174 | 2153.3400,10.6834 175 | 2163.4300,10.6572 176 | 2173.5300,10.6312 177 | 2183.6300,10.6052 178 | 2193.7300,10.5803 179 | 2203.8300,10.5560 180 | 2213.9300,10.5328 181 | 2224.0300,10.5101 182 | 2234.1200,10.4904 183 | 2244.2200,10.4722 184 | 2254.2200,10.4552 185 | 2264.3200,10.4408 186 | 2274.4200,10.4285 187 | 2284.5200,10.4197 188 | 2294.6100,10.4129 189 | 2304.7100,10.4088 190 | 2314.8100,10.4077 191 | 2324.9100,10.4077 192 | 2335.0100,10.4077 193 | 2345.1100,10.4077 194 | 2355.2100,10.4077 195 | 2365.2000,10.4077 196 | 2375.3000,10.4077 197 | 2385.4000,10.4077 198 | 2395.5000,10.4077 199 | -------------------------------------------------------------------------------- /fileIO/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: fileIO.py 3 | Author: Ronald Kemker 4 | Description: Input/Output helper functions 5 | 6 | Note: 7 | Requires GDAL, remote_sensing.utils, and display (optional) 8 | http://www.gdal.org/ 9 | """ 10 | 11 | import numpy as np 12 | import pickle 13 | from scipy.io import loadmat, savemat 14 | from gdal import Open 15 | import spectral.io.envi as envi 16 | from numba import jit 17 | from random import sample 18 | 19 | def read(fName): 20 | """File reader 21 | 22 | Parameters 23 | ---------- 24 | fname : string, file path to read in (supports .npy, .sav, .pkl, and 25 | .tif (GEOTIFF)) including file extension 26 | 27 | Returns 28 | ------- 29 | output : output data (in whatever format) 30 | """ 31 | ext = fName[-3:] 32 | 33 | if ext == 'npy': 34 | return np.load(fName) 35 | elif ext == 'sav' or ext == 'pkl': 36 | return pickle.load(open(fName, 'rb')) 37 | elif ext == 'mat': 38 | return loadmat(fName) 39 | elif ext == 'tif' or 'pix': 40 | return Open(fName).ReadAsArray() 41 | else: 42 | print('Unknown filename extension') 43 | return -1 44 | 45 | def write(fName, data): 46 | """File writer 47 | 48 | Parameters 49 | ---------- 50 | fname : string, file path to read in (supports .npy, .sav, .pkl, and 51 | .tif (GEOTIFF)) including file extension 52 | data : data to be stored 53 | Returns 54 | ------- 55 | output : output data (in whatever format) 56 | """ 57 | ext = fName[-3:] 58 | 59 | if ext == 'npy': 60 | np.save(fName, data) 61 | return 1 62 | elif ext == 'sav' or ext == 'pkl': 63 | pickle.dump(data, open(fName, 'wb')) 64 | return 1 65 | elif ext == 'mat': 66 | savemat(fName, data) 67 | return 1 68 | else: 69 | print('Unknown filename extension') 70 | return -1 71 | 72 | def readENVIHeader(fName): 73 | """ 74 | Reads envi header 75 | 76 | Parameters 77 | ---------- 78 | fName : String, Path to .hdr file 79 | 80 | Returns 81 | ------- 82 | centers : Band-centers 83 | fwhm : full-width half-maxes 84 | """ 85 | hdr = envi.read_envi_header(fName) 86 | centers = np.array(hdr['wavelength'],dtype=np.float) 87 | fwhm = np.array(hdr['fwhm'],dtype=np.float) 88 | return centers, fwhm 89 | 90 | @jit 91 | def patch_extractor_with_mask(data, mask, num_patches=50, patch_size=16): 92 | """ 93 | Extracts patches inside a mask. I need to find a faster way of doing this. 94 | 95 | Parameters 96 | ---------- 97 | data : 3-D input numpy array [rows x columns x channels] 98 | mask : 2-D binary mask where 1 is valid, 0 else 99 | num_patches : int, number of patches to extract (Default: 50) 100 | patch_size : int, pixel dimension of square patch (Default: 16) 101 | 102 | Returns 103 | ------- 104 | output : 4-D Numpy array [num patches x rows x columns x channels] 105 | """ 106 | sh = data.shape 107 | patch_arr = np.zeros((num_patches,patch_size, patch_size, sh[-1]), 108 | dtype=data.dtype) 109 | 110 | if type(patch_size) == int: 111 | patch_size = np.array([patch_size, patch_size], dtype = np.int32) 112 | 113 | #Find Valid Regions to Extract Patches 114 | valid = np.zeros(mask.shape, dtype=np.uint8) 115 | for i in range(0, sh[0]-patch_size[0]): 116 | for j in range(0, sh[1]-patch_size[1]): 117 | if np.all(mask[i:i+patch_size[0], j:j+patch_size[1]]) == True: 118 | valid[i,j] += 1 119 | 120 | idx = np.argwhere(valid > 0 ) 121 | idx = idx[np.array(sample(range(idx.shape[0]), num_patches))] 122 | 123 | for i in range(num_patches): 124 | patch_arr[i] = data[idx[i,0]:idx[i,0]+patch_size[0], 125 | idx[i,1]:idx[i,1]+patch_size[1]] 126 | 127 | return patch_arr 128 | 129 | @jit 130 | def patch_extractor(data, num_patches=50, patch_size=16): 131 | """ 132 | Extracts patches inside a mask. I need to find a faster way of doing this. 133 | 134 | Parameters 135 | ---------- 136 | data : 3-D input numpy array [rows x columns x channels] 137 | num_patches : int, number of patches to extract (Default: 50) 138 | patch_size : int, pixel dimension of square patch (Default: 16) 139 | 140 | Returns 141 | ------- 142 | output : 4-D Numpy array [num patches x rows x columns x channels] 143 | """ 144 | sh = data.shape 145 | patch_arr = np.zeros((num_patches,patch_size, patch_size, sh[-1]), 146 | dtype=data.dtype) 147 | 148 | if type(patch_size) == int: 149 | patch_size = np.array([patch_size, patch_size], dtype = np.int32) 150 | 151 | #Find Valid Regions to Extract Patches 152 | valid = np.zeros(data.shape[:2], dtype=np.uint8) 153 | valid[0: sh[0]-patch_size[0], 0: sh[1]-patch_size[1]] = 1 154 | 155 | idx = np.argwhere(valid > 0 ) 156 | idx = idx[np.array(sample(range(idx.shape[0]), num_patches))] 157 | 158 | for i in range(num_patches): 159 | patch_arr[i] = data[idx[i,0]:idx[i,0]+patch_size[0], 160 | idx[i,1]:idx[i,1]+patch_size[1]] 161 | return patch_arr 162 | -------------------------------------------------------------------------------- /metrics.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Wed Feb 22 10:42:10 2017 5 | 6 | @author: rmkemker 7 | """ 8 | 9 | from sklearn.metrics import confusion_matrix 10 | import numpy as np 11 | import time 12 | 13 | class Metrics(): 14 | def __init__(self, truth, prediction): 15 | self.c = confusion_matrix(truth , prediction) 16 | 17 | def _oa(self): 18 | return np.sum(np.diag(self.c))/np.sum(self.c) 19 | 20 | def _aa(self): 21 | return np.mean(np.diag(self.c)/np.sum(self.c, axis=1)) 22 | 23 | def _ca(self): 24 | return np.diag(self.c)/np.sum(self.c, axis=1) 25 | 26 | def per_class_accuracies(self): 27 | return self._ca() 28 | 29 | def mean_class_accuracy(self): 30 | return self._aa() 31 | 32 | def overall_accuracy(self): 33 | return self._oa() 34 | 35 | def _kappa(self): 36 | e = np.sum(np.sum(self.c,axis=1)*np.sum(self.c,axis=0))/np.sum(self.c)**2 37 | return (self._oa()-e)/(1-e) 38 | 39 | def standard_metrics(self): 40 | return self._oa(), self._aa(), self._kappa() 41 | 42 | class Timer(object): 43 | def __init__(self, name=None): 44 | self.name = name 45 | def __enter__(self): 46 | self.tic = time.time() 47 | def __exit__(self, type, value, traceback): 48 | if self.name: 49 | print('[%s]' % self.name) 50 | print('Elapsed: %1.3f seconds' % (time.time() - self.tic)) -------------------------------------------------------------------------------- /pipeline.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Thu Mar 23 14:22:36 2017 5 | 6 | @author: Ronald Kemker and Utsav Gewali 7 | """ 8 | 9 | import numpy as np 10 | from sklearn.base import BaseEstimator, ClassifierMixin 11 | from preprocessing import ImagePreprocessor 12 | from postprocessing import MRF, CRF 13 | from utils.utils import band_resample_hsi_cube 14 | 15 | class DummyOp(object): 16 | def __init__(self): 17 | pass 18 | 19 | def transform(self, X): 20 | return X 21 | 22 | class Pipeline(BaseEstimator, ClassifierMixin): 23 | """ 24 | Pipeline to classify single-image hyperspectral image cube 25 | - Deployable Framework for Remote Sensing Classification 26 | 27 | Note: TODO Link to paper here 28 | 29 | Args: 30 | classifier (obj): A classifier object. 31 | pre_processor (str, list, obj, optional): Pre-process the image data. 32 | It can either be a string ('MinMaxScaler', 'StandardScaler', 33 | 'PCA', 'Normalize','Exponential', 'ConeResponse, or 'ReLU'), an 34 | ImagePreprocessor object, or a list containing one or more of the 35 | previous strings or ImagePre-processor objects. 36 | feature_extractor (obj, optional): A spatial-spectral feature extractor 37 | object (e.g. MICA or SCAE). 38 | feature_scaler (str, list, obj, optional): If a feature extractor is 39 | used, you can pre-process the data again prior to classification. 40 | The input format is the same as the pre_processor object. 41 | feature_spatial_padding (2-D tuple): This is the pad_width argument to 42 | numpy.pad. This is used if spatial padding is required for feature 43 | extraction. 44 | post_processor (str, obj, optional): The post-processing object or 45 | string with the name of that post-processing object. The only 46 | supported post-processor at this time is 'CRF'. 47 | source_bands (array of floats, optional): The band centers of the 48 | source data. 49 | source_fwhm (array of floats, optional): The full-width, half maxes 50 | of the source data. Should be the same shape as source_bands. 51 | target_bands (array of floats, optional): The band centers of the 52 | target data. 53 | target_fwhm (array of floats, optional): The full-width, half maxes 54 | of the target data. Should be the same shape as target_bands. 55 | semisupervised (boolean, optional): If true, the classifier will treat 56 | all labels = -1 as unlabeled training data. Labels < -1 are 57 | ignored. (Default: False) 58 | **kwargs (optional): These are various input variables to the 59 | pre-processor, feature_scaler, classifier, and post_processor 60 | class instances. This doesn't really work well... 61 | 62 | Attributes: 63 | 64 | """ 65 | def __init__(self, classifier, pre_processor=None, feature_extractor=None, 66 | feature_scaler=[], feature_spatial_padding=None, 67 | post_processor=None, 68 | source_bands=[], source_fwhm=[], target_bands=[], 69 | target_fwhm=[], layer_sizes=None, denoising_cost=None, 70 | semisupervised = False, **kwargs): 71 | 72 | valid_pre = ['MinMaxScaler', 'StandardScaler', 'PCA', 73 | 'GlobalContrastNormalization', 'ZCAWhiten', 'Normalize', 74 | 'Exponential','ReLU','ConeResponse', 'AveragePooling2D'] 75 | 76 | valid_post = ['MRF','CRF'] 77 | 78 | self.classifier = classifier 79 | self.semisupervised = semisupervised 80 | 81 | self.probability = post_processor is not None 82 | kwargs['probability'] = self.probability 83 | 84 | #Pre-process the data 85 | if pre_processor is None: 86 | pre_processor = [] 87 | elif type(pre_processor) is not list: 88 | pre_processor = [pre_processor] 89 | 90 | self.pre_processor = [] 91 | for pre in pre_processor: 92 | if type(pre) == str: 93 | if pre in valid_pre: 94 | self.pre_processor.append(ImagePreprocessor(pre)) 95 | for k in kwargs.keys(): 96 | if k in self.pre_processor[-1].__dict__.keys(): 97 | self.pre_processor[-1].__dict__[k] = kwargs[k] 98 | else: 99 | msg = 'Invalid pre_processor type: %s. Built-in options include %s.' 100 | raise ValueError(msg % (pre, str(valid_pre)[1:-1])) 101 | else: 102 | self.pre_processor.append(pre) 103 | 104 | 105 | #Pre-process the data after feature extraction 106 | if type(feature_scaler) is not list: 107 | feature_scaler = [feature_scaler] 108 | 109 | self.feature_scaler = [] 110 | for pre in feature_scaler: 111 | if type(pre) == str: 112 | if pre in valid_pre: 113 | self.feature_scaler.append(ImagePreprocessor(pre)) 114 | for k in kwargs.keys(): 115 | if k in self.feature_scaler[-1].__dict__.keys(): 116 | self.feature_scaler[-1].__dict__[k] = kwargs[k] 117 | else: 118 | msg = 'Invalid pre_processor type: %s. Built-in options include %s.' 119 | raise ValueError(msg % (pre, str(valid_pre)[1:-1])) 120 | else: 121 | self.feature_scaler.append(pre) 122 | 123 | if type(post_processor) == str: 124 | if post_processor in valid_post: 125 | self.post_processor = eval(post_processor)() 126 | else: 127 | msg = 'Invalid post-processor type: %s. Built-in options include %s.' 128 | raise ValueError(msg % (post_processor, str(valid_post)[1:-1])) 129 | else: 130 | self.post_processor = post_processor 131 | 132 | self.feature_extractor = feature_extractor 133 | self.source_bands = source_bands 134 | self.source_fwhm = source_fwhm 135 | self.target_bands = target_bands 136 | self.target_fwhm = target_fwhm 137 | self.pad = feature_spatial_padding 138 | 139 | def fit(self, X_train, y_train, X_val, y_val): 140 | 141 | """ 142 | Fits to training data. 143 | 144 | Args: 145 | X_train (ndarray): the X training data 146 | y_train (ndarray): the y training data 147 | X_val (ndarray): the X validation data 148 | y_val (ndarray): the y validation data 149 | """ 150 | 151 | same = np.all(X_train == X_val) 152 | 153 | #Data pre-processing 154 | for i in range(len(self.pre_processor)): 155 | self.pre_processor[i].fit(X_train) 156 | X_train = self.pre_processor[i].transform(X_train) 157 | if not same: 158 | X_val = self.pre_processor[i].transform(X_val) 159 | 160 | if self.pad: 161 | X_train = np.pad(X_train, self.pad, mode='constant') 162 | if not same: 163 | X_val = np.pad(X_val, self.pad, mode='constant') 164 | 165 | if not self.feature_extractor: 166 | self.feature_extractor = [DummyOp()] 167 | self.target_bands= [[]] 168 | self.target_fwhm = [[]] 169 | elif not isinstance(self.feature_extractor, list): 170 | self.feature_extractor = [self.feature_extractor] 171 | self.target_bands = [self.target_bands] 172 | self.target_fwhm = [self.target_fwhm] 173 | 174 | base_train = np.copy(X_train) 175 | final_train = np.zeros((base_train.shape[:2] + (0, )),dtype=np.float32) 176 | if not same: 177 | base_val = np.copy(X_val) 178 | final_val = np.zeros((base_val.shape[:2] + (0, )),dtype=np.float32) 179 | 180 | for i, fe in enumerate(self.feature_extractor): 181 | #Band resampling 182 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \ 183 | len(self.source_bands) and len(self.source_fwhm): 184 | X_train = band_resample_hsi_cube(base_train, self.source_bands, 185 | self.target_bands[i], 186 | self.source_fwhm, 187 | self.target_fwhm[i]) 188 | if not same: 189 | X_val = band_resample_hsi_cube(base_val, self.source_bands, 190 | self.target_bands[i], 191 | self.source_fwhm, 192 | self.target_fwhm[i]) 193 | #Feature extraction 194 | X_train = fe.transform(X_train) 195 | if not same: 196 | X_val = fe.transform(X_val) 197 | 198 | final_train = np.append(final_train, X_train, axis=-1) 199 | if not same: 200 | final_val = np.append(final_val, X_val, axis=-1) 201 | 202 | 203 | X_train = np.copy(final_train) 204 | del base_train, final_train 205 | if not same: 206 | X_val = np.copy(final_val) 207 | del base_val, final_val 208 | 209 | if self.pad: 210 | X_train = X_train[self.pad[0][0]:-self.pad[0][1], 211 | self.pad[1][0]:-self.pad[1][1]] 212 | if not same: 213 | X_val = X_val[self.pad[0][0]:-self.pad[0][1], 214 | self.pad[1][0]:-self.pad[1][1]] 215 | 216 | print(X_train.shape) 217 | if len(self.feature_scaler): 218 | for feature_scaler_i in self.feature_scaler: 219 | feature_scaler_i.fit(X_train) #TODO: Combine train and val 220 | X_train = feature_scaler_i.transform(X_train) 221 | if not same: 222 | X_val = feature_scaler_i.transform(X_val) 223 | 224 | if same: 225 | X_val = X_train 226 | 227 | if self.post_processor: 228 | feature2d = X_val 229 | val_idx = y_val.ravel() > -1 230 | 231 | #Reshape to N x F 232 | if len(X_train.shape) > 2: 233 | X_train = X_train.reshape(-1, X_train.shape[-1]) 234 | if len(X_val.shape) > 2: 235 | X_val = X_val.reshape(-1, X_val.shape[-1]) 236 | if len(y_train.shape) > 1: 237 | y_train = y_train.ravel() 238 | if len(y_val.shape) > 1: 239 | y_val = y_val.ravel() 240 | 241 | if not self.semisupervised: 242 | X_train = X_train[y_train > -1] 243 | y_train = y_train[y_train > -1] 244 | else: 245 | X_train = X_train[y_train >= -1] 246 | y_train = y_train[y_train >= -1] 247 | 248 | X_val = X_val[y_val > -1] 249 | y_val = y_val[y_val > -1] 250 | 251 | print("Train: %d x %d" % (X_train.shape[0], X_train.shape[1])) 252 | print("Val: %d x %d" % (X_val.shape[0], X_val.shape[1])) 253 | 254 | #Fit classifier 255 | self.classifier.fit(X_train, y_train, X_val, y_val) 256 | 257 | #Fit MRF/CRF 258 | if self.post_processor: 259 | sh = feature2d.shape 260 | prob_map = self.classifier.predict_proba(feature2d.reshape(-1, sh[-1])) 261 | prob_map = prob_map.reshape(sh[0], sh[1], -1) 262 | self.post_processor.fit(prob_map+1e-50, y_val, val_idx) 263 | 264 | 265 | def predict(self, X): 266 | 267 | """ 268 | Makes predictions based on training and X. 269 | 270 | Args: 271 | X (ndarray): the testing data 272 | 273 | Returns: 274 | pred (ndarray): the predictions of the y data based 275 | on training and X 276 | """ 277 | 278 | if len(self.pre_processor): 279 | for j in range(len(self.pre_processor)): 280 | X = self.pre_processor[j].transform(X) 281 | 282 | if self.pad: 283 | X = np.pad(X, self.pad, mode='constant') 284 | 285 | base = np.copy(X) 286 | final = np.zeros(X.shape[:2] + (0,),dtype=np.float32) 287 | 288 | for i, fe in enumerate(self.feature_extractor): 289 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \ 290 | len(self.source_bands) and len(self.source_fwhm): 291 | X = band_resample_hsi_cube(base, self.source_bands, 292 | self.target_bands[i], 293 | self.source_fwhm, 294 | self.target_fwhm[i]) 295 | else: 296 | X = np.copy(base) 297 | 298 | #Feature extraction 299 | final = np.append(final, fe.transform(X) , axis=-1) 300 | 301 | 302 | X = np.copy(final) 303 | del base, final 304 | 305 | if self.pad: 306 | #TODO: generalize this to N-dimensions 307 | X = X[self.pad[0][0]:-self.pad[0][1], 308 | self.pad[1][0]:-self.pad[1][1]] 309 | 310 | if len(self.feature_scaler): 311 | for feature_scaler_i in self.feature_scaler: 312 | X = feature_scaler_i.transform(X) 313 | 314 | 315 | sh = X.shape 316 | if len(X.shape) > 2: 317 | X = X.reshape(-1, X.shape[-1]) 318 | 319 | if self.post_processor: 320 | prob = self.classifier.predict_proba(X) 321 | pred = self.post_processor.predict(prob.reshape(sh[0], sh[1], -1)+1e-50) 322 | pred = pred.ravel() 323 | else: 324 | pred = self.classifier.predict(X) 325 | 326 | 327 | return pred 328 | 329 | def predict_proba(self, X): 330 | 331 | """ 332 | Finds probabilities of possible outcomes using samples from X. 333 | 334 | Args: 335 | X (ndarray): the testing data 336 | 337 | Returns: ndarray of possible outcomes based on samples from X 338 | """ 339 | 340 | if len(self.pre_processor): 341 | for j in range(len(self.pre_processor)): 342 | X = self.pre_processor[j].transform(X) 343 | 344 | if self.pad: 345 | X = np.pad(X, self.pad, mode='constant') 346 | 347 | base = np.copy(X) 348 | final = np.zeros(X.shape[:2] + (0,),dtype=np.float32) 349 | 350 | for i, fe in enumerate(self.feature_extractor): 351 | if len(self.target_bands[i]) and len(self.target_fwhm[i]) and \ 352 | len(self.source_bands) and len(self.source_fwhm): 353 | X = band_resample_hsi_cube(base, self.source_bands, 354 | self.target_bands[i], 355 | self.source_fwhm, 356 | self.target_fwhm[i]) 357 | else: 358 | X = np.copy(base) 359 | 360 | #Feature extraction 361 | final = np.append(final, fe.transform(X) , axis=-1) 362 | 363 | 364 | X = np.copy(final) 365 | del base, final 366 | 367 | if self.pad: 368 | #TODO: generalize this to N-dimensions 369 | X = X[self.pad[0][0]:-self.pad[0][1], 370 | self.pad[1][0]:-self.pad[1][1]] 371 | 372 | if len(self.feature_scaler): 373 | for feature_scaler_i in self.feature_scaler: 374 | X = feature_scaler_i.transform(X) 375 | 376 | if len(X.shape) > 2: 377 | X = X.reshape(-1, X.shape[-1]) 378 | 379 | proba = self.classifier.predict_proba(X) 380 | 381 | return proba 382 | -------------------------------------------------------------------------------- /postprocessing.py: -------------------------------------------------------------------------------- 1 | """ 2 | Classes for gridwise MRF and dense CRF 3 | Uses pydense (https://github.com/lucasb-eyer/pydensecrf) and python_gco (https://github.com/amueller/gco_python) libraries 4 | """ 5 | import numpy as np 6 | import pydensecrf.densecrf as dcrf 7 | import pygco as gco 8 | 9 | class MRF(object): 10 | """ 11 | MRF postprocessing 12 | -uses alpha-beta swap algorithm from Multilabel optimization library GCO (http://vision.csd.uwo.ca/code/) 13 | -unary energy is derived from predicted probability map 14 | -pairwise energy is Potts function (parameter optimized using grid search) 15 | """ 16 | 17 | def __init__(self,inference_niter=50, multiplier=1000): 18 | self.INFERENCE_NITER = inference_niter #number of iteration of alpha-beta swap 19 | self.MULTIPLIER = multiplier #GCO uses int32 for energy therefore floating point energy is 20 | #scaled by multiplier and truncated 21 | 22 | 23 | def fit(self, X, y, idx):#img is unused here but is needed for the pipeline 24 | self.unaryEnergy = np.ascontiguousarray(-self.MULTIPLIER*np.log(X)).astype('int32') #energy=-log(probability) 25 | nClass = self.unaryEnergy.shape[2] 26 | 27 | params = np.logspace(-3,3,10) #grid search values for Potts compatibility 28 | best_score = 0.0 29 | 30 | print('Fitting Grid MRF...') 31 | for i in range(params.shape[0]): 32 | pairwiseEnergy = (self.MULTIPLIER*params[i]*(1-np.eye(nClass))).astype('int32') 33 | out = gco.cut_simple(unary_cost=self.unaryEnergy,\ 34 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER, 35 | algorithm='swap').ravel() 36 | 37 | score = np.sum(out[idx]==y)/float(y.size) 38 | if score > best_score: 39 | self.cost = params[i] 40 | best_score = score 41 | 42 | params = np.logspace(np.log10(self.cost)-1,np.log10(self.cost)+1,30) 43 | best_score = 0.0 44 | 45 | print('Finetuning Grid MRF...') 46 | for i in range(params.shape[0]): 47 | pairwiseEnergy = (self.MULTIPLIER*params[i]*(1-np.eye(nClass))).astype('int32') 48 | out = gco.cut_simple(unary_cost=self.unaryEnergy,\ 49 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER, 50 | algorithm='swap').ravel() 51 | 52 | score = np.sum(out[idx]==y)/float(y.size) 53 | if score > best_score: 54 | self.cost = params[i] 55 | best_score = score 56 | 57 | def predict(self, X): 58 | #self.cost = 2. 59 | self.unaryEnergy = np.ascontiguousarray(-self.MULTIPLIER*np.log(X)).astype('int32') 60 | nClass = self.unaryEnergy.shape[2] 61 | pairwiseEnergy = (self.MULTIPLIER*self.cost*(1-np.eye(nClass))).astype('int32') 62 | return gco.cut_simple(unary_cost=self.unaryEnergy,\ 63 | pairwise_cost=pairwiseEnergy,n_iter=self.INFERENCE_NITER,algorithm='swap') 64 | 65 | 66 | class CRF(object): 67 | """ 68 | CRF postprocessing 69 | -uses dense CRF with Gaussian edge potential (https://arxiv.org/abs/1210.5644) 70 | -unary energy is derived from predicted probability map 71 | -pairwise energy is Potts function (parameters optimized using grid search) 72 | """ 73 | 74 | def __init__(self, inference_niter=10): 75 | self.INFERENCE_NITER = inference_niter #number of mean field iteration in dense CRF 76 | self.w1 = 1. 77 | self.scale = 1000. 78 | 79 | def fit(self, X, y, idx): 80 | [self.nRows,self.nCols,self.nClasses] = X.shape 81 | self.nPixels = self.nRows*self.nCols 82 | prob_list = X.reshape([self.nPixels,self.nClasses]) 83 | self.uEnergy = -np.log(prob_list.transpose().astype('float32')) #energy=-log(probability) 84 | 85 | params = [np.logspace(-3,3,10) for i in range(2)] 86 | params = np.meshgrid(*params) 87 | params = np.vstack([x.ravel() for x in params]).T 88 | 89 | best_score = 0.0 90 | 91 | print('Fitting Fully-Connected CRF...') 92 | for i in range(params.shape[0]): 93 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses) 94 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy)) 95 | crf.addPairwiseGaussian(sxy=params[i,0], compat=params[i,1]) 96 | 97 | Q = crf.inference(self.INFERENCE_NITER) 98 | pred = np.argmax(Q,axis=0) 99 | score = np.sum(pred[idx]==y)/float(y.size) 100 | 101 | if score > best_score: 102 | self.w1 = params[i,1] 103 | self.scale = params[i,0] 104 | best_score = score 105 | 106 | params = [np.logspace(np.log10(x)-1,np.log10(x)+1,10) for x in [self.scale, self.w1]] 107 | params = np.meshgrid(*params) 108 | params = np.vstack([x.ravel() for x in params]).T 109 | 110 | best_score = 0.0 111 | 112 | print('Finetuning Fully-Connected CRF...') 113 | for i in range(params.shape[0]): 114 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses) 115 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy)) 116 | crf.addPairwiseGaussian(sxy=params[i,0], compat=params[i,1]) 117 | 118 | Q = crf.inference(self.INFERENCE_NITER) 119 | pred = np.argmax(Q,axis=0) 120 | score = np.sum(pred[idx]==y)/float(y.size) 121 | 122 | if score > best_score: 123 | self.w1 = params[i,1] 124 | self.scale = params[i,0] 125 | best_score = score 126 | 127 | return 1 128 | 129 | def predict(self, X): 130 | [self.nRows,self.nCols,self.nClasses] = X.shape 131 | self.nPixels = self.nRows*self.nCols 132 | prob_list = X.reshape([self.nPixels,self.nClasses]) 133 | self.uEnergy = -np.log(prob_list.transpose().astype('float32')) 134 | 135 | crf = dcrf.DenseCRF2D(self.nRows,self.nCols,self.nClasses) 136 | crf.setUnaryEnergy(np.ascontiguousarray(self.uEnergy)) 137 | crf.addPairwiseGaussian(sxy=self.scale, compat=self.w1) 138 | 139 | Q = crf.inference(self.INFERENCE_NITER) 140 | pred = np.argmax(Q,axis=0) 141 | return pred.reshape([self.nRows,self.nCols]) 142 | 143 | -------------------------------------------------------------------------------- /preprocessing.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: preprocessing.py 3 | Author: Ronald Kemker 4 | Description: Image Pre-processing workflow 5 | Note: 6 | Requires scikit-learn 7 | """ 8 | 9 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 10 | from sklearn.decomposition import PCA 11 | from utils.convolution import mean_pooling 12 | import numpy as np 13 | from scipy.stats import expon 14 | 15 | class ConeResponse(): 16 | def __init__(self, eps=0.05): 17 | self.eps = eps 18 | def fit(self, X): 19 | pass 20 | def transform(self, X): 21 | return (np.log(self.eps)-np.log(X + self.eps))/(np.log(self.eps)-np.log(1 + self.eps)) 22 | def fit_transform(self, X): 23 | self.fit(X) 24 | return self.transform(X) 25 | 26 | class ReLU(): 27 | def __init__(self): 28 | pass 29 | def fit(self, X): 30 | pass 31 | def transform(self, X): 32 | X[X<0] = 0 33 | return X 34 | def fit_transform(self, X): 35 | self.fit(X) 36 | return self.transform(X) 37 | 38 | class Exponential(): 39 | """ 40 | Class used to exponentially scale data by fitting and transforming it 41 | """ 42 | def __init__(self): 43 | pass 44 | def fit(self, X): 45 | """ 46 | Sets the scale based on the input data 47 | Args: 48 | X (array): the data to be used to set the scale 49 | """ 50 | self.scale = np.zeros(X.shape[-1], dtype=np.float32) 51 | for i in range(0,X.shape[-1]): 52 | _, self.scale[i] = expon.fit(X[:,i],floc=0) 53 | def transform(self, X): 54 | """ 55 | Transforms data based on the scale set in the fit function 56 | Args: 57 | X (array): the data to be transformed according to the scale 58 | Returns: the transformed data 59 | """ 60 | return 1-np.exp(-np.abs(X)/self.scale) 61 | def fit_transform(self, X): 62 | """ 63 | Performs both the fit and the transform methods on the same input data 64 | Args: 65 | X (array): the data to be used to set the scale then transformed 66 | Returns: the result of transforming the X data after fitting it 67 | """ 68 | self.fit(X) 69 | return self.transform(X) 70 | 71 | class Normalize(): 72 | """ 73 | Class used to normalize and scale data by fitting and transforming it 74 | Args: 75 | ord: Default value is None 76 | """ 77 | def __init__(self, norm_order=None): 78 | self.ord = norm_order 79 | def fit(self, X): 80 | """ 81 | Sets the scale based on the input data; uses np.linalg.norm 82 | Args: 83 | X (array): the data to be used to set the scale 84 | """ 85 | self.ord = np.linalg.norm(X, axis=0) 86 | self.ord[self.ord==0] = 1e-7 87 | def transform(self, X): 88 | """ 89 | Transforms data based on the scale set in the fit function 90 | Args: 91 | X (array): the data to be transformed according to the scale 92 | Returns: the transformed data 93 | """ 94 | return X/self.ord 95 | def fit_transform(self , X): 96 | """ 97 | Performs both the fit and the transform methods on the same input data 98 | Args: 99 | X (array): the data to be used to set the scale then transformed 100 | Returns: the result of transforming the X data after fitting it 101 | """ 102 | self.fit(X) 103 | return self.transform(X) 104 | 105 | class AveragePooling2D(): 106 | 107 | def __init__(self, pool_size=5): 108 | self.pool_size = pool_size 109 | def fit(self, X): 110 | pass 111 | def transform(self, X): 112 | return mean_pooling(X, self.pool_size) 113 | def fit_transform(self , X): 114 | return self.transform(X) 115 | 116 | class ImagePreprocessor(): 117 | 118 | def __init__(self, mode='StandardScaler' , feature_range=[0,1], 119 | with_std=True, PCA_components=None, svd_solver='auto', 120 | norm_order=None, eps=0.05, whiten=True): 121 | self.mode = mode.lower() 122 | self.with_std = with_std 123 | if self.mode == 'standardscaler': 124 | self.sclr = StandardScaler(with_std=with_std) 125 | elif self.mode == 'minmaxscaler': 126 | self.sclr = MinMaxScaler(feature_range = feature_range) 127 | elif self.mode == 'pca': 128 | self.sclr = PCA(n_components=PCA_components,whiten=whiten, 129 | svd_solver=svd_solver) 130 | elif self.mode == 'normalize': 131 | self.sclr = Normalize(norm_order) 132 | elif self.mode == 'exponential': 133 | self.sclr = Exponential() 134 | elif self.mode == 'coneresponse': 135 | self.sclr = ConeResponse(eps) 136 | elif self.mode == 'relu': 137 | self.sclr = ReLU() 138 | else: 139 | raise ValueError('Invalid pre-processing mode: ' + mode) 140 | 141 | def fit(self, data): 142 | data = data.reshape(-1,data.shape[-1]) 143 | self.sclr.fit(data) 144 | def transform(self,data): 145 | sh = data.shape 146 | data = data.reshape(-1,sh[-1]) 147 | data = self.sclr.transform(data) 148 | return data.reshape(sh[:-1] + (data.shape[-1],)) 149 | def fit_transform(self, data): 150 | sh = data.shape 151 | data = data.reshape(-1,sh[-1]) 152 | data = self.sclr.fit_transform(data) 153 | return data.reshape(sh[:-1] + (data.shape[-1],)) 154 | def get_params(self): 155 | if self.mode == 'standardscaler': 156 | if self.with_std: 157 | return self.sclr.mean_ , self.sclr.scale_ 158 | else: 159 | return self.sclr.mean_ 160 | elif self.mode == 'minmaxscaler': 161 | return self.sclr.data_min_, self.sclr.data_max_ 162 | elif self.mode == 'pca': 163 | return self.sclr.components_ 164 | elif self.mode == 'normalize': 165 | return self.sclr.ord 166 | elif self.mode == 'exponential': 167 | return self.sclr.scale 168 | else: 169 | return None -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """ 2 | setup.py 3 | Downloads and installs dependencies, datasets, and pretrained filters. 4 | """ 5 | import pip 6 | import zipfile 7 | import urllib.request 8 | import os 9 | 10 | 11 | def main(): 12 | 13 | #install numpy 14 | pip_install('numpy') 15 | 16 | #install scipy 17 | pip_install('scipy') 18 | 19 | #install matplotlib 20 | pip_install('matplotlib') 21 | 22 | #install scikit-learn 23 | pip_install('sklearn') 24 | 25 | #install spectral python 26 | pip_install('spectral') 27 | 28 | #install gdal 29 | pip_install('gdal') 30 | 31 | #install tensorflow-gpu 32 | pip_install('tensorflow-gpu') 33 | 34 | #install pydensecrf from https://github.com/lucasb-eyer/pydensecrf 35 | pip_install('git+https://github.com/lucasb-eyer/pydensecrf.git') 36 | 37 | #install py_gco from https://github.com/amueller/gco_python 38 | pip_install('git+git://github.com/amueller/gco_python') 39 | 40 | 41 | #download and install datasets and pretrained filters 42 | pretrained_filters = [['http://www.cis.rit.edu/~rmk6217/scae.zip', './feature_extraction/pretrained'], 43 | ['http://www.cis.rit.edu/~rmk6217/smcae.zip', './feature_extraction/pretrained']] 44 | datasets = [['http://www.cis.rit.edu/~rmk6217/datasets.zip', './']] 45 | to_download = pretrained_filters + datasets 46 | for i in range(len(to_download)): 47 | print('Downloading ' + to_download[i][0] + ' .....') 48 | download_and_unzip(to_download[i][0],to_download[i][1]) 49 | 50 | 51 | def pip_install(pkg): 52 | pip.main(['install', pkg]) 53 | 54 | def download_and_unzip(url, out_folder): 55 | tempfile_path = './tmp_dw_file.zip' 56 | if not os.path.exists(out_folder): 57 | os.mkdir(out_folder) 58 | 59 | urllib.request.urlretrieve(url, tempfile_path) 60 | zip_ref = zipfile.ZipFile(tempfile_path, 'r') 61 | zip_ref.extractall(out_folder) 62 | zip_ref.close() 63 | os.remove(tempfile_path) 64 | 65 | 66 | if __name__ == '__main__': 67 | main() 68 | -------------------------------------------------------------------------------- /tf_initializers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Wed Aug 16 15:32:06 2017 5 | 6 | @author: Ronald Kemker 7 | """ 8 | 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | 13 | def get(initializer): 14 | return eval(initializer) if isinstance(initializer, str) else initializer 15 | 16 | def zeros(shape, name=None): 17 | return tf.Variable(tf.zeros([shape]), name=name) 18 | 19 | def ones(shape, name=None): 20 | return tf.Variable(tf.ones([shape]), name=name) 21 | 22 | def constant(shape, value=0, name=None): 23 | return tf.Variable(value * tf.ones([shape]), name=name) 24 | 25 | def glorot_normal(shape, name=None): 26 | return _normal_base(shape, name, stddev=math.sqrt(2 / np.sum(shape))) 27 | 28 | def he_normal(shape, name=None): 29 | return _normal_base(shape, name, stddev=math.sqrt(2 / shape[0])) 30 | 31 | def lecun_normal(shape, name=None): 32 | return _normal_base(shape, name, stddev=math.sqrt(1 / shape[0])) 33 | 34 | def _normal_base(shape, name, stddev): 35 | return tf.Variable(tf.random_normal(shape, name=name)) * stddev 36 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rmkemker/EarthMapper/098392c405feebc79606c01b3c1c4bb48999f224/utils/__init__.py -------------------------------------------------------------------------------- /utils/convolution.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Wed Feb 22 10:19:42 2017 5 | 6 | @author: Ronald Kemker 7 | """ 8 | 9 | import numpy as np 10 | import tensorflow as tf 11 | from scipy.ndimage import convolve 12 | 13 | def meanPooling(X, poolSize): 14 | """ 15 | meanPooling - Average pooling filter across data 16 | Input: data - [height x width x channels] 17 | poolSize - (int) Receptive field size of mean pooling filter 18 | Output: output - [height x width x channels] 19 | """ 20 | X = np.float32(X[np.newaxis]) 21 | with tf.Graph().as_default(): 22 | X_t = tf.placeholder(tf.float32, shape=X.shape) 23 | pool_filter = tf.ones((poolSize, poolSize, X.shape[-1], 1), tf.float32)/poolSize**2 24 | half = int(poolSize/2) 25 | pad_t = tf.pad(X_t , tf.constant([[0,0],[half, half], [half, half],[0,0]]), 'REFLECT') 26 | pool_op = tf.nn.depthwise_conv2d(pad_t, pool_filter, strides=[1,1,1,1], padding='VALID') 27 | with tf.Session() as sess: 28 | return sess.run(pool_op, {X_t:X})[0] 29 | 30 | def conv2d(X, w): 31 | 32 | with tf.Graph().as_default(): 33 | X = np.float32(X[np.newaxis]) 34 | X_t = tf.placeholder(tf.float32, name='X_t') 35 | w_t = tf.placeholder(tf.float32, name='w_t') 36 | 37 | conv_op = tf.nn.conv2d(X_t, w_t, strides=[1,1,1,1], padding='SAME') 38 | 39 | nb_filters = w.shape[3] 40 | 41 | with tf.Session() as sess: 42 | 43 | if nb_filters <= 64: 44 | return sess.run(conv_op, {X_t:X, w_t:w})[0] 45 | else: 46 | output = np.zeros((X.shape[1], X.shape[2], nb_filters)) 47 | for i in range(0, nb_filters, 64): 48 | idx = np.arange(i,np.min([i+64, nb_filters])) 49 | output[:,:,idx] = sess.run(conv_op, {X_t:X,w_t:w[:,:,:,idx]})[0] 50 | return output[0] 51 | 52 | def mean_pooling(X, pool_size): 53 | 54 | X = np.float32(X) 55 | sh = X.shape 56 | f = np.ones((pool_size, pool_size) , dtype=np.float32) / pool_size**2 57 | 58 | result = np.zeros(sh, dtype=np.float32) 59 | 60 | for i in range(sh[-1]): 61 | result[:,:,i] = convolve(X[:,:,i], f, mode='reflect') 62 | return result 63 | 64 | if __name__ == "__main__": 65 | 66 | X = np.float32(np.random.rand(610,340,256*5*3)) 67 | result = mean_pooling(X, 5) 68 | -------------------------------------------------------------------------------- /utils/train_val_split.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Created on Tue Jul 25 10:58:44 2017 5 | 6 | @author: rmk6217 7 | """ 8 | 9 | import numpy as np 10 | 11 | def random_data_split(y, train_samples, classwise=True, ignore_zero=True, 12 | shuffle=False): 13 | """Split label data into two folds. 14 | 15 | Args: 16 | y (array of ints): M x N array of labels, from [0, N] 17 | train_samples (float, int): Determines how many samples will go to the 18 | training fold. Treated as percentage if float, treated as absolute 19 | if int. 20 | classwise (boolean): if True, training samples will be picked with 21 | respect to the classes rather than with respect to the image. 22 | ignore_zero (boolean): if True, neither fold will contain the 0 class. 23 | 24 | Returns: 25 | Array of ints, both M x N. Samples allocated for the other fold are 26 | set to -1. If `ignore_zero`, all values are decremented by 1. 27 | """ 28 | 29 | 30 | if not isinstance(train_samples, np.ndarray): 31 | relative = isinstance(train_samples, float) and 0 <= train_samples <= 1 32 | if not relative: 33 | cutoff = train_samples 34 | 35 | y = np.array(y) 36 | sh = y.shape 37 | y = y.ravel() 38 | 39 | num_classes = np.max(y) + 1 40 | 41 | y_train = np.ones(np.prod(sh), dtype=np.int32) * -1 42 | y_test = np.ones(np.prod(sh), dtype=np.int32) * -1 43 | 44 | if isinstance(train_samples, np.ndarray): 45 | start = 1 if ignore_zero else 0 46 | for i in range(start, num_classes): 47 | idx = np.where(y == i)[0] 48 | if shuffle: 49 | np.random.shuffle(idx) 50 | y_train[idx[:train_samples[i]]] = i 51 | y_test[idx[train_samples[i]:]] = i 52 | 53 | elif classwise: 54 | start = 1 if ignore_zero else 0 55 | for i in range(start, num_classes): 56 | idx = np.where(y == i)[0] 57 | if relative: 58 | cutoff = int(round(len(idx) * train_samples)) 59 | if shuffle: 60 | np.random.shuffle(idx) 61 | y_train[idx[:cutoff]] = i 62 | y_test[idx[cutoff:]] = i 63 | else: 64 | if ignore_zero: 65 | idx = np.where(y > 0)[0] 66 | else: 67 | idx = np.arange(len(y)) 68 | if shuffle: 69 | np.random.shuffle(idx) 70 | if relative: 71 | cutoff = int(round(len(idx) * train_samples)) 72 | y_train[idx[:cutoff]] = y[idx[:cutoff]] 73 | y_test[idx[cutoff:]] = y[idx[cutoff:]] 74 | y_train = y_train.reshape(sh) 75 | y_test = y_test.reshape(sh) 76 | 77 | if ignore_zero: 78 | y_train[y_train > 0] -= 1 79 | y_test[y_test > 0] -= 1 80 | 81 | return y_train, y_test 82 | 83 | 84 | def train_val_split(y, train_samples, val_samples, classwise=True, ignore_zero=True): 85 | """Randomly split label data into two folds. 86 | 87 | Args: 88 | y (array of ints): M x N array of labels, from [0, N] 89 | train_samples (int): Determines how many samples will go to the 90 | training fold. 91 | val_samples (int): Determines how many samples will go to the 92 | validation fold. 93 | classwise (boolean): if True, training samples will be picked with 94 | respect to the classes rather than with respect to the image. 95 | ignore_zero (boolean): if True, neither fold will contain the 0 class. 96 | 97 | Returns: 98 | Array of ints, both M x N. Samples allocated for the other fold are 99 | set to -1. If `ignore_zero`, all values are decremented by 1. 100 | """ 101 | 102 | 103 | y = np.array(y) 104 | sh = y.shape 105 | y = y.ravel() 106 | 107 | num_classes = np.max(y) + 1 108 | 109 | y_train = np.ones(np.prod(sh), dtype=np.int32) * -1 110 | y_val = np.ones(np.prod(sh), dtype=np.int32) * -1 111 | 112 | if isinstance(train_samples, np.ndarray): 113 | 114 | train_samples = np.int32(train_samples) 115 | val_samples = np.int32(val_samples) 116 | start = 1 if ignore_zero else 0 117 | for i in range(start, num_classes): 118 | idx = np.where(y == i)[0] 119 | np.random.shuffle(idx) 120 | y_train[idx[:train_samples[i]]] = i 121 | y_val[idx[train_samples[i]:train_samples[i]+val_samples[i]]] = i 122 | 123 | elif classwise: 124 | start = 1 if ignore_zero else 0 125 | train_samples = np.int32(train_samples) 126 | val_samples = np.int32(val_samples) 127 | for i in range(start, num_classes): 128 | idx = np.where(y == i)[0] 129 | 130 | np.random.shuffle(idx) 131 | y_train[idx[:train_samples]] = i 132 | y_val[idx[train_samples:train_samples+val_samples]] = i 133 | else: 134 | if ignore_zero: 135 | idx = np.where(y > 0)[0] 136 | else: 137 | idx = np.arange(len(y)) 138 | np.random.shuffle(idx) 139 | 140 | y_train[idx[:train_samples]] = y[idx[:train_samples]] 141 | y_val[idx[train_samples:train_samples+val_samples]] = y[idx[train_samples:train_samples+val_samples]] 142 | y_train = y_train.reshape(sh) 143 | y_val = y_val.reshape(sh) 144 | 145 | if ignore_zero: 146 | y_train[y_train > 0] -= 1 147 | y_val[y_val > 0] -= 1 148 | 149 | return y_train, y_val 150 | 151 | -------------------------------------------------------------------------------- /utils/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Name: utils.py 3 | Author: Ronald Kemker 4 | Description: Helper functions for remote sensing applications. 5 | 6 | Note: 7 | Requires SpectralPython and GDAL 8 | http://www.spectralpython.net/ 9 | http://www.gdal.org/ 10 | """ 11 | 12 | import numpy as np 13 | from spectral import BandResampler 14 | import os 15 | 16 | def band_resample_hsi_cube(data, bands1, bands2, fwhm1,fwhm2 , mask=None): 17 | """ 18 | band_resample_hsi_cube : Resample hyperspectral image 19 | 20 | Parameters 21 | ---------- 22 | data : numpy array (height x width x spectral bands) 23 | bands1 : numpy array [1 x num source bands], 24 | the band centers of the input hyperspectral cube 25 | bands2 : numpy array [1 x num target bands], 26 | the band centers of the output hyperspectral cube 27 | fwhm1 : numpy array [1 x num source bands], 28 | the full-width half-max of the input hyperspectral cube 29 | fwhm2 : numpy array [1 x num target bands], 30 | the full-width half-max of the output hyperspectral cube 31 | mask : numpy array (height x width), optional mask to perform the band- 32 | resampling operation on. 33 | 34 | Returns 35 | ------- 36 | output - numpy array (height x width x N) 37 | 38 | """ 39 | resample = BandResampler(bands1,bands2,fwhm1,fwhm2) 40 | dataSize = data.shape 41 | data = data.reshape((-1,dataSize[2])) 42 | 43 | if mask is None: 44 | out = resample.matrix.dot(data.T).T 45 | else: 46 | out = np.zeros((data.shape[0], len(bands2)), dtype=data.dtype) 47 | mask = mask.ravel() 48 | out[mask] = resample.matrix.dot(data[mask].T).T 49 | 50 | out[np.isnan(out)] = 0 51 | return out.reshape((dataSize[0],dataSize[1],len(bands2))) 52 | 53 | 54 | def set_gpu(device=None): 55 | 56 | if device is None: 57 | device="" 58 | os.environ['CUDA_VISIBLE_DEVICES'] = str(device) 59 | --------------------------------------------------------------------------------