├── README.md
├── LICENSE
└── MalenoV Code 5 layer CNN 65x65x65 voxels
/README.md:
--------------------------------------------------------------------------------
1 | # MalenoV_nD (MAchine LEarNing Of Voxels)
2 |
Tool for training & classifying 3D (4D, nD) SEGY seismic facies using deep neural networks
3 |
4 | For attribution nets check out https://github.com/crild/facies_net
5 |
6 | • MalenoV reads standard 3D SEGY seismic and performs a 3D neural network architecture of choice on a given set of classification data points (facies annotation /supervision). It then uses the learned weights and filters of the neural network to classify seismic at any other location in the seismic cube into the facies classes that have been previously been defined by the user. Finally the facies classification is written out as a SEGY cube with the same dimensions as the input cube.
7 |
8 | The tool reads can handle n seismic input cubes (offest stacks, 4D data, attributes, etc) and n number of facies training datasets. The more input seismic cubes are trained / classified simultaneously the more memory is needed and the slower training /classification will be (linear scaling).
9 |
10 | • MalenoV was created as part of the Summer project of Charles Rutherford Ildstad in ConocoPhillips in July 2017.
11 |
12 | • The tool was inspired by the work of Anders U. Waldeland who showed at that he was successfully classifying the seismic facies of salt using 3D convolutional networks (http://earthdoc.eage.org/publication/publicationdetails/?publication=88635).
13 |
14 | • Currently a 5 layer basic 3D convolutional network is implemented but this can be changed by the users at liberty.
15 |
16 | • This tool is essentially an I/O function for machine learning with seismic. Better neural architectures for AVO, rock property prediction, fault classifications are to be implemented. (Think Unet resnet or GAN).
17 |
18 | • The tool is public with a GNU Lesser General Public License v3.0
19 |
20 | • The tool has been updated to handle multiple input volumes (offest stacks, 4D seismic) for better classification results and more fun
21 |
22 | The User Manual, seismic data, training data and set up scripts for the tool can be found here
23 | https://drive.google.com/drive/folders/0B7brcf-eGK8CRUhfRW9rSG91bW8
24 |
25 |
26 |
27 |
28 |
29 | Seismic training data for testing the tool:
30 |
31 | • We decided to make available the Poseidon (3500km2) seismic dataset acquired for ConocoPhillips Australia including Near, Mid, Far stacks, well data and velocity data
32 |
33 | • The seismic data is available here: https://drive.google.com/drive/folders/0B7brcf-eGK8CRUhfRW9rSG91bW8
34 | BEAWRE one 32 bit SEGY File is 100 GB of data
35 |
36 | • There is also inline, xline, z, training data for fault identification on the Poseidon survey
37 |
38 | • The Dutch government F3 seismic dataset can also be downloaded from the same location.
39 | This data is only 1 GB
40 |
41 | • Training data locations for multi facies prediction, faults and steep dips is provided
42 | • Trained neural network models for steep dips and multi facies can be assessed
43 | .
44 | .
45 |
46 | MalenoV stands for MAchine LEarNing Of Voxels
47 |
48 | nD stands for unlimited input dimensions
49 |
50 |
51 |
52 | .
53 | .
54 | Improvement ideas:
55 |
56 | Priority number 1 is to improve the classification speed.
57 |
58 | One easy solution for this would be to do the classification only at a user defined spacing. Say every second inline and every third crossline and every other z sample. Once the classification is done a segy cube needs to be written out with a new inline xline z spacing but the same origin as the ingoing volumes. This undersampling will be really good to test hyper parameters because one would get a feel for the classification accuracy much quicker
59 |
60 | The second way to improve speed would be to make sure that the to be trained or classified numpy cubes are truly 8 bit integer. Currently there is the option to switch to 8 bit but it seems not to do the right stuff.
61 |
62 | 2nd priority is implement 3d augmentation by allowing the user to choose how many and how a sub set of the training cubelets are deformed in 3d (squeezed stretched bent etc) or adding gaussian noise to them. This would help to make more abstract models
63 |
64 |
65 | 3rd priority is to implement tensor board or other visual tools to see how good the training goes and implement a scikit module to get an accuracy score for withheld training examples
66 |
67 |
68 | 4th priority would be to implement u nets Gan and other interesting architectures
69 |
70 |
71 |
72 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | GNU LESSER GENERAL PUBLIC LICENSE
2 | Version 3, 29 June 2007
3 |
4 | Copyright (C) 2007 Free Software Foundation, Inc.
5 | Everyone is permitted to copy and distribute verbatim copies
6 | of this license document, but changing it is not allowed.
7 |
8 |
9 | This version of the GNU Lesser General Public License incorporates
10 | the terms and conditions of version 3 of the GNU General Public
11 | License, supplemented by the additional permissions listed below.
12 |
13 | 0. Additional Definitions.
14 |
15 | As used herein, "this License" refers to version 3 of the GNU Lesser
16 | General Public License, and the "GNU GPL" refers to version 3 of the GNU
17 | General Public License.
18 |
19 | "The Library" refers to a covered work governed by this License,
20 | other than an Application or a Combined Work as defined below.
21 |
22 | An "Application" is any work that makes use of an interface provided
23 | by the Library, but which is not otherwise based on the Library.
24 | Defining a subclass of a class defined by the Library is deemed a mode
25 | of using an interface provided by the Library.
26 |
27 | A "Combined Work" is a work produced by combining or linking an
28 | Application with the Library. The particular version of the Library
29 | with which the Combined Work was made is also called the "Linked
30 | Version".
31 |
32 | The "Minimal Corresponding Source" for a Combined Work means the
33 | Corresponding Source for the Combined Work, excluding any source code
34 | for portions of the Combined Work that, considered in isolation, are
35 | based on the Application, and not on the Linked Version.
36 |
37 | The "Corresponding Application Code" for a Combined Work means the
38 | object code and/or source code for the Application, including any data
39 | and utility programs needed for reproducing the Combined Work from the
40 | Application, but excluding the System Libraries of the Combined Work.
41 |
42 | 1. Exception to Section 3 of the GNU GPL.
43 |
44 | You may convey a covered work under sections 3 and 4 of this License
45 | without being bound by section 3 of the GNU GPL.
46 |
47 | 2. Conveying Modified Versions.
48 |
49 | If you modify a copy of the Library, and, in your modifications, a
50 | facility refers to a function or data to be supplied by an Application
51 | that uses the facility (other than as an argument passed when the
52 | facility is invoked), then you may convey a copy of the modified
53 | version:
54 |
55 | a) under this License, provided that you make a good faith effort to
56 | ensure that, in the event an Application does not supply the
57 | function or data, the facility still operates, and performs
58 | whatever part of its purpose remains meaningful, or
59 |
60 | b) under the GNU GPL, with none of the additional permissions of
61 | this License applicable to that copy.
62 |
63 | 3. Object Code Incorporating Material from Library Header Files.
64 |
65 | The object code form of an Application may incorporate material from
66 | a header file that is part of the Library. You may convey such object
67 | code under terms of your choice, provided that, if the incorporated
68 | material is not limited to numerical parameters, data structure
69 | layouts and accessors, or small macros, inline functions and templates
70 | (ten or fewer lines in length), you do both of the following:
71 |
72 | a) Give prominent notice with each copy of the object code that the
73 | Library is used in it and that the Library and its use are
74 | covered by this License.
75 |
76 | b) Accompany the object code with a copy of the GNU GPL and this license
77 | document.
78 |
79 | 4. Combined Works.
80 |
81 | You may convey a Combined Work under terms of your choice that,
82 | taken together, effectively do not restrict modification of the
83 | portions of the Library contained in the Combined Work and reverse
84 | engineering for debugging such modifications, if you also do each of
85 | the following:
86 |
87 | a) Give prominent notice with each copy of the Combined Work that
88 | the Library is used in it and that the Library and its use are
89 | covered by this License.
90 |
91 | b) Accompany the Combined Work with a copy of the GNU GPL and this license
92 | document.
93 |
94 | c) For a Combined Work that displays copyright notices during
95 | execution, include the copyright notice for the Library among
96 | these notices, as well as a reference directing the user to the
97 | copies of the GNU GPL and this license document.
98 |
99 | d) Do one of the following:
100 |
101 | 0) Convey the Minimal Corresponding Source under the terms of this
102 | License, and the Corresponding Application Code in a form
103 | suitable for, and under terms that permit, the user to
104 | recombine or relink the Application with a modified version of
105 | the Linked Version to produce a modified Combined Work, in the
106 | manner specified by section 6 of the GNU GPL for conveying
107 | Corresponding Source.
108 |
109 | 1) Use a suitable shared library mechanism for linking with the
110 | Library. A suitable mechanism is one that (a) uses at run time
111 | a copy of the Library already present on the user's computer
112 | system, and (b) will operate properly with a modified version
113 | of the Library that is interface-compatible with the Linked
114 | Version.
115 |
116 | e) Provide Installation Information, but only if you would otherwise
117 | be required to provide such information under section 6 of the
118 | GNU GPL, and only to the extent that such information is
119 | necessary to install and execute a modified version of the
120 | Combined Work produced by recombining or relinking the
121 | Application with a modified version of the Linked Version. (If
122 | you use option 4d0, the Installation Information must accompany
123 | the Minimal Corresponding Source and Corresponding Application
124 | Code. If you use option 4d1, you must provide the Installation
125 | Information in the manner specified by section 6 of the GNU GPL
126 | for conveying Corresponding Source.)
127 |
128 | 5. Combined Libraries.
129 |
130 | You may place library facilities that are a work based on the
131 | Library side by side in a single library together with other library
132 | facilities that are not Applications and are not covered by this
133 | License, and convey such a combined library under terms of your
134 | choice, if you do both of the following:
135 |
136 | a) Accompany the combined library with a copy of the same work based
137 | on the Library, uncombined with any other library facilities,
138 | conveyed under the terms of this License.
139 |
140 | b) Give prominent notice with the combined library that part of it
141 | is a work based on the Library, and explaining where to find the
142 | accompanying uncombined form of the same work.
143 |
144 | 6. Revised Versions of the GNU Lesser General Public License.
145 |
146 | The Free Software Foundation may publish revised and/or new versions
147 | of the GNU Lesser General Public License from time to time. Such new
148 | versions will be similar in spirit to the present version, but may
149 | differ in detail to address new problems or concerns.
150 |
151 | Each version is given a distinguishing version number. If the
152 | Library as you received it specifies that a certain numbered version
153 | of the GNU Lesser General Public License "or any later version"
154 | applies to it, you have the option of following the terms and
155 | conditions either of that published version or of any later version
156 | published by the Free Software Foundation. If the Library as you
157 | received it does not specify a version number of the GNU Lesser
158 | General Public License, you may choose any version of the GNU Lesser
159 | General Public License ever published by the Free Software Foundation.
160 |
161 | If the Library as you received it specifies that a proxy can decide
162 | whether future versions of the GNU Lesser General Public License shall
163 | apply, that proxy's public statement of acceptance of any version is
164 | permanent authorization for you to choose that version for the
165 | Library.
166 |
--------------------------------------------------------------------------------
/MalenoV Code 5 layer CNN 65x65x65 voxels:
--------------------------------------------------------------------------------
1 | ### Function for n-dimensional seismic facies training /classification using Convolutional Neural Nets (CNN)
2 | ### By: Charles Rutherford Ildstad (University of Trondheim), as part of a summer intern project in ConocoPhillips and private work
3 | ### Contributions from Anders U. Waldeland (University of Oslo), Chris Olsen (ConocoPhillips), Doug Hakkarinen (ConocoPhillips)
4 | ### Date: 26.10.2017
5 | ### For: ConocoPhillips, Norway,
6 | ### GNU V3.0 lesser license
7 |
8 | # Make initial package imports
9 | import segyio
10 | import random
11 | import keras
12 | import numpy as np
13 | import matplotlib.pyplot as plt
14 | import math
15 | import time
16 |
17 | from keras.models import Sequential
18 | from keras.models import Model
19 | from keras.layers import Dense, Activation, Flatten, Dropout
20 | from keras.layers import Conv3D
21 | from keras.callbacks import EarlyStopping
22 | from keras.callbacks import TensorBoard
23 | from keras.callbacks import LearningRateScheduler
24 | from matplotlib import gridspec
25 |
26 | from keras.layers.normalization import BatchNormalization
27 |
28 | from shutil import copyfile
29 |
30 | # Set random seed for reproducability
31 | np.random.seed(7)
32 | # Confirm backend if in doubt
33 | #keras.backend.backend()
34 |
35 |
36 | ### ---- Functions for Input data(SEG-Y) formatting and reading ----
37 | # Make a function that decompresses a segy-cube and creates a numpy array, and
38 | # a dictionary with the specifications, like in-line range and time step length, etc.
39 | def segy_decomp(segy_file, plot_data = False, read_direc='xline', inp_res = np.float64):
40 | # segy_file: filename of the segy-cube to be imported
41 | # plot_data: boolean that determines if a random xline should be plotted to test the reading
42 | # read_direc: which way the SEGY-cube should be read; 'xline', or 'inline'
43 | # inp_res: input resolution, the formatting of the seismic cube (could be changed to 8-bit data)
44 |
45 | # Make an empty object to hold the output data
46 | print('Starting SEG-Y decompressor')
47 | output = segyio.spec()
48 |
49 | # open the segyfile and start decomposing it
50 | with segyio.open(segy_file, "r" ) as segyfile:
51 | # Memory map file for faster reading (especially if file is big...)
52 | segyfile.mmap()
53 |
54 | # Store some initial object attributes
55 | output.inl_start = segyfile.ilines[0]
56 | output.inl_end = segyfile.ilines[-1]
57 | output.inl_step = segyfile.ilines[1] - segyfile.ilines[0]
58 |
59 | output.xl_start = segyfile.xlines[0]
60 | output.xl_end = segyfile.xlines[-1]
61 | output.xl_step = segyfile.xlines[1] - segyfile.xlines[0]
62 |
63 | output.t_start = int(segyfile.samples[0])
64 | output.t_end = int(segyfile.samples[-1])
65 | output.t_step = int(segyfile.samples[1] - segyfile.samples[0])
66 |
67 |
68 | # Pre-allocate a numpy array that holds the SEGY-cube
69 | output.data = np.empty((segyfile.xline.len,segyfile.iline.len,\
70 | (output.t_end - output.t_start)//output.t_step+1), dtype = np.float32)
71 |
72 | # Read the entire cube line by line in the desired direction
73 | if read_direc == 'inline':
74 | # Potentially time this to find the "fast" direction
75 | #start = time.time()
76 | for il_index in range(segyfile.xline.len):
77 | output.data[il_index,:,:] = segyfile.iline[segyfile.ilines[il_index]]
78 | #end = time.time()
79 | #print(end - start)
80 |
81 | elif read_direc == 'xline':
82 | # Potentially time this to find the "fast" direction
83 | #start = time.time()
84 | for xl_index in range(segyfile.iline.len):
85 | output.data[:,xl_index,:] = segyfile.xline[segyfile.xlines[xl_index]]
86 | #end = time.time()
87 | #print(end - start)
88 |
89 | elif read_direc == 'full':
90 | ## NOTE: 'full' for some reason invokes float32 data
91 | # Potentially time this to find the "fast" direction
92 | #start = time.time()
93 | output.data = segyio.tools.cube(segy_file)
94 | #end = time.time()
95 | #print(end - start)
96 | else:
97 | print('Define reading direction(read_direc) using either ''inline'', ''xline'', or ''full''')
98 |
99 |
100 | # Convert the numpy array to span between -127 and 127 and convert to the desired format
101 | factor = 127/np.amax(np.absolute(output.data))
102 | if inp_res == np.float32:
103 | output.data = (output.data*factor)
104 | else:
105 | output.data = (output.data*factor).astype(dtype = inp_res)
106 |
107 | # If sepcified, plot a given x-line to test the read data
108 | if plot_data:
109 | # Take a given xline
110 | data = output.data[:,100,:]
111 | # Plot the read x-line
112 | plt.imshow(data.T,interpolation="nearest", cmap="gray")
113 | plt.colorbar()
114 | plt.show()
115 |
116 |
117 | # Return the output object
118 | print('Finished using the SEG-Y decompressor')
119 | return output
120 |
121 |
122 |
123 | # Make a function that adds another layer to a segy-cube
124 | def segy_adder(segy_file, inp_cube, read_direc='xline', inp_res = np.float64):
125 | # segy_file: filename of the segy-cube to be imported
126 | # inp_cube: the existing cube that we should add a layer to
127 | # cube_num: which chronological number of cube is this
128 | # read_direc: which way the SEGY-cube should be read; 'xline', or 'inline'
129 | # inp_res: input resolution, the formatting of the seismic cube (could be changed to 8-bit data)
130 |
131 | # Make a variable to hold the shape of the input cube and preallocate a data holder
132 | print('Starting SEG-Y adder')
133 | cube_shape = inp_cube.shape
134 | dataholder = np.empty(cube_shape[0:-1])
135 |
136 | # open the segyfile and start decomposing it
137 | with segyio.open(segy_file, "r" ) as segyfile:
138 | # Memory map file for faster reading (especially if file is big...)
139 | segyfile.mmap()
140 |
141 | # Read the entire cube line by line in the desired direction
142 | if read_direc == 'inline':
143 | # Potentially time this to find the "fast" direction
144 | #start = time.time()
145 | for il_index in range(segyfile.xline.len):
146 | dataholder[il_index,:,:] = segyfile.iline[segyfile.ilines[il_index]]
147 | #end = time.time()
148 | #print(end - start)
149 |
150 | elif read_direc == 'xline':
151 | # Potentially time this to find the "fast" direction
152 | #start = time.time()
153 | for xl_index in range(segyfile.iline.len):
154 | dataholder[:,xl_index,:] = segyfile.xline[segyfile.xlines[xl_index]]
155 | #end = time.time()
156 | #print(end - start)
157 |
158 | elif read_direc == 'full':
159 | ## NOTE: 'full' for some reason invokes float32 data
160 | # Potentially time this to find the "fast" direction
161 | #start = time.time()
162 | dataholder[:,:,:] = segyio.tools.cube(segy_file)
163 | #end = time.time()
164 | #print(end - start)
165 | else:
166 | print('Define reading direction(read_direc) using either ''inline'', ''xline'', or ''full''')
167 |
168 |
169 | # Convert the numpy array to span between -127 and 127 and convert to the desired format
170 | factor = 127/np.amax(np.absolute(dataholder))
171 | if inp_res == np.float32:
172 | dataholder = (dataholder*factor)
173 | else:
174 | dataholder = (dataholder*factor).astype(dtype = inp_res)
175 |
176 |
177 | # Return the output object
178 | print('Finished adding a SEG-Y layer')
179 | return dataholder
180 |
181 | # Convert a numpy-cube and seismic specs into a csv file/numpy-csv-format,
182 | def csv_struct(inp_numpy,spec_obj,section,inp_res=np.float64,save=False,savename='default_write.ixz'):
183 | # inp_numpy: array that should be converted to csv
184 | # spec_obj: object containing the seismic specifications, like starting depth, inlines, etc.
185 | # inp_res: input resolution, the formatting of the seismic cube (could be changed to 8-bit data)
186 | # save: whether or not to save the output of the function
187 | # savename: what to name the newly saved csv-file
188 |
189 | # Get some initial parameters of the data
190 | (ilen,xlen,zlen) = inp_numpy.shape
191 | i = 0
192 |
193 | # Preallocate the array that we want to make
194 | full_np = np.empty((ilen*xlen*zlen,4),dtype = inp_res)
195 |
196 | # Itterate through the numpy-cube and convert each trace individually to a section of csv
197 | for il in range(section[0]*spec_obj.inl_step,(section[1]+1)*spec_obj.inl_step,spec_obj.inl_step):
198 | j = 0
199 | for xl in range(section[2]*spec_obj.xl_step,(section[3]+1)*spec_obj.xl_step,spec_obj.xl_step):
200 | # Make a list of the inline number, xline number, and depth for the given trace
201 | I = (il+spec_obj.inl_start)*(np.ones((zlen,1)))
202 | X = (xl+spec_obj.xl_start)*(np.ones((zlen,1)))
203 | Z = np.expand_dims(np.arange(section[4]*spec_obj.t_step+spec_obj.t_start,\
204 | (section[5]+1)*spec_obj.t_step+spec_obj.t_start,spec_obj.t_step),\
205 | axis=1)
206 |
207 | # Store the predicted class/probability at each og the given depths of the trace
208 | D = np.expand_dims(inp_numpy[i,j,:],axis = 1)
209 |
210 | # Concatenate these lists together and insert them into the full array
211 | inp_li = np.concatenate((I,X,Z,D),axis=1)
212 | full_np[i*xlen*zlen+j*zlen:i*xlen*zlen+(j+1)*zlen,:] = inp_li
213 | j+=1
214 | i+=1
215 |
216 | # Add the option to save it as an external file
217 | if save:
218 | # save the file as the given str-name
219 | np.savetxt(savename, full_np, fmt = '%f')
220 |
221 | # Return the list of adresses and classes as a numpy array
222 | return full_np
223 |
224 |
225 | ### ---- Functions for data augmentation ---- (Needs further development)
226 | # RotationXY
227 | def randomRotationXY(X, max_rot):
228 | max_rot = 6.28318530718 / 360 * max_rot #Deg 2 rad
229 | theta = tf.random_uniform([1], minval=-max_rot, maxval=max_rot, dtype='float32')
230 | x = X[2] * tf.cos(theta) - X[1] * tf.sin(theta)
231 | y = X[2] * tf.sin(theta) + X[1] * tf.cos(theta)
232 | return tf.stack([X[0],y,x])
233 |
234 |
235 | # RotationZ
236 | def randomRotationZ(X, max_rot):
237 | max_rot = 6.28318530718 / 360 * max_rot # Deg 2 rad
238 | theta = tf.random_uniform([1], minval=-max_rot, maxval=max_rot, dtype='float32')
239 | t = X[0] * tf.cos(theta) - X[1] * tf.sin(theta)
240 | x = X[0] * tf.sin(theta) + X[1] * tf.cos(theta)
241 | return tf.stack([t,x,X[2]])
242 |
243 |
244 | # Stretching
245 | def randomStretch(window_function, strech):
246 | return tf.cast(window_function,'float32') * (1 + tf.random_uniform([1],minval=-strech,maxval=strech))
247 |
248 |
249 | # Flip
250 | def randomFlip(window_function):
251 | should_flip = tf.cast(tf.random_uniform([1], 0, 2, dtype=tf.int32)[0] > 0, tf.bool)
252 | window_function = tf.reverse(window_function, tf.pack([should_flip]))
253 | return window_function
254 |
255 |
256 |
257 | ### ---- Functions for the training part of the program ----
258 | # Make a function that combines the adress cubes and makes a list of class adresses
259 | def convert(file_list, save = False, savename = 'adress_list', ex_adjust = False):
260 | # file_list: list of file names(strings) of adresses for the different classes
261 | # save: boolean that determines if a new ixz file should be saved with adresses and class numbers
262 | # savename: desired name of new .ixz-file
263 | # ex_adjust: boolean that determines if the amount of each class should be approximately equalized
264 |
265 | # Make an array of that holds the number of each example provided, if equalization is needed
266 | if ex_adjust:
267 | len_array = np.zeros(len(file_list),dtype = np.float32)
268 | for i in range(len(file_list)):
269 | len_array[i] = len(np.loadtxt(file_list[i], skiprows=0, usecols = range(3), dtype = np.float32))
270 |
271 | # Cnvert this array to a multiplier that determines how many times a given class set needs to be
272 | len_array /= max(len_array)
273 | multiplier = 1//len_array
274 |
275 |
276 | # preallocate space for the adr_list, the output containing all the adresses and classes
277 | adr_list = np.empty([0,4], dtype = np.int32)
278 |
279 | # Itterate through the list of example adresses and store the class as an integer
280 | for i in range(len(file_list)):
281 | a = np.loadtxt(file_list[i], skiprows=0, usecols = range(3), dtype = np.int32)
282 | adr_list = np.append(adr_list,np.append(a,i*np.ones((len(a),1),dtype = np.int32),axis=1),axis=0)
283 |
284 | # If desired copy the entire list by the multiplier calculated
285 | if ex_adjust:
286 | for k in range(int(multiplier[i])-1):
287 | adr_list = np.append(adr_list,np.append(a,i*np.ones((len(a),1),dtype = np.int32),axis=1),axis=0)
288 |
289 | # Add the option to save it as an external file
290 | if save:
291 | # save the file as the given str-name
292 | np.savetxt(savename + '.ixz', adr_list, fmt = '%i')
293 |
294 | # Return the list of adresses and classes as a numpy array
295 | return adr_list
296 |
297 |
298 |
299 | # Function for example creating
300 | # Outputs a dictionary with pairs of cube tuples and labels
301 | def ex_create(adr_arr,seis_arr,seis_spec,num_examp,cube_incr,inp_res=np.float64,sort_adr = False,replace_illegals = True):
302 | # adr_arr: 4-column numpy matrix that holds a header in the first row, then adress and class information for examples
303 | # seis_arr: 3D numpy array that holds a seismic cube
304 | # seis_spec: object that holds the specifications of the seismic cube;
305 | # num_examp: the number of output mini-cubes that should be created
306 | # cube_incr: the number of increments included in each direction from the example to make a mini-cube
307 | # inp_res: input resolution, the formatting of the seismic cube (could be changed to 8-bit data)
308 | # sort_adr: boolean; whether or not to sort the randomly drawn adresses before making the example cubes
309 | # replace_illegals: boolean; whether or not to draw a new sample in place for an illegal one, or not
310 |
311 | # Define the cube size
312 | cube_size = 2*cube_incr+1
313 |
314 | # Define some boundary parameters given in the input object
315 | inline_start = seis_spec.inl_start
316 | inline_end = seis_spec.inl_end
317 | inline_step = seis_spec.inl_step
318 | xline_start = seis_spec.xl_start
319 | xline_end = seis_spec.xl_end
320 | xline_step = seis_spec.xl_step
321 | t_start = seis_spec.t_start
322 | t_end = seis_spec.t_end
323 | t_step = seis_spec.t_step
324 | num_channels = seis_spec.cube_num
325 |
326 | # Define the buffer zone around the edge of the cube that defines the legal/illegal adresses
327 | inl_min = inline_start + inline_step*cube_incr
328 | inl_max = inline_end - inline_step*cube_incr
329 | xl_min = xline_start + xline_step*cube_incr
330 | xl_max = xline_end - xline_step*cube_incr
331 | t_min = t_start + t_step*cube_incr
332 | t_max = t_end - t_step*cube_incr
333 |
334 | # Print the buffer zone edges
335 | print('Defining the buffer zone:')
336 | print('(inl_min,','inl_max,','xl_min,','xl_max,','t_min,','t_max)')
337 | print('(',inl_min,',',inl_max,',',xl_min,',',xl_max,',',t_min,',',t_max,')')
338 | # Also give the buffer values in terms of indexes
339 | print('(',cube_incr,',',((inline_end-inline_start)//inline_step) - cube_incr,\
340 | ',',cube_incr,',',((xline_end-xline_start)//xline_step) - cube_incr,\
341 | ',',cube_incr,',',((t_end-t_start)//t_step) - cube_incr,')')
342 |
343 | # We preallocate the function outputs; a list of examples and a list of labels
344 | examples = np.empty((num_examp,cube_size,cube_size,cube_size,num_channels),dtype=inp_res)
345 | labels = np.empty(num_examp,dtype=np.int8)
346 |
347 | # If we want to stack the examples in the third dimension we use the following example preallocation in stead
348 | # examples = np.empty((num_examp*(cube_size),(cube_size),(cube_size)),dtype=inp_res)
349 |
350 | # Generate a random list of indexes to be drawn, and make sure it only takes a legal amount of examples
351 | try:
352 | max_row_idx = len(adr_arr)-1
353 | rand_idx = random.sample(range(0, max_row_idx), num_examp)
354 | # NOTE: Could be faster to sort indexes before making examples for algorithm optimization
355 | if sort_adr:
356 | rand_idx.sort()
357 | except ValueError:
358 | print('Sample size exceeded population size.')
359 |
360 | # Make an iterator for when the lists should become shorter(if we have replacement of illegals or not)
361 | n=0
362 | for i in range(num_examp):
363 | # Get a random in-line, x-line, and time value, and store the label
364 | # Make sure there is room for an example at this index
365 | for j in range(50):
366 | adr = adr_arr[rand_idx[i]]
367 | # Check that the given example is within the legal zone
368 | if (adr[0]>=inl_min and adr[0]=xl_min and adr[1]=t_min and adr[2]