├── .gitignore
├── LICENSE.txt
├── MANIFEST.in
├── README.md
├── changes.txt
├── data
    └── tcremoval
    │   └── tcparameters.txt
├── description.md
├── figs
    ├── download_continuous_example.png
    ├── noise_xcorr_example.png
    └── seisgo_logo.png
├── notebooks
    ├── OBS_orientation_Cascadia.csv
    ├── download_catalog.ipynb
    ├── download_continuous.ipynb
    ├── embededfigs
    │   └── JaniszewskiGJI2019Fig5.png
    ├── seisgo_download_xcorr_demo.ipynb
    ├── seisgo_dvv_workflow_example.ipynb
    ├── seisgo_xcorr_sac.ipynb
    ├── tcremoval_continuous.ipynb
    ├── tcremoval_continuous_correctorientation.ipynb
    ├── tcremoval_continuous_local.ipynb
    └── tcremoval_earthquakes.ipynb
├── scripts
    ├── EQDownload.py
    ├── ExtractEGFsFromNoise
    │   ├── 1_download_MPI.py
    │   ├── 2_xcorr_MPI.py
    │   ├── 3_merge_pairs_bysources_MPI.py
    │   ├── 4_split_sides_bysources_MPI.py
    │   ├── README.md
    │   ├── submit_download.sh
    │   ├── submit_merge_pairs_bysources_MPI.sh
    │   ├── submit_split_sides.sh
    │   └── submit_xcorr_MPI.sh
    ├── OBS_orientation_Cascadia.csv
    ├── seisgo_BANX_azimuthal_anisotropy.py
    ├── seisgo_ccf2sac_MPI.py
    ├── seisgo_cleaning_MPI.py
    ├── seisgo_download_MPI.py
    ├── seisgo_download_obsdata.py
    ├── seisgo_stacking_MPI.py
    ├── seisgo_tcremoval_continuous.py
    ├── seisgo_tcremoval_continuous_MPI.py
    └── seisgo_xcorr_sac.py
├── seisgo
    ├── __init__.py
    ├── __pycache__
    │   ├── __init__.cpython-37.pyc
    │   ├── obsmaster.cpython-37.pyc
    │   └── utils.cpython-37.pyc
    ├── anisotropy.py
    ├── clustering.py
    ├── dispersion.py
    ├── downloaders.py
    ├── helpers.py
    ├── monitoring.py
    ├── noise.py
    ├── obsmaster.py
    ├── plotting.py
    ├── simulation.py
    ├── stacking.py
    ├── types.py
    └── utils.py
├── setup.cfg
└── setup.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | __pycache__
3 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 Xiaotao Yang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
 1 | # Include the README
 2 | include *.md
 3 | 
 4 | # Include the license file
 5 | include LICENSE.txt
 6 | 
 7 | # Include setup.py
 8 | include setup.py
 9 | 
10 | # Include the data files
11 | recursive-include data *
12 | recursive-include scripts *
13 | recursive-include notebooks *
14 | recursive-include figs *
15 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # SeisGo
  2 | *A ready-to-go Python toolbox for seismic data analysis*
  3 | 
  4 | ![plot1](/figs/seisgo_logo.png)
  5 | 
  6 | ## Introduction
  7 | This package is currently heavily dependent on **obspy** (www.obspy.org) to handle seismic data (download, read, and write, etc). Users are referred to **obspy** toolbox for related functions.
  8 | 
  9 | ## Citation if using SeisGo
 10 | Please cite Yang et al. (2022) if you use SeisGo in your research. 
 11 | 
 12 | Yang, X., Bryan, J., Okubo, K., Jiang, C., Clements, T., & Denolle, M. A. (2022). Optimal stacking of noise cross-correlation functions. Geophysical Journal International, 232(3), 1600–1618. https://doi.org/10.1093/gji/ggac410
 13 | 
 14 | ## Available modules
 15 | This package is under active development. The currently available modules are listed here.
 16 | 
 17 | 1.  `utils`: This module contains frequently used utility functions not readily available in `obspy`.
 18 | 
 19 | 2. `downloaders`: This module contains functions used to download earthquake waveforms, earthquake catalogs, station information, continous waveforms, and read data from local files.
 20 | 
 21 | 3. `obsmaster`: This module contains functions to get and processing Ocean Bottom Seismometer (OBS) data. The functions and main processing modules for removing the tilt and compliance noises are inspired and modified from `OBStools` (https://github.com/nfsi-canada/OBStools) developed by Pascal Audet & Helen Janiszewski. The main tilt and compliance removal method is based on Janiszewski et al. (2019).
 22 | 
 23 | 4. `noise`: This module contains functions used in ambient noise processing, including cross-correlations and monitoring. The key functions were converted from `NoisePy` (https://github.com/mdenolle/NoisePy) with heavy modifications. Inspired by `SeisNoise.jl` (https://github.com/tclements/SeisNoise.jl), We modified the cross-correlation workflow with FFTData and CorrData (defined in `types` module) objects. The original NoisePy script for cross-correlations have been disassembled and wrapped in functions, primarily in this module. We also changed the way NoisePy handles timestamps when cross-correlating. This change results in more data, even with gaps. The xcorr functionality in SeisGo also has the minimum requirement on knowledge about the downloading step. We try to optimize and minimize inputs from the user. We added functionality to better manipulate the temporal resolution of xcorr results.
 24 | 
 25 | 5. `plotting`: This module contains major plotting functions for raw waveforms, cross-correlation results, and station maps.
 26 | 
 27 | 6. `monitoring`: This module contains functions for ambient noise seismic monitoring, adapted from functions by Yuan et al. (2021).
 28 | 
 29 | 7. `clustering`: Clustering functions for seismic data and velocity models.
 30 | 
 31 | 8. `stacking`: stacking of seismic data.
 32 | 
 33 | 9. `types`: This module contains the definition of major data types and classes.
 34 | 
 35 | ## Installation
 36 | **SeisGo** is available on PyPi (https://pypi.org/project/seisgo/). You can install it as a regular package `pip install seisgo`. The following instruction shows how to install seisgo with a virtual environment with github repository.
 37 | 
 38 | 1. Create and activate the **conda** `seisgo` environment
 39 | 
 40 | Make sure you have a working Anaconda installed. This step is required to have all dependencies installed for the package. You can also manually install the listed packages **without** creating the `seisgo` environment OR if you already have these packages installed. **The order of the following commands MATTERS.**
 41 | 
 42 | ```
 43 | $ conda create -n seisgo -c conda-forge jupyter numpy scipy pandas numba pycwt python<3.12 cartopy obspy mpi4py stockwell
 44 | $ conda activate seisgo
 45 | ```
 46 | 
 47 | The `jupyter` package is currently not required, **unless** you plan to run the accompanied Jupyter notebooks in **<notebooks>** directory. `mpi4py` is **required** to run parallel scripts stored in **scripts** directory. The modules have been fully tested on python 3.7.x but versions >= 3.6 also seem to work from a few tests. You can create jupyter kernel:
 48 | 
 49 | ```
 50 | $ conda activate seisgo
 51 | $ pip install --user ipykernel
 52 | $ python -m ipykernel install --user --name=seisgo
 53 | ```
 54 | 
 55 | **Install PyGMT plotting funcitons**
 56 | 
 57 | Map views with geographical projections are plotted using **PyGMT** (https://www.pygmt.org/latest/). It seems that only PIP install could get the latest version [personal experience]. The following are steps to install PyGMT package (please refer to PyGMT webpage for trouble shooting and testing):
 58 | 
 59 | Install GMT through conda first into the `SeisGo` environment:
 60 | 
 61 | ```
 62 | conda activate seisgo
 63 | conda config --prepend channels conda-forge
 64 | conda install  python pip numpy pandas xarray netcdf4 packaging gmt
 65 | ```
 66 | 
 67 | **You may need to specify the python version available on your environment.** In ~/.bash_profile, add this line: `export GMT_LIBRARY_PATH=$SEISGOROOT/lib`, where `$SEISGOROOT` is the root directory of the `seisgo` environment. Then, run:
 68 | 
 69 | ```
 70 | conda install pygmt
 71 | ```
 72 | 
 73 | Test your installation by running:
 74 | ```
 75 | python
 76 | > import pygmt
 77 | ```
 78 | 
 79 | 2. Download `SeisGo`
 80 | 
 81 | `cd` to the directory you want to save the package files. Then,
 82 | ```
 83 | $ git clone https://github.com/xtyangpsp/SeisGo.git
 84 | ```
 85 | 
 86 | 3. Install `seisgo` package functions using `pip`
 87 | 
 88 | This step will install the **SeisGo** modules under `seisgo` environment. The modules would then be imported under any working directory. Remember to rerun this command if you modified the functions/modules.
 89 | 
 90 | ```
 91 | $ pip install .
 92 | ```
 93 | 
 94 | If you are planning to frequently make changes to the scripts
 95 | 
 96 | ```
 97 | $ pip install -e .
 98 | ```
 99 | 
100 | is recommended. The '-e' option will automatically update executable scripts in the python env without doing 'pip install .' every time.
101 | 
102 | 3. Test the installation
103 | 
104 | Run the following commands to test your installation.
105 | ```
106 | $ python
107 | >>> from seisgo import obsmaster as obs
108 | >>> tflist=obs.gettflist(help=True)
109 | ------------------------------------------------------------------
110 | | Key    | Default  | Note                                       |
111 | ------------------------------------------------------------------
112 | | ZP     | True     | Vertical and pressure                      |
113 | | Z1     | True     | Vertical and horizontal-1                  |
114 | | Z2-1   | True     | Vertical and horizontals (1 and 2)         |
115 | | ZP-21  | True     | Vertical, pressure, and two horizontals    |
116 | | ZH     | True     | Vertical and rotated horizontal            |
117 | | ZP-H   | True     | Vertical, pressure, and rotated horizontal |
118 | ------------------------------------------------------------------
119 | ```
120 | 
121 | 4. Trouble shooting
122 | 
123 | * Stockwell errors with numpy importing errors (`numpy.core.multiarray errors`) and/or fftw errors. Some libraries (e.g., FFTW) used by stockwell are only compatible with numpy <= 1.22.4.
124 | 
125 | Try the following steps:
126 | ```
127 | $ conda activate seisgo
128 | $ conda uninstall numpy stockwell
129 | $ conda install -c conda-forge stockwell numpy==1.22.4
130 | ```
131 | * NETCDF4 errors. You may need to mannual install/reinstall netcdf4 by:
132 | 
133 | ```
134 | $ pip uninstall netcdf4
135 | $ pip install netcdf4
136 | ```
137 | 
138 | ## Update SeisGo
139 | If you installed SeisGo through github, run the following lines to update to the latest version (may not be a release on pip yet):
140 | 
141 | ```Python
142 | git pull
143 | pip install .   #note there is a period "." sign here, indicating the current directory
144 | ```
145 | 
146 | If you installed SeisGo through pip, you can get the latest release (GitHub always has the most recent commits) though running these lines:
147 | 
148 | ```Python
149 | pip install seisgo --upgrade
150 | ```
151 | 
152 | 
153 | ## Structure of the package
154 | 1. **seisgo**: This directory contains the main modules.
155 | 
156 | 2. **notebooks**: This directory contains the jupyter notebooks that provide tutorials for all modules.
157 | 
158 | 3. **data**: Data for testing or running the tutorials is saved under this folder.
159 | 
160 | 4. **figs**: Here we put figures in tutorials and other places.
161 | 
162 | 5. **scripts**: This directory contains example scripts for data processing using `seisgo`. Users are welcome to modify from the provided example scripts to work on their own data.
163 | 
164 | ## Tutorials on key functionalities
165 | 1. Download continuous waveforms for large-scale job (see item 3 for small jobs processing in memory). Example script using MPI is here: `scripts/seisgo_download_MPI.py`. The following lines show an example of the structure without MPI (so that you can easily test run it in Jupyter Notebook).
166 | 
167 | ```Python
168 | import os,glob
169 | from seisgo.utils import split_datetimestr,extract_waveform,plot_trace
170 | from seisgo import downloaders
171 | 
172 | rootpath = "data_test" # roothpath for the project
173 | DATADIR  = os.path.join(rootpath,'Raw')          # where to store the downloaded data
174 | down_list  = os.path.join(DATADIR,'station.txt') # CSV file for station location info
175 | 
176 | # download parameters
177 | source='IRIS'
178 | samp_freq = 10                      # targeted sampling rate at X samples per seconds
179 | rmresp   = True
180 | rmresp_out = 'DISP'
181 | 
182 | # targeted region/station information
183 | lamin,lamax,lomin,lomax= 39,41,-88,-86           # regional box:
184 | net_list  = ["TA"] #                              # network list
185 | chan_list = ["BHZ"]
186 | sta_list  = ["O45A","SFIN"]                       # station
187 | start_date = "2012_01_01_0_0_0"                   # start date of download
188 | end_date   = "2012_01_02_1_0_0"                   # end date of download
189 | inc_hours  = 12                                   # length of data for each request (in hour)
190 | maxseischan = 1                                   # the maximum number of seismic channels
191 | ncomp      = maxseischan #len(chan_list)
192 | 
193 | downlist_kwargs = {"source":source, 'net_list':net_list, "sta_list":sta_list, "chan_list":chan_list, \
194 |                    "starttime":start_date, "endtime":end_date, "maxseischan":maxseischan, "lamin":lamin, \
195 |                    "lamax":lamax,"lomin":lomin, "lomax":lomax, "fname":down_list}
196 | 
197 | stalist=downloaders.get_sta_list(**downlist_kwargs) #
198 | #this is a critical step for long duration downloading, as a demo here.
199 | all_chunk = split_datetimestr(start_date,end_date,inc_hours)
200 | 
201 | #################DOWNLOAD SECTION#######################
202 | for ick in range(len(all_chunk)-1):
203 |    s1= all_chunk[ick];s2=all_chunk[ick+1]
204 |    print('time segment:'+s1+' to '+s2)
205 |    downloaders.download(source=source,rawdatadir=DATADIR,starttime=s1,endtime=s2,\
206 |                       stationinfo=stalist,samp_freq=samp_freq)
207 | 
208 | print('downloading finished.')
209 | 
210 | #extrace waveforms
211 | tr=extract_waveform(glob.glob(os.path.join(DATADIR,"*.h5"))[0],net_list[0],sta_list[0],comp=chan_list[0])
212 | plot_trace([tr],size=(10,4),ylabels=['displacement'],title=[net_list[0]+'.'+sta_list[0]+'.'+chan_list[0]])
213 | ```
214 | 
215 | You should see the following image showing the waveform for TA.O45A.
216 | ![plot1](/figs/download_continuous_example.png)
217 | 
218 | 2. Download earthquake catalog and waveforms with given window length relative to phase arrivals
219 | TBA.
220 | 
221 | 3. Ambient noise cross-correlations
222 | * Minimum lines version for processing small data sets in memory. Another example is in `notebooks/seisgo_download_xcorr_demo.ipynb`.
223 | 
224 | ```Python
225 | from seisgo import downloaders
226 | from seisgo.noise import compute_fft,correlate
227 | 
228 | # download parameters
229 | source='IRIS'                                 # client/data center. see https://docs.obspy.org/packages/obspy.clients.fdsn.html for a list
230 | samp_freq = 10                                                  # targeted sampling rate at X samples per seconds
231 | 
232 | chan_list = ["BHZ","BHZ"]
233 | net_list  = ["TA","TA"] #                                             # network list
234 | sta_list  = ["O45A","SFIN"]                                               # station (using a station list is way either compared to specifying stations one by one)
235 | start_date = "2012_01_01_0_0_0"                               # start date of download
236 | end_date   = "2012_01_02_1_0_0"                               # end date of download
237 | 
238 | # Download
239 | print('downloading ...')
240 | trall,stainv_all=downloaders.download(source=source,starttime=start_date,endtime=end_date,\
241 |                                   network=net_list,station=sta_list,channel=chan_list,samp_freq=samp_freq)
242 | 
243 | print('cross-correlation ...')
244 | cc_len    = 1800                                                            # basic unit of data length for fft (sec)
245 | cc_step      = 900                                                             # overlapping between each cc_len (sec)
246 | maxlag         = 100                                                        # lags of cross-correlation to save (sec)
247 | 
248 | #get FFT
249 | fftdata1=compute_fft(trall[0],cc_len,cc_step,stainv=stainv_all[0])
250 | fftdata2=compute_fft(trall[1],cc_len,cc_step,stainv=stainv_all[1])
251 | 
252 | #do correlation
253 | corrdata=correlate(fftdata1,fftdata2,maxlag,substack=True)
254 | 
255 | #plot correlation results
256 | corrdata.plot(freqmin=0.1,freqmax=1,lag=100)
257 | ```
258 | 
259 | You should get the following figure:
260 | ![plot1](/figs/noise_xcorr_example.png)
261 | 
262 | * Run large-scale jobs through MPI. For processing of large datasets, the downloaded and xcorr data will be saved to disk. Example script here: `scripts/seisgo_xcorr_MPI.py`
263 | 
264 | ## Contribute
265 | Any bugs and ideas are welcome. Please file an issue through GitHub.
266 | 
267 | 
268 | ## References
269 | * Bell, S. W., D. W. Forsyth, & Y. Ruan (2015), Removing Noise from the Vertical Component Records of Ocean-Bottom Seismometers: Results from Year One of the Cascadia Initiative, Bull. Seismol. Soc. Am., 105(1), 300-313, doi:10.1785/0120140054.
270 | * Clements, T., & Denolle, M. A. (2020). SeisNoise.jl: Ambient Seismic Noise Cross Correlation on the CPU and GPU in Julia. Seismological Research Letters. https://doi.org/10.1785/0220200192
271 | * Janiszewski, H A, J B Gaherty, G A Abers, H Gao, Z C Eilon, Amphibious surface-wave phase-velocity measurements of the Cascadia subduction zone, Geophysical Journal International, Volume 217, Issue 3, June 2019, Pages 1929-1948, https://doi.org/10.1093/gji/ggz051
272 | * Jiang, C., & Denolle, M. A. (2020). NoisePy: A New High-Performance Python Tool for Ambient-Noise Seismology. Seismological Research Letters. https://doi.org/10.1785/0220190364
273 | * Tian, Y., & M. H. Ritzwoller (2017), Improving ambient noise cross-correlations in the noisy ocean bottom environment of the Juan de Fuca plate, Geophys. J. Int., 210(3), 1787-1805, doi:10.1093/gji/ggx281.
274 | * Yuan, C., Bryan, J., & Denolle, M. (2021). Numerical comparison of time-, frequency-, and wavelet-domain methods for coda wave interferometry. Geophysical Journal International, 828–846. https://doi.org/10.1093/gji/ggab140
275 | * Yang, X., Bryan, J., Okubo, K., Jiang, C., Clements, T., & Denolle, M. A. (2022). Optimal stacking of noise cross-correlation functions. Geophysical Journal International, 232(3), 1600–1618. https://doi.org/10.1093/gji/ggac410
276 | 


--------------------------------------------------------------------------------
/changes.txt:
--------------------------------------------------------------------------------
  1 | Updates in v0.9.1
  2 | ANISOTROPY:
  3 | 1. This is a new module for anisotropy analysis. Currently, only implemented BANX (beamforming anisotropy analysis from noise cross-correlations)
  4 | 
  5 | NOISE:
  6 | 1. get_stationpairs(): added the option to collect station coordinates.
  7 | 2. extract_corrdata(): added capibility to determine the side.
  8 | 
  9 | HELPERS:
 10 | 1. xcorr_sides(): added "o" for one-sided but unclear negative or positive and "u" for unavailable side (e.g.,empty data)
 11 | 
 12 | UTILS:
 13 | 1. Added cart2pol(), pol2cart(), and cart2compass() to convert coordinates between cartisian and polar.
 14 | 2. Replace cascaded_union [deprecated by shapely] with unary_union.
 15 | 
 16 | SETUP:
 17 | 1. Removed requirement for python<3.12. It seems to work with 3.12 on Purdue cluster, at least.
 18 | 
 19 | TYPES:
 20 | 1. Added capibility to handle more side symbols for corrdata.
 21 | 
 22 | =======================================================
 23 | Updates in v0.9.0:
 24 | SCRIPTS:
 25 | 1. Added the scripts for the whole workflow of extracting EGFs.
 26 | 2. Removed some old scripts that are incompatible with the new usage and functions.
 27 | 
 28 | SCRIPTS/2_XCORR_MPI.py:
 29 | 1. Added a switch named correct_orientation for channel orientation correction/conversion.
 30 | 2. Added pad_thre to define the maximum threshold of size gaps between channels when doing orientation correction.
 31 | 3. Added a switch named do_rotation for horizontal channel rotation
 32 | 4. Added a switch named channel_pairs for specifying which channel pairs are saved.
 33 | 
 34 | SIMULATION MODEL:
 35 | 1. Added build_vmodel() to create a layered model with linearly increasing velocity.
 36 | 
 37 | TYPES:
 38 | 1. Added checking for corrdata.data.ndim when initilizing CorrData to make sure substack True for ndim 2 (or >1).
 39 | 
 40 | OBSMASTER:
 41 | 1. Moved correct_orientations() out to utils, but kept the block of calling this function to be compatible with old usage.
 42 | 
 43 | UTILS:
 44 | 1. Moved correct_orientations() here.
 45 | 2. Fixed a bug where the number of traces in slicing_trace() is 1 trace less than the true number.
 46 | 
 47 | NOISE:
 48 | 1. Added trace_info() to extract trace and orientation info.
 49 | 2. Added the function to correct horizontal orientations and convert 1/2 channels to N/E channels.
 50 | 3. Optimized the workflow for cross-correlation with the option to do rotation before xcorr.
 51 | 3a. Added the function to do channel rotation specified channel pairs
 52 | 3b. do_correlation(): added an option to specify which channel pairs are saved.
 53 | 3c. Added assemble_raw() to assemble raw data from h5 files
 54 | 3d. Updated assemble_fft() to prepare data for FFT with rotation and specified xcorr pairs.
 55 | 4. Added get_locator() to read inventory and location id of stations
 56 | 5. Added rotation() to perform horizontal channel rotation
 57 | 
 58 | =============================================================
 59 | Updates in v0.8.3:
 60 | UTILS:
 61 | 1. Replace obspy.core.util.base._get_function_from_entry_point for 
 62 |   taper hann window with scipy.signal.windows.hann for compatibility with new scipy. 
 63 | 
 64 | DOWNLOADERS:
 65 | 1. Added more verbose prints in getdata().
 66 | 2. Use utils.taper() for tapering the raw data, instead of obspy's taper function,
 67 |   which has version errors in scipy hann window.
 68 | 
 69 | NOISE:
 70 | 1. Fixed minor bugs in int and float, which are now excluded from numpy.
 71 | 
 72 | NOTEBOOKS:
 73 | 1. Updated the seisgo_download_xcorr_demo with the latest version of seisgo.
 74 | 
 75 | =============================================================
 76 | Updates in v0.8.1
 77 | TYPES:
 78 | 1. Types.CorrData.subset: clarified documentary of overwrite option. Default False.
 79 | 2. Added a new class for plotting shaded relief based on cartopy.
 80 | 
 81 | NOISE:
 82 | 1. In shaping_corrdata(), modified to loop through all pairs and all components, if not specified by the user.
 83 | 
 84 | UTILS:
 85 | 1. qml2list:add option to only extract location and dates.
 86 | 2. slicing_trace(): fixed a bug when all trace std was computed using the amplitudes instead of the absolute amplitudes.
 87 | 3. slicing_trace(): changed to compute trace_stdS after demean and detrend. 
 88 | added demean and detrend of whole trace at the beginning.
 89 | 4. import tukey from scipy.signal.windows to be compatible with different scipy version.
 90 | 
 91 | DOWNLOADERS:
 92 | 1. get_events(): added depth range and search with radius in circles.
 93 | 
 94 | PLOTTINGS:
 95 | 1. plot_stations: added options to specific colors and renamed it to gmt_scatters() to be more general. 
 96 |     Old function was kept but will throw an error.
 97 | 
 98 | OBSMASTER:
 99 | 1. import tukey from scipy.signal.windows to be compatible with different scipy version.
100 | 
101 | =============================================================
102 | Updates in v0.8.0
103 | MONITORING:
104 | 1. get_dvv():drop negative error data
105 | 
106 | TYPES:
107 | 1. RawData: minor bug fix for attribute names.
108 | 2. Added CorrDataEnsemble() placeholder, to store a gather of CorrData from the same virtual source.
109 | 3. Fixed a bug when merging corrdata pairs with different channel types (when ignore_channel_type is True).
110 | 4. Fixed typos in CorrData.filter(), where highpass() was typed as lhighpass.
111 | 5. CorrData.merge(): fixed bugs when errors in joining two data matrices, the time attribute will still be merged, leading 
112 | to inconsistent sizes.
113 | 6. DvvData: drop negative error data and added option to specify maximum error in plot().
114 | 
115 | NOISE:
116 | 1. merge_pairs(): fixed minor bugs. Removed try{} block in stack, split, egf, and saving steps. The try block would hide errors
117 | that might be important to address.
118 | 
119 | DOWNLOADERS:
120 | 1. download(): use "PRESSURE" as output for pressure channel.
121 | 
122 | UTILS:
123 | 1. Change np.float to np.float32.
124 | 2. save2asdf(): fixed sta_in when it is a list.
125 | =======
126 | 2. save2asdf: Fixed a bug when saving multiple traces with a list of station inventory.
127 | 
128 | DOWNLOADERS:
129 | 1. download(): changed pressure channel rmresp output to "PRESSURE", instead of hardcoding it to "VEL".
130 | 
131 | STACKING:
132 | 1. stack: discard NaN traces before passing the data to each method.
133 | 2. robust: fixed a bug when outlier traces have huge amplitudes, which would cause the dot product 
134 |   and L2 norms too large to be handeled by the system. The final stack will be scaled back.
135 | =================================================================
136 | Updates in v0.7.7
137 | TYPES:
138 | 1. CorrData.merge(): added option to ignore channel types. This is needed when stations update
139 |   their channels types, e.g., EH? to BH?. This new option would allow station pairs with different
140 |   channel types to merge. Default is False.
141 | 
142 | NOISE:
143 | 1. merge_pairs(): added option of ignore_channel_type.
144 | 
145 | SETUP:
146 | 1. Change numpy requirement to be <1.26.0.
147 | =================================================================
148 | Updates in v0.7.6
149 | UTILS:
150 | 1. Added smooth functions for 1d, 2d, and 3d grids.
151 | 2. Added matrix_in_polygon to extract 2d or 3d values with,in a polygon.
152 | 3. Added boundary_points() to get the boundary of a series of points.
153 | 
154 | =================================================================
155 | Updates in v0.7.5
156 | NOISE:
157 | 1. correlate(): discarded lines removing peak at zero lag (demean of the spectrum).
158 | 
159 | MONITORING:
160 | 1. get_dvv: improved plot saving. Fixed a bug where stack_method was not specified for substacking.
161 | 2. ts_dvv: trimed data to only compare the original data window, excluding the padded zeros.
162 | 3. xc_dvv: newly added method using moving window cross-correlations. needs improvements and more tests.
163 | 
164 | DISPERSION:
165 | 1. get_dispersion_image: use adaptive window size for different periods, increasing with periods.
166 |   This is options. Options to specify minimum number of trace and minimum number of wavelength.
167 |   * added option to specify energy type, "power_sum" or "envelope".
168 |   * added plotting options.
169 |   * fixed bugs when only processing one side.
170 | 2. renamed get_dispersion_waveforms() to narrowband_waveforms to avoid using the word "dispersion".
171 | 
172 | DOWNLOADERS:
173 | 1. read_data(): minor bug fixes.
174 | 2. get_data() and get_sta_list(): added flexibility of specifying the source, which could be a Client object.
175 | 3. download(): added verbose to calling getdata().
176 | 
177 | STACKING:
178 | 1. Added handling of small data size, i.e., 1 trace, which will return the data without stacking.
179 | 2. Added option of using DOST for tfpws, based on the implementation by Jared Bryan. It was added as a new function:tfpws_dost.
180 | 
181 | UTILS:
182 | 1. Added two functions to get image gradient and convert xyz to matrix, without interpolation.
183 | 2. extract_waveform: read all data if not specifying stations.
184 | 
185 | PLOTTING:
186 | 1. Added facecolor to set the background of plots to white.
187 | 2. Added plot_dispersion_image() to plot dispersion image.
188 | 
189 | SIMULATION:
190 | 1. New module for seismic wave simulations.
191 | 
192 | ===================================================================
193 | Updates in v0.7.4
194 | UTILS:
195 | 1. Added spectral whitening function: whiten(), for time series data.
196 | 
197 | MONITORING:
198 | 1. Added whitening option in get_dvv().
199 | 
200 | TYPES:
201 | 1. FFTData, fixed a minor bug when getting Nfft for single trace data.
202 | 2. CorrData shaping: trim_end option to accommodate the need by FWANT.
203 | 
204 | PLOTTING:
205 | 1. plot_psd() bug fixed
206 | 
207 | STACKING:
208 | 1. Fixed a bug in robust stacking when the trace amplitudes are anomalously small.
209 | 
210 | DOWNLOADERS:
211 | 1. ms2asdf(): added option to specify response file.
212 | 
213 | DISPERSION:
214 | 1. get_dispersion_waveforms: Use period to get the evenly-spaced period vector.
215 | 2. get_dispersion_image: new function to extract dispersion image in velocity-period domain/space.
216 | 
217 | ==================================================================
218 | Updates in v0.7.3
219 | NOISE:
220 | 1. Added mpi option and fixed error in handling wrong file, in extract_corrdata()
221 | 2. Fixed error handling in get_stationpairs().
222 | 3. Added split_sides() to split corrfile sides and save separately.
223 | 4. Added shaping_corrdata() to wrap shaping function, convolving with wavelet.
224 | 5. Added split option in merge_pairs().
225 | 
226 | TYPES:
227 | 1. Added CorrData.shaping() to shape the data with wavelet.
228 | 2. Added CorrData.save() to wrap saving functions.
229 | 3. In CorrData.to_asdf(), save stack_method.
230 | 4. Fixed a bug in CorrData.to_egf() where the zero lag was not handled correctly. The negative side was wrong.
231 | 
232 | UTILS:
233 | 1. Added gaussian() and ricker() as the shaping wavelets.
234 | 2. Added box_smooth().
235 | 
236 | DOWNLOADERS:
237 | 1. In read_data(), changed default to False for getstainv.
238 | 
239 | HELPERS:
240 | 1. Added wavelet_labels().
241 | 
242 | MONITORING:
243 | 1. Added vpcluster_evaluate_kmean() to find the optimal number of clusters.
244 | 
245 | DISPERSION:
246 | 1. Improved frequency steps in disp_waveform_bp().
247 | 
248 | STACKING:
249 | 1. Synced with stackmaster functions.
250 | 
251 | CLUSTERING:
252 | 1. Automatically determine the optimal number of clusters.
253 | 
254 | ==================================================================
255 | Updates in v0.7.2
256 | All modules: Cleaned up unused functions.
257 | 
258 | TYPES:
259 | 1. CorrData.subset() changed overwrite to False by default.
260 | 2. DvvData, added get_info() to streamline and wrapup key information for other use.
261 | 3. DvvData: added save() as a wrapper for saving functions, including to_asdf() and to_pickle.
262 | 4. DvvData: added to_pickle() to save data to pickle files. This is mainly aiming to
263 |     workaround with the problem of saving large attributes in h5 file, through asdf.
264 | 5. DvvData: added subfreq as a Boolean attribute.
265 | 6. DvvData.plot(): added option to plot when subfreq=False. only plots error bars.
266 | 
267 | MONITORING:
268 | 1. extract_dvvdata: added option to read in pickle files.
269 | 2. get_dvv(): added option to save to pickle.
270 | 3. get_dvv(): added "ts" method.
271 | 4. Modified ts_dvv() to provide filter option and streamlined to work as a standalone method for SeisGo.
272 | 5. Delted unused functions.
273 | 
274 | OBSMASTER:
275 | 1. Moved some funcitons from utils.py to here.
276 | 
277 | NOISE:
278 | 1. Added reorganize_corrfile() to reorganize corrfile.
279 | 2. Added option in do_correlation() to specify the flag of output_structure.
280 | 3. Added comments and descriptions to major functions.
281 | 
282 | HELPERS:
283 | 1. Added this new module to provide a place showing a summary of helper function using SeisGo.
284 | 
285 | ==================================================================
286 | Changes in v0.7.0
287 | STACKING:
288 | 1. Added clusterstack() and tfpws().
289 | 
290 | MONITORING:
291 | 1. Added option to specify file name when saving to *.h5 file.
292 | 
293 | ==================================================================
294 | Updates in v0.6.6
295 | MONITORING:
296 | 1. Fixed a bug in single cpu mode, where the positive side was used for the negative measurements.
297 | 
298 | STACKING:
299 | 1. Cleaned up stacking methods with simpler names with a consistent fashion
300 | 2. Added seisstack() as a wrapper/interface to call all methods.
301 | 
302 | TYPES:
303 | 1. CorrData.to_egf(): Added a statement to check whether the data is already EGFs.
304 | 2. CorrData.stack(): demean is now an option. ampcut can be turned off now. This is to
305 |     consider situations when all data need to be stacked and/or the overall trend needs to be preserved.
306 | 3. Simplified calling stacking through seisstack. this enables easier future development.
307 | 
308 | NOISE:
309 | 1. Simplified calling stacking through seisstack. this enables easier future development.
310 | 
311 | UTILS:
312 | 1. Added get_snr() to get snr of data with distance information.
313 | 2. Added rms() to get the rms of data.
314 | 3. Fixed a bug in psd() to return half of the PSD and the corresponding frequency vector.
315 | 
316 | ==================================================================
317 | Updates in v0.6.5
318 | SETUP:
319 | 1. Added requirements for tslearn and minisom mainly for the clustering module .
320 | 
321 | CLUSTERING:
322 | 1. Added the new module.
323 | 2. Added clustering of velocity depth profiles with kmean and som, in two functions.
324 | 
325 | PLOTTING:
326 | 1. Added get_color_cycle() to help assign colors in plotting using matplotlib.
327 | 2. Cleaned up unused old NOISEPY functions.
328 | 3. Function to plot vmodel clustering results.
329 | 
330 | NOISE:
331 | 1. Optimized merge_chunks() to use less time getting the time range.
332 | 
333 | UTILS:
334 | 1. Added option to use pattern in get_filelist().
335 | 
336 | ==================================================================
337 | Updates in v0.6.4
338 | TYPES:
339 | 1. Check lower case only for method in CorrData.stack()
340 | 2. Bugs and improvements in DvvData and CorrData to save large attribute to asdf
341 | 3. Added psd() method in CorrData to plot psd of the CorrData result.
342 | 4. Added plot() method in FFTData to plot the amplitude spectrum of the FFT results.
343 | 5. Improved plot() for DvvData with smoothing option.
344 | 6. Minor bug fixes and improvement for FFTData in merging and plotting.
345 | 
346 | DOWNLOADERS:
347 | 1. Use inventory to remove response in read_data for miniseed
348 | 2. Added ms2asdf() to convert miniseed files to asdf files.
349 | 
350 | NOISE:
351 | 1. Optimize memory usage for merge_chunks()
352 | 
353 | PLOTTING:
354 | 1. Added plot_psd() to plot the psd of an array (works with 1-d and 2-d array only for now.)
355 | 2. Updated plot_eventsequence() to have the option of plotting depth as the ydata.
356 | 
357 | UTILS:
358 | 1. Minor bug fixes in read_gmtlines()
359 | 2. Added psd() to get the power spectral density of an array
360 | 
361 | ==================================================================
362 | Updates in v0.6.3.1
363 | 1. Fixed the size issue when saving "time" to ASDF file. HDF file limits the attribute
364 |   size to 64k or less. We split time to time_mean and np.float32(time) to reduce the size.
365 |   This is a temporary fix. Hopefully HDF could lift the size limit for attributes.
366 | 
367 | ==================================================================
368 | Updates in v0.6.3
369 | NOISE
370 | 1. Removed ncomp in do_correlation(). Setup a warning message if old usage is used.
371 | 2. Change defaults for acorr_only and xcorr_only both to False in do_correlation().
372 | 3. Added option to stack in merging(). This option could replace do_stacking() if no
373 |   rotation. Renamed merging() to merge_pairs(). The old name is kept for compatibility.
374 | 4. Updated extract_corrdata() to read "side" attribute if available.
375 | 5. Added merge_chunks() to merge correlation files, to reduce the number of files, with
376 |   the option for stacking.
377 | 
378 | 
379 | DOWNLOADERS
380 | 1. Return inventory in get_event_waveforms()
381 | 2. Drop duplicates in get_sta_list() and fixed minor bug when channels might be skipped.
382 | 3. Change default region to globe in get_events()
383 | 
384 | UTILS
385 | 1. Added mag_duration(), modified from obspyDMT.utils.event_handler.py
386 | 2. Renamed qml_to_event_list() to qml2list(). Added option to convert to pandas dataframe.
387 | 3. Fixed a bug in slicing_trace(), where the index was float instead of integer.
388 | 4. Added get_filelist() and slice_list().
389 | 5. Fixed a bug in sclicing_trace() when returning zeros array with errors. changed to return
390 |     empty arrays.
391 | 
392 | PLOTTING
393 | 1. Added plot_eventsequence() to plot event with time.
394 | 
395 | TYPES
396 | 1. Added "side" attribute in CorrData() to mark whether the corrdata is two-sided or one-side only.
397 | 2. Revised CorrData.plot() to check "side" attribute when plotting.
398 | 3. Added copy() method in CorrData class to allow the user to copy the object, to avoid directly
399 |   modifying the object values.
400 | 4. Added split() method in CorrData class to split the negative and positive sides of the data. This
401 |   is needed when the user wants to analyze the two sides separately.
402 | 5. Removed ngood attribute from CorrData, corresponding changes have been implemented for other
403 |   functions in "noise".
404 | 6. Added subset() method in CorrData() to subset data by time range.
405 | 7. Added filter() method in CorrData() to filter corrdata.data.
406 | 8. Added DvvData class to store dvv monitoring data.
407 | 
408 | MONITORING
409 | 1. Added get_dvv() as a wrapper to measure dvv with given CorrData object.
410 | 2. Added extract_dvvdata() to extract DvvData object from a ASDF file.
411 | 
412 | OBSMASTER
413 | 1. Removed getdata() and deprecated getobsdata(). Data downloading is now handled all by downloaders.
414 | 
415 | ==================================================================================
416 | ==================================================================================
417 | Updates in v0.6.2
418 | 
419 | PLOTTING
420 | 1. Added plot_stations() to plot seismic station map using GMT
421 | 
422 | 2. Fixed minor bugs in plot_waveform()
423 | 
424 | 3. Updated plot_corrfile() to take more options, more consistent with CorrData.plot()
425 | 
426 | 
427 | DOWNLOADERS
428 | 1. Fixed bug in get_event_waveforms() where only one station was downloaded. now it downloads
429 | all station data.
430 | 
431 | 2. Return Stream() for waveform data.
432 | 
433 | UTILS
434 | 1. Added subsetindex()
435 | 2. Added points_in_polygon() and generate_points_in_polygon()
436 | 3. Added read_gmtlines() to read in line segments in GMT style.
437 | 4. Added read_ncmodel3d to read 3-D model files in netCDF format.
438 | 5. Added read_ncmodel2d to read 2-D model files in netCDF format.
439 | 6. Added ncmodel_in_polygon to extract seismic model within polygons.
440 | 
441 | TYPES
442 | 1. Pushed cc_len and cc_step in CorrData.
443 | 2. Added in CorrData.stack() the option to stack over segmented time windows.
444 | 
445 | SETUP
446 | 1. Added requirement for shapely and netCDF4 packages.
447 | 


--------------------------------------------------------------------------------
/data/tcremoval/tcparameters.txt:
--------------------------------------------------------------------------------
1 | {'window': 3600, 'overlap': 0.2, 'taper': 0.08, 'qc_freq': [0.004, 1], 'tc_subset': ['ZP-H']}


--------------------------------------------------------------------------------
/description.md:
--------------------------------------------------------------------------------
  1 | # SeisGo
  2 | *A ready-to-go Python toolbox for seismic data analysis*
  3 | 
  4 | ### Author: Xiaotao Yang
  5 | 
  6 | ## Introduction
  7 | This package is currently dependent on **obspy** (www.obspy.org) to basic handling of seismic data (download, read, and write, etc). Users are referred to **obspy** toolbox for related functions.
  8 | 
  9 | ## Citation if using SeisGo
 10 | Please cite Yang et al. (2022) if you use SeisGo in your research. 
 11 | 
 12 | Yang, X., Bryan, J., Okubo, K., Jiang, C., Clements, T., & Denolle, M. A. (2022). Optimal stacking of noise cross-correlation functions. Geophysical Journal International, 232(3), 1600–1618. https://doi.org/10.1093/gji/ggac410
 13 | 
 14 | ## Available modules
 15 | This package is under active development. The currently available modules are listed here.
 16 | 
 17 | 1.  `utils`: This module contains frequently used utility functions not readily available in `obspy`.
 18 | 
 19 | 2. `downloaders`: This module contains functions used to download earthquake waveforms, earthquake catalogs, station information, continous waveforms, and read data from local files.
 20 | 
 21 | 3. `obsmaster`: This module contains functions to get and processing Ocean Bottom Seismometer (OBS) data. The functions and main processing modules for removing the tilt and compliance noises are inspired and modified from `OBStools` (https://github.com/nfsi-canada/OBStools) developed by Pascal Audet & Helen Janiszewski. The main tilt and compliance removal method is based on Janiszewski et al. (2019).
 22 | 
 23 | 4. `noise`: This module contains functions used in ambient noise processing, including cross-correlations and monitoring. The key functions were converted from `NoisePy` (https://github.com/mdenolle/NoisePy) with heavy modifications. Inspired by `SeisNoise.jl` (https://github.com/tclements/SeisNoise.jl), We modified the cross-correlation workflow with FFTData and CorrData (defined in `types` module) objects. The original NoisePy script for cross-correlations have been disassembled and wrapped in functions, primarily in this module. We also changed the way NoisePy handles timestamps when cross-correlating. This change results in more data, even with gaps. The xcorr functionality in SeisGo also has the minimum requirement on knowledge about the downloading step. We try to optimize and minimize inputs from the user. We added functionality to better manipulate the temporal resolution of xcorr results.
 24 | 
 25 | 5. `plotting`: This module contains major plotting functions for raw waveforms, cross-correlation results, and station maps.
 26 | 
 27 | 6. `monitoring`: This module contains functions for ambient noise seismic monitoring, adapted from functions by Yuan et al. (2021).
 28 | 
 29 | 7. `clustering`: Clustering functions for seismic data and velocity models.
 30 | 
 31 | 8. `stacking`: stacking of seismic data.
 32 | 
 33 | 9. `types`: This module contains the definition of major data types and classes.
 34 | 
 35 | ## Installation
 36 | 1. Create and activate the **conda** `seisgo` environment
 37 | 
 38 | Make sure you have a working Anaconda installed. This step is required to have all dependencies installed for the package. You can also manually install the listed packages **without** creating the `seisgo` environment OR if you already have these packages installed. **The order of the following commands MATTERS.**
 39 | 
 40 | ```
 41 | $ conda create -n seisgo -c conda-forge jupyter numpy scipy pandas numba pycwt python obspy mpi4py
 42 | $ conda activate seisgo
 43 | ```
 44 | 
 45 | The `jupyter` package is currently not required, **unless** you plan to run the accompanied Jupyter notebooks in **<notebooks>** directory. `mip4py` is **required** to run parallel scripts stored in **scripts** directory. The modules have been fully tested on python 3.7.x but versions >= 3.6 also seem to work from a few tests.
 46 | 
 47 | **Install PyGMT plotting funcitons**
 48 | 
 49 | Map views with geographical projections are plotted using **PyGMT** (https://www.pygmt.org/latest/). The following are steps to install PyGMT package (please refer to PyGMT webpage for trouble shooting and testing):
 50 | 
 51 | Install GMT through conda first into the `SeisGo` environment:
 52 | 
 53 | ```
 54 | conda activate seisgo
 55 | conda config --prepend channels conda-forge
 56 | conda install  python pip numpy pandas xarray netcdf4 packaging gmt
 57 | ```
 58 | 
 59 | **You may need to specify the python version available on your environment.** In ~/.bash_profile, add this line: `export GMT_LIBRARY_PATH=$SEISGOROOT/lib`, where `$SEISGOROOT` is the root directory of the `seisgo` environment. Then, run:
 60 | 
 61 | ```
 62 | conda install pygmt
 63 | ```
 64 | 
 65 | Test your installation by running:
 66 | ```
 67 | python
 68 | > import pygmt
 69 | ```
 70 | 
 71 | 2. Install `seisgo` package functions using `pip`
 72 | 
 73 | `cd` to the directory you want to save the package files. Then,
 74 | ```
 75 | $ conda activate seisgo
 76 | $ pip install seisgo
 77 | ```
 78 | 
 79 | This step will install the **SeisGo** modules under `seisgo` environment. The modules would then be imported under any working directory. Remember to rerun this command if you modified the functions/modules.
 80 | 
 81 | 3. Test the installation
 82 | 
 83 | Run the following commands to test your installation.
 84 | 
 85 | ```
 86 | $ python
 87 | >>> from seisgo import obsmaster as obs
 88 | >>> tflist=obs.gettflist(help=True)
 89 | ------------------------------------------------------------------
 90 | | Key    | Default  | Note                                       |
 91 | ------------------------------------------------------------------
 92 | | ZP     | True     | Vertical and pressure                      |
 93 | | Z1     | True     | Vertical and horizontal-1                  |
 94 | | Z2-1   | True     | Vertical and horizontals (1 and 2)         |
 95 | | ZP-21  | True     | Vertical, pressure, and two horizontals    |
 96 | | ZH     | True     | Vertical and rotated horizontal            |
 97 | | ZP-H   | True     | Vertical, pressure, and rotated horizontal |
 98 | ------------------------------------------------------------------
 99 | ```
100 | 
101 | 
102 | ## Tutorials on key functionalities
103 | See https://github.com/xtyangpsp/SeisGo for tutorials and more detailed descriptions.
104 | 
105 | 
106 | ## Contribute
107 | Any bugs and ideas are welcome. Please file an issue through GitHub https://github.com/xtyangpsp/SeisGo.
108 | 
109 | 
110 | ## References
111 | * Bell, S. W., D. W. Forsyth, & Y. Ruan (2015), Removing Noise from the Vertical Component Records of Ocean-Bottom Seismometers: Results from Year One of the Cascadia Initiative, Bull. Seismol. Soc. Am., 105(1), 300-313, doi:10.1785/0120140054.
112 | * Clements, T., & Denolle, M. A. (2020). SeisNoise.jl: Ambient Seismic Noise Cross Correlation on the CPU and GPU in Julia. Seismological Research Letters. https://doi.org/10.1785/0220200192
113 | * Janiszewski, H A, J B Gaherty, G A Abers, H Gao, Z C Eilon, Amphibious surface-wave phase-velocity measurements of the Cascadia subduction zone, Geophysical Journal International, Volume 217, Issue 3, June 2019, Pages 1929-1948, https://doi.org/10.1093/gji/ggz051
114 | * Jiang, C., & Denolle, M. A. (2020). NoisePy: A New High-Performance Python Tool for Ambient-Noise Seismology. Seismological Research Letters. https://doi.org/10.1785/0220190364
115 | * Tian, Y., & M. H. Ritzwoller (2017), Improving ambient noise cross-correlations in the noisy ocean bottom environment of the Juan de Fuca plate, Geophys. J. Int., 210(3), 1787-1805, doi:10.1093/gji/ggx281.
116 | * Yuan, C., Bryan, J., & Denolle, M. (2021). Numerical comparison of time-, frequency-, and wavelet-domain methods for coda wave interferometry. Geophysical Journal International, 828–846. https://doi.org/10.1093/gji/ggab140
117 | * Yang, X., Bryan, J., Okubo, K., Jiang, C., Clements, T., & Denolle, M. A. (2022). Optimal stacking of noise cross-correlation functions. Geophysical Journal International, 232(3), 1600–1618. https://doi.org/10.1093/gji/ggac410
118 | 


--------------------------------------------------------------------------------
/figs/download_continuous_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/figs/download_continuous_example.png


--------------------------------------------------------------------------------
/figs/noise_xcorr_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/figs/noise_xcorr_example.png


--------------------------------------------------------------------------------
/figs/seisgo_logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/figs/seisgo_logo.png


--------------------------------------------------------------------------------
/notebooks/OBS_orientation_Cascadia.csv:
--------------------------------------------------------------------------------
  1 | net,station,orientation_ch1,orientation_ch2,error,Num_measurements
  2 | 7D,G03A,154,244,5,4
  3 | 7D,G30A,140,230,2,5
  4 | 7D,J06A,225,315,6,5
  5 | 7D,J23A,114,204,11,5
  6 | 7D,J28A,206,296,12,4
  7 | 7D,J29A,74,164,13,7
  8 | 7D,J30A,69,159,9,8
  9 | 7D,J31A,5,95,11,7
 10 | 7D,J37A,264,354,9,11
 11 | 7D,J38A,101,191,14,7
 12 | 7D,J39A,323,53,13,9
 13 | 7D,J45A,12,102,10,13
 14 | 7D,J46A,254,344,14,4
 15 | 7D,J47A,247,337,14,9
 16 | 7D,J52A,253,343,11,11
 17 | 7D,J53A,149,239,13,2
 18 | 7D,J54A,27,117,9,6
 19 | 7D,J55A,211,301,12,4
 20 | 7D,J61A,257,347,6,6
 21 | 7D,J63A,165,255,8,3
 22 | 7D,J67A,138,228,5,5
 23 | 7D,J68A,158,248,20,4
 24 | 7D,FN07A,118,208,0,12
 25 | 7D,FN08A,128,218,55,4
 26 | 7D,FN14A,347,77,44,17
 27 | 7D,FN16A,201,291,22,15
 28 | 7D,FN18A,296,26,56,8
 29 | 7D,J41A,60,150,33,11
 30 | 7D,J42A,83,173,5,5
 31 | 7D,J49A,291,21,13,14
 32 | 7D,J50A,93,183,23,65
 33 | 7D,J51A,133,223,27,26
 34 | 7D,J58A,190,280,13,6
 35 | 7D,J59A,109,199,11,15
 36 | 7D,M03A,161,251,13,11
 37 | 7D,J25A,145,235,82,2
 38 | 7D,J33A,12,102,24,5
 39 | 7D,J35A,202,292,8,23
 40 | 7D,J36A,37,127,6.9,16
 41 | 7D,J43A,81.5,171.5,7.4,24
 42 | 7D,J44A,207.5,297.5,9.4,9
 43 | 7D,J57A,157,247,79,2
 44 | 7D,J65A,51,141,12,2
 45 | 7D,J73A,16,106,32,2
 46 | 7D,M01A,100,190,25,5
 47 | 7D,M07A,91,181,14,18
 48 | 7D,M08A,8,98,7,3
 49 | 7D,J48A,NaN,NaN,NaN,NaN
 50 | 7D,FN12A,312,42,0,1
 51 | 7D,J26A,NaN,NaN,NaN,NaN
 52 | 7D,J24A,NaN,NaN,NaN,NaN
 53 | 7D,J34A,NaN,NaN,NaN,NaN
 54 | 7D,M06A,NaN,NaN,NaN,NaN
 55 | 7D,M02A,NaN,NaN,NaN,NaN
 56 | 7D,FN05A,6,96,0,1
 57 | 7D,FS02B,78,168,60,6
 58 | 7D,FS04B,151,241,12,12
 59 | 7D,FS07B,30,120,0,1
 60 | 7D,FS11B,202,292,29,3
 61 | 7D,FS12B,25,115,0,1
 62 | 7D,FS13B,341,71,30,5
 63 | 7D,FS15B,141,231,37,4
 64 | 7D,FS18B,47,137,67,4
 65 | 7D,FS19B,31,121,0,1
 66 | 7D,FS20B,0,90,19,9
 67 | 7D,G09B,220,310,12,5
 68 | 7D,G17B,324,54,60,10
 69 | 7D,G18B,221,311,0,1
 70 | 7D,G25B,334,64,3,5
 71 | 7D,G26B,316,46,50,2
 72 | 7D,G34B,151,241,17,13
 73 | 7D,J17B,87,177,0,1
 74 | 7D,J25B,99,189,81,7
 75 | 7D,J33B,10,100,37,10
 76 | 7D,M09B,98,188,4,13
 77 | 7D,FS01B,346,76,15,23
 78 | 7D,FS14B,164,254,51,2
 79 | 7D,G02B,200,290,18,33
 80 | 7D,G10B,133,223,16,24
 81 | 7D,G12B,209,299,21,8
 82 | 7D,G27B,212,302,17,27
 83 | 7D,G28B,161,251,15,21
 84 | 7D,G37B,102,192,13,40
 85 | 7D,J09B,163,253,6,3
 86 | 7D,J10B,89,179,10,34
 87 | 7D,J18B,80,170,1,2
 88 | 7D,J20B,331,61,9,35
 89 | 7D,M11B,201,291,19,13
 90 | 7D,M12B,235,325,15,6
 91 | 7D,M14B,266,356,13,7
 92 | 7D,FS05B,307,37,11,3
 93 | 7D,FS06B,331,61,76,4
 94 | 7D,G03B,29,119,8,4
 95 | 7D,G05B,231,321,11,19
 96 | 7D,G11B,39,129,15,28
 97 | 7D,G13B,225,315,13,17
 98 | 7D,G19B,79,189,14,12
 99 | 7D,G20B,3,93,15,28
100 | 7D,G21B,250,340,11,13
101 | 7D,G22B,260,350,9,29
102 | 7D,G29B,329,59,8,15
103 | 7D,G30B,230,320,7,11
104 | 7D,G35B,333,63,13,15
105 | 7D,G36B2,195,285,14,8
106 | 7D,J06B,240,330,13,12
107 | 7D,J11B,321,51,10,19
108 | 7D,J19B,125,215,7,24
109 | 7D,J23B,315,45,9,19
110 | 7D,J27B,292,22,13,18
111 | 7D,J28B,23,113,9,16
112 | 7D,J48B,201,291,8,17
113 | 7D,J63B,145,235,7,11
114 | 7D,BB030,84,174,0,1
115 | 7D,BB060,121,211,0,1
116 | 7D,BB070,304,34,0,1
117 | 7D,BB090,18,108,0,1
118 | 7D,BB120,300,30,0,1
119 | 7D,BB130,246,336,0,1
120 | 7D,BB140,337,67,7,11
121 | 7D,BB150,29,109,0,1
122 | 7D,BB170,211,201,0,1
123 | 7D,BB180,162,252,0,1
124 | 7D,BB200,64,154,0,1
125 | 7D,BB230,158,248,0,1
126 | 7D,BB240,291,21,0,1
127 | 7D,BB260,40,130,0,1
128 | 7D,BB290,59,149,0,1
129 | 7D,BB300,303,33,0,1
130 | 7D,BB320,197,287,0,1
131 | 7D,BB330,63,153,0,1
132 | 7D,BB350,162,252,0,1
133 | 7D,BB370,63,153,0,1
134 | 7D,BB390,59,149,0,1
135 | 7D,BB410,357,87,0,1
136 | 7D,BB420,181,271,0,1
137 | 7D,BB440,303,33,0,1
138 | 7D,BB450,235,325,0,1
139 | 7D,BB480,62,152,0,1
140 | 7D,BB510,297,27,0,1
141 | 7D,BB530,301,31,0,1
142 | 7D,BB540,168,258,0,1
143 | 7D,BB550,230,320,0,1
144 | 7D,FN01C,72,162,65,2
145 | 7D,FN02C,141,231,57,5
146 | 7D,FN03C,26,116,70,38
147 | 7D,FN04C,163,253,55,35
148 | 7D,FN05C,357,87,43,15
149 | 7D,FN07C,171,261,32,7
150 | 7D,FN08C,324,54,71,8
151 | 7D,FN09C,162,252,25,19
152 | 7D,FN10C,245,335,2,2
153 | 7D,FN11C,252,342,74,4
154 | 7D,FN12C,318,48,41,7
155 | 7D,FN13C,76,166,48,10
156 | 7D,FN14C,221,311,80,16
157 | 7D,FN19C,182,272,50,20
158 | 7D,J26C,63,153,64,13
159 | 7D,J34C,325,55,56,10
160 | 7D,J41C,346,76,78,12
161 | 7D,J42C,225,315,54,17
162 | 7D,J49C,96,186,59,23
163 | 7D,J50C,339,69,42,12
164 | 7D,J57C,103,193,77,3
165 | 7D,J58C,115,205,39,14
166 | 7D,J59C,333,63,71,8
167 | 7D,M06C,359,89,29,14
168 | 7D,J25C,215,305,36,5
169 | 7D,J33C,215,305,27,11
170 | 7D,J52C,129,219,26,27
171 | 7D,J53C,301,39,27,31
172 | 7D,J61C,173,263,17,22
173 | 7D,J65C,109,199,79,9
174 | 7D,J68C,141,231,5,24
175 | 7D,J73C,329,49,26,8
176 | 7D,M01C,138,228,6,8
177 | 7D,M02C,88,178,20,2
178 | 7D,M03C,327,67,24,15
179 | 7D,M04C,103,193,21,10
180 | 7D,M05C,194,284,15,9
181 | 7D,M08C,92,182,9,9
182 | 7D,J21C,42,132,32,10
183 | 7D,J23C,147,237,7,10
184 | 7D,J28C,186,276,12,4
185 | 7D,J29C,59,149,29,37
186 | 7D,J30C,65,155,26,29
187 | 7D,J31C,234,324,33,29
188 | 7D,J32C,97,187,3,4
189 | 7D,J35C,333,73,32,18
190 | 7D,J36C,33,123,20,16
191 | 7D,J37C,228,318,22,18
192 | 7D,J38C,245,335,36,17
193 | 7D,J39C,268,358,18,22
194 | 7D,J43C,118,208,28,19
195 | 7D,J44C,303,33,33,19
196 | 7D,J45C,109,199,18,28
197 | 7D,J46C,175,265,17,25
198 | 7D,J47C,219,309,39,21
199 | 7D,J48C,290,20,33,11
200 | 7D,J54C,84,174,14,4
201 | 7D,J55C,127,217,30,11
202 | 7D,J63C,100,190,15,9
203 | 7D,J67C,178,268,8,6
204 | 7D,J69C,37,127,6,7
205 | 7D,FS02D,123,217,55,6
206 | 7D,FS04D,326,56,42,8
207 | 7D,FS06D,67,157,5,7
208 | 7D,FS07D,259,349,6,9
209 | 7D,FS08D,202,292,77,13
210 | 7D,FS09D,263,353,9,12
211 | 7D,FS10D,42,132,6,19
212 | 7D,FS13D,275,5,11,16
213 | 7D,FS16D,80,170,5,8
214 | 7D,FS41D,183,273,16,2
215 | 7D,FS44D,74,164,8,15
216 | 7D,G01D,10,100,11,15
217 | 7D,G03D,139,229,19,28
218 | 7D,G04D,261,351,14,28
219 | 7D,G05D,315,45,25,30
220 | 7D,G09D,284,14,2,7
221 | 7D,G10D,180,270,20,19
222 | 7D,G13D,181,271,21,46
223 | 7D,G20D,311,41,9,35
224 | 7D,G21D,296,26,10,29
225 | 7D,G22D,171,261,11,35
226 | 7D,G29D,182,272,14,43
227 | 7D,G33D,315,45,8,10
228 | 7D,G35D,52,142,11,35
229 | 7D,G36D,115,205,10,19
230 | 7D,G37D,137,227,24,49
231 | 7D,J11D,98,188,14,26
232 | 7D,J19D,256,346,22,34
233 | 7D,J20D,188,278,17,45
234 | 7D,J27D,28,118,9,28
235 | 7D,J28D,211,301,8,23
236 | 7D,M16D,315,45,7,14
237 | 7D,BB630,125,215,30,78
238 | 7D,BB631,125,115,30,78
239 | 7D,BB640,267,357,16,40
240 | 7D,BB650,147,237,36,18
241 | 7D,BB660,293,23,33,10
242 | 7D,BB670,201,291,25,36
243 | 7D,BB680,33,123,26,32
244 | 7D,BB690,244,334,20,7
245 | 7D,BB700,192,282,13,34
246 | 7D,BB710,118,208,29,71
247 | 7D,BB711,118,208,29,71
248 | 7D,BB720,26,116,19,52
249 | 7D,BB721,26,116,19,52
250 | 7D,BB730,36,126,35,26
251 | 7D,BB740,279,9,17,38
252 | 7D,BB750,NaN,NaN,NaN,NaN
253 | 7D,BB830,300,30,13,27
254 | 7D,BB840,62,152,19,32
255 | 7D,BB850,330,60,23,55
256 | 7D,BB870,230,320,28,57
257 | 7D,GB030,NaN,NaN,NaN,NaN
258 | 7D,GB050,257,347,1,2
259 | 7D,GB080,99,189,47,6
260 | 7D,GB100,146,236,38,42
261 | 7D,GB101,146,236,38,42
262 | 7D,GB111,159,249,28,9
263 | 7D,GB130,205,295,83,7
264 | 7D,GB170,327,57,40,24
265 | 7D,GB171,327,57,40,24
266 | 7D,GB180,186,276,24,20
267 | 7D,GB210,279,9,21,5
268 | 7D,GB220,334,64,81,7
269 | 7D,GB230,NaN,NaN,NaN,NaN
270 | 7D,GB260,215,305,51,4
271 | 7D,GB281,189,279,17,39
272 | 7D,GB320,143,233,33,3
273 | 7D,GB321,143,233,33,3
274 | 7D,GB330,20,110,23,30
275 | 7D,GB331,20,110,23,30
276 | 7D,GB340,33,123,0,1
277 | 7D,GB341,33,123,0,1
278 | 7D,GB350,NaN,NaN,NaN,NaN
279 | 7D,GB360,88,178,25,5
280 | 7D,GB380,242,332,41,20
281 | 7D,FC03D,84,174,4,16
282 | 7D,FS11D,263,353,54,4
283 | 7D,FS12D,181,271,32,12
284 | 7D,FS14D,213,303,29,9
285 | 7D,FS15D,172,262,0,1
286 | 7D,FS17D,298,28,3,4
287 | 7D,FS42D,102,192,0,1
288 | 7D,FS43D,125,215,5,13
289 | 7D,FS45D,239,329,55,10
290 | 7D,G02D,72,162,9,10
291 | 7D,G18D,257,347,15,29
292 | 7D,G25D,NaN,NaN,NaN,NaN
293 | 7D,G26D,236,326,10,14
294 | 7D,G27D,76,166,25,11
295 | 7D,G34D,124,214,6,15
296 | 7D,J09D,283,13,10,5
297 | 7D,J10D,181,271,8,16
298 | 7D,J17D,202,292,4,6
299 | 7D,J18D,291,21,23,13
300 | 7D,J25D,131,221,49,20
301 | 7D,J26D,173,263,5,11
302 | 7D,M13D,15,105,0,1
303 | 7D,M14D,237,327,7,3
304 | 7D,M15D,256,346,55,5
305 | 7D,M17D,56,146,15,23
306 | 7D,G11D,NaN,NaN,NaN,NaN
307 | 7D,J06D,309,39,29,32
308 | 7D,J23D,207,297,8,37


--------------------------------------------------------------------------------
/notebooks/download_continuous.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "source": [
  6 |     "# Download ans Save Data: Continuous\n",
  7 |     "### Xiaotao Yang\n",
  8 |     "This notebook shows examples of calling seisgo.utils.getdata() to download seismic data and save them to files."
  9 |    ],
 10 |    "metadata": {}
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "source": [
 15 |     "## Step 0. Load needed packages.\n",
 16 |     "Some functions are imported from the utils.py."
 17 |    ],
 18 |    "metadata": {}
 19 |   },
 20 |   {
 21 |    "cell_type": "code",
 22 |    "execution_count": 1,
 23 |    "source": [
 24 |     "#import needed packages.\n",
 25 |     "from seisgo import utils\n",
 26 |     "import sys\n",
 27 |     "import time\n",
 28 |     "import scipy\n",
 29 |     "import obspy\n",
 30 |     "import pyasdf\n",
 31 |     "import datetime\n",
 32 |     "import os, glob\n",
 33 |     "import numpy as np\n",
 34 |     "from obspy import UTCDateTime\n",
 35 |     "from obspy.core import Stream,Trace\n",
 36 |     "from IPython.display import clear_output\n",
 37 |     "from obspy.clients.fdsn import Client"
 38 |    ],
 39 |    "outputs": [],
 40 |    "metadata": {}
 41 |   },
 42 |   {
 43 |    "cell_type": "markdown",
 44 |    "source": [
 45 |     "## Step 1. Set global parameters\n",
 46 |     "These parameters will control the downloading procedure."
 47 |    ],
 48 |    "metadata": {}
 49 |   },
 50 |   {
 51 |    "cell_type": "code",
 52 |    "execution_count": 2,
 53 |    "source": [
 54 |     "\"\"\"\n",
 55 |     "1. Station parameters\n",
 56 |     "\"\"\"\n",
 57 |     "rawdatadir = '../data/raw_test'\n",
 58 |     "if not os.path.isdir(rawdatadir): os.mkdir(rawdatadir)\n",
 59 |     "    \n",
 60 |     "cleantargetdir=True #change to False or remove/comment this block if needed.\n",
 61 |     "if cleantargetdir:\n",
 62 |     "    dfiles1 = glob.glob(os.path.join(rawdatadir,'*.h5'))\n",
 63 |     "    if len(dfiles1)>0:\n",
 64 |     "        print('Cleaning up raw data directory before downloading ...')\n",
 65 |     "        for df1 in dfiles1:\n",
 66 |     "            os.remove(df1)\n",
 67 |     "\n",
 68 |     "source='IRIS'\n",
 69 |     "client=Client(source)\n",
 70 |     "# get data from IRIS web service\n",
 71 |     "inet=\"TA\"\n",
 72 |     "stalist=[\"G05D\",\"I04D\"]#[\"G03A\",\"J35A\",\"J44A\",\"J65A\"]\n",
 73 |     "chanlist=['HHZ','BHZ']\n",
 74 |     "\n",
 75 |     "starttime = \"2012_02_02_0_0_0\"\n",
 76 |     "endtime   = \"2012_02_05_0_0_0\"\n",
 77 |     "inc_hours = 8\n",
 78 |     "\n",
 79 |     "\"\"\"\n",
 80 |     "2. Preprocessing parameters\n",
 81 |     "\"\"\"\n",
 82 |     "rmresp=True #remove instrument response\n",
 83 |     "# parameters for butterworth filter\n",
 84 |     "samp_freq=10\n",
 85 |     "pfreqmin=0.002\n",
 86 |     "pfreqmax=samp_freq/2\n",
 87 |     "\n",
 88 |     "# prefilter information used when removing instrument responses\n",
 89 |     "f1 = 0.95*pfreqmin;f2=pfreqmin\n",
 90 |     "if 1.05*pfreqmax > 0.48*samp_freq:\n",
 91 |     "    f3 = 0.45*samp_freq\n",
 92 |     "    f4 = 0.48*samp_freq\n",
 93 |     "else:\n",
 94 |     "    f3 = pfreqmax\n",
 95 |     "    f4= 1.05*pfreqmax\n",
 96 |     "pre_filt  = [f1,f2,f3,f4]\n",
 97 |     "\n",
 98 |     "\n",
 99 |     "\"\"\"\n",
100 |     "3. Download by looping through datetime list.\n",
101 |     "***** The users usually don't need to chance the following lines ****\n",
102 |     "\"\"\"\n",
103 |     "dtlist = utils.split_datetimestr(starttime,endtime,inc_hours)\n",
104 |     "print(dtlist)\n",
105 |     "for idt in range(len(dtlist)-1):\n",
106 |     "    sdatetime = obspy.UTCDateTime(dtlist[idt])\n",
107 |     "    edatetime   = obspy.UTCDateTime(dtlist[idt+1])\n",
108 |     "\n",
109 |     "    fname = os.path.join(rawdatadir,dtlist[idt]+'T'+dtlist[idt+1]+'.h5')\n",
110 |     "\n",
111 |     "    \"\"\"\n",
112 |     "    Start downloading.\n",
113 |     "    \"\"\"\n",
114 |     "    \n",
115 |     "    for ista in stalist:\n",
116 |     "        print('Downloading '+inet+\".\"+ista+\" ...\")\n",
117 |     "        \"\"\"\n",
118 |     "        3a. Request data.\n",
119 |     "        \"\"\"\n",
120 |     "        for chan in chanlist:\n",
121 |     "            try:\n",
122 |     "                t0=time.time()\n",
123 |     "                tr,sta_inv = utils.getdata(inet,ista,sdatetime,edatetime,chan=chan,source=source,\n",
124 |     "                                           samp_freq=samp_freq,plot=False,rmresp=rmresp,rmresp_output='DISP',\n",
125 |     "                                           pre_filt=pre_filt,sacheader=True,getstainv=True)\n",
126 |     "                \n",
127 |     "                ta=time.time() - t0\n",
128 |     "                print('  downloaded '+inet+\".\"+ista+\".\"+chan+\" in \"+str(ta)+\" seconds.\")\n",
129 |     "                \"\"\"\n",
130 |     "                3b. Save to ASDF file.\n",
131 |     "                \"\"\"\n",
132 |     "                tags=[]\n",
133 |     "                if len(tr.stats.location) == 0:\n",
134 |     "                    tlocation='00'\n",
135 |     "                else:\n",
136 |     "                    tlocation=tr.stats.location\n",
137 |     "\n",
138 |     "                tags.append(tr.stats.channel.lower()+'_'+tlocation.lower())\n",
139 |     "\n",
140 |     "                print('  saving to '+fname)\n",
141 |     "                utils.save2asdf(fname,Stream(traces=[tr]),tags,sta_inv=sta_inv)  \n",
142 |     "            except Exception as e:\n",
143 |     "                print(str(e))\n",
144 |     "            \n",
145 |     "        "
146 |    ],
147 |    "outputs": [
148 |     {
149 |      "output_type": "stream",
150 |      "name": "stdout",
151 |      "text": [
152 |       "['2012_02_02_00_00_00', '2012_02_02_08_00_00', '2012_02_02_16_00_00', '2012_02_03_00_00_00', '2012_02_03_08_00_00', '2012_02_03_16_00_00', '2012_02_04_00_00_00', '2012_02_04_08_00_00', '2012_02_04_16_00_00', '2012_02_05_00_00_00']\n",
153 |       "Downloading 7D.G03A ...\n",
154 |       "station 7D.G03A --> seismic channel: BHZ\n",
155 |       "  downsamping from 50 to 10\n",
156 |       "  removing response using inv for 7D.G03A.BHZ\n",
157 |       "  downloaded 7D.G03A.BHZ in 4.829556941986084 seconds.\n",
158 |       "  saving to ../data/raw_test/2012_02_02_00_00_00T2012_02_02_08_00_00.h5\n",
159 |       "Downloading 7D.G03A ...\n",
160 |       "station 7D.G03A --> seismic channel: BHZ\n",
161 |       "  downsamping from 50 to 10\n",
162 |       "  removing response using inv for 7D.G03A.BHZ\n",
163 |       "  downloaded 7D.G03A.BHZ in 4.547628164291382 seconds.\n",
164 |       "  saving to ../data/raw_test/2012_02_02_08_00_00T2012_02_02_16_00_00.h5\n",
165 |       "Downloading 7D.G03A ...\n",
166 |       "station 7D.G03A --> seismic channel: BHZ\n",
167 |       "  downsamping from 50 to 10\n",
168 |       "  removing response using inv for 7D.G03A.BHZ\n",
169 |       "  downloaded 7D.G03A.BHZ in 4.405111074447632 seconds.\n",
170 |       "  saving to ../data/raw_test/2012_02_02_16_00_00T2012_02_03_00_00_00.h5\n",
171 |       "Downloading 7D.G03A ...\n",
172 |       "station 7D.G03A --> seismic channel: BHZ\n",
173 |       "  downsamping from 50 to 10\n",
174 |       "  removing response using inv for 7D.G03A.BHZ\n",
175 |       "  downloaded 7D.G03A.BHZ in 4.533736944198608 seconds.\n",
176 |       "  saving to ../data/raw_test/2012_02_03_00_00_00T2012_02_03_08_00_00.h5\n",
177 |       "Downloading 7D.G03A ...\n",
178 |       "station 7D.G03A --> seismic channel: BHZ\n",
179 |       "  downsamping from 50 to 10\n",
180 |       "  removing response using inv for 7D.G03A.BHZ\n",
181 |       "  downloaded 7D.G03A.BHZ in 3.4130070209503174 seconds.\n",
182 |       "  saving to ../data/raw_test/2012_02_03_08_00_00T2012_02_03_16_00_00.h5\n",
183 |       "Downloading 7D.G03A ...\n",
184 |       "station 7D.G03A --> seismic channel: BHZ\n",
185 |       "  downsamping from 50 to 10\n",
186 |       "  removing response using inv for 7D.G03A.BHZ\n",
187 |       "  downloaded 7D.G03A.BHZ in 3.534677028656006 seconds.\n",
188 |       "  saving to ../data/raw_test/2012_02_03_16_00_00T2012_02_04_00_00_00.h5\n",
189 |       "Downloading 7D.G03A ...\n",
190 |       "station 7D.G03A --> seismic channel: BHZ\n",
191 |       "  downsamping from 50 to 10\n",
192 |       "  removing response using inv for 7D.G03A.BHZ\n",
193 |       "  downloaded 7D.G03A.BHZ in 3.6696767807006836 seconds.\n",
194 |       "  saving to ../data/raw_test/2012_02_04_00_00_00T2012_02_04_08_00_00.h5\n",
195 |       "Downloading 7D.G03A ...\n",
196 |       "station 7D.G03A --> seismic channel: BHZ\n",
197 |       "  downsamping from 50 to 10\n",
198 |       "  removing response using inv for 7D.G03A.BHZ\n",
199 |       "  downloaded 7D.G03A.BHZ in 3.5089330673217773 seconds.\n",
200 |       "  saving to ../data/raw_test/2012_02_04_08_00_00T2012_02_04_16_00_00.h5\n",
201 |       "Downloading 7D.G03A ...\n",
202 |       "station 7D.G03A --> seismic channel: BHZ\n",
203 |       "  downsamping from 50 to 10\n",
204 |       "  removing response using inv for 7D.G03A.BHZ\n",
205 |       "  downloaded 7D.G03A.BHZ in 3.5488531589508057 seconds.\n",
206 |       "  saving to ../data/raw_test/2012_02_04_16_00_00T2012_02_05_00_00_00.h5\n"
207 |      ]
208 |     }
209 |    ],
210 |    "metadata": {}
211 |   },
212 |   {
213 |    "cell_type": "code",
214 |    "execution_count": null,
215 |    "source": [],
216 |    "outputs": [],
217 |    "metadata": {}
218 |   }
219 |  ],
220 |  "metadata": {
221 |   "@webio": {
222 |    "lastCommId": null,
223 |    "lastKernelId": null
224 |   },
225 |   "kernelspec": {
226 |    "display_name": "Python 3",
227 |    "language": "python",
228 |    "name": "python3"
229 |   },
230 |   "language_info": {
231 |    "codemirror_mode": {
232 |     "name": "ipython",
233 |     "version": 3
234 |    },
235 |    "file_extension": ".py",
236 |    "mimetype": "text/x-python",
237 |    "name": "python",
238 |    "nbconvert_exporter": "python",
239 |    "pygments_lexer": "ipython3",
240 |    "version": "3.7.6"
241 |   }
242 |  },
243 |  "nbformat": 4,
244 |  "nbformat_minor": 4
245 | }


--------------------------------------------------------------------------------
/notebooks/embededfigs/JaniszewskiGJI2019Fig5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/notebooks/embededfigs/JaniszewskiGJI2019Fig5.png


--------------------------------------------------------------------------------
/scripts/EQDownload.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import pandas as pd
  3 | from obspy.geodetics.base import locations2degrees
  4 | from obspy.taup import TauPyModel
  5 | from seisgo.downloaders import download
  6 | from seisgo.downloaders import get_sta_list
  7 | import numpy as np
  8 | from obspy import read_events
  9 | import time
 10 | 
 11 | 
 12 | 
 13 | def sta_travel_time(event_lat, event_long, sta_lat, sta_long, depth_km):
 14 |     sta_t = locations2degrees(event_lat, event_long, sta_lat, sta_long)
 15 |     taup = TauPyModel(model="iasp91")
 16 |     arrivals = taup.get_travel_times(source_depth_in_km=depth_km,distance_in_degree=sta_t)
 17 |     travel_time = arrivals[0].time
 18 |     return travel_time
 19 | 
 20 | # =========================================================================
 21 | #                           Download Catalog
 22 | # =========================================================================
 23 | 
 24 | # Catalog params
 25 | start = "2011-03-11"
 26 | end = "2011-03-15"
 27 | minlat = 32
 28 | maxlat = 41
 29 | minlon = 35
 30 | maxlon = 146
 31 | minmag = 8
 32 | maxmag = 9
 33 | 
 34 | events = []
 35 | magstep = 1.0
 36 | maglist = np.arange(minmag, maxmag + 0.5 * magstep, magstep)
 37 | for i in range(len(maglist) - 1):
 38 |     if i > 0:
 39 |         minM = str(maglist[i] + 0.0001)  # to avoid duplicates with intersecting magnitude thresholds
 40 |     else:
 41 |         minM = str(maglist[i])
 42 |     maxM = str(maglist[i + 1])
 43 | 
 44 |     print(minM, maxM)
 45 | 
 46 |     URL = "http://isc-mirror.iris.washington.edu/fdsnws/event/1/query?starttime=" + \
 47 |           start + "&endtime=" + end + "&minmagnitude=" + minM + "&maxmagnitude=" + maxM + "&minlatitude=" + \
 48 |           str(minlat) + "&maxlatitude=" + str(maxlat) + "&minlongitude=" + str(minlon) + "&maxlongitude=" + str(maxlon) + ""
 49 | 
 50 |     event = read_events(URL)
 51 |     print("Catalog: " + str(event.count()))
 52 |     events.append(event)
 53 |     print("Download loop #" + str(i) + " complete")
 54 | 
 55 | catalog = events[0]
 56 | for i in range(len(events) - 1):
 57 |     if (i == 0):
 58 |         skip = True
 59 |     else:
 60 |         catalog.extend(events[i])
 61 | 
 62 | # =========================================================================
 63 | git
 64 | #Station list parameters
 65 | fname = None
 66 | chan_list = ["*"]
 67 | net_list  = ["*"]
 68 | sta_list  = ["*"]
 69 | lamin,lamax,lomin,lomax= 27.372,46.073,126.563,150.82
 70 | max_tries = 10
 71 | source = 'IRIS'
 72 | maxseischan = 3
 73 | window_len = 120
 74 | 
 75 | # Downloading parameters
 76 | freqmin   = 0.01
 77 | freqmax   = 100
 78 | samp_freq= 200
 79 | sta_fname = None
 80 | download_dir= "../../EQData"
 81 | 
 82 | for event in catalog:
 83 | 
 84 |     # Get station list using seisgo.downloaders.get_sta_list()
 85 |     event_t = event.origins[0].time
 86 |     event_long = event.origins[0].longitude
 87 |     event_lat = event.origins[0].latitude
 88 |     depth_km = event.origins[0].depth / 1000
 89 |     edatetime = event_t + window_len
 90 |     sdatetime = event_t.datetime
 91 |     edatetime = edatetime.datetime
 92 |     sta_list = get_sta_list(fname, net_list, sta_list, chan_list,
 93 |                             sdatetime, edatetime, maxseischan, source='IRIS',
 94 |                             lamin=lamin, lamax=lamax, lomin=lomin, lomax=lomax)
 95 | 
 96 |     try:
 97 |         pd.write_csv(sta_fname+str(event.origins[0].time)+str(event.magnitudes[0].mag))
 98 |     except:
 99 |         pass
100 |     sta_list = sta_list.assign(starttime=np.nan, endtime=np.nan)
101 |     # Travel time correction using function defined in script
102 |     for i, sta in enumerate(sta_list.iterrows()):
103 |         sta_lat = sta_list.iloc[i].latitude
104 |         sta_long = sta_list.iloc[i].longitude
105 |         travel_time = sta_travel_time(event_lat, event_long, sta_lat, sta_long, depth_km)
106 |         sta_list.at[i, 'starttime'] = event_t + travel_time
107 |         sta_list.at[i, 'endtime'] = sta_list.starttime[i] + window_len
108 |         # Download for each station
109 |         # makes 1 asdf file per earthquake event, and appends new data
110 |         # saves event info and station info with each trace
111 | 
112 |         download(rawdatadir=download_dir, starttime=sta_list.starttime[i], endtime=sta_list.endtime[i],
113 |                  network=sta_list.network[i], station=sta_list.station[i],
114 |                  channel=sta_list.channel[i], source=source, max_tries=max_tries,
115 |                  samp_freq=samp_freq, freqmax=freqmax, freqmin=freqmin,
116 |                  event=event)
117 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/1_download_MPI.py:
--------------------------------------------------------------------------------
  1 | import sys
  2 | import obspy
  3 | import os
  4 | import time
  5 | import numpy as np
  6 | import pandas as pd
  7 | from mpi4py import MPI
  8 | from seisgo.utils import split_datetimestr
  9 | from seisgo import downloaders,noise
 10 | #########################################################
 11 | ################ PARAMETER SECTION ######################
 12 | #########################################################
 13 | tt0=time.time()
 14 | 
 15 | # paths and filenames
 16 | rootpath = "tempdata_craton" # roothpath for the project
 17 | direc  = os.path.join(rootpath,'Raw')                   # where to store the downloaded data
 18 | #if not os.path.isdir(direc): os.mkdir(direc)
 19 | down_list  = os.path.join(direc,'station.txt')
 20 | # CSV file for station location info
 21 | 
 22 | # download parameters
 23 | source='IRIS'                                 # client/data center. see https://docs.obspy.org/packages/obspy.clients.fdsn.html for a list
 24 | max_tries = 3                                                  #maximum number of tries when downloading, in case the server returns errors.
 25 | use_down_list = False                                                # download stations from a pre-compiled list or not
 26 | flag      = True                                               # print progress when running the script; recommend to use it at the begining
 27 | samp_freq = 5                                                  # targeted sampling rate at X samples per seconds
 28 | rmresp   = True
 29 | rmresp_out = 'VEL'
 30 | pressure_chan = [None]				#Added by Xiaotao Yang. This is needed when downloading some special channels, e.g., pressure data. VEL output for these channels.
 31 | respdir   = os.path.join(rootpath,'resp')                       # directory where resp files are located (required if rm_resp is neither 'no' nor 'inv')
 32 | freqmin   = 0.002                                                # pre filtering frequency bandwidth
 33 | freqmax   = 0.5*samp_freq
 34 | # note this cannot exceed Nquist freq
 35 | 
 36 | # targeted region/station information: only needed when use_down_list is False
 37 | lamin,lamax,lomin,lomax= 34,48,-98,-78.5                # regional box: min lat, max lat, min lon, max lon
 38 | chan_list = ["BHZ","HHZ"]
 39 | net_list  = ["XO","7A","CN","IM","IU","LD","LM","MU","N4","NM","OH","OK","PE","PN","SS","US","XI","Z9","ZL"]                   # network list
 40 | sta_list  = ["*"]                                               # station (using a station list is way either compared to specifying stations one by one)
 41 | start_date = "2013_09_28_0_0_0"                               # start date of download
 42 | end_date   = "2014_01_01_0_0_0"                               # end date of download
 43 | inc_hours  = 12                                                 # length of data for each request (in hour)
 44 | maxseischan = 1                                                  # the maximum number of seismic channels, excluding pressure channels for OBS stations.
 45 | ncomp      = maxseischan #len(chan_list)
 46 | 
 47 | # get rough estimate of memory needs to ensure it now below up in noise cross-correlations
 48 | cc_len    = 2*3600                                                # basic unit of data length for fft (s)
 49 | step      = 1*3600                                                 # overlapping between each cc_len (s)
 50 | MAX_MEM   = 5.0                                                 # maximum memory allowed per core in GB
 51 | 
 52 | 
 53 | ##################################################
 54 | # we expect no parameters need to be changed below
 55 | # assemble parameters used for pre-processing waveforms in downloading
 56 | prepro_para = {'rmresp':rmresp,'rmresp_out':rmresp_out,'respdir':respdir,'freqmin':freqmin,'freqmax':freqmax,\
 57 |                 'samp_freq':samp_freq}
 58 | 
 59 | downlist_kwargs = {"source":source, 'net_list':net_list, "sta_list":sta_list, "chan_list":chan_list, \
 60 |                     "starttime":start_date, "endtime":end_date, "maxseischan":maxseischan, "lamin":lamin, "lamax":lamax, \
 61 |                     "lomin":lomin, "lomax":lomax, "pressure_chan":pressure_chan, "fname":down_list}
 62 | 
 63 | ########################################################
 64 | #################DOWNLOAD SECTION#######################
 65 | ########################################################
 66 | #--------MPI---------
 67 | comm = MPI.COMM_WORLD
 68 | rank = comm.Get_rank()
 69 | size = comm.Get_size()
 70 | 
 71 | if rank==0:
 72 |     if flag:
 73 |         print('station.list selected [%s] for data from %s to %s with %sh interval'%(use_down_list,start_date,end_date,inc_hours))
 74 | 
 75 |     if not os.path.isdir(direc):os.makedirs(direc)
 76 |     if use_down_list:
 77 |         stalist=pd.read_csv(down_list)
 78 |     else:
 79 |         stalist=downloaders.get_sta_list(**downlist_kwargs) # saves station list to "down_list" file
 80 |                                               # here, file name is "station.txt"
 81 |     # rough estimation on memory needs (assume float32 dtype)
 82 |     memory_size=noise.cc_memory(inc_hours,samp_freq,len(stalist.station),ncomp,cc_len,step)
 83 |     if memory_size > MAX_MEM:
 84 |         raise ValueError('Require %5.3fG memory but only %5.3fG provided)! Reduce inc_hours to avoid this issue!' % (memory_size,MAX_MEM))
 85 | 
 86 |     # save parameters for future reference
 87 |     metadata = os.path.join(direc,'download_info.txt')
 88 |     fout = open(metadata,'w')
 89 |     fout.write(str({**prepro_para,**downlist_kwargs,'inc_hours':inc_hours,'ncomp':ncomp}));fout.close()
 90 | 
 91 |     all_chunk = split_datetimestr(start_date,end_date,inc_hours)
 92 |     if len(all_chunk)<1:
 93 |         raise ValueError('Abort! no data chunk between %s and %s' % (start_date,end_date))
 94 |     splits = len(all_chunk)-1
 95 | else:
 96 |     splits,all_chunk = [None for _ in range(2)]
 97 | 
 98 | # broadcast the variables
 99 | splits = comm.bcast(splits,root=0)
100 | all_chunk  = comm.bcast(all_chunk,root=0)
101 | extra = splits % size
102 | 
103 | # MPI: loop through each time chunk
104 | for ick in range(rank,splits,size):
105 |     s1= all_chunk[ick]
106 |     s2=all_chunk[ick+1]
107 | 
108 |     download_kwargs = {"source":source,"rawdatadir": direc, "starttime": s1, "endtime": s2, \
109 |               "stationinfo": down_list,**prepro_para}
110 | 
111 |     # Download for ick
112 |     downloaders.download(**download_kwargs)
113 | 
114 | tt1=time.time()
115 | print('downloading step takes %6.2f s' %(tt1-tt0))
116 | 
117 | comm.barrier()
118 | #if rank == 0:
119 | #    sys.exit()
120 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/2_xcorr_MPI.py:
--------------------------------------------------------------------------------
  1 | import sys,time,os, glob
  2 | from mpi4py import MPI
  3 | from seisgo import noise, utils
  4 | 
  5 | if not sys.warnoptions:
  6 |     import warnings
  7 |     warnings.simplefilter("ignore")
  8 | 
  9 | # absolute path parameters
 10 | rootpath  = "test_data"                                                     # root path for this data processing
 11 | CCFDIR    = os.path.join(rootpath,'CCF_test')                               # dir to store CC data
 12 | DATADIR   = os.path.join(rootpath,'raw_data')                               # dir where noise data is located
 13 | locations = os.path.join(rootpath,'station.txt')                            # station info including network,station,channel,latitude,longitude,elevation: only needed when input_fmt is not asdf
 14 | 
 15 | # some control parameters
 16 | freq_norm   = 'rma'                                                         # 'no' for no whitening, or 'rma' for running-mean average, 'phase' for sign-bit normalization in freq domain
 17 | time_norm   = 'no'                                                          # 'no' for no normalization, or 'rma', 'one_bit' for normalization in time domain
 18 | cc_method   = 'xcorr'                                                       # 'xcorr' for pure cross correlation, 'deconv' for deconvolution; FOR "COHERENCY" PLEASE set freq_norm to "rma" and time_norm to "no"
 19 | acorr_only  = False                                                         # only perform auto-correlation
 20 | xcorr_only  = True                                                          # only perform cross-correlation or not
 21 | correct_orientation = True                                                  # If Ture, correct orientations for horizontal channels and convert 1/2 to N/E channels
 22 | rotate_raw = True                                                          # If Ture, rotate to RTZ coordinate. correct_orientation will be automatically set to True
 23 | max_time_diff = 60                                                               # (unit: datapoints) For padding horizontal traces to match sizes when doing orient correction. If None, default is 10 data points.
 24 | exclude_chan = []                                                           # Added by Xiaotao Yang. Channels in this list will be skipped.
 25 | 
 26 | channel_pairs = ['ZZ','TT'] ### !!! IMPORTANT !!! Less than 3 pairs are strongly recommended for efficiency. Set to 'None' will do all cross-correlations
 27 | 
 28 | verbose = True
 29 | output_structure="source"                                                   # How the output files will be named. Acceptable formats are: "raw", "source", "station-pair", "station-component-pair", 
 30 | # pre-processing parameters
 31 | cc_len    = 3600*2                                                            # basic unit of data length for fft (sec)
 32 | step      = 3600*0.5                                                             # (1-overlapping) between each cc_len (sec). The smaller the more chuncks
 33 | smooth_N  = 150                                                              # Important. This value should be at least the max period of interet. Half moving window length for time/freq domain normalization if selected (points)
 34 | 
 35 | # cross-correlation parameters
 36 | maxlag         = 1500                                                        # lags of cross-correlation to save (sec)
 37 | substack       = False                                                      # sub-stack daily cross-correlation or not
 38 | substack_len   = cc_len                                                  # how long to stack over (for monitoring purpose): need to be multiples of cc_len
 39 | smoothspect_N  = 20                                                         # moving window length to smooth spectrum amplitude (points)
 40 | 
 41 | # criteria for data selection
 42 | max_over_std = 10                                                           # threahold to remove window of tremor signals: set it to 10*9 if prefer not to remove them
 43 | max_kurtosis = 10                                                           # max kurtosis allowed, TO BE ADDED!
 44 | 
 45 | # load useful download info if start from ASDF
 46 | dfile = os.path.join(DATADIR,'download_info.txt')
 47 | down_info = eval(open(dfile).read())
 48 | freqmin   = down_info['freqmin']
 49 | freqmax   = down_info['freqmax']
 50 | ##################################################
 51 | # we expect no parameters need to be changed below
 52 | #--------MPI---------
 53 | comm = MPI.COMM_WORLD
 54 | rank = comm.Get_rank()
 55 | size = comm.Get_size()
 56 | 
 57 | if rank == 0:
 58 |     # make a dictionary to store all variables: also for later cc
 59 |     fc_para={'cc_len':cc_len,'step':step,
 60 |             'freqmin':freqmin,'freqmax':freqmax,'freq_norm':freq_norm,'time_norm':time_norm,
 61 |             'cc_method':cc_method,'smooth_N':smooth_N,'substack':substack,'substack_len':substack_len,
 62 |             'smoothspect_N':smoothspect_N,'maxlag':maxlag,'max_over_std':max_over_std,
 63 |             'max_kurtosis':max_kurtosis,'channel_correction':correct_orientation,'rotate_raw':rotate_raw,'channel_pairs':channel_pairs}
 64 |     # save fft metadata for future reference
 65 |     fc_metadata  = os.path.join(CCFDIR,'fft_cc_data.txt')
 66 |     if not os.path.isdir(CCFDIR):os.makedirs(CCFDIR)
 67 |     # save metadata
 68 |     fout = open(fc_metadata,'w')
 69 |     fout.write(str(fc_para));fout.close()
 70 | 
 71 |     # set variables to broadcast
 72 |     tdir = sorted(glob.glob(os.path.join(DATADIR,'*.h5')))
 73 | 
 74 |     nchunk = len(tdir)
 75 |     splits  = nchunk
 76 | 
 77 |     if nchunk==0: raise IOError('Abort! no available seismic files for FFT')
 78 | 
 79 | else:
 80 |     splits,tdir = [None for _ in range(2)]
 81 | 
 82 | # broadcast the variables
 83 | splits = comm.bcast(splits,root=0)
 84 | tdir  = comm.bcast(tdir,root=0)
 85 | #loop through all data files.
 86 | for ick in range(rank,splits,size):
 87 |     sfile=tdir[ick]
 88 |     t10=time.time()
 89 |     #call the correlation wrapper.
 90 |     ndata,total_t=noise.do_correlation(sfile,cc_len,step,maxlag,channel_pairs,correct_orientation=correct_orientation,rotate_raw=rotate_raw,cc_method=cc_method,
 91 |                          acorr_only=acorr_only,xcorr_only=xcorr_only,substack=substack,
 92 |                          smoothspect_N=smoothspect_N,substack_len=substack_len,
 93 |                          maxstd=max_over_std,freqmin=freqmin,freqmax=freqmax,
 94 |                          time_norm=time_norm,freq_norm=freq_norm,smooth_N=smooth_N,max_time_diff=max_time_diff,
 95 |                          exclude_chan=exclude_chan,outdir=CCFDIR,v=verbose,output_structure=output_structure)
 96 | 
 97 |     t11 = time.time()
 98 |     print('it takes %6.5fs to process the chunk of %s' % (t11-t10,sfile.split('/')[-1]))
 99 |     if rotate_raw:
100 |         print('it takes %6.5fs to assemble raw source data' % (total_t[1]))
101 |         print('it takes %6.5fs to assemble raw receiver data' % (total_t[5]))
102 |         print('it takes %6.5fs to do rotation' % (total_t[2]))
103 |         print('it takes %6.5fs to prepare data for FFT' % (total_t[3]))
104 |         print('it takes %6.5fs to compute FFT' % (total_t[4]))
105 | 
106 | #comm.barrier()
107 | 
108 | # merge all path_array and output
109 | #if rank == 0:
110 | #    sys.exit()
111 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/3_merge_pairs_bysources_MPI.py:
--------------------------------------------------------------------------------
 1 | import sys,time,os, glob
 2 | from multiprocessing import Pool
 3 | from seisgo import noise,utils
 4 | if not sys.warnoptions:
 5 |     import warnings
 6 |     warnings.simplefilter("ignore")
 7 | 
 8 | '''
 9 | Stacking script of SeisGo to:
10 |     1) load cross-correlation data for each station pair
11 |     2) merge all time chuncks
12 |     3) save outputs in ASDF;
13 | 
14 | USAGE: python merge_pairs_bysources_MPI.py [nproc source]
15 | '''
16 | ########################################
17 | #########PARAMETER SECTION##############
18 | ########################################
19 | #get arguments on the number of processors
20 | 
21 | # absolute path parameters
22 | def merge_wrapper(ccfiles,pair,outdir):
23 |     to_egf = True                                               #convert CCF to empirical Green's functions when merging
24 |     stack = True
25 |     stack_method = "robust"
26 |     stack_win_len = 15*24*3600
27 |     flag   = True 
28 |     split = True 
29 |     taper = True
30 |     taper_frac=0.005
31 |     taper_maxlen=20
32 |     noise.merge_pairs(ccfiles,pair,outdir=outdir,verbose=flag,to_egf=to_egf,stack=stack,
33 |                         stack_method=stack_method,stack_win_len=stack_win_len,split=split,
34 |                         taper=taper,taper_frac=taper_frac,taper_maxlen=taper_maxlen)
35 |     return 0
36 | 
37 | def main():
38 |     narg=len(sys.argv)
39 |     selected_source=None
40 |     if narg == 1:
41 |         nproc=1
42 |     elif narg == 2:
43 |         nproc=int(sys.argv[1]) 
44 |     elif narg == 3:
45 |         nproc=int(sys.argv[1])
46 |         selected_source=str(sys.argv[2])
47 | 
48 |     ## Global parameters
49 |     rootpath  = "data_craton"                                 # root path for this data processing
50 |     CCFDIR    = os.path.join(rootpath,'CCF_SOURCES')                            # dir where CC data is stored
51 |     MERGEDIR  = os.path.join(rootpath+'_depot','PAIRS_TWOSIDES')                          # dir where stacked data is going to
52 |     if not os.path.isdir(MERGEDIR):os.makedirs(MERGEDIR)
53 | 
54 |     if selected_source is not None:
55 |         selected_source = [os.path.join(CCFDIR,selected_source)]
56 |     #######################################
57 |     ###########PROCESSING SECTION##########
58 |     #######################################
59 | 
60 |     #loop through resources, for each source MPI through station pairs.
61 |     sources_temp=utils.get_filelist(CCFDIR)
62 |     #exclude non-directory item in the list
63 |     sources=[]
64 |     for src in sources_temp:
65 |         if os.path.isdir(src): sources.append(src)
66 |     if selected_source is not None and selected_source[0] not in sources:
67 |         raise ValueError(selected_source[0]+" is not found")
68 |     if selected_source is None:
69 |         selected_source = sources
70 |     for src in selected_source:
71 |         tt0=time.time()
72 | 
73 |         # cross-correlation files
74 |         ccfiles = utils.get_filelist(src,"h5")
75 |         print("assembled %d files"%(len(ccfiles)))
76 |         pairs_all,netsta_all=noise.get_stationpairs(ccfiles,False)
77 |         print("found %d station pairs for %d stations"%(len(pairs_all),len(netsta_all)))
78 | 
79 |         #loop for each station pair
80 |         print("working on all pairs with %d processors."%(nproc))
81 |         if nproc < 2:
82 |             results=merge_wrapper(ccfiles,pairs_all,MERGEDIR)
83 |         else: 
84 |             p=Pool(int(nproc))
85 |             results=p.starmap(merge_wrapper,[(ccfiles,pair,MERGEDIR) for pair in pairs_all])
86 |             p.close()
87 |         del results
88 | 
89 | 
90 |         print('it takes %6.2fs to merge %s' % (time.time()-tt0,src))
91 |         # 
92 | if __name__ == "__main__":
93 |     main()
94 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/4_split_sides_bysources_MPI.py:
--------------------------------------------------------------------------------
 1 | import sys,time,os
 2 | from multiprocessing import Pool
 3 | from seisgo import noise,utils
 4 | if not sys.warnoptions:
 5 |     import warnings
 6 |     warnings.simplefilter("ignore")
 7 | 
 8 | '''
 9 | Stacking script of SeisGo to:
10 |     1) split the negative and positive sides after merging station pairs.
11 | 
12 | '''
13 | ########################################
14 | #########PARAMETER SECTION##############
15 | ########################################
16 | #get arguments on the number of processors
17 | 
18 | # absolute path parameters
19 | def split_sides_wrapper(ccfile,outdir):
20 |     taper = True
21 |     taper_frac=0.01
22 |     taper_maxlen=10
23 |     flag   = True 
24 |     print(ccfile)
25 |     noise.split_sides(ccfile,outdir=outdir,taper=taper,taper_frac=taper_frac,
26 |                 taper_maxlen=taper_maxlen,verbose=flag)
27 |     return 0
28 | 
29 | def main():
30 |     narg=len(sys.argv)
31 |     if narg == 1:
32 |         nproc=1
33 |     else:
34 |         nproc=int(sys.argv[1]) 
35 | 
36 |     ## Global parameters
37 |     rootpath  = "data_craton"                                 # root path for this data processing
38 |     MERGEDIR  = os.path.join(rootpath,'MERGED_PAIRS')                          # dir where stacked data is going to
39 |     SPLITDIR    = os.path.join(rootpath,'PAIRS_SPLIT_SIDES')                            # dir where CC data is stored
40 | 
41 |     if not os.path.isdir(SPLITDIR):os.makedirs(SPLITDIR)
42 |     #######################################
43 |     ###########PROCESSING SECTION##########
44 |     #######################################
45 | 
46 |     #loop through resources, for each source MPI through station pairs.
47 |     sources_temp=utils.get_filelist(MERGEDIR)
48 |     #exclude non-directory item in the list
49 |     sources=[]
50 |     for src in sources_temp:
51 |         if os.path.isdir(src): sources.append(src)
52 |     if nproc >=2:
53 |         p=Pool(int(nproc))
54 |     for src in sources:
55 |         tt0=time.time()
56 | 
57 |         # cross-correlation files
58 |         ccfiles = utils.get_filelist(src,"h5")
59 |         print("assembled %d files"%(len(ccfiles)))
60 |         outdir=os.path.join(SPLITDIR,os.path.split(src)[1])
61 |         #loop for each station pair
62 |         print("working on all pairs with %d processors."%(nproc))
63 |         if nproc < 2:
64 |             for j in range(len(ccfiles)):
65 |                 results=split_sides_wrapper(ccfiles[j],SPLITDIR)
66 |         else: 
67 |             results=p.starmap(split_sides_wrapper,[(ccfile,outdir) for ccfile in ccfiles])    
68 |         del results
69 | 
70 |         print('it takes %6.2fs to merge %s' % (time.time()-tt0,src))
71 |         # 
72 |     if nproc >=2:
73 |         p.close()
74 | if __name__ == "__main__":
75 |     main()
76 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/README.md:
--------------------------------------------------------------------------------
 1 | This folder contains scripts to extract EGFs with SeisGo. 
 2 | 
 3 | Steps to extract EGFs
 4 | 1. Download data. Script: 1_download_MPI.py
 5 | 2. Compute xcorr. Script: 2_xcorr_MPI.py
 6 | 3. Merge station pairs. Script: 3_merge_pairs_bysources_MPI.py
 7 | 4. Split two sides (negative and positive). Script: 4_split_sides_bysources_MPI.py
 8 | 
 9 | The above four steps produce the base of the data for FWANT. The following step creates the shaped data with specific wavelet. This step should be run after finalize simulation parameters (particularly the source time function). 
10 | 
11 | Step 3, when merging, has the option of splitting. Then step 4 is not needed if the data is already splitted.
12 | 
13 | All of these scripts can be run on local computer or the cluster. The tag "MPI" means the scripts are written for paralle run with "mpirun". The syntax can be found in the shell script. When running on the cluster, there are shell scripts that are used to submit the jobs. The file names are straightforward enough to use.
14 | 
15 | The tag "bysources" means the files (input or output) are organized by folders named with the virtual source (i.e., net.station). This is necessary to process large dataset.
16 | 
17 | Please change the parameters for each step for each project. 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/submit_download.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #SBATCH -J dld        #job name to remember
 3 | #SBATCH -n 5	#number of CPU cores you request for the job
 4 | #SBATCH -A xtyang  #queue to submit the job, our lab queue.
 5 | #SBATCH --mem-per-cpu 5000 	#requested memory per CPU
 6 | #SBATCH -t 7-10:00			#requested time day-hour:minute
 7 | #SBATCH -o %x.out  #path and name to save the output file.
 8 | #SBATCH -e %x.err 	#path to save the error file.
 9 | 
10 | module purge			#clean up the modules
11 | module load rcac		#reload rcac modules.
12 | module use /depot/xtyang/etc/modules		#load conda module
13 | module load conda-env/seisgo-py3.7.6  	#let every core activate the environment before running the job
14 | 
15 | mpirun -n $SLURM_NTASKS python 1_download_MPI.py
16 |  					#run line, change file name
17 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/submit_merge_pairs_bysources_MPI.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #SBATCH -J mgP  #job name to remember
 3 | #SBATCH -n 30  #number of CPU cores you request for the job
 4 | #SBATCH -A xtyang  #queue to submit the job
 5 | #SBATCH --mem-per-cpu 16000  #requested memory per CPU
 6 | #SBATCH -t 4-0:00   #requested time day-hour:minute
 7 | #SBATCH -o %x.out  #path and name to save the output file
 8 | #SBATCH -e %x.err  #path to save the error file
 9 | 
10 | module purge			#clean up the modules
11 | module load rcac		#reload rcac modules.
12 | module use /depot/xtyang/etc/modules
13 | module load conda-env/seisgo-py3.7.6
14 | 
15 | python 3_merge_pairs_bysources_MPI.py $SLURM_NTASKS
16 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/submit_split_sides.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #SBATCH -J split  #job name to remember
 3 | #SBATCH -n 40  #number of CPU cores you request for the job
 4 | #SBATCH -A xtyang  #queue to submit the job
 5 | #SBATCH --mem-per-cpu 2000  #requested memory per CPU
 6 | #SBATCH -t 2-0:00   #requested time day-hour:minute
 7 | #SBATCH -o %x.out  #path and name to save the output file
 8 | #SBATCH -e %x.err  #path to save the error file
 9 | 
10 | module purge			#clean up the modules
11 | module load rcac		#reload rcac modules.
12 | module use /depot/xtyang/etc/modules
13 | module load conda-env/seisgo-py3.7.6
14 | 
15 | python 4_split_sides_bysources_MPI.py $SLURM_NTASKS
16 | 


--------------------------------------------------------------------------------
/scripts/ExtractEGFsFromNoise/submit_xcorr_MPI.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | #SBATCH -J xc  #job name to remember
 3 | #SBATCH -n 30  #number of CPU cores you request for the job
 4 | #SBATCH -A xtyang  #queue to submit the job
 5 | #SBATCH --mem-per-cpu 16000  #requested memory per CPU
 6 | #SBATCH -t 5-0:00   #requested time day-hour:minute
 7 | #SBATCH -o %x.out  #path and name to save the output file
 8 | #SBATCH -e %x.err  #path to save the error file
 9 | #module --force purge
10 | 
11 | module load rcac
12 | module use /depot/xtyang/etc/modules
13 | module load conda-env/seisgo-py3.7.6
14 | 
15 | mpirun -n $SLURM_NTASKS python 2_xcorr_MPI.py
16 | 


--------------------------------------------------------------------------------
/scripts/OBS_orientation_Cascadia.csv:
--------------------------------------------------------------------------------
  1 | net,station,orientation_ch1,orientation_ch2,error,Num_measurements
  2 | 7D,G03A,154,244,5,4
  3 | 7D,G30A,140,230,2,5
  4 | 7D,J06A,225,315,6,5
  5 | 7D,J23A,114,204,11,5
  6 | 7D,J28A,206,296,12,4
  7 | 7D,J29A,74,164,13,7
  8 | 7D,J30A,69,159,9,8
  9 | 7D,J31A,5,95,11,7
 10 | 7D,J37A,264,354,9,11
 11 | 7D,J38A,101,191,14,7
 12 | 7D,J39A,323,53,13,9
 13 | 7D,J45A,12,102,10,13
 14 | 7D,J46A,254,344,14,4
 15 | 7D,J47A,247,337,14,9
 16 | 7D,J52A,253,343,11,11
 17 | 7D,J53A,149,239,13,2
 18 | 7D,J54A,27,117,9,6
 19 | 7D,J55A,211,301,12,4
 20 | 7D,J61A,257,347,6,6
 21 | 7D,J63A,165,255,8,3
 22 | 7D,J67A,138,228,5,5
 23 | 7D,J68A,158,248,20,4
 24 | 7D,FN07A,118,208,0,12
 25 | 7D,FN08A,128,218,55,4
 26 | 7D,FN14A,347,77,44,17
 27 | 7D,FN16A,201,291,22,15
 28 | 7D,FN18A,296,26,56,8
 29 | 7D,J41A,60,150,33,11
 30 | 7D,J42A,83,173,5,5
 31 | 7D,J49A,291,21,13,14
 32 | 7D,J50A,93,183,23,65
 33 | 7D,J51A,133,223,27,26
 34 | 7D,J58A,190,280,13,6
 35 | 7D,J59A,109,199,11,15
 36 | 7D,M03A,161,251,13,11
 37 | 7D,J25A,145,235,82,2
 38 | 7D,J33A,12,102,24,5
 39 | 7D,J35A,202,292,8,23
 40 | 7D,J36A,37,127,6.9,16
 41 | 7D,J43A,81.5,171.5,7.4,24
 42 | 7D,J44A,207.5,297.5,9.4,9
 43 | 7D,J57A,157,247,79,2
 44 | 7D,J65A,51,141,12,2
 45 | 7D,J73A,16,106,32,2
 46 | 7D,M01A,100,190,25,5
 47 | 7D,M07A,91,181,14,18
 48 | 7D,M08A,8,98,7,3
 49 | 7D,J48A,NaN,NaN,NaN,NaN
 50 | 7D,FN12A,312,42,0,1
 51 | 7D,J26A,NaN,NaN,NaN,NaN
 52 | 7D,J24A,NaN,NaN,NaN,NaN
 53 | 7D,J34A,NaN,NaN,NaN,NaN
 54 | 7D,M06A,NaN,NaN,NaN,NaN
 55 | 7D,M02A,NaN,NaN,NaN,NaN
 56 | 7D,FN05A,6,96,0,1
 57 | 7D,FS02B,78,168,60,6
 58 | 7D,FS04B,151,241,12,12
 59 | 7D,FS07B,30,120,0,1
 60 | 7D,FS11B,202,292,29,3
 61 | 7D,FS12B,25,115,0,1
 62 | 7D,FS13B,341,71,30,5
 63 | 7D,FS15B,141,231,37,4
 64 | 7D,FS18B,47,137,67,4
 65 | 7D,FS19B,31,121,0,1
 66 | 7D,FS20B,0,90,19,9
 67 | 7D,G09B,220,310,12,5
 68 | 7D,G17B,324,54,60,10
 69 | 7D,G18B,221,311,0,1
 70 | 7D,G25B,334,64,3,5
 71 | 7D,G26B,316,46,50,2
 72 | 7D,G34B,151,241,17,13
 73 | 7D,J17B,87,177,0,1
 74 | 7D,J25B,99,189,81,7
 75 | 7D,J33B,10,100,37,10
 76 | 7D,M09B,98,188,4,13
 77 | 7D,FS01B,346,76,15,23
 78 | 7D,FS14B,164,254,51,2
 79 | 7D,G02B,200,290,18,33
 80 | 7D,G10B,133,223,16,24
 81 | 7D,G12B,209,299,21,8
 82 | 7D,G27B,212,302,17,27
 83 | 7D,G28B,161,251,15,21
 84 | 7D,G37B,102,192,13,40
 85 | 7D,J09B,163,253,6,3
 86 | 7D,J10B,89,179,10,34
 87 | 7D,J18B,80,170,1,2
 88 | 7D,J20B,331,61,9,35
 89 | 7D,M11B,201,291,19,13
 90 | 7D,M12B,235,325,15,6
 91 | 7D,M14B,266,356,13,7
 92 | 7D,FS05B,307,37,11,3
 93 | 7D,FS06B,331,61,76,4
 94 | 7D,G03B,29,119,8,4
 95 | 7D,G05B,231,321,11,19
 96 | 7D,G11B,39,129,15,28
 97 | 7D,G13B,225,315,13,17
 98 | 7D,G19B,79,189,14,12
 99 | 7D,G20B,3,93,15,28
100 | 7D,G21B,250,340,11,13
101 | 7D,G22B,260,350,9,29
102 | 7D,G29B,329,59,8,15
103 | 7D,G30B,230,320,7,11
104 | 7D,G35B,333,63,13,15
105 | 7D,G36B2,195,285,14,8
106 | 7D,J06B,240,330,13,12
107 | 7D,J11B,321,51,10,19
108 | 7D,J19B,125,215,7,24
109 | 7D,J23B,315,45,9,19
110 | 7D,J27B,292,22,13,18
111 | 7D,J28B,23,113,9,16
112 | 7D,J48B,201,291,8,17
113 | 7D,J63B,145,235,7,11
114 | 7D,BB030,84,174,0,1
115 | 7D,BB060,121,211,0,1
116 | 7D,BB070,304,34,0,1
117 | 7D,BB090,18,108,0,1
118 | 7D,BB120,300,30,0,1
119 | 7D,BB130,246,336,0,1
120 | 7D,BB140,337,67,7,11
121 | 7D,BB150,29,109,0,1
122 | 7D,BB170,211,201,0,1
123 | 7D,BB180,162,252,0,1
124 | 7D,BB200,64,154,0,1
125 | 7D,BB230,158,248,0,1
126 | 7D,BB240,291,21,0,1
127 | 7D,BB260,40,130,0,1
128 | 7D,BB290,59,149,0,1
129 | 7D,BB300,303,33,0,1
130 | 7D,BB320,197,287,0,1
131 | 7D,BB330,63,153,0,1
132 | 7D,BB350,162,252,0,1
133 | 7D,BB370,63,153,0,1
134 | 7D,BB390,59,149,0,1
135 | 7D,BB410,357,87,0,1
136 | 7D,BB420,181,271,0,1
137 | 7D,BB440,303,33,0,1
138 | 7D,BB450,235,325,0,1
139 | 7D,BB480,62,152,0,1
140 | 7D,BB510,297,27,0,1
141 | 7D,BB530,301,31,0,1
142 | 7D,BB540,168,258,0,1
143 | 7D,BB550,230,320,0,1
144 | 7D,FN01C,72,162,65,2
145 | 7D,FN02C,141,231,57,5
146 | 7D,FN03C,26,116,70,38
147 | 7D,FN04C,163,253,55,35
148 | 7D,FN05C,357,87,43,15
149 | 7D,FN07C,171,261,32,7
150 | 7D,FN08C,324,54,71,8
151 | 7D,FN09C,162,252,25,19
152 | 7D,FN10C,245,335,2,2
153 | 7D,FN11C,252,342,74,4
154 | 7D,FN12C,318,48,41,7
155 | 7D,FN13C,76,166,48,10
156 | 7D,FN14C,221,311,80,16
157 | 7D,FN19C,182,272,50,20
158 | 7D,J26C,63,153,64,13
159 | 7D,J34C,325,55,56,10
160 | 7D,J41C,346,76,78,12
161 | 7D,J42C,225,315,54,17
162 | 7D,J49C,96,186,59,23
163 | 7D,J50C,339,69,42,12
164 | 7D,J57C,103,193,77,3
165 | 7D,J58C,115,205,39,14
166 | 7D,J59C,333,63,71,8
167 | 7D,M06C,359,89,29,14
168 | 7D,J25C,215,305,36,5
169 | 7D,J33C,215,305,27,11
170 | 7D,J52C,129,219,26,27
171 | 7D,J53C,301,39,27,31
172 | 7D,J61C,173,263,17,22
173 | 7D,J65C,109,199,79,9
174 | 7D,J68C,141,231,5,24
175 | 7D,J73C,329,49,26,8
176 | 7D,M01C,138,228,6,8
177 | 7D,M02C,88,178,20,2
178 | 7D,M03C,327,67,24,15
179 | 7D,M04C,103,193,21,10
180 | 7D,M05C,194,284,15,9
181 | 7D,M08C,92,182,9,9
182 | 7D,J21C,42,132,32,10
183 | 7D,J23C,147,237,7,10
184 | 7D,J28C,186,276,12,4
185 | 7D,J29C,59,149,29,37
186 | 7D,J30C,65,155,26,29
187 | 7D,J31C,234,324,33,29
188 | 7D,J32C,97,187,3,4
189 | 7D,J35C,333,73,32,18
190 | 7D,J36C,33,123,20,16
191 | 7D,J37C,228,318,22,18
192 | 7D,J38C,245,335,36,17
193 | 7D,J39C,268,358,18,22
194 | 7D,J43C,118,208,28,19
195 | 7D,J44C,303,33,33,19
196 | 7D,J45C,109,199,18,28
197 | 7D,J46C,175,265,17,25
198 | 7D,J47C,219,309,39,21
199 | 7D,J48C,290,20,33,11
200 | 7D,J54C,84,174,14,4
201 | 7D,J55C,127,217,30,11
202 | 7D,J63C,100,190,15,9
203 | 7D,J67C,178,268,8,6
204 | 7D,J69C,37,127,6,7
205 | 7D,FS02D,123,217,55,6
206 | 7D,FS04D,326,56,42,8
207 | 7D,FS06D,67,157,5,7
208 | 7D,FS07D,259,349,6,9
209 | 7D,FS08D,202,292,77,13
210 | 7D,FS09D,263,353,9,12
211 | 7D,FS10D,42,132,6,19
212 | 7D,FS13D,275,5,11,16
213 | 7D,FS16D,80,170,5,8
214 | 7D,FS41D,183,273,16,2
215 | 7D,FS44D,74,164,8,15
216 | 7D,G01D,10,100,11,15
217 | 7D,G03D,139,229,19,28
218 | 7D,G04D,261,351,14,28
219 | 7D,G05D,315,45,25,30
220 | 7D,G09D,284,14,2,7
221 | 7D,G10D,180,270,20,19
222 | 7D,G13D,181,271,21,46
223 | 7D,G20D,311,41,9,35
224 | 7D,G21D,296,26,10,29
225 | 7D,G22D,171,261,11,35
226 | 7D,G29D,182,272,14,43
227 | 7D,G33D,315,45,8,10
228 | 7D,G35D,52,142,11,35
229 | 7D,G36D,115,205,10,19
230 | 7D,G37D,137,227,24,49
231 | 7D,J11D,98,188,14,26
232 | 7D,J19D,256,346,22,34
233 | 7D,J20D,188,278,17,45
234 | 7D,J27D,28,118,9,28
235 | 7D,J28D,211,301,8,23
236 | 7D,M16D,315,45,7,14
237 | 7D,BB630,125,215,30,78
238 | 7D,BB631,125,115,30,78
239 | 7D,BB640,267,357,16,40
240 | 7D,BB650,147,237,36,18
241 | 7D,BB660,293,23,33,10
242 | 7D,BB670,201,291,25,36
243 | 7D,BB680,33,123,26,32
244 | 7D,BB690,244,334,20,7
245 | 7D,BB700,192,282,13,34
246 | 7D,BB710,118,208,29,71
247 | 7D,BB711,118,208,29,71
248 | 7D,BB720,26,116,19,52
249 | 7D,BB721,26,116,19,52
250 | 7D,BB730,36,126,35,26
251 | 7D,BB740,279,9,17,38
252 | 7D,BB750,NaN,NaN,NaN,NaN
253 | 7D,BB830,300,30,13,27
254 | 7D,BB840,62,152,19,32
255 | 7D,BB850,330,60,23,55
256 | 7D,BB870,230,320,28,57
257 | 7D,GB030,NaN,NaN,NaN,NaN
258 | 7D,GB050,257,347,1,2
259 | 7D,GB080,99,189,47,6
260 | 7D,GB100,146,236,38,42
261 | 7D,GB101,146,236,38,42
262 | 7D,GB111,159,249,28,9
263 | 7D,GB130,205,295,83,7
264 | 7D,GB170,327,57,40,24
265 | 7D,GB171,327,57,40,24
266 | 7D,GB180,186,276,24,20
267 | 7D,GB210,279,9,21,5
268 | 7D,GB220,334,64,81,7
269 | 7D,GB230,NaN,NaN,NaN,NaN
270 | 7D,GB260,215,305,51,4
271 | 7D,GB281,189,279,17,39
272 | 7D,GB320,143,233,33,3
273 | 7D,GB321,143,233,33,3
274 | 7D,GB330,20,110,23,30
275 | 7D,GB331,20,110,23,30
276 | 7D,GB340,33,123,0,1
277 | 7D,GB341,33,123,0,1
278 | 7D,GB350,NaN,NaN,NaN,NaN
279 | 7D,GB360,88,178,25,5
280 | 7D,GB380,242,332,41,20
281 | 7D,FC03D,84,174,4,16
282 | 7D,FS11D,263,353,54,4
283 | 7D,FS12D,181,271,32,12
284 | 7D,FS14D,213,303,29,9
285 | 7D,FS15D,172,262,0,1
286 | 7D,FS17D,298,28,3,4
287 | 7D,FS42D,102,192,0,1
288 | 7D,FS43D,125,215,5,13
289 | 7D,FS45D,239,329,55,10
290 | 7D,G02D,72,162,9,10
291 | 7D,G18D,257,347,15,29
292 | 7D,G25D,NaN,NaN,NaN,NaN
293 | 7D,G26D,236,326,10,14
294 | 7D,G27D,76,166,25,11
295 | 7D,G34D,124,214,6,15
296 | 7D,J09D,283,13,10,5
297 | 7D,J10D,181,271,8,16
298 | 7D,J17D,202,292,4,6
299 | 7D,J18D,291,21,23,13
300 | 7D,J25D,131,221,49,20
301 | 7D,J26D,173,263,5,11
302 | 7D,M13D,15,105,0,1
303 | 7D,M14D,237,327,7,3
304 | 7D,M15D,256,346,55,5
305 | 7D,M17D,56,146,15,23
306 | 7D,G11D,NaN,NaN,NaN,NaN
307 | 7D,J06D,309,39,29,32
308 | 7D,J23D,207,297,8,37


--------------------------------------------------------------------------------
/scripts/seisgo_BANX_azimuthal_anisotropy.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | import os,sys,time
  4 | import numpy as np
  5 | import pandas as pd
  6 | from seisgo import noise,utils
  7 | from multiprocessing import Pool
  8 | import pygmt as gmt
  9 | from seisgo.anisotropy import do_BANX
 10 | import warnings
 11 | warnings.filterwarnings("ignore", category=DeprecationWarning) 
 12 | # ## BANX wrapper with control parameters
 13 | def BANX_wrapper(stationdict_all, reference_site, datadir, outdir_root, receiver_box):
 14 |     """
 15 |     Wrapper function for do_BANX function in anisotropy module.
 16 |     This function is called by the main script mainly for parallelization purpose.
 17 |     """
 18 |     XCorrComp = 'ZZ'
 19 |     # Subarray parameters:
 20 |     Min_Stations = 10   
 21 | 
 22 |     # SNR:
 23 |     Min_SNR = 10
 24 | 
 25 |     # Scaling factors
 26 |     Min_Radius_scaling = 1
 27 |     Max_Radius_scaling = 1.25
 28 |     Min_Distance_scaling = 3
 29 |     Signal_Extent_Scaling = 2.5 # used to set the signal window when computing the signal to noise ratio. 
 30 |     # The window would be [predicted arrival time - Signal_Extent_Scaling*Max_Period, predicted arrival time + Signal_Extent_Scaling*Max_Period] 
 31 | 
 32 |     Vel_Signal_Window = 3.2
 33 |     # Beamforming space:
 34 |     Max_Slowness = 0.5   # [s/km] Maximum slowness in the beamforming
 35 |     Slowness_Step = 0.005  #[s/km] Slowness interval
 36 | 
 37 |     # Beamforming limit:
 38 |     Vel_Reference = 3.5  # [km/s]
 39 |     Vel_Perturbation = 0.4 # [Percentage fraction] 0.5 = %50                                                                                              
 40 | 
 41 |     Taper_Length_Scaling= 7 # Taper length scaling factor. The taper length is Taper_Length_Scaling*Max_Period.
 42 | 
 43 |     AZIBIN_STEP = 6 # azimuthal bin step size in degrees used in the QC step after beamforming of all sources.
 44 |     # QC baz coverage
 45 |     Min_BAZ_measurements = 1 #minimum number of measurements in each azimuthal bin. Should be >=3. Use 1 for testing here.
 46 |     Min_Good_BAZBIN = 5 #minimum number of good bins with >= Min_BAZ_measurements in each azimuthal bin. Should be >=5 (recommended).
 47 | 
 48 |     Min_Beam_Sharpness = 10 #minimum beam sharpness to be considered as a good measurement.
 49 | 
 50 |     MinTime = 0.0 #start time of the xcorr data.
 51 | 
 52 |     Sampling_Rate_Target = 5 #target sampling rate. Needs to be integer times the data sampling rate to avoid resampling error.
 53 |     #The data will be resampled to Sampling_Rate_Target (samples per second).
 54 | 
 55 |     Period_Band = [15,30]
 56 | 
 57 |     DoubleSided = False
 58 | 
 59 |     ####################################################
 60 |     #### plotting controls ##############################
 61 |     ####################################################
 62 |     show_fig = False #figures will be plotted by not shown.
 63 |     # plot moveout of good traces
 64 |     plot_moveout = True
 65 |     moveout_scaling = 4
 66 | 
 67 |     # plot cluster map and the source
 68 |     plot_clustermap = True #not plot if on HPC cluster. some display issue may happend if plotting.
 69 |     map_engine = 'cartopy'
 70 |     #slowness image
 71 |     plot_beampower =True
 72 | 
 73 |     # plot phase velocity of the reference station
 74 |     plot_station_result = True
 75 | 
 76 |     ########################################################
 77 |     #### Calling do_BANX function in anisotropy module. ####
 78 |     ########################################################
 79 |     Beam_Local, anisotropy=do_BANX(stationdict_all, reference_site, Period_Band, Vel_Reference,datadir,
 80 |                                     outdir_root,sampling_rate=Sampling_Rate_Target,min_stations=Min_Stations, 
 81 |                                     min_snr=Min_SNR, min_radius_scaling=Min_Radius_scaling,
 82 |                                     max_radius_scaling=Max_Radius_scaling, min_distance_scaling=Min_Distance_scaling, 
 83 |                                     signal_window_velocity=Vel_Signal_Window,
 84 |                                     signal_extent_scaling=Signal_Extent_Scaling,max_slowness=Max_Slowness,slowness_step=Slowness_Step,
 85 |                                     velocity_perturbation=Vel_Perturbation, trace_start_time=MinTime,taper_length_scaling=Taper_Length_Scaling,
 86 |                                     azimuth_step=AZIBIN_STEP,min_baz_measurements=Min_BAZ_measurements,min_good_bazbin=Min_Good_BAZBIN,
 87 |                                     min_beam_sharpness=Min_Beam_Sharpness,doublesided=DoubleSided, cc_comp =XCorrComp,
 88 |                                     receiver_box=receiver_box,show_fig=show_fig,plot_moveout=plot_moveout, moveout_scaling=moveout_scaling,
 89 |                                     plot_clustermap=plot_clustermap, map_engine = map_engine, plot_beampower=plot_beampower,
 90 |                                     plot_station_result=plot_station_result,verbose=True)
 91 |     
 92 |     return Beam_Local, anisotropy
 93 | 
 94 | ## Main script
 95 | # This is the main script for BANX processing.
 96 | def main():
 97 |     """
 98 |     Main script for BANX processing.
 99 |     """
100 |     # Read number of processors from command line.
101 |     narg = len(sys.argv)
102 |     if narg == 1:
103 |         nproc=1
104 |     else:
105 |         nproc=int(sys.argv[1])
106 |     #
107 |     if nproc > 1:
108 |         # Create a pool of workers
109 |         pool = Pool(processes=nproc)
110 |     """
111 |     Most of the time, users only need to change the following parameters:
112 |     """
113 |     # set root directory
114 |     rootdir='.'
115 |     datadir=os.path.join(rootdir,'data_craton/PAIRS_AVERAGEDSIDES_stack_robust_egf') #'data_craton/PAIRS_TWOSIDES_stack_robust')
116 |     outdir_root=os.path.join(rootdir,'BANX_results')
117 |     if not os.path.isdir(outdir_root):os.makedirs(outdir_root, exist_ok=True)
118 |     ReceiverBox_lat=[36,39.5]
119 |     ReceiverBox_lon=[-92,-84]
120 |     stationinfo_file='station_info.csv'
121 |     use_stationfile=True
122 |     plot_station_map = True
123 |     ######
124 |     ##########################################################
125 |     ####### End of user parameters. ############################
126 |     ##########################################################
127 | 
128 |     # ## Extract the netsta list and their coordinates from the xcorr data
129 |     # The coordinates for each net.sta are stored in dictionaries.
130 |     # load data
131 |     if use_stationfile:
132 |         station_df=pd.read_csv(stationinfo_file)
133 |         coord_all=dict()
134 |         for i in range(len(station_df)):
135 |             coord_all[station_df['net.sta'].iloc[i]] = [station_df['lat'].iloc[i],station_df['lon'].iloc[i],station_df['ele'].iloc[i]]
136 |         # remove duplicates
137 |         netsta_all=list(coord_all.keys()) #sorted(set(netsta_all))
138 |         print('Read %d station from %s'%(len(netsta_all),stationinfo_file))
139 |     else:
140 |         sourcelist=utils.get_filelist(datadir)
141 |         t1 = time.time()
142 |         # netsta_all=[]
143 |         coord_all=dict()
144 |         if nproc < 2:
145 |             for src in sourcelist:
146 |                 # srcdir=os.path.join(datadir,src)
147 |                 ccfiles=utils.get_filelist(src,'h5',pattern='P_stack')
148 |                 _,_,coord=noise.get_stationpairs(ccfiles,getcoord=True,verbose=True)
149 |                 # netsta_all.extend(netsta)
150 |                 coord_all = coord_all | coord
151 |         else:
152 |             #parallelization
153 |             print('Using %d processes to process %d source files'%(nproc,len(sourcelist)))
154 |             results=pool.starmap(noise.get_stationpairs, [(utils.get_filelist(src,'h5','P_stack'),
155 |                                                         False,False,True) for src in sourcelist])
156 |             # If running interactively, change the above line to: 
157 |             # results = pool.startmap(noise.get_stationpairs, [(src,True) for src in sourcelist])
158 |             # unpack results. Needed when running interactively. Otherwise, the results are not unpacked and have been saved to files.
159 |             _, _, coord_all = zip(*results)
160 |             # netsta_all = [item for sublist in netsta_all for item in sublist]
161 |             coord_all = {k: v for d in coord_all for k, v in d.items()}
162 |         #
163 |         # remove duplicates
164 |         netsta_all=list(coord_all.keys()) #sorted(set(netsta_all))
165 |         print('Extracted %d net.sta from %d source files in %.2f seconds.'%(len(netsta_all),len(sourcelist),time.time()-t1))
166 |         
167 |         stationfile = os.path.join(rootdir,stationinfo_file)
168 |         fout = open(stationfile,'w')
169 |         fout.write('net.sta,lat,lon,ele\n')
170 |         for i in range(len(netsta_all)):
171 |             coord0 = coord_all[netsta_all[i]]
172 |             fout.write('%s,%f,%f,%f\n'%(netsta_all[i],coord0[0],coord0[1],coord0[2]))
173 |         fout.close()
174 |         print('Station information saved to %s'%stationfile)
175 | 
176 |     # ## Subset the station list for the receiver box region
177 |     #set receiver region box
178 |     #this is usually a smaller region than the entire dataset. 
179 |     # Stations within this box region are used as receivers while all
180 |     # stations may be used as the sources.
181 |     # set receiver box for do_BANX function.
182 |     ReceiverBox = [ReceiverBox_lon[0],ReceiverBox_lon[1],ReceiverBox_lat[0],ReceiverBox_lat[1]]
183 | 
184 |     ReceiverList_Sites=[] #net.sta strings.
185 |     ReceiverList_Coord=[] #lat, lon
186 | 
187 |     SourceList_Sites=[]
188 |     SourceList_Coord=[]
189 |     for i in range(len(netsta_all)):
190 |         #Master site
191 |         coord0 = coord_all[netsta_all[i]][:2] #coordinates: lat, lon in order.
192 |         SourceList_Sites.append(netsta_all[i])
193 |         SourceList_Coord.append(coord0)
194 |         if coord0[0] >= ReceiverBox_lat[0] and coord0[0] <= ReceiverBox_lat[1] and \
195 |                     coord0[1] >= ReceiverBox_lon[0] and coord0[1] <= ReceiverBox_lon[1]:
196 |             ReceiverList_Sites.append(netsta_all[i])
197 |             ReceiverList_Coord.append(coord0)
198 |     #
199 |     """
200 |     Plot the station map.
201 |     """
202 |     if plot_station_map:
203 |         #plot station map.
204 |         source_coord_array=np.array(SourceList_Coord)
205 |         marker_style="i0.17c"
206 |         map_style="plain"
207 |         projection="M3.i"
208 |         frame="af"
209 |         title="station map"
210 |         GMT_FONT_TITLE="14p,Helvetica-Bold"
211 |         lon_all,lat_all=source_coord_array[:,1],source_coord_array[:,0]
212 | 
213 |         region="%6.2f/%6.2f/%5.2f/%5.2f"%(np.min(lon_all),np.max(lon_all),np.min(lat_all),np.max(lat_all))
214 |         fig = gmt.Figure()
215 |         gmt.config(MAP_FRAME_TYPE=map_style, FONT_TITLE=GMT_FONT_TITLE)
216 |         fig.coast(region=region, resolution="f",projection=projection, 
217 |                 water="0/180/255",frame=frame,land="240",
218 |                 borders=["1/1p,black", "2/0.5p,100"])
219 |         fig.basemap(frame='+t'+title+'')
220 |         fig.plot(
221 |             x=lon_all,
222 |             y=lat_all,
223 |             style=marker_style,
224 |             pen="0.5p,red",
225 |         )
226 |         #plot receiver box
227 |         lon_box=[ReceiverBox_lon[0],ReceiverBox_lon[1],ReceiverBox_lon[1],ReceiverBox_lon[0],ReceiverBox_lon[0]]
228 |         lat_box=[ReceiverBox_lat[0],ReceiverBox_lat[0],ReceiverBox_lat[1],ReceiverBox_lat[1],ReceiverBox_lat[0]]
229 |         fig.plot(
230 |             x=lon_box,
231 |             y=lat_box,
232 |             pen="1p,blue",
233 |         )
234 |         fig.savefig(os.path.join(rootdir,'station_map.pdf'))
235 |         gmt.set_display('none')
236 |         fig.show()
237 |     #
238 | 
239 |     """"
240 |     Start the main loop
241 |     """
242 |     #######################################
243 |     #### Loop over the reference sites ####
244 |     #######################################
245 |     if nproc <2:
246 |         # Beam_Local_all, anisotropy_all = [],[]
247 |         for i in range(len(ReceiverList_Sites)):
248 |             #Master site
249 |             Ref_Site = ReceiverList_Sites[i]
250 |             print('Processing reference site %s --- %d/%d'%(Ref_Site,i+1,len(ReceiverList_Sites)))
251 |             
252 |             # Beam_Local, anisotropy=BANX_wrapper(coord_all,Ref_Site, datadir, outdir_root, ReceiverBox)
253 |             _,_=BANX_wrapper(coord_all,Ref_Site, datadir, outdir_root, ReceiverBox)
254 |             #end here for debug/test.
255 |             # Beam_Local_all.append(Beam_Local)
256 |             # anisotropy_all.append(anisotropy)
257 |             # break
258 |         #
259 |     else:
260 |         #parallelization
261 |         print('Using %d processes to process %d receiver sites'%(nproc,len(ReceiverList_Sites)))
262 |         ############
263 |         
264 |         pool.starmap(BANX_wrapper, [(coord_all,Ref_Site, datadir, outdir_root, ReceiverBox) for Ref_Site in ReceiverList_Sites])
265 |         # If running interactively, change the above line to: 
266 |         # results = pool.startmap(BANX_wrapper, [(coord_all,Ref_Site, datadir, outdir_root, ReceiverBox) for Ref_Site in ReceiverList_Sites])
267 |         pool.close()
268 | 
269 |         # unpack results. Needed when running interactively. Otherwise, the results are not unpacked and have been saved to files.
270 |         
271 |         # Beam_Local_all, anisotropy_all = zip(*results)
272 | ##end of main
273 | 
274 | ########### 
275 | if __name__ == "__main__":
276 |     main()
277 |     
278 | 
279 | 
280 | 
281 | 


--------------------------------------------------------------------------------
/scripts/seisgo_ccf2sac_MPI.py:
--------------------------------------------------------------------------------
 1 | import sys,time,os,glob
 2 | from mpi4py import MPI
 3 | from seisgo.noise import save_corrfile_to_sac
 4 | 
 5 | """
 6 | Saves CCF data to SAC.
 7 | """
 8 | # absolute path parameters
 9 | rootpath  = 'data_injection'                                 # root path for this data processing
10 | CCFDIR    = os.path.join(rootpath,'CCF_test')                            # dir where CC data is stored
11 | SACDIR  = os.path.join(rootpath,'CCF_sac')                          # dir where stacked data is going to
12 | 
13 | #--------MPI---------
14 | comm = MPI.COMM_WORLD
15 | rank = comm.Get_rank()
16 | size = comm.Get_size()
17 | 
18 | if rank == 0:
19 |     # cross-correlation files
20 |     ccfiles   = sorted(glob.glob(os.path.join(CCFDIR,'*.h5')))
21 |     splits  = len(ccfiles)
22 |     if splits==0:raise IOError('Abort! no available CCF data for converting')
23 | else:
24 |     splits,ccfiles = [None for _ in range(2)]
25 | 
26 | # broadcast the variables
27 | splits    = comm.bcast(splits,root=0)
28 | ccfiles   = comm.bcast(ccfiles,root=0)
29 | #--------End of setting up MPI parameters---------
30 | 
31 | # MPI loop: loop through each user-defined time chunck
32 | for ifile in range(rank,splits,size):
33 |     save_corrfile_to_sac(ccfiles[ifile],rootdir=SACDIR,v=False)
34 | 
35 | comm.barrier()
36 | 
37 | # merge all path_array and output
38 | if rank == 0:
39 |     sys.exit()
40 | 


--------------------------------------------------------------------------------
/scripts/seisgo_cleaning_MPI.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | # This script cleans up the raw seismic waveform data with specified number of channels.
  4 | 
  5 | # In[ ]:
  6 | 
  7 | 
  8 | #import needed packages.
  9 | from seisgo import utils
 10 | from seisgo import obsmaster as obs
 11 | import sys
 12 | import time
 13 | import scipy
 14 | import obspy
 15 | import pyasdf
 16 | import datetime
 17 | from mpi4py import MPI
 18 | import os, glob
 19 | import numpy as np
 20 | # import pandas as pd
 21 | # import matplotlib.pyplot  as plt
 22 | # from obspy import UTCDateTime
 23 | from obspy.core import Stream, Trace
 24 | 
 25 | t0=time.time()
 26 | """
 27 | 1. Set global parameters.
 28 | """
 29 | rootpath='../data'
 30 | rawdatadir = os.path.join(rootpath,'raw')
 31 | downloadexample=False #change to False or remove/comment this block if needed.
 32 | 
 33 | #directory to save the data after TC removal
 34 | outdatadir = os.path.join(rootpath,'raw_clean')
 35 | cleanoutdatadir=True #If True, the program will remove all *.h5 files under `tcdatafir` before running.
 36 | 
 37 | #parameters for orientation correcttions for horizontal components
 38 | correct_obs_orient=True
 39 | obs_orient_file='OBS_orientation_Cascadia.csv'
 40 | 
 41 | #how to deal with stations with bad traces
 42 | drop_if_has_badtrace=True
 43 | 
 44 | ######################################################################
 45 | #### Normally, no changes are needed for the following processing ####
 46 | ######################################################################
 47 | """
 48 | 2. Read local data and do correction
 49 | We use the wrapper function for tilt and compliance corrections. The data after noise removal
 50 | will be saved to the original file name BUT in different directory defined by `outdatadir`.
 51 | """
 52 | #-------- Set MPI parameters --------------------------------
 53 | comm = MPI.COMM_WORLD
 54 | rank = comm.Get_rank()
 55 | size = comm.Get_size()
 56 | if rank==0:
 57 |     ####################################
 58 |     #### Optional clean-up block ####
 59 |     ####################################
 60 |     if not os.path.isdir(rawdatadir):
 61 |         comm.barrier()
 62 |         IOError('Abort! Directory for raw data NOT found: '+rawdatadir)
 63 |         sys.exit()
 64 | 
 65 |     if not os.path.isdir(outdatadir): os.mkdir(outdatadir)
 66 |     dfilesTC0 = glob.glob(os.path.join(outdatadir,'*.h5'))
 67 |     if cleanoutdatadir and len(dfilesTC0)>0:
 68 |         print('Cleaning up output directory before running ...')
 69 |         for df0 in dfilesTC0:os.remove(df0)
 70 |     ####################################
 71 |     ##### End of clean-up block #####
 72 |     ####################################
 73 | 
 74 |     dfiles0 = glob.glob(os.path.join(rawdatadir,'*.h5'))
 75 |     nfiles = len(dfiles0)
 76 |     splits0  = nfiles
 77 |     if nfiles < 1:
 78 |         raise IOError('Abort! no available seismic files in '+rawdatadir)
 79 |         sys.exit()
 80 | else:
 81 |     splits0,dfiles0 = [None for _ in range(2)]
 82 | 
 83 | # broadcast the variables
 84 | splits = comm.bcast(splits0,root=0)
 85 | dfiles  = comm.bcast(dfiles0,root=0)
 86 | #--------End of setting MPI parameters -----------------------
 87 | for ifile in range(rank,splits,size):
 88 |     #read obs orientation data.
 89 |     if correct_obs_orient:
 90 |         try:
 91 |             obs_orient_data=obs.get_orientations(obs_orient_file)
 92 |         except Exception as e:
 93 |             print(e)
 94 |             sys.exit()
 95 |     df=dfiles[ifile]
 96 |     print('Working on: '+df+' ... ['+str(ifile+1)+'/'+str(len(dfiles))+']')
 97 |     dfbase=os.path.split(df)[-1]
 98 |     df_out=os.path.join(outdatadir,dfbase)
 99 | 
100 |     ds=pyasdf.ASDFDataSet(df,mpi=False,mode='r')
101 |     netstalist = ds.waveforms.list()
102 |     nsta = len(netstalist)
103 | 
104 |     tilt=[]
105 |     sta_processed=[]
106 |     for ista in netstalist:
107 |         print('  station: '+ista)
108 |         """
109 |         Get the four-component data
110 |         """
111 |         try:
112 |             inv = ds.waveforms[ista]['StationXML']
113 |         except Exception as e:
114 |             print('  No stationxml for %s in file %s'%(ista,df))
115 |             inv = None
116 | 
117 |         all_tags = ds.waveforms[ista].get_waveform_tags()
118 |         if len(all_tags) < 1 or len(all_tags) > 4: continue #empty waveform group.
119 |         else: print(all_tags)
120 | 
121 |         tr1, tr2, trZ, trP=[None for _ in range(4)]
122 |         #assign components by waveform tags.
123 |         #This step may fail if the tags don't reflect the real channel information
124 |         newtags=['-','-','-','-']
125 |         for tg in all_tags:
126 |             tr_temp = ds.waveforms[ista][tg][0]
127 |             chan = tr_temp.stats.channel
128 |             if chan[-1].lower() == 'h':trP=tr_temp;newtags[3]=tg
129 |             elif chan[-1].lower() == '1' or chan[-1].lower() == 'e':tr1=tr_temp;newtags[0]=tg
130 |             elif chan[-1].lower() == '2' or chan[-1].lower() == 'n':tr2=tr_temp;newtags[1]=tg
131 |             elif chan[-1].lower() == 'z':trZ=tr_temp;newtags[2]=tg
132 | 
133 |         #sanity check.
134 |         badtrace=False
135 |         hasPressure=False
136 |         if isinstance(trP,Trace):
137 |             hasPressure=True
138 |         if not isinstance(tr1, Trace) and not isinstance(tr2, Trace) and not isinstance(trZ, Trace):
139 |                 print('  No seismic channels found. Drop the station: '+ista)
140 |                 continue
141 |         for tr in [tr1, tr2, trZ]:
142 |             if not isinstance(tr, Trace):
143 |                 print("  "+str(tr)+" is not a Trace object. "+ista)
144 |                 badtrace=True
145 |                 break
146 |             elif np.sum(np.isnan(tr.data))>0:
147 |                 print('  NaN found in trace: '+str(tr)+". "+ista)
148 |                 badtrace=True
149 |                 break
150 |             elif np.count_nonzero(tr.data) < 1:
151 |                 print('  All zeros in trace: '+str(tr)+". "+ista)
152 |                 badtrace=True
153 |                 break
154 |         if badtrace:
155 |             if not drop_if_has_badtrace:
156 |                 print("  Not enough good traces for TC removal! Save as is without processing!")
157 |                 outtrace=[]
158 |                 for tg in all_tags:
159 |                     outtrace.append(ds.waveforms[ista][tg][0])
160 |                 utils.save2asdf(df_out,Stream(traces=outtrace),all_tags,sta_inv=inv)
161 |             else:
162 |                 print("  Encountered bad trace for "+ista+". Skipped!")
163 |             continue
164 |         else:
165 |             newtags_tmp=[]
166 |             if isinstance(tr1, Trace) and isinstance(tr2, Trace) and correct_obs_orient and ista in obs_orient_data.keys():
167 |                 #correct horizontal orientations if in the obs_orient_data list.
168 |                 print("  Correctting horizontal orientations for: "+ista)
169 |                 trE,trN = obs.correct_orientations(tr1,tr2,obs_orient_data)
170 |                 newtags_tmp.append(utils.get_tracetag(trE))
171 |                 newtags_tmp.append(utils.get_tracetag(trN))
172 |                 if isinstance(trP,Trace):
173 |                     outstream=Stream(traces=[trE,trN,trZ,trP])
174 |                 else:
175 |                     outstream=Stream(traces=[trE,trN,trZ])
176 |             else: #save the station as is if it is not in the orientation database, assuming it is a land station.
177 |                 newtags_tmp.append(utils.get_tracetag(tr1))
178 |                 newtags_tmp.append(utils.get_tracetag(tr2))
179 |                 if isinstance(trP,Trace):
180 |                     outstream=Stream(traces=[tr1,tr2,trZ,trP])
181 |                 else:
182 |                     outstream=Stream(traces=[tr1,tr2,trZ])
183 | 
184 |             newtags_tmp.append(utils.get_tracetag(trZ))
185 |             if isinstance(trP,Trace): newtags_tmp.append(utils.get_tracetag(trP))
186 |             print(newtags_tmp)
187 |             print('  Saving '+ista+' to: '+df_out)
188 |             utils.save2asdf(df_out,outstream,newtags_tmp,sta_inv=inv)
189 | 
190 | ###############################################
191 | comm.barrier()
192 | if rank == 0:
193 |     tend=time.time() - t0
194 |     print('*************************************')
195 |     print('<<< Finished all files in %7.1f seconds, or %6.2f hours for %d files >>>' %(tend,tend/3600,len(dfiles)))
196 |     sys.exit()
197 | 


--------------------------------------------------------------------------------
/scripts/seisgo_download_MPI.py:
--------------------------------------------------------------------------------
  1 | import sys
  2 | import obspy
  3 | import os
  4 | import time
  5 | import numpy as np
  6 | import pandas as pd
  7 | from mpi4py import MPI
  8 | from seisgo.utils import split_datetimestr
  9 | from seisgo import downloaders
 10 | # Code to run this file from within a virtual environment
 11 | #########################################################
 12 |     # import sys
 13 |     # syspath = os.path.dirname(os.path.realpath(__file__))
 14 |     # print('Running code')
 15 |     # com = 'mpirun -n 4 ' + str(sys.executable) + " " + syspath + "/MPI_download.py"
 16 |     # print(com)
 17 |     # os.system(com)
 18 |     # print('Done')
 19 | #########################################################
 20 | 
 21 | 
 22 | #########################################################
 23 | ################ PARAMETER SECTION ######################
 24 | #########################################################
 25 | tt0=time.time()
 26 | 
 27 | # paths and filenames
 28 | rootpath = "data_decatur" # roothpath for the project
 29 | direc  = os.path.join(rootpath,'Raw')                   # where to store the downloaded data
 30 | #if not os.path.isdir(direc): os.mkdir(direc)
 31 | down_list  = os.path.join(direc,'station.txt')
 32 | # CSV file for station location info
 33 | 
 34 | # download parameters
 35 | source='IRIS'                                 # client/data center. see https://docs.obspy.org/packages/obspy.clients.fdsn.html for a list
 36 | max_tries = 10                                                  #maximum number of tries when downloading, in case the server returns errors.
 37 | use_down_list = False                                                # download stations from a pre-compiled list or not
 38 | flag      = True                                               # print progress when running the script; recommend to use it at the begining
 39 | samp_freq = 50                                                  # targeted sampling rate at X samples per seconds
 40 | rmresp   = True
 41 | rmresp_out = 'DISP'
 42 | pressure_chan = [None]				#Added by Xiaotao Yang. This is needed when downloading some special channels, e.g., pressure data. VEL output for these channels.
 43 | respdir   = os.path.join(rootpath,'resp')                       # directory where resp files are located (required if rm_resp is neither 'no' nor 'inv')
 44 | freqmin   = 0.01                                                # pre filtering frequency bandwidth
 45 | freqmax   = 25
 46 | # note this cannot exceed Nquist freq
 47 | 
 48 | # targeted region/station information: only needed when use_down_list is False
 49 | lamin,lamax,lomin,lomax= 39,40,-89,-88                # regional box: min lat, min lon, max lat, max lon (-114.0)
 50 | chan_list = ["HHZ"]
 51 | net_list  = ["GS"] #["7D","X9","TA","XT","UW"]                                              # network list
 52 | sta_list  = ["DEC05","DEC06","DEC07","DEC10"]                                               # station (using a station list is way either compared to specifying stations one by one)
 53 | start_date = "2014_01_01_0_0_0"                               # start date of download
 54 | end_date   = "2014_01_11_0_0_0"                               # end date of download
 55 | inc_hours  = 24                                                 # length of data for each request (in hour)
 56 | maxseischan = 1                                                  # the maximum number of seismic channels, excluding pressure channels for OBS stations.
 57 | ncomp      = maxseischan #len(chan_list)
 58 | 
 59 | # get rough estimate of memory needs to ensure it now below up in noise cross-correlations
 60 | cc_len    = 3600                                                # basic unit of data length for fft (s)
 61 | step      = 1800                                                 # overlapping between each cc_len (s)
 62 | MAX_MEM   = 8.0                                                 # maximum memory allowed per core in GB
 63 | 
 64 | ##################################################
 65 | # we expect no parameters need to be changed below
 66 | # assemble parameters used for pre-processing waveforms in downloading
 67 | prepro_para = {'rmresp':rmresp,'rmresp_out':rmresp_out,'respdir':respdir,'freqmin':freqmin,'freqmax':freqmax,\
 68 |                 'samp_freq':samp_freq}
 69 | 
 70 | downlist_kwargs = {"source":source, 'net_list':net_list, "sta_list":sta_list, "chan_list":chan_list, \
 71 |                     "starttime":start_date, "endtime":end_date, "maxseischan":maxseischan, "lamin":lamin, "lamax":lamax, \
 72 |                     "lomin":lomin, "lomax":lomax, "pressure_chan":pressure_chan, "fname":down_list}
 73 | 
 74 | ########################################################
 75 | #################DOWNLOAD SECTION#######################
 76 | ########################################################
 77 | #--------MPI---------
 78 | comm = MPI.COMM_WORLD
 79 | rank = comm.Get_rank()
 80 | size = comm.Get_size()
 81 | 
 82 | if rank==0:
 83 |     if flag:
 84 |         print('station.list selected [%s] for data from %s to %s with %sh interval'%(use_down_list,start_date,end_date,inc_hours))
 85 | 
 86 |     if not os.path.isdir(direc):os.makedirs(direc)
 87 |     if use_down_list:
 88 |         stalist=pd.read_csv(down_list)
 89 |     else:
 90 |         stalist=downloaders.get_sta_list(**downlist_kwargs) # saves station list to "down_list" file
 91 |                                               # here, file name is "station.txt"
 92 |     # rough estimation on memory needs (assume float32 dtype)
 93 |     memory_size=noise.cc_memory(inc_hours,samp_freq,len(stalist.station),ncomp,cc_len,step)
 94 |     if memory_size > MAX_MEM:
 95 |         raise ValueError('Require %5.3fG memory but only %5.3fG provided)! Reduce inc_hours to avoid this issue!' % (memory_size,MAX_MEM))
 96 | 
 97 |     # save parameters for future reference
 98 |     metadata = os.path.join(direc,'download_info.txt')
 99 |     fout = open(metadata,'w')
100 |     fout.write(str({**prepro_para,**downlist_kwargs,'inc_hours':inc_hours,'ncomp':ncomp}));fout.close()
101 | 
102 |     all_chunk = split_datetimestr(start_date,end_date,inc_hours)
103 |     if len(all_chunk)<1:
104 |         raise ValueError('Abort! no data chunk between %s and %s' % (start_date,end_date))
105 |     splits = len(all_chunk)-1
106 | else:
107 |     splits,all_chunk = [None for _ in range(2)]
108 | 
109 | # broadcast the variables
110 | splits = comm.bcast(splits,root=0)
111 | all_chunk  = comm.bcast(all_chunk,root=0)
112 | extra = splits % size
113 | 
114 | # MPI: loop through each time chunk
115 | for ick in range(rank,splits,size):
116 |     s1= all_chunk[ick]
117 |     s2=all_chunk[ick+1]
118 | 
119 |     download_kwargs = {"source":source,"rawdatadir": direc, "starttime": s1, "endtime": s2, \
120 |               "stationinfo": stalist,**prepro_para}
121 | 
122 |     # Download for ick
123 |     downloaders.download(**download_kwargs)
124 | 
125 | tt1=time.time()
126 | print('downloading step takes %6.2f s' %(tt1-tt0))
127 | 
128 | comm.barrier()
129 | if rank == 0:
130 |     sys.exit()
131 | 


--------------------------------------------------------------------------------
/scripts/seisgo_download_obsdata.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | """
  4 | Download OBS date including all four components (seismic channels and the pressure channel).
  5 | This light-weight script mainly aims at downloading test data for some seisgo modules, instead of a comprehensive
  6 | and robust downloading wrapper. It is recommended to use NoisePy's downloading script for
  7 | more comprehensive downloading (https://github.com/mdenolle/NoisePy).
  8 | """
  9 | #import needed packages.
 10 | from seisgo import utils
 11 | from seisgo import obsmaster as obs
 12 | import sys
 13 | import time
 14 | import scipy
 15 | import obspy
 16 | import pyasdf
 17 | import datetime
 18 | import os, glob
 19 | import numpy as np
 20 | import pandas as pd
 21 | import matplotlib.pyplot  as plt
 22 | from obspy import UTCDateTime
 23 | from obspy.core import Stream,Trace
 24 | from obspy.clients.fdsn import Client
 25 | 
 26 | # In[ ]:
 27 | """
 28 | 1. Set parameters for downloading data.
 29 | """
 30 | rawdatadir = '../data/raw'
 31 | if not os.path.isdir(rawdatadir): os.mkdir(rawdatadir)
 32 | 
 33 | source='IRIS'
 34 | client=Client(source)
 35 | # get data from IRIS web service
 36 | net="7D"
 37 | stalist=["FN07A","G30A"]#["G03A","J35A","J44A","J65A"]
 38 | 
 39 | starttime = "2012_02_02_0_0_0"
 40 | endtime   = "2012_02_05_0_0_0"
 41 | inc_hours = 8
 42 | 
 43 | """
 44 | 2. Preprocessing parameters
 45 | """
 46 | rmresp=True #remove instrument response
 47 | # parameters for butterworth filter
 48 | samp_freq=10
 49 | pfreqmin=0.002
 50 | pfreqmax=samp_freq/2
 51 | 
 52 | # prefilter information used when removing instrument responses
 53 | f1 = 0.95*pfreqmin;f2=pfreqmin
 54 | if 1.05*pfreqmax > 0.48*samp_freq:
 55 |     f3 = 0.45*samp_freq
 56 |     f4 = 0.48*samp_freq
 57 | else:
 58 |     f3 = pfreqmax
 59 |     f4= 1.05*pfreqmax
 60 | pre_filt  = [f1,f2,f3,f4]
 61 | 
 62 | 
 63 | """
 64 | 3. Download by looping through datetime list.
 65 | ***** The users usually don't need to chance the following lines ****
 66 | """
 67 | dtlist = utils.split_datetimestr(starttime,endtime,inc_hours)
 68 | print(dtlist)
 69 | for idt in range(len(dtlist)-1):
 70 |     sdatetime = obspy.UTCDateTime(dtlist[idt])
 71 |     edatetime   = obspy.UTCDateTime(dtlist[idt+1])
 72 | 
 73 |     fname = os.path.join(rawdatadir,dtlist[idt]+'T'+dtlist[idt+1]+'.h5')
 74 | 
 75 |     """
 76 |     Start downloading.
 77 |     """
 78 |     for ista in stalist:
 79 |         print('Downloading '+net+"."+ista+" ...")
 80 |         t0=time.time()
 81 |         """
 82 |         3a. Request data.
 83 |         """
 84 |         tr1,tr2,trZ,trP = obs.getdata(net,ista,sdatetime,edatetime,source=source,samp_freq=samp_freq,
 85 |                                       plot=False,rmresp=rmresp,pre_filt=pre_filt)
 86 |         sta_inv=client.get_stations(network=net,station=ista,
 87 |                                     starttime=sdatetime,endtime=edatetime,
 88 |                                     location='*',level='response')
 89 |         ta=time.time() - t0
 90 |         print('  downloaded '+net+"."+ista+" in "+str(ta)+" seconds.")
 91 |         """
 92 |         3b. Save to ASDF file.
 93 |         """
 94 |         tags=[]
 95 |         for itr,tr in enumerate([tr1,tr2,trZ,trP],1):
 96 |             if len(tr.stats.location) == 0:
 97 |                 tlocation='00'
 98 |             else:
 99 |                 tlocation=tr.stats.location
100 | 
101 |             tags.append(tr.stats.channel.lower()+'_'+tlocation.lower())
102 | 
103 |         print('  saving to '+fname)
104 |         utils.save2asdf(fname,Stream(traces=[tr1,tr2,trZ,trP]),tags,sta_inv=sta_inv)
105 | 


--------------------------------------------------------------------------------
/scripts/seisgo_stacking_MPI.py:
--------------------------------------------------------------------------------
 1 | import sys,time,os, glob
 2 | import pandas as pd
 3 | from mpi4py import MPI
 4 | from seisgo import noise
 5 | if not sys.warnoptions:
 6 |     import warnings
 7 |     warnings.simplefilter("ignore")
 8 | '''
 9 | Stacking script of SeisGo to:
10 |     1) load cross-correlation data for sub-stacking (if needed) and all-time average;
11 |     2) stack data with either linear or phase weighted stacking (pws) methods (or both);
12 |     3) save outputs in ASDF;
13 |     4) rotate from a E-N-Z to R-T-Z system if needed.
14 | 
15 | Modified from NoisePy
16 | '''
17 | tt0=time.time()
18 | ########################################
19 | #########PARAMETER SECTION##############
20 | ########################################
21 | # absolute path parameters
22 | rootpath  = 'data_injection'                                 # root path for this data processing
23 | CCFDIR    = os.path.join(rootpath,'CCF')                            # dir where CC data is stored
24 | STACKDIR  = os.path.join(rootpath,'STACK')                          # dir where stacked data is going to
25 | 
26 | # define new stacking para
27 | flag         = True                                                # output intermediate args for debugging
28 | stack_method = ['linear','robust']                                                # linear, pws, robust or all
29 | 
30 | # new rotation para
31 | rotation     = False #True                                                 # rotation from E-N-Z to R-T-Z
32 | correction   = False                                                # angle correction due to mis-orientation
33 | if rotation and correction:
34 |     corrfile = os.path.join(rootpath,'meso_angles.txt')             # csv file containing angle info to be corrected
35 |     locs     = pd.read_csv(corrfile)
36 | else: locs = None
37 | 
38 | # make a dictionary to store all variables: also for later cc
39 | stack_para={'rootpath':rootpath,'STACKDIR':STACKDIR,\
40 |     'stack_method':stack_method,'rotation':rotation,'correction':correction}
41 | #######################################
42 | ###########PROCESSING SECTION##########
43 | #######################################
44 | #--------MPI---------
45 | comm = MPI.COMM_WORLD
46 | rank = comm.Get_rank()
47 | size = comm.Get_size()
48 | 
49 | if rank == 0:
50 |     if not os.path.isdir(STACKDIR):os.mkdir(STACKDIR)
51 |     # save fft metadata for future reference
52 |     stack_metadata  = os.path.join(STACKDIR,'stack_data.txt')
53 |     fout = open(stack_metadata,'w');fout.write(str(stack_para));fout.close()
54 | 
55 |     # cross-correlation files
56 |     ccfiles   = sorted(glob.glob(os.path.join(CCFDIR,'*.h5')))
57 |     pairs_all,netsta_all=noise.get_stationpairs(ccfiles,False)
58 |     splits  = len(pairs_all)
59 |     if len(ccfiles)==0 or splits==0:
60 |         raise IOError('Abort! no available CCF data for stacking')
61 | 
62 |     for s in netsta_all:
63 |         tmp = os.path.join(STACKDIR,s)
64 |         if not os.path.isdir(tmp):os.mkdir(tmp)
65 | else:
66 |     splits,ccfiles,pairs_all,ccomp_all = [None for _ in range(4)]
67 | 
68 | # broadcast the variables
69 | splits    = comm.bcast(splits,root=0)
70 | ccfiles   = comm.bcast(ccfiles,root=0)
71 | pairs_all = comm.bcast(pairs_all,root=0)
72 | # MPI loop: loop through each user-defined time chunck
73 | for ipair in range (rank,splits,size):
74 |     pair=pairs_all[ipair]
75 |     if flag:print('station-pair %s'%(pair))
76 |     noise.do_stacking(ccfiles,pair,outdir=STACKDIR,method=stack_method,rotation=rotation,correctionfile=locs,flag=flag)
77 | 
78 | tt1 = time.time()
79 | print('it takes %6.2fs to stack in total' % (tt1-tt0))
80 | comm.barrier()
81 | 
82 | # merge all path_array and output
83 | if rank == 0:
84 |     sys.exit()
85 | 


--------------------------------------------------------------------------------
/scripts/seisgo_tcremoval_continuous.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | 
  4 | # # Tilt and Compliance Corrections for OBS Data: Continuous
  5 | # ### Xiaotao Yang @ Harvard University
  6 | # This notebook contains examples of compliance corrections using local data on the disk.
  7 | # The functions for tilt and compliance corrections are in module seisgo.obsmaster.
  8 | 
  9 | # ## Step 0. Load needed packages.
 10 | # Some functions are imported from the utils.py and the obsmaster.py.
 11 | 
 12 | #import needed packages.
 13 | from seisgo import utils
 14 | from seisgo import obsmaster as obs
 15 | import sys
 16 | import time
 17 | import scipy
 18 | import obspy
 19 | import pyasdf
 20 | import datetime
 21 | import os, glob
 22 | import numpy as np
 23 | import pandas as pd
 24 | # import matplotlib.pyplot  as plt
 25 | # from obspy import UTCDateTime
 26 | from obspy.core import Stream, Trace
 27 | 
 28 | """
 29 | 1. Set global data path parameters.
 30 | """
 31 | rawdatadir = '../data/raw'
 32 | if not os.path.isdir(rawdatadir): os.mkdir(rawdatadir)
 33 | #directory to save the data after TC removal
 34 | tcdatadir = '../data/tcremoval'
 35 | if not os.path.isdir(tcdatadir): os.mkdir(tcdatadir)
 36 | ####################################
 37 | #### Optional clean-up block ####
 38 | ####################################
 39 | cleantartgetdir=True #change to False or remove/comment this block if needed.
 40 | dfiles0 = glob.glob(os.path.join(tcdatadir,'*.h5'))
 41 | if cleantartgetdir and len(dfiles0)>0:
 42 |     print('Cleaning up TC removal directory before running ...')
 43 |     for df0 in dfiles0:os.remove(df0)
 44 | ####################################
 45 | ##### End of clean-up block #####
 46 | ####################################
 47 | 
 48 | ####################################
 49 | #### Optional downloading block ####
 50 | ####################################
 51 | downloadexample=False #change to False or remove/comment this block if needed.
 52 | if downloadexample:
 53 |     print('Cleaning up raw data directory before downloading ...')
 54 |     dfiles1 = glob.glob(os.path.join(rawdatadir,'*.h5'))
 55 |     for df1 in dfiles1:os.remove(df1)
 56 |     os.system('python seisgo_download_obsdata.py')
 57 | ####################################
 58 | ##### End of downloading block #####
 59 | ####################################
 60 | 
 61 | """
 62 | 2. Tilt and compliance removal parameters
 63 | """
 64 | window=3600
 65 | overlap=0.2
 66 | taper=0.08
 67 | qc_freq=[0.004, 1]
 68 | plot_correction=True
 69 | normalizecorrectionplot=True
 70 | tc_subset=['ZP-H']
 71 | #assemble all parameters into a dictionary.
 72 | tcpara={'window':window,'overlap':overlap,'taper':taper,'qc_freq':qc_freq,
 73 |         'tc_subset':tc_subset}
 74 | print(tcpara)
 75 | """
 76 | 3. Read local data and do correction
 77 | We use the wrapper function for tilt and compliance corrections.
 78 | 
 79 | Steps:
 80 | a. read in file list
 81 | b. loop through all files
 82 | c. loop through all stations
 83 |     c-1. read waveform tags and list
 84 |     c-2. read station info if available. skip land stations or stations with only vertical.
 85 |     c-3. assemble all four components for OBS stations
 86 |     c-4. do correction work flow and plot the result if applicable
 87 |     c-5. save auxiliary data, e.g., tilt direction and angle, and TC removal parameters.
 88 |     c-6. save to original file name in different directory
 89 | """
 90 | dfiles = glob.glob(os.path.join(rawdatadir,'*.h5'))
 91 | nfiles = len(dfiles)
 92 | splits  = nfiles
 93 | if nfiles==0:
 94 |     raise IOError('Abort! no available seismic files in '+rawdatadir)
 95 | 
 96 | t0=time.time()
 97 | for ifile in range(nfiles):
 98 |     df=dfiles[ifile]
 99 |     print('Working on: '+df+' ... ['+str(ifile+1)+'/'+str(nfiles)+']')
100 |     dfbase=os.path.split(df)[-1]
101 |     df_tc=os.path.join(tcdatadir,dfbase)
102 | 
103 |     ds=pyasdf.ASDFDataSet(df,mpi=False,mode='r')
104 |     netstalist = ds.waveforms.list()
105 |     nsta = len(netstalist)
106 | 
107 |     tilt=[]
108 |     sta_processed=[]
109 |     for ista in netstalist:
110 |         print('  station: '+ista)
111 |         """
112 |         Get the four-component data
113 |         """
114 |         try:
115 |             inv = ds.waveforms[ista]['StationXML']
116 |         except Exception as e:
117 |             print('  No stationxml for %s in file %s'%(ista,df))
118 |             inv = None
119 | 
120 |         all_tags = ds.waveforms[ista].get_waveform_tags()
121 |         print(all_tags)
122 |         if len(all_tags)!=4:
123 |             print("  Wrong number of components. Has to be four (4) channels! Skip!")
124 |             continue
125 | 
126 |         tr1=None
127 |         tr2=None
128 |         trZ=None
129 |         trP=None
130 |         #assign components by waveform tags.
131 |         #This step may fail if the tags don't reflect the real channel information
132 |         newtags=['-','-','-','-']
133 |         for tg in all_tags:
134 |             tr_temp = ds.waveforms[ista][tg][0]
135 |             chan = tr_temp.stats.channel
136 |             if chan[-1].lower() == 'h':trP=tr_temp;newtags[3]=tg
137 |             elif chan[-1].lower() == '1' or chan[-1].lower() == 'e':tr1=tr_temp;newtags[0]=tg
138 |             elif chan[-1].lower() == '2' or chan[-1].lower() == 'n':tr2=tr_temp;newtags[1]=tg
139 |             elif chan[-1].lower() == 'z':trZ=tr_temp;newtags[2]=tg
140 | 
141 |         #sanity check.
142 |         for tr in [tr1, tr2, trZ, trP]:
143 |             if not isinstance(tr, Trace):
144 |                 print(str(tr)+" is not a Trace object")
145 | 
146 |         """
147 |         Call correction wrapper
148 |         """
149 |         spectra,transfunc,correct=obs.TCremoval_wrapper(
150 |             tr1,tr2,trZ,trP,window=window,overlap=overlap,merge_taper=taper,
151 |             qc_freq=qc_freq,qc_spectra=True,fig_spectra=False,
152 |             save_spectrafig=False,fig_transfunc=False,correctlist=tc_subset)
153 |         tilt.append(spectra['rotation'].tilt)
154 |         sta_processed.append(ista)
155 |         if plot_correction:
156 |             obs.plotcorrection(trZ,correct,normalize=normalizecorrectionplot,freq=[0.005,0.1],
157 |                                size=(12,3),save=True,form='png')
158 | 
159 |         """
160 |         Save to ASDF file.
161 |         """
162 |         trZtc,tgtemp=obs.correctdict2stream(trZ,correct,tc_subset)
163 |         print('  saving to: '+df_tc)
164 |         utils.save2asdf(df_tc,Stream(traces=[tr1,tr2,trZtc[0],trP]),newtags,sta_inv=inv)
165 | 
166 |     #save auxiliary data to file.
167 |     print('  saving auxiliary data to: '+df_tc)
168 |     tcpara_temp=tcpara
169 |     tcpara_temp['tilt_stations']=sta_processed
170 |     utils.save2asdf(df_tc,np.array(tilt),None,group='auxiliary',para={'data_type':'tcremoval',
171 |                                                 'data_path':'tiltdir',
172 |                                                 'parameters':tcpara_temp})
173 | 
174 | tend=time.time() - t0
175 | print('Finished all files in '+str(tend)+' seconds, or '+str(tend/3600)+' hours')
176 | 
177 | # In[ ]:
178 | 


--------------------------------------------------------------------------------
/scripts/seisgo_tcremoval_continuous_MPI.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # coding: utf-8
  3 | 
  4 | # # Tilt and Compliance Corrections for OBS Data: Continuous
  5 | # ### Xiaotao Yang @ Harvard University
  6 | # This notebook contains examples of compliance corrections using local data on the disk. The functions for tilt and compliance corrections are in module seisgo.obsmaster.
  7 | 
  8 | # ## Step 0. Load needed packages.
  9 | #import needed packages.
 10 | from seisgo import utils
 11 | from seisgo import obsmaster as obs
 12 | import sys
 13 | import time
 14 | import scipy
 15 | import obspy
 16 | import pyasdf
 17 | import datetime
 18 | from mpi4py import MPI
 19 | import os, glob
 20 | import numpy as np
 21 | # import pandas as pd
 22 | # import matplotlib.pyplot  as plt
 23 | # from obspy import UTCDateTime
 24 | from obspy.core import Stream, Trace
 25 | 
 26 | t0=time.time()
 27 | """
 28 | 1. Set global parameters.
 29 | """
 30 | rootpath='../data'
 31 | rawdatadir = os.path.join(rootpath,'raw')
 32 | downloadexample=False #change to False or remove/comment this block if needed.
 33 | 
 34 | #directory to save the data after TC removal
 35 | tcdatadir = os.path.join(rootpath,'tcremoval')
 36 | cleantcdatadir=True #If True, the program will remove all *.h5 files under `tcdatafir` before running.
 37 | 
 38 | #parameters for orientation correcttions for horizontal components
 39 | correct_obs_orient=True
 40 | obs_orient_file='OBS_orientation_Cascadia.csv'
 41 | 
 42 | #how to deal with stations with bad traces
 43 | drop_if_has_badtrace=True
 44 | 
 45 | """
 46 | 2. Tilt and compliance removal parameters
 47 | """
 48 | window=3600
 49 | overlap=0.2
 50 | taper=0.08
 51 | qc_freq=[0.004, 1]
 52 | plot_correction=True
 53 | normalizecorrectionplot=True
 54 | tc_subset=['ZP-H']
 55 | savetcpara=True                     #If True, the parameters are saved to a text file
 56 |                                     #in the `tcdatadir` directory.
 57 | requirePressure=False
 58 | for tcs in tc_subset:
 59 |     if 'P' in tcs: requirePressure=True; break
 60 | tcparaoutfile=os.path.join(tcdatadir,'tcparameters.txt')
 61 | #assemble all parameters into a dictionary.
 62 | tcpara={'window':window,'overlap':overlap,'taper':taper,'qc_freq':qc_freq,
 63 |         'tc_subset':tc_subset}
 64 | 
 65 | ######################################################################
 66 | #### Normally, no changes are needed for the following processing ####
 67 | ######################################################################
 68 | """
 69 | 3. Read local data and do correction
 70 | We use the wrapper function for tilt and compliance corrections. The data after noise removal
 71 | will be saved to the original file name BUT in different directory defined by `tcdatadir`.
 72 | """
 73 | #-------- Set MPI parameters --------------------------------
 74 | comm = MPI.COMM_WORLD
 75 | rank = comm.Get_rank()
 76 | size = comm.Get_size()
 77 | if rank==0:
 78 |     ####################################
 79 |     #### Optional clean-up block ####
 80 |     ####################################
 81 |     if not os.path.isdir(rawdatadir):
 82 |         comm.barrier()
 83 |         IOError('Abort! Directory for raw data NOT found: '+rawdatadir)
 84 |         sys.exit()
 85 | 
 86 |     if not os.path.isdir(tcdatadir): os.mkdir(tcdatadir)
 87 |     dfilesTC0 = glob.glob(os.path.join(tcdatadir,'*.h5'))
 88 |     if cleantcdatadir and len(dfilesTC0)>0:
 89 |         print('Cleaning up TC removal directory before running ...')
 90 |         for df0 in dfilesTC0:os.remove(df0)
 91 |     ####################################
 92 |     ##### End of clean-up block #####
 93 |     ####################################
 94 |     print(tcpara)
 95 |     if savetcpara:
 96 |         fout = open(tcparaoutfile,'w')
 97 |         fout.write(str(tcpara));
 98 |         fout.close()
 99 |     dfiles0 = glob.glob(os.path.join(rawdatadir,'*.h5'))
100 |     nfiles = len(dfiles0)
101 |     splits0  = nfiles
102 |     if nfiles < 1:
103 |         raise IOError('Abort! no available seismic files in '+rawdatadir)
104 |         sys.exit()
105 | else:
106 |     splits0,dfiles0 = [None for _ in range(2)]
107 | 
108 | # broadcast the variables
109 | splits = comm.bcast(splits0,root=0)
110 | dfiles  = comm.bcast(dfiles0,root=0)
111 | #--------End of setting MPI parameters -----------------------
112 | for ifile in range(rank,splits,size):
113 |     #read obs orientation data.
114 |     if correct_obs_orient:
115 |         try:
116 |             obs_orient_data=obs.get_orientations(obs_orient_file)
117 |         except Exception as e:
118 |             print(e)
119 |             sys.exit()
120 |     df=dfiles[ifile]
121 |     print('Working on: '+df+' ... ['+str(ifile+1)+'/'+str(len(dfiles))+']')
122 |     dfbase=os.path.split(df)[-1]
123 |     df_tc=os.path.join(tcdatadir,dfbase)
124 | 
125 |     ds=pyasdf.ASDFDataSet(df,mpi=False,mode='r')
126 |     netstalist = ds.waveforms.list()
127 |     nsta = len(netstalist)
128 | 
129 |     tilt=[]
130 |     sta_processed=[]
131 |     for ista in netstalist:
132 |         print('  station: '+ista)
133 |         """
134 |         Get the four-component data
135 |         """
136 |         try:
137 |             inv = ds.waveforms[ista]['StationXML']
138 |         except Exception as e:
139 |             print('  No stationxml for %s in file %s'%(ista,df))
140 |             inv = None
141 | 
142 |         all_tags = ds.waveforms[ista].get_waveform_tags()
143 |         if len(all_tags) < 1 or len(all_tags) > 4: continue #empty waveform group.
144 |         else: print(all_tags)
145 | 
146 |         tr1, tr2, trZ, trP=[None for _ in range(4)]
147 |         #assign components by waveform tags.
148 |         #This step may fail if the tags don't reflect the real channel information
149 |         newtags=['-','-','-','-']
150 |         for tg in all_tags:
151 |             tr_temp = ds.waveforms[ista][tg][0]
152 |             chan = tr_temp.stats.channel
153 |             if chan[-1].lower() == 'h':trP=tr_temp;newtags[3]=tg
154 |             elif chan[-1].lower() == '1' or chan[-1].lower() == 'e':tr1=tr_temp;newtags[0]=tg
155 |             elif chan[-1].lower() == '2' or chan[-1].lower() == 'n':tr2=tr_temp;newtags[1]=tg
156 |             elif chan[-1].lower() == 'z':trZ=tr_temp;newtags[2]=tg
157 | 
158 |         #sanity check.
159 |         badtrace=False
160 |         hasPressure=False
161 |         if isinstance(trP,Trace):
162 |             hasPressure=True
163 |         if not isinstance(tr1, Trace) and not isinstance(tr2, Trace) and not isinstance(trZ, Trace):
164 |                 print('  No seismic channels found. Drop the station: '+ista)
165 |                 continue
166 |         for tr in [tr1, tr2, trZ]:
167 |             if not isinstance(tr, Trace):
168 |                 print("  "+str(tr)+" is not a Trace object. "+ista)
169 |                 badtrace=True
170 |                 break
171 |             elif np.sum(np.isnan(tr.data))>0:
172 |                 print('  NaN found in trace: '+str(tr)+". "+ista)
173 |                 badtrace=True
174 |                 break
175 |             elif np.count_nonzero(tr.data) < 1:
176 |                 print('  All zeros in trace: '+str(tr)+". "+ista)
177 |                 badtrace=True
178 |                 break
179 |         if badtrace:
180 |             if not drop_if_has_badtrace:
181 |                 print("  Not enough good traces for TC removal! Save as is without processing!")
182 |                 outtrace=[]
183 |                 for tg in all_tags:
184 |                     outtrace.append(ds.waveforms[ista][tg][0])
185 |                 utils.save2asdf(df_tc,Stream(traces=outtrace),all_tags,sta_inv=inv)
186 |             else:
187 |                 print("  Encountered bad trace for "+ista+". Skipped!")
188 |             continue
189 |         elif requirePressure and not hasPressure: #if station doesn't have pressure channel, it might be an obs or a land station
190 |             newtags_tmp=[]
191 |             if isinstance(tr1, Trace) and isinstance(tr2, Trace) and correct_obs_orient and ista in obs_orient_data.keys():
192 |                 #correct horizontal orientations if in the obs_orient_data list.
193 |                 print("  Correctting horizontal orientations for: "+ista)
194 |                 trE,trN = obs.correct_orientations(tr1,tr2,obs_orient_data)
195 |                 newtags_tmp.append(utils.get_tracetag(trE))
196 |                 newtags_tmp.append(utils.get_tracetag(trN))
197 |                 print(newtags_tmp)
198 |                 outstream=Stream(traces=[trE,trN,trZ])
199 |             else: #save the station as is if it is not in the orientation database, assuming it is a land station.
200 |                 newtags_tmp.append(utils.get_tracetag(tr1))
201 |                 newtags_tmp.append(utils.get_tracetag(tr2))
202 |                 outstream=Stream(traces=[tr1,tr2,trZ])
203 |             newtags_tmp.append(utils.get_tracetag(trZ))
204 |             print('  Saving '+ista+'without TC removal to: '+df_tc)
205 |             utils.save2asdf(df_tc,outstream,newtags_tmp,sta_inv=inv)
206 |             continue
207 | 
208 |         """
209 |         Call correction wrapper
210 |         """
211 |         try:
212 |             spectra,transfunc,correct=obs.TCremoval_wrapper(
213 |                 tr1,tr2,trZ,trP,window=window,overlap=overlap,merge_taper=taper,
214 |                 qc_freq=qc_freq,qc_spectra=True,fig_spectra=False,
215 |                 save_spectrafig=False,fig_transfunc=False,correctlist=tc_subset)
216 |             tilt.append(spectra['rotation'].tilt)
217 |             sta_processed.append(ista)
218 |             if plot_correction:
219 |                 obs.plotcorrection(trZ,correct,normalize=normalizecorrectionplot,freq=[0.005,0.1],
220 |                                    size=(12,3),save=True,form='png')
221 | 
222 |             trZtc,tgtemp=obs.correctdict2stream(trZ,correct,tc_subset)
223 |             if correct_obs_orient:
224 |                 print("  Correctting horizontal orientations for: "+ista)
225 |                 trE,trN = obs.correct_orientations(tr1,tr2,obs_orient_data)
226 |                 newtags[0]=utils.get_tracetag(trE)
227 |                 newtags[1]=utils.get_tracetag(trN)
228 |                 print(newtags)
229 |                 outstream=Stream(traces=[trE,trN,trZtc[0],trP])
230 |             else:
231 |                 outstream=Stream(traces=[tr1,tr2,trZtc[0],trP])
232 |             """
233 |             Save to ASDF file.
234 |             """
235 |             print('  Saving to: '+df_tc)
236 |             utils.save2asdf(df_tc,outstream,newtags,sta_inv=inv)
237 |         except Exception as e:
238 |             print(' Error in calling TCremoval procedures. Drop trace.')
239 |             print(df+' : '+ista+' : '+str(e))
240 |             continue
241 |     #save auxiliary data to file.
242 |     if len(tilt) > 0:
243 |         print('  saving auxiliary data to: '+df_tc)
244 |         tcpara_temp=tcpara
245 |         tcpara_temp['tilt_stations']=sta_processed
246 |         utils.save2asdf(df_tc,np.array(tilt),None,group='auxiliary',para={'data_type':'tcremoval',
247 |                                                     'data_path':'tiltdir',
248 |                                                     'parameters':tcpara_temp})
249 | ###############################################
250 | comm.barrier()
251 | if rank == 0:
252 |     tend=time.time() - t0
253 |     print('*************************************')
254 |     print('<<< Finished all files in %7.1f seconds, or %6.2f hours for %d files >>>' %(tend,tend/3600,len(dfiles)))
255 |     sys.exit()
256 | 


--------------------------------------------------------------------------------
/scripts/seisgo_xcorr_sac.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # coding: utf-8
 3 | import os,sys,glob
 4 | from seisgo.noise import compute_fft,correlate
 5 | from seisgo import utils,downloaders
 6 | 
 7 | rootdir='.'
 8 | respdir='.'
 9 | sacfiles = sorted(glob.glob(os.path.join(rootdir,'*.SAC')))
10 | rm_resp='RESP'
11 | #for removing responses.
12 | freqmin=0.01
13 | freqmax=100
14 | 
15 | tr,inv=downloaders.read_data(sacfiles,rm_resp=rm_resp,freqmin=freqmin,freqmax=freqmax,stainv=True)
16 | tr1,tr2=tr;inv1,inv2=inv
17 | 
18 | #trimming is needed for this data set, which there is one sample difference in the starting time.
19 | cstart=max([tr1.stats.starttime,tr2.stats.starttime])
20 | cend=min([tr1.stats.endtime,tr2.stats.endtime])
21 | tr1.trim(starttime=cstart,endtime=cend,nearest_sample=True)
22 | tr2.trim(starttime=cstart,endtime=cend,nearest_sample=True)
23 | 
24 | print('cross-correlation ...')
25 | cc_len    = 3600                                                            # basic unit of data length for fft (sec)
26 | cc_step      = 900                                                             # overlapping between each cc_len (sec)
27 | maxlag         = 100                                                        # lags of cross-correlation to save (sec)
28 | freq_norm='rma'
29 | time_norm='no'
30 | 
31 | #for whitening
32 | freqmin=0.02
33 | freqmax=2
34 | #get FFT, #do correlation
35 | fftdata1=compute_fft(tr1,cc_len,cc_step,stainv=inv1,
36 |                         freq_norm=freq_norm,freqmin=freqmin,freqmax=freqmax,
37 |                      time_norm=time_norm,smooth=500)
38 | fftdata2=compute_fft(tr2,cc_len,cc_step,stainv=inv2,
39 |                        freq_norm=freq_norm,freqmin=freqmin,freqmax=freqmax,
40 |                      time_norm=time_norm,smooth=500)
41 | corrdata=correlate(fftdata1,fftdata2,maxlag,substack=True)
42 | 
43 | #plot xcorr result
44 | freqs=[[0.05,0.1],[0.07,0.1],[0.1,0.5],[0.1,1],[0.5,1],[1,2]]
45 | for i in range(len(freqs)):
46 |     corrdata.plot(freqmin=freqs[i][0],freqmax=freqs[i][1],lag=50,stack_method='robust',save=True)
47 | 
48 | corrdata.to_asdf('2020.087_xcorr.h5')
49 | 
50 | corrdata.to_sac()
51 | 


--------------------------------------------------------------------------------
/seisgo/__init__.py:
--------------------------------------------------------------------------------
 1 | """
 2 | MIT License
 3 | 
 4 | Copyright (c) 2021 Xiaotao Yang
 5 | 
 6 | Permission is hereby granted, free of charge, to any person obtaining a copy
 7 | of this software and associated documentation files (the "Software"), to deal
 8 | in the Software without restriction, including without limitation the rights
 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | copies of the Software, and to permit persons to whom the Software is
11 | furnished to do so, subject to the following conditions:
12 | 
13 | The above copyright notice and this permission notice shall be included in all
14 | copies or substantial portions of the Software.
15 | 
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22 | SOFTWARE.
23 | """
24 | 


--------------------------------------------------------------------------------
/seisgo/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/seisgo/__pycache__/__init__.cpython-37.pyc


--------------------------------------------------------------------------------
/seisgo/__pycache__/obsmaster.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/seisgo/__pycache__/obsmaster.cpython-37.pyc


--------------------------------------------------------------------------------
/seisgo/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/xtyangpsp/SeisGo/e989041ec6297187878050e11ffeb9b77c7126a2/seisgo/__pycache__/utils.cpython-37.pyc


--------------------------------------------------------------------------------
/seisgo/clustering.py:
--------------------------------------------------------------------------------
  1 | ##
  2 | import pickle,os,math
  3 | import matplotlib.pyplot as plt
  4 | import plotly.express as px
  5 | import pandas as pd
  6 | import numpy as np
  7 | from seisgo import utils
  8 | from tslearn.utils import to_time_series, to_time_series_dataset
  9 | from tslearn.clustering import TimeSeriesKMeans
 10 | from minisom import MiniSom
 11 | from kneed import KneeLocator
 12 | ######
 13 | def vpcluster_evaluate_kmean(ts,nrange,smooth=False,smooth_n=3,plot=True,njob=1,
 14 |                         metric='euclidean',max_iter_barycenter=100, random_state=0):
 15 |     """
 16 | 
 17 |     """
 18 |     distortion=[]
 19 |     for n in nrange:
 20 |         distortion.append(TimeSeriesKMeans(n_clusters=int(n), n_jobs=njob,\
 21 |                 metric=metric, max_iter_barycenter=max_iter_barycenter, \
 22 |                 random_state=random_state,verbose=False).fit(ts).inertia_)
 23 | 
 24 |     if smooth:
 25 |         ys=utils.box_smooth(distortion,smooth_n)
 26 |         ys[0]=distortion[0]
 27 |         ys[-1]=distortion[-1]
 28 |     else:
 29 |         ys = distortion
 30 | 
 31 | 
 32 |     nbest=list(KneeLocator(nrange, ys, S=1, curve="convex", direction="decreasing").all_knees)[0]
 33 |     if plot:
 34 |         plt.figure(figsize=(8,4),facecolor='w')
 35 |         plt.plot(nrange,distortion,'o',label='data')
 36 |         if smooth:
 37 |             plt.plot(nrange,ys,'r-',label='smoothed')
 38 |         else:
 39 |             plt.plot(nrange,ys,'r-')
 40 |         plt.vlines(nbest,np.min(ys),np.max(ys),label='knee:'+str(nbest))
 41 |         plt.xlabel('number of clusters',fontsize=12)
 42 |         plt.ylabel('sum of distance to center',fontsize=12)
 43 |         plt.xticks(nrange,fontsize=12)
 44 |         plt.yticks(fontsize=12)
 45 |         plt.legend()
 46 |         plt.show()
 47 | 
 48 |     return nbest,ys
 49 | def vpcluster_kmean(lat, lon, dep,vmodel,ncluster=None,nrange=None,spacing=1,njob=1,zrange=None,dz=None,
 50 |                          verbose=False,plot=True,savefig=True,figbase='kmean',
 51 |                       metric='euclidean',max_iter_barycenter=100, random_state=0,save=True,
 52 |                       source='vmodel',tag='v',figsize=None,evaluate_smooth=False,evaluate_plot=True):
 53 |     """
 54 |     zrange: target depth range for clustering. Default None, will use full range.
 55 |     dz: depth grid interval. If given, will interpolate the depth profiles.
 56 |     """
 57 |     if zrange is None:
 58 |         zrange=[np.min(dep),np.max(dep)]
 59 | 
 60 |     didx=np.where((dep>= zrange[0]) & (dep<= zrange[1]))[0]
 61 |     v=vmodel[didx]
 62 |     depth=dep[didx]
 63 | 
 64 |     if dz is not None: #do interpolation at depth direction.
 65 |         print("interpolation of depth at the interval of: %6.2f km"%(dz))
 66 |         zvec=np.arange(np.min(depth),np.max(depth)+0.5*dz,dz)
 67 |     else:
 68 |         zvec = depth
 69 | 
 70 |     all_v = []
 71 |     lat_subidx=[int(x) for x in np.arange(0,len(lat),spacing)]
 72 |     lon_subidx=[int(x) for x in np.arange(0,len(lon),spacing)]
 73 |     lat0=[]
 74 |     lon0=[]
 75 |     count=0
 76 |     for i in lat_subidx:
 77 |         for j in lon_subidx:
 78 |             v0=np.ndarray((v.shape[0]))
 79 |             for pp in range(v.shape[0]):
 80 |                 v0[pp]=v[pp,i,j]
 81 |             if not np.isnan(v0).any() :
 82 |                 if dz is not None:
 83 |                     vtemp=np.interp(zvec,depth,v0)
 84 |                     all_v.append(vtemp)
 85 |                 else:
 86 |                     all_v.append(v0)
 87 |                 lat0.append(lat[i])
 88 |                 lon0.append(lon[j])
 89 |                 count += 1
 90 | 
 91 |     ts = to_time_series_dataset(all_v)
 92 | 
 93 |     # determine the best number of clusters if ncluster is None.
 94 |     ss=[]
 95 |     if ncluster is None:
 96 |         print('ncluster is None. Determine the best. This may take a few minutes.')
 97 |         if nrange is None:
 98 |             nrange=np.arange(2,21,1)
 99 |         ncluster,ss = vpcluster_evaluate_kmean(ts,nrange,smooth=evaluate_smooth,smooth_n=3,plot=evaluate_plot,njob=njob,
100 |                                 metric=metric,max_iter_barycenter=max_iter_barycenter,
101 |                                 random_state=random_state)
102 |     km = TimeSeriesKMeans(n_clusters=ncluster, n_jobs=njob,metric=metric, verbose=verbose,
103 |                           max_iter_barycenter=max_iter_barycenter, random_state=random_state)
104 |     y_pred = km.fit_predict(ts)
105 | 
106 | 
107 |     rows = []
108 |     for c in range(count):
109 |         cluster = km.labels_[c]
110 |         rows.append([lat0[c], lon0[c], cluster+1])
111 | 
112 |     df = pd.DataFrame(rows, columns=['lat', 'lon', 'cluster'])
113 |     cdata=[]
114 |     for yi in range(ncluster):
115 |         cdata.append(ts[y_pred == yi].T)
116 | 
117 |     outdict=dict()
118 |     outdict['method']="k-means"
119 |     outdict['source']=source
120 |     outdict['tag']=tag
121 |     outdict['depth']=zvec
122 |     outdict['model']=km
123 |     outdict['pred']=cdata
124 |     outdict['para']={'n_clusters':ncluster,'n_jobs':njob,'metric':metric,
125 |                         'max_iter_barycenter':max_iter_barycenter,
126 |                         'random_state':random_state,'nrange':nrange,'sum_square':ss}
127 |     outdict['cluster_map']=df
128 | 
129 |     outfile=figbase+"_clusters_k"+str(ncluster)+"_results.pk"
130 |     if save:
131 |         with open(outfile,'wb') as f:
132 |             pickle.dump(outdict,f)
133 | 
134 |     if plot:
135 |         ###########
136 |         #### plotting clustered data/time series.
137 |         ###########
138 |         if figsize is None:
139 |             if ncluster<4:
140 |                 plt.figure(figsize=(13, 4),facecolor='w')
141 |             else:
142 |                 plt.figure(figsize=(13, 9),facecolor='w')
143 |         else:
144 |             plt.figure(figsize=figsize,facecolor='w')
145 |         for yi in range(ncluster):
146 |             if ncluster<4:
147 |                 plt.subplot(1, ncluster, yi + 1)
148 |             elif ncluster<9:
149 |                 plt.subplot(int(np.ceil(ncluster/2)), 2, yi + 1)
150 |             elif ncluster < 16:
151 |                 plt.subplot(int(np.ceil(ncluster/3)), 3, yi + 1)
152 |             elif ncluster < 21:
153 |                 plt.subplot(int(np.ceil(ncluster/4)), 4, yi + 1)
154 |             elif ncluster < 26:
155 |                 plt.subplot(int(np.ceil(ncluster/5)), 5, yi + 1)
156 |             else:
157 |                 plt.subplot(int(np.ceil(ncluster/6)), 6, yi + 1)
158 |             for xx in cdata[yi]:
159 |                 plt.plot(zvec,xx, "k-", alpha=.2)
160 |             plt.plot(zvec,km.cluster_centers_[yi].ravel(), "r-")
161 |             plt.text(0.65, 0.15, 'Cluster %d' % (yi + 1),
162 |                      transform=plt.gca().transAxes)
163 |             plt.title(f"Cluster {yi+1}")
164 |             plt.xlabel('depth (km)')
165 |             plt.ylabel('Vs (km/s)')
166 |         plt.tight_layout()
167 | 
168 |         if savefig:
169 |             plt.savefig(figbase+"_clusters_k"+str(ncluster)+".png",format="png")
170 |             plt.close()
171 |         else:
172 |             plt.show()
173 | 
174 |         #####################
175 |         ######## plot map view of clusters.
176 |         ####################
177 |         # Create map using plotly
178 |         fig = px.scatter_mapbox(
179 |             df,lat="lat",lon='lon',color='cluster',size_max=13,zoom=3,width=900,height=800,
180 |         )
181 | 
182 |         fig.update_layout(
183 |             mapbox_style="white-bg",
184 |             mapbox_layers=[
185 |                 {
186 |                     "below": 'traces',
187 |                     "sourcetype": "raster",
188 |                     "source": ["https://basemap.nationalmap.gov/arcgis/rest/services/USGSImageryOnly/MapServer/tile/{z}/{y}/{x}"]
189 |                 }
190 |             ]
191 |         )
192 |         if savefig:
193 |             fig.write_image(figbase+"_clustermap_k"+str(ncluster)+".png",format="png")
194 |         else:
195 |             fig.show()
196 |     #
197 |     if not save:
198 |         return outdict
199 | 
200 | #
201 | def vpcluster_som(lat, lon, depth,v,grid_size=None,spacing=1,niteration=50000,sigma=0.3,
202 |                      rate=0.1,verbose=False,plot=True,savefig=True,figbase='som',
203 |                       save=True,source='vmodel',tag='v',figsize=None):
204 |     all_v = []
205 |     lat_subidx=[int(x) for x in np.arange(0,len(lat),spacing)]
206 |     lon_subidx=[int(x) for x in np.arange(0,len(lon),spacing)]
207 |     lat0=[]
208 |     lon0=[]
209 |     count=0
210 |     for i in lat_subidx:
211 |         for j in lon_subidx:
212 |             v0=np.ndarray((v.shape[0]))
213 |             for pp in range(v.shape[0]):
214 |                 v0[pp]=v[pp,i,j]
215 |             if not np.isnan(v0).any() :
216 |                 all_v.append(v0)
217 |                 lat0.append(lat[i])
218 |                 lon0.append(lon[j])
219 |                 count += 1
220 | 
221 |     if grid_size is None:
222 |         som_x = som_y = math.ceil(math.sqrt(math.sqrt(len(all_v))))
223 |     else:
224 |         som_x=grid_size[0]
225 |         som_y=grid_size[1]
226 |     som = MiniSom(som_x, som_y,len(all_v[0]), sigma=sigma, learning_rate = rate)
227 |     som.random_weights_init(all_v)
228 |     som.train(all_v, niteration)
229 | 
230 |     win_map = som.win_map(all_v)
231 |     # Returns the mapping of the winner nodes and inputs
232 | 
233 |     rows = []
234 |     for c in range(count):
235 |         wm=som.win_map([all_v[c]])
236 |         x,y=list(wm.keys())[0]
237 |         cluster=x*som_y + y + 1
238 |         rows.append([lat0[c], lon0[c], cluster])
239 |     df = pd.DataFrame(rows, columns=['lat', 'lon', 'cluster'])
240 | 
241 |     #clustered data
242 |     cdata=[]
243 | 
244 |     for x in range(som_x):
245 |         for y in range(som_y):
246 |             cluster = (x,y)
247 |             if cluster in win_map.keys():
248 |                 cdata.append(win_map[cluster])
249 | 
250 |     outdict=dict()
251 |     outdict['method']="som"
252 |     outdict['source']=source
253 |     outdict['tag']=tag
254 |     outdict['depth']=depth
255 |     outdict['model']=som
256 |     outdict['pred']=cdata
257 |     outdict['para']={'nx':som_x,'ny':som_y,'sigma':sigma,
258 |                         'niteration':niteration, 'learning_rate':rate}
259 |     outdict['cluster_map']=df
260 | 
261 |     outfile=figbase+"_clusters_k"+str(som_x)+"x"+str(som_y)+"_results.pk"
262 |     if save:
263 |         with open(outfile,'wb') as f:
264 |             pickle.dump(outdict,f)
265 | 
266 |     ###########
267 |     #######
268 |     if plot:
269 |         ncluster=len(cdata)
270 |         if figsize is None:
271 |             if ncluster<4:
272 |                 plt.figure(figsize=(13, 4),facecolor='w')
273 |             else:
274 |                 plt.figure(figsize=(13, 9),facecolor='w')
275 |         else:
276 |             plt.figure(figsize=figsize,facecolor='w')
277 |         for i in range(len(cdata)):
278 |             if ncluster<4:
279 |                 plt.subplot(1, ncluster, i + 1)
280 |             elif ncluster<9:
281 |                 plt.subplot(int(np.ceil(ncluster/2)), 2, i + 1)
282 |             elif ncluster < 16:
283 |                 plt.subplot(int(np.ceil(ncluster/3)), 3, i + 1)
284 |             elif ncluster < 21:
285 |                 plt.subplot(int(np.ceil(ncluster/4)), 4, i + 1)
286 |             elif ncluster < 26:
287 |                 plt.subplot(int(np.ceil(ncluster/5)), 5, i + 1)
288 |             else:
289 |                 plt.subplot(int(np.ceil(ncluster/6)), 6, i + 1)
290 | 
291 |             for series in cdata[i]:
292 |                 plt.plot(depth,series,c="gray",alpha=0.3)
293 |             plt.plot(depth,np.average(np.vstack(cdata[i]),axis=0),c="red")
294 |             plt.title(f"Cluster {i+1}")
295 |             plt.xlabel('depth (km)')
296 |             plt.ylabel('Vs (km/s)')
297 |         plt.tight_layout()
298 |         if savefig:
299 |             plt.savefig(figbase+"_clusters_k"+str(som_x)+"x"+str(som_y)+".png",format="png")
300 |             plt.close()
301 |         else:
302 |             plt.show()
303 | 
304 |         #####################
305 |         ######## plot map view of clusters.
306 |         ####################
307 |         # Create map using plotly
308 |         fig = px.scatter_mapbox(
309 |             df,lat="lat",lon='lon',color='cluster',size_max=13,zoom=3,width=900,height=800,
310 |         )
311 | 
312 |         fig.update_layout(
313 |             mapbox_style="white-bg",
314 |             mapbox_layers=[
315 |                 {
316 |                     "below": 'traces',
317 |                     "sourcetype": "raster",
318 |                     "source": ["https://basemap.nationalmap.gov/arcgis/rest/services/USGSImageryOnly/MapServer/tile/{z}/{y}/{x}"]
319 |                 }
320 |             ]
321 |         )
322 |         if savefig:
323 |             fig.write_image(figbase+"_clustermap_k"+str(som_x)+"x"+str(som_y)+".png",format="png")
324 |         else:
325 |             fig.show()
326 |     if not save:
327 |         return outdict
328 | 


--------------------------------------------------------------------------------
/seisgo/dispersion.py:
--------------------------------------------------------------------------------
  1 | import os,glob,copy,obspy,scipy,time,pycwt,pyasdf,datetime
  2 | import numpy as np
  3 | import pandas as pd
  4 | from scipy.signal import hilbert
  5 | from obspy.signal.util import _npts2nfft
  6 | from obspy.signal.invsim import cosine_taper
  7 | from obspy.signal.regression import linear_regression
  8 | from obspy.core.util.base import _get_function_from_entry_point
  9 | from obspy.core.inventory import Inventory, Network, Station, Channel, Site
 10 | from scipy.fftpack import fft,ifft,next_fast_len
 11 | from obspy.signal.filter import bandpass,lowpass
 12 | import matplotlib.pyplot as plt
 13 | """
 14 | This is a planned module, to be developed.
 15 | """
 16 | ################################################################
 17 | ################ DISPERSION EXTRACTION FUNCTIONS ###############
 18 | ################################################################
 19 | def get_dispersion_waveforms_cwt(d, dt,fmin,fmax,dj=1/12, s0=-1, J=-1, wvn='morlet'):
 20 |     """
 21 |     Produce dispersion wavefroms with continuous wavelet tranform.
 22 | 
 23 |     ===parameters===
 24 |     d: 1-d array data.
 25 |     df: time interval.
 26 |     fmin, fmax: frequency range.
 27 |     dj=1/12, s0=-1, J=-1, wvn='morlet': pycwt.cwt parameters.
 28 | 
 29 |     ==returns===
 30 |     dout, fout: narrowband-filtered waveforms and the frequency vector.
 31 |     """
 32 |     ds_cwt, sj, f, coi, _, _ = pycwt.cwt(d, dt, dj, s0, J, wvn)
 33 |     f_ind = np.where((f >= fmin) & (f <= fmax))[0]
 34 |     dout=[]
 35 |     fout=[]
 36 |     for ii in range(len(f_ind)):
 37 |         if ii>0 and ii<len(f_ind)-1: f_ind_temp=f_ind[ii-1:ii+1]
 38 |         elif ii==len(f_ind)-1: f_ind_temp=f_ind[ii-1:ii]
 39 |         elif ii==0:f_ind_temp=f_ind[ii:ii+1]
 40 |         fout.append(np.mean(f[f_ind_temp]))
 41 |         rds_cwt=np.real(pycwt.icwt(ds_cwt[f_ind_temp], sj[f_ind_temp], dt, dj, wvn))
 42 |         ds_win=np.power(rds_cwt,2)
 43 |         dout.append(ds_win/np.max(ds_win))
 44 |     return np.flip(np.array(dout),axis=0), np.flip(fout)
 45 | 
 46 | def narrowband_waveforms(d, dt,pmin,pmax,dp=1,pscale='ln',extend=10):
 47 |     """
 48 |     Produce dispersion wavefroms with narrowband filters.
 49 | 
 50 |     ===parameters===
 51 |     d: 1-d array data.
 52 |     dt: sampling interval.
 53 |     pmin, pmax: period range.
 54 |     dp: period increment in seconds. default 1 s.
 55 |     pscale: period scales. "ln" for linear [default]. "nln" for non-linear scale.
 56 |     extend: extend individual period value to form a band range. default: 5 scale steps.
 57 | 
 58 |     ==returns===
 59 |     dout, pout: narrowband-filtered waveforms and the period vector.
 60 |     """
 61 |     period=np.array([pmin - extend*dp,pmax + extend*dp])
 62 |     if period[0] < 2*dt: period[0]=2.01*dt
 63 | 
 64 |     if pscale=="ln":
 65 |         # f_all=np.arange(fmin-extend*df,fmax+extend*df,df)
 66 |         ptest=np.arange(period.min(),period.max(),dp)
 67 |     elif pscale=="nln":
 68 |         ptest=2 ** np.arange(np.log2(0.1*period.min()),
 69 |                     np.log2(2*period.max()),dp)
 70 |     f_all=np.flip(1/ptest)
 71 |     fout_temp=[]
 72 |     dout_temp=[]
 73 |     din=d.copy()
 74 | 
 75 |     for ii in range(len(f_all)-extend):
 76 |         if f_all[ii]>=1/(2*dt) or f_all[ii+extend]>=1/(2*dt): continue
 77 |         ds_win=bandpass(din,f_all[ii],f_all[ii+extend],1/dt,corners=4, zerophase=True)
 78 |         dout_temp.append(ds_win/np.max(np.abs(ds_win)))
 79 |         fout_temp.append(np.mean([f_all[ii],f_all[ii+extend]])) #center frequency
 80 |     fout_temp=np.array(fout_temp)
 81 |     f_ind=np.where((fout_temp>=1/pmax) & (fout_temp<=1/pmin))[0]
 82 |     fout=fout_temp[f_ind]
 83 |     dout_temp=np.array(dout_temp)
 84 |     dout = dout_temp[f_ind]
 85 |     pout = 1/fout
 86 |     return dout, pout
 87 | ##
 88 | def get_dispersion_image(g,t,d,pmin,pmax,vmin,vmax,dp=1,dv=0.1,window=1,pscale='ln',pband_extend=5,
 89 |                         verbose=False,min_trace=5,min_wavelength=1.5,energy_type='power_sum',
 90 |                         plot=False,figsize=None,cmap='jet',clim=[0,1]):
 91 |     """
 92 |     Uses phase-shift method. Park et al. (1998): http://www.masw.com/files/DispersionImaingScheme-1.pdf
 93 | 
 94 |     =====PARAMETERS====
 95 |     g: waveform gather for all distances (traces). It should be a numpy array.
 96 |     t: time vector.
 97 |     d: distance vector corresponding to the waveforms in `g`
 98 |     pmin: minimum period.
 99 |     pmax: maximum period.
100 |     vmin: minimum phase velocity to search.
101 |     vmax: maximum phase velocity to search.
102 |     dp: period increment. default is 1.
103 |     dv: velocity increment in searching. default is 0.1
104 |     window: number of wavelength when slicing the time segments in computing summed energy. default is 1.
105 |             Window can be a two-element array [min,max], when the window size will be interpolated between
106 |             the minimum and the maximum.
107 |     pscale: period vector scale in applying narrowband filters. default is 'ln' for linear scale.
108 |     pband_extend: number of period increments to extend in filtering. defult is 5.
109 |     verbose: verbose mode. default False.
110 |     min_trace: minimum trace to consider. default 5.
111 |     min_wavelength: minimum wavelength to satisfy far-field. default 1.5.
112 |     energy_type: method to compute maximum energy, 'envelope' or 'power_sum'. Default is 'power_sum'
113 |     plot: plot dispersion image or not. Default is False.
114 |     figsize: specify figsize. Decides automatically if not specified.
115 |     cmap: colormap. Default is 'jet'
116 |     clim: color value limit. Default is [0,1]
117 | 
118 |     =====RETURNS====
119 |     dout: dispersion information showing the normalized energy for each velocity value for each frequency.
120 |     vout: velocity vector used in searching.
121 |     pout: period vector.
122 |     """
123 |     #validate options.
124 |     energy_type_list=['power_sum','envelope']
125 |     if energy_type.lower() not in energy_type_list:
126 |         raise ValueError(energy_type+" is not a recoganized energy type. Use one of "+energy_type_list)
127 |     if len(np.array(window).shape) < 1:
128 |         window=[window,window]
129 | 
130 |     dt=np.abs(t[1]-t[0])
131 |     if t[0]<-1.0*dt and t[-1]> dt: #two sides.
132 |         side='a'
133 |         zero_idx=int((len(t)-1)/2)
134 |         if figsize is None:
135 |             figsize=(10,4)
136 |     elif t[0]<-1.0*dt:
137 |         side='n'
138 |         zero_idx=len(t)-1
139 |         if figsize is None:
140 |             figsize=(5,4)
141 |     elif t[-1]>dt:
142 |         side='p'
143 |         zero_idx=0
144 |         if figsize is None:
145 |             figsize=(5,4)
146 | 
147 |     if verbose: print('working on side: '+side)
148 |     dfiltered_all=[]
149 |     dist_final=[]
150 |     for k in range(g.shape[0]):
151 |         dtemp,pout=narrowband_waveforms(g[k]/np.max(np.abs(g[k])),dt,pmin,
152 |                                         pmax,dp=dp,pscale=pscale,extend=pband_extend)
153 |         dfiltered_all.append(dtemp)
154 |     dfiltered_all=np.array(dfiltered_all)
155 |     vout=np.arange(vmin,vmax+0.5*dv,dv)
156 |     dout_n_all=[]
157 |     dout_p_all=[]
158 |     window_vector=np.linspace(window[1],window[0],len(pout))
159 |     for k in range(len(pout)):
160 |         win_length=window_vector[k]*pout[k]
161 |         win_len_samples=int(win_length/dt)+1
162 |         dout_n=[]
163 |         dout_p=[]
164 | 
165 |         d_in=dfiltered_all[:,k,:]
166 |         for i,v in enumerate(vout):
167 |             #subset by distance
168 |             mindist=min_wavelength*v*pout[k] #at least 1.5 wavelength.
169 |             dist_idx=np.where((d >= mindist))[0]
170 |             if len(dist_idx) >min_trace:
171 |                 if side=='a' or side=='n':
172 |                     dvec=[]
173 |                     for j in dist_idx: #distance, loop through traces
174 |                         tmin=d[j]/v
175 |                         tmin_idx=zero_idx - int(tmin/dt)
176 |                         dsec=d_in[j][tmin_idx - win_len_samples : tmin_idx]
177 |                         if not any(np.isnan(dsec)):
178 |                             dvec.append(dsec)
179 |                     if energy_type.lower() == 'power_sum':
180 |                         peak_energy=np.sum(np.power(np.mean(dvec,axis=1),2))
181 |                     elif energy_type.lower() == 'envelope':
182 |                         peak_energy=np.max(np.abs(hilbert(np.mean(dvec,axis=1))))
183 |                     dout_n.append(peak_energy)
184 | 
185 |                 if side=='a' or side=='p':
186 |                     dvec=[]
187 |                     for j in dist_idx: #distance, loop through traces
188 |                         tmin=d[j]/v
189 |                         tmin_idx=zero_idx + int(tmin/dt)
190 |                         dsec=d_in[j][tmin_idx : tmin_idx + win_len_samples]
191 |                         if not any(np.isnan(dsec)):
192 |                             dvec.append(dsec)
193 |                     #
194 |                     if energy_type.lower() == 'power_sum':
195 |                         peak_energy=np.sum(np.power(np.mean(dvec,axis=1),2))
196 |                     elif energy_type.lower() == 'envelope':
197 |                         peak_energy=np.max(np.abs(hilbert(np.mean(dvec,axis=1))))
198 |                     dout_p.append(peak_energy)
199 |             else:
200 |                 if side=='a' or side=='n':
201 |                     dout_n.append(np.nan)
202 | 
203 |                 if side=='a' or side=='p':
204 |                     dout_p.append(np.nan)
205 | 
206 |         if side=='a' or side=='n':
207 |             dout_n /= np.nanmax(dout_n)
208 |             dout_n_all.append(dout_n)
209 |         if side=='a' or side=='p':
210 |             dout_p /= np.nanmax(dout_p)
211 |             dout_p_all.append(dout_p)
212 |     # plot or not
213 |     if plot:
214 |         plt.figure(figsize=figsize)
215 |         if side == 'a':
216 |             plt.subplot(1,2,1)
217 |             plt.imshow(np.flip(np.array(dout_n_all).T),cmap=cmap,extent=[pout[-1],pout[0],vout[0],vout[-1]],aspect='auto')
218 |             plt.ylabel('velocity (km/s)',fontsize=12)
219 |             plt.xlabel('period (s)',fontsize=12)
220 |             # plt.xticks(np.arange(pmin,pmax+1,5),fontsize=12)
221 |             # plt.yticks(np.arange(vmin,vmax+.5,.5),fontsize=12)
222 |             plt.clim(clim)
223 |             plt.colorbar()
224 |             plt.title('dispersion from negative lag: '+energy_type,fontsize=13)
225 | 
226 |             plt.subplot(1,2,2)
227 |             plt.imshow(np.flip(np.array(dout_p_all).T),cmap=cmap,extent=[pout[-1],pout[0],vout[0],vout[-1]],aspect='auto')
228 |             plt.ylabel('velocity (km/s)',fontsize=12)
229 |             plt.xlabel('period (s)',fontsize=12)
230 |             # plt.xticks(np.arange(pmin,pmax+1,5),fontsize=12)
231 |             # plt.yticks(np.arange(vmin,vmax+.5,.5),fontsize=12)
232 |             plt.clim(clim)
233 |             plt.colorbar()
234 |             plt.title('dispersion from positive lag: '+energy_type,fontsize=13)
235 |         elif side == 'n':
236 |             plt.imshow(np.flip(np.array(dout_n_all).T),cmap=cmap,extent=[pout[-1],pout[0],vout[0],vout[-1]],aspect='auto')
237 |             plt.ylabel('velocity (km/s)',fontsize=12)
238 |             plt.xlabel('period (s)',fontsize=12)
239 |             # plt.xticks(np.arange(pmin,pmax+1,5),fontsize=12)
240 |             # plt.yticks(np.arange(vmin,vmax+.5,.5),fontsize=12)
241 |             plt.clim(clim)
242 |             plt.colorbar()
243 |             plt.title('dispersion from negative lag: '+energy_type,fontsize=13)
244 |         elif side == 'p':
245 |             plt.imshow(np.flip(np.array(dout_p_all).T),cmap=cmap,extent=[pout[-1],pout[0],vout[0],vout[-1]],aspect='auto')
246 |             plt.ylabel('velocity (km/s)',fontsize=12)
247 |             plt.xlabel('period (s)',fontsize=12)
248 |             # plt.xticks(np.arange(pmin,pmax+1,5),fontsize=12)
249 |             # plt.yticks(np.arange(vmin,vmax+.5,.5),fontsize=12)
250 |             plt.clim(clim)
251 |             plt.colorbar()
252 |             plt.title('dispersion from positive lag: '+energy_type,fontsize=13)
253 |         #
254 |         plt.show()
255 | 
256 |     if side=='a':
257 |         dout=np.squeeze(np.array([dout_n_all,dout_p_all],dtype=np.float64))
258 |     elif side == 'p':
259 |         dout=np.squeeze(np.array(dout_p_all,dtype=np.float64))
260 |     elif side == 'n':
261 |         dout=np.squeeze(np.array(dout_n_all,dtype=np.float64))
262 |     return dout,vout,pout
263 | # function to extract the dispersion from the image
264 | # modified from NoisePy.
265 | def extract_dispersion_curve(amp,vel):
266 |     '''
267 |     this function takes the dispersion image as input, tracks the global maxinum on
268 |     the spectrum amplitude
269 | 
270 |     PARAMETERS:
271 |     ----------------
272 |     amp: 2D amplitude matrix of the wavelet spectrum
273 |     vel:  vel vector of the 2D matrix
274 |     RETURNS:
275 |     ----------------
276 |     gv:   group velocity vector at each frequency
277 |     '''
278 |     nper = amp.shape[0]
279 |     gv   = np.zeros(nper,dtype=np.float32)
280 |     dv = vel[1]-vel[0]
281 | 
282 |     # find global maximum
283 |     for ii in range(nper):
284 |         maxvalue = np.max(amp[ii],axis=0)
285 |         indx = list(amp[ii]).index(maxvalue)
286 |         gv[ii] = vel[indx]
287 | 
288 |     return gv
289 | 


--------------------------------------------------------------------------------
/seisgo/helpers.py:
--------------------------------------------------------------------------------
 1 | #SeisGo helper functions.
 2 | #
 3 | 
 4 | """
 5 | This module contains functionsn that help the users understand and use SeisGo.
 6 | It has similar role as a tutorial, though it can be accessed within codes. The
 7 | purpose is to reduce redundance and make it easier to maintain and update.
 8 | """
 9 | 
10 | def xcorr_methods():
11 |     """
12 |     Returns available xcorr methods.
13 |     """
14 |     o=["xcorr", "deconv", "coherency"]
15 | 
16 |     return o
17 | 
18 | def stack_methods():
19 |     """
20 |     Returns available stacking methods.
21 |     """
22 |     o=["linear","pws","tf-pws","robust","acf","nroot","selective","cluster"]
23 | 
24 |     return o
25 | def dvv_methods():
26 |     """
27 |     Returns available dv/v measuring methods.
28 |     """
29 |     o=['wts','ts']
30 | 
31 |     return o
32 | 
33 | def wavelet_labels():
34 |     """
35 |     Returns the available wavelets.
36 |     """
37 |     o=["gaussian","ricker"]
38 | 
39 |     return o
40 | #
41 | def xcorr_norm_methods(mode="tf"):
42 |     """
43 |     Normalization methods for cross-correlations.
44 |     """
45 | 
46 |     fnorm=["rma","phase_only"]
47 |     tnorm=["rma","one_bit","ftn"]
48 | 
49 |     if mode=="t": return tnorm
50 |     elif mode=="f": return fnorm
51 |     else: return tnorm,fnorm
52 | 
53 | def xcorr_output_structure():
54 |     """
55 |     Options to organize xcorr output files. These options determine the subdirectory
56 |     under the root data directory.
57 | 
58 |     Available options:
59 |     raw: same as raw data, normally by time chunks for all pairs.
60 |     source: organized by subfolder named with virtual source, with all receiver pairs in the same time chunk file.
61 |     station-pair: subfolder named by station-pair. all components will be saved in the same chunk file.
62 |     station-component-pair: subfolder named by station-pair, with lower level folder named by component pair.
63 |     """
64 |     o=["raw","source","station-pair","station-component-pair"]
65 |     o_short=["r","s","sp","scp"]
66 | 
67 |     return o,o_short
68 | 
69 | def xcorr_sides():
70 |     """
71 |     Side options/labels for xcorr data.
72 |     a: both negative and positive sides joined.
73 |     n: negative
74 |     p: positive
75 |     o: one-sided, unclear negative or positive.
76 |     u: not applicable
77 |     """
78 |     o=["a","n","p","o","u"]
79 | 
80 |     return o
81 | 
82 | def outdatafile_formats():
83 |     """
84 |     Formats when saving data files.
85 |     """
86 | 
87 |     o=["asdf","pickle"]
88 | 
89 |     return o
90 | 
91 | def datafile_extension():
92 |     """
93 |     File extensions for input and output data.
94 |     """
95 | 
96 |     o=["h5","pk"]
97 | 
98 |     return o
99 | 


--------------------------------------------------------------------------------
/seisgo/simulation.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from seisgo import utils
  3 | def fd1d_dx4dt4(x,dt,tmax,vmodel,rho,xsrc,xrcv,stf_freq=1,stf_shift=None,stf_type='ricker',t_interval=1):
  4 |     """
  5 |     Modified from Florian Wittkamp
  6 | 
  7 |     Finite-Difference acoustic seismic wave simulation
  8 |     Discretization of the first-order acoustic wave equation
  9 | 
 10 |     Temporal second-order accuracy $O(\Delta T^4)$
 11 | 
 12 |     Spatial fourth-order accuracy  $O(\Delta X^4)$
 13 | 
 14 |     Temporal discretization is based on the Adams-Basforth method is available in:
 15 | 
 16 |     Bohlen, T., & Wittkamp, F. (2016).
 17 |     Three-dimensional viscoelastic time-domain finite-difference seismic modelling using the staggered Adams-Bashforth time integrator.
 18 | 
 19 |     Geophysical Journal International, 204(3), 1781-1788.
 20 | 
 21 |     =====PARAMETERS=====
 22 |     x: spatial vector
 23 |     dt: time step for simulation.
 24 |     tmax: maximum time for simulation.
 25 |     vmodel: velocity model for each spatial grid.
 26 |     rho: density model for each spatial grid.
 27 |     xsrc: src grid index.
 28 |     xrcv: receiver grid index.
 29 |     stf_freq: source time function frequency parameter. Gaussian: width; Ricker: central frequency.
 30 |     stf_shift: source time function shift. default is 3*stf_freq.
 31 |     stf_type: "ricker" or "gaussian". Currently only support ricker.
 32 |     t_inverval: time interval of the output waveform. default: 1.
 33 | 
 34 |     ======RETURNS======
 35 |     tout,seisout: time and seismogram output.
 36 |     """
 37 |     c2=0.5  # CFL-Number. Stability condition.
 38 |     cmax=max(vmodel.flatten())
 39 |     dt_max=np.max(np.diff(x))/(cmax)*c2 #maximum time interval/step.
 40 |     if stf_shift is None:
 41 |         stf_shift = 3*stf_width
 42 |     nx=len(x)
 43 |     dx=np.abs(x[1]-x[0])
 44 | 
 45 |     if dt > dt_max:
 46 |         raise ValueError('dt %f is larger than allowable %f. '%(dt,dt_max))
 47 |     t=np.arange(0,tmax+stf_shift+0.5*dt,dt)     # Time vector
 48 |     nt=len(t)
 49 |     #wavelet information.
 50 |     # q0=1
 51 |     wavelet = np.zeros((len(t)))
 52 | 
 53 |     if stf_type.lower() == 'ricker':
 54 |         wlet0=utils.ricker(dt,stf_freq,stf_shift)[1]
 55 |         # tau=np.pi*stf_width*(t-1.5/stf_width)
 56 |         # wavelet=q0*(1.0-2.0*tau**2.0)*np.exp(-tau**2)
 57 |     elif stf_type.lower() == 'gaussian' or stf_type.lower() == 'gauss':
 58 |         wlet0=utils.gaussian(dt,stf_freq,stf_shift)[1]
 59 |     else:
 60 |         raise ValueError(stf_type+" not recoganized.")
 61 |     #
 62 |     wavelet[:len(wlet0)]=wlet0
 63 | 
 64 |     # Plotting source signal
 65 | #     plt.figure(figsize=(10,3))
 66 | #     plt.plot(t,wavelet)
 67 | #     plt.title('Source signal Ricker-Wavelet')
 68 | #     plt.ylabel('Amplitude')
 69 | #     plt.xlabel('Time in s')
 70 | #     plt.xlim(0,10)
 71 | #     plt.draw()
 72 | 
 73 |     # Init wavefields
 74 |     vx=np.zeros(nx)
 75 |     p=np.zeros(nx)
 76 |     vx_x=np.zeros(nx)
 77 |     p_x=np.zeros(nx)
 78 |     vx_x2=np.zeros(nx)
 79 |     p_x2=np.zeros(nx)
 80 |     vx_x3=np.zeros(nx)
 81 |     p_x3=np.zeros(nx)
 82 |     vx_x4=np.zeros(nx)
 83 |     p_x4=np.zeros(nx)
 84 | 
 85 |     # Calculate first Lame-Paramter
 86 |     l=rho * vmodel * vmodel
 87 | 
 88 |     ## Time stepping
 89 | 
 90 |     # Init Seismograms
 91 |     seisout=np.zeros((nt)); # Three seismograms
 92 | 
 93 |     # Calculation of some coefficients
 94 |     i_dx=1.0/(dx)
 95 |     kx=np.arange(0,nx-4)
 96 | 
 97 |     print("Starting time stepping...")
 98 |     ## Time stepping
 99 |     for n in range(2,nt):
100 | 
101 |         # Inject source wavelet
102 |         p[xsrc]=p[xsrc]+wavelet[n]
103 | 
104 |         # Calculating spatial derivative
105 |         p_x[kx]=i_dx*9.0/8.0*(p[kx+1]-p[kx])-i_dx*1.0/24.0*(p[kx+2]-p[kx-1])
106 | 
107 |         # Update velocity
108 |         vx[kx]=vx[kx]-dt/rho[kx]*(13.0/12.0*p_x[kx]-5.0/24.0*p_x2[kx]+1.0/6.0*p_x3[kx]-1.0/24.0*p_x4[kx])
109 | 
110 |         # Save old spatial derivations for Adam-Bashforth method
111 |         np.copyto(p_x4,p_x3)
112 |         np.copyto(p_x3,p_x2)
113 |         np.copyto(p_x2,p_x)
114 | 
115 |         # Calculating spatial derivative
116 |         vx_x[kx]= i_dx*9.0/8.0*(vx[kx]-vx[kx-1])-i_dx*1.0/24.0*(vx[kx+1]-vx[kx-2])
117 | 
118 |         # Update pressure
119 |         p[kx]=p[kx]-l[kx]*dt*(13.0/12.0*vx_x[kx]-5.0/24.0*vx_x2[kx]+1.0/6.0*vx_x3[kx]-1.0/24.0*vx_x4[kx])
120 | 
121 |         # Save old spatial derivations for Adam-Bashforth method
122 |         np.copyto(vx_x4,vx_x3)
123 |         np.copyto(vx_x3,vx_x2)
124 |         np.copyto(vx_x2,vx_x)
125 | 
126 |         # Save seismograms
127 |         seisout[n]=p[xrcv]
128 | 
129 |     print("Finished time stepping!")
130 |     #shift to account for the stf_shift.
131 | 
132 |     if t_interval >1: #downsample data in time.
133 |         tout=np.arange(0,tmax+0.5*t_interval*dt,t_interval*dt)
134 |         seisout=np.interp(tout,t-stf_shift,seisout)
135 |     else:
136 |         tout=t[int(stf_shift/dt):]
137 |         seisout=seisout[int(stf_shift/dt):]
138 | 
139 |     return tout,seisout
140 | 
141 | ###
142 | def build_vmodel(zmax,dz,nlayer,vmin,vmax,rhomin,rhomax,zmin=0,layer_dv=None):
143 |     """
144 |     Build layered velocity model with linearly increasing velocity, with the option of specifying anomalous layers.
145 |     =========
146 |     zmax: maximum depth of the model.
147 |     dz: number of model grids, not the velocity layers. this is to create a fine grid layered model.
148 |         with multiple grids within each layer.
149 |     nlayer: number of velocity layers.
150 |     vmin, vmax: velocity range.
151 |     rhomin,rhomax: density range.
152 |     zmin=0: minimum depth. default is 0.
153 |     layer_dv: velocity perturbation for each layer.
154 |     """
155 |     layerv=np.linspace(vmin,vmax,nlayer)
156 |     if layer_dv is None:
157 |         layer_dv = np.zeros((nlayer))
158 |     layerv = np.multiply(layerv,1+layer_dv)
159 |     layerrho=np.linspace(rhomin,rhomax,nlayer)
160 |     
161 |     z=np.arange(zmin,zmax+-.5*dz,dz)
162 |     
163 |     zlayer=np.linspace(zmin,zmax,nlayer)
164 |     v=np.zeros((len(z)))
165 |     rho=np.zeros((len(z)))
166 |     for i in range(len(z)):
167 |         zidx_all=np.where((zlayer<=z[i]))[0]
168 |         zidx=np.argmax(zlayer[zidx_all])
169 |         
170 |         v[i]=layerv[zidx]
171 |         rho[i]=layerrho[zidx]
172 |     #
173 |     return z,v,rho
174 | 


--------------------------------------------------------------------------------
/seisgo/stacking.py:
--------------------------------------------------------------------------------
  1 | import os,glob,copy,obspy,scipy,time
  2 | import numpy as np
  3 | from seisgo.utils import rms
  4 | import matplotlib.pyplot as plt
  5 | from scipy.signal import hilbert
  6 | from scipy.fftpack import fft,ifft,next_fast_len
  7 | from stockwell import st
  8 | from tslearn.utils import to_time_series, to_time_series_dataset
  9 | from tslearn.clustering import TimeSeriesKMeans
 10 | """
 11 | Stacking functions.
 12 | """
 13 | def stack(d,method,par=None):
 14 |     """
 15 |     this is a wrapper for calling individual stacking functions.
 16 |     d: data. 2-d array
 17 |     method: stacking method, one of "linear","pws","robust","acf","nroot","selective",
 18 |             "cluster"
 19 |     par: dictionary containing all parameters for each stacking method. defaults will
 20 |         be used if not specified.
 21 | 
 22 |     RETURNS:
 23 |     ds: stacked data, which may be a list depending on the method.
 24 |     """
 25 |     #remove NaN traces.
 26 |     newdata= [] 
 27 |     if d.ndim >1:
 28 |         for d2 in d: 
 29 |             if not np.isnan(d2).any(): 
 30 |                 newdata.append(d2)
 31 |         newdata = np.array(newdata)
 32 |     else:
 33 |         newdata=d.copy()
 34 | 
 35 |     method_list=["linear","pws","robust","acf","nroot","selective",
 36 |             "cluster","tfpws","tfpws-dost"]
 37 |     if method not in method_list:
 38 |         raise ValueError("$s not recoganized. use one of $s"%(method,str(method_list)))
 39 |     par0={"axis":0,"p":2,"g":1,"cc_min":0.0,"epsilon":1E-5,"maxstep":10,
 40 |             "win":None,"stat":False,"h":0.75,'plot':False,'normalize':True,'ref':None}  #stat: if true, will return statistics.
 41 |     if par is None:
 42 |         par=par0
 43 |     else:
 44 |         par={**par0,**par} #use par values if specified. otherwise, use defaults.
 45 | 
 46 |     if method.lower() == 'linear':
 47 |         ds = np.mean(newdata,axis=par["axis"])
 48 |     elif method.lower() == 'pws':
 49 |         ds = pws(newdata,p=par['p'])
 50 |     elif method.lower() == 'tfpws':
 51 |         ds = tfpws(newdata,p=par['p'])
 52 |     elif method.lower() == 'tfpws-dost':
 53 |         ds = tfpws_dost(newdata,p=par['p'])
 54 |     elif method.lower() == 'robust':
 55 |         ds = robust(newdata,epsilon=par['epsilon'],maxstep=par['maxstep'],win=par["win"],
 56 |                 stat=par['stat'],ref=par['ref'])
 57 |     elif method.lower() == 'acf':
 58 |         ds = adaptive_filter(newdata,g=par['g'])
 59 |     elif method.lower() == 'nroot':
 60 |         ds = nroot(newdata,p=par['p'])
 61 |     elif method.lower() == 'selective':
 62 |         ds = selective(newdata,cc_min=par['cc_min'],epsilon=par['epsilon'],maxstep=par['maxstep'],
 63 |                 stat=par['stat'],ref=par['ref'],win=par["win"])
 64 |     elif method.lower() == 'cluster':
 65 |         ds = clusterstack(newdata,h=par['h'],axis=par['axis'],win=par["win"],
 66 |         normalize=par['normalize'],plot=par['plot'])
 67 |     #
 68 |     return ds
 69 | 
 70 | def seisstack(d,method,par=None):
 71 |     """
 72 |     This is the same as stack(), to be compatible with old usage.
 73 |     """
 74 |     return stack(d,method=method,par=par)
 75 | 
 76 | def robust(d,epsilon=1E-5,maxstep=10,win=None,stat=False,ref=None):
 77 |     """
 78 |     this is a robust stacking algorithm described in Pavlis and Vernon 2010. Generalized
 79 |     by Xiaotao Yang.
 80 | 
 81 |     PARAMETERS:
 82 |     ----------------------
 83 |     d: numpy.ndarray contains the 2D cross correlation matrix
 84 |     epsilon: residual threhold to quit the iteration (a small number). Default 1E-5
 85 |     maxstep: maximum iterations. default 10.
 86 |     win: [start_index,end_index] used to compute the weight, instead of the entire trace. Default None.
 87 |             When None, use the entire trace.
 88 |     ref: reference stack, with the same length as individual data. Default: None. Use median().
 89 |     RETURNS:
 90 |     ----------------------
 91 |     newstack: numpy vector contains the stacked cross correlation
 92 |     w: weight (return only when stat = True.)
 93 |     nstep: number of iterations to produce the final stack (return only when stat = True.)
 94 |     Written by Marine Denolle
 95 |     Modified by Xiaotao Yang
 96 |     """
 97 |     if d.ndim == 1:
 98 |         print('2D matrix is needed')
 99 |         return d
100 |     N,M = d.shape
101 |     res  = 9E9  # residuals
102 |     w = np.ones(d.shape[0])
103 |     small_number=1E-15
104 |     max_abs_value=np.max(np.abs(d)) 
105 |     dcopy=d/max_abs_value #to avoid the dot product and L2 normm getting too large.
106 |     nstep=0
107 |     if N >=2:
108 |         if ref is None:
109 |             newstack = np.median(dcopy,axis=0)
110 |         else:
111 |             newstack = ref
112 |         if win is None:
113 |             win=[0,-1]
114 |         while res > epsilon and nstep <=maxstep:
115 |             stackt = newstack
116 |             for i in range(dcopy.shape[0]):
117 |                 dtemp=dcopy[i,win[0]:win[1]]
118 |                 crap = np.multiply(stackt[win[0]:win[1]],dtemp.T)
119 |                 crap_dot = np.sum(crap)
120 |                 di_norm = np.linalg.norm(dtemp)
121 |                 ri_norm = np.linalg.norm(dtemp -  crap_dot*stackt[win[0]:win[1]])
122 |                 if ri_norm < small_number:
123 |                     w[i]=0
124 |                 else:
125 |                     w[i]  = np.abs(crap_dot) /di_norm/ri_norm
126 |             w =w /np.sum(w)
127 |             newstack =np.sum( (w*dcopy.T).T,axis=0)#/len(cc_array[:,1])
128 |             res = np.linalg.norm(newstack-stackt,ord=1)/np.linalg.norm(newstack)/len(dcopy[:,1])
129 |             nstep +=1
130 |     else:
131 |         newstack=dcopy[0].copy()
132 |     newstack *= max_abs_value #scale the stack back.
133 |     if stat:
134 |         return newstack, w, nstep
135 |     else:
136 |         return newstack
137 | 
138 | def adaptive_filter(d,g=1):
139 |     '''
140 |     the adaptive covariance filter to enhance coherent signals. Fellows the method of
141 |     Nakata et al., 2015 (Appendix B)
142 | 
143 |     the filtered signal [x1] is given by x1 = ifft(P*x1(w)) where x1 is the ffted spectra
144 |     and P is the filter. P is constructed by using the temporal covariance matrix.
145 | 
146 |     PARAMETERS:
147 |     ----------------------
148 |     d: numpy.ndarray contains the 2D traces of daily/hourly cross-correlation functions
149 |     g: a positive number to adjust the filter harshness [default is 1]
150 |     RETURNS:
151 |     ----------------------
152 |     newstack: numpy vector contains the stacked cross correlation function
153 |     '''
154 |     if d.ndim == 1:
155 |         print('2D matrix is needed')
156 |         return d
157 |     N,M = d.shape
158 |     if N>=2:
159 |         Nfft = next_fast_len(M)
160 | 
161 |         # fft the 2D array
162 |         spec = fft(d,axis=1,n=Nfft)[:,:M]
163 | 
164 |         # make cross-spectrm matrix
165 |         cspec = np.zeros(shape=(N*N,M),dtype=np.complex64)
166 |         for ii in range(N):
167 |             for jj in range(N):
168 |                 kk = ii*N+jj
169 |                 cspec[kk] = spec[ii]*np.conjugate(spec[jj])
170 | 
171 |         S1 = np.zeros(M,dtype=np.complex64)
172 |         S2 = np.zeros(M,dtype=np.complex64)
173 |         # construct the filter P
174 |         for ii in range(N):
175 |             mm = ii*N+ii
176 |             S2 += cspec[mm]
177 |             for jj in range(N):
178 |                 kk = ii*N+jj
179 |                 S1 += cspec[kk]
180 | 
181 |         p = np.power((S1-S2)/(S2*(N-1)),g)
182 | 
183 |         # make ifft
184 |         narr = np.real(ifft(np.multiply(p,spec),Nfft,axis=1)[:,:M])
185 |         newstack=np.mean(narr,axis=0)
186 |     else:
187 |         newstack=d[0].copy()
188 |     #
189 |     return newstack
190 | 
191 | def pws(d,p=2):
192 |     '''
193 |     Performs phase-weighted stack on array of time series. Modified on the noise function by Tim Climents.
194 |     Follows methods of Schimmel and Paulssen, 1997.
195 |     If s(t) is time series data (seismogram, or cross-correlation),
196 |     S(t) = s(t) + i*H(s(t)), where H(s(t)) is Hilbert transform of s(t)
197 |     S(t) = s(t) + i*H(s(t)) = A(t)*exp(i*phi(t)), where
198 |     A(t) is envelope of s(t) and phi(t) is phase of s(t)
199 |     Phase-weighted stack, g(t), is then:
200 |     g(t) = 1/N sum j = 1:N s_j(t) * | 1/N sum k = 1:N exp[i * phi_k(t)]|^v
201 |     where N is number of traces used, v is sharpness of phase-weighted stack
202 | 
203 |     PARAMETERS:
204 |     ---------------------
205 |     d: N length array of time series data (numpy.ndarray)
206 |     p: exponent for phase stack (int). default is 2
207 | 
208 |     RETURNS:
209 |     ---------------------
210 |     newstack: Phase weighted stack of time series data (numpy.ndarray)
211 |     '''
212 | 
213 |     if d.ndim == 1:
214 |         print('2D matrix is needed')
215 |         return d
216 |     N,M = d.shape
217 |     if N >=2:
218 |         analytic = hilbert(d,axis=1, N=next_fast_len(M))[:,:M]
219 |         phase = np.angle(analytic)
220 |         phase_stack = np.mean(np.exp(1j*phase),axis=0)
221 |         phase_stack = np.abs(phase_stack)**(p)
222 | 
223 |         weighted = np.multiply(d,phase_stack)
224 | 
225 |         newstack=np.mean(weighted,axis=0)
226 |     else:
227 |         newstack=d[0].copy()
228 |     return newstack
229 | 
230 | def nroot(d,p=2):
231 |     '''
232 |     this is nth-root stacking algorithm translated based on the matlab function
233 |     from https://github.com/xtyangpsp/SeisStack (by Xiaotao Yang; follows the
234 |     reference of Millet, F et al., 2019 JGR)
235 | 
236 |     Parameters:
237 |     ------------
238 |     d: numpy.ndarray contains the 2D cross correlation matrix
239 |     p: np.int, nth root for the stacking. Default is 2.
240 | 
241 |     Returns:
242 |     ------------
243 |     newstack: np.ndarray, final stacked waveforms
244 | 
245 |     Written by Chengxin Jiang @ANU (May2020)
246 |     '''
247 |     if d.ndim == 1:
248 |         print('2D matrix is needed for nroot_stack')
249 |         return d
250 |     N,M = d.shape
251 |     if N >=2:
252 |         dout = np.zeros(M,dtype=np.float32)
253 | 
254 |         # construct y
255 |         for ii in range(N):
256 |             dat = d[ii,:]
257 |             dout += np.sign(dat)*np.abs(dat)**(1/p)
258 |         dout /= N
259 | 
260 |         # the final stacked waveform
261 |         newstack = dout*np.abs(dout)**(p-1)
262 |     else:
263 |         newstack=d[0].copy()
264 | 
265 |     return newstack
266 | 
267 | 
268 | def selective(d,cc_min,epsilon=1E-5,maxstep=10,win=None,stat=False,ref=None):
269 |     '''
270 |     this is a selective stacking algorithm developed by Jared Bryan/Kurama Okubo.
271 | 
272 |     PARAMETERS:
273 |     ----------------------
274 |     d: numpy.ndarray contains the 2D cross correlation matrix
275 |     epsilon: residual threhold to quit the iteration
276 |     cc_min: numpy.float, threshold of correlation coefficient to be selected
277 |     epsilon: residual threhold to quit the iteration (a small number). Default 1E-5
278 |     maxstep: maximum iterations. default 10.
279 |     win: [start_index,end_index] used to compute the weight, instead of the entire trace. Default None.
280 |             When None, use the entire trace.
281 |     ref: reference stack, with the same length as individual data. Default: None. Use mean().
282 |     RETURNS:
283 |     ----------------------
284 |     newstack: numpy vector contains the stacked cross correlation
285 |     nstep: np.int, total number of iterations for the stacking
286 | 
287 |     Originally ritten by Marine Denolle
288 |     Modified by Chengxin Jiang @Harvard (Oct2020)
289 |     '''
290 |     if d.ndim == 1:
291 |         print('2D matrix is needed for selective stacking')
292 |         return d
293 |     N,M = d.shape
294 |     if N>=2:
295 |         res  = 9E9  # residuals
296 |         cof  = np.zeros(N,dtype=np.float32)
297 |         if ref is None:
298 |             newstack = np.mean(d,axis=0)
299 |         else:
300 |             newstack = ref
301 | 
302 |         nstep = 0
303 |         if win is None:
304 |             win=[0,-1]
305 |         # start iteration
306 |         while res>epsilon and nstep<=maxstep:
307 |             for ii in range(N):
308 |                 cof[ii] = np.corrcoef(newstack[win[0]:win[1]], d[ii,win[0]:win[1]])[0, 1]
309 | 
310 |             # find good waveforms
311 |             indx = np.where(cof>=cc_min)[0]
312 |             nstep +=1
313 |             if not len(indx):
314 |                 newstack=np.ndarray((d.shape[1],))
315 |                 newstack.fill(np.nan)
316 |                 print('cannot find good waveforms inside selective stacking')
317 |                 break
318 |             else:
319 |                 oldstack = newstack
320 |                 newstack = np.mean(d[indx],axis=0)
321 |                 res = np.linalg.norm(newstack-oldstack)/(np.linalg.norm(newstack)*M)
322 |     else:
323 |         newstack=d[0].copy()
324 |     if stat:
325 |         return newstack, nstep
326 |     else:
327 |         return newstack
328 | #
329 | def clusterstack(d,h=0.75,win=None,axis=0,normalize=True,plot=False):
330 |     '''
331 |     Performs stack after clustering. The data will be clustered into two groups.
332 |     If the two centers of the clusters are similar (defined by corrcoef >= "t"), the original
333 |     traces associated with both clusters will be used to produce the final linear stack, weighted by
334 |     normalized SNR (phase clarity) of each cluster. Otherwise, the one with larger phase clarity
335 |     (defined as max(abs(amplitudes))/rms(abs(amplitudes))) will be used to get the final stack.
336 | 
337 |     PARAMETERS:
338 |     ---------------------
339 |     d: N length array of time series data (numpy.ndarray)
340 |     h: corrcoeff threshold to decide which group/cluster to use. Default 0.75.
341 |     win: [start_index,end_index] used to compute the weight, instead of the entire trace. Default None.
342 |             When None, use the entire trace.
343 |     axis: which axis to stack. default 0.
344 |     normalize: Normalize the traces before clustering. This will only influence the cluster.
345 |             The final stack will be produced using the original data.
346 |     plot: plot clustering results. default False.
347 | 
348 |     RETURNS:
349 |     ---------------------
350 |     newstack: final stack.
351 |     '''
352 |     ncluster=2 #DO NOT change this value.
353 |     min_trace=2 #minimum of two traces.
354 |     metric="euclidean" #matric to compute the distance in kmeans clustering.
355 |     if d.ndim == 1:
356 |         print('2D matrix is needed')
357 |         return d
358 |     N,M = d.shape
359 |     if N >= min_trace:
360 |         dataN=d.copy()
361 |         if normalize:
362 |             for i in range(N):
363 |                 dataN[i]=d[i]/np.max(np.abs(d[i]),axis=0)
364 | 
365 |         ts = to_time_series_dataset(dataN)
366 | 
367 |         km = TimeSeriesKMeans(n_clusters=ncluster, n_jobs=1,metric=metric, verbose=False,
368 |                               max_iter_barycenter=100, random_state=0)
369 |         y_pred = km.fit_predict(ts)
370 |         snr_all=[]
371 |         centers_all=[]
372 |         cidx=[]
373 |         if win is None:
374 |             win=[0,-1]
375 |         for yi in range(ncluster):
376 |             cidx.append(np.where((y_pred==yi))[0])
377 |             center=km.cluster_centers_[yi].ravel()#np.squeeze(np.mean(ts[y_pred == yi].T,axis=2))
378 |             centers_all.append(center)
379 |             snr=np.max(np.abs(center[win[0]:win[1]]))/rms(np.abs(center))
380 |             snr_all.append(snr)
381 | 
382 |         #
383 |         if plot:
384 |             plt.figure(figsize=(12,4))
385 |             for yi in range(ncluster):
386 |                 plt.subplot(1,ncluster,yi+1)
387 |                 plt.plot(np.squeeze(ts[cidx[yi]].T),'k-',alpha=0.3)
388 |                 plt.plot(centers_all[yi],'r-')
389 |                 plt.title('Cluster %d: %d'%(yi+1,len(cidx[yi])))
390 |             plt.show()
391 |         cc=np.corrcoef(centers_all[0],centers_all[1])[0,1]
392 |         if cc>= h: #use all data
393 |             snr_normalize=snr_all/np.sum(snr_all)
394 |             newstack=np.zeros((M))
395 |             for yi in range(ncluster):
396 |                 newstack += snr_normalize[yi]*np.mean(d[cidx[yi]],axis=0)
397 |         else:
398 |             goodidx=np.argmax(snr_all)
399 |             newstack=np.mean(d[cidx[goodidx]],axis=0)
400 |         del dataN,ts,y_pred
401 |     else:
402 |         newstack=d[0].copy()
403 |     #
404 |     return newstack
405 | 
406 | def tfpws(d,p=2,axis=0):
407 |     '''
408 |     Performs time-frequency domain phase-weighted stack on array of time series.
409 | 
410 |     $C_{ps} = |(\sum{S*e^{i2\pi}/|S|})/M|^p$, where $C_{ps}$ is the phase weight. Then
411 |     $S_{pws} = C_{ps}*S_{ls}$, where $S_{ls}$ is the S transform of the linea stack
412 |     of the whole data.
413 | 
414 |     Reference:
415 |     Schimmel, M., Stutzmann, E., & Gallart, J. (2011). Using instantaneous phase
416 |     coherence for signal extraction from ambient noise data at a local to a
417 |     global scale. Geophysical Journal International, 184(1), 494–506.
418 |     https://doi.org/10.1111/j.1365-246X.2010.04861.x
419 | 
420 |     PARAMETERS:
421 |     ---------------------
422 |     d: N length array of time series data (numpy.ndarray)
423 |     p: exponent for phase stack (int). default is 2
424 |     axis: axis to stack, default is 0.
425 | 
426 |     RETURNS:
427 |     ---------------------
428 |     newstack: Phase weighted stack of time series data (numpy.ndarray)
429 |     '''
430 |     if d.ndim == 1:
431 |         print('2D matrix is needed')
432 |         return d
433 |     N,M = d.shape
434 |     if N >=2:
435 |         lstack=np.mean(d,axis=axis)
436 |         #get the ST of the linear stack first
437 |         stock_ls=st.st(power2pad(lstack))
438 | 
439 |         #run a ST to get the dimension of ST result
440 |         stock_temp=st.st(power2pad(d[0]))
441 |         phase_stack=np.zeros((stock_temp.shape[0],stock_temp.shape[1]),dtype='complex128')
442 |         for i in range(N):
443 |             if i>0: #zero index has been computed.
444 |                 stock_temp=st.st(power2pad(d[i]))
445 |             phase_stack += np.multiply(stock_temp,np.angle(stock_temp))/np.abs(stock_temp)
446 |         #
447 |         phase_stack = np.abs(phase_stack/N)**p
448 | 
449 |         pwstock=np.multiply(phase_stack,stock_ls)
450 |         recdostIn=np.real(st.ist(pwstock))
451 |         newstack=recdostIn[:M] # trim padding
452 |     else:
453 |         newstack=d[0].copy()
454 |     #
455 |     return newstack
456 | 
457 | def tfpws_dost(d,p=2,axis=0):
458 |     '''
459 |     Performs time-frequency domain phase-weighted stack on array of time series using DOST (Discrete
460 |     orthogonal stockwell transform).
461 |     $C_{ps} = |(\sum{S*e^{i2\pi}/|S|})/M|^p$, where $C_{ps}$ is the phase weight. Then
462 | 	$S_{pws} = C_{ps}*S_{ls}$, where $S_{ls}$ is the Discrete Orthonormal S transform
463 | 	of the linear stack of the whole data.
464 | 
465 |     DOST stacking was implemented by Jared Bryan.
466 | 
467 |     Reference for tf-PWS:
468 |     Schimmel, M., Stutzmann, E., & Gallart, J. (2011). Using instantaneous phase
469 |     coherence for signal extraction from ambient noise data at a local to a
470 |     global scale. Geophysical Journal International, 184(1), 494–506.
471 |     https://doi.org/10.1111/j.1365-246X.2010.04861.x
472 | 
473 |     Reference for DOST:
474 |     U. Battisti, L. Riba, "Window-dependent bases for efficient representations of the
475 | 	Stockwell transform", Applied and Computational Harmonic Analysis, 23 February 2015,
476 | 	http://dx.doi.org/10.1016/j.acha.2015.02.002.
477 | 
478 |     PARAMETERS:
479 |     ---------------------
480 |     d: N length array of time series data (numpy.ndarray)
481 |     p: exponent for phase stack (int). default is 2
482 |     axis: axis to stack, default is 0.
483 | 
484 |     RETURNS:
485 |     ---------------------
486 |     newstack: Phase weighted stack of time series data (numpy.ndarray)
487 |     '''
488 |     if d.ndim == 1:
489 |         print('2D matrix is needed')
490 |         return d
491 |     N,M = d.shape
492 |     if N >=2:
493 |         lstack=np.mean(d,axis=axis)
494 |         #get the dost of the linear stack first
495 |         stock_ls_dost=DOST(lstack) # initialize dost object
496 |         stock_ls=stock_ls_dost.dost(stock_ls_dost.data) # calculate the dost
497 | 
498 |     	# calculate dost for first trace to know its shape
499 |         stock_dost=DOST(d[0])
500 |         stock_temp=stock_dost.dost(stock_dost.data)
501 |     	# initialize stack
502 |         phase_stack=np.zeros(len(stock_temp),dtype='complex128')
503 |     	# calculate the dost for each trace to be stacked
504 |         for i in range(d.shape[0]):
505 |         	if i>0: # zero index has been computed
506 |         		stock_dost=DOST(d[i])
507 |         		stock_temp=stock_dost.dost(stock_dost.data)
508 |         	phase_stack+=np.multiply(stock_temp,np.angle(stock_temp))/np.abs(stock_temp)
509 | 
510 |         phase_stack = np.abs(phase_stack/N)**p
511 | 
512 |         pwstock=np.multiply(phase_stack,stock_ls)
513 |         recdostIn = np.real(stock_dost.idost(pwstock))
514 |         newstack = recdostIn[:M] # trim padding
515 |     else:
516 |         newstack=d[0].copy()
517 |     #
518 |     return newstack
519 | 
520 | #################################
521 | ####### stacking needed utilities.
522 | ################################
523 | #
524 | def power2pad(data):
525 | 	"""Zero pad data such that its length is a power of 2"""
526 | 	N=int(2**np.ceil(np.log2(len(data))))
527 | 	pad_end=np.zeros(int(N-len(data)))
528 | 
529 | 	return np.concatenate((data,pad_end))
530 | class DOST:
531 | 	def __init__(self, data):
532 | 		# make sure data length is a power of 2
533 | 		if np.ceil(np.log2(len(data)))==np.floor(np.log2(len(data))):
534 | 			# length of data already a power of 2
535 | 			self.data=data
536 | 		else:
537 | 			# pad data to nearest power of 2
538 | 			self.data=self.pad(data)
539 | 
540 | 	def pad(self,data):
541 | 		"""Zero pad data such that its length is a power of 2"""
542 | 		N=int(2**np.ceil(np.log2(len(data))))
543 | 		pad_end=np.zeros(int(N-len(data)))
544 | 		data=np.concatenate((data,pad_end))
545 | 
546 | 		return data
547 | 
548 | 	def fourier(self, d):
549 | 		"""Normalize and center fft"""
550 | 		fftIn=(1/np.sqrt(len(d))) * np.fft.fftshift(np.fft.fft(np.fft.ifftshift(d)))
551 | 		return fftIn
552 | 
553 | 	def ifourier(self,d):
554 | 		"""Normalize and center ifft"""
555 | 		ifftIn=np.sqrt(len(d)) * np.fft.fftshift(np.fft.ifft(np.fft.ifftshift(d)))
556 | 		return ifftIn
557 | 
558 | 	def dostbw(self,D):
559 | 		"""Calculate size of the DOST bandwidths"""
560 | 		arr=[0]
561 | 		arr.extend(np.arange(np.log2(D)-2, -1e-9, -1))
562 | 		arr.extend([0])
563 | 		arr.extend(np.arange(0, np.log2(D)-2+1e-9))
564 | 		arr=2**np.array(arr)
565 | 		return arr
566 | 
567 | 	def dost(self,d):
568 | 		"""Discrete Orthonormal Stockwell Transform"""
569 | 		d_dost=self.fourier(d)
570 | 		D=len(d)
571 | 		bw=self.dostbw(D)
572 | 		k=0
573 | 		for i in bw:
574 | 			i=int(i)
575 | 			if i==1:
576 | 				k=k+i
577 | 			else:
578 | 				d_dost[k:k+i] = self.ifourier(d_dost[k:k+i])
579 | 				k=k+i
580 | 		return d_dost
581 | 
582 | 	def idost(self,d):
583 | 		"""Inverse Discrete Orthonormal Stockwell Transform"""
584 | 		d_idost=d
585 | 		D=len(d)
586 | 		bw=self.dostbw(D)
587 | 		k=0
588 | 		for i in bw:
589 | 			i=int(i)
590 | 			if i==1:
591 | 				k=k+i
592 | 			else:
593 | 				d_idost[k:k+i] = self.fourier(d_idost[k:k+i])
594 | 				k=k+i
595 | 		d_idost = self.ifourier(d_idost)
596 | 		return d_idost
597 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
 1 | # Inside of setup.cfg
 2 | [metadata]
 3 | name = seisgo
 4 | description = A ready-to-go Python toolbox for seismic data analysis
 5 | long_description = file: README.md
 6 | keywords = seismology, seismic data analysis, seismic toolbox
 7 | license = MIT license
 8 | classifiers=
 9 |       Development Status :: 4 - Beta
10 |       License :: OSI Approved :: MIT License
11 |       Programming Language :: Python :: 3.7
12 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | # from numpy.distutils.core import setup
  2 | from setuptools import setup, find_packages
  3 | import pathlib
  4 | 
  5 | here = pathlib.Path(__file__).parent.resolve()
  6 | 
  7 | # Get the long description from the README file
  8 | long_description = (here / 'description.md').read_text(encoding='utf-8')
  9 | 
 10 | # Arguments marked as "Required" below must be included for upload to PyPI.
 11 | # Fields marked as "Optional" may be commented out.
 12 | version='0.9.1'
 13 | setup(
 14 |     name='seisgo',
 15 |     version=version,
 16 |     description='A ready-to-go Python toolbox for seismic data analysis',
 17 |     author='Xiaotao Yang',
 18 |     author_email='stcyang@gmail.com',
 19 |     maintainer='Xiaotao Yang',
 20 |     maintainer_email='stcyang@gmail.com',
 21 |     download_url='https://github.com/xtyangpsp/SeisGo/archive/refs/tags/v'+version+'.tar.gz',
 22 | 
 23 |     # This is an optional longer description of your project that represents
 24 |     # the body of text which users will see when they visit PyPI.
 25 |     #
 26 |     # Often, this is the same as your README, so you can just read it in from
 27 |     # that file directly (as we have already done above)
 28 |     #
 29 |     # This field corresponds to the "Description" metadata field:
 30 |     # https://packaging.python.org/specifications/core-metadata/#description-optional
 31 |     long_description=long_description,  # Optional
 32 | 
 33 |     # Denotes that our long_description is in Markdown; valid values are
 34 |     # text/plain, text/x-rst, and text/markdown
 35 |     #
 36 |     # Optional if long_description is written in reStructuredText (rst) but
 37 |     # required for plain-text or Markdown; if unspecified, "applications should
 38 |     # attempt to render [the long_description] as text/x-rst; charset=UTF-8 and
 39 |     # fall back to text/plain if it is not valid rst" (see link below)
 40 |     #
 41 |     # This field corresponds to the "Description-Content-Type" metadata field:
 42 |     # https://packaging.python.org/specifications/core-metadata/#description-content-type-optional
 43 |     long_description_content_type='text/markdown',  # Optional (see note above)
 44 | 
 45 |     # This should be a valid link to your project's main homepage.
 46 |     url='https://github.com/xtyangpsp/SeisGo',  # Optional
 47 | 
 48 |     # Classifiers help users find your project by categorizing it.
 49 |     #
 50 |     # For a list of valid classifiers, see https://pypi.org/classifiers/
 51 |     classifiers=[  # Optional
 52 |         # How mature is this project? Common values are
 53 |         #   3 - Alpha
 54 |         #   4 - Beta
 55 |         #   5 - Production/Stable
 56 |         'Development Status :: 4 - Beta',
 57 | 
 58 |         # Pick your license as you wish
 59 |         'License :: OSI Approved :: MIT License',
 60 | 
 61 |         # Specify the Python versions you support here. In particular, ensure
 62 |         # that you indicate you support Python 3. These classifiers are *not*
 63 |         # checked by 'pip install'. See instead 'python_requires' below.
 64 |         'Programming Language :: Python :: 3.7'
 65 |     ],
 66 | 
 67 |     # This field adds keywords for your project which will appear on the
 68 |     # project page. What does your project relate to?
 69 |     #
 70 |     # Note that this is a list of additional keywords, separated
 71 |     # by commas, to be used to assist searching for the distribution in a
 72 |     # larger catalog.
 73 |     keywords='seismology, seismic data analysis, seismic toolbox',  # Optional
 74 | 
 75 |     # When your source code is in a subdirectory under the project root, e.g.
 76 |     # `src/`, it is necessary to specify the `package_dir` argument.
 77 |     #package_dir={'': 'src'},  # Optional
 78 | 
 79 |     # You can just specify package directories manually here if your project is
 80 |     # simple. Or you can use find_packages().
 81 |     #
 82 |     # Alternatively, if you just want to distribute a single Python file, use
 83 |     # the `py_modules` argument instead as follows, which will expect a file
 84 |     # called `my_module.py` to exist:
 85 |     #
 86 |     #   py_modules=["my_module"],
 87 |     #
 88 |     #packages=find_packages(where='src'),  # Required
 89 | 
 90 |     packages=['seisgo'],
 91 |     include_package_data = True,
 92 |     package_data={"":["data","figs","notebooks"]},
 93 |     # Specify which Python versions you support. In contrast to the
 94 |     # 'Programming Language' classifiers above, 'pip install' will check this
 95 |     # and refuse to install the project if the version does not match. See
 96 |     # https://packaging.python.org/guides/distributing-packages-using-setuptools/#python-requires
 97 |     python_requires='>=3.6',
 98 | 
 99 |     # This field lists other packages that your project depends on to run.
100 |     # Any package you put here will be installed by pip when your project is
101 |     # installed, so they must be valid existing projects.
102 |     #
103 |     # For an analysis of "install_requires" vs pip's requirements files see:
104 |     # https://packaging.python.org/en/latest/requirements.html
105 |     install_requires=['numpy',
106 |                         'scipy',
107 |                         'pandas',
108 |                         'obspy',
109 |                         'pyasdf',
110 |                         'numba',
111 |                         'pycwt',
112 |                         'shapely',
113 |                         'netCDF4',
114 |                         'tslearn',
115 |                         'plotly',
116 |                         'kaleido',
117 |                         'minisom',
118 |                         'stockwell',
119 |                         'kneed',
120 |                         'utm'
121 |         ]
122 | )
123 | 


--------------------------------------------------------------------------------