├── .gitignore
├── Anaconda Prompt.lnk
├── LICENSE
├── README.md
├── clean
    ├── __init__.py
    └── clean.py
├── data
    └── .gitkeep
├── notebooks
    ├── .gitkeep
    └── 1_Demo_of_the_CLEAN_algorithm.ipynb
└── setup.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | 
 5 | # C extensions
 6 | *.so
 7 | 
 8 | # Distribution / packaging
 9 | .Python
10 | env/
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | *.egg-info/
23 | .installed.cfg
24 | *.egg
25 | 
26 | # PyInstaller
27 | #  Usually these files are written by a python script from a template
28 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
29 | *.manifest
30 | *.spec
31 | 
32 | # Installer logs
33 | pip-log.txt
34 | pip-delete-this-directory.txt
35 | 
36 | # Unit test / coverage reports
37 | htmlcov/
38 | .tox/
39 | .coverage
40 | .coverage.*
41 | .cache
42 | nosetests.xml
43 | coverage.xml
44 | 
45 | # Translations
46 | *.mo
47 | *.pot
48 | 
49 | # Django stuff:
50 | *.log
51 | 
52 | # Sphinx documentation
53 | docs/_build/
54 | 
55 | # PyBuilder
56 | target/
57 | 
58 | # DotEnv configuration
59 | .env
60 | 
61 | # Database
62 | *.db
63 | *.rdb
64 | 
65 | # 
66 | harm
67 | .idea
68 | 
69 | # VS Code
70 | .vscode/
71 | 
72 | # Spyder
73 | .spyproject/
74 | 
75 | # Jupyter NB Checkpoints
76 | .ipynb_checkpoints/
77 | 
78 | # exclude data from source control by default
79 | *.pyc
80 | *.swp
81 | *.py.swp
82 | /data/
83 | 


--------------------------------------------------------------------------------
/Anaconda Prompt.lnk:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michalkalkowski/clean_algorithm/5f13eb2edd83beb6b25dbfb62c2d94d067f5bea3/Anaconda Prompt.lnk


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | 
 2 | The MIT License (MIT)
 3 | Copyright (c) 2018, Michal K. Kalkowski
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
 6 | 
 7 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
10 | 
11 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | CLEAN
 2 | ==============================
 3 | 
 4 | An implementation and demonstration of the CLEAN algorithm for estimating times of arrival in multi component signals.
 5 | 
 6 | This is an implementation of the CLEAN algorithm which identifies individual waveforms in a multi-component signal by simple spectral summations.
 7 | 
 8 | The details of the algorithm can be found in (and are not recalled here):
 9 | 
10 | 1. Gough, P.T., 1994. **A fast spectral estimation algorithm based on the FFT**. *IEEE Transactions on Signal Processing* 42, 1317–1322. https://doi.org/10.1109/78.286949
11 | 2. Holmes, C., Drinkwater, B.W., Wilcox, P.D., **Post-processing of the full matrix of ultrasonic transmit–receive array data for non-destructive evaluation**, 2005. *NDT & E International* 38, 701–711. https://doi.org/10.1016/j.ndteint.2005.04.002
12 | 3. Hunter, A.J., Drinkwater, B.W., Zhang, J., Wilcox, P.D., 2011. **A STUDY INTO THE EFFECTS OF AN AUSTENITIC WELD ON ULTRASONIC ARRAY IMAGING PERFORMANCE**. *Rev. QNDE* pp. 1063–1070. https://doi.org/10.1063/1.3592054
13 | 
14 | Two functions are provided. `extract_CLEAN` performs the iterative search and
15 | identifies individual components (wave packets) present in the measured signal.
16 | `plot_components` can be used to visualise the outcome of `extract_CLEAN`.
17 | 
18 | Project Organization
19 | ------------
20 | 
21 |     ├── README.md          <- The top-level README for developers using this project.
22 |     ├── data               <- Original, raw, external data 
23 |     │
24 |     ├── clean/  <- Python module with the source
25 |     code
26 |     │
27 |     ├── notebooks          <- Jupyter notebooks for expploration,
28 |     demonstration, visualisation
29 |     │
30 |     └── setup.py           <- allows for installing the module
31 | --------
32 | 


--------------------------------------------------------------------------------
/clean/__init__.py:
--------------------------------------------------------------------------------
1 | __all__ = ['clean']
2 | 


--------------------------------------------------------------------------------
/clean/clean.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | ==============================================================================
  5 | Copyright (C) 2018 Michal Kalkowski (MIT License)
  6 | kalkowski.m@gmail.com
  7 | 
  8 | This is an implementation of the CLEAN algorithm which identifies individual
  9 | waveforms in a multi-component signal by simple spectral summations.
 10 | 
 11 | The details of the algorithm can be found in (and are not recalled here):
 12 | [1] Gough, P.T., 1994. A fast spectral estimation algorithm based on the FFT.
 13 |     IEEE Transactions on Signal Processing 42, 1317–1322.
 14 |     https://doi.org/10.1109/78.286949
 15 | [2] Holmes, C., Drinkwater, B.W., Wilcox, P.D., Post-processing of the full
 16 |     matrix of ultrasonic transmit–receive array data for non-destructive
 17 |     evaluation, 2005. NDT & E International 38, 701–711.
 18 |     https://doi.org/10.1016/j.ndteint.2005.04.002
 19 | [3] Hunter, A.J., Drinkwater, B.W., Zhang, J., Wilcox, P.D., 2011.
 20 |     A STUDY INTO THE EFFECTS OF AN AUSTENITIC WELD ON ULTRASONIC ARRAY
 21 |     IMAGING PERFORMANCE. pp. 1063–1070. https://doi.org/10.1063/1.3592054
 22 | 
 23 | Two functions are provided. `extract_CLEAN` performs the iterative search and
 24 | identifies individual components (wave packets) present in the measured signal.
 25 | `plot_components` can be used to visualise the outcome of `extract_CLEAN`.
 26 | ==============================================================================
 27 | """
 28 | import numpy as np
 29 | import matplotlib.pyplot as plt
 30 | from scipy.signal import hilbert
 31 | from tqdm import trange
 32 | 
 33 | 
 34 | def extract_CLEAN_single(measured_signal, original_signal, delta_t, threshold=0.4,
 35 |         max_iter=10):
 36 |     """
 37 |     Applies the CLEAN algorithm to extract individual components from a multi-
 38 |     component signal. The algorithm is based in the assumption that the
 39 |     measured signal is a sum of scaled, delayed and shifted copies of
 40 |     the original signal.
 41 | 
 42 |     Individual components are extracted by iterating over dominant components
 43 |     of the spectrum. The algorithms starts from taking the spectrum of the
 44 |     input signal as a starting residue. It determines the amplitude, phase
 45 |     and time delay of the dominant component and subtracts the reconstructed
 46 |     spectrum of this componend from the residue. A new residue is formed,
 47 |     and the search for the dominant component restarts.
 48 | 
 49 |     The loop is terminated after the amplitude associated with one individual
 50 |     component drops below a given threshold.
 51 | 
 52 |     It is assumed that both the original and the measured signals are sampled
 53 |     at the same rate.
 54 | 
 55 |     Parameters:
 56 |     ---
 57 |     measured_signal: ndarray, measured signal to decompose of size (N,) where N
 58 |                     is the number of samples (single sensor signal)
 59 |     original_signal: ndarray, original (transmitted) signal,
 60 |                               e.g. the excitation applied to the structure
 61 |                               under test
 62 |     delta_t: float, time increment
 63 |     threshold: float, amplitude threshold as a fraction of the maximum
 64 |                       amplitude;
 65 |                     defaults to 0.4
 66 | 
 67 |     Returns:
 68 |     ---
 69 |     amplitudes: ndarray, amplitudes of the individual components
 70 |     delays: ndarray, tiem delays of the individual components
 71 |     phases: ndarray, phase shifts of the individual components
 72 |     components: ndarray, reconstructed time traces related to each individual
 73 |                         component
 74 |     """
 75 |     original_spectrum = np.fft.fft(original_signal)
 76 |     measured_spectrum = np.fft.fft(measured_signal)
 77 |     omega = np.fft.fftfreq(len(original_signal), delta_t)
 78 |     original_hilbert = hilbert(original_signal)
 79 |     original_argpeak = np.argmax(abs(original_hilbert))
 80 | 
 81 |     # Initialise lists
 82 |     amplitudes = []
 83 |     delays = []
 84 |     phases = []
 85 |     components = []
 86 | 
 87 |     # First residue is the spectrum of the measured signal
 88 |     residue = np.copy(measured_spectrum)
 89 | 
 90 |     # Dummy values initialising the loop
 91 |     amplitude = 1
 92 |     amp_max = 2
 93 |     iters = 0
 94 |     while amplitude > threshold*amp_max:
 95 |         if iters > max_iter:
 96 |             break
 97 |         r_t = np.fft.ifft(residue)
 98 |         # Remove numerical noise
 99 |         r_t = r_t.real
100 |         # Find the dominant wave packet
101 |         r_hilbert = hilbert(r_t)
102 |         r_argpeak = np.argmax(abs(r_hilbert))
103 |         # Extract its amplitude
104 |         amplitude = abs(r_hilbert[r_argpeak])
105 |         # Extract time delay with reference to the envelope of the original
106 |         # signal
107 |         delay = (r_argpeak - original_argpeak)*delta_t
108 |         # Extract phase shift with reference to the original signal
109 |         phase = (np.angle(r_hilbert[r_argpeak]) -
110 |                  np.angle(original_hilbert)[original_argpeak] + 2*np.pi)
111 | 
112 |         amplitudes.append(amplitude)
113 |         delays.append(delay)
114 |         phases.append(phase)
115 |         # Apply the delay and shift only to the positive half of the spectrum
116 |         if len(original_spectrum) % 2 == 0:
117 |             nyquist_index = int(len(original_spectrum)/2) + 1
118 |         else:
119 |             nyquist_index = int(np.ceil(len(original_spectrum)/2))
120 |         half_spectrum = original_spectrum[:nyquist_index]
121 |         component_spectrum = (amplitude*half_spectrum
122 |                               * np.exp(1j*(-2*np.pi*omega[:nyquist_index]*delay
123 |                                            + phase)))
124 |         # Reconstruct the double-sided spectrum
125 |         if len(original_spectrum) % 2 == 0:
126 |             reconstructed = np.concatenate((
127 |                 component_spectrum, component_spectrum.conj()[1:-1][::-1]))
128 |         else:
129 |             reconstructed = np.concatenate((
130 |                 component_spectrum, component_spectrum.conj()[1:][::-1]))
131 |         residue = residue - reconstructed
132 |         components.append(np.fft.ifft(reconstructed))
133 |         # Assign maximum amplitude if this is the first iteration
134 |         if len(amplitudes) == 1:
135 |             amp_max = amplitude
136 |         iters += 1
137 |     return (np.array(amplitudes[:-1]), np.array(delays[:-1]),
138 |             np.array(phases[:-1]), np.array(components[:-1]))
139 | 
140 | def extract_CLEAN_matrix(measured_signal, original_signal, delta_t,
141 |                          max_iter=10):
142 |     """
143 |     Applies the CLEAN algorithm to extract individual components from a matrix
144 |     of multi-component signals. The algorithm is based in the assumption that the
145 |     measured signal is a sum of scaled, delayed and shifted copies of
146 |     the original signal.
147 | 
148 |     Individual components are extracted by iterating over dominant components
149 |     of the spectrum. The algorithms starts from taking the spectrum of the
150 |     input signal as a starting residue. It determines the amplitude, phase
151 |     and time delay of the dominant component and subtracts the reconstructed
152 |     spectrum of this componend from the residue. A new residue is formed,
153 |     and the search for the dominant component restarts.
154 | 
155 |     The loop is terminated after a chosen number of components is extracted
156 |     from all signals.
157 | 
158 |     It is assumed that both the original and the measured signals are sampled
159 |     at the same rate.
160 | 
161 |     Parameters:
162 |     ---
163 |     measured_signal: ndarray, measured signal to decompose of size (N,...) where N
164 |                     is the number of samples, and two subsequent dimensions
165 |                     correspond to the size of the sensor signal matrix
166 |     original_signal: ndarray, original (transmitted) signal,
167 |                               e.g. the excitation applied to the structure
168 |                               under test
169 |     delta_t: float, time increment
170 |     max_iter: int, number of iterations
171 | 
172 |     Returns:
173 |     ---
174 |     amplitudes: ndarray, amplitudes of the individual components
175 |     delays: ndarray, tiem delays of the individual components
176 |     phases: ndarray, phase shifts of the individual components
177 |     components: ndarray, reconstructed time traces related to each individual
178 |                         component
179 |     """
180 |     original_spectrum = np.fft.fft(original_signal, axis=0)
181 |     measured_spectrum = np.fft.fft(measured_signal, axis=0)
182 |     omega = np.fft.fftfreq(len(original_signal), delta_t)
183 |     original_hilbert = hilbert(original_signal, axis=0)
184 |     original_argpeak = np.argmax(abs(original_hilbert))
185 | 
186 |     # Initialise lists
187 |     amplitudes = np.zeros([max_iter, measured_signal.shape[1],
188 |                            measured_signal.shape[2]], 'complex')
189 |     delays = np.zeros([max_iter, measured_signal.shape[1],
190 |                        measured_signal.shape[2]], 'float')
191 |     phases = np.zeros([max_iter, measured_signal.shape[1],
192 |                        measured_signal.shape[2]], 'float')
193 |     components = np.zeros([max_iter] + list(measured_signal.shape))
194 |     # First residue is the spectrum of the measured signal
195 |     residue = np.copy(measured_spectrum)
196 |     for i in trange(max_iter):
197 |         r_t = np.fft.ifft(residue, axis=0)
198 |         # Remove numerical noise
199 |         r_t = r_t.real
200 |         # Find the dominant wave packet
201 |         r_hilbert = hilbert(r_t, axis=0)
202 |         r_argpeak = np.argmax(abs(r_hilbert), axis=0)
203 |         # Extract its amplitude
204 |         # Fancy indexing
205 |         ind_2, ind_3 = np.ogrid[0:r_argpeak.shape[0], 0:r_argpeak.shape[1]]
206 |         amplitude = abs(r_hilbert[r_argpeak, ind_2, ind_3])
207 |         # Extract time delay with reference to the envelope of the original
208 |         # signal
209 |         delay = (r_argpeak - original_argpeak)*delta_t
210 |         # Extract phase shift with reference to the original signal
211 |         phase = (np.angle(r_hilbert[r_argpeak, ind_2, ind_3]) -
212 |                  np.angle(original_hilbert)[original_argpeak] + 2*np.pi)
213 | 
214 |         amplitudes[i] = amplitude
215 |         delays[i] = delay
216 |         phases[i] = phase
217 |         # Apply the delay and shift only to the positive half of the spectrum
218 |         if len(original_spectrum) % 2 == 0:
219 |             nyquist_index = int(len(original_spectrum)/2) + 1
220 |         else:
221 |             nyquist_index = int(np.ceil(len(original_spectrum)/2))
222 |         half_spectrum = original_spectrum[:nyquist_index]
223 |         component_spectrum = (amplitude[np.newaxis, :, :]
224 |                               *half_spectrum[:, np.newaxis, np.newaxis]
225 |                               * np.exp(1j*(-2*np.pi*omega[:nyquist_index,
226 |                                                           np.newaxis,
227 |                                                           np.newaxis]
228 |                                            *delay[np.newaxis, :, :]
229 |                                            + phase[np.newaxis, :, :])))
230 |         # Reconstruct the double-sided spectrum
231 |         if len(original_spectrum) % 2 == 0:
232 |             reconstructed = np.concatenate((
233 |                 component_spectrum, component_spectrum.conj()[1:-1][::-1]))
234 |         else:
235 |             reconstructed = np.concatenate((
236 |                 component_spectrum, component_spectrum.conj()[1:][::-1]))
237 |         residue = residue - reconstructed
238 |         components[i] = np.fft.ifft(reconstructed, axis=0)
239 |     return amplitudes, delays, phases, components
240 | 
241 | def plot_components(time, measured_signal, components):
242 |     """
243 |     Plots the measured signal and individual components extracted using
244 |     the CLEAN algorithm.
245 | 
246 |     ###
247 |     Example
248 |     ###
249 |     >> amps, dls, phss, comps = extract_CLEAN(
250 |                     measured_signal=signal, original_signal=transmitted,
251 |                     delta_t=delta_t, threshold=0.4)
252 |     >> plot_components(time, signal, comps)
253 | 
254 |     Parameters:
255 |     ---
256 |     time: ndarray, time vector
257 |     measured_signal: ndarray, multi-component measured signal
258 |     components: ndarray, the output of the extract_CLEAN function
259 |     """
260 |     plt.figure()
261 |     plt.plot(time*1e6, measured_signal, c='lightgray', lw=4, label='measured')
262 |     for i, comp in enumerate(components):
263 |         plt.plot(time*1e6, comp.real, label='component {}'.format(i))
264 |     plt.xlabel('time in us')
265 |     plt.ylabel('normalised amplitude')
266 |     plt.legend()
267 | 


--------------------------------------------------------------------------------
/data/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michalkalkowski/clean_algorithm/5f13eb2edd83beb6b25dbfb62c2d94d067f5bea3/data/.gitkeep


--------------------------------------------------------------------------------
/notebooks/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michalkalkowski/clean_algorithm/5f13eb2edd83beb6b25dbfb62c2d94d067f5bea3/notebooks/.gitkeep


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from setuptools import setup, find_packages
 3 | 
 4 | def read(fname):
 5 |     return open(os.path.join(os.path.dirname(__file__), fname)).read()
 6 | 
 7 | setup(
 8 |     name="clean",
 9 |     version="0.0.1",
10 |     description=("An implementation of the CLEAN algorithm for extracting"
11 |                  + "individual components from a multi-component signal"),
12 |     author="Michal K Kalkowski",
13 |     author_email="m.kalkowski@imperial.ac.uk",
14 |     packages=find_packages(exclude=['data', 'references', 'output', 'notebooks']),
15 |     long_description=read('README.md'),
16 |     license='MIT',
17 | )
18 | 


--------------------------------------------------------------------------------