├── Paper
    ├── paper.md
    └── readme.md
├── README.md
└── functions
    ├── Energy_extraction.py
    ├── Entropy_extraction.py
    ├── SpectralEdgeFrequency_Extraction.py
    └── readme.md


/Paper/paper.md:
--------------------------------------------------------------------------------
 1 | ### Emotion Recognition
 2 | 1. Zheng W L, Lu B L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks[J]. IEEE Transactions on Autonomous Mental Development, 2015, 7(3): 162-175.  
 3 | 2. Zong C, Chetouani M. Hilbert-Huang transform based physiological signals analysis for emotion recognition[C]//Signal processing and information technology (isspit), 2009 ieee international symposium on. IEEE, 2009: 334-339  
 4 | 
 5 | ### Sleep Stage Classification
 6 | 1. Ronzhina M, Janoušek O, Kolářová J, et al. Sleep scoring using artificial neural networks[J]. Sleep medicine reviews, 2012, 16(3): 251-263.
 7 | 2. Chapotot F, Becq G. Automated sleep–wake staging combining robust feature extraction, artificial neural network classification, and flexible decision rules[J]. International Journal of Adaptive Control and Signal Processing, 2010, 24(5): 409-423.  
 8 | 3. Ebrahimi F, Mikaeili M, Estrada E, et al. Automatic sleep stage classification based on EEG signals by using neural networks and wavelet packet coefficients[C]//Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference of the IEEE. IEEE, 2008: 1151-1154.  
 9 | 4. Losonczi L, Bako L, Brassai S T, et al. Hilbert-Huang transform used for EEG signal analysis[C]//The International Conference Interdisciplinarity in Engineering INTER-ENG. Editura Universitatii” Petru Maior” din Tirgu Mures, 2012: 361.  
10 | 5. Subasi A, Gursoy M I. EEG signal classification using PCA, ICA, LDA and support vector machines[J]. Expert Systems with Applications, 2010, 37(12): 8659-8666.
11 | MLA 
12 | 


--------------------------------------------------------------------------------
/Paper/readme.md:
--------------------------------------------------------------------------------
 1 | **This is a collection of useful papers in EEG signal analysis.**  
 2 | 
 3 | The papers may not be the most cited ones or those published by the top journal/conference, but I did extracted some useful ideas from them.  
 4 | 
 5 | I wrote down some comments on each paper I have read on my personal website [here](https://billbeatthepeat.github.io/2017/08/24/My-Currently-Reading-Paper/).  
 6 | 
 7 | I am still a novel student in the Machine Learning field, so there would be some mistakes or misunderstanding. Kindly contact me with any questions.
 8 | 
 9 | Regards.
10 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Python-for-EEG-analysis
2 | Currently working on analysis of EEG (Electroencephalography) signals on the emotion recognition and sleep stage classification problems. This package/file is a summary/gathering of the most frequently used packages, functions and papers of EEG analysis in Python.
3 | 


--------------------------------------------------------------------------------
/functions/Energy_extraction.py:
--------------------------------------------------------------------------------
  1 | ############################################################################
  2 | # This file is the demo code used to extract energy features for EEG analysis
  3 | # The mainly method is to use the existing package.
  4 | # However, the packages are sometimes obscure to use.
  5 | # So I wrote down this file as a reference for further work on EEG analysis.
  6 | #############################################################################
  7 | 
  8 | 
  9 | 
 10 | 
 11 | ###########################
 12 | 
 13 | ###       imported package
 14 | ####
 15 | 
 16 | import os
 17 | import pyedflib
 18 | import pandas as pd
 19 | import numpy as np
 20 | import pywt
 21 | import plotly as py
 22 | import plotly.graph_objs as go
 23 | py.offline.init_notebook_mode()
 24 | from numpy.fft import fft,ifft,fftfreq
 25 | from scipy.signal import welch,iirfilter,filtfilt
 26 | from scipy.stats import rv_continuous
 27 | from scipy.signal import savgol_filter
 28 | import xgboost as xgb
 29 | from xgboost.sklearn import XGBClassifier
 30 | from sklearn.metrics import accuracy_score
 31 | from scipy.signal import find_peaks_cwt
 32 | from scipy.interpolate import UnivariateSpline
 33 | from scipy.stats import entropy
 34 | from scipy.stats import kurtosis
 35 | from scipy.stats import skew
 36 | import pyeeg
 37 | # from spectrum import *
 38 | # from pyentrp import entropy as ent
 39 | 
 40 | 
 41 | 
 42 | ############################
 43 | 
 44 | ###				Read the data from .edf files and the labels from .txt files.
 45 | ###				Then concatenate the data and the label togther and 
 46 | ###				reshape them into the form of (n_samples, length_of_signal)
 47 | ####
 48 | 
 49 | def read_data(filename):
 50 | 
 51 | '''
 52 | Read data from the .edf and .txt files and output in pd.DataFrame.
 53 | '''
 54 | 
 55 |     f = pyedflib.EdfReader('eeg/edfs/'+filename+'.edf')
 56 |     headers = pd.DataFrame(f.getSignalHeaders())
 57 |     headers_512 = headers[ headers.sample_rate == 512][['sample_rate','label']]
 58 |     #load data
 59 |     data1 = []
 60 |     for row in headers_512.itertuples():
 61 |         idx = row.Index
 62 |         name = row.label
 63 |         data1.append(pd.DataFrame(f.readSignal(idx),columns=[name]))
 64 |     data = pd.concat(data1,axis=1)
 65 |     #load label
 66 |     label = pd.read_csv('eeg/edfs/'+filename+'.txt',header=None)
 67 |     label.columns = ['label']
 68 |     return data,label
 69 | 
 70 | 
 71 | 
 72 | def data_reshape(data,col):
 73 | 
 74 | '''
 75 | The data is the output of read_data in pd.DataFrame form.
 76 | The col is the name of the inpput signal channel.
 77 | Return in pd.DataFrame form
 78 | '''
 79 | 
 80 |     cols = [col+'_'+x for x in (map(str,range(15360)))]
 81 |     frame = pd.DataFrame(data[0][:len(data[1])*30*512][col].values.reshape((len(data[1]),15360)),columns=cols)
 82 |     return frame  
 83 | 
 84 | 
 85 | 
 86 | def map_label(label):
 87 | 
 88 | '''
 89 | Map the label from str to int. The W, N1, N2, N3, R represent each stage of sleep stages.
 90 | Return in dictionary form.
 91 | '''
 92 | 
 93 | 
 94 |     label_dic={'W':1,'N1':2,'N2':3,'N3':4,'R':5}
 95 |     return label_dic[label]
 96 | 
 97 | 
 98 | 
 99 | ###########################
100 | 
101 | ###				Filter the input signal with banpass butter filter.
102 | ###				According to the researches, we saperate the signal into 
103 | ###				5 to 7 frequency bands: alpha, delta, theta, beta, gamma, low_alpha and high_alpha
104 | ####
105 | 
106 | def Filter(frame1):
107 | 
108 | '''
109 | Filter delta,theta,alpha,beta,gamma.
110 | Input is the original signal data.
111 | Outputs are the filtered time-domain signals for each band.
112 | '''
113 |     a,b = iirfilter(1,[1.0/1024.0,4.0/1024.0],btype='bandpass',ftype='butter')
114 |     delta = filtfilt(a,b,frame1,axis=1)
115 |     a,b = iirfilter(1,[4.0/1024.0,8.0/1024.0],btype='bandpass',ftype='butter')
116 |     theta = filtfilt(a,b,frame1,axis=1)
117 |     a,b = iirfilter(1,[8.0/1024.0,14.0/1024.0],btype='bandpass',ftype='butter')
118 |     alpha = filtfilt(a,b,frame1,axis=1)
119 |     a,b = iirfilter(1,[8.5/1024.0,11.5/1024.0],btype='bandpass',ftype='butter')
120 |     low_alpha = filtfilt(a,b,frame1,axis=1)
121 |     a,b = iirfilter(1,[11.5/1024.0,15.5/1024.0],btype='bandpass',ftype='butter')
122 |     high_alpha = filtfilt(a,b,frame1,axis=1)
123 |     a,b = iirfilter(1,[14.0/1024.0,31.0/1024.0],btype='bandpass',ftype='butter')
124 |     beta = filtfilt(a,b,frame1,axis=1)
125 |     a,b = iirfilter(1,[31.0/1024.0,50.0/1024.0],btype='bandpass',ftype='butter')
126 |     gamma = filtfilt(a,b,frame1,axis=1)
127 | 
128 |     return delta, theta, alpha, beta, gamma, low_alpha, high_alpha
129 | 
130 | 
131 | 
132 | 
133 | #######################################
134 | 
135 | ### Calculate power spectual density for each frequency band (as a whole) and their relative ratio value.
136 | ### This might be the most useful and important feature for the Sleep stage classification
137 | ###
138 | ####
139 | 
140 | def energy_psd(phase, col, delta, theta, alpha, beta, gamma, low_alpha, high_alpha, dfFeature):
141 | 
142 | '''
143 | Input:
144 | 	phase: we can divide the original signal into several phase with respect with time. 
145 | 		   This variable is the label of each phase
146 | 	col: The name of each channel of bological signal.
147 | 	dfFeature: The returned df.DataFrame.
148 | '''
149 |     print "------------------------phase ", phase, "begin:"
150 |     delta_f,delta_psd = welch(delta,fs=512,scaling='density',axis=1)
151 | #     if 'Origin' == phase:
152 |     if '_phase_0' == phase:
153 |         dfFeature = pd.DataFrame((delta_psd*delta_psd).sum(axis=1))
154 |         dfFeature.columns = [col+'_delta_Energy'+phase]
155 |     else:
156 |         dfFeature[col+'_delta_Energy'+phase] = (delta_psd*delta_psd).sum(axis=1)
157 |     theta_f,theta_psd = welch(theta,fs=512,scaling='density',axis=1)
158 |     dfFeature[col+'_theta_Energy'+phase] = (theta_psd*theta_psd).sum(axis=1)
159 |     alpha_f,alpha_psd = welch(alpha,fs=512,scaling='density',axis=1)
160 |     dfFeature[col+'_alpha_Energy'+phase] = (alpha_psd*alpha_psd).sum(axis=1)
161 |     beta_f,beta_psd = welch(beta,fs=512,scaling='density',axis=1)
162 |     dfFeature[col+'_beta_Energy'+phase] = (beta_psd*beta_psd).sum(axis=1)
163 |     gamma_f,gamma_psd = welch(gamma,fs=512,scaling='density',axis=1)
164 |     dfFeature[col+'_gamma_Energy'+phase] = (gamma_psd*gamma_psd).sum(axis=1)
165 |     
166 |     low_alpha_f,low_alpha_psd = welch(low_alpha,fs=512,scaling='density',axis=1)
167 |     dfFeature[col+'_low_alpha_Energy'+phase] = (low_alpha_psd*low_alpha_psd).sum(axis=1)
168 |     high_alpha_f,high_alpha_psd = welch(high_alpha,fs=512,scaling='density',axis=1)
169 |     dfFeature[col+'_high_alpha_Energy'+phase] = (high_alpha_psd*high_alpha_psd).sum(axis=1)
170 |     
171 |     dfFeature[col+'_Energy'+phase] = dfFeature[col+'_delta_Energy'+phase]+dfFeature[col+'_theta_Energy'+phase]+dfFeature[col+'_alpha_Energy'+phase]+\
172 |                                 dfFeature[col+'_beta_Energy'+phase]+dfFeature[col+'_gamma_Energy'+phase]
173 |         
174 |     dfFeature[col+'Energyratio1'+phase] = (dfFeature[col+'_alpha_Energy'+phase])/(dfFeature[col+'_delta_Energy'+phase]+dfFeature[col+'_theta_Energy'+phase])
175 |     dfFeature[col+'Energyratio2'+phase] = (dfFeature[col+'_delta_Energy'+phase])/(dfFeature[col+'_theta_Energy'+phase]+dfFeature[col+'_alpha_Energy'+phase])
176 |     dfFeature[col+'Energyratio3'+phase] = (dfFeature[col+'_theta_Energy'+phase])/(dfFeature[col+'_delta_Energy'+phase]+dfFeature[col+'_alpha_Energy'+phase])
177 | 
178 |     dfFeature[col+'Energyrelative1'+phase] = (dfFeature[col+'_alpha_Energy'+phase])/dfFeature[col+'_Energy'+phase]
179 |     dfFeature[col+'Energyrelative2'+phase] = (dfFeature[col+'_delta_Energy'+phase])/dfFeature[col+'_Energy'+phase]   
180 |     dfFeature[col+'Energyrelative3'+phase] = (dfFeature[col+'_theta_Energy'+phase])/dfFeature[col+'_Energy'+phase]
181 |     dfFeature[col+'Energyrelative4'+phase] = (dfFeature[col+'_beta_Energy'+phase])/dfFeature[col+'_Energy'+phase]
182 |     dfFeature[col+'Energyrelative5'+phase] = (dfFeature[col+'_gamma_Energy'+phase])/dfFeature[col+'_Energy'+phase]
183 |         
184 |     return dfFeature
185 | 
186 | 
187 | 
188 | ##########################################
189 | 
190 | ### Calculate the psd with small moving windows (6 seconds with 50% overlaping)
191 | ### Extract statistic value from these psd series.
192 | ####
193 | 
194 | def small_window_psd(phase, col, delta, theta, alpha, beta, gamma, dfFeature):
195 |     
196 |     band = [delta, theta, alpha, beta, gamma]
197 |     band_name = ['delta', 'theta', 'alpha', 'beta', 'gamma']
198 |     
199 |     for i in range(5):
200 |         signal = np.array(band[i])
201 |         signal_name = band_name[i]
202 |         print "moving small windows on ", signal_name
203 |         
204 |         psd_arr = np.zeros((signal.shape[0],1 + (30-6)/3))
205 |         
206 |         for j in range(1 + (30-6)/3):
207 |             
208 |             segment = signal[::, j*3*512:(j+2)*3*512]
209 |             f,psd = welch(segment,fs=512, scaling='density', nperseg=256, noverlap=128, axis=1)
210 |             psd = pd.DataFrame(psd)
211 |             psd_arr[:,j] = psd.sum(axis=1)
212 |             
213 |         psd_arr = pd.DataFrame(psd_arr)
214 |         dfFeature[col+signal_name+'_psd_min'+phase] = psd_arr.min(axis=1)
215 |         dfFeature[col+signal_name+'_psd_max'+phase] = psd_arr.max(axis=1)
216 |         dfFeature[col+signal_name+'_psd_mean'+phase] = psd_arr.mean(axis=1)
217 |         dfFeature[col+signal_name+'_psd_median'+phase] = psd_arr.median(axis=1)
218 |         dfFeature[col+signal_name+'_psd_std'+phase] = psd_arr.std(axis=1)
219 |         dfFeature[col+signal_name+'_psd_var'+phase] = psd_arr.var(axis=1)
220 |         dfFeature[col+signal_name+'_psd_diff_max'+phase] = psd_arr.diff(axis=1).max(axis=1)
221 |         dfFeature[col+signal_name+'_psd_diff_min'+phase] = psd_arr.diff(axis=1).min(axis=1)
222 |         dfFeature[col+signal_name+'_psd_diff_mean'+phase] = psd_arr.diff(axis=1).mean(axis=1)
223 |         dfFeature[col+signal_name+'_psd_diff_std'+phase] = psd_arr.diff(axis=1).std(axis=1)
224 |         
225 |     return dfFeature
226 | 


--------------------------------------------------------------------------------
/functions/Entropy_extraction.py:
--------------------------------------------------------------------------------
  1 | ############################################################################
  2 | # This file is the demo code used to extract features for EEG analysis
  3 | # The mainly method is to use the existing package.
  4 | # However, the packages are sometimes obscure to use.
  5 | # So I wrote down this file as a reference for further work on EEG analysis.
  6 | ######################################
  7 | 
  8 | 
  9 | 
 10 | ###########################
 11 | 
 12 | ###       imported package
 13 | ####
 14 | 
 15 | import os
 16 | import pyedflib
 17 | import pandas as pd
 18 | import numpy as np
 19 | import pywt
 20 | import plotly as py
 21 | import plotly.graph_objs as go
 22 | py.offline.init_notebook_mode()
 23 | from numpy.fft import fft,ifft,fftfreq
 24 | from scipy.signal import welch,iirfilter,filtfilt
 25 | from scipy.stats import rv_continuous
 26 | from scipy.signal import savgol_filter
 27 | import xgboost as xgb
 28 | from xgboost.sklearn import XGBClassifier
 29 | from sklearn.metrics import accuracy_score
 30 | from scipy.signal import find_peaks_cwt
 31 | from scipy.interpolate import UnivariateSpline
 32 | from scipy.stats import entropy
 33 | from scipy.stats import kurtosis
 34 | from scipy.stats import skew
 35 | import pyeeg
 36 | # from spectrum import *
 37 | # from pyentrp import entropy as ent
 38 | 
 39 | 
 40 | 
 41 | def DE(phase, col, delta, theta, alpha, beta, gamma, dfFeature):
 42 |     #Differential entropy
 43 |     en = np.zeros(alpha.shape[0])
 44 |     sum_of = alpha.sum() * 1.0
 45 |     alpha = alpha / sum_of
 46 |     for i in range(alpha.shape[0]):
 47 |         en[i] = entropy(alpha[i])
 48 |     dfFeature[col+'alpha_Entropy'+phase] = en
 49 |     
 50 | #     sample_entropy = np.zeros(alpha.shape[0])
 51 | #     std_alpha = np.std(alpha, axis=1)
 52 | #     for i in range(std_alpha.shape[0]):
 53 | #         sample_entropy[i] = ent.sample_entropy(alpha, 4, 0.2*std_alpha)
 54 | #     dfFeature[col+'alpha_Sample_Entropy'+phase] = sample_entropy
 55 |     
 56 |     
 57 |     en = np.zeros(delta.shape[0])
 58 |     sum_of = delta.sum() * 1.0
 59 |     delta = delta / sum_of
 60 |     for i in range(delta.shape[0]):
 61 |         en[i] = entropy(delta[i])
 62 |     dfFeature[col+'delta_Entropy'+phase] = en
 63 |     
 64 |     en = np.zeros(theta.shape[0])
 65 |     sum_of = theta.sum() * 1.0
 66 |     theta = theta / sum_of
 67 |     for i in range(theta.shape[0]):
 68 |         en[i] = entropy(theta[i])
 69 |     dfFeature[col+'theta_Entropy'+phase] = en
 70 |     
 71 |     en = np.zeros(beta.shape[0])
 72 |     sum_of = beta.sum() * 1.0
 73 |     beta = beta / sum_of
 74 |     for i in range(beta.shape[0]):
 75 |         en[i] = entropy(beta[i])
 76 |     dfFeature[col+'beta_Entropy'+phase] = en
 77 |     
 78 |     en = np.zeros(gamma.shape[0])
 79 |     sum_of = gamma.sum() * 1.0
 80 |     gamma = gamma / sum_of
 81 |     for i in range(gamma.shape[0]):
 82 |         en[i] = entropy(gamma[i])
 83 |     dfFeature[col+'gamma_Entropy'+phase] = en
 84 |     
 85 |     
 86 | #     dfFeature[col+'delta_Entropy'+phase] = np.log(dfFeature[col+'_delta_Energy'+phase])
 87 | #     dfFeature[col+'theta_Entropy'+phase] = np.log(dfFeature[col+'_theta_Energy'+phase])
 88 | #     dfFeature[col+'alpha_Entropy'+phase] = np.log(dfFeature[col+'_alpha_Energy'+phase])
 89 | #     dfFeature[col+'beta_Entropy'+phase] = np.log(dfFeature[col+'_beta_Energy'+phase])
 90 | #     dfFeature[col+'gamma_Entropy'+phase] = np.log(dfFeature[col+'_gamma_Energy'+phase])
 91 |     return dfFeature
 92 | 
 93 | 
 94 | 
 95 | 
 96 | 
 97 | 
 98 | 
 99 | 
100 | ############################################
101 | 
102 | ### Following is the other useful functions
103 | ### Take down in the raw function from the introduction of its package
104 | ####
105 | 
106 | 
107 | ### From the package pyeeg: https://github.com/forrestbao/pyeeg/blob/master/pyeeg/__init__.py
108 | 
109 | 
110 | def spectral_entropy(X, Band, Fs, Power_Ratio=None):
111 |     """Compute spectral entropy of a time series from either two cases below:
112 |     1. X, the time series (default)
113 |     2. Power_Ratio, a list of normalized signal power in a set of frequency
114 |     bins defined in Band (if Power_Ratio is provided, recommended to speed up)
115 |     In case 1, Power_Ratio is computed by bin_power() function.
116 |     """
117 | 
118 | 
119 | def svd_entropy(X, Tau, DE, W=None):
120 |     """Compute SVD Entropy from either two cases below:
121 |     1. a time series X, with lag tau and embedding dimension dE (default)
122 |     2. a list, W, of normalized singular values of a matrix (if W is provided,
123 |     recommend to speed up.)
124 |     """
125 | 
126 | def ap_entropy(X, M, R):
127 |     """Computer approximate entropy (ApEN) of series X, specified by M and R.
128 |     Suppose given time series is X = [x(1), x(2), ... , x(N)]. We first build
129 |     embedding matrix Em, of dimension (N-M+1)-by-M, such that the i-th row of
130 |     Em is x(i),x(i+1), ... , x(i+M-1). Hence, the embedding lag and dimension
131 |     are 1 and M-1 respectively. Such a matrix can be built by calling pyeeg
132 |     function as Em = embed_seq(X, 1, M). Then we build matrix Emp, whose only
133 |     difference with Em is that the length of each embedding sequence is M + 1
134 |     Denote the i-th and j-th row of Em as Em[i] and Em[j]. Their k-th elements
135 |     are Em[i][k] and Em[j][k] respectively. The distance between Em[i] and
136 |     Em[j] is defined as 1) the maximum difference of their corresponding scalar
137 |     components, thus, max(Em[i]-Em[j]), or 2) Euclidean distance. We say two
138 |     1-D vectors Em[i] and Em[j] *match* in *tolerance* R, if the distance
139 |     between them is no greater than R, thus, max(Em[i]-Em[j]) <= R. Mostly, the
140 |     value of R is defined as 20% - 30% of standard deviation of X.
141 |     Pick Em[i] as a template, for all j such that 0 < j < N - M + 1, we can
142 |     check whether Em[j] matches with Em[i]. Denote the number of Em[j],
143 |     which is in the range of Em[i], as k[i], which is the i-th element of the
144 |     vector k. The probability that a random row in Em matches Em[i] is
145 |     \simga_1^{N-M+1} k[i] / (N - M + 1), thus sum(k)/ (N - M + 1),
146 |     denoted as Cm[i].
147 |     We repeat the same process on Emp and obtained Cmp[i], but here 0<i<N-M
148 |     since the length of each sequence in Emp is M + 1.
149 |     The probability that any two embedding sequences in Em match is then
150 |     sum(Cm)/ (N - M +1 ). We define Phi_m = sum(log(Cm)) / (N - M + 1) and
151 |     Phi_mp = sum(log(Cmp)) / (N - M ).
152 |     And the ApEn is defined as Phi_m - Phi_mp.
153 |     """
154 | 
155 | 
156 | def samp_entropy(X, M, R):
157 |     """Computer sample entropy (SampEn) of series X, specified by M and R.
158 |     SampEn is very close to ApEn.
159 |     Suppose given time series is X = [x(1), x(2), ... , x(N)]. We first build
160 |     embedding matrix Em, of dimension (N-M+1)-by-M, such that the i-th row of
161 |     Em is x(i),x(i+1), ... , x(i+M-1). Hence, the embedding lag and dimension
162 |     are 1 and M-1 respectively. Such a matrix can be built by calling pyeeg
163 |     function as Em = embed_seq(X, 1, M). Then we build matrix Emp, whose only
164 |     difference with Em is that the length of each embedding sequence is M + 1
165 |     Denote the i-th and j-th row of Em as Em[i] and Em[j]. Their k-th elements
166 |     are Em[i][k] and Em[j][k] respectively. The distance between Em[i] and
167 |     Em[j] is defined as 1) the maximum difference of their corresponding scalar
168 |     components, thus, max(Em[i]-Em[j]), or 2) Euclidean distance. We say two
169 |     1-D vectors Em[i] and Em[j] *match* in *tolerance* R, if the distance
170 |     between them is no greater than R, thus, max(Em[i]-Em[j]) <= R. Mostly, the
171 |     value of R is defined as 20% - 30% of standard deviation of X.
172 |     Pick Em[i] as a template, for all j such that 0 < j < N - M , we can
173 |     check whether Em[j] matches with Em[i]. Denote the number of Em[j],
174 |     which is in the range of Em[i], as k[i], which is the i-th element of the
175 |     vector k.
176 |     We repeat the same process on Emp and obtained Cmp[i], 0 < i < N - M.
177 |     The SampEn is defined as log(sum(Cm)/sum(Cmp))
178 |     """
179 | 
180 | 
181 | 
182 | def permutation_entropy(x, n, tau):
183 |     """Compute Permutation Entropy of a given time series x, specified by
184 |     permutation order n and embedding lag tau.
185 |     """
186 |     
187 |     
188 | ############################################
189 | 
190 | ### From the package pyEntropy: https://github.com/nikdon/pyEntropy
191 | ### Install: pip install pyentrp
192 | ### More easy to use than the pyeeg package
193 | 
194 | def multiscale_entropy(time_series, sample_length, tolerance):
195 |     n = len(time_series)
196 |     mse = np.zeros((1, sample_length))
197 | 
198 |     for i in range(sample_length):
199 |         b = int(np.fix(n / (i + 1)))
200 |         temp_ts = [0] * int(b)
201 |         for j in range(b):
202 |             num = np.sum(time_series[j * (i + 1): (j + 1) * (i + 1)])
203 |             den = i + 1
204 |             temp_ts[j] = float(num) / float(den)
205 |         se = pyentropy.sample_entropy(temp_ts, 1, tolerance)
206 |         mse[0, i] = se
207 |     
208 |     return mse[0]
209 | 
210 | def multiscale_permutation_entropy(time_series, m, delay, scale):
211 |     mspe = []
212 |     for i in range(scale):
213 |         coarse_time_series = pyentropy.util_granulate_time_series(time_series, i + 1)
214 |         pe = pyentropy.permutation_entropy(coarse_time_series, m, delay)
215 |         mspe.append(pe)
216 |     return mspe
217 | 
218 | def composite_multiscale_entropy(time_series, sample_length, scale, tolerance=None):
219 |     cmse = np.zeros((1, scale))
220 | 
221 |     for i in range(scale):
222 |         for j in range(i):
223 |             tmp = pyentropy.util_granulate_time_series(time_series[j:], i + 1)
224 |             cmse[0,i] = pyentropy.sample_entropy(tmp, sample_length, tolerance).sum() / (i + 1)
225 |     return cmse
226 | 
227 | def small_window_en(phase, col, delta, theta, alpha, beta, gamma, dfFeature):
228 |     
229 |     '''
230 |     Calculating Multi-scale entropy for the EEG signals.
231 |     Multiscale Entropy (sample entropy) and Multiscale-permutation Entropy.
232 |     Paper: Nakamura T, Adjei T, Alqurashi Y, et al. Complexity science for sleep stage classification from EE.
233 |     '''
234 |     
235 |     band = [delta, theta, alpha, beta, gamma]
236 |     band_name = ['delta', 'theta', 'alpha', 'beta', 'gamma']
237 |     
238 |     for i in range(5):
239 |         signal = pd.DataFrame(band[i])
240 |         signal_name = band_name[i]
241 |         print "calculating en with moving small window on ", signal_name
242 |         
243 |         multiscale_arr = np.zeros((signal.shape[0],30))
244 |         multipermu_arr = np.zeros((signal.shape[0],30))
245 | #         compomulti_arr = np.zeros((signal.shape[0],30))
246 | 
247 |         for index, row in signal.iterrows():
248 |             print index
249 |             multiscale_arr[index] = multiscale_entropy(row, 30, tolerance=None)
250 |             multipermu_arr[index] = multiscale_permutation_entropy(row, 5, 1, 30)
251 | #             compomulti_arr[j] = composite_multiscale_entropy(signal[j], 30, 30)
252 |         
253 |         multiscale_arr = pd.DataFrame(multiscale_arr)
254 |         dfFeature[col+signal_name+'_multiscale_min'+phase] = multiscale_arr.min(axis=1)
255 |         dfFeature[col+signal_name+'_multiscale_max'+phase] = multiscale_arr.max(axis=1)
256 |         dfFeature[col+signal_name+'_multiscale_mean'+phase] = multiscale_arr.mean(axis=1)
257 |         dfFeature[col+signal_name+'_multiscale_median'+phase] = multiscale_arr.median(axis=1)
258 |         dfFeature[col+signal_name+'_multiscale_std'+phase] = multiscale_arr.std(axis=1)
259 |         dfFeature[col+signal_name+'_multiscale_var'+phase] = multiscale_arr.var(axis=1)
260 |         dfFeature[col+signal_name+'_multiscale_diff_max'+phase] = multiscale_arr.diff(axis=1).max(axis=1)
261 |         dfFeature[col+signal_name+'_multiscale_diff_min'+phase] = multiscale_arr.diff(axis=1).min(axis=1)
262 |         dfFeature[col+signal_name+'_multiscale_diff_mean'+phase] = multiscale_arr.diff(axis=1).mean(axis=1)
263 |         dfFeature[col+signal_name+'_multiscale_diff_std'+phase] = multiscale_arr.diff(axis=1).std(axis=1)
264 |         
265 |         multipermu_arr = pd.DataFrame(multipermu_arr)
266 |         dfFeature[col+signal_name+'_multipermu_min'+phase] = multipermu_arr.min(axis=1)
267 |         dfFeature[col+signal_name+'_multipermu_max'+phase] = multipermu_arr.max(axis=1)
268 |         dfFeature[col+signal_name+'_multipermu_mean'+phase] = multipermu_arr.mean(axis=1)
269 |         dfFeature[col+signal_name+'_multipermu_median'+phase] = multipermu_arr.median(axis=1)
270 |         dfFeature[col+signal_name+'_multipermu_std'+phase] = multipermu_arr.std(axis=1)
271 |         dfFeature[col+signal_name+'_multipermu_var'+phase] = multipermu_arr.var(axis=1)
272 |         dfFeature[col+signal_name+'_multipermu_diff_max'+phase] = multipermu_arr.diff(axis=1).max(axis=1)
273 |         dfFeature[col+signal_name+'_multipermu_diff_min'+phase] = multipermu_arr.diff(axis=1).min(axis=1)
274 |         dfFeature[col+signal_name+'_multipermu_diff_mean'+phase] = multipermu_arr.diff(axis=1).mean(axis=1)
275 |         dfFeature[col+signal_name+'_multipermu_diff_std'+phase] = multipermu_arr.diff(axis=1).std(axis=1)
276 |         
277 | 
278 |         
279 |     return dfFeature
280 | 


--------------------------------------------------------------------------------
/functions/SpectralEdgeFrequency_Extraction.py:
--------------------------------------------------------------------------------
 1 | def calcNormalizedFFT(epoch,lvl,nt,fs=512):
 2 |     lseg = np.round(nt/fs*lvl).astype('int')
 3 |     D = np.absolute(np.fft.fft(epoch, n=lseg[-1], axis = 1))
 4 |     D[:,0]=0
 5 |     D /= D.sum()
 6 |     return D
 7 | 
 8 | def calcSpectralEdgeFreq(epoch, lvl, nt, nc, fs=512, percent=0.5):
 9 |     # find the spectral edge frequency
10 |     sfreq = fs
11 |     tfreq = 40
12 |     ppow = percent
13 |     
14 |     topfreq = int(round(nt/sfreq*tfreq)) + 1
15 |     D = calcNormalizedFFT(epoch, lvl, nt, fs)
16 |     A = np.cumsum(D[:topfreq, :], axis = 1)
17 |     B = A - (A.max()*ppow)
18 |     spedge = np.min(np.abs(B), axis = 1)
19 |     spedge = (spedge - 1)/(topfreq - 1)*tfreq
20 |     
21 |     return spedge
22 |   
23 | def SEF(phase, col, eeg, dfFeature):
24 | 
25 |     lvl = np.array([0.4,4,8,12,30,70,180])
26 |     [nc,nt] = eeg.shape
27 |     
28 |     sef1 = np.zeros((eeg.shape[0],281))
29 |     sef2 = np.zeros((eeg.shape[0],281))
30 |     
31 |     for index, row in eeg.iterrows():
32 |         sef1[index] = calcSpectralEdgeFreq(row.reshape(1,-1), lvl, nt, nc, fs=512, percent=0.5)
33 |         sef2[index] = calcSpectralEdgeFreq(row.reshape(1,-1), lvl, nt, nc, fs=512, percent=0.95)
34 |     sefd = sef2 - sef1
35 | 
36 |     dfFeature[col+'_SpectralEdgeFrequency95_mean_'+phase] = np.mean(sef2,axis=1)
37 |     dfFeature[col+'_SpectralEdgeFrequency95_max_'+phase] = np.max(sef2,axis=1)
38 |     dfFeature[col+'_SpectralEdgeFrequency95_min_'+phase] = np.min(sef2,axis=1)
39 |     dfFeature[col+'_SpectralEdgeFrequency95_var_'+phase] = np.var(sef2,axis=1)
40 |     dfFeature[col+'_SpectralEdgeFrequency50_mean_'+phase] = np.mean(sef1,axis=1)
41 |     dfFeature[col+'_SpectralEdgeFrequency50_max_'+phase] = np.max(sef1,axis=1)
42 |     dfFeature[col+'_SpectralEdgeFrequency50_min_'+phase] = np.min(sef1,axis=1)
43 |     dfFeature[col+'_SpectralEdgeFrequency50_var_'+phase] = np.var(sef1,axis=1)
44 |     dfFeature[col+'_SpectralEdgeFrequencyDiff_mean_'+phase] = np.mean(sefd,axis=1)
45 |     dfFeature[col+'_SpectralEdgeFrequencyDiff_mean_'+phase] = np.max(sefd,axis=1)
46 |     dfFeature[col+'_SpectralEdgeFrequencyDiff_mean_'+phase] = np.min(sefd,axis=1)
47 |     dfFeature[col+'_SpectralEdgeFrequencyDiff_mean_'+phase] = np.var(sefd,axis=1)
48 |     
49 |     return dfFeature
50 | 


--------------------------------------------------------------------------------
/functions/readme.md:
--------------------------------------------------------------------------------
 1 | ### Introduction
 2 | Collection of usful functions for Analysis of EEG signals. They are gathered from several open-source project either in Github or self-pages.  
 3 | **Sallute to their own authors!**  
 4 | In Python 2.7
 5 | 
 6 | ### Dependency
 7 | - Numpy
 8 | - Pandas
 9 | - PyEDFlib
10 | - PyWavelets (pywt)
11 | - Scipy
12 | - PyEEG
13 | - pyEntropy
14 | 


--------------------------------------------------------------------------------