├── LICENSE
├── README.md
├── code
    ├── IHR-IPR_Accuracy.py
    ├── Peak_Matching.py
    └── bsqi.py
└── paper
    └── 2021_Kotzen_et_al_IOP_CinC2021_Benchmarking_PPG_peak_detection .pdf


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 AIMLab.
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Benchmarking PPG Peak Detection Algorithms using ECG as a Reference
 2 | 
 3 | - See /paper for the publication
 4 | - See /code for the Beat Matching and IHR_IPR Accuracy Algorithms
 5 | - Requirements: Python3, Numpy, Pandas, Scipy
 6 | 
 7 | ## Abstract
 8 | **Introduction:** Photoplethysmography (PPG) is fast becoming the signal of choice for the widespread monitoring of sleep metrics obtained by wearable devices. Robust peak detection is critical for the extraction of meaningful features from the PPG  waveform. There is however no consensus on what PPG peak detection algorithms perform best on nocturnal continuous PPG recordings. We introduce two methods to benchmark the performance of PPG peak detectors. 
 9 | 
10 | **Methods:** We make use of data where nocturnal PPG and electrocardiogram (ECG) are measured synchronously. Within this setting, the ECG, a signal for which there are established R-peak detectors, is used as reference. The first method for benchmarking, denoted "Peak Matching", consists of forecasting the expected position of the PPG peaks using the ECG R-peaks as reference. The second technique, denoted "IHR-IPR Accuracy", compares the instantaneous pulse rate (IPR) extracted from the PPG with the instantaneous heart rate (IHR) extracted from the ECG. For benchmarking, we used the MESA dataset consisting of 2,055 overnight polysomnography recordings with a combined length of over 16,300 hours.Four open PPG peak detectors were benchmarked. 
11 | 
12 | **Results:** The "Pulses" detector performed best with a Peak Matching F1-score of 0.94 and an IHR-IPR Accuracy of 89.6\%. 
13 | 
14 | **Discussion and conclusion:** We introduced two new methods for benchmarking PPG peak detectors. Among the four detectors evaluated, "Pulses" performed best. Benchmarking of further PPG detectors and on other data source (e.g. daytime recordings, recordings from patients with arrhythmia) is needed. 
15 | 


--------------------------------------------------------------------------------
/code/IHR-IPR_Accuracy.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import pandas as pd
  3 | from scipy import signal, interpolate
  4 | 
  5 | def calculate_itervals_forwards(points):
  6 |     """
  7 |     Similar to numpy.gradient. Acts in forward direction. Adds a Nan at the end to maintain shape.
  8 |     :param points: A numpy array of sorted fiduciary positions
  9 |     :return: The beat to beat interval
 10 |     """
 11 |     return np.append((points[1:] - points[0:-1]), np.nan)
 12 | 
 13 | def nan_helper(y):
 14 |     return np.isnan(y), lambda z: z.nonzero()[0]
 15 | 
 16 | def moving_average_filter(ibi, win_samples, percent):
 17 |     """
 18 |     Outlier detection and removal outliers. Adapted from Physiozoo filtrr moving average filter.
 19 |     https://github.com/physiozoo/mhrv/blob/2f67075e3db11120b92dd29c869a3ef4a527a2c2/%2Bmhrv/%2Brri/filtrr.m
 20 |     :param ibi: A numpy array of Inter Beat Intervals
 21 |     :param win_samples: The number consecutive IBIs to include in the moving average filter
 22 |     :param percent: The percentage above/below the average to use for filtering
 23 |     :return: Filtered ibi
 24 |     """
 25 |     b_fir = 1 / (2 * win_samples) * np.append(np.ones(win_samples), np.append(0, np.ones(win_samples)))
 26 |     points_moving_average = signal.filtfilt(b_fir, 1, ibi)
 27 |     points_filtered = ibi.copy()
 28 |     points_filtered[
 29 |         ~((ibi < (1 + percent / 100) * points_moving_average) & (ibi > (1 - percent / 100) * points_moving_average))] = np.nan
 30 |     nans, x = nan_helper(points_filtered)
 31 |     points_filtered[nans] = np.interp(x(nans), x(~nans), points_filtered[~nans])
 32 |     return points_filtered
 33 | 
 34 | def find_closest_smaller_value(find_value, list_of_values):
 35 |     """
 36 |     Returns the closest value from a list of values that is smaller than find_value
 37 |     :param find_value: The value we are searching for
 38 |     :param list_of_values: The list of values
 39 |     :return: The index of the closes value in the list. Returns -1 if not found.
 40 |     """
 41 |     for i in reversed(range(len(list_of_values) - 1)):
 42 |         if (list_of_values[i] < find_value):
 43 |             return i
 44 |     return -1
 45 | 
 46 | def find_closest_bigger_value(value, list_of_values):
 47 |     """
 48 |         Returns the closest value from a list of values that is bigger than find_value
 49 |         :param find_value: The value we are searching for
 50 |         :param list_of_values: The list of values
 51 |         :return: The index of the closes value in the list. Returns -1 if not found.
 52 |         """
 53 |     for i in range(len(list_of_values) - 1):
 54 |         if (list_of_values[i] > value):
 55 |             return i
 56 |     return -1
 57 | 
 58 | def calculate_windowed_IHR_IPR_agreement(ppg_peaks, ecg_peaks, fs=256, window=30, ptt=0.45, max_HR_detla=5):
 59 |     """
 60 |     :param ppg_peaks: A numpy array of PPG fiduciaries positions when sampled at 'fs'
 61 |     :param ecg_peaks: A numpy array of ECG R-Peaks when sampled at 'fs'
 62 |     :param fs: Sample rate [Hz] of PPG and ECG peaks
 63 |     :return: A Pandas Dataframe with peak matching F1 score for each window.
 64 |     :param window: Window size [Seconds]
 65 |     :param ptt: Approximate Pulse Transition Time [Seconds]
 66 |     :param max_HR_detla: Size of the IHR window [BPM]
 67 |     :return:
 68 |     """
 69 | 
 70 |     # 1) Shift the ECG peaks by approximate ppt
 71 |     ecg_peaks = ecg_peaks + int(ptt * fs)
 72 | 
 73 |     # 2) Limit the signal ranges to one another.
 74 |     start_arg, end_arg = 0, ppg_peaks[-1]
 75 |     if ppg_peaks[-1] > ecg_peaks[-1]:
 76 |         end_arg = find_closest_smaller_value(ecg_peaks[-1], ppg_peaks) + 1
 77 |     if ppg_peaks[0] < ecg_peaks[0]:
 78 |         start_arg = find_closest_bigger_value(ecg_peaks[0], ppg_peaks)
 79 | 
 80 |     ppg_peaks = ppg_peaks[start_arg:end_arg]
 81 | 
 82 |     # 3) Calculate the RR interval and filter out really bad points. Convert to HR estimate
 83 |     RR = calculate_itervals_forwards(ecg_peaks) / fs
 84 |     RR_filt = moving_average_filter(RR, win_samples=10,
 85 |                                     percent=50)  # Moving average window of 10 beats. #Filter @ 50% from moving average
 86 |     HR_RR = 60 / (RR_filt)
 87 | 
 88 |     # 3) Calculate the PP intervals for the patient. Convert to HR estimate
 89 |     PP = calculate_itervals_forwards(ppg_peaks) / fs
 90 |     HR_PP = 60 / PP
 91 | 
 92 |     # 4) Build the HR band and continuous IHR and IPR functions
 93 |     HR_RR_continous = interpolate.interp1d(ecg_peaks, HR_RR)
 94 |     HR_PP_continous = interpolate.interp1d(ppg_peaks, HR_PP)
 95 | 
 96 |     # 5) Resample all the HR's to 2Hz
 97 |     resample_2Hz = np.arange(ppg_peaks[0], ppg_peaks[-1], fs / 2)
 98 |     HR_RR = HR_RR_continous(resample_2Hz)
 99 |     HR_PP = HR_PP_continous(resample_2Hz)
100 | 
101 |     # 6) Calculate the agreement inside windows
102 |     fs_2hz = 2
103 |     window_2hz = window*fs_2hz
104 |     len_ppg_in_s = ppg_peaks[-1]/fs
105 |     len_ppg_at_2hz = len_ppg_in_s*fs_2hz
106 |     windows = np.arange(0, len_ppg_at_2hz, window_2hz)
107 |     window_stats = pd.DataFrame()
108 | 
109 |     for i in (range(windows.shape[0] - 1)):
110 |         window_HR_RR = HR_RR[i*window_2hz:(i+1)*window_2hz]
111 |         window_HR_PP = HR_PP[i*window_2hz:(i+1)*window_2hz]
112 |         agreement_1 = np.sum(((window_HR_PP < window_HR_RR+1) & (window_HR_PP >= window_HR_RR-1)) | np.isnan(window_HR_RR)) / len(window_HR_RR)
113 |         agreement_2 = np.sum(((window_HR_PP < window_HR_RR+2) & (window_HR_PP >= window_HR_RR-2)) | np.isnan(window_HR_RR)) / len(window_HR_RR)
114 |         agreement_3 = np.sum(((window_HR_PP < window_HR_RR+3) & (window_HR_PP >= window_HR_RR-3)) | np.isnan(window_HR_RR)) / len(window_HR_RR)
115 |         agreement_4 = np.sum(((window_HR_PP < window_HR_RR+4) & (window_HR_PP >= window_HR_RR-4)) | np.isnan(window_HR_RR)) / len(window_HR_RR)
116 |         agreement_5 = np.sum(((window_HR_PP < window_HR_RR+5) & (window_HR_PP >= window_HR_RR-5)) | np.isnan(window_HR_RR)) / len(window_HR_RR)
117 | 
118 |         window_stats = window_stats.append({'Epoch': i,
119 |                              'Agreement 1BPM': agreement_1,
120 |                              'Agreement 2BPM': agreement_2,
121 |                              'Agreement 3BPM': agreement_3,
122 |                              'Agreement 4BPM': agreement_4,
123 |                              'Agreement 5BPM': agreement_5}, ignore_index=True)
124 |     return window_stats
125 | 


--------------------------------------------------------------------------------
/code/Peak_Matching.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import pandas as pd
 3 | from bsqi import bsqi
 4 | 
 5 | from scipy import signal
 6 | 
 7 | def nan_helper(y):
 8 |     """
 9 |     Finds all np.nan in a numpy array
10 |     :param y: A numpy array to search through
11 |     :return: A list of booleans where Nan is True, A function that allows for later interpolation
12 |     """
13 |     return np.isnan(y), lambda z: z.nonzero()[0]
14 | 
15 | def calculate_ptt(ppg_peaks, ecg_peaks, fs=256, max_ptt=0.54, min_ptt=0.20, smoothing_length=300):
16 |     """
17 |     Calculates the Pulse Transition Time (PTT) from the ECG R-Peaks and PPG Systolic Peaks.
18 |     :param ppg_peaks: A numpy array of PPG systolic peaks position when sampled at fs [Hz]
19 |     :param ecg_peaks: A numpy array of ECG R-Peaks when sampled at fs [Hz]
20 |     :param fs: Sample rate [Hz] of PPG and ECG peaks
21 |     :param max_ptt: The maximum time [Seconds] between the R-Peak and it's PPG-Peak
22 |     :param min_ptt: The minimum time [Seconds] between the R-Peak and it's PPG-Peak
23 |     :param smoothing_length: Number of DataPoints to smooth signal.
24 |     :return: A PTT duration at same fs [Hz] for each ECG R-Peak
25 |     """
26 |     midpoint = (max_ptt * fs + min_ptt * fs) / 2
27 |     ptt = np.zeros_like(ecg_peaks).astype(np.float32)
28 |     ptt[:] = midpoint
29 | 
30 |     for i in range(len(ecg_peaks) - 1):
31 |         try:
32 |             start = ecg_peaks[i]
33 |             end = ecg_peaks[i + 1]
34 |             times = ppg_peaks[(ppg_peaks > start) & (ppg_peaks < end)]
35 |             if len(times) > 0:
36 |                 ptt[i] = times[0] - start
37 |             else:
38 |                 ptt[i] = np.nan
39 |         except:
40 |             print("failed")
41 |             if i > 0:
42 |                 ptt[i] = np.nan
43 | 
44 |     nans, x = nan_helper(ptt)
45 |     ptt[nans] = np.interp(x(nans), x(~nans), ptt[~nans])
46 | 
47 |     ptt[(ptt > max_ptt * fs) | (ptt < min_ptt * fs)] = np.mean(ptt[(ptt < max_ptt * fs) & (ptt > min_ptt * fs)])
48 |     ptt = signal.filtfilt(np.ones(smoothing_length) / smoothing_length, 1, ptt, padlen=smoothing_length)
49 |     ptt = np.array(ptt[0:len(ecg_peaks)]).astype(int)
50 | 
51 |     return ptt
52 | 
53 | 
54 | def calculate_delayed_ecg(ppg_peaks, ecg_peaks, fs=256):
55 |     """
56 |     Forecasts the expected position of PPG fiduciaries from the ECG-R-Peaks taking PTT into consideration
57 |     :param ppg_peaks: A numpy array of PPG fiduciaries positions when sampled at fs [Hz]
58 |     :param ecg_peaks: A numpy array of ECG R-Peaks when sampled at fs [Hz]
59 |     :param fs:
60 |     :return: Forecast of PPG fiduciaries positions.
61 |     """
62 |     return ecg_peaks + calculate_ptt(ppg_peaks, ecg_peaks, fs=fs)
63 | 
64 | def calculate_windowed_delayed_ppg_ecg_bsqi(ppg_peaks, ecg_peaks, len_ppg=None, fs=256, window=30, agw=0.15):
65 |     """
66 |     For each window of length window [Seconds]
67 |     :param ppg_peaks: A numpy array of PPG fiduciaries positions when sampled at 'fs'
68 |     :param ecg_peaks: A numpy array of ECG R-Peaks when sampled at 'fs'
69 |     :param fs: Sample rate [Hz] of PPG and ECG peaks
70 |     :param window: Window size [Seconds] in which to calculate results
71 |     :param agw: Maximum time [Seconds] between expected and forecast PPG figuciary,
72 |     :return: A Pandas Dataframe with peak matching F1 score for each window.
73 |     """
74 | 
75 |     # Limit the peaks to the ECG reference.
76 |     ppg_peaks = ppg_peaks[ppg_peaks < ecg_peaks[-1]]
77 |     ppg_peaks = ppg_peaks[ppg_peaks > ecg_peaks[0]]
78 | 
79 |     # Delay the PPG signal using the PTT
80 |     delayed_ecg_peaks = calculate_delayed_ecg(ppg_peaks, ecg_peaks)
81 | 
82 |     # Window the results
83 |     window_fs = fs * window
84 |     windows = np.arange(0, len_ppg, window_fs)
85 |     window_stats = pd.DataFrame()
86 | 
87 |     for i in (range(windows.shape[0] - 1)):
88 |         window_ppg_peaks = ppg_peaks[(ppg_peaks >= window_fs*i)*(ppg_peaks < window_fs*(i+1))]
89 |         window_delayed_ecg_peaks = delayed_ecg_peaks[(delayed_ecg_peaks >= window_fs*i)*(delayed_ecg_peaks < window_fs*(i+1))]
90 |         window_stats = window_stats.append({'Epoch': i, **bsqi(window_delayed_ecg_peaks, window_ppg_peaks, fs=fs, agw=agw, return_dict=True)}, ignore_index=True)
91 | 
92 |     return window_stats
93 | 


--------------------------------------------------------------------------------
/code/bsqi.py:
--------------------------------------------------------------------------------
 1 | from scipy.spatial import cKDTree
 2 | import numpy as np
 3 | 
 4 | def bsqi(refqrs, testqrs, agw=0.05, fs=200, return_dict=False):
 5 | 
 6 |     """
 7 |     This function is based on the following paper:
 8 |         Li, Qiao, Roger G. Mark, and Gari D. Clifford.
 9 |         "Robust heart rate estimation from multiple asynchronous noisy sources
10 |         using signal quality indices and a Kalman filter."
11 |         Physiological measurement 29.1 (2007): 15.
12 | 
13 |     The implementation itself is based on:
14 |         Behar, J., Oster, J., Li, Q., & Clifford, G. D. (2013).
15 |         ECG signal quality during arrhythmia and its application to false alarm reduction.
16 |         IEEE transactions on biomedical engineering, 60(6), 1660-1666.
17 | 
18 |     :param refqrs:                  Annotation of the reference peak detector (Indices of the peaks).
19 |     :param testqrs:                 Annotation of the test peak detector (Indices of the peaks).
20 |     :param agw:                     Agreement window size (in seconds)
21 |     :param fs:                      Sampling frquency [Hz]
22 |     :param return_type:             If dict, returns a dictionary of the the metrics. Else returns F1
23 |     :returns F1 or metrics-dict:    The 'bsqi' score, between 0 and 1.
24 |     """
25 | 
26 |     agw *= fs
27 |     if len(refqrs) > 0 and len(testqrs) > 0:
28 |         NB_REF = len(refqrs)
29 |         NB_TEST = len(testqrs)
30 | 
31 |         tree = cKDTree(refqrs.reshape(-1, 1))
32 |         Dist, IndMatch = tree.query(testqrs.reshape(-1, 1))
33 |         IndMatchInWindow = IndMatch[Dist < agw]
34 |         NB_MATCH_UNIQUE = len(np.unique(IndMatchInWindow))
35 |         TP = NB_MATCH_UNIQUE
36 |         FN = NB_REF-TP
37 |         FP = NB_TEST-TP
38 |         Se  = TP / (TP+FN)
39 |         PPV = TP / (FP+TP)
40 |         if (Se+PPV) > 0:
41 |             F1 = 2 * Se * PPV / (Se+PPV)
42 |             _, ind_plop = np.unique(IndMatchInWindow, return_index=True)
43 |             Dist_thres = np.where(Dist < agw)[0]
44 |             meanDist = np.mean(Dist[Dist_thres[ind_plop]]) / fs
45 |         else:
46 |             if return_dict:
47 |                 return {'TP': TP, 'FN': FN, 'FP':FP, 'Se': 0, 'PPV': 0, 'F1':0}
48 |             else:
49 |                 return 0
50 |     else:
51 |         F1 = 0
52 |         if return_dict:
53 |             return {'TP': 0, 'FN': 0, 'FP': 0, 'Se': 0, 'PPV': 0, 'F1': 0}
54 |         else:
55 |             return 0
56 | 
57 |     if return_dict:
58 |         return {'TP': TP, 'FN': FN, 'FP':FP, 'Se': Se, 'PPV': PPV, 'F1':F1}
59 |     else:
60 |         return F1
61 | 


--------------------------------------------------------------------------------
/paper/2021_Kotzen_et_al_IOP_CinC2021_Benchmarking_PPG_peak_detection .pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aim-lab/Benchmarking-PPG-Peak-Detection-Algorithms-with-Reference-ECG/0e07fd1719d8c6bc3d6a7f2caeb6df8e22f22590/paper/2021_Kotzen_et_al_IOP_CinC2021_Benchmarking_PPG_peak_detection .pdf


--------------------------------------------------------------------------------