Auscultation, the action of listening to the internal sounds of the body using a stethoscope, is a simple and effective way to diagnosis of cardiac dysfunction and cardiovascular disease. Moreover, auscultation has been widely used for fundamental cardiac diagnoses because of its non-invasive and simple diagnostic characteristics. The major purpose of auscultation is listening to the heart sound caused by the closing of the heart valves. Heart sound contains many signals related to magnitude and frequency content, splitting, and the presence of additional systolic or diastolic murmur components. There are two dominant features of heart sounds: the first heart sound (S1 ) and the second heart sound (S2 ). S1 reflects the closure of mitral and the tricuspid valves, and S2 marks the closure of the aortic and pulmonary valves.
Heart sounds are usually investigated with S1 and S2 analysis. Because S1 and S2 occur at the onset of ventricular contraction and the end of ventricular systole, respectively, the left ventricular ejection time (LVET) and the preejection period (PEP) have been estimated by heart sound analysis . Moreover, the diagnosis of valve dysfunction and hemodynamic research are widely known applications of heart sound analysis. For example, cardiac valve disorder has been classified by a neural network classifier  and blood pressure has been estimated with pattern analysis  using heart sounds. Furthermore, the relationship between S2 and aortic blood pressure has been shown . Although heart sounds are a useful tool clinically, they are hard to analyze from the non-stationary characteristic. That is, making a cardiac disorder diagnosis based on heart sounds is still a considerable challenge [5-12]. Thus, the accurate localization of S1 and S2 is an important issue for heart sound analysis and clinical applications.
Previous localization methods for heart sounds can be summarized as filtering, transform-based de-noising, and feature classification. In particular, wavelet de-noising has been researched to enable accurate heart sound localization [13-15]. There have been several approaches to S1 and S2 localization, such as spectral tracking , Shannon energy [17, 18], boundary modification method , time-delayed neural network , ECG-gated method , entropy based detection , homomorphic filtering , complexity based segmentation , hidden Markov model , and wavelet transform [24, 25]. Recently, wavelet de-noising and ECG-gated peak-peaking methods have been improved by an ensemble average method for more accurate localization . Also, in 2012, S1 /S2 heart sounds detection based on music beat tracking has been researched .
Most heart sound localization methods are based on the energy of the features. A problem with energy-based detection methods is the low amplitude of S1 and S2 components; they can easily be buried in background noise, murmurs, or lung sounds, and they are not reflected in the energy envelope of the heart sound . Moreover, it has been shown that the amplitude of the heart sound waveform is changed by respiration ; the amplitude of the heart sound waveform is attenuated by inspiration. The ECG-gated localization method, a commonly used technique for the localization, is based on ECG features, such as the R-wave. ECG-gated localization provides relatively stable localization . However, ECG should also be measured as a reference in the ECG-gated localization method. Moreover, for a variety of pathological conditions, there may be detection errors because of inconsistencies between the electrical and mechanical activities of the heart .
The aim of the present study is the localization of S1 and S2 in heart sounds. In this study, features were discriminated from a phonocardiogram only. To improve localization performance, cascade integrator comb (CIC) filtering and an energy- and temporal-based localization method were developed. An experiment was performed in spontaneous and a 0.1 Hz or 0.25 Hz respiration-controlled environment to produce practical respiration effects and distortion. Statistical methods were used for a quantitative evaluation of the proposed algorithm.
2. Experimental Setup
Eighteen young, healthy subjects (11 males, 7 females, mean age is 24.1, range 17-30 years old) participated. Prior to the experiment, the subjects were requested to provide information on their physical condition. None reported any taking medicine or signs of cardiovascular or respiratory disease. Each experiment was performed in a typical sports medicine laboratory at ambient room temperature, from 11 am to 6 pm. Drinking and smoking were prohibited during 24 hours and 2 hours before experiments, respectively. Anthropometric data, such as height and weight, were measured and are summarized in Table 1.
Table 1.Subject demographic data (a Systolic Blood Pressure, b Diastolic blood pressure, c Heart rate (beats per minute)
2.2 Devices and software
Heart sounds were recorded with a MP150 (Biopac Systems, inc., CA, USA), DA 100C acquisition module, and SS17L contact microphone. Heart sounds were measured on the chest surface between the 3rd and 4th ribs, and the ECG was measured on the chest surface with a Lead II configuration (Fig. 1). Acknowledge 3.9.1 (BIOPAC, USA) was used for real-time monitoring and data storage. For signal conditioning, localization, and statistical analyses, MATLAB 2008b (Mathworks, Inc., Natick, MA, USA) was used and an HEM-907(Omron, Kyoto, Japan) was used for blood pressure measurements.
Fig. 1.Location of heart auscultation and ECG measurements
2.3 Experimental protocol
Heart sounds were measured, including spontaneous and controlled respiration effects. Respiration was controlled at 0.1 Hz (low rate) and 0.25 Hz (high rate) in a supine position using a metronome. In the sitting position, there was no respiration controlling. Experiments were carried out sequentially in the following order: spontaneous, 0.1 Hz controlled, and 0.25 Hz controlled respiration in the supine position, and spontaneous respiration in the sitting position. Subjects had a 2-min rest period between experiments. Data were sampled at a 1 kHz sampling rate for a 5-min period.
3. Signal Conditioning and Localization
3.1 Cascade integrator comb filter
CIC filtering has been used for biomedical signal conditioning and feature enhancement . A CIC filter, which is a linear-phase filter, was implemented for signal conditioning of the heart sounds. Linear phase-filtering, which prevents phase and timing distortion, is important because the heart sound has non-stationary characteristic. A CIC filter is composed of a high-pass and a low-pass filter and its transfer function is described in (1) and (2). Nhpf, Nlpf, and M indicate the high-pass filter order, low-pass filter order, and number of cascaded stages, respectively; their values were 16, 8, and 2, respectively, in the filter design. Nhpf, Nlpf, and M had a 43.2 Hz center frequency and pass-band between 26.6 Hz and 63 Hz (3 dB point). The frequency characteristics of the CIC filter are presented in Fig. 2(a).
3.2 Moving average filter
A general heart sound is described with a high-frequency signal. However, specific features, such as S1 and S2, are decided by the value of the envelope. From the natural characteristics of heart sounds, the envelope detected by the Hilbert transform contains many tiny fluctuations that could have an effect on specific feature extraction. These fluctuations are also regarded as high frequency noise in feature extraction; thus, a moving-average filter (MAF) was used for envelope smoothing as a kind of low-pass filter. The frequency response of MAF is shown in (3). L is the MAF length and it was of 0.15 s duration (cut-off frequency is ~3 Hz). Fig. 2(b) shows the frequency characteristics of MAF.
Fig. 2.Frequency characteristic of filters: (a) Cascade integrator comb filter. The center frequency is 43.2 Hz and the pass-band is 26.6 to 63 Hz; (b) Moving average filter (cut-off frequency is ~3 Hz)
3.3 Hilbert transform
The Hilbert transform is proper for time-varying signal analysis, used frequently for envelope detection with heart sounds . The analytic function of z[n] contains a real-value function x[n] and a complex value function , which is the Hilbert transform pair of x[n] . The analytical signal and envelope detection equations are described in (4) and (5), respectively.
3.4 S1 and S2 Localization
The heart sound signal is processed sequentially, with CIC filtering, the Hilbert transformation, and movingaverage filtering. Then, the S1 and S2 peaks are detected from the filtered envelope of the heart sound. S1 peaks were detected by a zero-crossing method, based on signal derivation. For S2 localization, an energy based detection method was used within the S1 intervals. The S1-gated window was determined from the (n-1)th position of S1 (PS1n-1) to the nth position of S1 (PS1n). A 0.2-s refractory period was also defined to reduce misdetection for the former and latter S1 location. Thus, the scanning window had a range between PS1n-1 + 0.2 and PS1n - 0.2. Within the scanning window, S2 was localized by taking the maximum value of the energy waveform of the heart sound.
The reference peak position was detected by an ECG-gated detection method and the Hilbert transformation. The S1 location was decided within a 250 ms detection window (P1) following the R peak of the ECG waveform. The S2 location was detected by the maximum energy within the specified range (P2). P2 starts from the S1 location to the next R peak of the ECG. Each detected S1 and S2 was corrected manually (Fig. 3).
Fig. 3.Localization method of the reference S1 and S2 positions
3.5 Peak correction
The envelope of the heart sound has two distinguishable peaks, at S1 and S2. In using time difference-based classification method, the detection of both S1 and S2 should precede S1 and S2 classification. The detected peaks are typically located at the S1 and S2 positions. However, redundant peaks or missed peaks may still remain. In the pre-processing stage of S1 localization, mis-detected peak correction is directly related to detection accuracy. The proposed localization method is based on time difference characteristics between S1 and S2. Because the S2-S1 interval is generally longer than the S1-S2 interval, the difference in peak time interval (PTI) can be described with an up-and-down curve. If we assume that every peak was detected correctly, then the PTI curve should be repeated with a regular up-and-down pattern. However, a PTI curve could show an irregular trend from missed or redundant peaks. In the missed peak case, PTI becomes higher than the normal upper limit (6) and redundant peaks causes a lower PTI value than the other lower peaks (7). To remove these irregularities, a peak correction method was used. Peak correction was carried out with two processes: False Positive Peak correction (FPPC) and False Negative Peak correction (FNPC). FPP correction modifies PTI values using (8); a redundant peak time interval at the nth PTI, PTIn, is removed by comparison between the former and the latter PTI value. In the FNP correction, in contrast, a new PTI is added using (9). The estimated S1 and S2 intervals are decided by average values of 50% higher or lower.
Moreover, both S1 and S2 should be detected after the signal conditioning procedures and classified with PTI analysis for S1 localization. Fig. 4 shows the PTI analysis procedure. PTI can be represented with an up-and-down plot in the general case, and the up-to-down breakpoint indicates the S1 location.
Fig. 4.Peak time interval (PTI) analysis and S1 localization: (a) Detected peaks and intervals between S1 and S2; (b) PTI and S1 localization. S1 is located at the up-to-down breakpoint.
3.6 Replace, add, and delete algorithm
As noted previously, a representative detection problem can be summarized with FPP and FNP. In S1 classification, a similar problem occurs. Typically, FNP means missed peaks or confusing S1 and S2. Moreover, FPP means that there are redundant peaks around the actual S1 location. To improve this detection problem, a replace, add, and delete (RAD) algorithm is proposed.
Replace, add, and delete procedures are defined by PTI range. In the replace procedure, when S2 is detected as a substitute of S1, PTI is between Limith1 and Limith2 (Limith1 < PTI < Limith2). If PTI is longer than the replace interval (Limith2 < PTI), it is regarded as an add procedure. The delete procedure is invoked when PTI is shorter than Limitl (Limitl > PTI). Classification criteria, Limith1, Limith2, and Limitl , are determined using (10), (11) and (12), respectively, with empirical constant, and they are adaptively changed by the previous S1 interval.
Fig. 5 shows three detection problems and correction methods such as ‘Replace’, ‘Add’ and ‘Delete’ algorithm. In Fig. 5, dotted box shows the detection problem with S1 detection. Fig. 5(a) indicates a false detection of S1 when PTI is between Limith1 and Limith2. Fig. 5(b) shows the corrected result of the ‘replace’ procedure for (a). A new S1 is detected with the mean beat rate (MBR) and detection range (DR). MBR is decided with the average value of the previous S1 time difference and DR is 0.2 s, empirically. The second detection problem is shown in Fig. 5(c). A S1 is missed when PTI is longer than Limith2. Fig. 5(d) is the result of the ‘add’ procedure. In add procedure, S1 is added by the same rule of replace method. However, in add procedure, the next S1 would not be deleted because a S1 is missed. The last detection problem is represented in Fig. 5(e). S1 is detected again when PTI is shorter than Limitl. Fig. 5(f) shows the result of the ‘delete’ procedure. In this procedure, the latter S1 is deleted between two adjacent S1, and the former S1 is retained.
Fig. 5.S1 detection problems and results of the Replace, Add, and Delete (RAD) algorithm. A solid circle indicates a newly added S1 and a dotted circle indicates a deleted S1: (a) False detection of S1; (b) The correction result by the ‘Replace’ procedure of (a); (c) Detection failure of S1; (d) The corrected result by the ‘Add’ procedure of (c); (e) Duplicated S1 detection; (f) The result of the ‘Delete’ procedure of (e).
Fig. 6 shows a flowchart of the RAD algorithm and S2 detection. The RAD algorithm is repeated until every S1 is corrected. The S2 localization depends on S1 detection and it is represented in Fig. 5(b). S2 is determined by the maximum value position between two adjacent S1.
Fig. 6.Flowchart of the RAD algorithm and S2 detection: (a) RAD algorithm for S1 correction; (b) S2 localization procedure. MBR = mean beat rate, PI = peak interval, PPn = n-th peak position.
For localization performance evaluation, the true positive (TP), false positive (FP), false negative (FN) and failed detection (FD) were calculated. Then, we derived the sensitivity (SE), positive predictivity (+P), and failed detection rate (FDR) to evaluate the accuracy of the detection algorithm. Detected peaks were verified with a 200-ms approval range from the reference peak positions that were obtained by ECG-gated detection. Sensitivity was calculated using (13), +P using (14), and FDR using (15). Evaluations were carried out for both S1 and S2 localizations.
5.1 Signal conditioning
The signal conditioning results are shown in Fig. 7. Fig. 7(a) represents the original heart sound waveform, and (b), (c), and (d) are the CIC-filtered signal, Hilbert-transformed signal, and moving-average-filtered signal, respectively. In Fig. 7(d), the detected S1 and S2 peaks are described.
Fig. 7.Signal conditioning procedure and digital filtering: (a) Original heart sound; (b) CIC-filtered result (M = 2, Nlpf = 16, Nhpf = 32); (c) Hilbert-transformation result; (d) Moving average integrator result (window size=0.15 s). ‘▼’ indicates a detected peak.
5.2 Peak correction and classification
The initially detected peaks after signal conditioning are processed with the FPPC and FNPC methods. The FPPC and FNPC results are represented in Fig. 8. Initially detected peaks are shown in Fig. 8(a), and Fig. 8(b) shows the PTI of (a). FPP and FNP could be detected with detection criteria (1) and (2), respectively, described as relatively low and high PTI values in Fig. 8(b). The corrected peak and its PTI are presented in Figs. 8(c) and 8(d), respectively. After peak correction, the S1 and S2 locations are decided at the up-or-down peaks of the PTI curve.
Fig. 8.Correction and classification of the detected peaks after signal conditioning: (a) Detected peaks by differential method; (b) Peak time intervals of (a). False positive peaks (FPP) and false negative peaks (FNP) were detected in this stage; (c) Corrected peaks. (d) Peak time intervals of (c). A longer time interval (UP peak) means a S1 and shorter time interval (Down peak) means a S2 peak.
5.3 Replace, Add, and delete correction
Classified S1 peaks were also corrected with the RAD algorithm. Fig. 9 shows an example of S1 correction with the RAD algorithm. Fig. 9(a) and (b) show the initially detected S1 and its PTI. RAD correction was carried out using (8), (9), and (10). Figs. 9(c) and (d), Figs. 9(e) and (f), and Figs. 9(g) and (h) show the replace, add, and delete procedures and the PTI, respectively.
Fig. 9.An example of the S1 correction method: (a) S1 detection result before RAD algorithm. There is a misdetected peak between the 3rd and 5th peaks; (b) Peak time interval (PTI) of (a). It was found to be a PTI>LimitHIGH1 peak; (c) ‘Replace’ procedure; (d) PTI of (c); (e) ‘Add’ procedure; (f) PTI of (e); (g) ‘Delete’ procedure; (h) PTI of (g).
5.4 Statistical evaluation
In the statistical analysis, SE, +P, and FDR were calculated to evaluate the classification performance at each stage. The results of S1 and S2 localization are presented in Tables 2 and 3. In S1 localization, SE was increased, 87.57% to 97.27%, and +P was increased, 99.72% to 99.94%. FDR decreased, 14.48% to 2.86%. Except for slight fluctuation in +P, each performance evaluation index was increased in every correction stage.
Table 2.a Total number of reference S1 b Total number of detected peaks
Table 3.a Total number of reference S2 b Total number of detected peaks
Performance for S2 localization was evaluated in the same way as S1. In S2 localization, SE increased, 83.94% to 94.9%, and +P was 100% in each case. FDR decreased, 19.13% to 5.27%. Each performance evaluation index of S2 also increased in every correction stage. Overall localization performance for S1 was better than for S2 except in terms of +P.
6.1 Performance evaluation
The proposed method was validated by a quantitative study with 27,082 pulses, posture changes, and respiratory variation. Localization performance was most improved by CIC filtering; SE and FDR were improved by 6.21% and 7.8%, respectively, in the S1 case. In the S2 case, they were also improved, by 6.1% and 8.07%. +P was not changed significantly. Remaining misdetected peaks after frequency domain-filtering were corrected by peak correction and the RAD algorithm. These algorithms improved FDR by 1.05% and 1.95% in the S1 case. These results could be considered as slight improvement; however, it is meaningful to recover from missing or extremely distorted peaks. Because S2detection depends largely on S1 results, S2 detection performance had a lower value than S1 results; however, +P was always 100%.
A time-dependent classification algorithm is hard to adapt to an irregular heartbeat. For example, the heart sound of an arrhythmia patient could generate more errors because a correction criterion is affected by the previous PTI and specific constant. An error could even be propagated in certain cases.
In S2 localization, the S2 detection rate depends on the S1 detection result. Thus, failed detection of S1 could be a reason of the failure of S2 localization. The proposed algorithm focuses on a post-peak correction method of heart sounds; however, it does not provide a noise reduction method for the inherent problems of heart sounds. Major heart sound problems, such as lung sounds, murmurs, respiration effects, and signal distortion in female subjects, were not addressed significantly in this paper.
6.3 Concluding remarks
In this study, only heart sound data were used to localize S1 and S2 features, with no other reference signal. This is a novel study in heart sound localization because it may provide an efficient and accurate method for heart sound applications. From the research presented, it appears that the proposed first and second heart sound localization method can improve localization performance. It may be a useful method for robust peak detection in various applications, such as LVET/PEP detection and cardiac disorder diagnoses. Although there are still unsolved problem related to abnormal heart activity and optimization for robust detection, we believe that the proposed method can provide simple and reliable heart sound feature detection.