• Title/Summary/Keyword: Spectrogram Analysis

Search Result 92, Processing Time 0.027 seconds

Feasibility of Deep Learning-Based Analysis of Auscultation for Screening Significant Stenosis of Native Arteriovenous Fistula for Hemodialysis Requiring Angioplasty

  • Jae Hyon Park;Insun Park;Kichang Han;Jongjin Yoon;Yongsik Sim;Soo Jin Kim;Jong Yun Won;Shina Lee;Joon Ho Kwon;Sungmo Moon;Gyoung Min Kim;Man-deuk Kim
    • Korean Journal of Radiology
    • /
    • v.23 no.10
    • /
    • pp.949-958
    • /
    • 2022
  • Objective: To investigate the feasibility of using a deep learning-based analysis of auscultation data to predict significant stenosis of arteriovenous fistulas (AVF) in patients undergoing hemodialysis requiring percutaneous transluminal angioplasty (PTA). Materials and Methods: Forty patients (24 male and 16 female; median age, 62.5 years) with dysfunctional native AVF were prospectively recruited. Digital sounds from the AVF shunt were recorded using a wireless electronic stethoscope before (pre-PTA) and after PTA (post-PTA), and the audio files were subsequently converted to mel spectrograms, which were used to construct various deep convolutional neural network (DCNN) models (DenseNet201, EfficientNetB5, and ResNet50). The performance of these models for diagnosing ≥ 50% AVF stenosis was assessed and compared. The ground truth for the presence of ≥ 50% AVF stenosis was obtained using digital subtraction angiography. Gradient-weighted class activation mapping (Grad-CAM) was used to produce visual explanations for DCNN model decisions. Results: Eighty audio files were obtained from the 40 recruited patients and pooled for the study. Mel spectrograms of "pre-PTA" shunt sounds showed patterns corresponding to abnormal high-pitched bruits with systolic accentuation observed in patients with stenotic AVF. The ResNet50 and EfficientNetB5 models yielded an area under the receiver operating characteristic curve of 0.99 and 0.98, respectively, at optimized epochs for predicting ≥ 50% AVF stenosis. However, Grad-CAM heatmaps revealed that only ResNet50 highlighted areas relevant to AVF stenosis in the mel spectrogram. Conclusion: Mel spectrogram-based DCNN models, particularly ResNet50, successfully predicted the presence of significant AVF stenosis requiring PTA in this feasibility study and may potentially be used in AVF surveillance.

Detection of Arousal in Patients with Respiratory Sleep Disorder Using Single Channel EEG (단일 채널 뇌전도를 이용한 호흡성 수면 장애 환자의 각성 검출)

  • Cho, Sung-Pil;Choi, Ho-Seon;Lee, Kyoung-Joung
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.55 no.5
    • /
    • pp.240-247
    • /
    • 2006
  • Frequent arousals during sleep degrade the quality of sleep and result in sleep fragmentation. Visual inspection of physiological signals to detect the arousal events is cumbersome and time-consuming work. The purpose of this study is to develop an automatic algorithm to detect the arousal events. The proposed method is based on time-frequency analysis and the support vector machine classifier using single channel electroencephalogram (EEG). To extract features, first we computed 6 indices to find out the informations of a subject's sleep states. Next powers of each of 4 frequency bands were computed using spectrogram of arousal region. And finally we computed variations of power of EEG frequency to detect arousals. The performance has been assessed using polysomnographic (PSG) recordings of twenty patients with sleep apnea, snoring and excessive daytime sleepiness (EDS). We could obtain sensitivity of 79.65%, specificity of 89.52% for the data sets. We have shown that proposed method was effective for detecting the arousal events.

Speech Problems of English Laterals by Korean Learners based on the acoustic Characteristics (한국인 영어 학습자의 설측음 발화의 문제점: 음향음성학적 특성을 중심으로)

  • Kim, Chong-Gu;Kim, Hyun-Gi;Jeon, Byung-Man
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.127-138
    • /
    • 2000
  • The aim of this paper is to find the speech problems of English Laterals by Korean learners and to contribute to the effective pronunciation education with visualizing the pronunciation. In this paper we analyzed 18 words including lateral sounds which were divided into such as: initial, initial consonant cluster, intervocalic, final consonant cluster, and final. To analyse the words we used High speed speech analysis system. We examined acoustic characteristics of English lateral spectrogram by using voice sustained time(ms), FL1, FL2, FL3. Before we started, we had expected that the result would show us that the mother tongue interfere in the final sounds because we have similar sounds in Korea. The results of our experiments showed that initially, voice sustained time showed many more differences between Korean and native pronunciation. Also, it was seen that Korean pronunciation used the syllable structure of the own mother tongue. For instance, in the case of initial consonant cluster CCVC, Koreans often used CC as a syllable and VC as another. This was due to the mother tongue interference. For this reason in the intervocalic and in the final, we saw the differences between Korean and native. Therefore we have to accept the visualized analysis system in the instruction of pronunciation.

  • PDF

A Comparative Study of the Speech Signal Parameters for the Consonants of Pyongyang and Seoul Dialects - Focused on "ㅅ/ㅆ" (평양 지역어와 서울 지역어의 자음에 대한 음성신호 파라미터들의 비교 연구 - "ㅅ/ ㅆ"을 중심으로)

  • So, Shin-Ae;Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.6
    • /
    • pp.927-937
    • /
    • 2018
  • In this paper the comparative study of the consonants of Pyongyang and Seoul dialects of Korean is performed from the perspective of the signal processing which can be regarded as the basis of engineering applications. Until today, the most of speech signal studies were primarily focused on the vowels which are playing important role in the language evolution. In any language, however, the number of consonants is greater than the number of vowels. Therefore, the research of consonants is also important. In this paper, with the vowel study of the Pyongyang dialect, which was conducted by phonological research and experimental phonetic methods, the consonant studies are processed based on an engineering operation. The alveolar consonant, which has demonstrated many differences in the phonetic value between Pyongyang and Seoul dialects, was used as the experimental data. The major parameters of the speech signal analysis - formant frequency, pitch, spectrogram - are measured. The phonetic values between the two dialects were compared with respect to /시/ and /씨/ of Korean language. This study can be used as the basis for the voice recognition and the voice synthesis in the future.

A Study on 3-Dimensional Near-Field Source Localization Using Interference Pattern Matching in Shallow Water Environments (천해에서 간섭패턴 정합을 이용한 근거리 음원의 3차원 위치추정 기법연구)

  • Kim, Se-Young;Chun, Seung-Yong;Son, Yoon-Jun;Kim, Ki-Man
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.318-327
    • /
    • 2009
  • In this paper, we propose a 3-D geometric localization method for near-field broadband source in shallow water environments. According to the waveguide invariant theory, slope of the interference pattern which is seen in a sensor spectrogram directly proportional to a range of the source. The relative ratio of the range between source and sensors was estimated by matching of two interference patterns in spectrogram. Then this ratio is applied to the Apollonius's circle which shows the locus of a source whose range ratio from two sensors is constant. Two Apollonius's circles from three sensors make the intersection point that means the horizontal range and the azimuth angle of the source. And this intersection point is constant with source depth. Therefore the source depth can be estimated using 3-D hyperboloid equation whose range difference from two sensors is constant. To evaluate a performance of the proposed localization algorithm, simulation is performed using acoustic propagation program and analysis of localization error is demonstrated. From simulation results, error estimate for range and depth is described within 50 m and 15 m respectively.

The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing (음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

An Analysis of Preference for Korean Pop Music By Applying Acoustic Signal Analysis Techniques (음향신호분석 기술을 적용한 한국가요의 시대별 선호도 분석)

  • Cho, Dong-Uk;Kim, Bong-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.19D no.3
    • /
    • pp.211-220
    • /
    • 2012
  • Recently K-Pop gained worldwide sensational popularity, no longer limited to the domestic pop music scene. One of the main causes can be that K-Pop mostly are "Hook Song" which has the "hook effect": a certain melody or/and rhythm is repeated up to 70 times in one song so that it hooks the ear of the listener. Also, visual effects by K-Pop dance group are supposed to contribute to gaining the popularity. In this paper, we propose a method which traces the changes of preference for Korean pop music according to the passing of time and investigates the causes using acoustic signal analysis. For this, experiments in acoustic signal analysis are performed on Korean pop music of from popular female singers in 1960s to those as of this date. Experimental results by applying acoustic signal processing techniques show that the periods discrimination is possible based on scientific evidences. Also, quantitative, objective and numerical data based on acoustic signal processing techniques are extracted compared with the pre-existing methods such as subjective and statistical data.

Visual.Auditory.Acoustic Study on Singing Vowels of Korean Lyric Songs (시각과 청각 및 음향적 관점에서의 노랫말 모음 연구)

  • Lee Jai Kang
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.362-366
    • /
    • 1996
  • This paper is generally divided in 2 parts. One is the study on vowels about korean singer's lyric song in view of Daniel Jones' Cardinal Vowel. The other is acoustic study on vowels in my singing about korean lyric song. Analysis data are KBS concert video tape and CSL's. NSP file on my singing and Informants are famous singers i.e. 3 sopranos, 1 mezzo, 2 tenors, 1baritone, and me. Analysis aim is to find out Korean 8 vowels([equation omitted]) quality in singing. The methods of descrition are used in closed vowels, half closed vowels, half open vowels, open vowels and rounded vowels, unroundes vowels and formants. The study of the former is while watching the monitor screen to stop the scene that is to be analysixed. The study of the latter is to analysis the spectrogram converted by CSL's. SP file. Analysis results are an follows: Visual and auditory korean vowels quality in singing have the 3 tendency. One is the tendency of more rounded than is usual Korean vowels. Another is the tendency of centralized to center point in Cardinal Vowel and the other is the tendency of diversity in vowel quality. Acoustic analysis is studied by means of 4 formants. Fl and F2 show similiar step in spoken. In Fl there is the same formant values. This seems to vocal organization be perceived the singign situation. The width of F3 is the widest of all, so F3 may be the characteristics in singing. In conclude, the characteristics of vowels in Korean lyric songs are seems to have the tendencies of rounding, centralizing to center point in Cardinal Vowel, diversity in vowel quality and, F3'widest width in compared with usual Korean vowels.

  • PDF

Fault Diagnosis and Analysis Based on Transfer Learning and Vibration Signals (전이 학습과 진동 신호를 이용한 설비 고장 진단 및 분석)

  • Yun, Jong Pil;Kim, Min Su;Koo, Gyogwon;Shin, Crino
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.6
    • /
    • pp.287-294
    • /
    • 2019
  • With the automation of production lines in the manufacturing industry, the importance of real-time fault diagnosis of facility is increasing. In this paper, we propose a fault diagnosis algorithm of LM (Linear Motion)-guide based on deep learning using vibration signals. Generally, in order to guarantee the performance of the deep learning, it is necessary to have a sufficient amount of data, but in a manufacturing industry, it is often difficult to obtain enough data due to physical and time constraints. To solve this problem, we propose a convolutional neural networks (CNN) model based on transfer learning. In addition, the spectrogram image is input to the CNN to reflect the frequency characteristic of the vibration signals with time. The performance of fault diagnosis according to various load condition and transfer learning method was compared and evaluated by experiments. The results showed that the proposed algorithm exhibited an excellent performance.

The Effect of Tonsillectomy and Adenoidectomy on Acoustic Factors (구개편도 및 아데노이드 절제술이 음향학적 자질에 미치는 영향)

  • 임성태;손진호;유정운;강지원;이현석;신승헌;박재율
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.9 no.1
    • /
    • pp.38-42
    • /
    • 1998
  • It has been reported that Tonsillectomy & Adenoidectomy(T & A) resulted in the change of voice by structural changes directly to the vocal track. We studied the effect of T & A on the voice of patients comparing the pre-operative to the post-operative voice. It was performed using a Computerized Speech Lab(CSL50) which is currently used as a method for voice analysis. Forty-five patients who had T&A, aging from 3 to 42 years old, took part in studies and wert evaluated for voice changes and the degree of formant changes of four basic vowels, /a/, /i/, /o/, and /u/. They were evaluated pre-operatively and post-operatively one month later using MDVP, CSL program of CSL50. The results obtained were as follows ; In using MDVP, there were some differences between pre-operative and post-operative shimmer measures within the normal range but other acoustic measures(Fo, jitter, NHR) show no significant differences(p>0.05). F3 of /a/ and /o/ were significantly decreased(p<0.05) and F2, F3 of /i/ were increased(p>0.05) in patients who only had Tonsillectomy in doing CSL spectrogram. For the patients who had T & A, Fl and F3 of /a/, F3 of /i/, Fl, F2 and F3 of /o/ were decreased with significant increase in F1 and F2 of /i/(p<0.05).

  • PDF