• Title/Summary/Keyword: 3-D Spectrogram

Search Result 19, Processing Time 0.03 seconds

Study on Discrimination between Natural Earthquakes and Man-made Explosions using Wonju KSRS Data (원주 KSRS 자료를 이용한 자연지진과 인공지진 구별에 관한 연구)

  • Kang, Ik-Bum;Kim, Sung-Bae;Suh, Man-Cheol;Jun, Myung-Soon
    • Journal of the Korean Geophysical Society
    • /
    • v.3 no.1
    • /
    • pp.25-36
    • /
    • 2000
  • 3-D Spectrograms for 22 events are drawn to discern about whether those are earthquakes or explosions. Generally, in case of explosions relative to the case of earthquakes, amplitude of P phase is more dominantly shown. According to the results on logarithm of spectral ratio of P (Pn, Pg)/Lg after removing free-surface effects from 3-D (U-D, N-S, E-W) seismogram, $-1.2{\sim}-0.9$ is shown for earthquakes and $-0.7{\sim}-0.1$ if shown for explosions. This result is consistent with previous researches (Kim Park, 1997) that -0.6 of spectral ratio between P and Lg after taking logarithm may be the criterion for the discrimination between earthquakes and explosions in Korea. In addition, Complexity is applied to two events as another discrimination method. The value of Complexity of explosion is much smaller than that of earthquake. This may be due to well-developed P-wave in explosion compared to that in earthquake. This result is in accordance with that of 3-D Spectrogram.

  • PDF

Classification of infant cries using 3D feature vectors (3D 특징 벡터를 이용한 영아 울음소리 분류)

  • Park, JeongHyeon;Kim, MinSeo;Choi, HyukSoon;Moon, Nammee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.597-599
    • /
    • 2022
  • 영아는 울음이라는 비언어적 의사 소통 방식을 사용하여 모든 욕구를 표현한다. 하지만 영아의 울음소리를 파악하는 것에는 어려움이 따른다. 영아의 울음소리를 해석하기 위해 많은 연구가 진행되었다. 이에 본 논문에서는 3D 특징 벡터를 이용한 영아의 울음소리 분류를 제안한다. Donate-a-corpus-cry 데이터 세트는 복통, 트림, 불편, 배고픔, 피곤으로 총 5 개의 클래스로 분류된 데이터를 사용한다. 데이터들은 원래 속도의 90%와 110%로 수정하는 방법인 템포조절을 통해 증강한다. Spectrogram, Mel-Spectrogram, MFCC 로 특징 벡터화를 시켜준 후, 각각의 2 차원 특징벡터를 묶어 3차원 특징벡터로 구성한다. 이후 3 차원 특징 벡터를 ResNet 과 EfficientNet 모델로 학습을 진행한다. 그 결과 2 차원 특징 벡터는 0.89(F1) 3 차원 특징 벡터의 경우 0.98(F1)으로 0.09 의 성능 향상을 보여주었다.

Speech Denoising via Low-Rank and Sparse Matrix Decomposition

  • Huang, Jianjun;Zhang, Xiongwei;Zhang, Yafei;Zou, Xia;Zeng, Li
    • ETRI Journal
    • /
    • v.36 no.1
    • /
    • pp.167-170
    • /
    • 2014
  • In this letter, we propose an unsupervised framework for speech noise reduction based on the recent development of low-rank and sparse matrix decomposition. The proposed framework directly separates the speech signal from noisy speech by decomposing the noisy speech spectrogram into three submatrices: the noise structure matrix, the clean speech structure matrix, and the residual noise matrix. Evaluations on the Noisex-92 dataset show that the proposed method achieves a signal-to-distortion ratio approximately 2.48 dB and 3.23 dB higher than that of the robust principal component analysis method and the non-negative matrix factorization method, respectively, when the input SNR is -5 dB.

Development of Realtime Phonetic Typewriter (실시간 음성타자 시스템 구현)

  • Cho, W.Y.;Choi, D.I.
    • Proceedings of the KIEE Conference
    • /
    • 1999.11c
    • /
    • pp.727-729
    • /
    • 1999
  • We have developed a realtime phonetic typewriter implemented on IBM PC with sound card based on Windows 95. In this system, analyzing of speech signal, learning of neural network, labeling of output neurons and visualizing of recognition results are performed on realtime. The developing environment for speech processing is established by adding various functions, such as editing, saving, loading of speech data and 3-D or gray level displaying of spectrogram. Recognition experimental using Korean phone had a 71.42% for 13 basic consonant and 90.01% for 7 basic vowel accuracy.

  • PDF

Some Notational Problems of the translation of Japanese stops[k, t] and affricates[t s ,$t{\int}$] into Korean (일본어 파열음[k, t]과 파찰음[t s , $t{\int}$ 의 국어 표기상의 문제점)

  • Lee, Young-Hee
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.187-192
    • /
    • 2007
  • The purpose of this paper is to show that the current notation of Japanese proper names in Korean has some problems. It cannot represent the different sounds between the voiced and voiceless. The purpose of this paper is also to give a more correct notation which is coherent and efficient. After introducing some general knowledge about the phonemes of Japanese language, I measured the Voice Onset Time of the stops[k, t] at the beginning, in the middle and at the end of a word, and compared the spectrogram of affricates with that of fricatives. In conclusion, Japanese voiceless [k, t ,$t{\int}$] should be written as [ㅋ,ㅌ,ㅊ] and voiced [g, d $d_3$] as [ㄱ,ㄷ,ㅈ] and the affricate[ts] as[ㅊ] in Korean.

  • PDF

A Study on 3-Dimensional Near-Field Source Localization Using Interference Pattern Matching in Shallow Water Environments (천해에서 간섭패턴 정합을 이용한 근거리 음원의 3차원 위치추정 기법연구)

  • Kim, Se-Young;Chun, Seung-Yong;Son, Yoon-Jun;Kim, Ki-Man
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.318-327
    • /
    • 2009
  • In this paper, we propose a 3-D geometric localization method for near-field broadband source in shallow water environments. According to the waveguide invariant theory, slope of the interference pattern which is seen in a sensor spectrogram directly proportional to a range of the source. The relative ratio of the range between source and sensors was estimated by matching of two interference patterns in spectrogram. Then this ratio is applied to the Apollonius's circle which shows the locus of a source whose range ratio from two sensors is constant. Two Apollonius's circles from three sensors make the intersection point that means the horizontal range and the azimuth angle of the source. And this intersection point is constant with source depth. Therefore the source depth can be estimated using 3-D hyperboloid equation whose range difference from two sensors is constant. To evaluate a performance of the proposed localization algorithm, simulation is performed using acoustic propagation program and analysis of localization error is demonstrated. From simulation results, error estimate for range and depth is described within 50 m and 15 m respectively.

Sources separation of passive sonar array signal using recurrent neural network-based deep neural network with 3-D tensor (3-D 텐서와 recurrent neural network기반 심층신경망을 활용한 수동소나 다중 채널 신호분리 기술 개발)

  • Sangheon Lee;Dongku Jung;Jaesok Yu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.357-363
    • /
    • 2023
  • In underwater signal processing, separating individual signals from mixed signals has long been a challenge due to low signal quality. The common method using Short-time Fourier transform for spectrogram analysis has faced criticism for its complex parameter optimization and loss of phase data. We propose a Triple-path Recurrent Neural Network, based on the Dual-path Recurrent Neural Network's success in long time series signal processing, to handle three-dimensional tensors from multi-channel sensor input signals. By dividing input signals into short chunks and creating a 3D tensor, the method accounts for relationships within and between chunks and channels, enabling local and global feature learning. The proposed technique demonstrates improved Root Mean Square Error and Scale Invariant Signal to Noise Ratio compared to the existing method.

Multiple damage detection of maglev rail joints using time-frequency spectrogram and convolutional neural network

  • Wang, Su-Mei;Jiang, Gao-Feng;Ni, Yi-Qing;Lu, Yang;Lin, Guo-Bin;Pan, Hong-Liang;Xu, Jun-Qi;Hao, Shuo
    • Smart Structures and Systems
    • /
    • v.29 no.4
    • /
    • pp.625-640
    • /
    • 2022
  • Maglev rail joints are vital components serving as connections between the adjacent F-type rail sections in maglev guideway. Damage to maglev rail joints such as bolt looseness may result in rough suspension gap fluctuation, failure of suspension control, and even sudden clash between the electromagnets and F-type rail. The condition monitoring of maglev rail joints is therefore highly desirable to maintain safe operation of maglev. In this connection, an online damage detection approach based on three-dimensional (3D) convolutional neural network (CNN) and time-frequency characterization is developed for simultaneous detection of multiple damage of maglev rail joints in this paper. The training and testing data used for condition evaluation of maglev rail joints consist of two months of acceleration recordings, which were acquired in-situ from different rail joints by an integrated online monitoring system during a maglev train running on a test line. Short-time Fourier transform (STFT) method is applied to transform the raw monitoring data into time-frequency spectrograms (TFS). Three CNN architectures, i.e., small-sized CNN (S-CNN), middle-sized CNN (M-CNN), and large-sized CNN (L-CNN), are configured for trial calculation and the M-CNN model with excellent prediction accuracy and high computational efficiency is finally optioned for multiple damage detection of maglev rail joints. Results show that the rail joints in three different conditions (bolt-looseness-caused rail step, misalignment-caused lateral dislocation, and normal condition) are successfully identified by the proposed approach, even when using data collected from rail joints from which no data were used in the CNN training. The capability of the proposed method is further examined by using the data collected after the loosed bolts have been replaced. In addition, by comparison with the results of CNN using frequency spectrum and traditional neural network using TFS, the proposed TFS-CNN framework is proven more accurate and robust for multiple damage detection of maglev rail joints.

A Study of response Spectrums and characteristics of Time-Frequency Domain of Microearthquakes in the Central Part of South Korea (남한 중부지역 미소지진들의 응답 스펙트럼 및 시간-주파수 영역에서의 특성에 관한 연구)

  • 이전희
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 1999.10a
    • /
    • pp.72-82
    • /
    • 1999
  • The microearthquake and explosion events recorded in the seismic KNUE(Korea National University of Education) network were analyzed. The seismic data were recorded from Dec. 1997 to Dec. 1998. Total of 118 records consisted of 24 earthquake and 4 explosion events were instrumented at 6 stations. Spectral values increases as magnitude increases and the predominant frequency band expands to low frequency. zone as magnitude increases. Three-dimensional spectrograms(time frequency. amplitude) were also synthesized in order to discriminate microearthquakes and artificial underground explosions. The waves from microearthquakes show that frequency content of dominant amplitude appeared above 10 Hz and the discrimination can be performed in almost all the frequency domain of 3-d spectrogram.

  • PDF

The comparative Study of the Acoustic Representation between Pansori singer's and Spasmodic dysphonia patient's Voice (병적인 소리 떨림증과 소리꾼 떨림증의 음향학적인 비교연구)

  • Hong, K.H.;Kim, H.G.;Lee, J.K.;Choi, J.S.
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.143-145
    • /
    • 2007
  • Muscle groups that are located in and around the vocal tract can produce audible changes in frequency and/or intensity of the voice. Vocal vibrato is a characteristic feature in the singing of performers trained in the western classical tradition and vibrato is generally considered to result from modulation in frequency amplitude and timbre. Vocal tremor is also characterized by periodic fluctuations in the voice frequency or intensity and vocal tremor is symptom of a neurological disease as Spasmodic dysphonia , Parkinson's disease. Vocal vibrato and Vocal tremor may have many of the same origins and mechanisms in the voice production systems. The purpose of this study is to find acostic character of Korean traditional song Pansori singer's vibrato and Spasmodic dysphonia patient's vocal tremor. twelve Pansori singers and seven Spasmodic dysponia patients participated to this study. Power spectrum and Real time Spectrogram are used to analyze the acoustic characteristics of Pansori singing and Spasmodic dysphonia patient's voice The results are as follows; First, vowel formant differences between Pansori singing and Spasmodic dysphonia patient's voice are higher F1, F3. Second, The vibrato rate show differences between Pansori singing and Spasmodic dysphonia patients;$4^{\sim}6/sec$ and $5{\sim}6/sec$ Vibrato rate of pitch is 5.7 Hz ${\sim}$ 42.4 Hz for Pansori singing , 3.8 Hz ${\sim}$ 27.9 Hz for Spasmodic dysphonia patients ;Vibrato rate of intensity range is 0.07 dB ${\sim}$ 8.26 dB for Pansori singing and 0.07 dB ${\sim}$ 4.81 dB for Spasmodic dysphonia patients

  • PDF