• Title/Summary/Keyword: Sound spectrogram

Search Result 69, Processing Time 0.025 seconds

Aurally Relevant Analysis by Synthesis - VIPER a New Approach to Sound Design -

  • Daniel, Peter;Pischedda, Patrice
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2003.05a
    • /
    • pp.1009-1009
    • /
    • 2003
  • VIPER a new tool for the VIsual PERception of sound quality and for sound design will be presented. Requirement for the visualization of sound quality is a signal analysis modeling the information processing of the ear. The first step of the signal processing implemented in VIPER, calculates an auditory spectrogram by a filter bank adapted to the time- and frequency resolution of the human ear. The second step removes redundant information by extracting time- and frequency contours from the auditory spectrogram in analogy to contours of the visual system. In a third step contours and/or auditory spectrogram can be resynthesised confirming that only aurally relevant information were extracted. The visualization of the contours in VIPER allows intuitively to grasp the important components of a signal. Contributions of parts of a signal to the overall quality can be easily auralized by editing and resynthesising the contours or the underlying auditory spectrogram. Resynthesis of time contours alone allows e.g. to auralize impulsive components separately from the tonal components. Further processing of the contours determines tonal parts in form of tracks. Audible differences between two versions of a sound can be visually inspected in VIPER through the help of auditory distance spectrograms. Applications are shown for the sound design of several interior noises of cars.

  • PDF

Significance of Nasometer and First Formant for Nasal Patency After Septoplasty and Turbinoplasty (비중격 성형술 및 하비잡개 절제술 후 비개존도 측정을 위한 Nasometer와 제1포만트 측정의 유용성)

  • 진성민;강현국;이경철;박상욱;이성채;이용배
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.8 no.2
    • /
    • pp.161-165
    • /
    • 1997
  • Background : The rhinomanometry and acoustic rhinometry can assess e nasal passage dynamically and statically Recently, analytic methods such as nasometer and sound spectrogram are gaining wide attention to evaluate the nasality objectively. Objectives : firstly to determine if ere was a relationship between the new methods and nasal airway resistance, and secondly to establish if the measurement of nasalance and sound spectrum could be used as an alternative to rhinomanometry and acoustic rhinometry. Materials and Methods : Thirty two patients who underwent either septoplasty and turbinectomy for nasal obstruction were studied. And their ages ranged form 15 to 45 years, with an average of 26.1 years. The rhinomanometry, nasometer, sound spectrogram were performed at preoperative and postoperative 4 weeks day. Results : After operation, subjective symptoms and rhinomanometric results were significantly improved but nasalance and slope of nana, mama and mamma passage had not meningful change. The significnat changes were noted in nasalance and first nasal formant frequency of nasal consonant of velum(angang). Conclusion : Nasometer and sound spectrogram had a limitation for the measure of nasal patency.

  • PDF

A Study on Partial Discharge Diagnostic System for Power Cable using RLCR

  • Park, Keeyoung;Choi, Hyungkee;Lee, Chulhee;Hong, Soomi
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.1
    • /
    • pp.43-47
    • /
    • 2016
  • This system is a diagnosis system that checks whether it causes a partial discharge of a power cable or not. It is to classify normal from abnormal-normal, PD (Partial Discharge) sound through analysis of RLCR (Relative Level Crossing Rate) and spectrogram energy algorithm. Partial discharge diagnostic system has a function that stores PD sound and analyzes the data. The wave shape of PD sound is similar to noise and is systematically generated by partial discharge. Therefore, in this paper, we could discreminate between normal and abnormal case using relative level crossing rate (RLCR) and spectrogram of frequency energy rate.

A Study on the Correlation between Sound Spectrogram and Sasang Constitution (성문(聲紋)과 사상체질(四象體質)과의 상관성(相關性)에 관(關)한 연구(硏究))

  • Yang, Seung-hyun;Kim, Dal Lae
    • Journal of Sasang Constitutional Medicine
    • /
    • v.8 no.2
    • /
    • pp.191-202
    • /
    • 1996
  • Sasang constitution classification is very important subject, so many medical men studied the Sasang constitution classification but there is no certain method to classify objectively. And the purpose of this study is to help classifying Sasang constitution through correlation with sound spectrogram. This study was done it under the suppose that Sasang costitution hag correlation with sound spectrogram. The following results were obtained about correlation between sound spectrogram and Sasang constitution by comparison and analysis the pitch and reading speed of Sasang constitutions; 1. There was a similar tendency in the composition reading speed between taeeumin, soeumin and soyangin. 2. Taeeumin's center was lower measured more than soeumin's and soyangin's in the pitch graph and graph by normal curve fit and there was a similar tendency between soeumin and soyangin. 3. There was a similar tendency in the pitch graph's width between all constitutions. 4. There was a significant difference between taeeumin and soeum in the mean of three constitution's pitch, this means that taeeumin uses lower voice more than soeumin. According to the results, it is considered that there is a correlation between pitch of sound spectrogram and Sasang constitution. And method of Sasang constitution classification through sound spectrogram analysis can be one method as assistant for the objectification of Sasang constitution classification.

  • PDF

Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods (오디오 전처리 방법에 따른 콘벌루션 신경망의 환경음 분류 성능 비교)

  • Oh, Wongeun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.3
    • /
    • pp.143-149
    • /
    • 2020
  • This paper presents the effect of the feature extraction methods used in the audio preprocessing on the classification performance of the Convolutional Neural Networks (CNN). We extract mel spectrogram, log mel spectrogram, Mel Frequency Cepstral Coefficient (MFCC), and delta MFCC from the UrbanSound8K dataset, which is widely used in environmental sound classification studies. Then we scale the data to 3 distributions. Using the data, we test four CNNs, VGG16, and MobileNetV2 networks for performance assessment according to the audio features and scaling. The highest recognition rate is achieved when using the unscaled log mel spectrum as the audio features. Although this result is not appropriate for all audio recognition problems but is useful for classifying the environmental sounds included in the Urbansound8K.

A Method of Sound Segmentation in Time-Frequency Domain Using Peaks and Valleys in Spectrogram for Speech Separation (음성 분리를 위한 스펙트로그램의 마루와 골을 이용한 시간-주파수 공간에서 소리 분할 기법)

  • Lim, Sung-Kil;Lee, Hyon-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.418-426
    • /
    • 2008
  • In this paper, we propose an algorithm for the frequency channel segmentation using peaks and valleys in spectrogram. The frequency channel segments means that local groups of channels in frequency domain that could be arisen from the same sound source. The proposed algorithm is based on the smoothed spectrum of the input sound. Peaks and valleys in the smoothed spectrum are used to determine centers and boundaries of segments, respectively. To evaluate a suitableness of the proposed segmentation algorithm before that the grouping stage is applied, we compare the synthesized results using ideal mask with that of proposed algorithm. Simulations are performed with mixed speech signals with narrow band noises, wide band noises and other speech signals.

Watermarking System That Inserts Copyright Holder′s Logo (저작권자의 로고를 워터 마킹하는 장치)

  • 남상엽;이천우;김형배;이상원;박인정
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1487-1490
    • /
    • 2003
  • This paper shows the watermarking system that inserts copyright holder's logo in music file. In other words, a sound file is able to have an image information like a logo or letters. The watermarking system converts a sound file into an image file using spectrogram. In the spectrogram domain, a logo is inserted using spread spectrum. The proposed technique shows that the verification of copyright is better than the method using PN-Sequence.

  • PDF

Evaluation of Stimulus Strategy for Cochlear Implant Using Neurogram (Neurogram을 이용한 인공와우 자극기법 평가 연구)

  • Yang, Hyejin;Woo, Jihwan
    • Journal of Biomedical Engineering Research
    • /
    • v.34 no.2
    • /
    • pp.47-54
    • /
    • 2013
  • Electrical stimulation is delivered to auditory nerve (AN) through the electrodes in cochlear implant system. Neurogram is a spectrogram that includes information of neural response to electrical stimulation. We hypothesized that the similarity between a neurogram and an input-sound spectrogram could show how well a cochlear implant system works. In this study, we evaluated electrical stimulus configuration of CIS strategy using the computational model. The computational model includes stochastic property and anatomical features of cat auditory nerve fiber. To evaluate similarity between a neurogram and an input-sound spectrogram, we calculated Structural Similarity Index (SSIM). The results show that the dynamic range and the stimulation rate per channel influenced SSIM. Finally, we suggested the optimal configuration within the given stimulus CIS. We expect that the results and the evaluating procedure could be employed to improve the performance of a cochlear implant system.

CNN based Complex Spectrogram Enhancement in Multi-Rotor UAV Environments (멀티로터 UAV 환경에서의 CNN 기반 복소 스펙트로그램 향상 기법)

  • Kim, Young-Jin;Kim, Eun-Gyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.4
    • /
    • pp.459-466
    • /
    • 2020
  • The sound collected through the multi-rotor unmanned aerial vehicle (UAV) includes the ego noise generated by the motor or propeller, or the wind noise generated during the flight, and thus the quality is greatly impaired. In a multi-rotor UAV environment, both the magnitude and phase of the target sound are greatly corrupted, so it is necessary to enhance the sound in consideration of both the magnitude and phase. However, it is difficult to improve the phase because it does not show the structural characteristics. in this study, we propose a CNN-based complex spectrogram enhancement method that removes noise based on complex spectrogram that can represent both magnitude and phase. Experimental results reveal that the proposed method improves enhancement performance by considering both the magnitude and phase of the complex spectrogram.

Recognition of Overlapped Sound and Influence Analysis Based on Wideband Spectrogram and Deep Neural Networks (광역 스펙트로그램과 심층신경망에 기반한 중첩된 소리의 인식과 영향 분석)

  • Kim, Young Eon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.421-430
    • /
    • 2018
  • Many voice recognition systems use methods such as MFCC, HMM to acknowledge human voice. This recognition method is designed to analyze only a targeted sound which normally appears between a human and a device one. However, the recognition capability is limited when there is a group sound formed with diversity in wider frequency range such as dog barking and indoor sounds. The frequency of overlapped sound resides in a wide range, up to 20KHz, which is higher than a voice. This paper proposes the new recognition method which provides wider frequency range by conjugating the Wideband Sound Spectrogram and the Keras Sequential Model based on DNN. The wideband sound spectrogram is adopted to analyze and verify diverse sounds from wide frequency range as it is designed to extract features and also classify as explained. The KSM is employed for the pattern recognition using extracted features from the WSS to improve sound recognition quality. The experiment verified that the proposed WSS and KSM excellently classified the targeted sound among noisy environment; overlapped sounds such as dog barking and indoor sounds. Furthermore, the paper shows a stage by stage analyzation and comparison of the factors' influences on the recognition and its characteristics according to various levels of noise.