Search | Korea Science

On the Classification of Normal, Benign, Malignant Speech Using Neural Network and Cepstral Method (Cepstrum 방법과 신경회로망을 이용한 정상, 양성종양, 악성종양 상태의 식별에 관한 연구)

조철우
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.399-402
- /
- 1998
본 논문에서는 환자의 음성을 정상, 양성종양, 악성종양으로 분류하는 실험을 켑스트럼 파라미터를 통한 음원분리와 신경회로망을 이용하여 수행하고 그 결과를 보고한다. 기존의 장애음성 데이터베이스에는 정상음성과 양성종양의 경우만 수록되어 있었고 외국의 환자들을 대상으로 한 경우만 있었기 때문에 국내의 환자들에게 직접 적용할 경우 어떠한 결과가 나올지 예측하기가 어려웠다. 최근 부산대학교 이비인후과팀에서 수집한 국내의 정상, 양성, 악성종양의 경우에 대한 데이터베이스를 분석하고 신경회로망에 의해 분류함으로써 사람의 음성신호만에 의한 후두질환이 식별이 가능하였다. 본 실험에서는 식별 파라미터로 음성신호의 선형예측오차신호에 관한 켑스트럼으로부터 음원비인 HNRR을 구하여 Jitter, Shimmer와 함께 사용하였다. 신경회로망은 입, 출력 층과 한 개의 은닉층을 갖는 다층신경망을 이용하였으며, 식별은 두단계로 나누어 정상과 비정상을 분류한 후 다시 비정상을 양성과 악성으로 분류하였다[1].
PDF

A Study on the Analysis and Recognition of Korean Speech Signal using the Phoneme (음소를 이용한 한국어 음성 신호의 분석과 인식에 관한 연구)

Kim Y. I.;Hwang Y. S.;Youn D. H.;Cha I. W.
- The Journal of the Acoustical Society of Korea
- /
- v.8 no.5
- /
- pp.70-77
- /
- 1989
In this paper, Korean language recognition using the phoneme is studied. The experiment is carried out by dividing 545 isolated words into phonemes. Using linear prediction coefficients the recognition rate of consonants, vowels, and end-consonants are $87.3(\%), 91.0(\%), 91.7(\%)$, respectively. Recognition rate of isolated words combined with the phonemes is $71.4(\%)$. Itakura-saito distortion measure is used to phoneme segmentation and phoneme recognition.
PDF

High-Band Codec for Bandwidth Scalable Wideband Speech Codec (대역폭 계층 구조의 광대역 음성 부호화기를 위한 상위 대역 부호화기 연구)

Kim Youngvo;Jeong Byounghak;Son Chang-Yong;Sung Ho-Sang;Park Hochong
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.7
- /
- pp.395-401
- /
- 2005
In this paper, the high-band codec for bandwidth scalable wideband speech codec is proposed. The wideband input speech signal is separated into low-band signal and high-band signal, and the low-band signal is encoded by the standard narrow-band speech codec and the high-band signal is encoded by the proposed codec. In the high-band codec. the signal is transformed into frequency domain by MLT on a subframe basis, and MLT coefficients are splitted into magnitude and sign for quantization. The magnitudes of MLT coefficients are arranged into several time-frequency bands and each band is quantized in 2D-DCT domain, where the low-band information is utilized for better performance. The sign of MLT coefficient is quantized based on a priority selection process with the weighting measurement. The objective and subjective performance of wideband speech codec including the proposed high-band codec is measured, and it is confirmed that the proposed codec has better performance than 32kbps G.722.1.
PDF KSCI

Reverberation Characterization and Suppression by Means of Low Rank Approximation (낮은 계수 근사법을 이용한 표준 잔향음 신호 획득 및 제거 기법)

윤관섭;최지웅;나정열
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.5
- /
- pp.494-502
- /
- 2002
In this paper, the Low Rank Approximation (LRA) method to suppress the interference of signals from temporal fluctuations is applied. The reverberation signals and temporally fluctuating signals are separated from the measured data using the Ink. The Singular value decomposition (SVD) method is applied to extract the low rank and the temporally stable reverberation was extracted using the LRA. The reverberation suppression is performed on the LRA residual value obtained by removing the approximate reverberation signals. In overall, the method can be applied to the suppression of reververation in active sonar system as well as to the modeling of reverberation.
PDF KSCI

Speech Recognition Performance Improvement using Gamma-tone Feature Extraction Acoustic Model (감마톤 특징 추출 음향 모델을 이용한 음성 인식 성능 향상)

Ahn, Chan-Shik;Choi, Ki-Ho
- Journal of Digital Convergence
- /
- v.11 no.7
- /
- pp.209-214
- /
- 2013
Improve the recognition performance of speech recognition systems as a method for recognizing human listening skills were incorporated into the system. In noisy environments by separating the speech signal and noise, select the desired speech signal. but In terms of practical performance of speech recognition systems are factors. According to recognized environmental changes due to noise speech detection is not accurate and learning model does not match. In this paper, to improve the speech recognition feature extraction using gamma tone and learning model using acoustic model was proposed. The proposed method the feature extraction using auditory scene analysis for human auditory perception was reflected In the process of learning models for recognition. For performance evaluation in noisy environments, -10dB, -5dB noise in the signal was performed to remove 3.12dB, 2.04dB SNR improvement in performance was confirmed.
https://doi.org/10.14400/JDPM.2013.11.7.209 인용 PDF

Suppression of side lobe and grating lobe in ultrasound medical imaging system (의료용 초음파 영상 시스템에서 부엽과 격자엽의 억제)

Jeong, Mok Kun
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.5
- /
- pp.525-533
- /
- 2022
We propose an effective method for suppressing both side and grating lobes by applying 2-dimensional Fourier Transform to the received channel data during the receive focusing process of an ultrasound imaging system. When the signal from the image point is focused, the channel signals have the same DC value across the channels. However, even after echoes from outside an imaging point are focused, they are manifested as having different spatial frequencies depending on their incident angles. Therefore, after the receive focusing delay time is applied, 2-D Fourier Transform is performed on the time-channel data to separate the DC component and other frequency components in the spectral domain, and the weighting value is defined using the ratio of the two values. The side lobe and grating lobe were suppressed by multiplying the ultrasound image by a weighting value. Ultrasound images with a frequency of 5 MHz were simulated in a 64-channel linear array. The grating lobe appearing in the ultrasound image was completely removed by applying the proposed method. In addition, the side lobe was reduced and the lateral resolution was greatly increased. Results of computer simulation on a human organ mimicking image show that the proposed method can aid in better lesion diagnosis by increasing the image contrast.
https://doi.org/10.7776/ASK.2022.41.5.525 인용 PDF KSCI

UTS Designs and Experiments according to a Stand-off Technique using the Magnetostrictive Ultrasonic (자왜 초음파를 이용한 Stand-off 기술에 따른 UTS 설계 및 실험)

Koo Kil-Mo;Kim Sang-Baik;Kim Hee-Dong;Kang Hee-Young;Joung Young-Moo;Park Chi-Seong
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.257-262
- /
- 2004
본 논문에서는 초음파 웨이브렛 지연시간을 이용한 초고온 온도 측정법을 기초로 하여, 이 기술을 바탕으로 두 번째 단계인 용융물 온도에서 내구성을 갖는 초음파 센서(UTS : Ultrasonic Temperature Sense)를 설계하여 약 $2300^{\circ}C$까지 실험로 내부의 온도를 측정하고자 한다. 이때 UTS 설계의 중요 인수는 센서 봉 외부 표면과 시스(sheath) 내부 표면의 두 텅스텐 재료가 비접촉 상태로 요구된다. 만약 이들 두 재료가 고온의 상태에서 접촉되면 음향적 분로인 Shunting 현상이 발생한다. 이 현상을 물리적으로 억제하기 위한 센서 설계가 필요하게 되며, 이 센서 설계의 성공 여부의 첫째 요구 조건으로서 센서 내부의 구조적으로 음향 Shunting 현상을 억제하는 기술이 필요로 하게 된다. 이들 센서의 내부 구조에 상호 접촉을 피하기 위해서 작은 공간에 새롭게 구조적 분리가 가능한 텅스텐 재료인 Standoff를 제작하여 설치하였다. 그러나 본 실험에서는 제안된 Standoff적용한 출력 신호의 신호 대 잡음 비는 소량의 개선 가능성을 확인하였으나, 다양한 Standoff의 설계와 제작이 지속적으로 진행되어야 할 것이다.
PDF

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.8
- /
- pp.697-705
- /
- 2009
An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.
https://doi.org/10.7776/ASK.2009.28.8.697 인용 PDF KSCI

The Comparison of features for Speech/Music Discrimination (음성/음악 분류를 위한 특징 비교)

Lee Kyong Rok;Seo Bong Su;Kim Jin Young
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.157-160
- /
- 2000
본 논문에서는 멀티미디어 정보에서 원하는 정보를 추출하는 멀티미디어 인덱싱 중 오디오 인덱싱의 전처리 부격인 음성/음악 분류실험을 하였다. 오디오 인덱싱에 있어서 음성/음악 분류기는 원 오디오 신호에서 정보를 가진 음성 부분을 분리하는 역할을 한다. 실험에서는 음성/음악 분류에서 널리 쓰이는 멜캡스트럼(Mel Cepstrum), 정규화 로그 에너지(normalized log energy), 영교차(Zero-Crossings)를 특징 파라미터로 사용하였다[l, 2, 3]. 특징공간은 GMM(Gaussian Mixture Model)에 의해 모델링 되었고, 오디오 신호의 분류는 각각 3가지 분류항목(음성, 음악, 음성+음악)과 2가지 분류항목(음성, 음악)을 적용하였다. 실험결과 3가지 분류항목 적용시와 2가지 분류항목 적용시 모두 멜캡스트럼을 사용하였을 때 가장 좋은 결과를 보였다.
PDF

Study on formant transition for improvement of speech synthesis (음성 합성의 개선을 위한 포만트 변경에 관한 연구)

Lee Sang-hyun;Yang Sung-il;Kwon Y.
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.41-44
- /
- 2001
본 논문에서는 음성합성 과정에서 음성유닛을 연결할 때 모음의 결합부분에서 포만트의 불일치로 일어나는 부자연스러운 합성음이 발생되는 문제점을 개선하기 위해서 앞에 오는 음성 유닛과 뒤에 오는 합성 유닛의 포만트 변경에 관한 방법을 제안한다. 요즘에 연구되는 코퍼스 방식에선 에너지와 피치와 음순지속시간 등을 기준으로 유닛을 선택한 후 연결하지만, 스펙트럼의 불일치가 이루어진다. 이런 스펙트럼의 불일치는 음질의 저하를 유도한다. 그래서 앞 음성유닛의 연결부분의 일정부분과 뒤 음성 유닛의 연결부분의 일정부분의 포만트를 천이시켜 일치시켜줌으로써 음질을 향상시켰다. 음성신호를 FFT한 후 magnitude와 phase를 분리한 후 앞 음성의 연결부분의 magnitude와 뒷 음성의 연결부분의 magnitude를 기준으로 linear interpolation한 값을 목표치로 이동하고 다시 합하여 원 신호를 복원하는 방식으로 포만트를 변경시켰다.
PDF

Search Result 137, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)