• Title/Summary/Keyword: 음질 인식

Search Result 41, Processing Time 0.024 seconds

BLIND AUDIO WATERMARKING TECHNIQUE USING SPECIFIC FREQUENCY SIGNAL (특정된 주파수 신호를 이용한 오디오 워터마킹)

  • Piao, Cheng-Ri;Han, Seung-Soo;Choi, Jong-Uk
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2368-2372
    • /
    • 2002
  • 멀티미디어의 저작권 보호를 위한 기술로서 워터마킹 기술은 현재 멀티미디어의 여러 분야에서 많이 연구되며 사용되고 있다. 이 기술은 컨텐츠가 질적으로 소비자에게 인식되지 않으며, 그리고 컨텐츠자체에 다양한 정보를 은닉하기 때문에 컨텐츠에 항상 포함되어 있다는 장점이 있다. 현재 MP3등과 같은 압축기술이 발달되어 있기 때문에 네트웍에 의한 데이터 전송성능이 향상되었고, 그러므로 디지털 데이터들이 유통이 활성화되었다. 이것으로 인하여 불법적으로 복제된 다양한 컨텐츠의 유통이 생산자의 이익을 해치고 있다. 디지털 오디오 컨텐츠의 소유권을 위하여, 본 논문에서는 압축에 대한 견고성을 제고하기 위하여 청각시스템의 마스킹 효과를 이용하여 시간영역에서 오디오신호에 특정된 주파수를 가진 워터마크 정보를 삽입하였다. 이 특정된 주파수는 반드시 압축에 살아남는 주파수 대역이어야 하며, 음질을 동시에 고려하여야 한다. 그리고 추출할 때는 FFT변환을 하여 주파수 대역에서 추출한다. 저작권 정보를 쉽게 확인하기 위하여 2진 송상을 워터마크 정보로 삽입하였다.

  • PDF

A Robust Speech Recognition Method Combining the Model Compensation Method with the Speech Enhancement Algorithm (음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식)

  • Kim, Hee-Keun;Chung, Yong-Joo;Bae, Keun-Seung
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.115-126
    • /
    • 2007
  • There have been many research efforts to improve the performance of the speech recognizer in noisy conditions. Among them, the model compensation method and the speech enhancement approach have been used widely. In this paper, we propose to combine the two different approaches to further enhance the recognition rates in the noisy speech recognition. For the speech enhancement, the minimum mean square error-short time spectral amplitude (MMSE-STSA) has been adopted and the parallel model combination (PMC) and Jacobian adaptation (JA) have been used as the model compensation approaches. From the experimental results, we could find that the hybrid approach that applies the model compensation methods to the enhanced speech produce better results than just using only one of the two approaches.

  • PDF

A Study on the Perception of Foreign Undergraduates on Online Lecture

  • Kim, Yoon-Hee;Lim, Eun-jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.9
    • /
    • pp.203-212
    • /
    • 2020
  • The purpose of this study is to analyze the perception of non-face-to-face online undergraduate lectures experienced by foreign learners, to identify problems of online lectures, and to suggest improvements. For this study, the perception of online lectures was investigated and analyzed by foreign undergraduate students who took online lectures at A and B universities. Through this, I explored the design direction, complementary measures, and direction of online lectures to be held at Korean universities in the future. As a result of this study, non-real-time lectures through E campus were recognized as advantages in that they could learn repeatedly and listen to lectures at home., Real-time lectures using Zoom were recognized as an advantage of being able to communicate between professors and learners. Both types of online lectures had many tasks and had difficulty in focusing on the lecture until the end. In the future, it was found that the amount of lecture contents and the amount of tasks should be reduced and the condition and sound quality of the lecture image should be improved. As for the evaluation method, they preferred online evaluation rather than offline evaluation, and they preferred relative evaluation rather than absolute evaluation. The results of this study were able to closely understand how learners perceive online lectures. Also, when conducting online lectures, I was able to know the points that need to be improved in the future. The results of this study are expected to contribute to the design direction of online lectures and the development of online contents at each university.

An Improvement of Stochastic Feature Extraction for Robust Speech Recognition (강인한 음성인식을 위한 통계적 특징벡터 추출방법의 개선)

  • 김회린;고진석
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2
    • /
    • pp.180-186
    • /
    • 2004
  • The presence of noise in speech signals degrades the performance of recognition systems in which there are mismatches between the training and test environments. To make a speech recognizer robust, it is necessary to compensate these mismatches. In this paper, we studied about an improvement of stochastic feature extraction based on band-SNR for robust speech recognition. At first, we proposed a modified version of the multi-band spectral subtraction (MSS) method which adjusts the subtraction level of noise spectrum according to band-SNR. In the proposed method referred as M-MSS, a noise normalization factor was newly introduced to finely control the over-estimation factor depending on the band-SNR. Also, we modified the architecture of the stochastic feature extraction (SFE) method. We could get a better performance when the spectral subtraction was applied in the power spectrum domain than in the mel-scale domain. This method is denoted as M-SFE. Last, we applied the M-MSS method to the modified stochastic feature extraction structure, which is denoted as the MMSS-MSFE method. The proposed methods were evaluated on isolated word recognition under various noise environments. The average error rates of the M-MSS, M-SFE, and MMSS-MSFE methods over the ordinary spectral subtraction (SS) method were reduced by 18.6%, 15.1%, and 33.9%, respectively. From these results, we can conclude that the proposed methods provide good candidates for robust feature extraction in the noisy speech recognition.

Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition (자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석)

  • Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2E
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

Possibility of Motor Speech Improvement in People With Spinocerebellar Ataxia via Intensive Speech Treatment (집중치료를 통한 소뇌운동실조증 환자의 말운동개선 가능성)

  • Park, Youngmi
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.11
    • /
    • pp.634-642
    • /
    • 2018
  • People with spinocerebellar ataxia, a hereditary and progressive neurogenic disorder, suffer from ataxic dysarthria due to cerebellar dystrophy. This study was designed to examine if intensive motor speech treatment yields improvement in progressive ataxic dysarthria and if then, to investigate magnitude of therapeutic effect. SPEAK $OUT!^{(R)}$ was provided to a 55-year old female diagnosed with SCA for improving motor speech functions. Magnitude of therapeutic effect was large in changes of MPT and vocal intensity across speech tasks. Small effect size was found in changes of fundamental frequency, however, large therapeutic effect was observed in changes of frequency range. In addition, improvement of vocal quality based on jitter, shimmer, and HNR was observed with large therapeutic effect size and vowel space was expanded, particularly, due to F1. Lastly, VHI scores were decreased. Intensive motor speech treatment, called as SPEAK $OUT!^{(R)}$ was effective enough to observe improvement in vocal intensity, frequency range, and vocal quality, expanding vowel space and lowering VHI scores. Based on the results of this case study, further efficacy evaluation of SPEAK $OUT!^{(R)}$ for improving progressive ataxic dysarthria in people with SCA is required.

Speech Dereverberation using Improved Linear Prediction Residual (개선된 선형예측 잔여를 이용한 음성의 잔향음 제거)

  • Park, Chan-Sub;Kim, Ki-Man;Kang, Suk-Youb
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.10
    • /
    • pp.1845-1851
    • /
    • 2007
  • Background noise and room reverberation are two causes of degradation in speech in listening situations. Many algorithms developed to enhance reverberant speech. In this paper we propose a dereverberation method for enhancement of speech using modified the linear prediction(LP) residual in reverberant room condition. The proposed dereberberation method based on the fact that the signification excitation of the vocal tract system takes place at the instant of glottal closure in voiced speech. Our method used delay information form each sensor, and we need reverberant signals from 3 sensors. We obtain a new LP residual signal using modified IP residual combination which derived form weighting of the LP residual and the Hilbert transform of LP residual. The nature of the coherently added Hilbert envelop has several large amplitude spikes because of the effects of noise and reverberation. This residual of the clean speech is used to excite the time-varying all-pole filter to obtain the enhanced speech. We achieved simulation of proposed algorithm for performance analysis in reverberation environment. The proposed algorithm improves substantially the quality of reverberant speech.

Search speed improved minimum audio fingerprinting using the difference of Gaussian (가우시안의 차를 이용하여 검색속도를 향상한 최소 오디오 핑거프린팅)

  • Kwon, Jin-Man;Ko, Il-Ju;Jang, Dae-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.12
    • /
    • pp.75-87
    • /
    • 2009
  • This paper, which is about the method of creating the audio fingerprint and comparing with the audio data, presents how to distinguish music using the characteristics of audio data. It is a process of applying the Difference of Gaussian (DoG: generally used for recognizing images) to the audio data, and to extract the music that changes radically, and to define the location of fingerprint. This fingerprint is made insensitive to the changes of sound, and is possible to extract the same location of original fingerprint with just a portion of music data. By reducing the data and calculation of fingerprint, this system indicates more efficiency than the pre-system which uses pre-frequency domain. Adopting this, it is possible to indicate the copyrighted music distributed in internet, or meta information of music to users.

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

  • Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.5
    • /
    • pp.213-221
    • /
    • 2006
  • Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.

Patterns of Mother-of-Pearl Craftwork Sketches and the Way of Supply and Demand of the Works in Modern and Contemporary Times (근·현대 나전도안과 공예품의 수급(需給)형태 - 중요무형문화재 제10호 나전장 송방웅 소장 나전도안을 중심으로 -)

  • Lee, Yeon Jae
    • Korean Journal of Heritage: History & Science
    • /
    • v.43 no.3
    • /
    • pp.334-365
    • /
    • 2010
  • Mother-of-Pearl craftwork sketch involves the whole process of making a piece of work. Therefore, it includes types, forms, sizes, and patterns of the work. Some information about when and by whom those works were manufactured and who ordered them are still found in some sketches. This paper seeks to find out popular types and patterns of the works in each period and its demand and the way of supply by examining the collection of approximately 1700 Mother-of-Pearl craftwork sketches from the period of Japanese colonization up to the present time, which are owned by Mr. Song Bang-wung, Important Intangible Cultural Heritage no.10. Typical patterns of sketches are the hua-jo(花鳥 : Flowers and Birds), the Sakunja(四君子 : Four Gracious Plants), cultural treasures, figures in folk tales, 'Su-bok(壽福)' characters, and landscape. The pattern sketches have changed according to the circumstances of Korean society. During the period of Japanese colonization from the 1920s to the 1940s the manufacture and the supply and demand of Mother-of-Pearl craftworks were controled by the Japanese government. As a result, many of the patterns were adjusted to the Japanese taste. Most of its customers were also Japanese. During the 1950s after Independence the American Military Forces appeared as new customers due to the Korean War. Thus, the traditional Korean patterns to decorate accessories adored by American soldiers gained popularity. Foreign Mother-of-Perls were imported from the late 1960s to the 1970s. They were bigger and more colorful than those of Korean and it enabled the sketches bigger and the patterns more various. The most popular pattern in this period was the pattern of cultural treasures, such as an image of Buddha, metalcraft works, porcelains and pagodas. In terms of a technique, new techniques, such as engraving and rusting were introduced. There was a great demand for Mother-of-Pearl craftworks in the 1970s as people were highly interested in them. They were entirely made to order and there was a large demand from diverse organizations, furniture dealers and individuals. And the Mother-of-Pearl craftwork was in full flourish in the 1970s due to the country's economic development and the growth of national income. Mass production of the works was possible and the professional designers who drew patterns actively worked in this period. The favor of Mother-of-Pearl craftworks declined in the 1980s since the built-in furniture and the Western style of furniture became prevalent due to the change of housing into apartments. But it seemed that the manufacture of Mother-of-Pearl craftworks revived for once the technique of Kunum-jil(끊음질 : cutting and attaching) became popular in Tong-young(統營). After the 1990s, however, the making of Mother-of-Pearl craftworks gradually declined as the need of them decreased. Now it barely maintains its existence by a few artisans.