Search | Korea Science

Performance Enhancement of SBC for Voice Signal Using Adaptive Postfiltering at the Medium Bit Rate (중간 전송율에서 적응 포스트 필터링을 이용한 음성용 SBC의 성능 향상)

김원구;이남걸;윤대희;차일환
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.17 no.2
- /
- pp.121-131
- /
- 1992
In this paper, three methods are studied to enhance the performance of SBC ( Sub-Band Coding )schemes for voice signal at the medium bit rate between 12 kbps and If; kbps, and adaptive postfilteritng using human auditory characteristics Is (Bone at the decoder out put. First, GQMF(Generalized Quadrature Mirror Filter ) Is used instead of QME'((Quadrature MirrorFiltcr ) to have better performance. Second, by adaptive bit allocation to each sub-band, speech quality is enhanced and valuable rate ceding If possible. Third, corriparlson study oS thr: coder performance using APCM(Adaptive Pulse Code ModulatioTi) and ADPCM( Adaptive Differentiai Pulse Code Modulatiori) , Indicates that SB AfCM performance better than the other. Adaptive postfiltering at the decoder output enhances the quality of the coded speech. The two proposed postfiltering methods decrease the noise sufficiently at the expense of the low computational load.
PDF

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
- Journal of Broadcast Engineering
- /
- v.22 no.6
- /
- pp.693-701
- /
- 2017
In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.
https://doi.org/10.5909/JBE.2017.22.6.693 인용 PDF KSCI KPUBS

Pornographic Content Detection Scheme Using Bi-directional Relationships in Audio Signals (음향 신호의 양방향적 연관성을 고려한 유해 콘텐츠 검출 기법)

Song, KwangHo;Kim, Yoo-Sung
- The Journal of the Korea Contents Association
- /
- v.20 no.5
- /
- pp.1-10
- /
- 2020
In this paper, we propose a new pornographic content detection scheme using bi-directional relationships between neighboring auditory signals in order to accurately detect sound-centered obscene contents that are rapidly spreading via the Internet. To capture the bi-directional relationships between neighboring signals, we design a multilayered bi-directional dilated-causal convolution network by stacking several dilated-causal convolution blocks each of which performs bi-directional dilated-causal convolution operations. To verify the performance of the proposed scheme, we compare its accuracy to those of the previous two schemes each of which uses simple auditory feature vectors with a support vector machine and uses only the forward relationships in audio signals by a previous stack of dilated-causal convolution layers. As the results, the proposed scheme produces an accuracy of up to 84.38% that is superior performance up to 25.80% than other two comparison schemes.
https://doi.org/10.5392/JKCA.2020.20.05.001 인용 PDF KSCI HTML

Comparative Study of Functional Magnetic Resonance Imaging by Global Scaling Analysis (Global Scaling 분석방법에 따른 기능적 자기공명영상의 비교 연구)

Yoo, Dong-Soo
- Investigative Magnetic Resonance Imaging
- /
- v.10 no.1
- /
- pp.26-31
- /
- 2006
Purpose : To evaluate the effect of global scaling analysis on brain activation for sensory and motor functional MR imaging study. Materials and methods : Four normal subjects without abnormal neurological history were included. Arm extension-flexion movement was used for motor function and 1KHz pure tone stimulation was used for auditory function. Functional magnetic resonance imaging was performed at 3T MRI (GE, Milwaukee, USA) using BOLD-EPI technique and SPM2 was employed for data analysis. On data analysis, the brain activation images were obtained with and without global scaling by fixing other parameters such as motion correction and realignment. Results : The difference in brain activation between no scaling and global scaling was not large in case of right upper extremity movement (p<0.000001). For auditory test, brain activation with global scaling showed larger activation than that of without global scaling (p<0.05). Conclusion : A caution must be taken into account when analyzing functional imaging data with global scaling especially for functional study of small local BOLD signal change.
PDF

A Perceptual Audio Coder Based on Temporal-Spectral Structure (시간-주파수 구조에 근거한 지각적 오디오 부호화기)

김기수;서호선;이준용;윤대희
- Journal of Broadcast Engineering
- /
- v.1 no.1
- /
- pp.67-73
- /
- 1996
In general, the high quality audio coding(HQAC) has the structure of the convertional data compression techniques combined with moodels of human perception. The primary auditory characteristic applied to HQAC is the masking effect in the spectral domain. Therefore spectral techniques such as the subband coding or the transform coding are widely used[1][2]. However no effort has yet been made to apply the temporal masking effect and temporal redundancy removing method in HQAC. The audio data compression method proposed in this paper eliminates statistical and perceptual redundancies in both temporal and spectral domain. Transformed audio signal is divided into packets, which consist of 6 frames. A packet contains 1536 samples($256{\times}6$) :nd redundancies in packet reside in both temporal and spectral domain. Both redundancies are elminated at the same time in each packet. The psychoacoustic model has been improved to give more delicate results by taking into account temporal masking as well as fine spectral masking. For quantization, each packet is divided into subblocks designed to have an analogy with the nonlinear critical bands and to reflect the temporal auditory characteristics. Consequently, high quality of reconstructed audio is conserved at low bit-rates.
PDF

Construction and Operation of a 37-channel Hemispherical Magnetoencephalogram System (37채널 반구형 뇌자도 측정장치 제작 및 동작)

이용호;김진목;권혁찬;김기웅;박용기;강찬석;이순걸
- Journal of Biomedical Engineering Research
- /
- v.24 no.3
- /
- pp.159-165
- /
- 2003
We developed a 37-channel magnetoencephalogram (MEG) measurement system based on low-noise superconducting quantum interference device (SQUID) magnetometets, and operated the system to measure MEG signals. By using double relaxation oscillation SQUIDs with high flux-4o-voltage transfers, the SQUID outputs could be measured directly by room temperature preamplifiers and compact readout circuits were used for SQUID operation. The average field noise level of the magnetometers is about 3 fT/√Hz in the white region, low enough for MEG measurements when operated inside a magnetically shielded room. The 37 magnetometers were distributed on a hemispherical surface haying a radius of 125 mm. In addition to the 37 sensing channels. 11 reference channels were installed to pickup external noise and to form software gradiometers. A low-noise liquid helium dewar was fabricated with a liquid capacity of 30 L and boil-off rate of 4 L/d. The signal processing software consists of digital filtering, software gradiometer, isofield mapping and source localization. By using the developed system, we measured auditory-evoked fields and localized the current dipoles, demonstrating the effectiveness of the system.
PDF KSCI

존 웰즈 교수의 초청 강연 초록

Wells, John
- MALSORI
- /
- no.15_18
- /
- pp.71-80
- /
- 1989
It is an honour to be speaking on phonetics at the invitation of the Phonetic Society of Korea. Through the Korean Hangout script, invented in the fifteenth century at the instigation of the great King Sejong, and the work Hunminjeongeum which describes it, this country has an important place in the world history of phonetics. Phonetics is the description and analysis of pronunciation. Spoken language can be investigated at three points: in the speaker (articulatory phonetics), in the hearer (auditory phonetics), and in the physical speech signal (acoustic phonetics)... Beginners in English who are Korean mother tongue have to learn to make the sound 'f' as in "coffee", which is a voiceless labio-dental fricative, lip on upper teeth. They also have to learn to make [\theta]sound in "think", a voiceless dental fricative.
PDF

Robust Audio Watermarking Using HAS and Neural Network (신경망과 HAS을 이용한 강인한 오디오 워터마킹 알고리즘)

Jung, Se-Won;Piao, Cheng-Ri;Han, Seung-Soo
- Proceedings of the KIEE Conference
- /
- 2006.07d
- /
- pp.2101-2102
- /
- 2006
In this paper, a new digital audio watermarking algorithm is presented. The proposed algorithm embeds watermark into audio signal based on human auditory system (HAS). This algorithm is a blind audio watermarking method, which does not require any prior information during watermark extraction process. This algorithm finds watermarking position using time-domain masking effect. First we insert the watermark into wavelet domain, and then we use a back-propagation neural network (BPN) to learn the characteristics of relationship between the watermark and the watermarked audio. Due to the teaming and adaptive capabilities of the BPN, the false recovery of the watermark can be greatly reduced by the trained BPN. Experimental results show that the proposed method has good inaudibility and high robustness to common audio processing attacks.
PDF

The Implemetation of Real-time Broadcast Synchronizing System Using Audio Watermark (오디오 워터마크를 이용한 실시간 방송동기화시스템의 구현)

Shin Dong-Hwan;Kim Jong-Weon
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.54 no.12
- /
- pp.716-722
- /
- 2005
In this paper, we propose the audio watermarking algorithm based on the critical band of HAS(human auditory system) without audibly affecting the quality of the watermarked audio and implement the detecting algorithm on the BSS(broadcast synchronizing system) for testing the proposed algorithm. According to the audio quality test, the SNR(signal to noise ratio) of the watermarked audio objectively is 66dB above. In the robustness test, the proposed algorithm can detect the watermark more than $90\%$ from various compression(MP3, AAC), A/D and D/A conversions, sampling rate conversions and especially asynchronizing attacks. The BSS automatically switches the programs between the key station and the local station in broadcasting system. The result of reliability test of implemented system by using the real broadcasting audio has no false positive error during 30 days. Because of detecting once processing per 0.5 second, we can judge that the false positive error does not occur.
PDF KSCI

H/W Implementation of Speech Protestor for Cochlear Implant (청각보철장치용 어음발췌기의 하드웨어 구현)

Shin, J.I.;Park, S.H.
- Proceedings of the KOSOMBE Conference
- /
- v.1998 no.11
- /
- pp.161-162
- /
- 1998
In this paper, a speech processor which is the most important part of the cochlear implant is developed, to recover auditory ability for the sensorineural disorders who have damaged for their inner ear. This system consists of the analog and digital signal processing part, of which functions is the pre-processing and the main processing, respectively. The main processing is peformed in DSP processor (TMS320C31-40) by using S/W. Because the program is used in this system, it is possible to cope with the individual status of the patients, very easily.
PDF

Search Result 176, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)