Search | Korea Science

A Method of Frequency Response Normalization of Smart Phones Based on Deep Neural Networks for Virtual Reality Sound Reconstruction (가상현실 음향 재구성을 위한 심층신경망 기반 스마트폰의 주파수응답 정규화 방법)

Choi, Jaegyu;Yun, Deokgyu;Choi, Seung Ho
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2017.11a
- /
- pp.19-20
- /
- 2017
본 논문은 사용자제작콘텐츠 (User Created Contents, UCC)를 이용한 가상현실 (virtual reality, VR) 음향 재구성을 위한 스마트폰의 주파수응답 정규화에 관한 연구이다. 서로 다른 스마트폰들로 취득한 음향들을 연결할 때, 부자연스러운 음향이 발생하며, 이것은 주로 스마트폰별로 다른 주파수 응답에 기인한다. 이러한 문제를 해결하기 위하여 스마트폰의 주파수응답을 정규화 하는 과정이 필요하며, 본 연구에서는 심층신경망 (deep neural network)을 이용하는 방법을 제안한다. 심층신경망의 입력은 처리하고자 하는 스마트폰 음향의 스펙트럼이고 출력은 이것과 기준 스마트폰 음향의 스펙트럼과의 비율이다. 실험결과, 서로 다른 스마트폰으로 취득한 음향신호가 연결되었을 때, 객관적 및 주관적 평가를 통해 음향의 자연성이 개선됨을 확인 하였다.
PDF

Performance Improvement of korean Connected Digit Recognition Based on Acoustic Parameters (음향학적 파라메터를 이용한 한국어 연결숫자인식의 성능개선)

Kim Seunghi;Kim Hyung Soon
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.44-47
- /
- 1999
본 논문에서는 한국어 연결숫자인식에 있어서 모델간의 변별력 향상을 통해 인식률을 높이기 위하여 음향학적 파라메터(Acousticparameter)를 사용하는 짓을 제안한다. 제안된 방법은 음성학적 지식에 근거하여 적절한 주파수 대역별 에너지의 비의 로그값을 추가적인 특징파라메터로 사용한다. 실험결과, 제안된 방법을 사용함으로써 기본 인식시스템에 비해 오류율이 최고 $46\%$ 정도 감소됨을 확인할 수 있었다. 그리고 채널보상 기술을 함께 적용함으로써 $69\%$ 정도의 오류율 감소를 얻었다.
PDF

A Study on Determining the Transmission Loss of Water-Borne Noise Silencer in a Sea-Connected Piping System (해수연결 배관계 소음감소기의 투과손실 측정에 관한 연구)

Park, Kyung-Hoon
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.6
- /
- pp.286-292
- /
- 2007
The dominant source of noise in a sea-connected piping system is usually due to a seawater cooling pump which circulates seawater to operate onboard equipments normally, and so its water-borne noise with some tonal frequencies should be reduced using proper silencers. In order to obtain the transmission loss of water-borne noise silencers experimentally the present paper suggests a transfer function technique that acoustic wave in the piping system is decomposed into its incident and transmitted components when the reflection at the termination of the system exists. Good agreement in the interested frequency range with theory and the proposed technique shows the validity of the technique.
https://doi.org/10.7776/ASK.2007.26.6.286 인용 PDF KSCI

An algorithm of the Non-uniform synthesis unit selection for concatenative speech synthesis system (연결형 합성시스템을 위한 문맥종속 단위 기반의 비정형 합성단위 추출 알고리즘)

김영일
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.273.2-277
- /
- 1998
본 논문에서는 음소단위 비정형 연결합성 시, 접합점에서 포만트 불연속을 최소화할 수 있도록 이웃음소간 경계강도 예측모델과 합성단위 검색시 음소단위 최장일치 검색 알고리즘을 설계하였다. 합성단위 연결부에서 발생하는 신호왜곡을 최소화하기 위해 “_C_”환경에서 자음이 유성음화된 경우, “_V_”환경에서 모음이 무성음화된 경우, 그리고 유성음 사이의 포만트 주파수 차이에 대한 모델을 생성하여, 음소간의 조음강도가 약한 부분이 합성단위 경계로 설정되도록 하였다. 합성단위 경계가 결정되면 주어진 문장의 문맥정보만을 이용하여 코포스로부터 후보를 선택한다. 선택된 후보를 사이의 연결성을 측정하기 위하여 합성 경계를 기준으로 전, 후 음소에 대한 음성적 특성과 포만트 천이 특성을 고려하였다. 실험은 K-ToBI 레이블링된 200문장을 기반으로 하였으며, 코퍼스로부터 한 문장을 선택하여 이를 목적치 패턴으로 선정 한 후, 목적치 패턴과 후보사이의 단위비용과 후보들 간의 연결비용을 계산하여 최적의 합성단위열을 추출하는 방식으로 이루어졌다. 본 논문에서는 이러한 문맥종속 단위 기반의 합성단위 추출 알고리즘과 실험 결과에 대해 보고한다.
PDF

Acoustic characteristics of sound field in partially opened rooms - Emphasis on horizontal coupling of diffuse and non-diffuse field - (실내공간의 부분적 개방에 따른 음향특성변화 I - 확산음장과 자유음장의 수평적 결합을 중심으로 -)

Jeong, Dae-Up;Choi, Young-Ji;Kim, Ji-Young;Oh, Yang-Ki;Choi, Seok-Ju
- Journal of Korean Association for Spatial Structures
- /
- v.7 no.2 s.24
- /
- pp.105-114
- /
- 2007
Large span spaces have been widely used for various purposes including sports events in other countries. Due to increasing demands on multi-purpose use of such spaces, recently built large span sport facilities such dome stadiums have been required to accommodate sport events as well as performing events. Also, retractable ceilings and/or walls were generally adopted in those spaces for providing various event conditions. It seems obvious that the openings between diffuse fields and free field may cause difficulties in the acoustic design of such spaces In the present work, the acoustic characteristics of non-diffuse field has been investigated using 1/10 acoustic scale models. It was found that RTs at low- and mid-frequencies decayed faster than those at high-frequencies as the percentage of opening area increased. The decay rate of RTs at high frequencies were not influenced by increasing the area of an opening. Also high dependence of EDT on the percentage of opening area was observed at all frequency range. D50 was not improved by increasing the area of an opening up to 12.5% and then sharply increased. The application of Schroeder integration to evaluate the reverberation charateristics of coupled spaces may not be proper, since non-exponential decay process with double or triple decays of those spaces can not be properly defined. EDT seems more appropriate to predict reverberation at the early acoustic design stage of acoustically coupled spaces.
PDF

Performance Improvement of Korean Connected Digit Recognition Based on Acoustic Parameters (음향학적 파라메터를 이용한 한국어 연결숫자인식의 성능개선)

김승희;김형순
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.5
- /
- pp.58-62
- /
- 1999
This paper proposes use of acoustic parameters to improve the discriminability among digit models in Korean connected digit recognition. The proposed method used the logarithmic values of energy ratio between the predetermined frequency bands as additional feature parameters, based on the acoustic-phonetic knowledge. The results of our experiment show that the proposed method reduced the error rate by 46% in comparison with the baseline system. And incorporation of channel compensation technique in the proposed method yielded error reduction of about 69%.
PDF

The Optimal and Complete Prompts Lists Generation Algorithm for Connected Spoken Word Speech Corpus (연결 단어 음성 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성 알고리즘)

유하진
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.2
- /
- pp.187-191
- /
- 2004
This paper describes an efficient algorithm to generate compact and complete prompts lists for connected spoken words speech corpus. In building a connected spoken digit recognizer, we have to acquire speech data in various contexts. However, in many speech databases the lists are made by using random generators. We provide an efficient algorithm that can generate compact and complete lists of digits in various contexts. This paper includes the proof of optimality and completeness of the algorithm.
PDF KSCI

Residual Convolutional Recurrent Neural Network-Based Sound Event Classification Applicable to Broadcast Captioning Services (자막방송을 위한 잔차 합성곱 순환 신경망 기반 음향 사건 분류)

Kim, Nam Kyun;Kim, Hong Kook;Ahn, Chung Hyun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.26-27
- /
- 2021
본 논문에서는 자막방송 제공을 위해 방송콘텐츠를 이해하는 방법으로 잔차 합성곱 순환신경망 기반 음향 사건 분류 기법을 제안한다. 제안된 기법은 잔차 합성곱 신경망과 순환 신경망을 연결한 구조를 갖는다. 신경망의 입력 특징으로는 멜-필터벵크 특징을 활용하고, 잔차 합성곱 신경망은 하나의 스템 블록과 5개의 잔차 합성곱 신경망으로 구성된다. 잔차 합성곱 신경망은 잔차 학습으로 구성된 합성곱 신경망과 기존의 합성곱 신경망 대비 특징맵의 표현 능력 향상을 위해 합성곱 블록 주의 모듈로 구성한다. 추출된 특징맵은 순환 신경망에 연결되고, 최종적으로 음향 사건 종류와 시간정보를 추출하는 완전연결층으로 연결되는 구조를 활용한다. 제안된 모델 훈련을 위해 라벨링되지 않는 데이터 활용이 가능한 평균 교사 모델을 기반으로 훈련하였다. 제안된 모델의 성능평가를 위해 DCASE 2020 챌린지 Task 4 데이터 셋을 활용하였으며, 성능 평가 결과 46.8%의 이벤트 단위의 F1-score를 얻을 수 있었다.
PDF

A Study on the High-speed Processing of Connectionless Data in BISDN (광대역 정보통신망에서 비연결형 데이터의 고속처리에 관한연구)

이완범;김종협;김환용
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.5
- /
- pp.63-68
- /
- 1998
광대역 정보통신망(B-ISDN)에 적합한 직접 제공법의 스트리밍모드 비연결형 서버 는 단일 셀의 전송시간 동안 셀의 송·수신 및 룩업(lookup)을 수행해야 한다는 시간적인 제약을 받기 때문에 버스트 트래픽(Burst Traffic)이 발생했을 경우 셀 손실이 많다는 단점 을 가지고 있다. 따라서 본 논문에서는 ATM 망의 스트리밍 모드 비연결형 서버가 고속으 로 데이터를 처리하여 셀 손실을 줄일 수 있도록 하기 위해 DBLCAM을 제안하였으며, 입 력 VPI(Virtual Path Identifier)/VCI(Virtual Channel Identifier)에 대한 연결 번호를 출력하 는 기능의 포워딩 테이블 VPC맵을 제안된 DBLCAM과 이중 포트 SRAM을 이용하여 설계 하였다.
PDF

Performance Improvement of Connected Digit Recognition by Considering Phoneme Variations in Korean Digit. (한국어 숫자음에서의 음운변화를 고려한 연결숫자 인식의 성능향상)

Song Myung Gyu;Kim Hyung Soon
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.105-108
- /
- 2001
한국어 숫자는 각 숫자가 단음절로 이루어져 있으며, 연속적으로 발음될 때 인접 숫자들의 상호조음현상에 의해 각 숫자의 고유 발음이 변화하고, 또한 그 숫자들의 경계도 모호해지는 문제점이 있다. 한편 연속적인 숫자의 발성을 기대하는 인식시스템에 반하여 일부 사용자는 숫자들을 고려시켜서 발성하기도 한다. 이는 연결숫자의 음운현상만을 고려한 인식 시스템에서는 성능저하의 한 원인이 된다 본 논문에서는 연결숫자의 인식성능 향상을 위해서 한국어 숫자들의 음운 변화를 고려하여 변이음군을 정하였으며, 사용자의 여러 가지 발성형태에 따른 다양한 음운 현상의 변화를 흡수 할 수 있도록 인식 네트웍을 구성하는 방식을 검토하였다. 전화망 4연숫자음을 이용한 화자독립 인식실험을 통해서 한국어 숫자에서 자주 오인식 되는 '이', '오', '일' 인식 성능이 각각 $4..2\%$, $4.2\%$, $2.9\%$씩 향상되었으며, 인식속도도 $33\%$의 개선이 있었다
PDF

Search Result 130, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)