• 제목/요약/키워드: Voice signal

검색결과 433건 처리시간 0.029초

시각 장애인을 위한 정보전송 시스템 연구 (Study of Information Transmission System for Visually Impaired)

  • 이서영;최종엽;안상준;김정훈;박용욱
    • 한국전자통신학회논문지
    • /
    • 제12권6호
    • /
    • pp.1227-1232
    • /
    • 2017
  • 본 연구에서는 압력센서와 초음파센서를 사용하여 시각장애인이 교통 정보를 이용하는데 있어서 불편함과 교차로에서의 사고율을 줄이기 위한 시스템을 연구하였다. 압력센서를 이용하여 strip LED를 점등시켜 운전자가 보행자를 인식함으로써 사고율을 줄이고, 보행자 신호등 정보 외에 초음파센서와 블루투스를 통해 버스 위치 정보를 애플리케이션으로 전송하여 음성으로 들을 수 있도록 하였다.

ERB 필터를 이용한 시맨틱 온톨로지 음성 인식 성능 향상 (Semantic Ontology Speech Recognition Performance Improvement using ERB Filter)

  • 이종섭
    • 디지털융복합연구
    • /
    • 제12권10호
    • /
    • pp.265-270
    • /
    • 2014
  • 기존의 음성 인식 알고리즘은 어휘들 간의 순서가 정해져 있지 않으며, 음성 인식 환경 변화에 따른 잡음으로 인한 음성 검출이 정확하지 못한 단점을 가지며, 검색 시스템은 키워드의 의미가 다양하여 정확한 정보를 인지하지 못한다. 본 연구에서는 사건 기반 시맨틱 온톨로지 추론 모델을 제안하였으며, 제안된 시스템에서 음성 인식 특징을 추출하기 위해 ERB 필터를 이용하여 특징 추출하는 모델을 구축하였다. 제안된 모델은 성능 평가를 위해 지하철역, 지하철 잡음을 사용하였고 잡음 환경의 SNR -10dB, -5dB 신호에서 잡음 제거를 수행하여 왜곡도를 측정한 결과 2.17dB, 1.31dB의 성능이 향상됨을 확인하였다.

재사용성을 고려한 항공기 인터콤 오디오 라우팅 처리방안 연구 (A Study on the Audio Routing Processing for Aircraft Intercom Considering Reusability)

  • 이승목
    • 항공우주시스템공학회지
    • /
    • 제11권6호
    • /
    • pp.1-9
    • /
    • 2017
  • 항공기 인터콤 장비는 각종 LRU가 송신한 오디오를 혼합, 분배하고 상황인지용 메세지 재생을 통해 조종사의 원할한 임무 수행에 큰 역할을 담당하는 장비이다. 특히, 수신되는 오디오를 혼합/분배하는 오디오 라우팅의 경우에는 수신되는 오디오 채널에 대해 On/Off 제어를 하고, 연동 LRU에 오디오를 송신하여 임무에 대한 상황전파 및 공유를 통해 임무 수행에 매우 중요한 기능이다. 이러한 오디오 라우팅 처리는 다양한 연동 신호를 수반하고 있어 다양한 조합이 발생해 이에 대한 예외처리가 복잡해지므로 응집도를 낮고 결합도를 높여 유지보수성과 재사용성을 낮춘다. 이를 방지고자 소프트웨어 변경 시 영향을 최소화하고 재사용성과 유지보수성을 높인 항공기 인터콤용 오디오 라우팅을 효율적으로 처리하는 방안을 제시한다.

무선 IP 네트워크에서 전용선 모뎀 사용가능성 검증 (The Investigation of the Leased Line Modem Usability in the Wireless Internet Protocol Network)

  • 박민호;백해현;금동원;최형석;이종성
    • 한국군사과학기술학회지
    • /
    • 제18권4호
    • /
    • pp.423-431
    • /
    • 2015
  • A leased line modem usability was evaluated and investigated in the wireless internet protocol(IP) network. The signal of the modem in the circuit switching network was translated to IP packet by using several voice codecs (PCM, G.711A, $G.711{\mu}$, and etc.) and transmitted through the wireless IP network. The wireless IP network was simulated by the Tactical information and communication network Modeling and simulation Software(TMS). The performance and usability of the leased line modem are simulated using the system-in-the-loop(SITL) function of TMS with respect to packet delay, jitter, packet discard ratio, codecs, and wireless link BER.

A Study on Traffic Light Detection (TLD) as an Advanced Driver Assistance System (ADAS) for Elderly Drivers

  • Roslan, Zhafri Hariz;Cho, Myeon-gyun
    • International Journal of Contents
    • /
    • 제14권2호
    • /
    • pp.24-29
    • /
    • 2018
  • In this paper, we propose an efficient traffic light detection (TLD) method as an advanced driver assistance system (ADAS) for elderly drivers. Since an increase in traffic accidents is associated with the aging population and an increase in elderly drivers causes a serious social problem, the provision of ADAS for older drivers via TLD is becoming a necessary(Ed: verify word choice: necessary?) public service. Therefore, we propose an economical TLD method that can be implemented with a simple black box (built in camera) and a smartphone in the near future. The system utilizes a color pre-processing method to differentiate between the stop and go signals. A mathematical morphology algorithm is used to further enhance the traffic light detection and a circular Hough transform is utilized to detect the traffic light correctly. From the simulation results of the computer vision and image processing based on a proposed algorithm on Matlab, we found that the proposed TLD method can detect the stop and go signals from the traffic lights not only in daytime, but also at night. In the future, it will be possible to reduce the traffic accident rate by recognizing the traffic signal and informing the elderly of how to drive by voice.

펨토 기지국의 효과적인 타이밍 동기방안 (Effective timing synchronization methods for femtocell)

  • 신준효;김정훈;정석종
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 한국정보통신설비학회 2008년도 정보통신설비 학술대회
    • /
    • pp.237-241
    • /
    • 2008
  • Femtocells are cellular access points that connect to a mobile operator's network using residential DSL or cable broadband connections. They have been developed to work with a range of different cellular standards including CDMA, GSM and UMTS. Like legacy base station, the frequency accuracy and phase alignment is necessary for ensuring the quality of service (QoS) for applications such as voice, real-time video, wireless hand-off, and data over a converged access medium at the femtocell. But, the GPS has some problem to be used at the femtocell, because it is difficult to set-up, depends on the satellite condition, and very expensive. So, some techniques are discussed to alternate with the legacy GPS system. NTP, PTP, Synchronous Ethernet use the ethernet to synchronize distributed clocks in packet networks. AGPS support reliable position information than the legacy GPS in poor signal conditions. But, These method also have some problems. So, hybrid timing method like A-GPS+PTP and TV+GPS was developed to make up the weak point of GPS. This paper introduces the each method and compare each other and y propose much better solution for timing synchronization at the Femtocell

  • PDF

인간의 청각 시스템을 응용한 음원위치 추정에 관한 연구 (A study imitating human auditory system for tracking the position of sound source)

  • 배진만;조선호;박종국
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2003년도 학술회의 논문집 정보 및 제어부문 B
    • /
    • pp.878-881
    • /
    • 2003
  • To acquire an appointed speaker's clear voice signal from inspect-camera, picture-conference or hands free microphone eliminating interference noises needs to be preceded speaker's position automatically. Presumption of sound source position's basic algorithm is about measuring TDOA(Time Difference Of Arrival) from reaching same signals between two microphones. This main project uses ADF(Adaptive Delay Filter) [4] and CPS(Cross Power Spectrum) [5] which are one of the most important analysis of TDOA. From these analysis this project proposes presumption of real time sound source position and improved model NI-ADF which makes possible to presume both directions of sound source position. NI-ADF noticed that if auditory sense of humankind reaches above to some specified level in specified frequency, it will accept sound through activated nerve. NI-ADF also proposes practicable algorithm, the presumption of real time sound source position including both directions, that when microphone loads to some specified system, it will use sounds level difference from external system related to sounds of diffraction phenomenon. In accordance with the project, when existing both direction adaptation filter's algorithm measures sound source, it increases more than twice number by measuring one way. Preserving this weak point, this project proposes improved algorithm to presume real time in both directions.

  • PDF

A PERFORMANCE STUDY OF SPEECH CODERS FOR TELEPHONE CONFERENCING IN DIGITAL MOBILE COMMUNICATION NETWORKS

  • Lee, M.S.;Lee, G.C.;Kim, K.C.;Lee, H.S.;Lyu, D.S.;Shin, D.J.;Lee, Hun
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.899-903
    • /
    • 1994
  • This paper describes two methods to assess the output speech, quality of vocoders for telephone conferencing in digital mobile communication networks. The proposed methods are the sentence discrimiantion method and the modified degraded mean opinion score (MDMOS) test. We apply these two methods to Qualcomm code excited linear prediction (QCELP), vector sum excited linear prediction (VSELP) and regular pulse excited-long term predictin (RPE-LTD) voceders to evaluate which vocoding algorithm can process mixed voice signal from two speakers better for telephone conferencing. From the experiments we obtain that the VSELP vocoding algorithm reveals superior output speech quality to the other two.

  • PDF

트랜스포머를 이용한 음성기반 코비드19 진단 (Audio-based COVID-19 diagnosis using separable transformer)

  • 강승태;장길진
    • 한국음향학회지
    • /
    • 제42권3호
    • /
    • pp.221-225
    • /
    • 2023
  • 본 연구에서는 코로나 바이러스 감염증은 음성만으로 빠르게 진단하는 효율적인 방법을 제안하였다. 기존의 딥러닝 기반 방법들의 연산시간과 대용량 학습자료 요구조건을 완화하기 위해서 Separable Transformer(SepTr)의 구조를 개선하여 파라미터의 수를 대폭 감소시키고 빠른 진단을 가능하게 하는 새로운 Strided Convolution Separable Transformer(SC-SepTr)를 제안하였다. 공개 음향 데이터인 Coswara에 대하여 실험을 수행한 결과 제안된 방법은 상대적으로 소규모의 학습자료에 대해서도 Area Under the Curve(AUC) 성능을 보장하면서도 신속하게 진단을 수행할 수 있음을 보였다.

Prediction of Closed Quotient During Vocal Phonation using GRU-type Neural Network with Audio Signals

  • Hyeonbin Han;Keun Young Lee;Seong-Yoon Shin;Yoseup Kim;Gwanghyun Jo;Jihoon Park;Young-Min Kim
    • Journal of information and communication convergence engineering
    • /
    • 제22권2호
    • /
    • pp.145-152
    • /
    • 2024
  • Closed quotient (CQ) represents the time ratio for which the vocal folds remain in contact during voice production. Because analyzing CQ values serves as an important reference point in vocal training for professional singers, these values have been measured mechanically or electrically by either inverse filtering of airflows captured by a circumferentially vented mask or post-processing of electroglottography waveforms. In this study, we introduced a novel algorithm to predict the CQ values only from audio signals. This has eliminated the need for mechanical or electrical measurement techniques. Our algorithm is based on a gated recurrent unit (GRU)-type neural network. To enhance the efficiency, we pre-processed an audio signal using the pitch feature extraction algorithm. Then, GRU-type neural networks were employed to extract the features. This was followed by a dense layer for the final prediction. The Results section reports the mean square error between the predicted and real CQ. It shows the capability of the proposed algorithm to predict CQ values.