• Title/Summary/Keyword: 음성적 거리

Search Result 135, Processing Time 0.023 seconds

Communications System between Android Platform and PC for the Visually Impaired Person (시각 장애인을 위한 Android Platform과 PC간의 1:1 통신 구현)

  • Lee, Jon-Hwey;Kim, Young-Kil;Oh, Jae-Gyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.213-215
    • /
    • 2012
  • The current Navigation System for the Visually Impaired Person has a short and limited communication distance and can't receive enough information from Visually Impaired Person to assist directly. In addition, because the path is dangerous and incomplete for the Visually Impaired Person, moving with White Stick is still inconvenient and dangerous. To solve this problem we implement communication that can send and receive video, voice, location information between the Visually Impaired Person's Android platform and assistant's PC.

  • PDF

A Study on DTW Reference Pattern Creation Using Genetic Algorithm (유전자 알고리듬을 이용한 DTW 참조패턴 생성에 관한 연구)

  • 서광석
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.385-388
    • /
    • 1998
  • DTW를 이용한 음성인식에서는 참조패턴이 인식률에 절대적인 영향을 미치므로 가장 적합한 참조패턴의 생성이 중요한 요인으로 작용한다. 그러므로 인식률 향상을 위해 여러개의 참조패턴을 사용하는 방법이 있다. 그러나 이러한 방법은 게산량의 과다 및 사용 메모리의 증가 등이 단점으로 지적되고 있다. 따라서 본 논문에서는 참조패턴의 수를 줄이면서 높은 인식률을 얻기 위해 유전자 알고리듬을 이용하여 보다 우수한 참조패턴을 생성하여 음성인식에 적용하였다. 본 논문에서는 참조패턴 생성을 위하여 훈련에 참가한 자료를 서로 비교하여 DTW 거리값의 누적값이 최소가 되는 데이터를 선정하는 방법, 유전자 알고리듬을 이용한 방법으로 선정하는 방법으로 나누어 실험을 했고, 그 결과 누적값의 최소값을 이용하였을 경우 98.33%의 인식률을 얻을 수 있었던 반면에 유전자 알고리듬을 사용하였을 경우 100%의 화자종속 인식률을 얻을 수 있었다.

  • PDF

A Efficient Design of Naval Tactical Video Transmission (효과적인 해상 전술영상 전송에 관한 연구)

  • Hwang, Jaeryong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.1125-1126
    • /
    • 2015
  • A naval tactical video has utilized to implement network based remote C2 system, but it is difficult to transmit video intelligence data between tactical combat team and base-station due to the distance and sea condition. Threrfore, We first understand the naval tactical video transmission system and identify its limitations. Finally, we present an efficient method to increase transmission rate in the naval tactical video transmission system.

  • PDF

Improving Fidelity of Synthesized Voices Generated by Using GANs (GAN으로 합성한 음성의 충실도 향상)

  • Back, Moon-Ki;Yoon, Seung-Won;Lee, Sang-Baek;Lee, Kyu-Chul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.1
    • /
    • pp.9-18
    • /
    • 2021
  • Although Generative Adversarial Networks (GANs) have gained great popularity in computer vision and related fields, generating audio signals independently has yet to be presented. Unlike images, an audio signal is a sampled signal consisting of discrete samples, so it is not easy to learn the signals using CNN architectures, which is widely used in image generation tasks. In order to overcome this difficulty, GAN researchers proposed a strategy of applying time-frequency representations of audio to existing image-generating GANs. Following this strategy, we propose an improved method for increasing the fidelity of synthesized audio signals generated by using GANs. Our method is demonstrated on a public speech dataset, and evaluated by Fréchet Inception Distance (FID). When employing our method, the FID showed 10.504, but 11.973 as for the existing state of the art method (lower FID indicates better fidelity).

ATM에 HDSL 정합 기능 구현과 동 선로를 이용한 서비스

  • 양충렬;김진태
    • Information and Communications Magazine
    • /
    • v.14 no.8
    • /
    • pp.104-117
    • /
    • 1997
  • 본 고에서는 ATM(Asynchronous Transfer Mod) 교환기에 4선식의 HDSL 정합기능을 구현하고, 기존의 음성전화 가입자 서비스 매체로서 완전 동 선로를 이용하여 HDSL(High-rate Digital Subscriber Line) 서비스를 제공하기 위한 가입자 서비스 루프 거리 및 셀 손실 성능을 평가하였다. 이 평가는 전이중 2B1Q 전송 방식을 지원하는 HDSL 정합장치를 통하여 누화, 임펄스 등 구리전화선(copper telephone line)에 존재하는 중요한 전송 손실을 감안한 CSA(Carrier Serving Area)환경에서 각각 26게이지 <0.4 mm> 선로와 24 게이지(0.5 mm) 페어 선로를 이용하여 E1 급(2.048 Mbps)데이터 전송할 때 $10^7$의 셀 손실 율을 만족하는 가입자 서비스 루프 거리를 평가하였다. 또한, ATM 시스템 하의 시범환경에서 MPEG-1급 주문형 비디오서비스, 영상회의 서비스 및 고속 인터넷 접속 서비스 등의 대표적인 서비스 데모를 실시하였다.

  • PDF

Isolated-Word Speech Recognition using Variable-Frame Length Normalization (가변프레임 길이정규화를 이용한 단어음성인식)

  • Sin, Chan-Hu;Lee, Hui-Jeong;Park, Byeong-Cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.21-30
    • /
    • 1987
  • Length normalization by variable frame size is proposed as a novel approach to length normalization to solve the problem that the length variation of spoken word results in a lowing of recognition accuracy. This method has the advantage of curtailment of recognition time in the recognition stage because it can reduce the number of frames constructing a word compared with length normalization by a fixed frame size. In this paper, variable frame length normalization is applied to multisection vector quantization and the efficiency of this method is estimated in the view of recognition time and accuracy through practical recognition experiments.

  • PDF

Determinants of Safety and Satisfaction with In-Vehicle Voice Interaction : With a Focus of Agent Persona and UX Components (자동차 음성인식 인터랙션의 안전감과 만족도 인식 영향 요인 : 에이전트 퍼소나와 사용자 경험 속성을 중심으로)

  • Kim, Ji-hyun;Lee, Ka-hyun;Choi, Jun-ho
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.573-585
    • /
    • 2018
  • Services for navigation and entertainment through AI-based voice user interface devices are becoming popular in the connected car system. Given the classification of VUI agent developers as IT companies and automakers, this study explores attributes of agent persona and user experience that impact the driver's perceived safety and satisfaction. Participants of a car simulator experiment performed entertainment and navigation tasks, and evaluated the perceived safety and satisfaction. Results of regression analysis showed that credibility of the agent developer, warmth and attractiveness of agent persona, and efficiency and care of the UX dimension showed significant impact on the perceived safety. The determinants of perceived satisfaction were unity of auto-agent makers and gender as predisposing factors, distance in the agent persona, and convenience, efficiency, ease of use, and care in the UX dimension. The contributions of this study lie in the discovery of the factors required for developing conversational VUI into the autonomous driving environment.

A Study on the Audio Compensation System (음향 보상 시스템에 관한 연구)

  • Jeoung, Byung-Chul;Won, Chung-Sang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.6
    • /
    • pp.509-517
    • /
    • 2013
  • In this paper, we researched a method that makes a good acoustic-speech system using a digital signal processing technique with dynamic microphone as a transducer. Good acoustic-speech system should deliver the original sound input to electric signal without distortion. By measuring the frequency response of the microphone, adjustment factors are obtained by comparing measured data and standard frequency response of microphone for each frequency band. The final sound levels are obtained using the developed adjustment factors of frequency responses from the microphone and speaker to match the original sound levels using the digital signal processing technique. Then, we minimize the changes in the frequency response and level due to the variation of the distance from source to microphone, where the frequency responses were measured according to the distance changes.

A Driving Information Centric Information Processing Technology Development Based on Image Processing (영상처리 기반의 운전자 중심 정보처리 기술 개발)

  • Yang, Seung-Hoon;Hong, Gwang-Soo;Kim, Byung-Gyu
    • Convergence Security Journal
    • /
    • v.12 no.6
    • /
    • pp.31-37
    • /
    • 2012
  • Today, the core technology of an automobile is becoming to IT-based convergence system technology. To cope with many kinds of situations and provide the convenience for drivers, various IT technologies are being integrated into automobile system. In this paper, we propose an convergence system, which is called Augmented Driving System (ADS), to provide high safety and convenience of drivers based on image information processing. From imaging sensor, the image data is acquisited and processed to give distance from the front car, lane, and traffic sign panel by the proposed methods. Also, a converged interface technology with camera for gesture recognition and microphone for speech recognition is provided. Based on this kind of system technology, car accident will be decreased although drivers could not recognize the dangerous situations, since the system can recognize situation or user context to give attention to the front view. Through the experiments, the proposed methods achieved over 90% of recognition in terms of traffic sign detection, lane detection, and distance measure from the front car.

Design of a Three Dimensional Audio System for Multicast Conferencing (멀티캐스트 화상회의를 위한 3-D 음향시스템 설계)

  • 김영오;고대식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.1B
    • /
    • pp.71-76
    • /
    • 2000
  • On multimedia teleconferencing system existing a number of participants, face of the participants can beperceived by visual image. However, differentiation of each participant's voice and spaciousness sense are very hard since voice of all participants is processed with one dimensional data. In this paper, we implemented three dimensional audio rendering system using the HRTF(Head Related Transfer Function) and distance sense reproduction method and determined the optimal location of the participants for teleconferencing system. In the results of the listening test using elevation and azimuth angle, we showed that directional perception of the azimuth angles were better than that of the elevation angles. Specially, we showed that participant location using the HRTFS of the azimuth angle 10" , 90" , 270" and350" was efficient in teleconferencing system existing four participants. We also proposed that distance cue was used for enhancement of the reality and location of many participants more than five.ipants more than five.

  • PDF