• Title/Summary/Keyword: Voice problem

Search Result 339, Processing Time 0.028 seconds

The Design of a CTI System for reliable video-conference (신뢰성있는 화상회의를 위한 CTI System 설계)

  • 이종열;정현우;박원배
    • Proceedings of the IEEK Conference
    • /
    • 2000.06a
    • /
    • pp.225-228
    • /
    • 2000
  • In this paper, a design of the reliable video-conference system using CTI(Computer Telephony Integration) technology is proposed. When video-conference is run on the current existing Internet, the transmission delay problem for voice data traffic can be frequently occurred. In order to transmit the real-time voice data through the Internet efficiently, some complicated algorithms such as CODEC(Code/Decode) should be applied. It can cause further excessive processing delay which can affect the overall performance. The voice traffic is usually transmitted through the reliable PSTN(Public Switched Telephone Network) in the CTI system. In this paper a new architecture, in which PSTN for voice traffic and Internet for video traffic are used at the same time instead of using Internet by itself, is proposed to relieve the problems on a video conference.

  • PDF

Heterogeneous Study of Voice Communication Delay According to Connection Delay Difference of Heterogeneous Radios (이종 무전기의 통신접속지연차에 따른 음성통신성능 개선 연구)

  • Park, Jin-Hee;Lee, Soon-Hwa
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.6
    • /
    • pp.29-35
    • /
    • 2013
  • The heterogeneous emergency communication radios is used at disaster management agencies of Korea to response activity in the event of disaster. The compensation method by communication connection time difference is necessary to seamless voice communication because radios have different communication method and delay. In this paper, we suggested solution for voice transmission chance and data loss problem.

Screening of Voice Disorder using Source Parameter Model and Artificial Neural Network (음원 파라미터 모델과 인공신경망을 이용한 음성장애 검출)

  • Chytil, Pavel;Jo, Cheol-Woo;Pavel, Misha
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.89-97
    • /
    • 2008
  • There is a number of clinical conditions that affect directly or indirectly the physical properties of the vocal folds and thereby the pressure waveforms of elicited sounds. If the relationships between the clinical conditions and the voice quality are sufficiently reliable, it should be possible to detect these diseases or disorders. The focus of this paper is to determine the set of features and their values that would characterize the speaker's state of vocal folds. To the extent that these features can capture the anatomical, physiological, and neurological aspects of the speaker they can be potentially used to mediate an unobtrusive approach to diagnosis. We will show a new approach to this problem supported with results obtained from two disordered voice corpora.

  • PDF

A Security-Enhanced Storing Method for the Voice Data in the Aircraft (항공기에서 보안 강화된 음성 데이터 저장 방식)

  • Cho, Seung Hoon;Suh, Jeong Bae;Moon, Yong Ho
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.4
    • /
    • pp.255-261
    • /
    • 2011
  • In this paper, we propose a security-enhanced storing method for the voice data obtained during the flight. When an emergency occurs during flight, the flight data in the storage device such as DTS or Blackbox can be exposed to antagonist or enemy. Currently, zeroize function is embedded in these devices in order to prevent this situation. However, this could not be operated if the system is malfunctioned or the pilot is wounded in the emergency. In order to solve this problem, the voice data compressed by the ADPCM is encrypted in the proposed method composed of the AES algorithm and a reordering method. The simulation results show that the security for the voice date is further enhanced due to the proposed method.

Data augmentation in voice spoofing problem (데이터 증강기법을 이용한 음성 위조 공격 탐지모형의 성능 향상에 대한 연구)

  • Choi, Hyo-Jung;Kwak, Il-Youp
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.449-460
    • /
    • 2021
  • ASVspoof 2017 deals with detection of replay attacks and aims to classify real human voices and fake voices. The spoofed voice refers to the voice that reproduces the original voice by different types of microphones and speakers. data augmentation research on image data has been actively conducted, and several studies have been conducted to attempt data augmentation on voice. However, there are not many attempts to augment data for voice replay attacks, so this paper explores how audio modification through data augmentation techniques affects the detection of replay attacks. A total of 7 data augmentation techniques were applied, and among them, dynamic value change (DVC) and pitch techniques helped improve performance. DVC and pitch showed an improvement of about 8% of the base model EER, and DVC in particular showed noticeable improvement in accuracy in some environments among 57 replay configurations. The greatest increase was achieved in RC53, and DVC led to an approximately 45% improvement in base model accuracy. The high-end recording and playback devices that were previously difficult to detect were well identified. Based on this study, we found that the DVC and pitch data augmentation techniques are helpful in improving performance in the voice spoofing detection problem.

Development of tangible language content system based on voice recording (음성녹음 기반의 실감형 어학시스템 콘텐츠 개발)

  • Na, Jong-Won
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.2
    • /
    • pp.234-239
    • /
    • 2013
  • Learning a lesson about poor concentration and problems of the existing content, the system of language which could not be determined, Many teachers' assessment decision was made. As a result, voice recording based on the combination of ubiquitous technology and virtual reality technology, and install the projector in a classroom Through the learning content corresponding grade English student ID card attached RFID reader in each classroom, and students of RFID tags attached. In reality of the virtual three-dimensional image content foreigners and question-and-answer using the voice recording technology at the same time check the pronunciation and intonation level passes or level failure judged. Student education data to a central server system is configured to do so after saving to the DB through a feedback process, which provides information. Analysis of the issues that can have a common language content in the present study and Problem for voice recording technology to solve the problem and did not solve the existing language in the content level based classes.

Mobile Voice Web Browser for the Low Vision (저시력자를 위한 모바일 보이스 웹 브라우저 개발)

  • Park, Joo Hyun;Lee, Han Na;Shin, Ji Eun;Dong, Suh-Yeon;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.11
    • /
    • pp.1418-1427
    • /
    • 2020
  • The web has become indispensable in all of our daily lives. We communicate, study and get information with others through the web. This behavior also continues in the smart phone environment. The biggest problem is that the small display screen of a smart phone degrades the accuracy in selecting or manipulating content for people with low vision. To compensate for this, voice guidance services that combine touch and voice, such as VoiceOver and Talkback, are currently provided to smart phone devices. However, restrictions arise in GUI, TTS control problems, and content expansion and selection. In addition, unnecessary content is also output by voice, which causes fatigue for low vision people to use. In this study, we propose a mobile web browser interface that selects and enlarges a desired area from web browsers and contents, or outputs it as a voice so that people with low vision can easily use the mobile web browser. In this paper, we propose a context selective focusing function that enables selection for each element of web content. In addition, we intend to develop a mobile voice web browser that can enlarge the selected content or output it by voice.

Tracheoesophageal Shunt Voice in Total Laryngectomee (후두 전 절제 환자에서 음성재활을 위한 기관식도발성)

  • Wang, Soo-Geun;Jang, Sun-Mi
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.19 no.1
    • /
    • pp.21-27
    • /
    • 2008
  • Total laryngectomy is the most useful procedure tor advanced laryngopharyngeal cancer, but it remains the major problem such as loss of voice. Voice restoration is essential for every patients who undergo a total laryngectomy. Ideal voice rehabilitation methods can resolve three factors. First, every laryngectomee can produce voice sufficient for communication, second every patient should be allowed to use both hands freely during phonation, and last, the voice restoration methods should be easy and safe without complication during and after treatment. Among various voice rehabilitation procedures during or after total laryngectomy, it can be divided electronic and pneumatic methods. In pneumatic methods, there are also divided both pulmonary air and non-pulmonary air methods. The non-pulmonary air methods include esophageal speech, buccal speech, and pharyngeal speech. Pulmonary air methods are divided into surgical and non-surgical such as pneumatic speech aid. In the surgical methods, there are neoglottic operation, tracheopharyngeal shunt, and tracheopharyngeal shunt operations. Recently, tracheoesophageal shunt with or without prosthesis are being recognized the most effective method. Blom-Singer low pressure prosthesis, Panje button, and Provox are well known types of prosthesis in the tracheoesophageal shunt operation. Amatsu method is a kind of famous tracheoesophageal shunt method without using prosthesis. Authors tried to review the published articles for evaluation of effectiveness and problems of tracheoesophageal shunt operation with or without prosthesis. In conclusion, indwelling type of prosthesis and pharyngeal myotomy and plexus neurectomy are recommended for higher success rate during tracheoesophageal puncture procedure. More over, Amatsu method is also one of the recommended voice rehabilitation procedure during total laryngectomy. In this situation, pharyngeal myotomy and plexus neurectomy may be helpful for better fluent communication.

  • PDF

Voice Conversion using Generative Adversarial Nets conditioned by Phonetic Posterior Grams (Phonetic Posterior Grams에 의해 조건화된 적대적 생성 신경망을 사용한 음성 변환 시스템)

  • Lim, Jin-su;Kang, Cheon-seong;Kim, Dong-Ha;Kim, Kyung-sup
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.369-372
    • /
    • 2018
  • This paper suggests non-parallel-voice-conversion network conversing voice between unmapped voice pair as source voice and target voice. Conventional voice conversion researches used learning methods that minimize spectrogram's distance error. Not only these researches have some problem that is lost spectrogram resolution by methods averaging pixels. But also have used parallel data that is hard to collect. This research uses PPGs that is input voice's phonetic data and a GAN learning method to generate more clear voices. To evaluate the suggested method, we conduct MOS test with GMM based Model. We found that the performance is improved compared to the conventional methods.

  • PDF

Implementation of the automatic switching device for the voice communications between heterogeneous devices (이종 기기 간 음성통신을 위한 자동전환장치의 구현)

  • Lew, Chang-Guk;Lee, Bae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.12
    • /
    • pp.1321-1328
    • /
    • 2015
  • A radio is a half-duplex voice communication method using the PTT(: Push To Talk), occupy a single line calls during transmission. As an interface between the telephone and the radio, UHF and VHF, for voice communication between the different heterogeneous devices, A device automatically switches between the two devices is required. Therefore, in accordance with the performance of the voice switching apparatus for detecting a voice to be transmitted from an input signal, loss of the audio signal to be transmitted is subjected to Significant influence. Conventional method has the problem responding to noise by setting the level through simple means of amplitude of input signal, in other words, the energy level of the input signal. This paper, by using the audio signal processing techniques, this discriminated what the voice is among the input signal and substantiated a device for the automatic voice transmission between heterogeneous devices. With this proposal, I was confirmed of improvement of performance in the automatic voice switching device, could perform loss-less transmission of voice between heterogeneous devices.