• Title/Summary/Keyword: Voice detecting

Search Result 47, Processing Time 0.023 seconds

Emotion Detecting Method Based on Various Attributes of Human Voice

  • MIYAJI Yutaka;TOMIYAMA Ken
    • Science of Emotion and Sensibility
    • /
    • v.8 no.1
    • /
    • pp.1-7
    • /
    • 2005
  • This paper reports several emotion detecting methods based on various attributes of human voice. These methods have been developed at our Engineering Systems Laboratory. It is noted that, in all of the proposed methods, only prosodic information in voice is used for emotion recognition and semantic information in voice is not used. Different types of neural networks(NNs) are used for detection depending on the type of voice parameters. Earlier approaches separately used linear prediction coefficients(LPCs) and time series data of pitch but they were combined in later studies. The proposed methods are explained first and then evaluation experiments of individual methods and their performances in emotion detection are presented and compared.

  • PDF

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

  • Kim, Young-Po;Lee, Han-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.77-83
    • /
    • 2013
  • In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.

iVisher: Real-Time Detection of Caller ID Spoofing

  • Song, Jaeseung;Kim, Hyoungshick;Gkelias, Athanasios
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.865-875
    • /
    • 2014
  • Voice phishing (vishing) uses social engineering, based on people's trust in telephone services, to trick people into divulging financial data or transferring money to a scammer. In a vishing attack, a scammer often modifies the telephone number that appears on the victim's phone to mislead the victim into believing that the phone call is coming from a trusted source, since people typically judge a caller's legitimacy by the displayed phone number. We propose a system named iVisher for detecting a concealed incoming number (that is, caller ID) in Session Initiation Protocol-based Voice-over-Internet Protocol initiated phone calls. Our results demonstrate that iVisher is capable of detecting a concealed caller ID without significantly impacting upon the overall call setup time.

Implementation of Human and Computer Interface for Detecting Human Emotion Using Neural Network (인간의 감정 인식을 위한 신경회로망 기반의 휴먼과 컴퓨터 인터페이스 구현)

  • Cho, Ki-Ho;Choi, Ho-Jin;Jung, Seul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.9
    • /
    • pp.825-831
    • /
    • 2007
  • In this paper, an interface between a human and a computer is presented. The human and computer interface(HCI) serves as another area of human and machine interfaces. Methods for the HCI we used are voice recognition and image recognition for detecting human's emotional feelings. The idea is that the computer can recognize the present emotional state of the human operator, and amuses him/her in various ways such as turning on musics, searching webs, and talking. For the image recognition process, the human face is captured, and eye and mouth are selected from the facial image for recognition. To train images of the mouth, we use the Hopfield Net. The results show 88%$\sim$92% recognition of the emotion. For the vocal recognition, neural network shows 80%$\sim$98% recognition of voice.

QoS-Sensitive Admission Policy for Non-Real Time Data Packets in Voice/Data CDMA Systems (음성/데이터 CDMA 시스템에서의 서비스 품질을 고려한 비 실시간 데이터 패킷 전송 제어 정책)

  • Seungjae Bahng;Insoo Koo;Jeongrok Yang;Kim, Kiseon
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.125-128
    • /
    • 1999
  • In this paper, we propose a QoS-sensitive admission threshold method for the transmission of the non-real tine data packet such that the quality of services for both voice and data traffics are maintained to a required level. By detecting the active voice traffic during the current time slot, the non-real-time data packets are transmitted up to an admission threshold level during the next time slot. We found out that the optimum admission threshold is four voice traffic resources lower from the maximum allowable threshold to maintain the outage probability within 1% when the connected voice users are 15.

  • PDF

Robust Voice Activity Detection Using the Spectral Peaks of Vowel Sounds

  • Yoo, In-Chul;Yook, Dong-Suk
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.451-453
    • /
    • 2009
  • This letter proposes the use of vowel sound detection for voice activity detection. Vowels have distinctive spectral peaks. These are likely to remain higher than their surroundings even after severe corruption. Therefore, by developing a method of detecting the spectral peaks of vowel sounds in corrupted signals, voice activity can be detected as well even in low signal-to-noise ratio (SNR) conditions. Experimental results indicate that the proposed algorithm performs reliably under various noise and low SNR conditions. This method is suitable for mobile environments where the characteristics of noise may not be known in advance.

A Study on Precise Control of Autonomous Travelling Robot Based on RVR (RVR에 의한 자율주행로봇의 정밀제어에 관한연구)

  • Shim, Byoung-Kyun;Cong, Nguyen Huu;Kim, Jong-Soo;Ha, Eun-Tae
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.17 no.2
    • /
    • pp.42-53
    • /
    • 2014
  • Robust voice recognition (RVR) is essential for a robot to communicate with people. One of the main problems with RVR for robots is that robots inevitably real environment noises. The noise is captured with strong power by the microphones, because the noise sources are closed to the microphones. The signal-to-noise ratio of input voice becomes quite low. However, it is possible to estimate the noise by using information on the robot's own motions and postures, because a type of motion/gesture produces almost the same pattern of noise every time it is performed. In this paper, we propose an RVR system which can robustly recognize voice by adults and children in noisy environments. We evaluate the RVR system in a communication robot placed in a real noisy environment. Voice is captured using a wireless microphone. Navigation Strategy is shown Obstacle detection and local map, Design of Goal-seeking Behavior and Avoidance Behavior, Fuzzy Decision Maker and Lower level controller. The final hypothesis is selected based on posterior probability. We then select the task in the motion task library. In the motion control, we also integrate the obstacle avoidance control using ultrasonic sensors. Those are powerful for detecting obstacle with simple algorithm.

Implementation of the automatic switching device for the voice communications between heterogeneous devices (이종 기기 간 음성통신을 위한 자동전환장치의 구현)

  • Lew, Chang-Guk;Lee, Bae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.12
    • /
    • pp.1321-1328
    • /
    • 2015
  • A radio is a half-duplex voice communication method using the PTT(: Push To Talk), occupy a single line calls during transmission. As an interface between the telephone and the radio, UHF and VHF, for voice communication between the different heterogeneous devices, A device automatically switches between the two devices is required. Therefore, in accordance with the performance of the voice switching apparatus for detecting a voice to be transmitted from an input signal, loss of the audio signal to be transmitted is subjected to Significant influence. Conventional method has the problem responding to noise by setting the level through simple means of amplitude of input signal, in other words, the energy level of the input signal. This paper, by using the audio signal processing techniques, this discriminated what the voice is among the input signal and substantiated a device for the automatic voice transmission between heterogeneous devices. With this proposal, I was confirmed of improvement of performance in the automatic voice switching device, could perform loss-less transmission of voice between heterogeneous devices.

On a Study of Detecting First Formant Using Autocorrelation Method (자기상관법을 이용한 제 1 포만트 검출법에 관한 연구)

  • 강은영;민소연;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.285-288
    • /
    • 2001
  • In the speech analysis, to estimate formant center frequencies exactly is very important. If we know formant frequencies, we can expect which pronunciation is uttered. Generally, the magnitude of first formant frequency in voiced speech is 10dB more than other formant frequency. So, the shape of voice signal in time domain is affected by mainly first formant. Therefore we can get first formant frequency roughly by using ZCR(Zero Cross Rate). In this paper, we proposed the improvement method to get first formant frequency by using ZCR. We did autocorrelation before getting ZCR. This procedure makes voice signal smooth so, first formant in voice signal is emphasized. As a result of this method, we got more exact ZCR and first formant frequency. Conventional method of formant estimate is done in frequency domain but proposed method is done in time domain. So, this is very simple.

  • PDF

Limitations of Spectrogram Analysis for Smartphone Voice Recording File Forgery Detection (스마트폰 음성 녹음 파일 위변조 검출을 위한 스펙트로그램 분석의 한계점)

  • Sangmin Han;Yeongmin Son;Jae Wan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.2
    • /
    • pp.545-551
    • /
    • 2023
  • As digital information is readily available to everyone today, the adoption of digital evidence is increasing. However, it is virtually impossible to determine the authenticity of forgery in the case of a voice recording file that has gone through a sophisticated editing process along with the spread of various voice file editing tools. This study aims to prove that forgery, which is difficult to distinguish from the original file, is possible by using insertion, deletion, linking, and synthetic editing technologies in voice recording files. This study presents the difficulty of detecting forgery by encoding a forged voice file with the same extension as the original. In addition, it was shown that forgery detection is impossible if additional transition band deletion and secondary encoding are performed only for experiments in which features occurred. Through this, this study is expected to contribute to the establishment of more stringent evidence admissibility criteria for adopting voice recording files as digital evidence.