• Title/Summary/Keyword: Environment sound recognition

Search Result 55, Processing Time 0.027 seconds

Comparison of Speech Intelligibility & Performance of Speech Recognition in Real Driving Environments (자동차 주행 환경에서의 음성 전달 명료도와 음성 인식 성능 비교)

  • Lee Kwang-Hyun;Choi Dae-Lim;Kim Young-Il;Kim Bong-Wan;Lee Yong-Ju
    • MALSORI
    • /
    • no.50
    • /
    • pp.99-110
    • /
    • 2004
  • The normal transmission characteristics of sound are hardly obtained due to the various noises and structural factors in a running car environment. It is due to the channel distortion of the original source sound recorded by microphones, and it seriously degrades the performance of the speech recognition in real driving environments. In this paper we analyze the degree of intelligibility under the various sound distortion environments by channels according to driving speed with respect to speech transmission index(STI) and compare the STI with rates of speech recognition. We examine the correlation between measures of intelligibility depending on sound pick-up patterns and performance in speech recognition. Thereby we consider the optimal location of a microphone in single channel environment. In experimentation we find that high correlation is obtained between STI and rates of speech recognition.

  • PDF

Active Slope Weighted-Constraints Based DTW Algorithm for Environmental Sound Recognition System (능동형 기울기 가중치 제약에 기반한 환경소리 인식시스템용 DTW 알고리듬)

  • Jung, Young-Jin;Lee, Yun-Jung;Kim, Pil-Un;Kim, Myoung-Nam
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.4
    • /
    • pp.471-480
    • /
    • 2008
  • The deaf can not recognize useful sound informations such as alarm, doorbell, siren, car horn, and phone ring etc., because they have the hearing impairment. To solve this problems, portable hearing assistive devices which have suitable environment sound recognition methods are needed. In this paper, the DTW algorithm for sound recognition system with new active slope weighting constraint method was proposed. The environment sound recognition methods consist of three processes. First process is extraction of start point and end point using frequency and amplitude of sound. Second process is extraction of features and third process is classification of features for given segments. As a result of the experiment, the recognition rate of the proposed method is over 90%. And, the recognition rate of the proposed method increased about 20% than the conventional algorithm. Therefore if there are developed portable assistive devices which use developed method to recognize environment sound for hearing-impaired persons, they could be more convenient in life.

  • PDF

A Emergency Sound Detecting Method for Smarter City (스마트 시티에서의 이머전시 사운드 감지방법)

  • Cho, Young-Im
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.12
    • /
    • pp.1143-1149
    • /
    • 2010
  • Because the noise is the main cause for decreasing the performance at speech recognition, the place or environment is very important in speech recognition. To improve the speech recognition performance in the real situations where various extraneous noises are abundant, a novel combination of FIR and Wiener filters is proposed and experimented. The combination resulted in improved accuracy and reduced processing time, enabling fast analysis and response in emergency situations. Usually, there are many dangerous situations in our city life, so for the smarter city it is necessary to detect many types of sound in various environment. Therefore this paper is about how to detect many types of sound in real city, especially on CCTV. This paper is for implementing the smarter city by detecting many types of sounds and filtering one of the emergency sound in this sound stream. And then it can be possible to handle with the emergency or dangerous situation.

Improvement of Environment Recognition using Multimodal Signal (멀티 신호를 이용한 환경 인식 성능 개선)

  • Park, Jun-Qyu;Baek, Seong-Joon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.27-33
    • /
    • 2010
  • In this study, we conducted the classification experiments with GMM (Gaussian Mixture Model) from combining the extracted features by using microphone, Gyro sensor and Acceleration sensor in 9 different environment types. Existing studies of Context Aware wanted to recognize the Environment situation mainly using the Environment sound data with microphone, but there was limitation of reflecting recognition owing to structural characteristics of Environment sound which are composed of various noises combination. Hence we proposed the additional application methods which added Gyro sensor and Acceleration sensor data in order to reflect recognition agent's movement feature. According to the experimental results, the method combining Acceleration sensor data with the data of existing Environment sound feature improves the recognition performance by more than 5%, when compared with existing methods of getting only Environment sound feature data from the Microphone.

The research on the MEMS device improvement which is necessary for the noise environment in the speech recognition rate improvement (잡음 환경에서 음성 인식률 향상에 필요한 MEMS 장치 개발에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1659-1666
    • /
    • 2018
  • When the input sound is mixed voice and sound, it can be seen that the voice recognition rate is lowered due to the noise, and the speech recognition rate is improved by improving the MEMS device which is the H / W device in order to overcome the S/W processing limit. The MEMS microphone device is a device for inputting voice and is implemented in various shapes and used. Conventional MEMS microphones generally exhibit excellent performance, but in a special environment such as noise, there is a problem that the processing performance is deteriorated due to a mixture of voice and sound. To overcome these problems, we developed a newly designed MEMS device that can detect the voice characteristics of the initial input device.

A study on Recognition of Inpatient Room Acoustic Pattern for Hospital safety (병원안전을 위한 입원실 음향패턴 인식 관한 연구)

  • Ryu, Han-Sul;Ahn, Jong-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.3
    • /
    • pp.169-173
    • /
    • 2021
  • Currently, safety accidents in hospitals are steadily occurring. In particular, safety accidents of elderly patients with weak immunity, such as nursing hospitals, continue to occur, and countermeasures are needed. Most accidents are caused by patient movement. As a method of reducing safety accidents by analyzing and recognizing the sound of the inpatient room according to the movement of the patient, this paper classifies the sound pattern for sound recognition in the hospital inpatient room using DTW (Dynamic Time Warping), an algorithm applicable to time-series pattern recognition. It was analyzed by applying it to the inpatient room environment.

Recognition of Overlapped Sound and Influence Analysis Based on Wideband Spectrogram and Deep Neural Networks (광역 스펙트로그램과 심층신경망에 기반한 중첩된 소리의 인식과 영향 분석)

  • Kim, Young Eon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.421-430
    • /
    • 2018
  • Many voice recognition systems use methods such as MFCC, HMM to acknowledge human voice. This recognition method is designed to analyze only a targeted sound which normally appears between a human and a device one. However, the recognition capability is limited when there is a group sound formed with diversity in wider frequency range such as dog barking and indoor sounds. The frequency of overlapped sound resides in a wide range, up to 20KHz, which is higher than a voice. This paper proposes the new recognition method which provides wider frequency range by conjugating the Wideband Sound Spectrogram and the Keras Sequential Model based on DNN. The wideband sound spectrogram is adopted to analyze and verify diverse sounds from wide frequency range as it is designed to extract features and also classify as explained. The KSM is employed for the pattern recognition using extracted features from the WSS to improve sound recognition quality. The experiment verified that the proposed WSS and KSM excellently classified the targeted sound among noisy environment; overlapped sounds such as dog barking and indoor sounds. Furthermore, the paper shows a stage by stage analyzation and comparison of the factors' influences on the recognition and its characteristics according to various levels of noise.

Development of the Mechanical Timer's Gear Sound Recognition system (기계식 타이머의 치차음 인식 시스템 개발)

  • 서영호;이돈진;안중환
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2001.04a
    • /
    • pp.217-220
    • /
    • 2001
  • We have developed the gear sound recognition system of mechanical timer. A mechancal timer is superior in endurance to electronic timer. So it is reliable under severe operating environment. It is putting together several kind of gears. Therefore when the timer operates, it emits mechanical sound of gears. We have chosen a microphone to detect the gear sound. A microphone is more efficient and convenient than other sensors. Because it is of low price and non-contact type sensor. For ease of measurement we designed real-time processing software based on graphical user interface.

  • PDF

Lip Region Extraction by Gaussian Classifier (가우스 분류기를 이용한 입술영역 추출)

  • Kim, Jeong Yeop
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.108-114
    • /
    • 2017
  • Lip reading is a field of image processing to assist the process of sound recognition. In some environment, the capture of sound signal usually has significant noise and therefore, the recognition rate of sound signal decreases. Lip reading can be a good feature for the increase of recognition rates. Conventional lip extraction methods have been proposed widely. Maia et. al. proposed a method by the sum of Cr and Cb. However, there are two problems as follows: the point with maximum saturation is not always regarded as lips region and the inner part of lips such as oral cavity and teeth can be classified as lips. To solve these problems, this paper proposes a method which adopts the histogram-based classifier for the extraction of lips region. The proposed method consists of two stages, learning and test. The amount of computation is minimized because this method has no color conversion. The performance of proposed method gives 66.8% of detection rate compared to 28% of conventional ones.

Voice Recognition Performance Improvement using a convergence of Voice Energy Distribution Process and Parameter (음성 에너지 분포 처리와 에너지 파라미터를 융합한 음성 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.313-318
    • /
    • 2015
  • A traditional speech enhancement methods distort the sound spectrum generated according to estimation of the remaining noise, or invalid noise is a problem of lowering the speech recognition performance. In this paper, we propose a speech detection method that convergence the sound energy distribution process and sound energy parameters. The proposed method was used to receive properties reduce the influence of noise to maximize voice energy. In addition, the smaller value from the feature parameters of the speech signal The log energy features of the interval having a more of the log energy value relative to the region having a large energy similar to the log energy feature of the size of the voice signal containing the noise which reducing the mismatch of the training and the recognition environment recognition experiments Results confirmed that the improved recognition performance are checked compared to the conventional method. Car noise environment of Pause Hit Rate is in the 0dB and 5dB lower SNR region showed an accuracy of 97.1% and 97.3% in the high SNR region 10dB and 15dB 98.3%, showed an accuracy of 98.6%.