Search | Korea Science

A Study on the Noisy Speech Recognition Based on the Data-Driven Model Parameter Compensation (직접데이터 기반의 모델적응 방식을 이용한 잡음음성인식에 관한 연구)

Chung, Yong-Joo
- Speech Sciences
- /
- v.11 no.2
- /
- pp.247-257
- /
- 2004
There has been many research efforts to overcome the problems of speech recognition in the noisy conditions. Among them, the model-based compensation methods such as the parallel model combination (PMC) and vector Taylor series (VTS) have been found to perform efficiently compared with the previous speech enhancement methods or the feature-based approaches. In this paper, a data-driven model compensation approach that adapts the HMM(hidden Markv model) parameters for the noisy speech recognition is proposed. Instead of assuming some statistical approximations as in the conventional model-based methods such as the PMC, the statistics necessary for the HMM parameter adaptation is directly estimated by using the Baum-Welch algorithm. The proposed method has shown improved results compared with the PMC for the noisy speech recognition.
PDF

AI-based language tutoring systems with end-to-end automatic speech recognition and proficiency evaluation

Byung Ok Kang;Hyung-Bae Jeon;Yun Kyung Lee
- ETRI Journal
- /
- v.46 no.1
- /
- pp.48-58
- /
- 2024
This paper presents the development of language tutoring systems for nonnative speakers by leveraging advanced end-to-end automatic speech recognition (ASR) and proficiency evaluation. Given the frequent errors in non-native speech, high-performance spontaneous speech recognition must be applied. Our systems accurately evaluate pronunciation and speaking fluency and provide feedback on errors by relying on precise transcriptions. End-to-end ASR is implemented and enhanced by using diverse non-native speaker speech data for model training. For performance enhancement, we combine semisupervised and transfer learning techniques using labeled and unlabeled speech data. Automatic proficiency evaluation is performed by a model trained to maximize the statistical correlation between the fluency score manually determined by a human expert and a calculated fluency score. We developed an English tutoring system for Korean elementary students called EBS AI Peng-Talk and a Korean tutoring system for foreigners called KSI Korean AI Tutor. Both systems were deployed by South Korean government agencies.
https://doi.org/10.4218/etrij.2023-0322 인용 PDF

Automatic classification of failure patterns in semiconductor EDS Test using pattern recognition (반도체 EDS공정에서의 패턴인식기법을 이용한 불량 유형 자동 분류 방법 연구)

한영신;황미영;이칠기
- Proceedings of the IEEK Conference
- /
- 2003.07b
- /
- pp.703-706
- /
- 2003
Yield enhancement in semiconductor fabrication is important. It is ideal to prevent all the failures. However, when a failure occurs, it is important to quickly specify the cause stage and take countermeasure. The automatic method of failure pattern extraction from fail bit map provides reduced time to analysis and facilitates yield enhancement. This paper describes the techniques to automatically classifies a failure pattern using a fail bit map, a new simple schema which facilitates the failure analysis.
PDF

Performance Enhancement of Marker Detection and Recognition using SVM and LDA (SVM과 LDA를 이용한 마커 검출 및 인식의 성능 향상)

Kang, Sun-Kyoung;So, In-Mi;Kim, Young-Un;Lee, Sang-Seol;Jung, Sung-Tae
- Journal of Korea Multimedia Society
- /
- v.10 no.7
- /
- pp.923-933
- /
- 2007
In this paper, we present a method for performance enhancement of the marker detection system by using SVM(Support Vector Machine) and LDA(Linear Discriminant Analysis). It converts the input image to a binary image and extracts contours of objects in the binary image. After that, it approximates the contours to a list of line segments. It finds quadrangle by using geometrical features which are extracted from the approximated line segments. It normalizes the shape of extracted quadrangle into exact squares by using the warping technique and scale transformation. It extracts feature vectors from the square image by using principal component analysis. It then checks if the square image is a marker image or a non-marker image by using a SVM classifier. After that, it computes feature vectors by using LDA for the extracted marker images. And it calculates the distance between feature vector of input marker image and those of standard markers. Finally, it recognizes the marker by using minimum distance method. Experimental results show that the proposed method achieves enhancement of recognition rate with smaller feature vectors by using LDA and it can decrease false detection errors by using SVM.
PDF

Distortion Invariant Vehicle License Plate Extraction and Recognition Algorithm (왜곡 불변 차량 번호판 검출 및 인식 알고리즘)

Kim, Jin-Ho
- The Journal of the Korea Contents Association
- /
- v.11 no.3
- /
- pp.1-8
- /
- 2011
Automatic vehicle license plate recognition technology is widely used in gate control and parking control of vehicles, and police enforcement of illegal vehicles. However inherent geometric information of the license plate can be transformed in the vehicle images due to the slant and the sunlight or lighting environment. In this paper, a distortion invariant vehicle license plate extraction and recognition algorithm is proposed. First, a binary image reserving clean character strokes can be achieved by using a DoG filter. A plate area can be extracted by using the location of consecutive digit numbers that reserves distortion invariant characteristic. License plate is recognized by using neural networks after geometric distortion correction and image enhancement. The simulation results of the proposed algorithm show that the accuracy is 98.4% and the average speed is 0.05 seconds in the recognition of 6,200 vehicle images that are obtained by using commercial LPR system.
https://doi.org/10.5392/JKCA.2011.11.3.001 인용 PDF KSCI

Gain Compensation Method for Codebook-Based Speech Enhancement (코드북 기반 음성향상 기법을 위한 게인 보상 방법)

Jung, Seungmo;Kim, Moo Young
- Journal of the Institute of Electronics and Information Engineers
- /
- v.51 no.9
- /
- pp.165-170
- /
- 2014
Speech enhancement techniques that remove surrounding noise are stressed to preprocessor of speech recognition. Among the various speech enhancement techniques, Codebook-based Speech Enhancement (CBSE) operates efficiently in non-stationary noise environments. But, CBSE has some problems that inaccurate gains can be estimated if mismatch occur between input noisy signal and trained speech/noise codevectors. In this paper, the Normalized Weighting Factor (NWF) is calculated by long-term noise estimation algorithm based on Signal-to-Noise Ratio, compensated to the conventional inaccurate gains. The proposed CBSE shows better performance than conventional CBSE.
https://doi.org/10.5573/ieie.2014.51.9.165 인용 PDF KSCI

Region-Based Reconstruction Method for Resolution Enhancement of Low-Resolution Facial Image (저해상도 얼굴 영상의 해상도 개선을 위한 영역 기반 복원 방법)

Park, Jeong-Seon
- Journal of KIISE:Software and Applications
- /
- v.34 no.5
- /
- pp.476-486
- /
- 2007
This paper proposes a resolution enhancement method which can reconstruct high-resolution facial images from single-frame, low-resolution facial images. The proposed method is derived from example-based reconstruction methods and the morphable face model. In order to improve the performance of the example-based reconstruction, we propose the region-based reconstruction method which can maintain the characteristics of local facial regions. Also, in order to use the capability of the morphable face model to face resolution enhancement problems, we define the extended morphable face model in which an extended face is composed of a low-resolution face, its interpolated high-resolution face, and the high-resolution equivalent, and then an extended face is separated by an extended shape vector and an extended texture vector. The encouraging results show that the proposed methods can be used to improve the performance of face recognition systems, particularly to enhance the resolution of facial images captured from visual surveillance systems.
PDF KSCI

3D image processing using laser slit beam and CCD camera (레이저 슬릿빔과 CCD 카메라를 이용한 3차원 영상인식)

김동기;윤광의;강이석
- 제어로봇시스템학회:학술대회논문집
- /
- 1997.10a
- /
- pp.40-43
- /
- 1997
This paper presents a 3D object recognition method for generation of 3D environmental map or obstacle recognition of mobile robots. An active light source projects a stripe pattern of light onto the object surface, while the camera observes the projected pattern from its offset point. The system consists of a laser unit and a camera on a pan/tilt device. The line segment in 2D camera image implies an object surface plane. The scaling, filtering, edge extraction, object extraction and line thinning are used for the enhancement of the light stripe image. We can get faithful depth informations of the object surface from the line segment interpretation. The performance of the proposed method has demonstrated in detail through the experiments for varies type objects. Experimental results show that the method has a good position accuracy, effectively eliminates optical noises in the image, greatly reduces memory requirement, and also greatly cut down the image processing time for the 3D object recognition compared to the conventional object recognition.
PDF

Real-time and reconfiguable hardware filler for face recognition (얼굴 인식을 위한 실시간 재구성형 하드웨어 필터)

송민규;송승민;동성수;이종호;이필규
- Proceedings of the IEEK Conference
- /
- 2003.07c
- /
- pp.2645-2648
- /
- 2003
In this paper, real-time and reconfiguable hardware filter for face recognition is proposed and implemented on FPGA chip using verilog-HDL. In general, face recognition is considerably difficult because it is influenced by noises or the variation of illumination. Some of the commonly used filters such s histogram equalization filter, contrast stretching filter for image enhancement and illumination compensation filter are proposed for realizing more effective illumination compensation. The filter proposed in this paper was designed and verified by debugging and simulating on hardware. Experimental results show that the proposed filter system can generate selective set of real-time reconfiguable hardware filters suitable for face recognition in various situation.
PDF

A study on Voice Recognition using Model Adaptation HMM for Mobile Environment (모델적응 HMM을 이용한 모바일환경에서의 음성인식에 관한 연구)

Ahn, Jong-Young;Kim, Sang-Bum;Kim, Su-Hoon;Hur, Kang-In
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.11 no.3
- /
- pp.175-179
- /
- 2011
In this paper, we propose the MA(Model Adaption) HMM that to use speech enhancement and feature compensation. Normally voice reference data is not consider for real noise data. This method is not to use estimated noise but we use real life environment noise data. And we applied this contaminated data for recognition reference model that suitable for noise environment. MAHMM is combined with surround noise when generating reference patten. We improved voice recognition rate at mobile environment to use MAHMM.
https://doi.org/10.7236/JIWIT.2011.11.3.175 인용 PDF KSCI

Search Result 362, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)