• 제목/요약/키워드: Recognition Enhancement

검색결과 361건 처리시간 0.023초

A User-friendly Remote Speech Input Method in Spontaneous Speech Recognition System

  • Suh, Young-Joo;Park, Jun;Lee, Young-Jik
    • The Journal of the Acoustical Society of Korea
    • /
    • 제17권2E호
    • /
    • pp.38-46
    • /
    • 1998
  • In this paper, we propose a remote speech input device, a new method of user-friendly speech input in spontaneous speech recognition system. We focus the user friendliness on hands-free and microphone independence in speech recognition applications. Our method adopts two algorithms, the automatic speech detection and the microphone array delay-and-sum beamforming (DSBF)-based speech enhancement. The automatic speech detection algorithm is composed of two stages; the detection of speech and nonspeech using the pitch information for the detected speech portion candidate. The DSBF algorithm adopts the time domain cross-correlation method as its time delay estimation. In the performance evaluation, the speech detection algorithm shows within-200 ms start point accuracy of 93%, 99% under 15dB, 20dB, and 25dB signal-to-noise ratio (SNR) environments, respectively and those for the end point are 72%, 89%, and 93% for the corresponding environments, respectively. The classification of speech and nonspeech for the start point detected region of input signal is performed by the pitch information-base method. The percentages of correct classification for speech and nonspeech input are 99% and 90%, respectively. The eight microphone array-based speech enhancement using the DSBF algorithm shows the maximum SNR gaing of 6dB over a single microphone and the error reductin of more than 15% in the spontaneous speech recognition domain.

  • PDF

얼굴 검출을 위한 영상 향상 방법 연구 (Image Enhancement Method Research for Face Detection)

  • 전인자;정경용
    • 한국콘텐츠학회논문지
    • /
    • 제9권10호
    • /
    • pp.13-21
    • /
    • 2009
  • 본 논문에서는 정확한 얼굴 영역 검출을 위한 영상화질 향상에 대한 연구를 수행하였다. 일반적인 인식시스템에서는 입력되는 모든 영상에 고정된 영상처리 과정을 수행한다. 고정된 영상처리 필터를 사용하는 방법을 다양한 환경 조건에서 획득된 얼굴 영상에 적용하게 되면, 정확한 얼굴영역을 검출할 수 없게 될 것이다. 복잡한 배경과 조명이 포함된 영상으로부터 검출에 적합한 영상으로 구성하기 위하여, 본 논문에서는 부-윈도우를 기반으로 하는 카테고리에 따른 영상 향상 방법을 제안한다. 처리를 위한 영상이 획득되었을 때, 영상의 부-윈도우로부터 평균값을 계산하고, 이를 기 구성된 카테고리와 비교하여 입력영상에 적용 가능한 영상처리 방법을 선택적으로 적용하는 처리를 수행한다. 얼굴영역을 검출한 결과 히스토그램 평활화, 감마변환등의 방법을 전체영상에 적용한 결과와, 제안된 방법을 적용하여 추출한 영상들로부터 얼굴영상 등록을 통한 검출률을 비교한 결과, 현저히 향상된 등록 결과를 획득할 수 있었다.

EAR: Enhanced Augmented Reality System for Sports Entertainment Applications

  • Mahmood, Zahid;Ali, Tauseef;Muhammad, Nazeer;Bibi, Nargis;Shahzad, Imran;Azmat, Shoaib
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권12호
    • /
    • pp.6069-6091
    • /
    • 2017
  • Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.

잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델 (Multi-resolution DenseNet based acoustic models for reverberant speech recognition)

  • 박순찬;정용원;김형순
    • 말소리와 음성과학
    • /
    • 제10권1호
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.

확산망을 이용한 음성인식 (The Speech Recognition Using the Diffusion Network)

  • 허만택
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1996년도 영남지부 학술발표회 논문집 Acoustic Society of Korean Youngnam Chapter Symposium Proceedings
    • /
    • pp.70-75
    • /
    • 1996
  • In this paper, the pre-precessing method for the recognition of single vowels by use of spectrum envelope is presented , we use new method of an extrating spectrum envelope using the diffusion filter bank. We reduced the total processing time, and got higher enhancement of discrimination . By getting 88.3% of average recognition rate for single vowels of real voice through computer simulation, we confirmed it to be useful for speech recongition which use spectrum analysis for voice signal to have many frequency components.

  • PDF

강원도 중소기업 품질경영 운영 방안 사례 (A study on Quality Management in Small and Medium Enterprises)

  • 박노국
    • 대한안전경영과학회지
    • /
    • 제8권1호
    • /
    • pp.131-144
    • /
    • 2006
  • Quality system management adapted by small and medium enterprises in Kangwon province to enhance the competitiveness was studied. Variance analysis on several questionnaire answers was performed. Motives for acquiring the accreditation, such as product export, adjustment to international trend, enhancement of brand/product recognition, CEO's mind change, and management innovation, have been changed significantly among business types. Mind changes after the accreditations were setting company's first priority on quality, enhanced recognition on compliance of in-house standards and regulations, employee's performance with the recognition of quality. Amongst service problems to maintain the ace reditations were difficulties in maintaining the recognition of the company's finality management, labor increase to maintain the ISO 9000 enforcement team, and financial burden to keep the accreditation. Quality recognition after the accreditations was significantly improved in setting company's first priority on quality, enhanced recognition on compliance of in-house standards and regulations, employee's performance with the recognition of quality.

영화 비디오 자막 추출 및 추출된 자막 이미지 향상 방법 (Methods for Video Caption Extraction and Extracted Caption Image Enhancement)

  • 김소명;곽상신;최영우;정규식
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제29권4호
    • /
    • pp.235-247
    • /
    • 2002
  • 디지털 비디오 영상을 효과적으로 색인하고 검색하기 위해서 비디오의 내용을 함축적으로 표현하고 있는 비디오 자막을 추출하여 인식하는 연구가 필요하다. 본 논문에서는 압축되지 않은 비디오 영화 영상에 인위적으로 삽입한 한글 및 영어 자막을 대상으로 자막 영역을 추출하고, 추출된 자막 이미지를 향상시키는 방법을 제안한다. 제안한 방법의 특징은 동일한 내용의 자막을 갖는 프레임들의 위치를 자동으로 찾아서 동일 자막 프레임들을 다중 결합하여 배경에 포함되어 있는 잡영의 일부 또는 전부를 우선 제거한다. 또한, 이 결과 이미지에 해상도 중대, 히스토그램 평활화, 획 기반 이진화, 스무딩의 이미지 향상 방법을 단계적으로 적용하여 인식 가능한 수준의 이미지로 향상시킨다. 제안한 방법을 비디오 영상에 적용하여 동일한 내용의 자막 그룹 단위로 자막 이미지를 추출하는 것이 가능해졌으며, 잡영이 제거되고 복잡한 자소의 획이 보존된 자막 이미지를 추출할 수 있었다. 동일한 내용의 자막 프레임의 시작 및 글위치를 파악하는 것은 비디오 영상의 색인과 검색에 유용하게 활용될 수 있다. 한글 및 영어 비디오 영화 자막에 제안한 방법을 적용하여 향상된 문자 인식 결과를 얻었다.

Iris Image Enhancement for the Recognition of Non-ideal Iris Images

  • Sajjad, Mazhar;Ahn, Chang-Won;Jung, Jin-Woo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권4호
    • /
    • pp.1904-1926
    • /
    • 2016
  • Iris recognition for biometric personnel identification has gained much interest owing to the increasing concern with security today. The image quality plays a major role in the performance of iris recognition systems. When capturing an iris image under uncontrolled conditions and dealing with non-cooperative people, the chance of getting non-ideal images is very high owing to poor focus, off-angle, noise, motion blur, occlusion of eyelashes and eyelids, and wearing glasses. In order to improve the accuracy of iris recognition while dealing with non-ideal iris images, we propose a novel algorithm that improves the quality of degraded iris images. First, the iris image is localized properly to obtain accurate iris boundary detection, and then the iris image is normalized to obtain a fixed size. Second, the valid region (iris region) is extracted from the segmented iris image to obtain only the iris region. Third, to get a well-distributed texture image, bilinear interpolation is used on the segmented valid iris gray image. Using contrast-limited adaptive histogram equalization (CLAHE) enhances the low contrast of the resulting interpolated image. The results of CLAHE are further improved by stretching the maximum and minimum values to 0-255 by using histogram-stretching technique. The gray texture information is extracted by 1D Gabor filters while the Hamming distance technique is chosen as a metric for recognition. The NICE-II training dataset taken from UBRIS.v2 was used for the experiment. Results of the proposed method outperformed other methods in terms of equal error rate (EER).

주도적 성격과 지식 공유행위, 직무 특성, 그리고 조직의 인정 간 관계에 관한 연구: 비싼 신호보내기 이론을 중심으로 (Proactive Personality, Knowledge Sharing Behavior, Job Characteristics, and Organizational Recognition: An Application of Costly Signaling Theory)

  • 박지성;채희선
    • 한국산학기술학회논문지
    • /
    • 제19권12호
    • /
    • pp.128-137
    • /
    • 2018
  • 본 논문은 비싼 신호보내기 이론과 자기향상동기를 중심으로 주도적 성격과 지식 공유 행위 간 관계, 그리고 더 나아가 조직의 인정 간 관계를 검증하였다. 개인의 성격적 특질에 더하여, 상황적 요인으로 직무 특성을 고려하였는데, 본 논문에서는 직무 복잡성과 다양성이 주도적 성격과 지식 공유 행위, 조직 인정 간 정(+)의 관계를 조절할 것이라는 조절된 매개모형을 제안하였다. 한국 기업들을 대상으로 상사-부하 쌍(dyad) 설문을 실시하여 총 166쌍의 응답을 실증 분석한 결과, 예측한대로 주도적 성격을 가진 구성원일수록 지식 공유 행위를 보다 많이 보이는 것으로 나타났으며, 이러한 지식 공유 행위는 조직 내 상사가 평가한 조직 인정을 궁극적으로 높이는 것으로 나타났다. 뿐만 아니라, 직무 복잡성과 다양성이 높을 때가 낮을 때보다 지식 공유 행위에 의해 매개된 주도적 성격과 조직 인정 간 정(+)의 관계를 보다 강화하는 것으로 나타나 예측한대로 조절변수 관련 가설들도 지지되었다. 이러한 결과들은 조직 공유의 동기가 무엇이고, 이를 활성화시키는 경계조건들이 무엇인지를 밝힘으로써 지식경영 분야에 이론적 실무적 함의를 제공한다고 할 수 있다.

자동차 잡음 및 오디오 출력신호가 존재하는 자동차 실내 환경에서의 강인한 음성인식 (Robust Speech Recognition in the Car Interior Environment having Car Noise and Audio Output)

  • 박철호;배재철;배건성
    • 대한음성학회지:말소리
    • /
    • 제62호
    • /
    • pp.85-96
    • /
    • 2007
  • In this paper, we carried out recognition experiments for noisy speech having various levels of car noise and output of an audio system using the speech interface. The speech interface consists of three parts: pre-processing, acoustic echo canceller, post-processing. First, a high pass filter is employed as a pre-processing part to remove some engine noises. Then, an echo canceller implemented by using an FIR-type filter with an NLMS adaptive algorithm is used to remove the music or speech coming from the audio system in a car. As a last part, the MMSE-STSA based speech enhancement method is applied to the out of the echo canceller to remove the residual noise further. For recognition experiments, we generated test signals by adding music to the car noisy speech from Aurora 2 database. The HTK-based continuous HMM system is constructed for a recognition system. Experimental results show that the proposed speech interface is very promising for robust speech recognition in a noisy car environment.

  • PDF