• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.029 seconds

Face Recognition Using First Moment of Image and Eigenvectors (영상의 1차 모멘트와 고유벡터를 이용한 얼굴인식)

  • Cho Yong-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.1
    • /
    • pp.33-40
    • /
    • 2006
  • This paper presents an efficient face recognition method using both first moment of image and eigenvector. First moment is a method for finding centroid of image, which is applied to exclude the needless backgrounds in the face recognitions by shitting to the centroid of face image. Eigenvector which are the basis images as face features, is extracted by principal component analysis(PCA). This is to improve the recognition performance by excluding the redundancy considering to second-order statistics of face image. The proposed methods has been applied to the problem for recognizing the 60 face images(15 persons *4 scenes) of 320*243 pixels. The 3 distances such as city-block, Euclidean, negative angle are used as measures when match the probe images to the nearest gallery images. In case of the 45 face images, the experimental results show that the recognition rate of the proposed methods is about 1.6 times and its the classification is about 5.6 times higher than conventional PCA without preprocessing. The city-block has been relatively achieved more an accurate classification than Euclidean or negative angle.

  • PDF

Feature Extraction by Optimizing the Cepstral Resolution of Frequency Sub-bands (주파수 부대역의 켑스트럼 해상도 최적화에 의한 특징추출)

  • 지상문;조훈영;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.35-41
    • /
    • 2003
  • Feature vectors for conventional speech recognition are usually extracted in full frequency band. Therefore, each sub-band contributes equally to final speech recognition results. In this paper, feature Teeters are extracted indepedently in each sub-band. The cepstral resolution of each sub-band feature is controlled for the optimal speech recognition. For this purpose, different dimension of each sub-band ceptral vectors are extracted based on the multi-band approach, which extracts feature vector independently for each sub-band. Speech recognition rates and clustering quality are suggested as the criteria for finding the optimal combination of sub-band Teeter dimension. In the connected digit recognition experiments using TIDIGITS database, the proposed method gave string accuracy of 99.125%, 99.775% percent correct, and 99.705% percent accuracy, which is 38%, 32% and 37% error rate reduction relative to baseline full-band feature vector, respectively.

The Amur starfishes recognition using the adaptive filter (적응형 필터를 이용한 아무르 불가사리 인식)

  • Kim, Jong Ik;Shim, Hyun Bo;Kim, Sung Rak
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.4
    • /
    • pp.922-934
    • /
    • 2013
  • The Amur Starfishes, known as give huge damages to shellfish farms, are being removed by only divers and fish net in the wide areas. Therefore, increasing recognition rates of starfishes are desperately required to remove high amounts of Amur Starfishes effectively. Meanwhile, current technology of obtaining the images of starfish distribution has limitation of obtaining color images due to using visible light in the darkness seabed area. In this research, we have used infrared ray, which is very penetrated, and strong against noise that comes from floating objects under the water, in order to solve current problems. As a result, we have acquired better images to analyze recognition rates of starfishes, and also able to received much satisfied recognition rate results for 88.7% of Amur Starfishes by adopting the most suitable adaptive filter method.

Emergency dispatching based on automatic speech recognition (음성인식 기반 응급상황관제)

  • Lee, Kyuwhan;Chung, Jio;Shin, Daejin;Chung, Minhwa;Kang, Kyunghee;Jang, Yunhee;Jang, Kyungho
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.31-39
    • /
    • 2016
  • In emergency dispatching at 119 Command & Dispatch Center, some inconsistencies between the 'standard emergency aid system' and 'dispatch protocol,' which are both mandatory to follow, cause inefficiency in the dispatcher's performance. If an emergency dispatch system uses automatic speech recognition (ASR) to process the dispatcher's protocol speech during the case registration, it instantly extracts and provides the required information specified in the 'standard emergency aid system,' making the rescue command more efficient. For this purpose, we have developed a Korean large vocabulary continuous speech recognition system for 400,000 words to be used for the emergency dispatch system. The 400,000 words include vocabulary from news, SNS, blogs and emergency rescue domains. Acoustic model is constructed by using 1,300 hours of telephone call (8 kHz) speech, whereas language model is constructed by using 13 GB text corpus. From the transcribed corpus of 6,600 real telephone calls, call logs with emergency rescue command class and identified major symptom are extracted in connection with the rescue activity log and National Emergency Department Information System (NEDIS). ASR is applied to emergency dispatcher's repetition utterances about the patient information. Based on the Levenshtein distance between the ASR result and the template information, the emergency patient information is extracted. Experimental results show that 9.15% Word Error Rate of the speech recognition performance and 95.8% of emergency response detection performance are obtained for the emergency dispatch system.

Visual Observation Confidence based GMM Face Recognition robust to Illumination Impact in a Real-world Database

  • TRA, Anh Tuan;KIM, Jin Young;CHAUDHRY, Asmatullah;PHAM, The Bao;Kim, Hyoung-Gook
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1824-1845
    • /
    • 2016
  • The GMM is a conventional approach which has been recently applied in many face recognition studies. However, the question about how to deal with illumination changes while ensuring high performance is still a challenge, especially with real-world databases. In this paper, we propose a Visual Observation Confidence (VOC) measure for robust face recognition for illumination changes. Our VOC value is a combined confidence value of three measurements: Flatness Measure (FM), Centrality Measure (CM), and Illumination Normality Measure (IM). While FM measures the discrimination ability of one face, IM represents the degree of illumination impact on that face. In addition, we introduce CM as a centrality measure to help FM to reduce some of the errors from unnecessary areas such as the hair, neck or background. The VOC then accompanies the feature vectors in the EM process to estimate the optimal models by modified-GMM training. In the experiments, we introduce a real-world database, called KoFace, besides applying some public databases such as the Yale and the ORL database. The KoFace database is composed of 106 face subjects under diverse illumination effects including shadows and highlights. The results show that our proposed approach gives a higher Face Recognition Rate (FRR) than the GMM baseline for indoor and outdoor datasets in the real-world KoFace database (94% and 85%, respectively) and in ORL, Yale databases (97% and 100% respectively).

A Study on a Violence Recognition System with CCTV (CCTV에서 폭력 행위 감지 시스템 연구)

  • Shim, Young-Bin;Park, Hwa-Jin
    • Journal of Digital Contents Society
    • /
    • v.16 no.1
    • /
    • pp.25-32
    • /
    • 2015
  • With the increased frequency of crime such as assaults and sexual violence, the reliance on CCTV in arresting criminals has increased as well. However, CCTV, which should be monitored by human labor force at all times, has limits in terms of budget and man-power. Thereby, the interest in intelligent security system is growing nowadays. Expanding the techniques of an objects behavior recognition in previous studies, we propose a system to detect forms of violence between 2~3 objects from images obtained in CCTV. It perceives by detecting the object with the difference operation and the morphology of the background image. The determinant criteria to define violent behaviors are suggested. Moreover, provable decision metric values through measurements of the number of violent condition are derived. As a result of the experiments with the threshold values, showed more than 80% recognition success rate. A future research for abnormal behaviors recognition system in a crowded circumstance remains to be developed.

Acoustic model training using self-attention for low-resource speech recognition (저자원 환경의 음성인식을 위한 자기 주의를 활용한 음향 모델 학습)

  • Park, Hosung;Kim, Ji-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.483-489
    • /
    • 2020
  • This paper proposes acoustic model training using self-attention for low-resource speech recognition. In low-resource speech recognition, it is difficult for acoustic model to distinguish certain phones. For example, plosive /d/ and /t/, plosive /g/ and /k/ and affricate /z/ and /ch/. In acoustic model training, the self-attention generates attention weights from the deep neural network model. In this study, these weights handle the similar pronunciation error for low-resource speech recognition. When the proposed method was applied to Time Delay Neural Network-Output gate Projected Gated Recurrent Unit (TNDD-OPGRU)-based acoustic model, the proposed model showed a 5.98 % word error rate. It shows absolute improvement of 0.74 % compared with TDNN-OPGRU model.

Development of a Recognition System of Smile Facial Expression for Smile Treatment Training (웃음 치료 훈련을 위한 웃음 표정 인식 시스템 개발)

  • Li, Yu-Jie;Kang, Sun-Kyung;Kim, Young-Un;Jung, Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.4
    • /
    • pp.47-55
    • /
    • 2010
  • In this paper, we proposed a recognition system of smile facial expression for smile treatment training. The proposed system detects face candidate regions by using Haar-like features from camera images. After that, it verifies if the detected face candidate region is a face or non-face by using SVM(Support Vector Machine) classification. For the detected face image, it applies illumination normalization based on histogram matching in order to minimize the effect of illumination change. In the facial expression recognition step, it computes facial feature vector by using PCA(Principal Component Analysis) and recognizes smile expression by using a multilayer perceptron artificial network. The proposed system let the user train smile expression by recognizing the user's smile expression in real-time and displaying the amount of smile expression. Experimental result show that the proposed system improve the correct recognition rate by using face region verification based on SVM and using illumination normalization based on histogram matching.

Same music file recognition method by using similarity measurement among music feature data (음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법)

  • Sung, Bo-Kyung;Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.99-106
    • /
    • 2008
  • Recently, digital music retrieval is using in many fields (Web portal. audio service site etc). In existing fields, Meta data of music are used for digital music retrieval. If Meta data are not right or do not exist, it is hard to get high accurate retrieval result. Contents based information retrieval that use music itself are researched for solving upper problem. In this paper, we propose Same music recognition method using similarity measurement. Feature data of digital music are extracted from waveform of music using Simplified MFCC (Mel Frequency Cepstral Coefficient). Similarity between digital music files are measured using DTW (Dynamic time Warping) that are used in Vision and Speech recognition fields. We success all of 500 times experiment in randomly collected 1000 songs from same genre for preying of proposed same music recognition method. 500 digital music were made by mixing different compressing codec and bit-rate from 60 digital audios. We ploved that similarity measurement using DTW can recognize same music.

  • PDF

A Study on the Extraction of an Individual Character and Chinese Characters Recognition on the Off-line Documents (오프라인 문서에서 개별 문자 추출과 한자 인식에 관한 연구)

  • Kim, Ui-Jeong;Kim, Tae-Gyun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.5
    • /
    • pp.1277-1288
    • /
    • 1997
  • In this paper,the extraciton method for individual and the recognition method for the printed dociments are discussed. In preprocessing is a technique to extract characters that are difficult to manage such as touching characters or overlapped chracters.Genrally in the existing segmentation methods,projection and edge detection are applied.However,in this paper an indvidual character is extracted by using connected pixel with one projection after the string extraction The maximum Blok Methld(MBM)is used for the recognition.The MBM is a method to enlarge the block to the last point the pixel that was found during projection. The maximum blocks are skeletonxied after the division into straight line block and oblique line block.Especially,in the recognition of chinese chracters compared to the existing method it showed improved recognition rate.

  • PDF