• Title/Summary/Keyword: 검출 모델

Search Result 1,728, Processing Time 0.031 seconds

Facial Features and Motion Recovery using multi-modal information and Paraperspective Camera Model (다양한 형식의 얼굴정보와 준원근 카메라 모델해석을 이용한 얼굴 특징점 및 움직임 복원)

  • Kim, Sang-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.9B no.5
    • /
    • pp.563-570
    • /
    • 2002
  • Robust extraction of 3D facial features and global motion information from 2D image sequence for the MPEG-4 SNHC face model encoding is described. The facial regions are detected from image sequence using multi-modal fusion technique that combines range, color and motion information. 23 facial features among the MPEG-4 FDP (Face Definition Parameters) are extracted automatically inside the facial region using color transform (GSCD, BWCD) and morphological processing. The extracted facial features are used to recover the 3D shape and global motion of the object using paraperspective camera model and SVD (Singular Value Decomposition) factorization method. A 3D synthetic object is designed and tested to show the performance of proposed algorithm. The recovered 3D motion information is transformed into global motion parameters of FAP (Face Animation Parameters) of the MPEG-4 to synchronize a generic face model with a real face.

Improvement of Keyword Spotting Performance Using Normalized Confidence Measure (정규화 신뢰도를 이용한 핵심어 검출 성능향상)

  • Kim, Cheol;Lee, Kyoung-Rok;Kim, Jin-Young;Choi, Seung-Ho;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.380-386
    • /
    • 2002
  • Conventional post-processing as like confidence measure (CM) proposed by Rahim calculates phones' CM using the likelihood between phoneme model and anti-model, and then word's CM is obtained by averaging phone-level CMs[1]. In conventional method, CMs of some specific keywords are tory low and they are usually rejected. The reason is that statistics of phone-level CMs are not consistent. In other words, phone-level CMs have different probability density functions (pdf) for each phone, especially sri-phone. To overcome this problem, in this paper, we propose normalized confidence measure. Our approach is to transform CM pdf of each tri-phone to the same pdf under the assumption that CM pdfs are Gaussian. For evaluating our method we use common keyword spotting system. In that system context-dependent HMM models are used for modeling keyword utterance and contort-independent HMM models are applied to non-keyword utterance. The experiment results show that the proposed NCM reduced FAR (false alarm rate) from 0.44 to 0.33 FA/KW/HR (false alarm/keyword/hour) when MDR is about 8%. It achieves 25% improvement of FAR.

Analysis of Oscillometric Model based on Shape of Arterial Pressure (동맥압 형태를 고려한 오실로메트릭 모델분석)

  • 임성수;이경중
    • Journal of Biomedical Engineering Research
    • /
    • v.21 no.4
    • /
    • pp.411-417
    • /
    • 2000
  • This paper describes the analysis of the oscillometric method based on the shape of arterial pressure and proposal of a new algorithm for estimating the blood pressure by computer simulation. In the first step, the arterial pressure model which is able to control the shape of arterial pressure was designed and then we simulated the oscillometric model using both the existing exponential model showing the static arterial pressure-volume relation and the designed arterial pressure model. By analyzing the correlation of characteristic ratio based on the shape of arterial pressure, we could find that the characteristic ratio was not the only standard parameter for estimating systolic and diastolic pressure. We were able to estimate the shape of arterial pressure by computing the correlation of arterial pressure shape with oscillation shape. Finally, we proposed an algorithm which is able to estimate systolic and diastolic pressure according to pressure(Pp) table constructed from the relation of maximum amplitude of oscillation and arterial pressure shape. We tested 60 arterial pressure waveforms having various arterial pressure shape and pulse. As a results, the absolute deviation average values of the estimation of systolic, diastolic and mean pressure were 1.62%, 2.40% and 2.20%, respectively. In conclusions, the proposed algorithm showed the possibility of usefullness in estimating the blood pressure.

  • PDF

Performance analysis of weakly-supervised sound event detection system based on the mean-teacher convolutional recurrent neural network model (평균-교사 합성곱 순환 신경망 모델을 이용한 약지도 음향 이벤트 검출 시스템의 성능 분석)

  • Lee, Seokjin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.2
    • /
    • pp.139-147
    • /
    • 2021
  • This paper introduces and implements a Sound Event Detection (SED) system based on weakly-supervised learning where only part of the data is labeled, and analyzes the effect of parameters. The SED system estimates the classes and onset/offset times of events in the acoustic signal. In order to train the model, all information on the event class and onset/offset times must be provided. Unfortunately, the onset/offset times are hard to be labeled exactly. Therefore, in the weakly-supervised task, the SED model is trained by "strongly labeled data" including the event class and activations, "weakly labeled data" including the event class, and "unlabeled data" without any label. Recently, the SED systems using the mean-teacher model are widely used for the task with several parameters. These parameters should be chosen carefully because they may affect the performance. In this paper, performance analysis was performed on parameters, such as the feature, moving average parameter, weight of the consistency cost function, ramp-up length, and maximum learning rate, using the data of DCASE 2020 Task 4. Effects and the optimal values of the parameters were discussed.

ACMs-based Human Shape Extraction and Tracking System for Human Identification (개인 인증을 위한 활성 윤곽선 모델 기반의 사람 외형 추출 및 추적 시스템)

  • Park, Se-Hyun;Kwon, Kyung-Su;Kim, Eun-Yi;Kim, Hang-Joon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.5
    • /
    • pp.39-46
    • /
    • 2007
  • Research on human identification in ubiquitous environment has recently attracted a lot of attention. As one of those research, gait recognition is an efficient method of human identification using physical features of a walking person at a distance. In this paper, we present a human shape extraction and tracking for gait recognition using geodesic active contour models(GACMs) combined with mean shift algorithm The active contour models (ACMs) are very effective to deal with the non-rigid object because of its elastic property. However, they have the limitation that their performance is mainly dependent on the initial curve. To overcome this problem, we combine the mean shift algorithm with the traditional GACMs. The main idea is very simple. Before evolving using level set method, the initial curve in each frame is re-localized near the human region and is resized enough to include the targe region. This mechanism allows for reducing the number of iterations and for handling the large object motion. The proposed system is composed of human region detection and human shape tracking modules. In the human region detection module, the silhouette of a walking person is extracted by background subtraction and morphologic operation. Then human shape are correctly obtained by the GACMs with mean shift algorithm. In experimental results, the proposed method show that it is extracted and tracked efficiently accurate shape for gait recognition.

  • PDF

Gesture Spotting by Web-Camera in Arbitrary Two Positions and Fuzzy Garbage Model (임의 두 지점의 웹 카메라와 퍼지 가비지 모델을 이용한 사용자의 의미 있는 동작 검출)

  • Yang, Seung-Eun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.2
    • /
    • pp.127-136
    • /
    • 2012
  • Many research of hand gesture recognition based on vision system have been conducted which enable user operate various electronic devices more easily. 3D position calculation and meaningful gesture classification from similar gestures should be executed to recognize hand gesture accurately. A simple and cost effective method of 3D position calculation and gesture spotting (a task to recognize meaningful gesture from other similar meaningless gestures) is described in this paper. 3D position is achieved by calculation of two cameras relative position through pan/tilt module and a marker regardless with the placed position. Fuzzy garbage model is proposed to provide a variable reference value to decide whether the user gesture is the command gesture or not. The reference is achieved from fuzzy command gesture model and fuzzy garbage model which returns the score that shows the degree of belonging to command gesture and garbage gesture respectively. Two-stage user adaptation is proposed that off-line (batch) adaptation for inter-personal difference and on-line (incremental) adaptation for intra-difference to enhance the performance. Experiment is conducted for 5 different users. The recognition rate of command (discriminate command gesture) is more than 95% when only one command like meaningless gesture exists and more than 85% when the command is mixed with many other similar gestures.

Smoke color analysis of the standard color models for fire video surveillance (화재 영상감시를 위한 표준 색상모델의 연기색상 분석)

  • Lee, Yong-Hun;Kim, Won-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.9
    • /
    • pp.4472-4477
    • /
    • 2013
  • This paper describes the color features of smoke in each standard color model in order to present the most suitable color model for somke detection in video surveillance system. Histogram intersection technique is used to analyze the difference characteristics between color of smoke and color of non smoke. The considered standard color models are RGB, YCbCr, CIE-Lab, HSV, and if the calculated histogram intersection value is large for the considered color model, then the smoke spilt characteristics are not good in that color model. If the calculated histogram intersection value is small, then the smoke spilt characteristics are good in that color model. The analyzed result shows that the RGB and HSV color models are the most suitable for color model based smoke detection by performing respectively 0.14 and 0.156 for histogram intersection value.

Layered Object Detection using Adaptive Gaussian Mixture Model in the Complex and Dynamic Environment (혼잡한 환경에서 적응적 가우시안 혼합 모델을 이용한 계층적 객체 검출)

  • Lee, Jin-Hyung;Cho, Seong-Won;Kim, Jae-Min;Chung, Sun-Tae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.387-391
    • /
    • 2008
  • For the detection of moving objects, background subtraction methods are widely used. In case the background has variation, we need to update the background in real-time for the reliable detection of foreground objects. Gaussian mixture model (GMM) combined with probabilistic learning is one of the most popular methods for the real-time update of the background. However, it does not work well in the complex and dynamic backgrounds with high traffic regions. In this paper, we propose a new method for modelling and updating more reliably the complex and dynamic backgrounds based on the probabilistic learning and the layered processing.

1D CNN and Machine Learning Methods for Fall Detection (1D CNN과 기계 학습을 사용한 낙상 검출)

  • Kim, Inkyung;Kim, Daehee;Noh, Song;Lee, Jaekoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.3
    • /
    • pp.85-90
    • /
    • 2021
  • In this paper, fall detection using individual wearable devices for older people is considered. To design a low-cost wearable device for reliable fall detection, we present a comprehensive analysis of two representative models. One is a machine learning model composed of a decision tree, random forest, and Support Vector Machine(SVM). The other is a deep learning model relying on a one-dimensional(1D) Convolutional Neural Network(CNN). By considering data segmentation, preprocessing, and feature extraction methods applied to the input data, we also evaluate the considered models' validity. Simulation results verify the efficacy of the deep learning model showing improved overall performance.

Kubernetes-based Framework for Improving Traffic Light Recognition Performance: Convergence Vision AI System based on YOLOv5 and C-RNN with Visual Attention (신호등 인식 성능 향상을 위한 쿠버네티스 기반의 프레임워크: YOLOv5와 Visual Attention을 적용한 C-RNN의 융합 Vision AI 시스템)

  • Cho, Hyoung-Seo;Lee, Min-Jung;Han, Yeon-Jee
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.851-853
    • /
    • 2022
  • 고령화로 인해 65세 이상 운전자가 급증하며 고령운전자의 교통사고 비율이 증가함에 따라 시급한 사회 문제로 떠오르고 있다. 이에 본 연구에서는 객체 검출, 인식 모델을 결합하고 신호등을 인식하여 Text-To-Speech(TTS)로 알리는 쿠버네티스 기반의 프레임워크를 제안한다. 객체 검출 단계에서는 YOLOv5 모델들의 성능을 비교하여 활용하였으며 객체 인식 단계에서는 C-RNN 기반의 attention-OCR 모델을 활용하였다. 이는 신호등의 내부 LED 영역이 아닌 이미지 전체를 인식하는 방식으로 오탐지 요소를 낮춰 인식률을 높였다. 결과적으로 1,628장의 테스트 데이터에서 accuracy 0.997, F1-score 0.991의 성능 평가를 얻어 제안한 프레임워크의 타당성을 입증하였다. 본 연구는 후속 연구에서 특정 도메인에 딥러닝 모델을 한정하지 않고 다양한 분야의 모델을 접목할 수 있도록 하며 고령 운전자 및 신호 위반으로 인한 교통사고 문제를 예방할 수 있다.