• Title/Summary/Keyword: Hierarchical feature extraction

Search Result 34, Processing Time 0.021 seconds

Head Gesture Recognition using Facial Pose States and Automata Technique (얼굴의 포즈 상태와 오토마타 기법을 이용한 헤드 제스처 인식)

  • Oh, Seung-Taek;Jun, Byung-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.12
    • /
    • pp.947-954
    • /
    • 2001
  • In this paper, we propose a method for the recognition of various head gestures with automata technique applied to the sequence of facial pose states. Facial regions as detected by using the optimum facial color of I-component in YIQ model and the difference of images adaptively selected. And eye regions are extracted by using Sobel operator, projection, and the geometric location of eyes Hierarchical feature analysis is used to classify facial states, and automata technique is applied to the sequence of facial pose states to recognize 13 gestures: Gaze Upward, Downward, Left ward, Rightward, Forward, Backward Left Wink Right Wink Left Double Wink, Left Double Wink , Right Double Wink Yes, and No As an experimental result with total 1,488 frames acquired from 8 persons, it shows 99.3% extraction rate for facial regions, 95.3% extraction rate for eye regions 94.1% recognition rate for facial states and finally 99.3% recognition rate for head gestures. .

  • PDF

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

Improve the Performance of People Detection using Fisher Linear Discriminant Analysis in Surveillance (서베일런스에서 피셔의 선형 판별 분석을 이용한 사람 검출의 성능 향상)

  • Kang, Sung-Kwan;Lee, Jung-Hyun
    • Journal of Digital Convergence
    • /
    • v.11 no.12
    • /
    • pp.295-302
    • /
    • 2013
  • Many reported methods assume that the people in an image or an image sequence have been identified and localization. People detection is one of very important variable to affect for the system's performance as the basis technology about the detection of other objects and interacting with people and computers, motion recognition. In this paper, we present an efficient linear discriminant for multi-view people detection. Our approaches are based on linear discriminant. We define training data with fisher Linear discriminant to efficient learning method. People detection is considerably difficult because it will be influenced by poses of people and changes in illumination. This idea can solve the multi-view scale and people detection problem quickly and efficiently, which fits for detecting people automatically. In this paper, we extract people using fisher linear discriminant that is hierarchical models invariant pose and background. We estimation the pose in detected people. The purpose of this paper is to classify people and non-people using fisher linear discriminant.

Robust Estimation of Hand Poses Based on Learning (학습을 이용한 손 자세의 강인한 추정)

  • Kim, Sul-Ho;Jang, Seok-Woo;Kim, Gye-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.12
    • /
    • pp.1528-1534
    • /
    • 2019
  • Recently, due to the popularization of 3D depth cameras, new researches and opportunities have been made in research conducted on RGB images, but estimation of human hand pose is still classified as one of the difficult topics. In this paper, we propose a robust estimation method of human hand pose from various input 3D depth images using a learning algorithm. The proposed approach first generates a skeleton-based hand model and then aligns the generated hand model with three-dimensional point cloud data. Then, using a random forest-based learning algorithm, the hand pose is strongly estimated from the aligned hand model. Experimental results in this paper show that the proposed hierarchical approach makes robust and fast estimation of human hand posture from input depth images captured in various indoor and outdoor environments.