• Title/Summary/Keyword: Visual Feature Extraction

Search Result 142, Processing Time 0.027 seconds

Door Recognition using Visual Fuzzy System in Indoor Environments (시각 퍼지 시스템을 이용한 실내 문 인식)

  • Yi, Chu-Ho;Lee, Sang-Heon;Jeong, Seung-Do;Suh, Il-Hong;Choi, Byung-Uk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.1
    • /
    • pp.73-82
    • /
    • 2010
  • Door is an important object to understand given environment and it could be used to distinguish with corridors and rooms. Doors are widely used natural landmark in mobile robotics for localization and navigation. However, almost algorithm for door recognition with camera is difficult real-time application because feature extraction and matching have heavy computation complexity. This paper proposes a method to recognize a door in corridor. First, we extract distinguished lines which have high possibility to comprise of door using Hough transformation. Then, we detect candidate of door region by applying previously extracted lines to first-stage visual fuzzy system. Finally, door regions are determined by verifying knob region in candidate of door region suing second-stage visual fuzzy system.

MPEG-7 Texture Descriptor (MPEG-7 질감 기술자)

  • 강호경;정용주;유기원;노용만;김문철;김진웅
    • Journal of Broadcast Engineering
    • /
    • v.5 no.1
    • /
    • pp.10-22
    • /
    • 2000
  • In this paper, we present a texture description method as a standardization of multimedia contents description. Like color, shape, object and camera motion information, texture is one of very important information in the visual part of international standard (MPEG-7) in multimedia contents description. Current MPEG-7 texture descriptor has been designed to fit human visual system. Many psychophysical experiments give evidence that the brain decomposes the spectra into perceptual channels that are bands in spatial frequency. The MPEG-7 texture description method has employed Radon transform that fits with HVS behavior. By taking average energy and energy deviation of HVS channels, the texture descriptor is generated. To test the performance of current texture descriptor, experiments with MPEG-7 Texture data sets of T1 to T7 are performed. Results show that the current MPEG-7 texture descriptor gives better retrieval rate and fast and fast extraction time for texture feature.

  • PDF

A Hybrid Proposed Framework for Object Detection and Classification

  • Aamir, Muhammad;Pu, Yi-Fei;Rahman, Ziaur;Abro, Waheed Ahmed;Naeem, Hamad;Ullah, Farhan;Badr, Aymen Mudheher
    • Journal of Information Processing Systems
    • /
    • v.14 no.5
    • /
    • pp.1176-1194
    • /
    • 2018
  • The object classification using the images' contents is a big challenge in computer vision. The superpixels' information can be used to detect and classify objects in an image based on locations. In this paper, we proposed a methodology to detect and classify the image's pixels' locations using enhanced bag of words (BOW). It calculates the initial positions of each segment of an image using superpixels and then ranks it according to the region score. Further, this information is used to extract local and global features using a hybrid approach of Scale Invariant Feature Transform (SIFT) and GIST, respectively. To enhance the classification accuracy, the feature fusion technique is applied to combine local and global features vectors through weight parameter. The support vector machine classifier is a supervised algorithm is used for classification in order to analyze the proposed methodology. The Pascal Visual Object Classes Challenge 2007 (VOC2007) dataset is used in the experiment to test the results. The proposed approach gave the results in high-quality class for independent objects' locations with a mean average best overlap (MABO) of 0.833 at 1,500 locations resulting in a better detection rate. The results are compared with previous approaches and it is proved that it gave the better classification results for the non-rigid classes.

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

COVID-19 Diagnosis from CXR images through pre-trained Deep Visual Embeddings

  • Khalid, Shahzaib;Syed, Muhammad Shehram Shah;Saba, Erum;Pirzada, Nasrullah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.175-181
    • /
    • 2022
  • COVID-19 is an acute respiratory syndrome that affects the host's breathing and respiratory system. The novel disease's first case was reported in 2019 and has created a state of emergency in the whole world and declared a global pandemic within months after the first case. The disease created elements of socioeconomic crisis globally. The emergency has made it imperative for professionals to take the necessary measures to make early diagnoses of the disease. The conventional diagnosis for COVID-19 is through Polymerase Chain Reaction (PCR) testing. However, in a lot of rural societies, these tests are not available or take a lot of time to provide results. Hence, we propose a COVID-19 classification system by means of machine learning and transfer learning models. The proposed approach identifies individuals with COVID-19 and distinguishes them from those who are healthy with the help of Deep Visual Embeddings (DVE). Five state-of-the-art models: VGG-19, ResNet50, Inceptionv3, MobileNetv3, and EfficientNetB7, were used in this study along with five different pooling schemes to perform deep feature extraction. In addition, the features are normalized using standard scaling, and 4-fold cross-validation is used to validate the performance over multiple versions of the validation data. The best results of 88.86% UAR, 88.27% Specificity, 89.44% Sensitivity, 88.62% Accuracy, 89.06% Precision, and 87.52% F1-score were obtained using ResNet-50 with Average Pooling and Logistic regression with class weight as the classifier.

Facial Feature Detection and Facial Contour Extraction using Snakes (얼굴 요소의 영역 추출 및 Snakes를 이용한 윤곽선 추출)

  • Lee, Kyung-Hee;Byun, Hye-Ran
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.7
    • /
    • pp.731-741
    • /
    • 2000
  • This paper proposes a method to detect a facial region and extract facial features which is crucial for visual recognition of human faces. In this paper, we extract the MER(Minimum Enclosing Rectangle) of a face and facial components using projection analysis on both edge image and binary image. We use an active contour model(snakes) for extraction of the contours of eye, mouth, eyebrow, and face in order to reflect the individual differences of facial shapes and converge quickly. The determination of initial contour is very important for the performance of snakes. Particularly, we detect Minimum Enclosing Rectangle(MER) of facial components and then determine initial contours using general shape of facial components within the boundary of the obtained MER. We obtained experimental results to show that MER extraction of the eye, mouth, and face was performed successfully. But in the case of images with bright eyebrow, MER extraction of eyebrow was performed poorly. We obtained good contour extraction with the individual differences of facial shapes. Particularly, in the eye contour extraction, we combined edges by first order derivative operator and zero crossings by second order derivative operator in designing energy function of snakes, and we achieved good eye contours. For the face contour extraction, we used both edges and grey level intensity of pixels in designing of energy function. Good face contours were extracted as well.

  • PDF

HMM-based Intent Recognition System using 3D Image Reconstruction Data (3차원 영상복원 데이터를 이용한 HMM 기반 의도인식 시스템)

  • Ko, Kwang-Enu;Park, Seung-Min;Kim, Jun-Yeup;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.135-140
    • /
    • 2012
  • The mirror neuron system in the cerebrum, which are handled by visual information-based imitative learning. When we observe the observer's range of mirror neuron system, we can assume intention of performance through progress of neural activation as specific range, in include of partially hidden range. It is goal of our paper that imitative learning is applied to 3D vision-based intelligent system. We have experiment as stereo camera-based restoration about acquired 3D image our previous research Using Optical flow, unscented Kalman filter. At this point, 3D input image is sequential continuous image as including of partially hidden range. We used Hidden Markov Model to perform the intention recognition about performance as result of restoration-based hidden range. The dynamic inference function about sequential input data have compatible properties such as hand gesture recognition include of hidden range. In this paper, for proposed intention recognition, we already had a simulation about object outline and feature extraction in the previous research, we generated temporal continuous feature vector about feature extraction and when we apply to Hidden Markov Model, make a result of simulation about hand gesture classification according to intention pattern. We got the result of hand gesture classification as value of posterior probability, and proved the accuracy outstandingness through the result.

A Object-Based Image Retrieval Using Feature Analysis and Fractal Dimension (특징 분석과 프랙탈 차원을 이용한 객체 기반 영상검색)

  • 이정봉;박장춘
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.2
    • /
    • pp.173-186
    • /
    • 2004
  • This paper proposed the content-based retrieval system as a method for performing image retrieval through the effective feature extraction of the object of significant meaning based on the characteristics of man's visual system. To allow the object region of interest to be primarily detected, the region, being comparatively large size, greatly different from the background color and located in the middle of the image, was judged as the major object with a meaning. To get the original features of the image, the cumulative sum of tile declination difference vector the segment of the object contour had and the signature of the bipartite object were extracted and used in the form of being applied to the rotation of the object and the change of the size after partition of the total length of the object contour of the image into the normalized segment. Starting with this form feature, it was possible to make a retrieval robust to any change in translation, rotation and scaling by combining information on the texture sample, color and eccentricity and measuring the degree of similarity. It responded less sensitively to the phenomenon of distortion of the object feature due to the partial change or damage of the region. Also, the method of imposing a different weight of similarity on the image feature based on the relationship of complexity between measured objects using the fractal dimension by the Boxing-Counting Dimension minimized the wrong retrieval and showed more efficient retrieval rate.

  • PDF

Co-registration of PET-CT Brain Images using a Gaussian Weighted Distance Map (가우시안 가중치 거리지도를 이용한 PET-CT 뇌 영상정합)

  • Lee, Ho;Hong, Helen;Shin, Yeong-Gil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.612-624
    • /
    • 2005
  • In this paper, we propose a surface-based registration using a gaussian weighted distance map for PET-CT brain image fusion. Our method is composed of three main steps: the extraction of feature points, the generation of gaussian weighted distance map, and the measure of similarities based on weight. First, we segment head using the inverse region growing and remove noise segmented with head using region growing-based labeling in PET and CT images, respectively. And then, we extract the feature points of the head using sharpening filter. Second, a gaussian weighted distance map is generated from the feature points in CT images. Thus it leads feature points to robustly converge on the optimal location in a large geometrical displacement. Third, weight-based cross-correlation searches for the optimal location using a gaussian weighted distance map of CT images corresponding to the feature points extracted from PET images. In our experiment, we generate software phantom dataset for evaluating accuracy and robustness of our method, and use clinical dataset for computation time and visual inspection. The accuracy test is performed by evaluating root-mean-square-error using arbitrary transformed software phantom dataset. The robustness test is evaluated whether weight-based cross-correlation achieves maximum at optimal location in software phantom dataset with a large geometrical displacement and noise. Experimental results showed that our method gives more accuracy and robust convergence than the conventional surface-based registration.

A Fast Vision-based Head Tracking Method for Interactive Stereoscopic Viewing

  • Putpuek, Narongsak;Chotikakamthorn, Nopporn
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1102-1105
    • /
    • 2004
  • In this paper, the problem of a viewer's head tracking in a desktop-based interactive stereoscopic display system is considered. A fast and low-cost approach to the problem is important for such a computing environment. The system under consideration utilizes a shuttle glass for stereoscopic display. The proposed method makes use of an image taken from a single low-cost video camera. By using a simple feature extraction algorithm, the obtained points corresponding to the image of the user-worn shuttle glass are used to estimate the glass center, its local 'yaw' angle, as measured with respect to the glass center, and its global 'yaw' angle as measured with respect to the camera location. With these estimations, the stereoscopic image synthetic program utilizes those values to interactively adjust the two-view stereoscopic image pair as displayed on a computer screen. The adjustment is carried out such that the so-obtained stereoscopic picture, when viewed from a current user position, provides a close-to-real perspective and depth perception. However, because the algorithm and device used are designed for fast computation, the estimation is typically not precise enough to provide a flicker-free interactive viewing. An error concealment method is thus proposed to alleviate the problem. This concealment method should be sufficient for applications that do not require a high degree of visual realism and interaction.

  • PDF