• Title/Summary/Keyword: Object Feature Extraction

Search Result 266, Processing Time 0.02 seconds

Normalization of Face Images Subject to Directional Illumination using Linear Model (선형모델을 이용한 방향성 조명하의 얼굴영상 정규화)

  • 고재필;김은주;변혜란
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.54-60
    • /
    • 2004
  • Face recognition is one of the problems to be solved by appearance based matching technique. However, the appearance of face image is very sensitive to variation in illumination. One of the easiest ways for better performance is to collect more training samples acquired under variable lightings but it is not practical in real world. ]:n object recognition, it is desirable to focus on feature extraction or normalization technique rather than focus on classifier. This paper presents a simple approach to normalization of faces subject to directional illumination. This is one of the significant issues that cause error in the face recognition process. The proposed method, ICR(illumination Compensation based on Multiple Linear Regression), is to find the plane that best fits the intensity distribution of the face image using the multiple linear regression, then use this plane to normalize the face image. The advantages of our method are simple and practical. The planar approximation of a face image is mathematically defined by the simple linear model. We provide experimental results to demonstrate the performance of the proposed ICR method on public face databases and our database. The experimental results show a significant improvement of the recognition accuracy.

A Study of Feature-Extraction from the Specifically Intended Product Designs (제품의 특성추출을 통한 디자인 적용 방법에 관한 연구)

  • Hyoung, Sung-Eun;Cho, Un-Dea;Cho, Kwang-Soo
    • Science of Emotion and Sensibility
    • /
    • v.10 no.1
    • /
    • pp.87-98
    • /
    • 2007
  • The aim of this study is to grasp the features of the object which reveals its own specific purposes, and to apply them to the product concept and design forms when designers develop products. For this study, the subjects of the experiment were chosen to fill out a basic questionnaire, and an image analysis of them was performed. After the analysis, the functional design elements of the subjects were extracted and coded. They preyed the correlation between the results of the image analysis and the characteristics of the subjects. The questionnaire was carried out to determine the characteristics of the subjects. As the features of specific products were extracted through this experiment, they can be used as basic data to analyze consumer needs and to better understand the products when we design for them. This can be useful fundamental data enabling designers to understand products easily and to establish concepts for their designs. In the case of the MP3 player in this study, the results of the image analysis of it are turned out to be sound quality, compatibility, portability, employment, interface, and personality. Their respective related features were investigated as well. The important features of designing the MP3 player were presented. Through this fundamental study, it will be possible to understand consumer's needs more effectively, which will bring about the development of the fundamental basis of various fields in design.

  • PDF

Image Registration and Fusion between Passive Millimeter Wave Images and Visual Images (수동형 멀리미터파 영상과 가시 영상과의 정합 및 융합에 관한 연구)

  • Lee, Hyoung;Lee, Dong-Su;Yeom, Seok-Won;Son, Jung-Young;Guschin, Vladmir P.;Kim, Shin-Hwan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.6C
    • /
    • pp.349-354
    • /
    • 2011
  • Passive millimeter wave imaging has the capability of detecting concealed objects under clothing. Also, passive millimeter imaging can obtain interpretable images under low visibility conditions like rain, fog, smoke, and dust. However, the image quality is often degraded due to low spatial resolution, low signal level, and low temperature resolution. This paper addresses image registration and fusion between passive millimeter images and visual images. The goal of this study is to combine and visualize two different types of information together: human subject's identity and concealed objects. The image registration process is composed of body boundary detection and an affine transform maximizing cross-correlation coefficients of two edge images. The image fusion process comprises three stages: discrete wavelet transform for image decomposition, a fusion rule for merging the coefficients, and the inverse transform for image synthesis. In the experiments, various types of metallic and non-metallic objects such as a knife, gel or liquid type beauty aids and a phone are detected by passive millimeter wave imaging. The registration and fusion process can visualize the meaningful information from two different types of sensors.

An Automatic ROI Extraction and Its Mask Generation based on Wavelet of Low DOF Image (피사계 심도가 낮은 이미지에서 웨이블릿 기반의 자동 ROI 추출 및 마스크 생성)

  • Park, Sun-Hwa;Seo, Yeong-Geon;Lee, Bu-Kweon;Kang, Ki-Jun;Kim, Ho-Yong;Kim, Hyung-Jun;Kim, Sang-Bok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.93-101
    • /
    • 2009
  • This paper suggests a new algorithm automatically searching for Region-of-Interest(ROI) with high speed, using the edge information of high frequency subband transformed with wavelet. The proposed method executes a searching algorithm of 4-direction object boundary by the unit of block using the edge information, and detects ROIs. The whole image is splitted by $64{\times}64$ or $32{\times}32$ sized blocks and the blocks can be ROI block or background block according to taking the edges or not. The 4-directions searche the image from the outside to the center and the algorithm uses a feature that the low-DOF image has some edges as one goes to center. After searching all the edges, the method regards the inner blocks of the edges as ROI, and makes the ROI masks and sends them to server. This is one of the dynamic ROI method. The existing methods have had some problems of complicated filtering and region merge, but this method improved considerably the problems. Also, it was possible to apply to an application requiring real-time processing caused by the process of the unit of block.

Reproducing Summarized Video Contents based on Camera Framing and Focus

  • Hyung Lee;E-Jung Choi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.85-92
    • /
    • 2023
  • In this paper, we propose a method for automatically generating story-based abbreviated summaries from long-form dramas and movies. From the shooting stage, the basic premise was to compose a frame with illusion of depth considering the golden division as well as focus on the object of interest to focus the viewer's attention in terms of content delivery. To consider how to extract the appropriate frames for this purpose, we utilized elemental techniques that have been utilized in previous work on scene and shot detection, as well as work on identifying focus-related blur. After converting the videos shared on YouTube to frame-by-frame, we divided them into a entire frame and three partial regions for feature extraction, and calculated the results of applying Laplacian operator and FFT to each region to choose the FFT with relative consistency and robustness. By comparing the calculated values for the entire frame with the calculated values for the three regions, the target frames were selected based on the condition that relatively sharp regions could be identified. Based on the selected results, the final frames were extracted by combining the results of an offline change point detection method to ensure the continuity of the frames within the shot, and an edit decision list was constructed to produce an abbreviated summary of 62.77% of the footage with F1-Score of 75.9%

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.