• Title/Summary/Keyword: Image Feature Vector

Search Result 500, Processing Time 0.031 seconds

A Study on Stroke Extraction for Handwritten Korean Character Recognition (필기체 한글 문자 인식을 위한 획 추출에 관한 연구)

  • Choi, Young-Kyoo;Rhee, Sang-Burm
    • The KIPS Transactions:PartB
    • /
    • v.9B no.3
    • /
    • pp.375-382
    • /
    • 2002
  • Handwritten character recognition is classified into on-line handwritten character recognition and off-line handwritten character recognition. On-line handwritten character recognition has made a remarkable outcome compared to off-line hacdwritten character recognition. This method can acquire the dynamic written information such as the writing order and the position of a stroke by means of pen-based electronic input device such as a tablet board. On the contrary, Any dynamic information can not be acquired in off-line handwritten character recognition since there are extreme overlapping between consonants and vowels, and heavily noisy images between strokes, which change the recognition performance with the result of the preprocessing. This paper proposes a method that effectively extracts the stroke including dynamic information of characters for off-line Korean handwritten character recognition. First of all, this method makes improvement and binarization of input handwritten character image as preprocessing procedure using watershed algorithm. The next procedure is extraction of skeleton by using the transformed Lu and Wang's thinning: algorithm, and segment pixel array is extracted by abstracting the feature point of the characters. Then, the vectorization is executed with a maximum permission error method. In the case that a few strokes are bound in a segment, a segment pixel array is divided with two or more segment vectors. In order to reconstruct the extracted segment vector with a complete stroke, the directional component of the vector is mortified by using right-hand writing coordinate system. With combination of segment vectors which are adjacent and can be combined, the reconstruction of complete stroke is made out which is suitable for character recognition. As experimentation, it is verified that the proposed method is suitable for handwritten Korean character recognition.

A Robust Hand Recognition Method to Variations in Lighting (조명 변화에 안정적인 손 형태 인지 기술)

  • Choi, Yoo-Joo;Lee, Je-Sung;You, Hyo-Sun;Lee, Jung-Won;Cho, We-Duke
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.25-36
    • /
    • 2008
  • In this paper, we present a robust hand recognition approach to sudden illumination changes. The proposed approach constructs a background model with respect to hue and hue gradient in HSI color space and extracts a foreground hand region from an input image using the background subtraction method. Eighteen features are defined for a hand pose and multi-class SVM(Support Vector Machine) approach is applied to learn and classify hand poses based on eighteen features. The proposed approach robustly extracts the contour of a hand with variations in illumination by applying the hue gradient into the background subtraction. A hand pose is defined by two Eigen values which are normalized by the size of OBB(Object-Oriented Bounding Box), and sixteen feature values which represent the number of hand contour points included in each subrange of OBB. We compared the RGB-based background subtraction, hue-based background subtraction and the proposed approach with sudden illumination changes and proved the robustness of the proposed approach. In the experiment, we built a hand pose training model from 2,700 sample hand images of six subjects which represent nine numerical numbers from one to nine. Our implementation result shows 92.6% of successful recognition rate for 1,620 hand images with various lighting condition using the training model.

Vehicle Detection and Tracking using Billboard Sweep Stereo Matching Algorithm (빌보드 스윕 스테레오 시차정합 알고리즘을 이용한 차량 검출 및 추적)

  • Park, Min Woo;Won, Kwang Hee;Jung, Soon Ki
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.6
    • /
    • pp.764-781
    • /
    • 2013
  • In this paper, we propose a highly precise vehicle detection method with low false alarm using billboard sweep stereo matching and multi-stage hypothesis generation. First, we capture stereo images from cameras established in front of the vehicle and obtain the disparity map in which the regions of ground plane or background are removed using billboard sweep stereo matching algorithm. And then, we perform the vehicle detection and tracking on the labeled disparity map. The vehicle detection and tracking consists of three steps. In the learning step, the SVM(support vector machine) classifier is obtained using the features extracted from the gabor filter. The second step is the vehicle detection which performs the sobel edge detection in the image of the left camera and extracts candidates of the vehicle using edge image and billboard sweep stereo disparity map. The final step is the vehicle tracking using template matching in the next frame. Removal process of the tracking regions improves the system performance in the candidate region of the vehicle on the succeeding frames.

An Algorithm of Fingerprint Image Restoration Based on an Artificial Neural Network (인공 신경망 기반의 지문 영상 복원 알고리즘)

  • Jang, Seok-Woo;Lee, Samuel;Kim, Gye-Young
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.530-536
    • /
    • 2020
  • The use of minutiae by fingerprint readers is robust against presentation attacks, but one weakness is that the mismatch rate is high. Therefore, minutiae tend to be used with skeleton images. There have been many studies on security vulnerabilities in the characteristics of minutiae, but vulnerability studies on the skeleton are weak, so this study attempts to analyze the vulnerability of presentation attacks against the skeleton. To this end, we propose a method based on the skeleton to recover the original fingerprint using a learning algorithm. The proposed method includes a new learning model, Pix2Pix, which adds a latent vector to the existing Pix2Pix model, thereby generating a natural fingerprint. In the experimental results, the original fingerprint is restored using the proposed machine learning, and then, the restored fingerprint is the input for the fingerprint reader in order to achieve a good recognition rate. Thus, this study verifies that fingerprint readers using the skeleton are vulnerable to presentation attacks. The approach presented in this paper is expected to be useful in a variety of applications concerning fingerprint restoration, video security, and biometrics.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Counterfeit Money Detection Algorithm based on Morphological Features of Color Printed Images and Supervised Learning Model Classifier (컬러 프린터 영상의 모폴로지 특징과 지도 학습 모델 분류기를 활용한 위변조 지폐 판별 알고리즘)

  • Woo, Qui-Hee;Lee, Hae-Yeoun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.889-898
    • /
    • 2013
  • Due to the popularization of high-performance capturing equipments and the emergence of powerful image-editing softwares, it is easy to make high-quality counterfeit money. However, the probability of detecting counterfeit money to the general public is extremely low and the detection device is expensive. In this paper, a counterfeit money detection algorithm using a general purpose scanner and computer system is proposed. First, the printing features of color printers are calculated using morphological operations and gray-level co-occurrence matrix. Then, these features are used to train a support vector machine classifier. This trained classifier is applied for identifying either original or counterfeit money. In the experiment, we measured the detection rate between the original and counterfeit money. Also, the printing source was identified. The proposed algorithm was compared with the algorithm using wiener filter to identify color printing source. The accuracy for identifying counterfeit money was 91.92%. The accuracy for identifying the printing source was over 94.5%. The results support that the proposed algorithm performs better than previous researches.

Regional Projection Histogram Matching and Linear Regression based Video Stabilization for a Moving Vehicle (영역별 수직 투영 히스토그램 매칭 및 선형 회귀모델 기반의 차량 운행 영상의 안정화 기술 개발)

  • Heo, Yu-Jung;Choi, Min-Kook;Lee, Hyun-Gyu;Lee, Sang-Chul
    • Journal of Broadcast Engineering
    • /
    • v.19 no.6
    • /
    • pp.798-809
    • /
    • 2014
  • Video stabilization is performed to remove unexpected shaky and irregular motion from a video. It is often used as preprocessing for robust feature tracking and matching in video. Typical video stabilization algorithms are developed to compensate motion from surveillance video or outdoor recordings that are captured by a hand-help camera. However, since the vehicle video contains rapid change of motion and local features, typical video stabilization algorithms are hard to be applied as it is. In this paper, we propose a novel approach to compensate shaky and irregular motion in vehicle video using linear regression model and vertical projection histogram matching. Towards this goal, we perform vertical projection histogram matching at each sub region of an input frame, and then we generate linear regression model to extract vertical translation and rotation parameters with estimated regional vertical movement vector. Multiple binarization with sub-region analysis for generating the linear regression model is effective to typical recording environments where occur rapid change of motion and local features. We demonstrated the effectiveness of our approach on blackbox videos and showed that employing the linear regression model achieved robust estimation of motion parameters and generated stabilized video in full automatic manner.

Real-Time Vehicle License Plate Recognition System Using Adaptive Heuristic Segmentation Algorithm (적응 휴리스틱 분할 알고리즘을 이용한 실시간 차량 번호판 인식 시스템)

  • Jin, Moon Yong;Park, Jong Bin;Lee, Dong Suk;Park, Dong Sun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.9
    • /
    • pp.361-368
    • /
    • 2014
  • The LPR(License plate recognition) system has been developed to efficient control for complex traffic environment and currently be used in many places. However, because of light, noise, background changes, environmental changes, damaged plate, it only works limited environment, so it is difficult to use in real-time. This paper presents a heuristic segmentation algorithm for robust to noise and illumination changes and introduce a real-time license plate recognition system using it. In first step, We detect the plate utilized Haar-like feature and Adaboost. This method is possible to rapid detection used integral image and cascade structure. Second step, we determine the type of license plate with adaptive histogram equalization, bilateral filtering for denoise and segment accurate character based on adaptive threshold, pixel projection and associated with the prior knowledge. The last step is character recognition that used histogram of oriented gradients (HOG) and multi-layer perceptron(MLP) for number recognition and support vector machine(SVM) for number and Korean character classifier respectively. The experimental results show license plate detection rate of 94.29%, license plate false alarm rate of 2.94%. In character segmentation method, character hit rate is 97.23% and character false alarm rate is 1.37%. And in character recognition, the average character recognition rate is 98.38%. Total average running time in our proposed method is 140ms. It is possible to be real-time system with efficiency and robustness.

Hand Motion Recognition Algorithm Using Skin Color and Center of Gravity Profile (피부색과 무게중심 프로필을 이용한 손동작 인식 알고리즘)

  • Park, Youngmin
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.411-417
    • /
    • 2021
  • The field that studies human-computer interaction is called HCI (Human-computer interaction). This field is an academic field that studies how humans and computers communicate with each other and recognize information. This study is a study on hand gesture recognition for human interaction. This study examines the problems of existing recognition methods and proposes an algorithm to improve the recognition rate. The hand region is extracted based on skin color information for the image containing the shape of the human hand, and the center of gravity profile is calculated using principal component analysis. I proposed a method to increase the recognition rate of hand gestures by comparing the obtained information with predefined shapes. We proposed a method to increase the recognition rate of hand gestures by comparing the obtained information with predefined shapes. The existing center of gravity profile has shown the result of incorrect hand gesture recognition for the deformation of the hand due to rotation, but in this study, the center of gravity profile is used and the point where the distance between the points of all contours and the center of gravity is the longest is the starting point. Thus, a robust algorithm was proposed by re-improving the center of gravity profile. No gloves or special markers attached to the sensor are used for hand gesture recognition, and a separate blue screen is not installed. For this result, find the feature vector at the nearest distance to solve the misrecognition, and obtain an appropriate threshold to distinguish between success and failure.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.