• Title/Summary/Keyword: feature similarity

Search Result 595, Processing Time 0.028 seconds

Face Recognition Method using Geometric Feature and PCA/LDA in Wavelet Domain (웨이브릿 영역에서 기하학적 특징과 PCA/LDA를 사용한 얼굴 인식 방법)

  • 송영준;김영길
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.3
    • /
    • pp.107-113
    • /
    • 2004
  • This paper improved the performance of the face recognition system using the PCA/LDA hybrid method based on the facial geometric feature and the Wavelet transform. Because the previous PCA/LDA methods have measured the similarity according to the formal dispersion, they could not reflect facial boundaries exactly In order to recover this defect, this paper proposed the method using the distance between eyes and mouth. If the difference of the measured distances on the query and the training images is over the given threshold, then the method reorders the candidate images according to energy feature vectors of eyes, a nose, and a chin. To evaluate the performance of the proposed method the computer simulations have been performed with four hundred facial images in the ORL database. The results showed that our method improves about 4% recognition rate over the previous PCA/LDA method.

  • PDF

Selective Feature Extraction Method Between Markov Transition Probability and Co-occurrence Probability for Image Splicing Detection (접합 영상 검출을 위한 마르코프 천이 확률 및 동시발생 확률에 대한 선택적 특징 추출 방법)

  • Han, Jong-Goo;Eom, Il-Kyu;Moon, Yong-Ho;Ha, Seok-Wun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.4
    • /
    • pp.833-839
    • /
    • 2016
  • In this paper, we propose a selective feature extraction algorithm between Markov transition probability and co-occurrence probability for an effective image splicing detection. The Features used in our method are composed of the difference values between DCT coefficients in the adjacent blocks and the value of Kullback-Leibler divergence(KLD) is calculated to evaluate the differences between the distribution of original image features and spliced image features. KLD value is an efficient measure for selecting Markov feature or Co-occurrence feature because KLD shows non-similarity of the two distributions. After training the extracted feature vectors using the SVM classifier, we determine whether the presence of the image splicing forgery. To verify our algorithm we used grid search and 6-folds cross-validation. Based on the experimental results it shows that the proposed method has good detection performance with a limited number of features compared to conventional methods.

A Deep Learning Application for Automated Feature Extraction in Transaction-based Machine Learning (트랜잭션 기반 머신러닝에서 특성 추출 자동화를 위한 딥러닝 응용)

  • Woo, Deock-Chae;Moon, Hyun Sil;Kwon, Suhnbeom;Cho, Yoonho
    • Journal of Information Technology Services
    • /
    • v.18 no.2
    • /
    • pp.143-159
    • /
    • 2019
  • Machine learning (ML) is a method of fitting given data to a mathematical model to derive insights or to predict. In the age of big data, where the amount of available data increases exponentially due to the development of information technology and smart devices, ML shows high prediction performance due to pattern detection without bias. The feature engineering that generates the features that can explain the problem to be solved in the ML process has a great influence on the performance and its importance is continuously emphasized. Despite this importance, however, it is still considered a difficult task as it requires a thorough understanding of the domain characteristics as well as an understanding of source data and the iterative procedure. Therefore, we propose methods to apply deep learning for solving the complexity and difficulty of feature extraction and improving the performance of ML model. Unlike other techniques, the most common reason for the superior performance of deep learning techniques in complex unstructured data processing is that it is possible to extract features from the source data itself. In order to apply these advantages to the business problems, we propose deep learning based methods that can automatically extract features from transaction data or directly predict and classify target variables. In particular, we applied techniques that show high performance in existing text processing based on the structural similarity between transaction data and text data. And we also verified the suitability of each method according to the characteristics of transaction data. Through our study, it is possible not only to search for the possibility of automated feature extraction but also to obtain a benchmark model that shows a certain level of performance before performing the feature extraction task by a human. In addition, it is expected that it will be able to provide guidelines for choosing a suitable deep learning model based on the business problem and the data characteristics.

A New Similarity Measure based on RMF and It s Application to Linguistic Approximation (상대적 소수 함수에 기반을 둔 새로운 유사성 측도와 언어 근사에의 응용)

  • Choe, Dae-Yeong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.463-468
    • /
    • 2001
  • We propose a new similarity measure based on relative membership function (RMF). In this paper, the RMF is suggested to represent the relativity between fuzzy subsets easily. Since the shape of the RMF is determined according to the values of its parameters, we can easily represent the relativity between fuzzy subsets by adjusting only the values of its parameters. Hence, we can easily reflect the relativity among individuals or cultural differences when we represent the subjectivity by using the fuzzy subsets. In this case, these parameters may be regarded as feature points for determining the structure of fuzzy subset. In the sequel, the degree of similarity between fuzzy subsets can be quickly computed by using the parameters of the RMF. We use Euclidean distance to compute the degree of similarity between fuzzy subsets represented by the RMF. In the meantime, we present a new linguistic approximation method as an application area of the proposed similarity measure and show its numerical example.

  • PDF

A Document Ranking Method by Document Clustering Using Bayesian SoM and Botstrap (베이지안 SOM과 붓스트랩을 이용한 문서 군집화에 의한 문서 순위조정)

  • Choe, Jun-Hyeok;Jeon, Seong-Hae;Lee, Jeong-Hyeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2108-2115
    • /
    • 2000
  • The conventional Boolean retrieval systems based on vector spae model can provide the results of retrieval fast, they can't reflect exactly user's retrieval purpose including semantic information. Consequently, the results of retrieval process are very different from those users expected. This fact forces users to waste much time for finding expected documents among retrieved documents. In his paper, we designed a bayesian SOM(Self-Organizing feature Maps) in combination with bayesian statistical method and Kohonen network as a kind of unsupervised learning, then perform classifying documents depending on the semantic similarity to user query in real time. If it is difficult to observe statistical characteristics as there are less than 30 documents for clustering, the number of documents must be increased to at least 50. Also, to give high rank to the documents which is most similar to user query semantically among generalized classifications for generalized clusters, we find the similarity by means of Kohonen centroid of each document classification and adjust the secondary rank depending on the similarity.

  • PDF

Similarity-Based Subsequence Search in Image Sequence Databases (이미지 시퀀스 데이터베이스에서의 유사성 기반 서브시퀀스 검색)

  • Kim, In-Bum;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.501-512
    • /
    • 2003
  • This paper proposes an indexing technique for fast retrieval of similar image subsequences using the multi-dimensional time warping distance. The time warping distance is a more suitable similarity measure than Lp distance in many applications where sequences may be of different lengths and/or different sampling rates. Our indexing scheme employs a disk-based suffix tree as an index structure and uses a lower-bound distance function to filter out dissimilar subsequences without false dismissals. It applies the normaliration for an easier control of relative weighting of feature dimensions and the discretization to compress the index tree. Experiments on medical and synthetic image sequences verify that the proposed method significantly outperforms the naive method and scales well in a large volume of image sequence databases.

B-Corr Model for Bot Group Activity Detection Based on Network Flows Traffic Analysis

  • Hostiadi, Dandy Pramana;Wibisono, Waskitho;Ahmad, Tohari
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.10
    • /
    • pp.4176-4197
    • /
    • 2020
  • Botnet is a type of dangerous malware. Botnet attack with a collection of bots attacking a similar target and activity pattern is called bot group activities. The detection of bot group activities using intrusion detection models can only detect single bot activities but cannot detect bots' behavioral relation on bot group attack. Detection of bot group activities could help network administrators isolate an activity or access a bot group attacks and determine the relations between bots that can measure the correlation. This paper proposed a new model to measure the similarity between bot activities using the intersections-probability concept to define bot group activities called as B-Corr Model. The B-Corr model consisted of several stages, such as extraction feature from bot activity flows, measurement of intersections between bots, and similarity value production. B-Corr model categorizes similar bots with a similar target to specify bot group activities. To achieve a more comprehensive view, the B-Corr model visualizes the similarity values between bots in the form of a similar bot graph. Furthermore, extensive experiments have been conducted using real botnet datasets with high detection accuracy in various scenarios.

A Study on the Color Functions of the Textile Design System based on CAD using Image Analysis Methods (텍스타일 디자인 캐드 시스템의 색정리 기능에 대한 정량적 분석 연구)

  • Choi, Kyung-Me;Kim, Jong-Jun
    • Journal of Fashion Business
    • /
    • v.15 no.4
    • /
    • pp.43-54
    • /
    • 2011
  • Printing process has been a major sector in the textile industries for a long period of time. With the advent of digital textile printing, the complex procedures of printing preparations and after-treatment processes have been streamlined. For the design of the motives of images to be printed, the use of image handling software, e.g. Photoshop(Adobe), has been of prime importance. Even though the software is extremely useful and functionally versatile, there are many laborious steps involved for the specific textile printing process. The use of a CAD-based textile printing function may help the textile printing process in streamlining the complex processing stages. The image qualities of the output designs have been compared objectively with the aid of several image similarity evaluation schemes including the SSIM, and FSIM Index methods.

Short-Term Prediction Model of Postal Parcel Traffic based on Self-Similarity (자기 유사성 기반 소포우편 단기 물동량 예측모형 연구)

  • Kim, Eunhye;Jung, Hoon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.4
    • /
    • pp.76-83
    • /
    • 2020
  • Postal logistics organizations are characterized as having high labor intensity and short response times. These characteristics, along with rapid change in mail volume, make load scheduling a fundamental concern. Load analysis of major postal infrastructures such as post offices, sorting centers, exchange centers, and delivery stations is required for optimal postal logistics operation. In particular, the performance of mail traffic forecasting is essential for optimizing the resource operation by accurate load analysis. This paper addresses a traffic forecast problem of postal parcel that arises at delivery stations of Korea Post. The main purpose of this paper is to describe a method for predicting short-term traffic of postal parcel based on self-similarity analysis and to introduce an application of the traffic prediction model to postal logistics system. The proposed scheme develops multiple regression models by the clusters resulted from feature engineering and individual models for delivery stations to reinforce prediction accuracy. The experiment with data supplied by main postal delivery stations shows the advantage in terms of prediction performance. Comparing with other technique, experimental results show that the proposed method improves the accuracy up to 45.8%.

AI Performance Based On Learning-Data Labeling Accuracy (인공지능 학습데이터 라벨링 정확도에 따른 인공지능 성능)

  • Ji-Hoon Lee;Jieun Shin
    • Journal of Industrial Convergence
    • /
    • v.22 no.1
    • /
    • pp.177-183
    • /
    • 2024
  • The study investigates the impact of data quality on the performance of artificial intelligence (AI). To this end, the impact of labeling error levels on the performance of artificial intelligence was compared and analyzed through simulation, taking into account the similarity of data features and the imbalance of class composition. As a result, data with high similarity between characteristic variables were found to be more sensitive to labeling accuracy than data with low similarity between characteristic variables. It was observed that artificial intelligence accuracy tended to decrease rapidly as class imbalance increased. This will serve as the fundamental data for evaluating the quality criteria and conducting related research on artificial intelligence learning data.