• Title/Summary/Keyword: Query Extraction

Search Result 109, Processing Time 0.038 seconds

Feature Selection with PCA based on DNS Query for Malicious Domain Classification (비정상도메인 분류를 위한 DNS 쿼리 기반의 주성분 분석을 이용한 성분추출)

  • Lim, Sun-Hee;Cho, Jaeik;Kim, Jong-Hyun;Lee, Byung Gil
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.1 no.1
    • /
    • pp.55-60
    • /
    • 2012
  • Recent botnets are widely using the DNS services at the connection of C&C server in order to evade botnet's detection. It is necessary to study on DNS analysis in order to counteract anomaly-based technique using the DNS. This paper studies collection of DNS traffic for experimental data and supervised learning for DNS traffic-based malicious domain classification such as query of domain name corresponding to C&C server from zombies. Especially, this paper would aim to determine significant features of DNS-based classification system for malicious domain extraction by the Principal Component Analysis(PCA).

A Term Weight Mensuration based on Popularity for Search Query Expansion (검색 질의 확장을 위한 인기도 기반 단어 가중치 측정)

  • Lee, Jung-Hun;Cheon, Suh-Hyun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.8
    • /
    • pp.620-628
    • /
    • 2010
  • With the use of the Internet pervasive in everyday life, people are now able to retrieve a lot of information through the web. However, exponential growth in the quantity of information on the web has brought limits to online search engines in their search performance by showing piles and piles of unwanted information. With so much unwanted information, web users nowadays need more time and efforts than in the past to search for needed information. This paper suggests a method of using query expansion in order to quickly bring wanted information to web users. Popularity based Term Weight Mensuration better performance than the TF-IDF and Simple Popularity Term Weight Mensuration to experiments without changes of search subject. When a subject changed during search, Popularity based Term Weight Mensuration's performance change is smaller than others.

Keyword Spotting on Hangul Document Images Using Character Feature Models (문자 별 특징 모델을 이용한 한글 문서 영상에서 키워드 검색)

  • Park, Sang-Cheol;Kim, Soo-Hyung;Choi, Deok-Jai
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.521-526
    • /
    • 2005
  • In this Paper, we propose a keyword spotting system as an alternative to searching system for poor quality Korean document images and compare the Proposed system with an OCR-based document retrieval system. The system is composed of character segmentation, feature extraction for the query keyword, and word-to-word matching. In the character segmentation step, we propose an effective method to remove the connectivity between adjacent characters and a character segmentation method by making the variance of character widths minimum. In the query creation step, feature vector for the query is constructed by a combination of a character model by typeface. In the matching step, word-to-word matching is applied base on a character-to-character matching. We demonstrated that the proposed keyword spotting system is more efficient than the OCR-based one to search a keyword on the Korean document images, especially when the quality of documents is quite poor and point size is small.

A PageRank based Data Indexing Method for Designing Natural Language Interface to CRM Databases (분석 CRM 실무자의 자연어 질의 처리를 위한 기업 데이터베이스 구성요소 인덱싱 방법론)

  • Park, Sung-Hyuk;Hwang, Kyeong-Seo;Lee, Dong-Won
    • CRM연구
    • /
    • v.2 no.2
    • /
    • pp.53-70
    • /
    • 2009
  • Understanding consumer behavior based on the analysis of the customer data is one essential part of analytic CRM. To do this, the analytic skills for data extraction and data processing are required to users. As a user has various kinds of questions for the consumer data analysis, the user should use database language such as SQL. However, for the firm's user, to generate SQL statements is not easy because the accuracy of the query result is hugely influenced by the knowledge of work-site operation and the firm's database. This paper proposes a natural language based database search framework finding relevant database elements. Specifically, we describe how our TableRank method can understand the user's natural query language and provide proper relations and attributes of data records to the user. Through several experiments, it is supported that the TableRank provides accurate database elements related to the user's natural query. We also show that the close distance among relations in the database represents the high data connectivity which guarantees matching with a search query from a user.

  • PDF

Adaptive Image Content-Based Retrieval Techniques for Multiple Queries (다중 질의를 위한 적응적 영상 내용 기반 검색 기법)

  • Hong Jong-Sun;Kang Dae-Seong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.3 s.303
    • /
    • pp.73-80
    • /
    • 2005
  • Recently there have been many efforts to support searching and browsing based on the visual content of image and multimedia data. Most existing approaches to content-based image retrieval rely on query by example or user based low-level features such as color, shape, texture. But these methods of query are not easy to use and restrict. In this paper we propose a method for automatic color object extraction and labelling to support multiple queries of content-based image retrieval system. These approaches simplify the regions within images using single colorizing algorithm and extract color object using proposed Color and Spatial based Binary tree map(CSB tree map). And by searching over a large of number of processed regions, a index for the database is created by using proposed labelling method. This allows very fast indexing of the image by color contents of the images and spatial attributes. Futhermore, information about the labelled regions, such as the color set, size, and location, enables variable multiple queries that combine both color content and spatial relationships of regions. We proved our proposed system to be high performance through experiment comparable with another algorithm using 'Washington' image database.

An Object-Based Image Retrieval Techniques using the Interplay between Cortex and Hippocampus (해마와 피질의 상호 관계를 이용한 객체 기반 영상 검색 기법)

  • Hong Jong-Sun;Kang Dae-Seong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.4 s.304
    • /
    • pp.95-102
    • /
    • 2005
  • In this paper, we propose a user friendly object-based image retrieval system using the interaction between cortex and hippocampus. Most existing ways of queries in content-based image retrieval rely on query by example or query by sketch. But these methods of queries are not adequate to needs of people's various queries because they are not easy for people to use and restrict. We propose a method of automatic color object extraction using CSB tree map(Color and Spatial based Binary をn map). Extracted objects were transformed to bit stream representing information such as color, size and location by region labelling algorithm and they are learned by the hippocampal neural network using the interplay between cortex and hippocampus. The cells of exciting at peculiar features in brain generate the special sign when people recognize some patterns. The existing neural networks treat each attribute of features evenly. Proposed hippocampal neural network makes an adaptive fast content-based image retrieval system using excitatory learning method that forwards important features to long-term memories and inhibitory teaming method that forwards unimportant features to short-term memories controlled by impression.

Representative Feature Extraction of Objects using VQ and Its Application to Content-based Image Retrieval (VQ를 이용한 영상의 객체 특징 추출과 이를 이용한 내용 기반 영상 검색)

  • Jang, Dong-Sik;Jung, Seh-Hwan;Yoo, Hun-Woo;Sohn, Yong--Jun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.724-732
    • /
    • 2001
  • In this paper, a new method of feature extraction of major objects to represent an image using Vector Quantization(VQ) is proposed. The principal features of the image, which are used in a content-based image retrieval system, are color, texture, shape and spatial positions of objects. The representative color and texture features are extracted from the given image using VQ(Vector Quantization) clustering algorithm with a general feature extraction method of color and texture. Since these are used for content-based image retrieval and searched by objects, it is possible to search and retrieve some desirable images regardless of the position, rotation and size of objects. The experimental results show that the representative feature extraction time is much reduced by using VQ, and the highest retrieval rate is given as the weighted values of color and texture are set to 0.5 and 0.5, respectively, and the proposed method provides up to 90% precision and recall rate for 'person'query images.

  • PDF

A Study on Semantic Based Indexing and Fuzzy Relevance Model (의미기반 인덱스 추출과 퍼지검색 모델에 관한 연구)

  • Kang, Bo-Yeong;Kim, Dae-Won;Gu, Sang-Ok;Lee, Sang-Jo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.238-240
    • /
    • 2002
  • If there is an Information Retrieval system which comprehends the semantic content of documents and knows the preference of users. the system can search the information better on the Internet, or improve the IR performance. Therefore we propose the IR model which combines semantic based indexing and fuzzy relevance model. In addition to the statistical approach, we chose the semantic approach in indexing, lexical chains, because we assume it would improve the performance of the index term extraction. Furthermore, we combined the semantic based indexing with the fuzzy model, which finds out the exact relevance of the user preference and index terms. The proposed system works as follows: First, the presented system indexes documents by the efficient index term extraction method using lexical chains. And then, if a user tends to retrieve the information from the indexed document collection, the extended IR model calculates and ranks the relevance of user query. user preference and index terms by some metrics. When we experimented each module, semantic based indexing and extended fuzzy model. it gave noticeable results. The combination of these modules is expected to improve the information retrieval performance.

  • PDF

Speech Query Recognition for Tamil Language Using Wavelet and Wavelet Packets

  • Iswarya, P.;Radha, V.
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1135-1148
    • /
    • 2017
  • Speech recognition is one of the fascinating fields in the area of Computer science. Accuracy of speech recognition system may reduce due to the presence of noise present in speech signal. Therefore noise removal is an essential step in Automatic Speech Recognition (ASR) system and this paper proposes a new technique called combined thresholding for noise removal. Feature extraction is process of converting acoustic signal into most valuable set of parameters. This paper also concentrates on improving Mel Frequency Cepstral Coefficients (MFCC) features by introducing Discrete Wavelet Packet Transform (DWPT) in the place of Discrete Fourier Transformation (DFT) block to provide an efficient signal analysis. The feature vector is varied in size, for choosing the correct length of feature vector Self Organizing Map (SOM) is used. As a single classifier does not provide enough accuracy, so this research proposes an Ensemble Support Vector Machine (ESVM) classifier where the fixed length feature vector from SOM is given as input, termed as ESVM_SOM. The experimental results showed that the proposed methods provide better results than the existing methods.

An Efficient Feature Point Extraction and Comparison Method through Distorted Region Correction in 360-degree Realistic Contents

  • Park, Byeong-Chan;Kim, Jin-Sung;Won, Yu-Hyeon;Kim, Young-Mo;Kim, Seok-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.93-100
    • /
    • 2019
  • One of critical issues in dealing with 360-degree realistic contents is the performance degradation in searching and recognition process since they support up to 4K UHD quality and have all image angles including the front, back, left, right, top, and bottom parts of a screen. To solve this problem, in this paper, we propose an efficient search and comparison method for 360-degree realistic contents. The proposed method first corrects the distortion at the less distorted regions such as front, left and right parts of the image excluding severely distorted regions such as upper and lower parts, and then it extracts feature points at the corrected region and selects the representative images through sequence classification. When the query image is inputted, the search results are provided through feature points comparison. The experimental results of the proposed method shows that it can solve the problem of performance deterioration when 360-degree realistic contents are recognized comparing with traditional 2D contents.