• Title/Summary/Keyword: 패턴 유사성 검색

Search Result 50, Processing Time 0.02 seconds

Pattern Analysis-Based Query Expansion for Enhancing Search Convenience (검색 편의성 향상을 위한 패턴 분석 기반 질의어 확장)

  • Jeon, Seo-In;Park, Gun-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.2
    • /
    • pp.65-72
    • /
    • 2012
  • In the 21st century of information systems, the amount of information resources are ever increasing and the role of information searching system is becoming criticalto easily acquire required information from the web. Generally, it requires the user to have enough pre-knowledge and superior capabilities to identify keywords of information to effectively search the web. However, most of the users undertake searching of the information without holding enough pre-knowledge and spend a lot of time associating key words which are related to their required information. Furthermore, many search engines support the keywords searching system but this only provides collection of similar words, and do not provide the user with exact relational search information with the keywords. Therefore this research report proposes a method of offering expanded user relationship search keywords by analyzing user query patterns to provide the user a system, which conveniently support their searching of the information.

Recognition of Shape Similarity using Shape Pattern Representation for Design Computation (컴퓨터를 이용한 디자인 프로세스에 있어서 형태패턴의 스키마적 표현을 이용한 건축형태의 유사성 판단에 관한 연구)

  • 차명열
    • Archives of design research
    • /
    • v.15 no.4
    • /
    • pp.337-346
    • /
    • 2002
  • Among many design processes such as learning, storing, retrieving and applying, the process that learns design knowledge is very important for producing creative results that solve design purposes in design computations. The computer should have the ability similar to human in learning design knowledge. It should recognize not only physical properties but also high level design knowledge constructed from the first level physical properties. The high level design knowledge are recognised in terms of isometric translation relationships. This paper explains properties of isometric translation and methods how the computer can recognize high level shape design knowledge using shape pattern representation.

  • PDF

XML Document Analysis based on Similarity (유사성 기반 XML 문서 분석 기법)

  • Lee, Jung-Won;Lee, Ki-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.6
    • /
    • pp.367-376
    • /
    • 2002
  • XML allows users to define elements using arbitrary words and organize them in a nested structure. These features of XML offer both challenges and opportunities in information retrieval and document management. In this paper, we propose a new methodology for computing similarity considering XML semantics - meanings of the elements and nested structures of XML documents. We generate extended-element vectors, using thesaurus, to normalize synonyms, compound words, and abbreviations and build similarity matrix using them. And then we compute similarity between XML elements. We also discover and minimize XML structure using automata(NFA(Nondeterministic Finite Automata) and DFA(Deterministic Finite automata). We compute similarity between XML structures using similarity matrix between elements and minimized XML structures. Our methodology considering XML semantics shows 100% accuracy in identifying the category of real documents from on-line bookstore.

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

  • 구경이;신창환;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.507-520
    • /
    • 2003
  • In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.

Construction of Theme Melody Index by Transforming Melody to Time-series Data for Content-based Music Information Retrieval (내용기반 음악정보 검색을 위한 선율의 시계열 데이터 변환을 이용한 주제선율색인 구성)

  • Ha, Jin-Seok;Ku, Kyong-I;Park, Jae-Hyun;Kim, Yoo-Sung
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.547-558
    • /
    • 2003
  • From the viewpoint of that music melody has the similar features to time-series data, music melody is transformed to a time-series data with normalization and corrections and the similarity between melodies is defined as the Euclidean distance between the transformed time-series data. Then, based the similarity between melodies of a music object, melodies are clustered and the representative of each cluster is extracted as one of theme melodies for the music. To construct the theme melody index, a theme melody is represented as a point of the multidimensional metric space of M-tree. For retrieval of user's query melody, the query melody is also transformed into a time-series data by the same way of indexing phase. To retrieve the similar melodies to the query melody given by user from the theme melody index the range query search algorithm is used. By the implementation of the prototype system using the proposed theme melody index we show the effectiveness of the proposed methods.

Incremental Clustering of XML Documents based on Similar Structures (유사 구조 기반 XML 문서의 점진적 클러스터링)

  • Hwang Jeong Hee;Ryu Keun Ho
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.699-709
    • /
    • 2004
  • XML is increasingly important in data exchange and information management. Starting point for retrieving the structure and integrating the documents efficiently is clustering the documents that have similar structure. The reason is that we can retrieve the documents more flexible and faster than the method treating the whole documents that have different structure. Therefore, in this paper, we propose the similar structure-based incremental clustering method useful for retrieving the structure of XML documents and integrating them. As a novel method, we use a clustering algorithm for transactional data that facilitates the large number of data, which is quite different from the existing methods that measure the similarity between documents, using vector. We first extract the representative structures of XML documents using sequential pattern algorithm, and then we perform the similar structure based document clustering, assuming that the document as a transaction, the representative structure of the document as the items of the transaction. In addition, we define the cluster cohesion and inter-cluster similarity, and analyze the efficiency of the Proposed method through comparing with the existing method by experiments.

A Study of Practical Search System for Information Retrieval on the Web (웹 상의 정보검색을 위한 지능형 검색시스템의 연구)

  • Park, Beung-Raul;Lim, Jong-Tae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11c
    • /
    • pp.1737-1740
    • /
    • 2002
  • 검색시스템은 분류시스템과 지식탐사 시스템을 결합하여 구성한 복합적인 시스템으로 일반 사용자들에게 자신이 일하는 정보의 데이터를 우선적으로 제공한다. 시스템의 특징으로 겉으로 보기에는 일반 검색엔진과 유사하나, 시스템적으로는 요구하는 각종 기능과 검색 기법, 지식탐사기법이 들어있다. 시스템에서는 문서 분류기법과 문서와 검색어 사이의 연관성을 찾기 위한 방법, 문서간의 연속적인 사건을 통한 검색 패턴 탐사기법을 사용하였다. 이들은 시스템의 검색과 분류 결과를 지금까지보다 더욱 인공지능에 가깝도록 하여 준다.

  • PDF

XML Document Clustering Based on Sequential Pattern (순차패턴에 기반한 XML 문서 클러스터링)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.7
    • /
    • pp.1093-1102
    • /
    • 2003
  • As the use of internet is growing, the amount of information is increasing rapidly and XML that is a standard of the web data has the property of flexibility of data representation. Therefore electronic document systems based on web, such as EDMS (Electronic Document Management System), ebXML (e-business extensible Markup Language), have been adopting XML as the method for exchange and standard of documents. So research on the method which can manage and search structural XML documents in an effective wav is required. In this paper we propose the clustering method based on structural similarity among the many XML documents, using typical structures extracted from each document by sequential pattern mining in pre-clustering process. The proposed algorithm improves the accuracy of clustering by computing cost considering cluster cohesion and inter-cluster similarity.

Inflow Forecasting Using Fuzzy-Grey Model (Fuzzy-Grey 모형을 이용한 유입량 예측)

  • Kim, Yong;Yi, Choong Sung;Kim, Hung Soo;Shim, Myung Pil
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2004.05b
    • /
    • pp.759-764
    • /
    • 2004
  • 본 연구는 Deng(1989)이 제시한 Grey 모형을 이용하여 성진강댐의 월유입량을 예측하였고 그 방법을 제시하였다. Grey 모형은 시계열모형이나 다른 모형에 비해 비교적 적은 수의 자료를 이용하고, 간단할 수식으로 구성되어 있는 장점이 있으나, 적은 수의 자료로 인해 입력자료가 가지는 증감의 경향(trend)으로 오차가 발생하기 쉽다. 그러므로 예측오차를 극복하기 위해서 Fuzzy 시스템을 결합한 Fuzzy-Grey 모형을 구성하였고 Fuzzy 시스템에 필요한 매개변수를 추정하기 위해 최적화기법인 유전자 알고리즘(GA; Genetic Algorithm)을 이용하였다. Grey 모형과 결합된 Fuzzy 시스템은 현재의 입력자료가 가지는 패턴과 가장 유사한 패턴의 과거자료를 이용하여 현재의 입력자료의 예측오차를 추론해내는 기능을 가진다. 오차를 추론하기 위해서 과거 월유입량 자료중 현재 입력 자료와 유사한 패턴을 Grey 상관도를 이용하여 검색하고, 보다 높은 유사성을 가지는 패턴을 선별하고자 노름(norm)을 사용하였고, 유전자 알고리즘의 탐색공간을 제한하였다. 이렇게 구성한 Fuzzy-Grey 모형을 이용하여 전국적인 가뭄년도였던 1992년, 1988년, 2001년에 대해 섬진강댐의 월유입량을 예측하였다. 오차는 1982년, 2001년, 1988년 순으로 비슷한 크기의 오차가 발생하였는데 결과를 분석하여 보면, 급격한 월유입량의 변화가 있었던 경우에 오차가 크게 발생하였으나 가뭄년도에 대해 월유입량의 불확실성이 큼에도 불구하고 비교적 월유입량의 추세를 잘 예측한 것으로 판단된다. 본 연구에서 적용한 Fuzzy-Grey 모형은 적은 수의 자료를 이용하여 예측하고 예측결과를 다시 입력자료로 사용하는 업데이트 방식을 사용하기 때문에 예측결과의 오차가 완전하게 보정되지 않으면 다음 결과에 역시 오차를 주게 되어 오차보정이 상당히 중요하다는 것을 알 수 있었다. 오차를 보다 효과적으로 보정하기 위해서는 퍼지제어에 사용되는 퍼지규칙의 수를 늘리고, 유입량에 직접적인 영향을 주는 강우량과 연계한 2변수의 Fuzzy-Grey 모형을 이용한다면 보다 정확한 유입량 예측이 가능할 것으로 사료된다.

  • PDF

Clustering of Web Objects with Similar Popularity Trends (유사한 인기도 추세를 갖는 웹 객체들의 클러스터링)

  • Loh, Woong-Kee
    • The KIPS Transactions:PartD
    • /
    • v.15D no.4
    • /
    • pp.485-494
    • /
    • 2008
  • Huge amounts of various web items such as keywords, images, and web pages are being made widely available on the Web. The popularities of such web items continuously change over time, and mining temporal patterns in popularities of web items is an important problem that is useful for several web applications. For example, the temporal patterns in popularities of search keywords help web search enterprises predict future popular keywords, enabling them to make price decisions when marketing search keywords to advertisers. However, presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in popularities of web items. We treat the popularities of web items as time-series, and propose gapmeasure to quantify the similarity between the popularities of two web items. To reduce the computation overhead for this measure, an efficient method using the Fast Fourier Transform (FFT) is presented. We assume that the popularities of web items are not necessarily following any probabilistic distribution or periodic. For finding clusters of web items with similar popularity trends, we propose to use a density-based clustering algorithm based on the gap measure. Our experiments using the popularity trends of search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.