• Title/Summary/Keyword: Similarity retrieval

Search Result 437, Processing Time 0.026 seconds

A Semantic Similarity Measure for Retrieving Software Components (소프트웨어 부품의 검색을 위한 의미 유사도 측정)

  • Kim, Tae-Hee;Kang, Moon-Seol
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1443-1452
    • /
    • 1996
  • In this paper, we propose a semantic similarity measure for reusable software components, which aims to provide the automatic classification process of reusable to be stored in the structure of a software library, and to provide an efficient retrieval method of the software components satisfying the user's requirements. We have identified the facets to represent component characteristics by extracting information from the component descriptions written in a natural language, composed the software component identifiers from the automatically extracted terms corresponding to each facets, and stored them which the components in the nearest locations according to the semantic similarity of the classified components. In order to retrieve components satisfying user's requirements, we measured a semantic similarity between the queries and the stored components in the software library. As a result of using the semantic similarity to retrieve reusable components, we could not only retrieve the set of components satisfying user's queries. but also reduce the retrieval time of components of user's request. And we further improve the overall retrieval efficiency by assigning relevance ranking to the retrieved components according to the degree of query satisfaction.

  • PDF

A Study on the Optimal Search Keyword Extraction and Retrieval Technique Generation Using Word Embedding (워드 임베딩(Word Embedding)을 활용한 최적의 키워드 추출 및 검색 방법 연구)

  • Jeong-In Lee;Jin-Hee Ahn;Kyung-Taek Koh;YoungSeok Kim
    • Journal of the Korean Geosynthetics Society
    • /
    • v.22 no.2
    • /
    • pp.47-54
    • /
    • 2023
  • In this paper, we propose the technique of optimal search keyword extraction and retrieval for news article classification. The proposed technique was verified as an example of identifying trends related to North Korean construction. A representative Korean media platform, BigKinds, was used to select sample articles and extract keywords. The extracted keywords were vectorized using word embedding and based on this, the similarity between the extracted keywords was examined through cosine similarity. In addition, words with a similarity of 0.5 or higher were clustered based on the top 10 frequencies. Each cluster was formed as 'OR' between keywords inside the cluster and 'AND' between clusters according to the search form of the BigKinds. As a result of the in-depth analysis, it was confirmed that meaningful articles appropriate for the original purpose were extracted. This paper is significant in that it is possible to classify news articles suitable for the user's specific purpose without modifying the existing classification system and search form.

A framework for similarity recognition of CAD models

  • Zehtaban, Leila;Elazhary, Omar;Roller, Dieter
    • Journal of Computational Design and Engineering
    • /
    • v.3 no.3
    • /
    • pp.274-285
    • /
    • 2016
  • A designer is mainly supported by two essential factors in design decisions. These two factors are intelligence and experience aiding the designer by predicting the interconnection between the required design parameters. Through classification of product data and similarity recognition between new and existing designs, it is partially possible to replace the required experience for an inexperienced designer. Given this context, the current paper addresses a framework for recognition and flexible retrieval of similar models in product design. The idea is to establish an infrastructure for transferring design as well as the required PLM (Product Lifecycle Management) know-how to the design phase of product development in order to reduce the design time. Furthermore, such a method can be applied as a brainstorming method for a new and creative product development as well. The proposed framework has been tested and benchmarked while showing promising results.

Performance Improvement of Web Information Retrieval Using Sentence-Query Similarity (문장-질의 유사성을 이용한 웹 정보 검색의 성능 향상)

  • Park Eui-Kyu;Ra Dong-Yul;Jang Myung-Gil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.406-415
    • /
    • 2005
  • Prosperity of Internet led to the web containing huge number of documents. Thus increasing importance is given to the web information retrieval technology that can provide users with documents that contain the right information they want. This paper proposes several techniques that are effective for the improvement of web information retrieval. Similarity between a document and the query is a major source of information exploited by conventional systems. However, we suggest a technique to make use of similarity between a sentence and the query. We introduce a technique to compute the approximate score of the sentence-query similarity even without a mature technology of natural language processing. It was shown that the amount of computation for this task is linear to the number of documents in the total collection, which implies that practical systems can make use of this technique. The next important technique proposed in this paper is to use stratification of documents in re-ranking the documents to output. It was shown that it can lead to significant improvement in performance. We furthermore showed that using hyper links, anchor texts, and titles can result in enhancement of performance. To justify the proposed techniques we developed a large scale web information retrieval system and used it for experiments.

The Weight Decision of Multi-dimensional Features using Fuzzy Similarity Relations and Emotion-Based Music Retrieval (퍼지 유사관계를 이용한 다차원 특징들의 가중치 결정과 감성기반 음악검색)

  • Lim, Jee-Hye;Lee, Joon-Whoan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.5
    • /
    • pp.637-644
    • /
    • 2011
  • Being digitalized, the music can be easily purchased and delivered to the users. However, there is still some difficulty to find the music which fits to someone's taste using traditional music information search based on musician, genre, tittle, album title and so on. In order to reduce the difficulty, the contents-based or the emotion-based music retrieval has been proposed and developed. In this paper, we propose new method to determine the importance of MPEG-7 low-level audio descriptors which are multi-dimensional vectors for the emotion-based music retrieval. We measured the mutual similarities of musics which represent a pair of emotions expressed by opposite meaning in terms of each multi-dimensional descriptor. Then rough approximation, and inter- and intra similarity ratio from the similarity relation are used for determining the importance of a descriptor, respectively. The set of weights based on the importance decides the aggregated similarity measure, by which emotion-based music retrieval can be achieved. The proposed method shows better result than previous method in terms of the average number of satisfactory musics in the experiment emotion-based retrieval based on content-based search.

Performance Improvement of Image Retrieval System by Presenting Query based on Human Perception (인간의 인지도에 근거한 질의를 통한 영상 검색의 성능 향상)

  • 유헌우;장동식;오근태
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.2
    • /
    • pp.158-165
    • /
    • 2003
  • Image similarity is often decided by computing the distance between two feature vectors. Unfortunately, the feature vector cannot always reflect the notion of similarity in human perception. Therefore, most current image retrieval systems use weights measuring the importance of each feature. In this paper new initial weight selection and update rules are proposed for image retrieval purpose. In order to obtain the purpose, database images are first divided into groups based on human perception and, inner and outer query are performed, and, then, optimal feature weights for each database images are computed through searching the group where the result images among retrieved images are belong. Experimental results on 2000 images show the performance of proposed algorithm.

Adaptive User Profile for Information Retrieval from the Web

  • Srinil, Phaitoon;Pinngern, Ouen
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1986-1989
    • /
    • 2003
  • This paper proposes the information retrieval improvement for the Web using the structure and hyperlinks of HTML documents along with user profile. The method bases on the rationale that terms appearing in different structure of documents may have different significance in identifying the documents. The method partitions the occurrence of terms in a document collection into six classes according to the tags in which particular terms occurred (such as Title, H1-H6 and Anchor). We use genetic algorithm to determine class importance values and expand user query. We also use this value in similarity computation and update user profile. Then a genetic algorithm is used again to select some terms from user profile to expand the original query. Lastly, the search engine uses the expanded query for searching and the results of the search engine are scored by similarity values between each result and the user profile. Vector space model is used and the weighting schemes of traditional information retrieval were extended to include class importance values. The tested results show that precision is up to 81.5%.

  • PDF

VRTEC : Multi-step Retrieval Model for Content-based Video Query (VRTEC : 내용 기반 비디오 질의를 위한 다단계 검색 모델)

  • 김창룡
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.1
    • /
    • pp.93-102
    • /
    • 1999
  • In this paper, we propose a data model and a retrieval method for content-based video query After partitioning a video into frame sets of same length which is called video-window, each video-window can be mapped to a point in a multidimensional space. A video can be represented a trajectory by connection of neighboring video-window in a multidimensional space. The similarity between two video-windows is defined as the euclidean distance of two points in multidimensional space, and the similarity between two video segments of arbitrary length is obtained by comparing corresponding trajectory. A new retrieval method with filtering and refinement step if developed, which return correct results and makes retrieval speed increase by 4.7 times approximately in comparison to a method without filtering and refinement step.

  • PDF

Image Retrieval Using Entropy-Based Image Segmentation (엔트로피에 기반한 영상분할을 이용한 영상검색)

  • Jang, Dong-Sik;Yoo, Hun-Woo;Kang, Ho-Jueng
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.8 no.4
    • /
    • pp.333-337
    • /
    • 2002
  • A content-based image retrieval method using color, texture, and shape features is proposed in this paper. A region segmentation technique using PIM(Picture Information Measure) entropy is used for similarity indexing. For segmentation, a color image is first transformed to a gray image and it is divided into n$\times$n non-overlapping blocks. Entropy using PIM is obtained from each block. Adequate variance to perform good segmentation of images in the database is obtained heuristically. As variance increases up to some bound, objects within the image can be easily segmented from the background. Therefore, variance is a good indication for adequate image segmentation. For high variance image, the image is segmented into two regions-high and low entropy regions. In high entropy region, hue-saturation-intensity and canny edge histograms are used for image similarity calculation. For image having lower variance is well represented by global texture information. Experiments show that the proposed method displayed similar images at the average of 4th rank for top-10 retrieval case.

Image Retrieval Scheme using Spatial Similarity and Annotation (공간 유사도와 주석을 이용한 이미지 검색 기법)

  • 이수철;황인준
    • Journal of KIISE:Databases
    • /
    • v.30 no.2
    • /
    • pp.134-144
    • /
    • 2003
  • Spatial relationships among objects are one of the important ingredients for expressing constraints of an image in image or multimedia retrieval systems. In this paper, we propose a unified image retrieval scheme using spatial relationships among objects and their features. The proposed scheme is especially effective in computing similarity between query image and images in the database. Also, objects and their spatial relationships are captured and annotated in XML. It could give better precision and flexibility in retrieving images from database. Finally, we have implemented a prototype system for retrieving images based on proposed technique and showed some of the experiment results.