• Title/Summary/Keyword: Query Extraction

Search Result 109, Processing Time 0.026 seconds

Extraction of Query Information and Generation of Identifier for Effective Component Classification and Retrieval (효율적인 컴포넌트 분류와 검색을 위한 질의정보 추출 및 식별자 생성)

  • Park, Jea-Youn;Song, Young-Jae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05c
    • /
    • pp.1753-1756
    • /
    • 2003
  • 소프트웨어 생산성과 품질을 개선하기 위한 방안으로 컴포넌트 기반의 소프트웨어 개발이 전개되고 있다. 소프트웨어 컴포넌트 라이브러리를 재사용하기 위해서는 재사용 가능한 컴포넌트를 효율적으로 수집하여 분류, 저장, 검색하여야한다. 기존의 요구사항 정형화 기법들은 요구사항들 간의 의미적 관계를 표현하는 데 초점을 맞추고 있어 컴포넌트 검색에는 적합하지 않으므로 본 연구에서는 개발하려는 유즈케이스 다이어그램을 구문분석을 거쳐 명세하여 질의 정보를 추출하였다. 기존의 자연어를 기반으로 하는 컴포넌트의 비정형적인 명세를 컴포넌트 검색과 조립에 필요한 정보를 효율적으로 얻을 수 있도록 구문분석과 추상화 단계를 거쳐 정형화된 중간형태의 명세로 전환하고 제안한 유사도를 사용하여 컴포넌트를 검색하고자 한다. 또한 개괄명세와 상세명세를 통해 컴포넌트 검색에 필요한 정보를 추출할 뿐만 아니라 컴포넌트의 aspect을 이용하여 컴포넌트 조림에 필요한 정보도 얻을 수 있다. 2차 질의를 통해 컴포넌트 검색의 정확도를 향상시키고 명세를 추상화시켜 검색의 재현율을 향상시킨다.

  • PDF

Query Related Issue Detection using Related Term Extraction (연관 어휘 추출을 통한 질의어 관련 이슈 탐지)

  • Kim, Je-Sang;Kim, Dong-Sung;Jo, Hyo-Geun;Lee, Hyun-Ah
    • Annual Conference on Human and Language Technology
    • /
    • 2013.10a
    • /
    • pp.133-136
    • /
    • 2013
  • 근래 트위터와 페이스북 등의 SNS(Social Network Service)에서 일반 대중의 관심사나 트렌드 등의 이슈를 탐지하는 많은 연구가 이루어지고 있다. 본 논문에서는 검색어에 대한 연관 어휘 추출을 통해 검색어에 연관된 이슈나 화제를 트위터에서 추출하기 위한 방법을 제안한다. 본 논문에서는 연관성이 높은 단어는 서로 가깝게 발생할 것으로 기대하고, 단어 간 거리가 가까울수록, 공기빈도가 높을수록 커지는 단어연관도 계산법을 제안한다. 연관도 값이 임계치를 넘는 어휘를 연관 어휘로 보고 네트워크의 형태로 관련 이슈를 제시한다.

  • PDF

Shape Feature Extraction technique for Content-Based Image Retrieval in Multimedia Databases

  • Kim, Byung-Gon;Han, Joung-Woon;Lee, Jaeho;Haechull Lim
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.869-872
    • /
    • 2000
  • Although many content-based image retrieval systems using shape feature have tried to cover rotation-, position- and scale-invariance between images, there have been problems to cover three kinds of variance at the same time. In this paper, we introduce new approach to extract shape feature from image using MBR(Minimum Bounding Rectangle). The proposed method scans image for extracting MBR information and, based on MBR information, compute contour information that consists of 16 points. The extracted information is converted to specific values by normalization and rotation. The proposed method can cover three kinds of invariance at the same time. We implemented our method and carried out experiments. We constructed R*_tree indexing structure, perform k-nearest neighbor search from query image, and demonstrate the capability and usefulness of our method.

  • PDF

Query Expansion based on Knowledge Extraction and Latent Dirichlet Allocation for Clinical Decision Support (의학 문서 검색을 위한 지식 추출 및 LDA 기반 질의 확장)

  • Jo, Seung-Hyeon;Lee, Kyung-Soon
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.31-34
    • /
    • 2015
  • 본 논문에서는 임상 의사 결정 지원을 위한 UMLS와 위키피디아를 이용하여 지식 정보를 추출하고 질의 유형 정보를 이용한 LDA 기반 질의 확장 방법을 제안한다. 질의로는 해당 환자가 겪고 있는 증상들이 주어진다. UMLS와 위키피디아를 사용하여 병명과 병과 관련된 증상, 검사 방법, 치료 방법 정보를 추출한다. UMLS와 위키피디아를 사용하여 추출한 의학 정보를 이용하여 질의와 관련된 병명을 추출한다. 질의와 관련된 병명을 이용하여 추가 증상, 검사 방법, 치료 방법 정보를 확장 질의로 선택한다. 또한, LDA를 실행한 후, Word-Topic 클러스터에서 질의와 관련된 클러스터를 추출하고 Document-Topic 클러스터에서 초기 검색 결과와 관련이 높은 클러스터를 추출한다. 추출한 Word-Topic 클러스터와 Document-Topic 클러스터 중 같은 번호를 가지고 있는 클러스터를 찾는다. 그 후, Word-Topic 클러스터에서 의학 용어를 추출하여 확장 질의로 선택한다. 제안 방법의 유효성을 검증하기 위해 TREC Clinical Decision Support(CDS) 2014 테스트 컬렉션에 대해 비교 평가한다.

  • PDF

A Study on the Performance Analysis of Entity Name Recognition Techniques Using Korean Patent Literature

  • Gim, Jangwon
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.2
    • /
    • pp.139-151
    • /
    • 2020
  • Entity name recognition is a part of information extraction that extracts entity names from documents and classifies the types of extracted entity names. Entity name recognition technologies are widely used in natural language processing, such as information retrieval, machine translation, and query response systems. Various deep learning-based models exist to improve entity name recognition performance, but studies that compared and analyzed these models on Korean data are insufficient. In this paper, we compare and analyze the performance of CRF, LSTM-CRF, BiLSTM-CRF, and BERT, which are actively used to identify entity names using Korean data. Also, we compare and evaluate whether embedding models, which are variously used in recent natural language processing tasks, can affect the entity name recognition model's performance improvement. As a result of experiments on patent data and Korean corpus, it was confirmed that the BiLSTM-CRF using FastText method showed the highest performance.

COLORNET: Importance of Color Spaces in Content based Image Retrieval

  • Judy Gateri;Richard Rimiru;Micheal Kimwele
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.5
    • /
    • pp.33-40
    • /
    • 2023
  • The mainstay of current image recovery frameworks is Content-Based Image Retrieval (CBIR). The most distinctive retrieval method involves the submission of an image query, after which the system extracts visual characteristics such as shape, color, and texture from the images. Most of the techniques use RGB color space to extract and classify images as it is the default color space of the images when those techniques fail to change the color space of the images. To determine the most effective color space for retrieving images, this research discusses the transformation of RGB to different color spaces, feature extraction, and usage of Convolutional Neural Networks for retrieval.

Design and Implementation of Matching Engine for QbSH System Based on Polyphonic Music (다성음원 기반 QbSH 시스템을 위한 매칭엔진의 설계 및 구현)

  • Park, Sung-Joo;Chung, Kwang-Sue
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.1
    • /
    • pp.18-31
    • /
    • 2012
  • This paper proposes a matching engine of query-by-singing/humming (QbSH) system which retrieves the most similar music information by comparing the input data with the extracted feature information from polyphonic music like MP3. The feature sequences transcribed from polyphonic music may have many errors. So, to reduce the influence of errors and improve the performance, the chroma-scale representation, compensation and asymmetric DTW (Dynamic Time Warping) are adopted in the matching engine. The performance of various distance metrics are also investigated in this paper. In our experiment, the proposed QbSH system achieves MRR (Mean Reciprocal Rank) of 0.718 for 1000 singing/humming queries when searching from a database of 450 polyphonic musics.

A Centroid-based Image Retrieval Scheme Using Centroid Situation Vector (Centroid 위치벡터를 이용한 영상 검색 기법)

  • 방상배;남재열;최재각
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.126-135
    • /
    • 2002
  • An image contains various features such as color, shape, texture and location information. When only one of those features is used to retrieve an image, it is difficult to acquire satisfactory retrieval efficiency. Especially, in the database with huge capacity, such phenomenon happens frequently. Therefore, by using moi·e features, efficiency of the contents-based image retrieval (CBIR) system can be improved. This paper proposes a technique to consider location information about specific color as well as color information in image using centroid situation vector. Centroid situation vectors are calculated for specific color of the query image. Then, location similarity is determined through comparing distances between extracted centroid situation vectors of query image and target image in the database. Simulation results show that the proposed method is robust in zoom-in or zoom-out processed images and improves discrimination ability in fliped or rotated images. In addition, the suggested method reduced computational complexity by overlapping information extraction, and that improved the retrieval speed using an efficient index file.

An Efficient Video Clip Matching Algorithm Using the Cauchy Function (커쉬함수를 이용한 효율적인 비디오 클립 정합 알고리즘)

  • Kim Sang-Hyul
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.5 no.4
    • /
    • pp.294-300
    • /
    • 2004
  • According to the development of digital media technologies various algorithms for video clip matching have been proposed to match the video sequences efficiently. A large number of video search methods have focused on frame-wise query, whereas a relatively few algorithms have been presented for video clip matching or video shot matching. In this paper, we propose an efficient algorithm to index the video sequences and to retrieve the sequences for video clip query. To improve the accuracy and performance of video sequence matching, we employ the Cauchy function as a similarity measure between histograms of consecutive frames, which yields a high performance compared with conventional measures. The key frames extracted from segmented video shots can be used not only for video shot clustering but also for video sequence matching or browsing, where the key frame is defined by the frame that is significantly different from the previous frames. Experimental results with color video sequences show that the proposed method yields the high matching performance and accuracy with a low computational load compared with conventional algorithms.

  • PDF

A visual query database system for the Sample Research DB of the National Health Insurance Service (국민건강보험공단의 표본연구DB를 위한 비주얼 쿼리 데이터베이스 시스템 개발 연구)

  • Cho, Sang-Hoon;Kim, HeeChan;Kang, Gunseog
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.13-24
    • /
    • 2017
  • The Sample Cohort DB supplied by the National Health Insurance Service is a valuable resource for statistical studies as well as for health and medical studies. It takes significant time and effort to extract data from this Cohort DB having a large size. As such, we introduce a database system, conveniently called the National Health Insurance Service Cohort DB Extract Tool (NICE Tool), which supports several useful operations for effectively and efficiently managing the Cohort DB. For example, researchers can extract variables and cases related with study by simply clicking a computer mouse without any prior knowledge regarding SAS DATA step or SQL. We expect that NICE Tool will facilitate the faster extraction of data and eventually lead to the active use of the Cohort DB for research purposes.