Search | Korea Science

XQuery 작성기 설계 및 구현

김태권
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.10b
- /
- pp.22-24
- /
- 2004
XML은 관계형 데이터는 물론 구조화 또는 반구조화 된 데이터를 효과적으로 조직화하여 표현할 수가 있다. XQuery는 구조화된 XML 데이터를 대상으로 필요한 정보를 검색하는 질의어이다. 평면적인 테이블 형태의 SQL과는 달리 XQuery는 데이터의 내부구조 정보 없이는 질의어를 작성하는데 어려움이 따른다. 이 논문은 내부적으로 구조화된 XML데이터에서 필요한 정보를 검색하는 검색언어 XQuery질의를 효과적으로 작성할 수 있도록 질의 대상이 되는 XML 데이터 구조를 트리 형태로 보여주고, 필요한 경로식을 효과적으로 지정함으로써 질의어를 보다 쉽게 작성하도록 도와주는 XQuery 작성기를 설계하고 구현한다.
PDF

KTARSQI: The Annotation of Temporal and Event Expressions in Korean Text (KTARSQI: 한국어 텍스트의 시간 및 사건 표현 주석)

Im, Seohyun;Kim, Yoon-Shin;Jo, Yoomi;Jang, Hayun;Ko, Minsoo;Nam, Seungho;Shin, Hyopil
- Annual Conference on Human and Language Technology
- /
- 2009.10a
- /
- pp.130-135
- /
- 2009
정보추출(information extraction), 질의-응답 시스템(Question-Answering system) 등의 자연언어처리 응용분야에서 시간과 사건에 관련한 정보를 추출하는 것은 중요한 부분이다. 그럼에도 불구하고, 한국어의 자연언어처리 응용분야에서는 아직까지 이 연구가 본격화되지 않았다. 미국 TARSQI 프로젝트의 연구결과를 바탕으로 하여 한국어 텍스트에서 시간 및 사건 표현의 주석, 추출, 추론을 위한 명세 언어(KTimeML), 주석 말뭉치(KTimeBank), 자동 태깅 시스템(KTarsqi Toolkit: KTTK)의 개발을 목표로 2008년 KTARSQI 프로젝트가 시작되었다. 이 논문에서는 KTARSQI 프로젝트의 목표와 과제에 대한 전반적인 소개와 함께, 현재까지 진행된 작업의 결과로서 사건 태그의 명세와 주석에 관한 논의를 덧붙인다.
PDF

Indexing and Storage Schemes for Keyword-based Query Processing over Semantic Web Data (시맨틱 웹 데이터의 키워드 질의 처리를 위한 인덱싱 및 저장 기법)

Kim, Youn-Hee;Shin, Hye-Yeon;Lim, Hae-Chull;Chong, Kyun-Rak
- Journal of the Korea Society of Computer and Information
- /
- v.12 no.5
- /
- pp.93-102
- /
- 2007
Metadata and ontology can be used to retrieve related information through the inference mure accurately and simply on the Semantic Web. RDF and RDF Schema are general languages for representing metadata and ontology. An enormous number of keywords on the Semantic Web are very important to make practical applications of the Semantic Web because most users prefer to search with keywords. In this paper, we consider a resource as a unit of query results. And we classily queries with keyword conditions into three patterns and propose indexing techniques for keyword-search considering both metadata and ontology. Our index maintains resources that contain keywords indirectly using conceptual relationships between resources as well as resources that contain keywords directly. So, if user wants to search resources that contain a certain keyword, all resources are retrieved using our keyword index. We propose a structure of table for storing RDF Schema information that is labeled using some simple methods.
PDF

Adaptive Path Index for Efficient U Query Processing (효율적인 XML 질의 처리를 위한 적응형 경로 인덱스)

민준기;심규석;정진완
- Journal of KIISE:Databases
- /
- v.31 no.1
- /
- pp.61-71
- /
- 2004
XML can describe a wide range of data, from regular to irregular and from flat to deeply nested. Thus, XML is rapidly emerging as the do facto standard for the Web document format since XML supports an efficient data exchange and integration. Also, to retrieve the data represented by XML, several XML query languages are proposed. XML query languages such as XPath and XQuery use path expressions to traverse irregularly structured data which comprise B% elements. To evaluate path expressions, various path indexes are proposed. However, traditional path indexes are constructed by utilizing only the XML data structure. Therefore, in this paper, we propose an adaptive path index which utilizes the XML data structure as well as query workloads. To improve the query performance, the adaptive path index proposed by this paper manages the frequently used paths and the structural summary of the XML data using a hash tree and a graph structure. Experimental results show that the adaptive path index improves the query performance typically 2 to 69 times compared with the existing indexes.
PDF KSCI

An Efficient Storage Schema Construction and Retrieval Technique for Querying OWL Data (OWL 데이타 검색을 위한 효율적인 저장 스키마 구축 및 질의 처리 기법)

Woo, Eun-Mii;Park, Myung-Jae;Chung, Chin-Wan
- Journal of KIISE:Databases
- /
- v.34 no.3
- /
- pp.206-216
- /
- 2007
With respect to the Semantic Web proposed to overcome the limitation of the Web, OWL has been recommended as the ontology language used to give a well-defined meaning to diverse data. OWL is the representative ontology language suggested by W3C. An efficient retrieval of OWL data requires a well-constructed storage schema. In this paper, we propose a storage schema construction technique which supports more efficient query processing. A retrieval technique corresponding to the proposed storage schema is also introduced. OWL data includes inheritance information of classes and properties. When OWL data is extracted, hierarchy information should be considered. For this reason, an additional XML document is created to preserve hierarchy information and stored in an XML database system. An existing numbering scheme is utilized to extract ancestor/descendent relationships, and order information of nodes is added as attribute values of elements in an XML document. Thus, it is possible to retrieve subclasses and subproperties fast and easily. The improved query performance from experiments shows the effectiveness of the proposed storage schema construction and retrieval method.
PDF KSCI

Development of a Regulatory Q&A System for KAERI Utilizing Document Search Algorithms and Large Language Model (거대언어모델과 문서검색 알고리즘을 활용한 한국원자력연구원 규정 질의응답 시스템 개발)

Hongbi Kim;Yonggyun Yu
- Journal of Korea Society of Industrial Information Systems
- /
- v.28 no.5
- /
- pp.31-39
- /
- 2023
The evolution of Natural Language Processing (NLP) and the rise of large language models (LLM) like ChatGPT have paved the way for specialized question-answering (QA) systems tailored to specific domains. This study outlines a system harnessing the power of LLM in conjunction with document search algorithms to interpret and address user inquiries using documents from the Korea Atomic Energy Research Institute (KAERI). Initially, the system refines multiple documents for optimized search and analysis, breaking the content into managable paragraphs suitable for the language model's processing. Each paragraph's content is converted into a vector via an embedding model and archived in a database. Upon receiving a user query, the system matches the extracted vectors from the question with the stored vectors, pinpointing the most pertinent content. The chosen paragraphs, combined with the user's query, are then processed by the language generation model to formulate a response. Tests encompassing a spectrum of questions verified the system's proficiency in discerning question intent, understanding diverse documents, and delivering rapid and precise answers.
https://doi.org/10.9723/jksiis.2023.28.5.031 인용 PDF

Implementation of an Internet Homepage Retrieval System and Improvement of Retrieval Efficiency (인터넷 홈페이지 검색시스템 구현과 검색효율 향상)

Park, Hyun-Joo;Choi, Jae-Duck;Kang, Sang-Bae;Park, Seung;Park, Yong-Uk;Kwon, Hyuk-Chul
- Annual Conference on Human and Language Technology
- /
- 1997.10a
- /
- pp.227-232
- /
- 1997
이 논문은 인터넷 홈페이지를 검색하는 정보검색시스템인 미리내 시스템을 제시한다. 웹 문서의 특성을 고려하여 로봇의 기능을 확장하고, 색인, 등록, 수정, 삭제, 분류의 자동화를 구현하여 관리효율을 높인다. 자동화에 따른 문제점과 해결방법을 제시하고, 불리언질의검색 외에 자연언어질의 검색에서 질의어 확장의 방법으로 웹페이지 링크속성검색, Relevance feedback을 통한 검색효율을 높인다.
PDF

A Technique for wrapping Java applications in EJBs (자바 어플리케이션을 EJB로 래핑 하기 위한 기법)

김동관;정효택;양영종
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04c
- /
- pp.130-132
- /
- 2003
1990년대 초반에 등장한 자바 언어는 빠른 속도로 프로그래머들 사이에 보급되었으며 인터넷의 등장과 더불어 이는 더욱 가속화되었다. 또한 무선 플랫폼 등과 같은 새로운 컴퓨팅 환경에 빠르게 대처함으로써 자바 언어의 끝을 예측하기는 쉽지 않은 상태이다. 초기 자바 어플리케이션들은 단일 티어(single-tier)로 개발되었으며 환경의 변화로 인해 이런 어플리케이션들을 네트워크로 연결할 필요성이 대두되고 있다. 자바 언어는 분산 컴퓨팅 환경의 솔루션으로 Enterprise JavaBean(EJB)［1］을 제시하고 있다. 본 논문에서는 기존에 개발된 자바 어플리케이션을 EJB로 래핑하기 위한 기법들을 제공한다. 핵심 비즈니스 로직을 가진 클래스들을 수작업을 통해 EJB로 변환할 수도 있지만 본 논문에서는 반자동화된 방법을 통해 변환 상의 효율을 증대시키고 변환 과정에서 발생할 수 있는 오류를 최소화하고자 한다. EJB 래핑 기법은 세션 빈(session bean)［1］래핑과 엔터티 빈(entity bean)［1］래핑으로 구성된다. 세션 빈 래핑은 자바 어플리케이션을 구성하는 클래스 가운데 질의문(query)을 가지지 않는 자바 클래스들을 래핑한다. 엔터티 빈은 질의문을 포함하는 자바 클래스를 래핑한다. EJB 래핑을 위해 리플렉션(reflection)［2］과 위임 (delegation) 장치를 사용한다.
PDF

Identification of Characteristics of a Concept through Linguistic Analysis (언어학적 분석을 통한 개념의 특성 정보 인식)

Paik, Hae-Seung;Kang, Young-Soo;Choi, Key-Sun
- Annual Conference on Human and Language Technology
- /
- 2001.10d
- /
- pp.233-238
- /
- 2001
개념은 그 개념을 나타내기 위한 특성들이 결합된 지식의 단위이며 각 특성은 개념에 속한 개체들의 성질을 축약한 것으로 정의될 수 있다[4]. 이 논문은 백과사전 설명문 텍스트를 분석하여 개념을 구성하는데 필요한 정보를 몇 개의 대표적인 특성으로 분류하고, 이를 개념의 특성정보로 구축하였으며, 이를 관련 개념 문서에 적용하여 특성 정보를 인식하는 것을 보여준다. 본 연구는 백과사전이 세계 지식(world knowledge) 전반을 함축적으로 표현하고 있다는 가정에서 출발하였으며 적은 양의 데이터에 대한 수동 분석 결과를 통해 많은 양의 코퍼스를 분석한 것과 같은 의미있는 결과를 얻었다. 백과사전에 표현된 많은 개념 중 "질병"에 관하여 실험한 결과 평균 81%의 정확율로 질병의 특성 정보인 원인, 증상, 치료를 자동 인식함을 보여주었다. 개념의 요소 정보 인식은 정보의 이나 질의 응답과 같은 분야에 적용될 수 있다.
PDF

Query Analysis Using Information Extraction (정보추출을 이용한 질의분석)

Jung, Han-Min;Min, Kyung-Koo;Sung, Won-Kyung;Park, Dong-In
- Annual Conference on Human and Language Technology
- /
- 2004.10d
- /
- pp.290-295
- /
- 2004
본 논문에서는 네비게이션 도메인 상에서의 자연어 질의를 분석하기 위한 방법으로 정보추출을 이용한다. 목적지향성 대화문을 처리하기 위해 도입한 정보추출은 미리 정의된 필드들의 값을 채우는 방식으로 대화를 이끌 수 있도록 한다. Lexico-semantic pattern 기반의 언어처리와 추출/필터링/랭킹 규칙들을 사용하여 강건하면서도 애매성 처리가 용이한 정보추출 기법을 이용한다. 네비게이션 도메인 상에서의 실험은 목적지까지의 이동을 위한 사용자와의 대화집합 256개에 대해 문장레벨 97%의 정확율을 보여준다.
PDF

Search Result 808, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)