• Title/Summary/Keyword: 질의 생성

Search Result 2,131, Processing Time 0.033 seconds

Automatic Training Corpus Generation Method of Named Entity Recognition Using Knowledge-Bases (개체명 인식 코퍼스 생성을 위한 지식베이스 활용 기법)

  • Park, Youngmin;Kim, Yejin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.27-41
    • /
    • 2016
  • Named entity recognition is to classify elements in text into predefined categories and used for various departments which receives natural language inputs. In this paper, we propose a method which can generate named entity training corpus automatically using knowledge bases. We apply two different methods to generate corpus depending on the knowledge bases. One of the methods attaches named entity labels to text data using Wikipedia. The other method crawls data from web and labels named entities to web text data using Freebase. We conduct two experiments to evaluate corpus quality and our proposed method for generating Named entity recognition corpus automatically. We extract sentences randomly from two corpus which called Wikipedia corpus and Web corpus then label them to validate both automatic labeled corpus. We also show the performance of named entity recognizer trained by corpus generated in our proposed method. The result shows that our proposed method adapts well with new corpus which reflects diverse sentence structures and the newest entities.

  • PDF

Range Stabbing Technique for Continuous Queries on RFID Streaming Data) (RFID 스트리밍 데이타의 연속질의를 위한 영역 스태빙 기법)

  • Park, Jae-Kwan;Hong, Bong-Hee;Lee, Ki-Han
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.112-122
    • /
    • 2009
  • The EPCglobal leading the development in RFID standards proposed Event Cycle Specification (ECSpec) and Event Cycle Reports (ECReports) for the standard about RFID middleware interface. ECSpec is a specification for filtering and collecting RFID tag data and is treated as a Continuous Query (CQ) processed during fixed time intervals repeatedly. ECReport is a specification for describing the results after ECSpec is processed. Thus, it is efficient to apply Query Indexing technique designed for the continuous query processing. This query index processes ECSpecs as data and tag events as queries for efficiency. In logistics environment, the similar or same products are transferred together. Also, when RFID tags attached to the products are acquired, the acquisition events occur massively for the short period. For these properties, it is inefficient to process the massive events one by one. In this paper, we propose a technique reducing similar search process by considering tag events which are collected by the report period in ECSpec, as a range query. For this group processing, we suggest a queuing method for collecting tag events efficiently and a structure for generating range queries in the queues. The experiments show that performance is enhanced by the proposed methods.

Mineralogical Characteristics and Genetic Environment of Zeolitic Bentonite in Yeongil Area (영일 지역 제올라이트질 벤토나이트의 광물특성 및 생성환경)

  • 노진환;고상모
    • Journal of the Mineralogical Society of Korea
    • /
    • v.17 no.2
    • /
    • pp.135-145
    • /
    • 2004
  • A zeolitic bentonite, which exhibits whitish appearance and contains considerable amounts (nearly 〉 5%) of zeolites, frequently occurs as thin beds less than 1 m in Yeongil area. The bentonites are mostly found in closely association with zeolite beds in the Nuldaeri Tuff and Coal-bearing formations of the Janggi Croup. A discordant occurrence of the bentonite against the bedding plane is also locally found. Montmorillonite, the major mineral constituent of the bentonite, is mostly associated with clinoptilolite as a zeolite. However, instead of clinoptilolite, mordenite is sometimes included in the case of more silicic bentonite, and heulandite in the less silicic one. It is characteristic that the mordenite is accompanied by lots of opal-CT in the silicic bentonite. SEM observations characteristically indicate that these authigenic phases, especially the montmorillonite and zeolite, nearly coexist as mixtures not forming a fine-scale zoning. The zeolitic bentonite seems to be formed in the comparatively silicic pore fluid at the alkaline condition accompanying pH fluctuation Compared to the zeolite-free normal bentonite, the zeolitic types exhibit somewhat higher REE abundance. These chemical characteristics, together with modes of occurrences and authigenic mineral associations, may suggest that the zeolitic bentonite is not merely diagenetic products and a possible hydrothermal alteration could not be excluded in the bentonite genesis.

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

Mechanism of Surface Film Formation on Graphite Negative Electrodes and Its Correlation with Electrolyte in Lithium Secondary Batteries (리튬 이차전지의 흑연 음극 표면피막 생성기구와 전해질과의 상관성)

  • Jeong, Soon-Ki
    • Journal of the Korean Electrochemical Society
    • /
    • v.13 no.1
    • /
    • pp.19-33
    • /
    • 2010
  • The surface film, which is formed on graphite negative electrodes during the initial charging, is a key component in lithium secondary batteries. The battery reactions are strongly affected by the nature of the surface film. It is thus very important to understand the physicochemical properties of the surface film. On the other hand, the surface film formation is a very complicated interfacial phenomenon occurring at the graphite/electrolyte interface. In studies on electrode surfaces in lithium secondary batteries, in-situ experimental techniques are very important because the surface film is highly reactive and unstable in the air. In this respect electrochemical atomic force microscopy (ECAFM) is a useful tool for direct visualizing electrode/solution interfaces at which various electrochemical reactions occur under potential control. In the present review, mechanism of surface film formation and its correlation with electrolyte are summarized on the basis of in-situ ECAFM studies for understanding of the nature of the surface film on graphite negative electrodes.

Efficient dummy generation for protecting location privacy in location based services (위치기반 서비스에서 위치 프라이버시를 보호하기 위한 효율적인 더미 생성)

  • Cai, Tian-yuan;Youn, Ji-hye;Song, Doo-hee;Park, Kwang-jin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.23-30
    • /
    • 2017
  • For enjoying the convenience provided by location based services, the user needs to submit his or her location and query to the LBS server. So there is a probability that the untrusted LBS server may expose the user's id and location etc. To protect user's privacy so many approaches have been proposed in the literature. Recently, the approaches about using dummy are getting popular. However, there are a number of things to consider if we want to generate a dummy. For example, when generating a dummy, we have to take the obstacle and the distance between dummies into account so that we can improve the privacy level. Thus, in this paper we proposed an efficient dummy generation algorithm to achieve k-anonymity and protect user's privacy in LBS. Evaluation results show that the algorithm can significantly improve the privacy level when it was compared with others.

A Neighbor Selection Technique for Improving Efficiency of Local Search in Load Balancing Problems (부하평준화 문제에서 국지적 탐색의 효율향상을 위한 이웃해 선정 기법)

  • 강병호;조민숙;류광렬
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.2
    • /
    • pp.164-172
    • /
    • 2004
  • For a local search algorithm to find a bettor quality solution it is required to generate and evaluate a sufficiently large number of candidate solutions as neighbors at each iteration, demanding quite an amount of CPU time. This paper presents a method of selectively generating only good-looking candidate neighbors, so that the number of neighbors can be kept low to improve the efficiency of search. In our method, a newly generated candidate solution is probabilistically selected to become a neighbor based on the quality estimation determined heuristically by a very simple evaluation of the generated candidate. Experimental results on the problem of load balancing for production scheduling have shown that our candidate selection method outperforms other random or greedy selection methods in terms of solution quality given the same amount of CPU time.

Memory Efficient Query Processing over Dynamic XML Fragment Stream (동적 XML 조각 스트림에 대한 메모리 효율적 질의 처리)

  • Lee, Sang-Wook;Kim, Jin;Kang, Hyun-Chul
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.1-14
    • /
    • 2008
  • This paper is on query processing in the mobile devices where memory capacity is limited. In case that a query against a large volume of XML data is processed in such a mobile device, techniques of fragmenting the XML data into chunks and of streaming and processing them are required. Such techniques make it possible to process queries without materializing the XML data in its entirety. The previous schemes such as XFrag[4], XFPro[5], XFLab[6] are not scalable with respect to the increase of the size of the XML data because they lack proper memory management capability. After some information on XML fragments necessary for query processing is stored, it is not deleted even after it becomes of no use. As such, when the XML fragments are dynamically generated and infinitely streamed, there could be no guarantee of normal completion of query processing. In this paper, we address scalability of query processing over dynamic XML fragment stream, proposing techniques of deleting information on XML fragments accumulated during query processing in order to extend the previous schemes. The performance experiments through implementation showed that our extended schemes considerably outperformed the previous ones in memory efficiency and scalability with respect to the size of the XML data.

Answer Constraints Extraction on User Question for Wikipedia QA (위키피디아 QA를 위한 질의문의 정답제약 추출)

  • Wang, JiHyun;Heo, Jeong;Lee, Hyungjik;Bae, Yongjin;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2017.10a
    • /
    • pp.248-250
    • /
    • 2017
  • 질의응답 시스템에서 정답을 제약하기 위한 위키피디아 영역의 정답제약 9개를 정의하고 질문 문장에서 제약표현을 추출하는 방법을 제안한다. 다어절의 정답제약 표현을 추출하기 위해서 언어분석 결과를 활용하여 정답제약 후보를 생성하며 후보단위로 정답제약 표현을 학습하기 위한 자질을 제시한다. 기계학습 방법을 이용하여 정답제약 후보 별로 정답제약 태그를 분류하여 정답제약 표현을 추출한다. 성능 실험은 각 정답제약 태그 별로 F1-Score 평가를 수행하였다.

  • PDF

Design of a Contextual Lexical Knowledge Graph Extraction Algorithm (맥락적 어휘 지식 그래프 추출 알고리즘의 설계)

  • Nam, Sangha;Choi, Gyuhyeon;Hahm, Younggyun;Choi, Key-Sun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.147-151
    • /
    • 2016
  • 본 논문에서는 Reified 트리플 추출을 위한 한국어 개방형 정보추출 방법을 제시한다. 시맨틱웹 분야에서 지식은 흔히 RDF 트리플 형태로 표현되지만, 자연언어문장은 복수개의 서술어와 논항간의 관계로 구성되어 있다. 이러한 이유로, 시맨틱웹의 대표적인 지식표현법인 트리플을 따름과 동시에 문장의 의존구조를 반영하여 복수개의 술어와 논항간의 관계를 지식화하는 새로운 개방형 정보추출 시스템이 필요하다. 본 논문에서는 문장 구조에 대한 일관성있는 변환을 고려한 새로운 개방형 정보추출 방법을 제안하며, 개체중심의 지식과 사건중심의 지식을 함께 표현할 수 있는 Reified 트리플 추출방법을 제안한다. 본 논문에서 제안한 방법의 우수성과 실효성을 입증하기 위해 한국어 위키피디아 알찬글 본문을 대상으로 추출된 지식의 양과 정확도 측정 실험을 수행하였고, 본 논문에서 제안한 방식을 응용한 의사 SPARQL 질의 생성 모듈에 대해 소개한다.

  • PDF