• Title/Summary/Keyword: 웹분류

Search Result 894, Processing Time 0.027 seconds

Query Expansion Based on Word Graphs Using Pseudo Non-Relevant Documents and Term Proximity (잠정적 부적합 문서와 어휘 근접도를 반영한 어휘 그래프 기반 질의 확장)

  • Jo, Seung-Hyeon;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.189-194
    • /
    • 2012
  • In this paper, we propose a query expansion method based on word graphs using pseudo-relevant and pseudo non-relevant documents to achieve performance improvement in information retrieval. The initially retrieved documents are classified into a core cluster when a document includes core query terms extracted by query term combinations and the degree of query term proximity. Otherwise, documents are classified into a non-core cluster. The documents that belong to a core query cluster can be seen as pseudo-relevant documents, and the documents that belong to a non-core cluster can be seen as pseudo non-relevant documents. Each cluster is represented as a graph which has nodes and edges. Each node represents a term and each edge represents proximity between the term and a query term. The term weight is calculated by subtracting the term weight in the non-core cluster graph from the term weight in the core cluster graph. It means that a term with a high weight in a non-core cluster graph should not be considered as an expanded term. Expansion terms are selected according to the term weights. Experimental results on TREC WT10g test collection show that the proposed method achieves 9.4% improvement over the language model in mean average precision.

Service Plan of National R&D Report System Using KANO Model (KANO모형을 이용한 국가R&D보고서 시스템의 서비스 방안)

  • Park, Man-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.1
    • /
    • pp.364-373
    • /
    • 2014
  • The relationship between a service provided via the information system and user satisfaction has been thought of as an important factor for the development of a new service for the information system. In this study, the twelve new key services that are applicable to national R&D report system were derived by web environment changes in step with IT technology developments in order to support the new service for the user. The twelve new key services are as follows; semantic search service for national R&D report, associated report service, RSS service, mesh-up service, topic-map service, open API service, personalized service, collective intelligence service, SNS service, unstructured data service, detailed search service, mailing service. To assess the quality attribute of the twelve new key services in the national R&D report system, a survey was performed. In conclusion, a stepwise service plan for the national R&D report system was proposed which would use the satisfaction coefficient and the results of the service classification. The following step-by-step service should be developed by in this way. The unstructured data service, personalized service, associated report service, topic-map service, open API service, and the collective intelligence service are needed to develop the first step and RSS service, mesh-up service, semantic search service for the national R&D report, mailing service, detailed search service, and SNS service are needed to develop the second step.

Building Transparency on the Total System Performance Assessment of Radioactive Repository through the Development of the Cyber R&D Platform; Application for Development of Scenario and Input of TSPA Data through QA Procedures (Cyber R&D Platform개발을 통한 방사성폐기물 처분종합성능평가(TSPA) 투명성 증진에 관한 연구; 시나리오 도출 과정과 TSPA 데이터 입력에서의 품질보증 적용 사례)

  • Seo, Eun-Jin;Hwang, Yong-Soo;Kang, Chul-Hyung
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.4 no.1
    • /
    • pp.65-75
    • /
    • 2006
  • Transparency on the Total System Performance Assessment (TSPA) is the key issue to enhance the public acceptance for a radioactive repository. To approve it, all performances on TSPA through Quality Assurance is necessary. The integrated Cyber R&D Platform is developed by KAERI using the T2R3 principles applicable for five major steps : planning, research work, documentation, and internal & external audits in R&D's. The proposed system is implemented in the web-based system so that all participants in TSPA are able to access the system. It is composed of three sub-systems; FEAS (FEp to Assessment through Scenario development) showing systematic approach from the FEPs to Assessment methods flow chart, PAID (Performance Assessment Input Databases) being designed to easily search and review field data for TSPA and QA system containing the administrative system for QA on five key steps in R&D's in addition to approval and disapproval processes, corrective actions, and permanent record keeping. All information being recorded in QA system through T2R3 principles is integrated into Cyber R&D Platform so that every data in the system can be checked whenever necessary. Throughout the next phase R&D, Cyber R&D Platform will be connected with the assessment tool for TSPA so that it will be expected to search the whole information in one unified system.

  • PDF

Automatic Training Corpus Generation Method of Named Entity Recognition Using Knowledge-Bases (개체명 인식 코퍼스 생성을 위한 지식베이스 활용 기법)

  • Park, Youngmin;Kim, Yejin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.27-41
    • /
    • 2016
  • Named entity recognition is to classify elements in text into predefined categories and used for various departments which receives natural language inputs. In this paper, we propose a method which can generate named entity training corpus automatically using knowledge bases. We apply two different methods to generate corpus depending on the knowledge bases. One of the methods attaches named entity labels to text data using Wikipedia. The other method crawls data from web and labels named entities to web text data using Freebase. We conduct two experiments to evaluate corpus quality and our proposed method for generating Named entity recognition corpus automatically. We extract sentences randomly from two corpus which called Wikipedia corpus and Web corpus then label them to validate both automatic labeled corpus. We also show the performance of named entity recognizer trained by corpus generated in our proposed method. The result shows that our proposed method adapts well with new corpus which reflects diverse sentence structures and the newest entities.

  • PDF

Development of Evaluation Framework and Professional Evaluation of Health Information Predictability (건강정보의 예보성 평가준거를 활용한 전문가 평가결과 분석연구)

  • Kang, Min-Sug;Lee, Moo-Sik;Hong, Jee-Young;Kim, Sang-Ha
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.10
    • /
    • pp.2966-2973
    • /
    • 2009
  • In this article, I propose effective strategies for improving the Predictive Health Care. The results of qualitative study on health information show the following order from the highest score: whether health information is scientifically sound ($3.7\pm0.5$), whether people can easily understand health information ($3.6\pm0.5$), and whether health information reflects the public'sconcerns (($3.5\pm0.5$), and whether health information includes enough information to satisfy the public ($2.9\pm0.6$). The most pressing reforms for the effective Predictive Health Care areto provide enough health information and regularly collection of information because the Predictive Health Care has not provided enough information, authoritative information has rarely been offered, and methodological limitations on producing and applying predictive information have not been addressed. Although the Predictive Health Care provides online services like web-based epidemic reporting system, it needs to extend services from the epidemic information to general health information because of lack of promoting the Predictive Health Care and of credibility of information offered so far. Lastly, the Predictive Health Care needs to strengthen efforts to collect information, form common grounds between information and the public's concerns, clarify classification system of information, and offer an easy way for the public to use information.

Cascade Composition of Translation Rules for the Ontology Interoperability of Simple RDF Message (단순 RDF 메시지의 온톨로지 상호 운용성을 위한 변환 규칙들의 연쇄 조합)

  • Kim, Jae-Hoon;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.528-545
    • /
    • 2007
  • Recently ontology has been an attractive technology along with the business strategy of providing a plenty of more intelligent services. The essential problem in application domains using ontology is that all members, agents, and application programs in the domains must share the same ontology concepts. However, a variety of mobile devices, sensing devices, and network components manufactured by various companies, a variety of common carriers, and a variety of contents providers make multiple heterogeneous ontologies more likely to coexist. We can see many past researches fallen into resolving this semantic interoperability. Such methods can be broadly classified into by-mapping, by-merging, and by-translation. In this research, we focus on by-translation among them which uses a translation rule directly made between two heterogeneous ontology data like OntoMorph. However, the manual composition of the direct translation rule is not convenient by itself and if there are N ontologies, the direct method has the rule composition complexity of $O(N^2)$ in the worst case. Therefore, in this paper we introduce the cascade composition of translation rules based on web openness in order to improve the complexity. The research result made us recognize some important factors in an ontology translation system, that is speediness of translation, and conveniency of translation rule composition, and some experiments and comparing analysis with existing methods showed that our cascade method has more conveniency with insuring the speediness and the correctness.

Strategies on Text Screen Design Of The Electronic Textbook For Focused Attention Using Automatic Text Scroll (자동 스크롤 가능을 이용한 주의력 집중을 위한 웹기반 전자교과서 텍스트 화면 설계전략)

  • Kwon, Hyunggyu
    • The Journal of Korean Association of Computer Education
    • /
    • v.5 no.4
    • /
    • pp.134-145
    • /
    • 2002
  • The purpose of this study is to present the functional and technical solutions for text learning of web-based textbook in which each letter has its own focal point. The solutions help learners not to lose the main focus when eye moves to the next letter or line. The text screen of the electronic textbook automatically scrolls the text to up and down or left and right directions which are preassigned by learner. It doesn't need the operation of mouse or keyboard. And learner can change scroll speed and types anytime during scrolling. Automatic text scroll function is a solution for controlling data and screen to reflect the personal favor and ability. It contains the content structure of the text(characteristics, categorizations etc.), the appearance of the text(density, size, font etc.), scroll options(scroll, speed etc.), program control type(ram resident program etc.), and the application of the screen design principles(legibility etc.). To resolve these functional problems, technical 8 phases are provided, which are environment setting, scroll option setting, copy, data analysis, scroll coding, centered focus coding, left and right focus coding, implementation. The learner can focus on text without dispersion because the text focal points stay in the fixed area of screen. 1bey read the text following their preferences for fonts, sizes, line spacing and so on.

  • PDF

A Study on the UCC Copyright which uses the Broadcasting Contents and the ODR(Online Dispute Resolution) through the Online Technical embodiment : Focusing on the CCl as the Conversational law Approach (방송콘텐츠를 이용한 UCC의 저작권 문제와 온라인 기술 구현을 통한 ODR(Online Dispute Resolution)의 가능성에 관한 연구 : Conversational Law 접근으로써 CCL을 중심으로)

  • Kim, Mi-Sun;Yu, Sae-Kyung
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.558-564
    • /
    • 2008
  • The study aims to examine the UCC (User Created Contents) Copyright which use the broadcasting contents. UCC are classified by UGC(User Generated Contents), UMC(User Modified Contents), and URC(User Recreated Contents). Especially UMC and URC correspond to a problem of copyright. Following the Copyright Protection Center investigation in 2006, it reported that 83.7% UCC are infringement of copyright. In spite of remarkable the UCC copyright problem, the concrete resolution does not exist. Also it is difficult to apply the offline legal conformity because of online nature of the UCC. The study observes the UCC copyright dispute instances which use the broadcasting contents and investigates a resolution of the UCC copyright. Considering the online media nature, it tries to analyse CCL(Creative Common License) as the ODR(online Dispute Resolution). It is meaningful to search the possibility of UCC copyright problem through the online technical embodiment.

  • PDF

PDA Personalized Agent System (PDA용 개인화 에이전트 시스템)

  • 표석진;박영택
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.345-352
    • /
    • 2002
  • 무선 인터넷을 이용하는 사용자는 정보의 양의 따른 시간적 통신비용의 증가 문제로 개인화 에이전트가 사용자의 관심에 따라 서비스를 제공하는 기능과 맞춤화된 정보를 제공하는 기능, 지식 기반 방식으로 정보를 예측하는 기능을 가지기를 바라고 있다. 본 논문에서는 이와 같이 무선 인터넷을 사용하는 사용자를 위한 PDA 개인화 에이전트 시스템을 구축하고자 한다. PDA 개인화 에이전트 시스템 구축을 위해 프로파일 기반의 에이전트 엔진과 사용자 프로파일을 이용한 지식기반 방식을 사용한다. 사용자가 웹페이지에서 행하는 행위들을 모니터링하여 사용자가 관심 가지는 문서를 파악하고 정보 검색을 통해 얻어진 문서를 분석하여 사용자 각각의 관심 문서로 나누어 서비스하게 된다. 모니터링 되어진 문서를 효과적으로 분석하기 위해 unsupervised clustering 기계학습 방식인 Cobweb을 이용한다. unsupervised 기계 학습은 conceptual 방식을 이용하여 검색되어진 정보를 사용자의 관심 분야별로 clustering한다. 클러스터링을 통해 얻어진 결과를 다시 기계학습을 통해 사용자 관심문서에 대한 프로파일을 생성하게 된다. 이렇게 만들어진 프로파일을 룰(Rule)로 만들어 이를 기반으로 사용자에게 서비스하게 된다. 이러한 룰은 사용자의 모니터링 결과로 얻어지기 때문에 주기적으로 업데이트하게 된다. 제안하는 시스템은 인터넷신문이나 웹진 등에서 사용자들에게 뉴스를 전달하기 위한 목적으로 생성하는 뉴스문서를 특정 대상으로 선정하였고 사용자 정보를 이용한 검색을 실시하고 결과로 얻어진 정보를 정보 분류를 통해 PDA나 휴대폰을 통해 사용자에게 제공한다. 상품을 검색하기 위한 검색노력을 줄이고, 검색된 대안들로부터 구매자와 시스템이 웹상에서 서로 상호작용(interactivity) 하여 해를 찾고, 제약조건과 규칙들에 의해 적합한 해를 찾아가는 방법을 제시한다. 본 논문은 구성기반 예로서 컴퓨터 부품조립을 사용해서 Template-based reasoning 예를 보인다 본 방법론은 검색노력을 줄이고, 검색에 있어 Feasibility와 Admissibility를 보장한다.매김할 수 있는 중요한 계기가 될 것이다.재무/비재무적 지표를 고려한 인공신경망기법의 예측적중률이 높은 것으로 나타났다. 즉, 로지스틱회귀 분석의 재무적 지표모형은 훈련, 시험용이 84.45%, 85.10%인 반면, 재무/비재무적 지표모형은 84.45%, 85.08%로서 거의 동일한 예측적중률을 가졌으나 인공신경망기법 분석에서는 재무적 지표모형이 92.23%, 85.10%인 반면, 재무/비재무적 지표모형에서는 91.12%, 88.06%로서 향상된 예측적중률을 나타내었다.ting LMS according to increasing the step-size parameter $\mu$ in the experimentally computed. learning curve. Also we find that convergence speed of proposed algorithm is increased by (B+1) time proportional to B which B is the number of recycled data buffer without complexity of compu

  • PDF

A Study on the Data Organization of Specification Information for reference of Design Information (설계정보 참조를 위한 시방정보의 자료구조화에 관한 연구)

  • Kim Jae-hyun;Song Younk-Kyou;Kim Uk
    • Korean Journal of Construction Engineering and Management
    • /
    • v.2 no.3 s.7
    • /
    • pp.92-100
    • /
    • 2001
  • The architectural drawing, construction project specification, etc. are included in the contract of a documents. However, construction project specification, for being documentation, is not utilized to such an extent. The reason is that specification information is difficult in collecting information in relation to the architectural drawing, material finishing list and other architectural information. Therefore, an integrated model, which can be associated with other architectural information, is needed, and a DB based on this integrated model must be established in order for it to be utilized in design, construction, and management. The DB, which is established through this process, must be updated according to modification in design, and construction. Furthermore the specification must be in document on the web for reference. Consequently in this research, the structure of integrated model has been introduced, and it has made the search and preparation of the integrated model on the Internet, using the specification information DB established for the mutual reference of DB, possible. The improvements of construction project specification standards are expected by this system. Also, it will bring about Improvements upon claim prevention, and design, construction, management qualities. Furthermore, it will make the use of information more convenient in practical business such as order agency, design service and building site.

  • PDF