• Title/Summary/Keyword: information collection and extraction

Search Result 89, Processing Time 0.027 seconds

Construction of Test Collection for Automatically Extracting Technological Knowledge (기술 지식 자동 추출을 위한 테스트 컬렉션 구축)

  • Shin, Sung-Ho;Choi, Yun-Soo;Song, Sa-Kwang;Choi, Sung-Pil;Jung, Han-Min
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.7
    • /
    • pp.463-472
    • /
    • 2012
  • For last decade, the amount of information has been increased rapidly because of the internet and computing technology development, mobile devices and sensors, and social networks like facebook or twitter. People who want to gain important knowledge from database have been frustrated with large database. Many studies for automatic knowledge extracting meaningful knowledge from large database have been fulfilled. In that sense, automatic knowledge extracting with computing technology has been highly significant in information technology field, but still has many challenges to go further. In order to improve the effectives and efficiency of knowledge extracting system, test collection is strongly necessary. In this research, we introduce a test collection for automatic knwoledge extracting. We name the test collection KEEC/KREC(KISTI Entity Extraction Collection/KISTI Relation Extraction Collection) and present the process and guideline for building as well as the features of. The main feature is to tag by experts to guarantee the quality of collection. The experts read documents and tag entities and relation between entities with a tool for tagging. KEEC/KREC is being used for a research to evaluate system performance and will continue to contribute to next researches.

The Design and Implementation of Parameter Extraction System for Analyzing Internet Using SNMP (SNMP를 이용한 인터넷 분석 파라미터 추출 시스템의 설계 및 구현)

  • Sin, Sang-Cheol;An, Seong-Jin;Jeong, Jin-Uk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.3
    • /
    • pp.710-721
    • /
    • 1999
  • In this paper, we have designed and implemented a parameter extraction system for analyzing Internet using SNMP. The extraction system has two modules; one is collection request module, and the other is analysis request module. The collection request module generates a polling script, which is used to collect management information from the managed system periodically. With this collected data, analysis request module extracts analysis parameters. These parameters are traffic flow analysis, interface traffic analysis, packet traffic analysis, and management traffic analysis parameter. For management activity, we have introduced two-step-analysis-view. One is Summary-View, which is used find out malfunction of a system among the entire managed systems. The Other is Specific-View. With this view we can analyze the specific system with all our analysis parameters. To show available data as indicators for line capacity planning, network redesigning decision making of performance upgrade for a network device and things like that.

  • PDF

A Development Method of Framework for Collecting, Extracting, and Classifying Social Contents

  • Cho, Eun-Sook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.163-170
    • /
    • 2021
  • As a big data is being used in various industries, big data market is expanding from hardware to infrastructure software to service software. Especially it is expanding into a huge platform market that provides applications for holistic and intuitive visualizations such as big data meaning interpretation understandability, and analysis results. Demand for big data extraction and analysis using social media such as SNS is very active not only for companies but also for individuals. However despite such high demand for the collection and analysis of social media data for user trend analysis and marketing, there is a lack of research to address the difficulty of dynamic interlocking and the complexity of building and operating software platforms due to the heterogeneity of various social media service interfaces. In this paper, we propose a method for developing a framework to operate the process from collection to extraction and classification of social media data. The proposed framework solves the problem of heterogeneous social media data collection channels through adapter patterns, and improves the accuracy of social topic extraction and classification through semantic association-based extraction techniques and topic association-based classification techniques.

Construction of Test Collection for Extraction of Biomedical PLOT & Relations (생의학분야 PLOT 및 관계추출을 위한 테스트컬렉션 구축)

  • Choi, Yun-Soo;Choi, Sung-Phl;Jeong, Chang-Hoo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2010.05a
    • /
    • pp.425-427
    • /
    • 2010
  • Large-scaled information extraction consists of named-entity recognition, terminology extraction and relation extraction. Since all the elementary technologies have been studied independently so far, test collections for related machine learning models also have been constructed independently. As a result, it is difficult to handle scientific documents to extract both named-entities and technical terms at once. In this study, we integrate named-entities and terminologies with PLOT(Person, Location, Organization, Terminology) in a biomedical domain and construct a test collection of PLOT and relations between PLOTs.

  • PDF

Construction of Test Collection for Evaluation of Scientific Relation Extraction System (과학기술분야 용어 간 관계추출 시스템의 평가를 위한 테스트컬렉션 구축)

  • Choi, Yun-Soo;Choi, Sung-Pil;Jeong, Chang-Hoo;Yoon, Hwa-Mook;You, Beom-Jong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.754-758
    • /
    • 2009
  • Extracting information in large-scale documents would be very useful not only for information retrieval but also for question answering and summarization. Even though relation extraction is very important area, it is difficult to develop and evaluate a machine learning based system without test collection. The study shows how to build test collection(KREC2008) for the relation extraction system. We extracted technology terms from abstracts of journals and selected several relation candidates between them using Wordnet. Judges who were well trained in evaluation process assigned a relation from candidates. The process provides the method with which even non-experts are able to build test collection easily. KREC2008 are open to the public for researchers and developers and will be utilized for development and evaluation of relation extraction system.

  • PDF

Extraction of Protein-Protein Interactions based on Convolutional Neural Network (CNN) (Convolutional Neural Network (CNN) 기반의 단백질 간 상호 작용 추출)

  • Choi, Sung-Pil
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.194-198
    • /
    • 2017
  • In this paper, we propose a revised Deep Convolutional Neural Network (DCNN) model to extract Protein-Protein Interaction (PPIs) from the scientific literature. The proposed method has the merit of improving performance by applying various global features in addition to the simple lexical features used in conventional relation extraction approaches. In the experiments using AIMed, which is the most famous collection used for PPI extraction, the proposed model shows state-of-the art scores (78.0 F-score) revealing the best performance so far in this domain. Also, the paper shows that, without conducting feature engineering using complicated language processing, convolutional neural networks with embedding can achieve superior PPIE performance.

A study of the preparation And procedures by Smartphone Mobile Forensic evidence collection and analysis (스마트폰 모바일 포렌식 증거 수집 분석을 위한 준비사항 및 절차 연구)

  • Lee, Jae-Hyun;Park, Dea-Woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.10a
    • /
    • pp.269-272
    • /
    • 2011
  • The lawsuit is being made on the smart phone. And recent is getting a lot of evidence for the smart phone data in a court of law. Thus, the evidence of illegal use smartphone for the extraction of data and evidence collection, forensic procedure is a need for research. In this paper, evidence of phone forensic procedure for the extraction of the data suggests. And, by collecting forensic evidence from smartphones ensure the integrity of digital evidence and how to solve the case investigated. With this study, smartphone forensic will be able to contribute to the development.

  • PDF

Collection and Extraction Algorithm of Field-Associated Terms (분야연상어의 수집과 추출 알고리즘)

  • Lee, Sang-Kon;Lee, Wan-Kwon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.347-358
    • /
    • 2003
  • VSField-associated term is a single or compound word whose terms occur in any document, and which makes it possible to recognize a field of text by using common knowledge of human. For example, human recognizes the field of document such as or , a field name of text, when she encounters a word 'Pitcher' or 'election', respectively We Proposes an efficient construction method of field-associated terms (FTs) for specializing field to decide a field of text. We could fix document classification scheme from well-classified document database or corpus. Considering focus field we discuss levels and stability ranks of field-associated terms. To construct a balanced FT collection, we construct a single FTs. From the collections we could automatically construct FT's levels, and stability ranks. We propose a new extraction algorithms of FT's for document classification by using FT's concentration rate, its occurrence frequencies.

The Discontinuities Extraction and Analysis of Rock Slope by 3D Image (3차원영상에 의한 암반사면의 불연속면 추출 및 분석)

  • 강준묵;김위현;박준규
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2003.10a
    • /
    • pp.163-167
    • /
    • 2003
  • As digital photogrammetry can acquire much three-dimensional data quickly and exactly in equal accuracy, and it has advantage that can use this in modelling, it's practical use possibility is increased in various field by collection method of data for GIS. In this study, it was intended to create 3D image that has coordinate system, and use in acquisition of position information for object. And, it was applied to discontinuities extraction and measurement of rock slope for practical use of three-dimensional image and examination of measurement accuracy. Through this, it is inspected the possibility of three-dimensional image creation and the acquisition of space information.

  • PDF

Automatic Extraction of Fractures and Their Characteristics in Rock Masses by LIDAR System and the Split-FX Software (LIDAR와 Split-FX 소프트웨어를 이용한 암반 절리면의 자동추출과 절리의 특성 분석)

  • Kim, Chee-Hwan;Kemeny, John
    • Tunnel and Underground Space
    • /
    • v.19 no.1
    • /
    • pp.1-10
    • /
    • 2009
  • Site characterization for structural stability in rock masses mainly involves the collection of joint property data, and in the current practice, much of this data is collected by hand directly at exposed slopes and outcrops. There are many issues with the collection of this data in the field, including issues of safety, slope access, field time, lack of data quantity, reusability of data and human bias. It is shown that information on joint orientation, spacing and roughness in rock masses, can be automatically extracted from LIDAR (light detection and ranging) point floods using the currently available Split-FX point cloud processing software, thereby reducing processing time, safety and human bias issues.