• Title/Summary/Keyword: Data extraction

Search Result 3,299, Processing Time 0.035 seconds

A Comparison of Data Extraction Techniques and an Implementation of Data Extraction Technique using Index DB -S Bank Case- (원천 시스템 환경을 고려한 데이터 추출 방식의 비교 및 Index DB를 이용한 추출 방식의 구현 -ㅅ 은행 사례를 중심으로-)

  • 김기운
    • Korean Management Science Review
    • /
    • v.20 no.2
    • /
    • pp.1-16
    • /
    • 2003
  • Previous research on data extraction and integration for data warehousing has concentrated mainly on the relational DBMS or partly on the object-oriented DBMS. Mostly, it describes issues related with the change data (deltas) capture and the incremental update by using the triggering technique of active database systems. But, little attention has been paid to data extraction approaches from other types of source systems like hierarchical DBMS, etc. and from source systems without triggering capability. This paper argues, from the practical point of view, that we need to consider not only the types of information sources and capabilities of ETT tools but also other factors of source systems such as operational characteristics (i.e., whether they support DBMS log, user log or no log, timestamp), and DBMS characteristics (i.e., whether they have the triggering capability or not, etc), in order to find out appropriate data extraction techniques that could be applied to different source systems. Having applied several different data extraction techniques (e.g., DBMS log, user log, triggering, timestamp-based extraction, file comparison) to S bank's source systems (e.g., IMS, DB2, ORACLE, and SAM file), we discovered that data extraction techniques available in a commercial ETT tool do not completely support data extraction from the DBMS log of IMS system. For such IMS systems, a new date extraction technique is proposed which first creates Index database and then updates the data warehouse using the Index database. We illustrates this technique using an example application.

Extraction of Expert Knowledge Based on Hybrid Data Mining Mechanism (하이브리드 데이터마이닝 메커니즘에 기반한 전문가 지식 추출)

  • Kim, Jin-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.6
    • /
    • pp.764-770
    • /
    • 2004
  • This paper presents a hybrid data mining mechanism to extract expert knowledge from historical data and extend expert systems' reasoning capabilities by using fuzzy neural network (FNN)-based learning & rule extraction algorithm. Our hybrid data mining mechanism is based on association rule extraction mechanism, FNN learning and fuzzy rule extraction algorithm. Most of traditional data mining mechanisms are depended ()n association rule extraction algorithm. However, the basic association rule-based data mining systems has not the learning ability. Therefore, there is a problem to extend the knowledge base adaptively. In addition, sequential patterns of association rules can`t represent the complicate fuzzy logic in real-world. To resolve these problems, we suggest the hybrid data mining mechanism based on association rule-based data mining, FNN learning and fuzzy rule extraction algorithm. Our hybrid data mining mechanism is consisted of four phases. First, we use general association rule mining mechanism to develop an initial rule base. Then, in the second phase, we adopt the FNN learning algorithm to extract the hidden relationships or patterns embedded in the historical data. Third, after the learning of FNN, the fuzzy rule extraction algorithm will be used to extract the implicit knowledge from the FNN. Fourth, we will combine the association rules (initial rule base) and fuzzy rules. Implementation results show that the hybrid data mining mechanism can reflect both association rule-based knowledge extraction and FNN-based knowledge extension.

AUTOMATIC ROAD NETWORK EXTRACTION. USING LIDAR RANGE AND INTENSITY DATA

  • Kim, Moon-Gie;Cho, Woo-Sug
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.79-82
    • /
    • 2005
  • Recently the necessity of road data is still being increased in industrial society, so there are many repairing and new constructions of roads at many areas. According to the development of government, city and region, the update and acquisition of road data for GIS (Geographical Information System) is very necessary. In this study, the fusion method with range data(3D Ground Coordinate System Data) and Intensity data in stand alone LiDAR data is used for road extraction and then digital image processing method is applicable. Up to date Intensity data of LiDAR is being studied. This study shows the possibility method for road extraction using Intensity data. Intensity and Range data are acquired at the same time. Therefore LiDAR does not have problems of multi-sensor data fusion method. Also the advantage of intensity data is already geocoded, same scale of real world and can make ortho-photo. Lastly, analysis of quantitative and quality is showed with extracted road image which compare with I: 1,000 digital map.

  • PDF

Extraction Transformation Transportation (ETT) system Design and implementation for extracting heterogeneous Data on Data Warehouse (데이터웨어하우스에서 이질적 형태를 가진 데이터의 추출을 위한 Extraction Transformation Transportation(ETT) 시스템 설계 및 구현)

  • 여성주;왕지남
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.24 no.67
    • /
    • pp.49-60
    • /
    • 2001
  • Data warehouse(DW) manages all information in a Enterprise and also offers the specific information to users. However, it might be difficult to develope an effective DW system due to varieties in computing facilities, data base, and operating systems. The heterogeneous system environments make it harder to extract data and to provide proper information to usesr in real time. Also commonly occurred is data inconsistency of non-integrated legacy system, which requires an effective and efficient data extraction flow control as well as data cleansing. We design the integrated automatic ETT(Extraction Transformation Transportation) system to control data extraction flow and suggest implementation methodology. Detail analysis and design are given to specify the proposed ETT approach with a real implementation.

  • PDF

DATA MINING-BASED MULTIDIMENSIONAL EXTRACTION METHOD FOR INDICATORS OF SOCIAL SECURITY SYSTEM FOR PEOPLE WITH DISABILITIES

  • BATYHA, RADWAN M.
    • Journal of applied mathematics & informatics
    • /
    • v.40 no.1_2
    • /
    • pp.289-303
    • /
    • 2022
  • This article examines the multidimensional index extraction method of the disability social security system based on data mining. While creating the data warehouse of the social security system for the disabled, we need to know the elements of the social security indicators for the disabled. In this context, a clustering algorithm was used to extract the indicators of the social security system for the disabled by investigating the historical dimension of social security for the disabled. The simulation results show that the index extraction method has high coverage, sensitivity and reliability. In this paper, a multidimensional extraction method is introduced to extract the indicators of the social security system for the disabled based on data mining. The simulation experiments show that the method presented in this paper is more reliable, and the indicators of social security system for the disabled extracted are more effective in practical application.

A Study on Ion Extraction Characteristics of Ceramics by Cleaning Agents (보존처리용 세척제에 대한 토기의 이온용출 특성연구)

  • Park, Dae-Woo;Kang, Hyun-Mi;Nam, Byeong-Jik;Jang, Sung-Yoon;Ham, Chul-Hee
    • 보존과학연구
    • /
    • s.31
    • /
    • pp.43-57
    • /
    • 2010
  • This study intends to provide quantitative data about the extraction characteristics of major elements of earthenware by executing soaking test of cleaning agents. It aims at providing basic data for the stability assessment when applying the cleaning agents for conserving earthenware. The data will be extracted from the analysis of co-relationship between the physical characteristics and the ion extraction characteristics. XRD analysis displayed that AT-1, AT-2 and AT-3 which did not generate mullite were fired at lower than 1,000 whereas AT-3 and AT-5 that included mullite were higher than 1,000. The degree of absorption was AT-4 > AT-2 > AT-1 > AT-3 > AT-5 in order and the correlation between the degree of absorption and firing temperature of earthenware displayed a positive correlation. Extraction amount of oxalic acid which was used for the removing iron oxide was AT-1 > AT-2 AT-4 > AT-3 > AT-5 in order. and the ion extraction data displayed that there is a positive correlation with absorption level. However AT-1 and AT-2 which were fired at lower temperature showed that there was no correlation between the ion extraction characteristics and absorption level. Ion extraction of citric acid produced little amount compared with the one of oxalic acid, yet it caused less damage to earthenware than oxalic acid when it applied. The result of ion extraction level in the absorption test displayed that Fe had higher level than in Si, Al from the test for both oxalic acid and citric acid. Based on the regression analysis of the data from the previous studies, the physical characteristics of the earthenware and ion extraction level, further studies will be conducted on the predicting technique on the extraction characteristics of major elements of earthenware samples for the conservation in future.

  • PDF

Distinctive point extraction and recognition algorithm for counters for the various kinds of bank notes

  • Joe, Yong-won;An, Eung-seop;Lee, Jae-kang;Kim, Il-hwan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.90.1-90
    • /
    • 2002
  • Counters for the various kinds of bank notes require high-speed distinctive point extraction and recognition for notes. In this paper we propose a new point extraction and recognition algorithm for bank notes. For distinctive point extraction we use a coordinate data extraction method from specific parts of a bank note representing the same color. The recognition algorithm uses a back-propagation neural network that has coordinate data input. The proposed algorithm is designed to minimize recognition time.

  • PDF

Solvent Extraction of Cobalt Chloride from Strong Hydrochloric Acid Solutions by Alamine336 (진한 염산용액에서 Alamine336에 의한 염화코발트의 용매추출)

  • Lee, Man-seung;Lee, Jin-Young
    • Korean Journal of Metals and Materials
    • /
    • v.46 no.4
    • /
    • pp.227-232
    • /
    • 2008
  • Solvent extraction reaction of cobalt by Alamine336 from strong hydrochloric acid solution was identified by analyzing the solvent extraction data reported in the literature. Analysis of the data by graphical method revealed that Alamine336 took part in the solvent extraction reaction as a monomer in the concentration ranges, [Co(II)] : 0.0169 - 0.102 M, [Alamine336] ; 0.02- 1.75 M, and [HCl ] : 5 - 10 M. The following solvent extraction reaction and equilibrium constant was obtained from the experimental data by considering the activity coefficients of chemical species present in the aqueous phase. $Co^{2+}+2Cl^{-}+R_3NHCl_{org}=CoCl_3\;R_3NH_{org}$, $K_{ex}=2.21$ The distribution coefficients of cobalt predicted in this study agreed well with those reported in the literature.

An Efficient Damage Information Extraction from Government Disaster Reports

  • Shin, Sungho;Hong, Seungkyun;Song, Sa-Kwang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.55-63
    • /
    • 2017
  • One of the purposes of Information Technology (IT) is to support human response to natural and social problems such as natural disasters and spread of disease, and to improve the quality of human life. Recent climate change has happened worldwide, natural disasters threaten the quality of life, and human safety is no longer guaranteed. IT must be able to support tasks related to disaster response, and more importantly, it should be used to predict and minimize future damage. In South Korea, the data related to the damage is checked out by each local government and then federal government aggregates it. This data is included in disaster reports that the federal government discloses by disaster case, but it is difficult to obtain raw data of the damage even for research purposes. In order to obtain data, information extraction may be applied to disaster reports. In the field of information extraction, most of the extraction targets are web documents, commercial reports, SNS text, and so on. There is little research on information extraction for government disaster reports. They are mostly text, but the structure of each sentence is very different from that of news articles and commercial reports. The features of the government disaster report should be carefully considered. In this paper, information extraction method for South Korea government reports in the word format is presented. This method is based on patterns and dictionaries and provides some additional ideas for tokenizing the damage representation of the text. The experiment result is F1 score of 80.2 on the test set. This is close to cutting-edge information extraction performance before applying the recent deep learning algorithms.

Automatic Building Extraction Using LIDAR Data

  • Cho, Woo-Sug;Jwa, Yoon-Seok
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1137-1139
    • /
    • 2003
  • This paper proposed a practical method for building detection and extraction using airborne laser scanning data. The proposed method consists mainly of two processes: low and high level processes. The major distinction from the previous approaches is that we introduce a concept of pseudogrid (or binning) into raw laser scanning data to avoid the loss of information and accuracy due to interpolation as well as to define the adjacency of neighboring laser point data and to speed up the processing time. The approach begins with pseudo-grid generation, noise removal, segmentation, grouping for building detection, linearization and simplification of building boundary , and building extraction in 3D vector format. To achieve the efficient processing, each step changes the domain of input data such as point and pseudo-grid accordingly. The experimental results shows that the proposed method is promising.

  • PDF