• Title/Summary/Keyword: Data-Warehouse

Search Result 348, Processing Time 0.027 seconds

Explanation-based Data Mining in Data Warehouse (데이터 웨어하우스 환경에서의 설명기반 데이터 마이닝)

  • 김현수;이창호
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.03a
    • /
    • pp.115-123
    • /
    • 1999
  • 산업계 전반에 걸친 오랜 정보시스템 운용의 결과로 대용량의 데이터들이 축적되고 있다. 이러한 데이터로부터 유용한 지식을 추출하기 위해 여러 가지 데이터 마이닝 기법들이 연구되어왔다. 특히 데이터 웨어하우스의 등장은 이러한 데이터 마이닝에 있어 필요한 데이터 제공 환경을 제공해 주고 있다. 그러나 전문가의 적절한 판단과 해석을 거치지 않은 데이터 마이닝의 결과는 당연한 사실이거나, 사실과 다른 가짜이거나 또는 관련성 없는(trivial, spurious and irrelevant)내용만 무수히 쏟아낼 수 있다. 그러므로 데이터 마이닝의 결과가 비록 통계적 유의성을 가진다 하더라도 그 정당성과 유용성에 대한 검증과정과 방법론의 정립이 필요하다. 데이터 마이닝의 가장 어려운 점은 귀납적 오류를 없애기 위해 사람이 직접 그 결과를 해석하고 판단하며 아울러 새로운 탐색 방향을 제시해야 한다는 것이다. 본 논문에서는 데이터 마이닝 기법 중 연관규칙탐사로 얻어진 결과를 설명가능성 여부의 판단을 통해 검증하는 기법을 제안하며, 이를 통해 얻어진 검증된 지식을 토대로 일반화를 통한 새로운 가설을 생성하여 데이터 웨어하우스로부터 연관규칙을 검증하는 일련의 아텍쳐(architecture)를 제시하고다 한다. 먼저 데이터 마이닝 결과에 대한 설명의 필요성을 제시하고, 데이터 웨어하우스와 데이터 마이닝 기법들에 대한 간략한 설명과 연관규칙탐사에 대한 정의 및 방법을 보이고, 대상 영역에 대한 데이터 웨어하우스으 스키마를 보였다. 다음으로 도메인 지식(domain knowledge)과 연관규칙탐사를 통해 얻어진 결과를 표현하기위한 지식표현 방법으로 Relational Predicate Logic을 제안하였다. 연관규칙탐사로 얻어진 결과를 설명하기 위한 방법으로는 연관규칙탐사로 얻어진 연관규칙에 대해 Relational Predicate Logic으로 표현된 도메인 지식으로서 설명됨을 보이게 한다. 또한 이러한 설명(explanation)을 토대로 검증된 지식을 일반화하여 새로운 가설을 연역적으로 생성하고 이를 연관규칙탐사를 통해 검증한 후 새로운 지식을 얻는 반복적인 Explanation-based Data Mining Architecture를 제시하였다. 본 연구의 의의로는 데이터 마이닝을 통한 귀납적 지식생성에 있어 귀납적 오류의 발생을 도메인 지식을 통해 설명가능 함을 보임으로 검증하고 아울러 이러한 설명을 통해 연역적으로 새로운 가설지식을 생성시켜 이를 가설검증방식으로 검증함으로써 귀납적 접근과 연역적 접근의 통합 데이터 마이닝 접근을 제시하였다는데 있다.

  • PDF

Analysis of prescription frequency of herbs in traditional Korean medicine hospital using electronic medical records

  • Lee, Byung-Wook;Cho, Hyun-Woo;Hwang, Eui-Hyoung;Heo, In;Shin, Byung-Cheul;Hwang, Man-Suk
    • The Journal of Korean Medicine
    • /
    • v.40 no.4
    • /
    • pp.29-40
    • /
    • 2019
  • Objectives: To analyze the prescription frequency of various herbs as either individual or major herbs (in terms of dosage) and their usage patterns in the treatment of different diseases for standardization of traditional Korean medicine. Methods: We analyzed the prescription database of patients at the Pusan National University Korean Medicine Hospital from the date of establishment of the hospital to February 2013. The complete prescription data were extracted from the electronic medical records of patients, and the prescription frequencies of individual herbs, particularly, of major herbs, were analyzed in terms of gender, age, and international classification of diseases (ICD) code. Results: The prescription frequency of individual herbs based on age and gender showed a similar pattern. Herbal mixtures were also distributed in a similar manner. The use of some herbs differed according to age and gender (Table 1.). The herbs that were used at high frequencies for a given ICD code had similar usage patterns in different categories. However, some major herbs in the "Jun (King)" category were used uniquely for a given ICD code (Table 2.). There was significant difference between male and female on ICD code E and N, but the other ICD codes had small differences. The ratio of herbal medicine by gender showed different usage patterns in each gender. Conclusions: The findings of our study provide fundamental data that reflect the real clinical conditions in South Korea, and therefore, can contribute to the standardization of TKM.

Informally Patients Prediction Model of Admission Patients (입원환자 데이터를 이용한 예약부도환자 이탈방지 모형 연구)

  • Kim, Eun-Yeob;Ham, Sung-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.11
    • /
    • pp.3465-3472
    • /
    • 2009
  • The aims of this study is to medical record data warehouse which had been collected from hospital information systems. continuous patient 2,118 60.5%, informally patient 1,385 39.5%. In using survival factors sex, age, area, insurance, admission-course, medical treatment, out-patient lesson, out-patient form, conference diagnosis, operation, cancer, medical reservation. As a result of making a predictive modeling using the logistic regression, the fitness of the predictive modeling of informally patient was 66.0% and neural network, the predictive was 66.72% and CHAID, the predictive was 63.25%, which is a data mining. The expected modeling of the informally patients, the hospital through the continuous patient management and trust of hospital.

Sensitivity Analysis of Wind Resource Micrositing at the Antarctic King Sejong Station (남극 세종기지에서의 풍력자원 국소배치 민감도 분석)

  • Kim, Seok-Woo;Kim, Hyun-Goo
    • Journal of the Korean Solar Energy Society
    • /
    • v.27 no.4
    • /
    • pp.1-9
    • /
    • 2007
  • Sensitivity analysis of wind resource micrositing has been performed through the application case at the Antarctic King Sejong station with the most representative micrositing softwares: WAsP, WindSim and Meteodyn WT. The wind data obtained from two met-masts separated 625m were applied as a climatology input condition of micro-scale wind mapping. A tower shading effect on the met-mast installed 20m apart from the warehouse has been assessed by the CFD software Fluent and confirmed a negligible influence on wind speed measurement. Theoretically, micro-scale wind maps generated by the two met-data located within the same wind system and strongly correlated meteor-statistically should be identical if nothing influenced on wind prediction but orography. They, however, show discrepancies due to nonlinear effects induced by surrounding complex terrain. From the comparison of sensitivity analysis, Meteodyn WT employing 1-equation turbulence model showed 68% higher RMSE error of wind speed prediction than that of WindSim using the ${\kappa}-{\epsilon}$ turbulence model, while a linear-theoretical model WAsP showed 21% higher error. Consequently, the CFD model WindSim would predict wind field over complex terrain more reliable and less sensitive to climatology input data than other micrositing models. The auto-validation method proposed in this paper and the evaluation result of the micrositing softwares would be anticipated a good reference of wind resource assessments in complex terrain.

Location reference technique of ITS Space Database supporting interoperability (상호운용성을 지원하는 ITS 공간 데이터베이스의 위치참조 기법)

  • Kim, Suk-Hee;Choi, Kee-Choo;Jang, Jeong-Ah
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.12 no.1 s.28
    • /
    • pp.45-53
    • /
    • 2004
  • The purpose of this paper is to study a scheme to ITS service which enables the data (spatial, non-spatial and image) sharing among heterogeneous system (various environment) with employing the concept of object orientedness and to show Location Reference Technique of ITS Space DB for interoperability. Data warehouse service, query object service, interface object service, and naming object service have been identified for this. In addition, a metadata management object service and persistent object service based system framework has been devised. The proposed skeletal framework would be expected to be functioning well for ITS data sharing environment and for the interoperability support.

  • PDF

An Efficient ROLAP Cube Generation Scheme (효율적인 ROLAP 큐브 생성 방법)

  • Kim, Myung;Song, Ji-Sook
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.99-109
    • /
    • 2002
  • ROLAP(Relational Online Analytical Processing) is a process and methodology for a multidimensional data analysis that is essential to extract desired data and to derive value-added information from an enterprise data warehouse. In order to speed up query processing, most ROLAP systems pre-compute summary tables. This process is called 'cube generation' and it mostly involves intensive table sorting stages. (1) showed that it is much faster to generate ROLAP summary tables indirectly using a MOLAP(multidimensional OLAP) cube generation algorithm. In this paper, we present such an indirect ROLAP cube generation algorithm that is fast and scalable. High memory utilization is achieved by slicing the input fact table along one or more dimensions before generating summary tables. High speed is achieved by producing summary tables from their smallest parents. We showed the efficiency of our algorithm through experiments.

Latent mobility pattern analysis of bus passengers with LDA (LDA 기법을 이용한 버스 승객의 잠재적 이동패턴 분석)

  • Cho, Ah;Lee, Kyung Hee;Cho, Wan Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.5
    • /
    • pp.1061-1069
    • /
    • 2015
  • Recently, transportation big data generated in the transportation sector has been widely used in the transportation policies making and efficient system management. Bus passengers' mobility patterns are useful insight for transportation policy maker to optimize bus lines and time intervals in a city. We propose a new methodology to discover mobility patterns by using transportation card data. We first estimate the bus stations where the passengers get-off because the transportation card data don't have the get-off information in most cities. We then applies LDA (Latent Dirichlet Allocation), the most representative topic modeling technique, to discover mobility patterns of bus passengers in Cheong-Ju city. To understand discovered patterns, we construct a data warehouse and perform multi-dimensional analysis by bus-route, region, time-period, and the mobility patterns (get-on/get-off station). In the case of Cheong Ju, we discovered mobility pattern 1 from suburban area to Cheong-Ju terminal, mobility pattern 2 from residential area to commercial area, mobility pattern 3 from school areas to commercial area.

Localization of a Mobile Robot Using Ceiling Image with Identical Features (동일한 형태의 특징점을 갖는 천장 영상 이용 이동 로봇 위치추정)

  • Noh, Sung Woo;Ko, Nak Yong;Kuc, Tae Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.2
    • /
    • pp.160-167
    • /
    • 2016
  • This paper reports a localization method of a mobile robot using ceiling image. The ceiling has landmarks which are not distinguishablefrom one another. The location of every landmark in a map is given a priori while correspondence is not given between a detected landmark and a landmark in the map. Only the initial pose of the robot relative to the landmarks is given. The method uses particle filter approach for localization. Along with estimating robot pose, the method also associates a landmark in the map to a landmark detected from the ceiling image. The method is tested in an indoor environment which has circular landmarks on the ceiling. The test verifies the feasibility of the method in an environment where range data to walls or to beacons are not available or severely corrupted with noise. This method is useful for localization in a warehouse where measurement by Laser range finder and range data to beacons of RF or ultrasonic signal have large uncertainty.

Full-text databases as a means for resource sharing (자원공유 수단으로서의 전문 데이터베이스)

  • 노진구
    • Journal of Korean Library and Information Science Society
    • /
    • v.24
    • /
    • pp.45-79
    • /
    • 1996
  • Rising publication costs and declining financial resources have resulted in renewed interest among librarians in resource sharing. Although the idea of sharing resources is not new, there is a sense of urgency not seen in the past. Driven by rising publication costs and static and often shrinking budgets, librarians are embracing resource sharing as an idea whose time may finally have come. Resource sharing in electronic environments is creating a shift in the concept of the library as a warehouse of print-based collection to the idea of the library as the point of access to need information. Much of the library's material will be delivered in electronic form, or printed. In this new paradigm libraries can not be expected to su n.0, pport research from their own collections. These changes, along with improved communications, computerization of administrative functions, fax and digital delivery of articles, advancement of data storage technologies, are improving the procedures and means for delivering needed information to library users. In short, for resource sharing to be truly effective and efficient, however, automation and data communication are essential. The possibility of using full-text online databases as a su n.0, pplement to interlibrary loan for document delivery is examined. At this point, this article presents possibility of using full-text online databases as a means to interlibrary loan for document delivery. The findings of the study can be summarized as follows : First, turn-around time and the cost of getting a hard copy of a journal article from online full-text databases was comparable to the other document delivery services. Second, the use of full-text online databases should be considered as a method for promoting interlibrary loan services, as it is more cost-effective and labour saving. Third, for full-text databases to work as a document delivery system the databases must contain as many periodicals as possible and be loaded on as many systems as possible. Forth, to contain many scholarly research journals on full-text databases, we need guidelines to cover electronic document delivery, electronic reserves. Fifth, to be a full full-text database, more advanced information technologies are really needed.

  • PDF

Structural Safety Analysis and Reinforcement for Weak Area of the Coal Silo Tunnel using Finite Elements Analysis (유한요소해석을 이용한 Coal Silo Tunnel 취약부위의 구조안전성 분석 및 구조보강)

  • Lee, Hyun-Woo;Jung, Sung-Yuen;Song, Se-Arm;Kim, Min-Soo;Kim, Jin-Hyung;Kim, Chul
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.29 no.4
    • /
    • pp.461-468
    • /
    • 2012
  • Silo is a warehouse for storing granular materials such as grain, cement, petroleum compound and coal. When compared to other warehouses, the silo can use space efficiently. The coal silo are consists of silo, tunnel and extractor. Of these, there are not sufficient study and design data on tunnel. It depends heavily upon trial and error method by field engineers with several years of experience. Recently, silos are constructed with a large size, and tunnel becomes to be in danger of severe cracking and collapse by a huge load of coal. So it is necessary to analyze structural safety for tunnel. In this study, the problems of the tunnel are analyzed by field data, and reinforcement of structural weak area using FE analysis has been carried out to design the tunnel satisfying structural safety. From FE Analysis, the reinforced model which does not exceed the yield strength of the material has been proposed.