• Title/Summary/Keyword: datawarehouse

Search Result 21, Processing Time 0.022 seconds

데이터 웨어하우스 구축을 위한 상향식.하향식 접근 방법에 관한 연구 - 방법론적 고찰을 통한 접근

  • 박주석;전대성;박진휘
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.15-18
    • /
    • 1998
  • Recently, domestic datawarehouse marketplaces take up the most important part of information technology investments with ERP(Enterprise Resource Planning), EC(Electronic Commerce) construction projects in spite of a business slowdown for IMF environment. Therefore, in industrial fields as finances, communications, manufactures, distributions, and so on, datawarehouse was constructed or is under construction, and datawarehouse success stories was showed around some enterprises that has constructed previously datawarehouse. This study is intended to be henceforth an index and direction for an actual proof study by discussing a study of methodology for datawarehouse construction in order to compare and analyze Top-down & Bottom-up approach for datawarehouse that has constructed in actual enterprises.

  • PDF

Improvement of Datawarehouse Development Process by Applying the Configuration Management of CMMI (CMMI의 형상관리를 적용한 데이터웨어하우스 개발 프로세스의 개선)

  • Park Jong-Mo;Cho Kyung-San
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.625-632
    • /
    • 2006
  • A Datawarehouse, which extracts and saves the massive analysis data from the operating servers, is a decision support tool in which data quality and processing time are very important. Thus, it is necessary to standardize and improve datawarehouse development process in order to stabilize data quality and improve the productivity. We propose a novel improved process for datawarehouse development by applying the configuration management of CMMI (Capability Maturity Model Integration) which has become a major force in software development process improvement. In addition, we specify some matrices for evaluating datawarehouse development process. Through the comparison analysis with other existing processes, we show that our proposal is more efficient in cost and productivity as well as improves data quality and reusability.

Control of metadata schema conflicts for internet datawarehouse (인터넷 데이터웨어하우스 구축을 위한 메타데이터 스키마 충돌 제어)

  • Kim, Byung-Gon
    • Journal of Digital Contents Society
    • /
    • v.8 no.4
    • /
    • pp.499-507
    • /
    • 2007
  • With the increasing of users' request about internet web service, importance of Internet datawarehouse to support decision making of users is increasing now. Early Internet datawarehouse was studied in the form of using existent database and XML. However, because of limitation of information expression ability, it is gradually changed to system that use metadata schema like RDFS. Because of distributed environment of the Internet, integration and saving of each metadata schemas into one global schema is important. However, between each different schema, semantic and structural conflicts can be happen in such situation and they must be controlled. In this paper, we analyze occasions of conflict when integrate distributed metadata schemas and propose conflict resolution technique for efficient internet datawarehouse query processing.

  • PDF

Design and Analysis of Metrics for Enhancing Productivity of Datawarehouse (데이터웨어하우스의 개발생산성 향상을 위한 측정지표의 설계 및 분석)

  • Park, Jong-Mo;Cho, Kyung-San
    • Journal of Internet Computing and Services
    • /
    • v.8 no.5
    • /
    • pp.151-160
    • /
    • 2007
  • A datawarehouse which extracts and saves the massive analysis data is used for marketing and decision support of business. However, the datawarehouse has the problem of increasing the process time and cost as well as has a high risk of process errors because it integrates vast amount of data from distributed environments. Thus, we propose a metrics for measurement in the area of productivity, process quality and data quality. Also through the evaluation using the proposed metrics, we show that our proposal provides productivity enhancement and process improvement.

  • PDF

Design and Application of Metadata Schema in Datawebhouse System (데이터웹하우스 시스템에서 메타데이터 스키마의 설계 및 활용)

  • Park, Jong-Mo;Cho, Kyung-San
    • The KIPS Transactions:PartD
    • /
    • v.14D no.6
    • /
    • pp.701-706
    • /
    • 2007
  • Datawebhouse consists of both web log analysis used for customer management and datawarehouse used for decision support. However, datawebhouse needs complex operations for management in order to transform and integrate data from heterogeneous data sources and distributed systems. We propose a metadata schema in order to enable data integration and data management which are essential in datawebhouse environments. We show that our proposed schema supports datawebhouse development and enables integrated asset management of business information. With ETL metadata for web log extract, we can improve the data processing time of web log.

Cross Compressed Replication Scheme for Large-Volume Column Storages (대용량 컬럼 저장소를 위한 교차 압축 이중화 기법)

  • Byun, Siwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.5
    • /
    • pp.2449-2456
    • /
    • 2013
  • The column-oriented database storage is a very advanced model for large-volume data analysis systems because of its superior I/O performance. Traditional data storages exploit row-oriented storage where the attributes of a record are placed contiguously in hard disk for fast write operations. However, for search-mostly datawarehouse systems, column-oriented storage has become a more proper model because of its superior read performance. Recently, solid state drive using MLC flash memory is largely recognized as the preferred storage media for high-speed data analysis systems. In this paper, we introduce fast column-oriented data storage model and then propose a new storage management scheme using a cross compressed replication for the high-speed column-oriented datawarehouse system. Our storage management scheme which is based on two MLC SSD achieves superior performance and reliability by the cross replication of the uncompressed segment and the compressed segment under high workloads of CPU and I/O. Based on the results of the performance evaluation, we conclude that our storage management scheme outperforms the traditional scheme in the respect of update throughput and response time of the column segments.

Design of DatawareHouse Real-Time Cleansing System using XMDR (XMDR을 이용한 데이터웨어하우스 실시간 데이터 정제 시스템 설계)

  • Song, Hong-Youl;Jung, Kye-Dong;Choi, Young-Keum
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1861-1867
    • /
    • 2010
  • A datawarehouse is generally used in organizations for decision and policy making. And In a distribute environment when a new system is added, there needs considerable amount of time and cost due to the difference between the systems. Therefore, to solve this matter. Firstly, heterogeneous data structures can be handled by creating abstract queries according to the standard schema and by separating the queries using XMDR. Secondly, metadata dictionary which defines synonyms of metadata and methods for data expression is used to overcome difference of definition and expression of data. Especially, work presented in this thesis provides standardized information for data integration and minimizing the effects of integration on local systems in discrete environments using XMDR to create information of data warehouse in realtime.

Search Performance Improvement of Column-oriented Flash Storages using Segmented Compression Index (분할된 압축 인덱스를 이용한 컬럼-지향 플래시 스토리지의 검색 성능 개선)

  • Byun, Siwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.393-401
    • /
    • 2013
  • Most traditional databases exploit record-oriented storage model where the attributes of a record are placed contiguously in hard disk to achieve high performance writes. However, for search-mostly datawarehouse systems, column-oriented storage has become a proper model because of its superior read performance. Today, flash memory is largely recognized as the preferred storage media for high-speed database systems. In this paper, we introduce fast column-oriented database model and then propose a new column-aware index management scheme for the high-speed column-oriented datawarehouse system. Our index management scheme which is based on enhanced $B^+$-Tree achieves high search performance by embedded flash index and unused space compression in internal and leaf nodes. Based on the results of the performance evaluation, we conclude that our index management scheme outperforms the traditional scheme in the respect of the search throughput and response time.

Design of Metrics for Datawarehouse Process (데이터웨어하우스 개발 프로세스를 위한 측정지표의 설계)

  • Park, Jong-Mo;An, Hyo-Beom;Kim, Heung-Jun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.06a
    • /
    • pp.807-810
    • /
    • 2007
  • 최근의 기업환경에서는 데이터 분석을 통한 마케팅이 기업의 경쟁력에서 중요한 부분을 차지하며, 이를 지원하기 위하여 분석 정보를 추출하여 저장하는 데이터웨어하우스가 사용된다. 그러나 데이터웨어하우스는 다양한 종류의 업무 시스템으로부터 대규모의 데이터를 처리하기 때문에 많은 시간과 비용이 소요된다. 이러한 문제의 해결방안으로 지속적인 데이터웨어하우스의 프로세스 개선을 통해 시간과 비용의 절검이 가능하다. 프로세스 개선을 위해 본 연구에서는 생산성, 프로세스 품질, 데이터 품질의 영역에서 측정지표를 제안한다. 측정 지표를 통해 프로세스 개선과 제어를 위한 기반을 제공한다.

  • PDF

Design of data cleansing system based on XMDR for Datawarehouse (데이터웨어하우스를 위한 XMDR 기반의 데이터 정제시스템 설계)

  • Song, Hong-Youl;Ayush, Tsend;Jung, Kye-Dong;Choi, Young-Keun
    • Annual Conference of KIPS
    • /
    • 2010.04a
    • /
    • pp.180-182
    • /
    • 2010
  • 데이터웨어하우스는 기업의 정책을 결정하는데 사용하고 있다. 그러나, 새로운 시스템이 추가되면 데이터 통합 측면에서 시스템간의 여러 가지 이질적인 특성으로 인해 많은 비용과 시간이 필요로 하게 된다. 따라서, 이러한 이질적인 특성을 해결하기 위해 데이터 구조의 이질성 및 데이터 표현의 이질성은 XMDR(eXtended Master Data Registry)를 이용하여 추상화된 쿼리를 생성하고, XMDR에 맞게 쿼리를 분리함으로써 이질성을 해결한다. 특히 본 논문에서는 XMDR을 이용하여 분산 시스템 통합시 로컬시스템의 영향을 최소화하고, 데이터웨어하우스의 정보를 실시간으로 생성하기 위해 분산된 환경에서 데이터 통합을 위한 표준화된 정보를 제공한다. 또한, 기존 시스템의 변경 없이 데이터를 통합하여 비용과 시간을 절감하고, 실시간 데이터 추출 및 정제 작업을 통해 일관성있는 실시간 정보를 생성하여 정보의 품질을 향상시킬수 있도록 한다.