통합 검색 | Korea Science

Development of the Design Methodology for Large-scale Data Warehouse based on MongoDB

Lee, Junho;Joo, Kyungsoo
- 한국컴퓨터정보학회논문지
- /
- 제23권3호
- /
- pp.49-54
- /
- 2018
A data warehouse is a system that collectively manages and integrates data of a company. And provides the basis for decision making for management strategy. Nowadays, analysis data volumes are reaching critical size challenging traditional data ware housing approaches. Current implemented solutions are mainly based on relational database that are no longer adapted to these data volume. NoSQL solutions allow us to consider new approaches for data warehousing, especially from the multidimensional data management point of view. In this paper, we extend the data warehouse design methodology based on relational database using star schema, and have developed a consistent design methodology from information requirement analysis to data warehouse construction for large scale data warehouse construction based on MongoDB, one of NoSQL.
https://doi.org/10.9708/jksci.2018.23.03.049 인용 PDF KSCI

TATS: an Efficient Technique for Computing Temporal Aggregates for Data Warehousing

Shin, Young-Ok;Park, Sung-Kong;Baik, Doo-Kwon;Ryu, Keun-Ho
- ETRI Journal
- /
- 제22권3호
- /
- pp.41-51
- /
- 2000
An important use of data warehousing is to provide temporal views over the history of source data. It is significant that nearly all data warehouses are dependent on relational database technology, yet relational databases provide little or no real support for temporal data. Therefore, in is difficult to obtain accurate information for time-varying data. In this paper, we are going to design a temporal data warehouse to support time-varying data efficiently. For this purpose, we present a method to support temporal query by combining a temporal query process layer with the relational database which is used as a source database in an existing data warehouse. We introduce the Temporal Aggregate Tree Strategy (TATS), and suggest its algorithm for the way to aggregate the time-varying data that is changed by the time when the temporal view is created. In addition, The TATS and the materialized view creation method of the existing data warehouse have been evaluated. As a result, the TATS reduces the size of the fact table and it shows a good performance for the comparison factor in case of processing the query for time-varying data.
PDF

시간 공간 통합 본원적 데이터 모델링 및 그 구현에 관한 연구 (Modeling and Implementation for Generic Spatio-Temporal Incorporated Information)

이우기
- Journal of Information Technology Applications and Management
- /
- 제12권1호
- /
- pp.35-48
- /
- 2005
An architectural framework is developed for integrating geospatial and temporal data with relational information from which a spatio-temporal data warehouse (STDW) system is built. In order to implement the STDW, a generic conceptual model was designed that accommodated six dimensions: spatial (map object), temporal (time), agent (contractor), management (e.g. planting) and tree species (specific species) that addressed the 'where', 'when', 'who', 'what', 'why' and 'how' (5W1H) of the STDW information, respectively. A formal algebraic notation was developed based on a triplet schema that corresponded with spatial, temporal, and relational data type objects. Spatial object structures and spatial operators (spatial selection, spatial projection, and spatial join) were defined and incorporated along with other database operators having interfaces via the generic model.
PDF

Mining Information in Automated Relational Databases for Improving Reliability in Forest Products Manufacturing

Young, Timothy M.;Guess, Frank M.
- International Journal of Reliability and Applications
- /
- 제3권4호
- /
- pp.155-164
- /
- 2002
This paper focuses on how modem data mining can be integrated with real-time relational databases and commercial data warehouses to improve reliability in real-time. An important Issue for many manufacturers is the development of relational databases that link key product attributes with real-time process parameters. Helpful data for key product attributes in manufacturing may be derived from destructive reliability testing. Destructive samples are taken at periodic time intervals during manufacturing, which might create a long time-gap between key product attributes and real-time process data. A case study is briefly summarized for the medium density fiberboard (MDF) industry. MDF is a wood composite that is used extensively by the home building and furniture manufacturing industries around the world. The cost of unacceptable MDF was as large as 5% to 10% of total manufacturing costs. Prevention can result In millions of US dollars saved by using better Information systems.
PDF

관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구 (A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses)

안형근;고재진
- 정보처리학회논문지D
- /
- 제19D권1호
- /
- pp.1-14
- /
- 2012
데이터 웨어하우스는 크기가 방대하기 때문에 인덱스의 선택은 질의어 처리의 효율성에 상대한 영향을 준다. 인덱스는 질의 처리 비용을 줄이지만, 그것이 차지하는 기억 영역과 데이터베이스의 변경에 따른 보수라는 비용이 수반된다. 데이터 웨어하우스에서 하나의 사실 테이블과 여러 개의 차원 테이블 사이의 조인을 행하는 스타 조인 질의어와 차원 테이블의 선택을 최적화하기 위해서 비트맵 조인 인덱스가 잘 적용된다. 비트맵 조인 인덱스는 이진수로 표현되기 때문에 저장 비용은 적게 들지만 인덱스 할 후보 속성들이 많이 생성되기 때문에 그 중에서 인덱스 할 속성들을 선택하는 일은 어려운 과제가 된다. 인덱스 선택은 일단 후보 속성들의 개수를 축소하고, 그 중에서 인덱스를 선택하게 된다. 본 논문에서는 데이터 마이닝 방법을 사용해서 비트 맵 조인 인덱스 선택 문제에서 후보 속성들의 개수를 축소하는 것을 해결한다. 질의어에 있는 속성들의 빈도에 기준해서 후보 속성들의 개수를 감소시키는 기존의 방법에 비해서 본 논문은 속성들의 빈도를 사용함과 동시에 차원 테이블의 크기, 차원 테이블의 튜플 크기, 디스크의 페이지 크기 등을 고려한다. 그리고 데이터마이닝 기법으로 빈발 항목집합을 마이닝하여 후보 속성들의 개수를 효과적으로 줄인다. 후보 속성집합들의 비트 맵 조인 인덱스에 비용함수를 적용해서 최소의 비용과 기억 영역 제한에 적합한 속성집합들의 비트 맵 조인 인덱스를 구한다. 본 논문의 방법의 효율성을 평가하기 위해서 기존의 방법들과 비교 분석을 한다.
https://doi.org/10.3745/KIPSTD.2012.19D.1.001 인용 PDF KSCI

관계 DBMS의 실체뷰 기능을 이용한 XML 실체뷰 지원 (Supporting XML Materialized Views Using Materialized Views of RDBMS)

김승훈;강현철
- 한국전자거래학회지
- /
- 제11권4호
- /
- pp.33-48
- /
- 2006
XML이 웹상에서 데이타 교환의 표준으로 등장한 이래 웹 환경에서 e-Commerce와 같은 웹 기반 비즈니스 응용을 효율적으로 지원하기 위해 XML 웨어하우스 기술이 요구되고 있다. 관계 DMBS가 XML웨어하우스의 저장소로 사용될 경우, XML웨어하우스의 XML실체뷰는 관계 DBMS의 관계 실체뷰를 이용하여 제공할 수 있다. XML 문서가 관계 튜플로 저장되기 때문에 XML 실체뷰를 정의하는 XML 질의는 SQL로 변경된다. 만일 변경된 SQL문으로 관계 실체뷰를 정의하면, XML실체뷰는 해당 관계 실체뷰를 구성하는 튜플들에 대한 XML 태깅만으로 얻어질 수 있다. 이런 기법의 가장 큰 장점은, 소스 XML 문서가 변경될 때마다 XML 태깅을 제외한 XML 실체뷰의 일관성 유지를 관계 DBMS가 수행해준다는 것이다. 본 논문에서는 이러한 XML 실체뷰 기법을 제시하고 Windows 2000 Professional 환경에서 실체뷰 기능을 갖춘 상용 관계 DBMS를 사용하여 Java로 구현하였다. 성능 실험은 웹상의 e-Commerce 벤치마크인 TPC-W의 XML 문서를 대상으로 수행하였다. 실험 결과 본 논문이 제시한 XML 실체뷰 기법 이 매우 효율적 인 것으로 나타났다.
PDF

데이타 웨어하우스 환경에서 최적 실체뷰 구성을 위한 효율적인 탐색공간 생성 기법 (An Efficient Search Space Generation Technique for Optimal Materialized Views Selection in Data Warehouse Environment)

이태희;장재영;이상구
- 한국정보과학회논문지:데이타베이스
- /
- 제31권6호
- /
- pp.585-595
- /
- 2004
데이타 웨어하우스에서의 분석 질의는 대체로 복잡한 연산을 포함하고 있기 때문에 질의 처리 과정이 매우 중요하다. 성능 향상을 위해서 데이타 웨어하우스에서 보편적으로 쓰이고 있는 방법은 실체뷰를 구축하는 것이다. 어떤 실체뷰를 구축하느냐 하는 문제는 데이타 웨어하우스 전체의 질의처리 성능과 유지보수 비용에 중요한 영향을 미친다. 실체뷰 구성 문제란 이러한 질의처리 비용과 유지보수비용을 고려하여 최적의 실체뷰를 선택하는 것이다. 본 논문에서는 이러한 최적의 실체뷰를 구성하는 효율적인 해결방안을 제시한다. 최적 실체뷰의 구성문제는 일반적으로 NP-hard 문제이지만, 본 논문에서는 관계형 데이터 베이스에서 사용되는 조인, 선택, 그룹, 집계 연산의 특성을 고려하여 문제해결을 위한 탐색 공간을 획기적으로 줄이는 방법을 제안한다.
PDF KSCI

XML을 이용한 웹 정보 추출 및 다차원 분석 (Web Information Extraction and Multidimensional Analysis Using XML)

박병권
- 한국멀티미디어학회논문지
- /
- 제11권5호
- /
- pp.567-578
- /
- 2008
인터넷에 있는 방대한 양의 웹 페이지들을 분석하기 위해서는 웹 페이지에 내재된 정보를 추출하는 것이 필요하다. 본 논문에서는 웹 페이지로부터 정보를 추출하고 이를 XML 문서로 변환하여 다차원적으로 분석하는 방법을 제안한다. 웹 페이지로부터 정보를 추출하기 위하여 두 종류의 언어를 제안한다. 하나는 객체지향 모델에 의거하여 웹 정보 추출 규칙을 기술하기 위한 것이고, 다른 하나는 추출하고자 하는 정보를 찾기 위한 HTML 태그 패턴을 정규식으로 기술하기 위한 것이다. XML 문서에 대한 다차원 분석을 위하여 관계형 데이터에 대해 하는 것처럼 웨어하우스를 구축하고 이로부터 다양한 큐브를 생성하는 방법을 제안한다. 마지막으로 본 논문에서 제안한 방법을 미국특허 웹 페이지에 적용한 예를 통해 그 타당성을 보인다.
PDF

Optimized Entity Attribute Value Model: A Search Efficient Re-presentation of High Dimensional and Sparse Data

Paul, Razan;Latiful Hoque, Abu Sayed Md.
- Interdisciplinary Bio Central
- /
- 제3권3호
- /
- pp.9.1-9.5
- /
- 2011
Entity Attribute Value (EAV) is the widely used solution to represent high dimensional and sparse data, but EAV is not search efficient for knowledge extraction. In this paper, we have proposed a search efficient data model: Optimized Entity Attribute Value (OEAV) for physical representation of high dimensional and sparse data as an alternative of widely used EAV. We have implemented both EAV and OEAV models in a data warehousing en-vironment and performed different relational and warehouse queries on both the models. The experimental results show that OEAV is dramatically search efficient and occupy less storage space compared to EAV.
https://doi.org/10.4051/ibc.2011.3.3.0009 인용 PDF

XML 큐브를 이용한 다차원 XML 문서 분석 (Multidimensional Analysis of XML Documents using XML Cubes)

박병권
- 한국정보시스템학회:학술대회논문집
- /
- 한국정보시스템학회 2005년도 춘계학술대회 발표 논문집
- /
- pp.65-78
- /
- 2005
Nowadays, large amounts of XML documents are available on the Internet. Thus, we need to analyze them multi-dimensionally in the same way as relational data. In this paper, we propose a new frame-work for multidimensional analysis of XML documents, which we call XML-OLAP. We base XML-OLAP on XML warehouses where every fact data as well as dimension data are stored as XML documents. We build XML cubes from XML warehouses. We propose a new multidimensional expression language for XML cubes, which we call XML-MDX. XML-MDX statements target XML cubes and use XQuery expressions to designate the measure data. They specify text mining operators for aggregating text constituting the measure data. We evaluate XML-OLAP by applying it to a U.S. patent XML warehouse. We use XML-MDX queries, which demonstrate that XML-OLAP is effective for multi-dimensionally analyzing the U.S. patents.
PDF

검색결과 23건 처리시간 0.023초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)