• Title/Summary/Keyword: Large data

Search Result 14,214, Processing Time 0.045 seconds

Large Sample Tests for Independence and Symmetry in the Bivariate Weibull Model under Random Censorship

  • Cho, Jang-Sik;Ko, Jeong-Hwan;Kang, Sang-Kil
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.405-412
    • /
    • 2003
  • In this paper, we consider two components system which the lifetimes have a bivariate weibull distribution with random censored data. Here the censoring time is independent of the lifetimes of the components. We construct large sample tests for independence and symmetry between two-components based on maximum likelihood estimators and the natural estimators. Also we present a numerical study.

  • PDF

ROBUST REGRESSION ESTIMATION BASED ON DATA PARTITIONING

  • Lee, Dong-Hee;Park, You-Sung
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.2
    • /
    • pp.299-320
    • /
    • 2007
  • We introduce a high breakdown point estimator referred to as data partitioning robust regression estimator (DPR). Since the DPR is obtained by partitioning observations into a finite number of subsets, it has no computational problem unlike the previous robust regression estimators. Empirical and extensive simulation studies show that the DPR is superior to the previous robust estimators. This is much so in large samples.

A Study on Damage Evaluations of Truss for Large Structure Health Monitoring (대형 구조물 상태평가를 위한 트러스 구조물 손상 평가에 관한 연구)

  • Lee, Jong-Ho;Kim, Seon-Gyu
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2016.10a
    • /
    • pp.130-131
    • /
    • 2016
  • This study was performed for application of Structural Health Monitoring system of large structures. In order to evaluate damage of a structure, strain data of truss members that are changing with damage are gained by FEM analysis program. These data are used to train Artificial Neural Network(ANN), and this ANN algorithm can be used to analysis strain data for evaluating damage of the truss members.

  • PDF

A High-rate GPS Data Processing for Large-scale Structure Monitoring (대형구조물 모니터링을 위한 high-rate GPS 자료처리)

  • Bae, Tea-Suk
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2010.04a
    • /
    • pp.181-182
    • /
    • 2010
  • For real-time displacement monitoring of large-scale structures, the high-rate (>1 Hz) GPS data processing is necessary, which is not possible even for the scientific GPS data processing softwares. Since the baseline is generally very short in this case, most of the atmospheric effects are removed, resulting in the unknowns of position and integer ambiguity. The number of unknowns in real-time kinematic GPS positioning makes the positioning impossible with usual approach, thus two-step approach is tested in this study.

  • PDF

Efficient Binary Join Processing for Large Data Streams (대용량 데이터 스트림을 처리하기 위한 효율적 이진 조인 처리 기법)

  • Park, Hong-Kyu;Lee, Won-Suk
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2008.06a
    • /
    • pp.189-192
    • /
    • 2008
  • 최근에 제한된 데이터 셋보다 센서 데이터 처리, 웹 서버 로그나 전화 기록과 같은 다양한 트랜잭션 로그 분석등과 관련된 대용량 데이터 스트림을 실시간으로 처리하는 것에 많은 관심이 집중되고 있으며, 특히 데이터 스트림의 조인 처리에 대한 관심이 증가하고 있다. 본 논문에서는 조인 연산을 빠르게 처리하기 위한 효율적인 해시 구조와 조인 방법에 대해서 연구하고 다양한 환경에서 제안 방법을 검증한다.

  • PDF

CAPTURE OF YELLOW DUST BLOW BY MODIS DATA

  • Song, Jie;Park, Jong-Geol;Yasuda, Yoshizumi
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.920-922
    • /
    • 2003
  • Large plumes of yellow send or yellow dust blow out over the Sea of Japan and the Japanese archipelago from mainland of China. In this study, the methodology to capture the perspective on the large Yellow dust storm by using MODIS data is discussed. As the typical image of yellow send, MODIS data obtained of April 8, 2002 were used in this study.

  • PDF

The BIOWAY System: A Data Warehouse for Generalized Representation & Visualization of Bio-Pathways

  • Kim, Min Kyung;Seo, Young Joo;Lee, Sang Ho;Song, Eun Ha;Lee, Ho Il;Ahn, Chang Shin;Choi, Eun Chung;Park, Hyun Seok
    • Genomics & Informatics
    • /
    • v.2 no.4
    • /
    • pp.191-194
    • /
    • 2004
  • Exponentially increasing biopathway data in recent years provide us with means to elucidate the large-scale modular organization of the cell. Given the existing information on metabolic and regulatory networks, inferring biopathway information through scientific reasoning or data mining of large scale array data or proteomics data get great attention. Naturally, there is a need for a user-friendly system allowing the user to combine large and diverse pathway data sets from different resources. We built a data warehouse - BIOWAY - for analyzing and visualizing biological pathways, by integrating and customizing resources. We have collected many different types of data in regards to pathway information, including metabolic pathway data from KEGG/LIGAND, signaling pathway data from BIND, and protein information data from SWISS-PROT. In addition to providing general data retrieval mechanism, a successful user interface should provide convenient visualization mechanism since biological pathway data is difficult to conceptualize without graphical representations. Still, the visual interface in the previous systems, at best, uses static images only for the specific categorized pathways. Thus, it is difficult to cope with more complex pathways. In the BIOWAY system, all the pathway data can be displayed in computer generated graphical networks, rather than manually drawn image data. Furthermore, it is designed in such a way that all the pathway maps can be expanded or shrinked, by introducing the concept of super node. A subtle graphic layout algorithm has been applied to best display the pathway data.

A Research on Development of Bills of Material Using Web Grid for Product Lifecycle Management

  • Yoo, Ji-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.12
    • /
    • pp.131-136
    • /
    • 2017
  • PLM(Product Lifecycle Management) is an information management system that can integrate data, processes, business systems and human resources throughout the enterprise. BOM(Bills Of Material) is key data for designing, purchasing materials, manufacturing planning and management, which is basic for product development throughout the product life cycle. In this paper, we propose the efficient system to increase the data loading speed and the processing speed when using such large BOM data. We present the performance and usability of IMDG (In Memory Data Grid) for data processing when loading large amounts of data. In the UI, using the pure web grid of JavaScript instead of the existing data loading method can be improve performance of data managing.

Environmental Consciousness Data Modeling by Association Rules

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.10a
    • /
    • pp.115-124
    • /
    • 2004
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are association rules, decision tree, clustering, neural network and so on. Association rule mining searches for interesting relationships among items in a given large data set. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule, support and confidence and lift. We analyze Gyeongnam social indicator survey data using association rule technique for environmental information discovery. We can use to environmental preservation and environmental improvement by association rule outputs.

  • PDF

Dynamic Replication Based on Availability and Popularity in the Presence of Failures

  • Meroufel, Bakhta;Belalem, Ghalem
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.263-278
    • /
    • 2012
  • The data grid provides geographically distributed resources for large-scale applications. It generates a large set of data. The replication of this data in several sites of the grid is an effective solution for achieving good performance. In this paper we propose an approach of dynamic replication in a hierarchical grid that takes into account crash failures in the system. The replication decision is taken based on two parameters: the availability and popularity of the data. The administrator requires a minimum rate of availability for each piece of data according to its access history in previous periods, but this availability may increase if the demand is high on this data. We also proposed a strategy to keep the desired availability respected even in case of a failure or rarity (no-popularity) of the data. The simulation results show the effectiveness of our replication strategy in terms of response time, the unavailability of requests, and availability.