• Title/Summary/Keyword: methods of data analysis

Search Result 19,515, Processing Time 0.045 seconds

Influence of asphalt removal on operational modal analysis of Egebækvej Bridge

  • Umut Yildirim
    • Smart Structures and Systems
    • /
    • v.31 no.2
    • /
    • pp.171-181
    • /
    • 2023
  • Using the most up-to-date system identification methods in both time and frequency domains, the dynamic monitoring data from the reinforced concrete Egebaekvej Bridge near Holte, Denmark, is examined in this investigation. The bridge was erected in the 1960s and was still standing during test campaign before demolishing. The ARTeMIS Modal was adopted to derive the modal parameters from ambient vibration data. Several Operational Modal Analysis (OMA) approaches were applied, including Enhanced Frequency Domain Decomposition (EFDD), Curve-fit Frequency Domain Decomposition (CFDD), and Frequency Domain Decomposition (FDD). Afterward, Principal Component (SSI-PC), Unweighted Principal Component (SSI-UPC) Stochastic Subspace Identification methods were utilized. Danish engineering consulting company, COWI with the allowance of the bridge contractor BARSLUND, allow the researcher for this experimental test to demonstrate the impact of OMA applications.

A Hybrid Data Mining Technique Using Error Pattern Modeling (오차 패턴 모델링을 이용한 Hybrid 데이터 마이닝 기법)

  • Hur, Joon;Kim, Jong-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.30 no.4
    • /
    • pp.27-43
    • /
    • 2005
  • This paper presents a new hybrid data mining technique using error pattern modeling to improve classification accuracy when the data type of a target variable is binary. The proposed method increases prediction accuracy by combining two different supervised learning methods. That is, the algorithm extracts a subset of training cases that are predicted inconsistently by both methods, and models error patterns from the cases. Based on the error pattern model, the Predictions of two different methods are merged to generate final prediction. The proposed method has been tested using practical 10 data sets. The analysis results show that the performance of proposed method is superior to the existing methods such as artificial neural networks and decision tree induction.

Reliability Analysis Using Parametric and Nonparametric Input Modeling Methods (모수적·비모수적 입력모델링 기법을 이용한 신뢰성 해석)

  • Kang, Young-Jin;Hong, Jimin;Lim, O-Kaung;Noh, Yoojeong
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.30 no.1
    • /
    • pp.87-94
    • /
    • 2017
  • Reliability analysis(RA) and Reliability-based design optimization(RBDO) require statistical modeling of input random variables, which is parametrically or nonparametrically determined based on experimental data. For the parametric method, goodness-of-fit (GOF) test and model selection method are widely used, and a sequential statistical modeling method combining the merits of the two methods has been recently proposed. Kernel density estimation(KDE) is often used as a nonparametric method, and it well describes a distribution function when the number of data is small or a density function has multimodal distribution. Although accurate statistical models are needed to obtain accurate RA and RBDO results, accurate statistical modeling is difficult when the number of data is small. In this study, the accuracy of two statistical modeling methods, SSM and KDE, were compared according to the number of data. Through numerical examples, the RA results using the input models modeled by two methods were compared, and appropriate modeling method was proposed according to the number of data.

Comparison of covariance thresholding methods in gene set analysis

  • Park, Sora;Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.591-601
    • /
    • 2022
  • In gene set analysis with microarray expression data, a group of genes such as a gene regulatory pathway and a signaling pathway is often tested if there exists either differentially expressed (DE) or differentially co-expressed (DC) genes between two biological conditions. Recently, a statistical test based on covariance estimation have been proposed in order to identify DC genes. In particular, covariance regularization by hard thresholding indeed improved the power of the test when the proportion of DC genes within a biological pathway is relatively small. In this article, we compare covariance thresholding methods using four different regularization penalties such as lasso, hard, smoothly clipped absolute deviation (SCAD), and minimax concave plus (MCP) penalties. In our extensive simulation studies, we found that both SCAD and MCP thresholding methods can outperform the hard thresholding method when the proportion of DC genes is extremely small and the number of genes in a biological pathway is much greater than a sample size. We also applied four thresholding methods to 3 different microarray gene expression data sets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer to compare genetic pathways identified by each method.

Investigating the performance of different decomposition methods in rainfall prediction from LightGBM algorithm

  • Narimani, Roya;Jun, Changhyun;Nezhad, Somayeh Moghimi;Parisouj, Peiman
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.150-150
    • /
    • 2022
  • This study investigates the roles of decomposition methods on high accuracy in daily rainfall prediction from light gradient boosting machine (LightGBM) algorithm. Here, empirical mode decomposition (EMD) and singular spectrum analysis (SSA) methods were considered to decompose and reconstruct input time series into trend terms, fluctuating terms, and noise components. The decomposed time series from EMD and SSA methods were used as input data for LightGBM algorithm in two hybrid models, including empirical mode-based light gradient boosting machine (EMDGBM) and singular spectrum analysis-based light gradient boosting machine (SSAGBM), respectively. A total of four parameters (i.e., temperature, humidity, wind speed, and rainfall) at a daily scale from 2003 to 2017 is used as input data for daily rainfall prediction. As results from statistical performance indicators, it indicates that the SSAGBM model shows a better performance than the EMDGBM model and the original LightGBM algorithm with no decomposition methods. It represents that the accuracy of LightGBM algorithm in rainfall prediction was improved with the SSA method when using multivariate dataset.

  • PDF

Bayesian Analysis in Generalized Log-Gamma Censored Regression Model

  • Younshik chung;Yoomi Kang
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.3
    • /
    • pp.733-742
    • /
    • 1998
  • For industrial and medical lifetime data, the generalized log-gamma regression model is considered. Then the Bayesian analysis for the generalized log-gamma regression with censored data are explained and following the data augmentation (Tanner and Wang; 1987), the censored data is replaced by simulated data. To overcome the complicated Bayesian computation, Makov Chain Monte Carlo (MCMC) method is employed. Then some modified algorithms are proposed to implement MCMC. Finally, one example is presented.

  • PDF

A Comparative Analysis of Areal Interpolation Methods for Representing Spatial Distribution of Population Subgroups (하위인구집단의 분포 재현을 위한 에어리얼 인터폴레이션의 비교 분석)

  • Cho, Daeheon
    • Spatial Information Research
    • /
    • v.22 no.3
    • /
    • pp.35-46
    • /
    • 2014
  • Population data are usually provided at administrative spatial units in Korea, so areal interpolation is needed for fine-grained analysis. This study aims to compare various methods of areal interpolation for population subgroups rather than the total population. We estimated the number of elderly people and single-person households for small areal units from Dong data by the different interpolation methods using 2010 census data of Seoul, and compared the estimates to actual values. As a result, the performance of areal interpolation methods varied between the total population and subgroup populations as well as between different population subgroups. It turned out that the method using GWR (geographically weighted regression) and building type data outperformed other methods for the total population and households. However, the OLS regression method using building type data performed better for the elderly population, and the OLS regression method based on land use data was the most effective for single-person households. Based on these results, spatial distribution of the single elderly was represented at small areal units, and we believe that this approach can contribute to effective implementation of urban policies.

A New Study on Vibration Data Acquisition and Intelligent Fault Diagnostic System for Aero-engine

  • Ding, Yongshan;Jiang, Dongxiang
    • Proceedings of the Korean Society of Propulsion Engineers Conference
    • /
    • 2008.03a
    • /
    • pp.16-21
    • /
    • 2008
  • Aero-engine, as one kind of rotating machinery with complex structure and high rotating speed, has complicated vibration faults. Therefore, condition monitoring and fault diagnosis system is very important for airplane security. In this paper, a vibration data acquisition and intelligent fault diagnosis system is introduced. First, the vibration data acquisition part is described in detail. This part consists of hardware acquisition modules and software analysis modules which can realize real-time data acquisition and analysis, off-line data analysis, trend analysis, fault simulation and graphical result display. The acquisition vibration data are prepared for the following intelligent fault diagnosis. Secondly, two advanced artificial intelligent(AI) methods, mapping-based and rule-based, are discussed. One is artificial neural network(ANN) which is an ideal tool for aero-engine fault diagnosis and has strong ability to learn complex nonlinear functions. The other is data mining, another AI method, has advantages of discovering knowledge from massive data and automatically extracting diagnostic rules. Thirdly, lots of historical data are used for training the ANN and extracting rules by data mining. Then, real-time data are input into the trained ANN for mapping-based fault diagnosis. At the same time, extracted rules are revised by expert experience and used for rule-based fault diagnosis. From the results of the experiments, the conclusion is obvious that both the two AI methods are effective on aero-engine vibration fault diagnosis, while each of them has its individual quality. The whole system can be developed in local vibration monitoring and real-time fault diagnosis for aero-engine.

  • PDF

Current Status of Tree Height Estimation from Airborne LiDAR Data

  • Hwang, Se-Ran;Lee, Im-Pyeong
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.3
    • /
    • pp.389-401
    • /
    • 2011
  • Most nations around the world have expressed significant concern in the climate change due to a rapid increase in green-house gases and thus reach an international agreement to control total amount of these gases for the mitigation of global warming. As the most important absorber of carbon dioxide, one of major green-house gases, forest resources should be more tightly managed with a means to measure their total amount, forest biomass, efficiently and accurately. Forest biomass has close relations with forest areas and tree height. Airborne LiDAR data helps extract biophysical properties on forest resources such as tree height more efficiently by providing detailed spatial information about the wide-range ground surface. Many researchers have thus developed various methods to estimate tree height using LiDAR data, which retain different performance and characteristics depending on forest environment and data characteristics. In this study, we attempted to investigate such various techniques to estimate tree height, elaborate their advantages and limitations, and suggest future research directions. We first examined the characteristics of LiDAR data applied to forest studies and then analyzed methods on filtering, a precedent procedure for tree height estimation. Regarding the methods for tree height estimation, we classified them into two categories: individual tree-based and regression-based method and described the representative methods under each category with a summary of their analysis results. Finally, we reviewed techniques regarding data fusion between LiDAR and other remote sensing data for future work.

Model- Data Based Small Area Estimation

  • Shin, Key-Il;Lee, Sang Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.637-645
    • /
    • 2003
  • Small area estimation had been studied using data-based methods such as Direct, Indirect, Synthetic methods. However recently, model-based such as based on regression or time series estimation methods are applied to the study. In this paper we investigate a model-data based small area estimation which takes into account the spatial relation among the areas. The Economic Active Population Survey in 2001 are used for analysis and the results from the model based and model-data based estimation are compared with using MSE(Mean squared error), MAE(Mean absolute error) and MB(Mean bias).