• 제목/요약/키워드: Multivariate Dataset

검색결과 66건 처리시간 0.022초

Multivariate assessment of the occurrence of compound Hazards at the pan-Asian region

  • Davy Jean Abella;Kuk-Hyun Ahn
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.166-166
    • /
    • 2023
  • Compound hazards (CHs) are two or more extreme climate events combined which occur simultaneously in the same region at the same time. Compared to individual hazards, the combination of hazards that cause CHs can result in greater economic losses and deaths. While several extreme climate events have been recorded across Asia for the past decades, many studies have only focused on a single hazard. In this study, we assess the spatiotemporal pattern of dry compound hazards which includes drought, heatwave, fire and wind across Asia for the last 42 years (1980-2021) using the historical data from ERA5 Reanalysis dataset. We utilize a daily spatial data of each climate event to assess the occurrence of such compound hazards on a daily basis. Heatwave, fire and wind hazard occurrences are analyzed using daily percentile-based thresholds while a pre-defined threshold for SPI is applied for drought occurrence. Then, the occurrence of each type of compound hazard is taken from overlapping the map of daily occurrences of a single hazard. Lastly, a multivariate assessment are conducted to quantify the occurrence frequency, hotspots and trends of each type of compound hazard across Asia. By conducting a multivariate analysis of the occurrence of these compound hazards, we identify the relationships and interactions in dry compound hazards including droughts, heatwaves, fires, and winds, ultimately leading to better-informed decisions and strategies in the natural risk management.

  • PDF

UCI machine learning repository 사용한 TCN-Prophet 기반 당뇨병 예측 (Diabetes Prediction with the TCN-Prophet model using UCI Machine Learning Repository)

  • 탄텐보;조인휘
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2023년도 춘계학술발표대회
    • /
    • pp.325-327
    • /
    • 2023
  • Diabetes is a common chronic disease that threatens human life and health, and its prevalence remains high because its mechanisms are complex, further its etiology remains unclear. According to the International Diabetes Federation (IDF), there are 463 million cases of diabetes in adults worldwide, and the number is growing. This study aims to explore the potential influencing factors of diabetes by learning data from the UCI diabetes dataset, which is a multivariate time series dataset. In this paper we propose the TCN-prophet model for diabetes. The experimental results show that the prediction of insulin concentration by the TCN-prophet model provides a high degree of consistency, compared to the existing LSTM model.

Using Structural Changes to support the Neural Networks based on Data Mining Classifiers: Application to the U.S. Treasury bill rates

  • 오경주
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2003년도 추계학술대회
    • /
    • pp.57-72
    • /
    • 2003
  • This article provides integrated neural network models for the interest rate forecasting using change-point detection. The model is composed of three phases. The first phase is to detect successive structural changes in interest rate dataset. The second phase is to forecast change-point group with data mining classifiers. The final phase is to forecast the interest rate with BPN. Based on this structure, we propose three integrated neural network models in terms of data mining classifier: (1) multivariate discriminant analysis (MDA)-supported neural network model, (2) case based reasoning (CBR)-supported neural network model and (3) backpropagation neural networks (BPN)-supported neural network model. Subsequently, we compare these models with a neural network model alone and, in addition, determine which of three classifiers (MDA, CBR and BPN) can perform better. For interest rate forecasting, this study then examines the predictability of integrated neural network models to represent the structural change.

  • PDF

Artificial Neural Networks for Interest Rate Forecasting based on Structural Change : A Comparative Analysis of Data Mining Classifiers

  • Oh, Kyong-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권3호
    • /
    • pp.641-651
    • /
    • 2003
  • This study suggests the hybrid models for interest rate forecasting using structural changes (or change points). The basic concept of this proposed model is to obtain significant intervals caused by change points, to identify them as the change-point groups, and to reflect them in interest rate forecasting. The model is composed of three phases. The first phase is to detect successive structural changes in the U. S. Treasury bill rate dataset. The second phase is to forecast the change-point groups with data mining classifiers. The final phase is to forecast interest rates with backpropagation neural networks (BPN). Based on this structure, we propose three hybrid models in terms of data mining classifier: (1) multivariate discriminant analysis (MDA)-supported model, (2) case-based reasoning (CBR)-supported model, and (3) BPN-supported model. Subsequently, we compare these models with a neural network model alone and, in addition, determine which of three classifiers (MDA, CBR and BPN) can perform better. For interest rate forecasting, this study then examines the prediction ability of hybrid models to reflect the structural change.

  • PDF

Estimating the AUC of the MROC curve in the presence of measurement errors

  • G, Siva;R, Vishnu Vardhan;Kamath, Asha
    • Communications for Statistical Applications and Methods
    • /
    • 제29권5호
    • /
    • pp.533-545
    • /
    • 2022
  • Collection of data on several variables, especially in the field of medicine, results in the problem of measurement errors. The presence of such measurement errors may influence the outcomes or estimates of the parameter in the model. In classification scenario, the presence of measurement errors will affect the intrinsic cum summary measures of Receiver Operating Characteristic (ROC) curve. In the context of ROC curve, only a few researchers have attempted to study the problem of measurement errors in estimating the area under their respective ROC curves in the framework of univariate setup. In this paper, we work on the estimation of area under the multivariate ROC curve in the presence of measurement errors. The proposed work is supported with a real dataset and simulation studies. Results show that the proposed bias-corrected estimator helps in correcting the AUC with minimum bias and minimum mean square error.

Exploiting Neural Network for Temporal Multi-variate Air Quality and Pollutant Prediction

  • Khan, Muneeb A.;Kim, Hyun-chul;Park, Heemin
    • 한국멀티미디어학회논문지
    • /
    • 제25권2호
    • /
    • pp.440-449
    • /
    • 2022
  • In recent years, the air pollution and Air Quality Index (AQI) has been a pivotal point for researchers due to its effect on human health. Various research has been done in predicting the AQI but most of these studies, either lack dense temporal data or cover one or two air pollutant elements. In this paper, a hybrid Convolutional Neural approach integrated with recurrent neural network architecture (CNN-LSTM), is presented to find air pollution inference using a multivariate air pollutant elements dataset. The aim of this research is to design a robust and real-time air pollutant forecasting system by exploiting a neural network. The proposed approach is implemented on a 24-month dataset from Seoul, Republic of Korea. The predicted results are cross-validated with the real dataset and compared with the state-of-the-art techniques to evaluate its robustness and performance. The proposed model outperforms SVM, SVM-Polynomial, ANN, and RF models with 60.17%, 68.99%, 14.6%, and 6.29%, respectively. The model performs SVM and SVM-Polynomial in predicting O3 by 78.04% and 83.79%, respectively. Overall performance of the model is measured in terms of Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE).

다변량 시계열 이상 탐지 과업에서 비지도 학습 모델의 성능 비교 (A Survey on Unsupervised Anomaly Detection for Multivariate Time Series)

  • 임주완;이재구
    • 정보보호학회논문지
    • /
    • 제33권1호
    • /
    • pp.1-12
    • /
    • 2023
  • 다변량 시계열 이상 탐지 과업에서 정답 값이 존재하는 데이터를 얻는 것은 매우 시간 집약적인 일이다. 따라서 최근 정답 값이 필요 없는 비지도 학습법(unsupervised learning)에 관한 많은 연구가 진행되었다. 하지만 다변량 시계열 이상 탐지 과업에 특화된 주요 구조와 세부적인 특성에 대한 심화 있는 논의는 이루어지지 않았다. 본 논문에서는 비지도 학습 기반의 다변량 시계열 이상 탐지 모델과 특장점을 포괄적으로 분석하여 분류하였다. 전력 계통(power grid) 또는 Cyber Physical System(CPS)과 같은 현실 세계 데이터 집합에서 현실적인 이상 상황을 고려하여 학습을 진행하였고, 실험 결과를 바탕으로 각 모델의 정량적 성능을 비교 분석하였다. 성능 지표로는 정밀도(precision), 재현율(recall)과 F1 점수를 사용하여 성능을 측정하였다.

해양과정시뮬레이션의 과학기술적가시화 (Scientific and Technical Visualization for Ocean Process Simulations)

  • 최병호
    • 한국전산유체공학회:학술대회논문집
    • /
    • 한국전산유체공학회 1999년도 춘계 학술대회논문집
    • /
    • pp.1-10
    • /
    • 1999
  • This paper briefly introduces the work done up to 1998 during the past twenty years for numerical modeling of ocean process focussing on the neighbouring seas of Korean Peninsula. Modeling of global ocean dynamics has also been performed as a pathway to understand the regional ocean dynamics. The ocean simulation produces a vast amount of multidimensional multivariate dataset therefore adoption of scientific and technical visualization techniques were essential to properly understand the physics involved.

  • PDF

이변량 지역빈도해석을 이용한 우리나라 극한 강우 분석 (Bivariate regional frequency analysis of extreme rainfalls in Korea)

  • 신주영;정창삼;안현준;허준행
    • 한국수자원학회논문집
    • /
    • 제51권9호
    • /
    • pp.747-759
    • /
    • 2018
  • 다변량 빈도해석과 지역빈도해석의 장점을 동시에 가지는 다변량 지역빈도해석은 다양한 변수를 고려함으로써 수문 현상에 대하여 많은 정보를 얻을 수 있고 많은 가용 자료 수로 인하여 높은 정확도의 분석결과를 도출할 수 있다. 현재까지는 우리나라의 강우 자료를 이용하여 다변량 지역빈도해석이 시도된 적이 없어 국내의 강우 자료를 대상으로 다변량 지역빈도해석의 적용성을 검토할 필요가 있다. 본 연구에서는 다변량 지역빈도해석의 매개변수 추정, 최적 분포형 선정, 확률수문량 성장곡선 추정 등에 집중하여 이변량 수문자료인 연 최대 강우량-지속기간 자료에 대하여 이변량 지역빈도해석의 적용성을 평가하였다. 기상청 71개 지점에 대하여 분석을 실시하였다. 본 연구를 통해 적용된 지역강우자료의 최적 copula 모형으로는 Frank와 Gumbel copula 모형이 선택되었고 주변분포형에 대해서는 지역별로 Gumbel과 대수정규분포와 같은 다양한 분포형이 최적 분포형으로 선택되었다. 상대제곱근오차(relative root mean square error)를 기준으로 지역빈도해석이 지점빈도해석보다 안정적이고 정확한 확률수문량 곡선 추정을 하였다. 이변량 강우분석에서 지역빈도해석을 적용하면 안정적인 수공구조물 설계기준 제시와 강우-지속기간 관계를 모형화 할 수 있을 것으로 기대된다.

독립성분분석을 이용한 다변량 공정에서의 고장탐지 방법 (Fault Detection Method for Multivariate Process using ICA)

  • 정승환;김민석;이한수;김종근;김성신
    • 한국정보통신학회논문지
    • /
    • 제24권2호
    • /
    • pp.192-197
    • /
    • 2020
  • 대규모 발전소나 화학공정과 같은 다변량 공정은 매우 위험한 환경에서 운전되기 때문에 고장이 발생하면 심각한 인적·물적 손실이 발생할 수 있다. 따라서 시스템의 고장을 사전에 탐지할 수 있는 온라인 모니터링 기술이 필수적이다. 본 논문에서는 세 가지의 다른 다변량 공정 데이터에 ICA를 적용하여 고장탐지를 수행하였고, PCA와 성능을 비교하였다. ICA 기반의 고장탐지 절차는 크게 오프라인 과정과 온라인 과정으로 나뉜다. 오프라인 과정에서는 시스템이 정상일 때 계측된 데이터를 이용하여 고장판별을 위한 문턱 값을 설정한다. 그리고 온라인 과정에서는 실시간으로 계측되는 질의벡터에 대한 통계량을 계산한 후, 계산된 통계량과 사전에 정의된 문턱 값과 비교하여 고장을 판별한다. 본 논문에서 이용한 세 가지의 다변량 공정 데이터에 실험한 결과, ICA 기반 고장탐지 방법이 시스템의 고장을 사전에 탐지하였고, PCA 보다 우수한 고장탐지 성능을 보여주었다.