• Title/Summary/Keyword: data-based model

Search Result 21,105, Processing Time 0.046 seconds

Design of Grinding Database by Taking Frame-Based Model (후레임 모델에 의한 연삭가공용 데이터 베이스의 설계)

  • 김건희
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.15 no.2
    • /
    • pp.107-113
    • /
    • 1998
  • Grinding operation has difficulty in satisfying the qualitative knowledge based on the skilful expert as well as the quantitative data for all user. Design of grinding database based on the frame-based model is more effective method for utilizing the empirical and qualitative knowledge. In this paper. basic strategy to develop the grinding database by taking frame-based model, which is strongly dependent upon experience and intuition, is described. Grinding database based on the frame based model for designing the interaction and inference among the slots is accomplised by the object-oriented paradigm system.

  • PDF

Ensemble Gene Selection Method Based on Multiple Tree Models

  • Mingzhu Lou
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.652-662
    • /
    • 2023
  • Identifying highly discriminating genes is a critical step in tumor recognition tasks based on microarray gene expression profile data and machine learning. Gene selection based on tree models has been the subject of several studies. However, these methods are based on a single-tree model, often not robust to ultra-highdimensional microarray datasets, resulting in the loss of useful information and unsatisfactory classification accuracy. Motivated by the limitations of single-tree-based gene selection, in this study, ensemble gene selection methods based on multiple-tree models were studied to improve the classification performance of tumor identification. Specifically, we selected the three most representative tree models: ID3, random forest, and gradient boosting decision tree. Each tree model selects top-n genes from the microarray dataset based on its intrinsic mechanism. Subsequently, three ensemble gene selection methods were investigated, namely multipletree model intersection, multiple-tree module union, and multiple-tree module cross-union, were investigated. Experimental results on five benchmark public microarray gene expression datasets proved that the multiple tree module union is significantly superior to gene selection based on a single tree model and other competitive gene selection methods in classification accuracy.

Estimation of Optimal Training Period for the Deep-Learning LSTM Model to Forecast CMIP5-based Streamflow (CMIP5 기반 하천유량 예측을 위한 딥러닝 LSTM 모형의 최적 학습기간 산정)

  • Chun, Beom-Seok;Lee, Tae-Hwa;Kim, Sang-Woo;Lim, Kyoung-Jae;Jung, Young-Hun;Do, Jong-Won;Shin, Yong-Chul
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.1
    • /
    • pp.39-50
    • /
    • 2022
  • In this study, we suggested the optimal training period for predicting the streamflow using the LSTM (Long Short-Term Memory) model based on the deep learning and CMIP5 (The fifth phase of the Couple Model Intercomparison Project) future climate scenarios. To validate the model performance of LSTM, the Jinan-gun (Seongsan-ri) site was selected in this study. We comfirmed that the LSTM-based streamflow was highly comparable to the measurements during the calibration (2000 to 2002/2014 to 2015) and validation (2003 to 2005/2016 to 2017) periods. Additionally, we compared the LSTM-based streamflow to the SWAT-based output during the calibration (2000~2015) and validation (2016~2019) periods. The results supported that the LSTM model also performed well in simulating streamflow during the long-term period, although small uncertainties exist. Then the SWAT-based daily streamflow was forecasted using the CMIP5 climate scenario forcing data in 2011~2100. We tested and determined the optimal training period for the LSTM model by comparing the LSTM-/SWAT-based streamflow with various scenarios. Note that the SWAT-based streamflow values were assumed as the observation because of no measurements in future (2011~2100). Our results showed that the LSTM-based streamflow was similar to the SWAT-based streamflow when the training data over the 30 years were used. These findings indicated that training periods more than 30 years were required to obtain LSTM-based reliable streamflow forecasts using climate change scenarios.

A Comparative Study of Estimation by Analogy using Data Mining Techniques

  • Nagpal, Geeta;Uddin, Moin;Kaur, Arvinder
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.621-652
    • /
    • 2012
  • Software Estimations provide an inclusive set of directives for software project developers, project managers, and the management in order to produce more realistic estimates based on deficient, uncertain, and noisy data. A range of estimation models are being explored in the industry, as well as in academia, for research purposes but choosing the best model is quite intricate. Estimation by Analogy (EbA) is a form of case based reasoning, which uses fuzzy logic, grey system theory or machine-learning techniques, etc. for optimization. This research compares the estimation accuracy of some conventional data mining models with a hybrid model. Different data mining models are under consideration, including linear regression models like the ordinary least square and ridge regression, and nonlinear models like neural networks, support vector machines, and multivariate adaptive regression splines, etc. A precise and comprehensible predictive model based on the integration of GRA and regression has been introduced and compared. Empirical results have shown that regression when used with GRA gives outstanding results; indicating that the methodology has great potential and can be used as a candidate approach for software effort estimation.

Assessment of Feasibility of Rainfall-Runoff Simulation Using SRTM-DEM Based on SWMM (SWMM 기반 SRTM-DEM을 활용한 강우-유출 모의 가능성 평가)

  • Mirae Kim;Junsuk Kang
    • Journal of Environmental Science International
    • /
    • v.33 no.7
    • /
    • pp.443-452
    • /
    • 2024
  • The recent increase in impermeable surfaces due to urbanization and the occurrence of concentrated heavy rainfall events caused by climate change have led to an increase in urban flooding. To predict and prepare for flood damage, a convenient and highly accurate simulation of rainfall-runoff based on geospatial information is essential. In this study, the storm water management model (SWMM) was applied to simulate rainfall runoff in the Bangbae-dong area of Seoul, using two sets of topographical data: The conventional topographic digital elevation model (TOPO-DEM) and the proposed shuttle radar topography mission (SRTM)-DEM. To evaluate the applicability of the SRTM-DEM for rainfall-runoff modeling, two DEMs were constructed for the study area, and rainfall-runoff simulations were performed. The construction of the terrain data for the study area generally reflected the topographical characteristics of the area. Quantitative evaluation of the rainfall-runoff simulation results indicated that the outcomes were similar to those obtained using the existing TOPO-DEM. Based on the results of this study, we propose the use of SRTM-DEM, a more convenient terrain data, in rainfall-runoff studies, rather than asserting the superiority of a specific geospatial data.

Bayesian Estimation of the Normal Means under Model Perturbation

  • Kim, Dal-Ho;Han, Seung-Cheol
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.3
    • /
    • pp.1009-1019
    • /
    • 2006
  • In this paper, we consider the simultaneous estimation problem for the normal means. We set up the model structure using the several different distributions of the errors for observing their effects of model perturbation for the error terms in obtaining the empirical Bayes and hierarchical Bayes estimators. We compare the performance of those estimators under model perturbation based on a simulation study.

  • PDF

Development of Performance Analysis Methodology for Nuclear Power Plant Turbine Cycle Using Validation Model of Performance Measurements (원전 터빈사이클 성능 데이터의 검증 모델에 의한 성능분석 기법의 개발)

  • Kim, Seong-Geun;Choe, Gwang-Hui
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.24 no.12
    • /
    • pp.1625-1634
    • /
    • 2000
  • Verification of measurements is required for precise evaluation of turbine cycle performance in nuclear power plant. We assumed that initial acceptance data and design data of the plant could provide correlation information between performance data. The data can be used as sample sets for the correct estimation model of measurement value. The modeling was done practically by using regression model based on plant design data, plant acceptance data and verified plant performance data of domestic nuclear power plant. We can construct more robust performance analysis system for an operation nuclear power plant with this validation scheme.

A Lifelog Common Data Reference Model for the Healthcare Ecosystem (디지털 헬스케어 생태계 활성화를 위한 라이프로그 공통데이터 참조모델)

  • Lee, Young-joo;Ko, Yoon-seok
    • Knowledge Management Research
    • /
    • v.19 no.4
    • /
    • pp.149-170
    • /
    • 2018
  • Healthcare lifelog, a personal record relating to disease treatment and healthcare, plays an important role in healthcare paradigm shifts in which medical and information technology converge. Healthcare services based on various healthcare lifelogs are being launched domestically by both large corporations and small and medium enterprises, however, they are being built on an individual platform that is dependent on each company. Therefore, the terms of lifelog data are different as well as the measurement specifications are not uniform. This study proposes a reference model for minimum common data required for sharing and utilization of healthcare lifelog. Literature study and expert survey derived 3 domain, 17 essential items, and 51 sub-items. The model provides definition, measurement data format, measurement method, and precautions for each detailed measurement item, and provides necessary guidelines for data and service design and construction for healthcare service. This study has its significance as a basic research supporting the activation of ecosystem by ensuring interoperability of data between heterogeneous healthcare devices linked to digital healthcare platform.

Smart contract research for data outlier detection and processing of ARIMA model

  • Min, Youn-A
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.140-147
    • /
    • 2022
  • In this study, in order to efficiently detect data patterns and outliers in time series data, outlier detection processing is performed for each section based on a smart contract in the data preprocessing process, and parameters for the ARIMA model are determined by generating and reflecting the significance and outlier-related parameters of the data. It was created and applied to the modified arithmetic expression to lower the data abnormality. To evaluate the performance of this study, the normality of the data was compared and evaluated when the parameters of the general ARIMA model and the ARIMA model through this study were applied, and a performance improvement of more than 6% was confirmed.

The Analysis Framework for User Behavior Model using Massive Transaction Log Data (대규모 로그를 사용한 유저 행동모델 분석 방법론)

  • Lee, Jongseo;Kim, Songkuk
    • The Journal of Bigdata
    • /
    • v.1 no.2
    • /
    • pp.1-8
    • /
    • 2016
  • User activity log includes lots of hidden information, however it is not structured and too massive to process data, so there are lots of parts uncovered yet. Especially, it includes time series data. We can reveal lots of parts using it. But we cannot use log data directly to analyze users' behaviors. In order to analyze user activity model, it needs transformation process through extra framework. Due to these things, we need to figure out user activity model analysis framework first and access to data. In this paper, we suggest a novel framework model in order to analyze user activity model effectively. This model includes MapReduce process for analyzing massive data quickly in the distributed environment and data architecture design for analyzing user activity model. Also we explained data model in detail based on real online service log design. Through this process, we describe which analysis model is fit for specific data model. It raises understanding of processing massive log and designing analysis model.

  • PDF