• Title/Summary/Keyword: data-based model

Search Result 21,096, Processing Time 0.045 seconds

On the Development of an initial Hull Structural CAD System based on the Semantic Product Data Model (의미론적 제품 데이터 모델 기반 초기 선체 구조 CAD 시스템 개발)

  • 이원준;이규열;노명일;권오환
    • Korean Journal of Computational Design and Engineering
    • /
    • v.7 no.3
    • /
    • pp.157-169
    • /
    • 2002
  • In the initial stages of ship design, designers represent geometry, arrangement, and dimension of hull structures with 2D geometric primitives such as points, lines, arcs, and drawing symbols. However, these design information(‘2D geometric primitives’) defined in the drawing sheet require more intelligent translation processes by the designers in the next design stages. Thus, the loss of design semantics could be occurred and following design processes could be delayed. In the initial design stages, it is not easy to adopt commercial 3D CAD systems, which have been developed f3r being used in detail and production design stages, because the 3D CAD systems require detailed input for geometry definition. In this study, a semantic product model data structure was proposed, and an initial structural CAD system was developed based on the proposed data structure. Contents(‘product model data and design knowledges’) of the proposed data structure are filled with minimal input of the designers, and then 3D solid model and production material information can be automatically generated as occasion demands. Finally, the applicability of the proposed semantic product model data structure and the developed initial structural CAD system was verified through application to deadweight 300,000ton VLCC(Very Large Crude oil Carrier) product modeling procedure.

Variable selection and prediction performance of penalized two-part regression with community-based crime data application

  • Seong-Tae Kim;Man Sik Park
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.4
    • /
    • pp.441-457
    • /
    • 2024
  • Semicontinuous data are characterized by a mixture of a point probability mass at zero and a continuous distribution of positive values. This type of data is often modeled using a two-part model where the first part models the probability of dichotomous outcomes -zero or positive- and the second part models the distribution of positive values. Despite the two-part model's popularity, variable selection in this model has not been fully addressed, especially, in high dimensional data. The objective of this study is to investigate variable selection and prediction performance of penalized regression methods in two-part models. The performance of the selected techniques in the two-part model is evaluated via simulation studies. Our findings show that LASSO and ENET tend to select more predictors in the model than SCAD and MCP. Consequently, MCP and SCAD outperform LASSO and ENET for β-specificity, and LASSO and ENET perform better than MCP and SCAD with respect to the mean squared error. We find similar results when applying the penalized regression methods to the prediction of crime incidents using community-based data.

Model-Based Survival Estimates of Female Breast Cancer Data

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Rana, Sagar;Ahmed, Nasar Uddin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.6
    • /
    • pp.2893-2900
    • /
    • 2014
  • Background: Statistical methods are very important to precisely measure breast cancer patient survival times for healthcare management. Previous studies considered basic statistics to measure survival times without incorporating statistical modeling strategies. The objective of this study was to develop a data-based statistical probability model from the female breast cancer patients' survival times by using the Bayesian approach to predict future inferences of survival times. Materials and Methods: A random sample of 500 female patients was selected from the Surveillance Epidemiology and End Results cancer registry database. For goodness of fit, the standard model building criteria were used. The Bayesian approach is used to obtain the predictive survival times from the data-based Exponentiated Exponential Model. Markov Chain Monte Carlo method was used to obtain the summary results for predictive inference. Results: The highest number of female breast cancer patients was found in California and the lowest in New Mexico. The majority of them were married. The mean (SD) age at diagnosis (in years) was 60.92 (14.92). The mean (SD) survival time (in months) for female patients was 90.33 (83.10). The Exponentiated Exponential Model found better fits for the female survival times compared to the Exponentiated Weibull Model. The Bayesian method is used to obtain predictive inference for future survival times. Conclusions: The findings with the proposed modeling strategy will assist healthcare researchers and providers to precisely predict future survival estimates as the recent growing challenges of analyzing healthcare data have created new demand for model-based survival estimates. The application of Bayesian will produce precise estimates of future survival times.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.

Artificial neural network for classifying with epilepsy MEG data (뇌전증 환자의 MEG 데이터에 대한 분류를 위한 인공신경망 적용 연구)

  • Yujin Han;Junsik Kim;Jaehee Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.139-155
    • /
    • 2024
  • This study performed a multi-classification task to classify mesial temporal lobe epilepsy with left hippocampal sclerosis patients (left mTLE), mesial temporal lobe epilepsy with right hippocampal sclerosis (right mTLE), and healthy controls (HC) using magnetoencephalography (MEG) data. We applied various artificial neural networks and compared the results. As a result of modeling with convolutional neural networks (CNN), recurrent neural networks (RNN), and graph neural networks (GNN), the average k-fold accuracy was excellent in the order of CNN-based model, GNN-based model, and RNN-based model. The wall time was excellent in the order of RNN-based model, GNN-based model, and CNN-based model. The graph neural network, which shows good figures in accuracy, performance, and time, and has excellent scalability of network data, is the most suitable model for brain research in the future.

Demand Analysis for Community-based Tourism Using Count Data Models (가산자료모형을 이용한 지역사회기반형 관광수요 분석)

  • Yun, Hee-Jeong
    • The Korean Journal of Community Living Science
    • /
    • v.22 no.2
    • /
    • pp.247-255
    • /
    • 2011
  • This study analyzed the demand for a community-based tourism site using a poisson model, a negative binominal model, a truncated poisson model and a truncated negative binominal model as count data models. For these reasons, questionnaire surveys were conducted into 5 community-based tourism sites in Chuncheon city with 406 tourists, and was analyzed using the STATA program. The fitness levels of four models were significant(p=0.0000) using a likelihood ratio test. The study results suggest that the demand of community-based tourism sites for visiting tourists was influenced by a pre-visiting experience, recognition of sustainable tourism, visitation of downtown, purchase of souvenir or farm produce, conversation with regional residents, regional harmony, preservation of natural resources and sex within the poisson and truncated poisson models. However, the variables of visitation of downtown, preservation of natural resources and sex were not significant within the negative binominal model and the visitation of downtown and preservation of natural resources were not significant within the truncated negative binominal model. The results of the visiting demand of community-based tourism sites can provide information for sustainable regional development strategies.

Model Updating Using the Closed-loop Natural Frequency (폐루프 공진 주파수를 이용한 모델 개선법)

  • Jung Hunsang;Park Youngjin
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.14 no.9 s.90
    • /
    • pp.801-810
    • /
    • 2004
  • Parameter modification of a linear finite element model(FEM) based on modal sensitivity matrix is usually performed through an effort to match FEM modal data to experimental ones. However, there are cases where this method can't be applied successfully; lack of reliable modal data and ill-conditioning of the modal sensitivity matrix constitute such cases. In this research, a novel concept of introducing feedback loops to the conventional modal test setup is proposed. This method uses closed-loop natural frequency data for parameter modification to overcome the problems associated with the conventional method based on modal sensitivity matrix. We proposed the whole procedure of parameter modification using the closed-loop natural frequency data including the modal sensitivity modification and controller design method. Proposed controller design method is efficient in changing modes. Numerical simulation of parameter estimation based on time-domain input/output data is provided to demonstrate the estimation performance of the proposed method.

A Study of Data Mining Optimization Model for the Credit Evaluation

  • Kim, Kap-Sik;Lee, Chang-Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.825-836
    • /
    • 2003
  • Based on customer information and financing processes in capital market, we derived individual models by applying multi-layered perceptrons, MDA, and decision tree. Further, the results from the existing single models were compared with the results from the integrated model that was developed using genetic algorithm. This study contributes not only to verifying the existing individual models and but also to overcoming the limitations of the existing approaches. We have depended upon the approaches that compare individual models and search for the best-fit model. However, this study presents a methodology to build an integrated data mining model using genetic algorithm.

  • PDF

A Channel Equalization Algorithm Using Neural Network Based Data Least Squares (뉴럴네트웍에 기반한 Data Least Squares를 사용한 채널 등화기 알고리즘)

  • Lim, Jun-Seok;Pyeon, Yong-Kuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.2E
    • /
    • pp.63-68
    • /
    • 2007
  • Using the neural network model for oriented principal component analysis (OPCA), we propose a solution to the data least squares (DLS) problem, in which the error is assumed to lie in the data matrix only. In this paper, we applied this neural network model to channel equalization. Simulations show that the neural network based DLS outperforms ordinary least squares in channel equalization problems.

Design of Distributed Processing Framework Based on H-RTGL One-class Classifier for Big Data (빅데이터를 위한 H-RTGL 기반 단일 분류기 분산 처리 프레임워크 설계)

  • Kim, Do Gyun;Choi, Jin Young
    • Journal of Korean Society for Quality Management
    • /
    • v.48 no.4
    • /
    • pp.553-566
    • /
    • 2020
  • Purpose: The purpose of this study was to design a framework for generating one-class classification algorithm based on Hyper-Rectangle(H-RTGL) in a distributed environment connected by network. Methods: At first, we devised one-class classifier based on H-RTGL which can be performed by distributed computing nodes considering model and data parallelism. Then, we also designed facilitating components for execution of distributed processing. In the end, we validate both effectiveness and efficiency of the classifier obtained from the proposed framework by a numerical experiment using data set obtained from UCI machine learning repository. Results: We designed distributed processing framework capable of one-class classification based on H-RTGL in distributed environment consisting of physically separated computing nodes. It includes components for implementation of model and data parallelism, which enables distributed generation of classifier. From a numerical experiment, we could observe that there was no significant change of classification performance assessed by statistical test and elapsed time was reduced due to application of distributed processing in dataset with considerable size. Conclusion: Based on such result, we can conclude that application of distributed processing for generating classifier can preserve classification performance and it can improve the efficiency of classification algorithms. In addition, we suggested an idea for future research directions of this paper as well as limitation of our work.