• Title/Summary/Keyword: Data-based model

Search Result 20,593, Processing Time 0.052 seconds

Conditional Variational Autoencoder-based Generative Model for Gene Expression Data Augmentation (유전자 발현량 데이터 증대를 위한 Conditional VAE 기반 생성 모델)

  • Hyunsu Bong;Minsik Oh
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.275-284
    • /
    • 2023
  • Gene expression data can be utilized in various studies, including the prediction of disease prognosis. However, there are challenges associated with collecting enough data due to cost constraints. In this paper, we propose a gene expression data generation model based on Conditional Variational Autoencoder. Our results demonstrate that the proposed model generates synthetic data with superior quality compared to two other state-of-the-art models for gene expression data generation, namely the Wasserstein Generative Adversarial Network with Gradient Penalty based model and the structured data generation models CTGAN and TVAE.

Role of Scientific Reasoning in Elementary School Students' Construction of Food Pyramid Prediction Models (초등학생들의 먹이 피라미드 예측 모형 구성에서 과학적 추론의 역할)

  • Han, Moonhyun
    • Journal of Korean Elementary Science Education
    • /
    • v.38 no.3
    • /
    • pp.375-386
    • /
    • 2019
  • This study explores how elementary school students construct food pyramid prediction models using scientific reasoning. Thirty small groups of sixth-grade students in the Kyoungki province (n=138) participated in this study; each small group constructed a food pyramid prediction model based on scientific reasoning, utilizing prior knowledge on topics such as biotic and abiotic factors, food chains, food webs, and food pyramid concepts. To understand the scientific reasoning applied by the students during the modeling process, three forms of qualitative data were collected and analyzed: each small group's discourse, their representation, and the researcher's field notes. Based on this data, the researcher categorized the students' model patterns into three categories and identified how the students used scientific reasoning in their model patterns. The study found that the model patterns consisted of the population number variation model, the biological and abiotic factors change model, and the equilibrium model. In the population number variation model, students used phenomenon-based reasoning and relation-based reasoning to predict variations in the number of producers and consumers. In the biotic and abiotic factors change model, students used relation-based reasoning to predict the effects on producers and consumers as well as on decomposers and abiotic factors. In the equilibrium model, students predicted that "the food pyramid would reach equilibrium," using relation-based reasoning and model-based reasoning. This study demonstrates that elementary school students can systematically elaborate on complicated ecology concepts using scientific reasoning and modeling processes.

Pavement Performance Model Development Using Bayesian Algorithm (베이지안 기법을 활용한 공용성 모델개발 연구)

  • Mun, Sungho
    • International Journal of Highway Engineering
    • /
    • v.18 no.1
    • /
    • pp.91-97
    • /
    • 2016
  • PURPOSES : The objective of this paper is to develop a pavement performance model based on the Bayesian algorithm, and compare the measured and predicted performance data. METHODS : In this paper, several pavement types such as SMA (stone mastic asphalt), PSMA (polymer-modified stone mastic asphalt), PMA (polymer-modified asphalt), SBS (styrene-butadiene-styrene) modified asphalt, and DGA (dense-graded asphalt) are modeled in terms of the performance evaluation of pavement structures, using the Bayesian algorithm. RESULTS : From case studies related to the performance model development, the statistical parameters of the mean value and standard deviation can be obtained through the Bayesian algorithm, using the initial performance data of two different pavement cases. Furthermore, an accurate performance model can be developed, based on the comparison between the measured and predicted performance data. CONCLUSIONS : Based on the results of the case studies, it is concluded that the determined coefficients of the nonlinear performance models can be used to accurately predict the long-term performance behaviors of DGA and modified asphalt concrete pavements. In addition, the developed models were evaluated through comparison studies between the initial measurement and prediction data, as well as between the final measurement and prediction data. In the model development, the initial measured data were used.

Precision Evaluation of Recent Global Geopotential Models based on GNSS/Leveling Data on Unified Control Points

  • Lee, Jisun;Kwon, Jay Hyoun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.2
    • /
    • pp.153-163
    • /
    • 2020
  • After launching the GOCE (Gravity Field and Steady-State Ocean Circulation Explorer) which obtains high-frequency gravity signal using a gravity gradiometer, many research institutes are concentrating on the development of GGM (Global Geopotential Model) based on GOCE data and evaluating its precision. The precision of some GGMs was also evaluated in Korea. However, some studies dealt with GGMs constructed based on initial GOCE data or others applied a part of GNSS (Global Navigation Satellite System) / Leveling data on UCPs (Unified Control Points) for the precision evaluation. Now, GGMs which have a higher degree than EGM2008 (Earth Gravitational Model 2008) are available and UCPs were fully established at the end of 2019. Thus, EIGEN-6C4 (European Improved Gravity Field of the Earth by New techniques - 6C4), GECO (GOCE and EGM2008 Combined model), XGM2016 (Experimental Gravity Field Model 2016), SGG-UGM-1, XGM2019e_2159 were collected with EGM2008, and their precisions were assessed based on the GNSS/Leveling data on UCPs. Among GGMs, it was found that XGM2019e_2159 showed the minimum difference compared to a total of 5,313 points of GNSS/Leveling data. It is about a 1.5cm and 0.6cm level of improvement compare to EGM2008 and EIGEN-6C4. Especially, the local biases in the northern part of Gyeonggi-do, Jeju island shown in the EGM2008 was removed, so that both mean and standard deviation of the difference of XGM2019e_2159 to the GNSS/Leveling are homogeneous regardless of region (mountainous or plain area). NGA (National Geospatial-Intelligence Agency) is currently in progress in developing EGM2020 and XGM2019e_2159 is the experimentally published model of EGM2020. Therefore, it is expected that the improved GGM will be available shortly so that it is necessary to verify the precision of new GGMs consistently.

Prediction Model of Real Estate Transaction Price with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.274-283
    • /
    • 2022
  • Korea is facing a number difficulties arising from rising housing prices. As 'housing' takes the lion's share in personal assets, many difficulties are expected to arise from fluctuating housing prices. The purpose of this study is creating housing price prediction model to prevent such risks and induce reasonable real estate purchases. This study made many attempts for understanding real estate instability and creating appropriate housing price prediction model. This study predicted and validated housing prices by using the LSTM technique - a type of Artificial Intelligence deep learning technology. LSTM is a network in which cell state and hidden state are recursively calculated in a structure which added cell state, which is conveyor belt role, to the existing RNN's hidden state. The real sale prices of apartments in autonomous districts ranging from January 2006 to December 2019 were collected through the Ministry of Land, Infrastructure, and Transport's real sale price open system and basic apartment and commercial district information were collected through the Public Data Portal and the Seoul Metropolitan City Data. The collected real sale price data were scaled based on monthly average sale price and a total of 168 data were organized by preprocessing respective data based on address. In order to predict prices, the LSTM implementation process was conducted by setting training period as 29 months (April 2015 to August 2017), validation period as 13 months (September 2017 to September 2018), and test period as 13 months (December 2018 to December 2019) according to time series data set. As a result of this study for predicting 'prices', there have been the following results. Firstly, this study obtained 76 percent of prediction similarity. We tried to design a prediction model of real estate transaction price with the LSTM Model based on AI and Bigdata. The final prediction model was created by collecting time series data, which identified the fact that 76 percent model can be made. This validated that predicting rate of return through the LSTM method can gain reliability.

Incorporating BERT-based NLP and Transformer for An Ensemble Model and its Application to Personal Credit Prediction

  • Sophot Ky;Ju-Hong Lee;Kwangtek Na
    • Smart Media Journal
    • /
    • v.13 no.4
    • /
    • pp.9-15
    • /
    • 2024
  • Tree-based algorithms have been the dominant methods used build a prediction model for tabular data. This also includes personal credit data. However, they are limited to compatibility with categorical and numerical data only, and also do not capture information of the relationship between other features. In this work, we proposed an ensemble model using the Transformer architecture that includes text features and harness the self-attention mechanism to tackle the feature relationships limitation. We describe a text formatter module, that converts the original tabular data into sentence data that is fed into FinBERT along with other text features. Furthermore, we employed FT-Transformer that train with the original tabular data. We evaluate this multi-modal approach with two popular tree-based algorithms known as, Random Forest and Extreme Gradient Boosting, XGBoost and TabTransformer. Our proposed method shows superior Default Recall, F1 score and AUC results across two public data sets. Our results are significant for financial institutions to reduce the risk of financial loss regarding defaulters.

The Lower Flash Points of the n-Butanol+n-Decane System

  • Dong-Myeong Ha;Yong-Chan Choi;Sung-Jin Lee
    • Fire Science and Engineering
    • /
    • v.17 no.2
    • /
    • pp.50-55
    • /
    • 2003
  • The lower flash points for the binary system, n-butanol+n-decane, were measured by Pensky-Martens closed cup tester. The experimental results showed the minimum in the flash point versus composition curve. The experimental data were compared with the values calculated by the reduced model under an ideal solution assumption and the flash point-prediction models based on the Van Laar and Wilson equations. The predictive curve based upon the reduced model deviated form the experimental data for this system. The experimental results were in good agreement with the predictive curves, which use the Van Laar and Wilson equations to estimate activity coefficients. However, the predictive curve of the flash point prediction model based on the Willson equation described the experimentally-derived data more effectively than that of the flash point prediction model based on the Van Laar equation.

Biomechanical analysis of human foot using the computer graphic-based model during walking (컴퓨터 그래픽 모델을 통한 보행 시 발의 생체역학적 해석)

  • 최현기;김시열;이범현
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2002.10a
    • /
    • pp.1088-1092
    • /
    • 2002
  • The purpose of this investigation was to study the kinematics of joints between foot segments based on computer graphic-based model during the stance phase of walking. In the model, ail joints were assumed to act as monocentric, single degree of freedom hinge joints. The motion of foot was captured by a video collection system using four cameras. The model fitted in an individual subject was simulated with this motion data. The kinematic data of tarsometatarsal joints and metatarso-phalangeal joint were quantitatively similar to the previous data. Therefore, our method using the computer graphic-based model is considered useful.

  • PDF

Hierarchical Flow-Based Anomaly Detection Model for Motor Gearbox Defect Detection

  • Younghwa Lee;Il-Sik Chang;Suseong Oh;Youngjin Nam;Youngteuk Chae;Geonyoung Choi;Gooman Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.6
    • /
    • pp.1516-1529
    • /
    • 2023
  • In this paper, a motor gearbox fault-detection system based on a hierarchical flow-based model is proposed. The proposed system is used for the anomaly detection of a motion sound-based actuator module. The proposed flow-based model, which is a generative model, learns by directly modeling a data distribution function. As the objective function is the maximum likelihood value of the input data, the training is stable and simple to use for anomaly detection. The operation sound of a car's side-view mirror motor is converted into a Mel-spectrogram image, consisting of a folding signal and an unfolding signal, and used as training data in this experiment. The proposed system is composed of an encoder and a decoder. The data extracted from the layer of the pretrained feature extractor are used as the decoder input data in the encoder. This information is used in the decoder by performing an interlayer cross-scale convolution operation. The experimental results indicate that the context information of various dimensions extracted from the interlayer hierarchical data improves the defect detection accuracy. This paper is notable because it uses acoustic data and a normalizing flow model to detect outliers based on the features of experimental data.

The Life Cycle Model Considering Legal and Technical Characteristics of Personal Data (개인정보의 법적·기술적 특성을 고려한 라이프 사이클(Life Cycle) 모델)

  • Jang, Jae-Young;Park, Tae-Hwan;Kim, Beom-Soo
    • The Journal of Society for e-Business Studies
    • /
    • v.17 no.3
    • /
    • pp.43-60
    • /
    • 2012
  • This study reviews the life cycle models considering legal and technical characteristics of personal data respectively. Based on the reviews, this research proposes 'consent and management based model of personal data' which is applicable to the domestic IT companies. The model suggested in this paper has characteristics that 'Consent' and 'Management' factors are ㅁpositively considered, which is overlooked in the other models. The validity of the model is examined by two methods, validation of the model of excellence by contrast of the other models, and 'consent' and 'management' factors cover all the life cycle processes. Using this model, IT companies will be contributed to the analysis of the personal data utilization and the development of IT system protection.