• Title/Summary/Keyword: Inference models

Search Result 449, Processing Time 0.024 seconds

A Study of Pre-trained Language Models for Korean Language Generation (한국어 자연어생성에 적합한 사전훈련 언어모델 특성 연구)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.309-328
    • /
    • 2022
  • This study empirically analyzed a Korean pre-trained language models (PLMs) designed for natural language generation. The performance of two PLMs - BART and GPT - at the task of abstractive text summarization was compared. To investigate how performance depends on the characteristics of the inference data, ten different document types, containing six types of informational content and creation content, were considered. It was found that BART (which can both generate and understand natural language) performed better than GPT (which can only generate). Upon more detailed examination of the effect of inference data characteristics, the performance of GPT was found to be proportional to the length of the input text. However, even for the longest documents (with optimal GPT performance), BART still out-performed GPT, suggesting that the greatest influence on downstream performance is not the size of the training data or PLMs parameters but the structural suitability of the PLMs for the applied downstream task. The performance of different PLMs was also compared through analyzing parts of speech (POS) shares. BART's performance was inversely related to the proportion of prefixes, adjectives, adverbs and verbs but positively related to that of nouns. This result emphasizes the importance of taking the inference data's characteristics into account when fine-tuning a PLMs for its intended downstream task.

A Bayesian Method to Semiparametric Hierarchical Selection Models (준모수적 계층적 선택모형에 대한 베이지안 방법)

  • 정윤식;장정훈
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.161-175
    • /
    • 2001
  • Meta-analysis refers to quantitative methods for combining results from independent studies in order to draw overall conclusions. Hierarchical models including selection models are introduced and shown to be useful in such Bayesian meta-analysis. Semiparametric hierarchical models are proposed using the Dirichlet process prior. These rich class of models combine the information of independent studies, allowing investigation of variability both between and within studies, and weight function. Here we investigate sensitivity of results to unobserved studies by considering a hierachical selection model with including unknown weight function and use Markov chain Monte Carlo methods to develop inference for the parameters of interest. Using Bayesian method, this model is used on a meta-analysis of twelve studies comparing the effectiveness of two different types of flouride, in preventing cavities. Clinical informative prior is assumed. Summaries and plots of model parameters are analyzed to address questions of interest.

  • PDF

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Metaheuristic models for the prediction of bearing capacity of pile foundation

  • Kumar, Manish;Biswas, Rahul;Kumar, Divesh Ranjan;T., Pradeep;Samui, Pijush
    • Geomechanics and Engineering
    • /
    • v.31 no.2
    • /
    • pp.129-147
    • /
    • 2022
  • The properties of soil are naturally highly variable and thus, to ensure proper safety and reliability, we need to test a large number of samples across the length and depth. In pile foundations, conducting field tests are highly expensive and the traditional empirical relations too have been proven to be poor in performance. The study proposes a state-of-art Particle Swarm Optimization (PSO) hybridized Artificial Neural Network (ANN), Extreme Learning Machine (ELM) and Adaptive Neuro Fuzzy Inference System (ANFIS); and comparative analysis of metaheuristic models (ANN-PSO, ELM-PSO, ANFIS-PSO) for prediction of bearing capacity of pile foundation trained and tested on dataset of nearly 300 dynamic pile tests from the literature. A novel ensemble model of three hybrid models is constructed to combine and enhance the predictions of the individual models effectively. The authenticity of the dataset is confirmed using descriptive statistics, correlation matrix and sensitivity analysis. Ram weight and diameter of pile are found to be most influential input parameter. The comparative analysis reveals that ANFIS-PSO is the best performing model in testing phase (R2 = 0.85, RMSE = 0.01) while ELM-PSO performs best in training phase (R2 = 0.88, RMSE = 0.08); while the ensemble provided overall best performance based on the rank score. The performance of ANN-PSO is least satisfactory compared to the other two models. The findings were confirmed using Taylor diagram, error matrix and uncertainty analysis. Based on the results ELM-PSO and ANFIS-PSO is proposed to be used for the prediction of bearing capacity of piles and ensemble learning method of joining the outputs of individual models should be encouraged. The study possesses the potential to assist geotechnical engineers in the design phase of civil engineering projects.

Roles of Models in Abductive Reasoning: A Schematization through Theoretical and Empirical Studies (귀추적 사고 과정에서 모델의 역할 -이론과 경험 연구를 통한 도식화-)

  • Oh, Phil Seok
    • Journal of The Korean Association For Science Education
    • /
    • v.36 no.4
    • /
    • pp.551-561
    • /
    • 2016
  • The purpose of this study is to investigate both theoretically and empirically the roles of models in abductive reasoning for scientific problem solving. The context of the study is design-based research the goal of which is to develop inquiry learning programs in the domain of earth science, and the current article dealt with an early process of redesigning an abductive inquiry activity in geology. In the theoretical study, an extensive review was conducted with the literature addressing abduction and modeling together as research methods characterizing earth science. The result led to a tentative scheme for modeling-based abductive inference, which represented relationships among evidence, resource models, and explanatory models. This scheme was improved by the empirical study in which experts' reasoning for solving a geological problem was analyzed. The new scheme included the roles of critical evidence, critical resource models, and a scientifically sound explanatory model. Pedagogical implications for the support of student reasoning in modeling-based abductive inquiry in earth science was discussed.

Web access prediction based on parallel deep learning

  • Togtokh, Gantur;Kim, Kyung-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.51-59
    • /
    • 2019
  • Due to the exponential growth of access information on the web, the need for predicting web users' next access has increased. Various models such as markov models, deep neural networks, support vector machines, and fuzzy inference models were proposed to handle web access prediction. For deep learning based on neural network models, training time on large-scale web usage data is very huge. To address this problem, deep neural network models are trained on cluster of computers in parallel. In this paper, we investigated impact of several important spark parameters related to data partitions, shuffling, compression, and locality (basic spark parameters) for training Multi-Layer Perceptron model on Spark standalone cluster. Then based on the investigation, we tuned basic spark parameters for training Multi-Layer Perceptron model and used it for tuning Spark when training Multi-Layer Perceptron model for web access prediction. Through experiments, we showed the accuracy of web access prediction based on our proposed web access prediction model. In addition, we also showed performance improvement in training time based on our spark basic parameters tuning for training Multi-Layer Perceptron model over default spark parameters configuration.

Twin models for high-resolution visual inspections

  • Seyedomid Sajedi;Kareem A. Eltouny;Xiao Liang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.351-363
    • /
    • 2023
  • Visual structural inspections are an inseparable part of post-earthquake damage assessments. With unmanned aerial vehicles (UAVs) establishing a new frontier in visual inspections, there are major computational challenges in processing the collected massive amounts of high-resolution visual data. We propose twin deep learning models that can provide accurate high-resolution structural components and damage segmentation masks efficiently. The traditional approach to cope with high memory computational demands is to either uniformly downsample the raw images at the price of losing fine local details or cropping smaller parts of the images leading to a loss of global contextual information. Therefore, our twin models comprising Trainable Resizing for high-resolution Segmentation Network (TRS-Net) and DmgFormer approaches the global and local semantics from different perspectives. TRS-Net is a compound, high-resolution segmentation architecture equipped with learnable downsampler and upsampler modules to minimize information loss for optimal performance and efficiency. DmgFormer utilizes a transformer backbone and a convolutional decoder head with skip connections on a grid of crops aiming for high precision learning without downsizing. An augmented inference technique is used to boost performance further and reduce the possible loss of context due to grid cropping. Comprehensive experiments have been performed on the 3D physics-based graphics models (PBGMs) synthetic environments in the QuakeCity dataset. The proposed framework is evaluated using several metrics on three segmentation tasks: component type, component damage state, and global damage (crack, rebar, spalling). The models were developed as part of the 2nd International Competition for Structural Health Monitoring.

Performance Evaluation of Efficient Vision Transformers on Embedded Edge Platforms (임베디드 엣지 플랫폼에서의 경량 비전 트랜스포머 성능 평가)

  • Minha Lee;Seongjae Lee;Taehyoun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.89-100
    • /
    • 2023
  • Recently, on-device artificial intelligence (AI) solutions using mobile devices and embedded edge devices have emerged in various fields, such as computer vision, to address network traffic burdens, low-energy operations, and security problems. Although vision transformer deep learning models have outperformed conventional convolutional neural network (CNN) models in computer vision, they require more computations and parameters than CNN models. Thus, they are not directly applicable to embedded edge devices with limited hardware resources. Many researchers have proposed various model compression methods or lightweight architectures for vision transformers; however, there are only a few studies evaluating the effects of model compression techniques of vision transformers on performance. Regarding this problem, this paper presents a performance evaluation of vision transformers on embedded platforms. We investigated the behaviors of three vision transformers: DeiT, LeViT, and MobileViT. Each model performance was evaluated by accuracy and inference time on edge devices using the ImageNet dataset. We assessed the effects of the quantization method applied to the models on latency enhancement and accuracy degradation by profiling the proportion of response time occupied by major operations. In addition, we evaluated the performance of each model on GPU and EdgeTPU-based edge devices. In our experimental results, LeViT showed the best performance in CPU-based edge devices, and DeiT-small showed the highest performance improvement in GPU-based edge devices. In addition, only MobileViT models showed performance improvement on EdgeTPU. Summarizing the analysis results through profiling, the degree of performance improvement of each vision transformer model was highly dependent on the proportion of parts that could be optimized in the target edge device. In summary, to apply vision transformers to on-device AI solutions, either proper operation composition and optimizations specific to target edge devices must be considered.

Slope stability prediction using ANFIS models optimized with metaheuristic science

  • Gu, Yu-tian;Xu, Yong-xuan;Moayedi, Hossein;Zhao, Jian-wei;Le, Binh Nguyen
    • Geomechanics and Engineering
    • /
    • v.31 no.4
    • /
    • pp.339-352
    • /
    • 2022
  • Studying slope stability is an important branch of civil engineering. In this way, engineers have employed machine learning models, due to their high efficiency in complex calculations. This paper examines the robustness of various novel optimization schemes, namely equilibrium optimizer (EO), Harris hawks optimization (HHO), water cycle algorithm (WCA), biogeography-based optimization (BBO), dragonfly algorithm (DA), grey wolf optimization (GWO), and teaching learning-based optimization (TLBO) for enhancing the performance of adaptive neuro-fuzzy inference system (ANFIS) in slope stability prediction. The hybrid models estimate the factor of safety (FS) of a cohesive soil-footing system. The role of these algorithms lies in finding the optimal parameters of the membership function in the fuzzy system. By examining the convergence proceeding of the proposed hybrids, the best population sizes are selected, and the corresponding results are compared to the typical ANFIS. Accuracy assessments via root mean square error, mean absolute error, mean absolute percentage error, and Pearson correlation coefficient showed that all models can reliably understand and reproduce the FS behavior. Moreover, applying the WCA, EO, GWO, and TLBO resulted in reducing both learning and prediction error of the ANFIS. Also, an efficiency comparison demonstrated the WCA-ANFIS as the most accurate hybrid, while the GWO-ANFIS was the fastest promising model. Overall, the findings of this research professed the suitability of improved intelligent models for practical slope stability evaluations.

Flood Forecasting and Warning Using Neuro-Fuzzy Inference Technique (Neuro-Fuzzy 추론기법을 이용한 홍수 예.경보)

  • Yi, Jae-Eung;Choi, Chang-Won
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.3
    • /
    • pp.341-351
    • /
    • 2008
  • Since the damage from the torrential rain increases recently due to climate change and global warming, the significance of flood forecasting and warning becomes important in medium and small streams as well as large river. Through the preprocess and main processes for estimating runoff, diverse errors occur and are accumulated, so that the outcome contains the errors in the existing flood forecasting and warning method. And estimating the parameters needed for runoff models requires a lot of data and the processes contain various uncertainty. In order to overcome the difficulties of the existing flood forecasting and warning system and the uncertainty problem, ANFIS(Adaptive Neuro-Fuzzy Inference System) technique has been presented in this study. ANFIS, a data driven model using the fuzzy inference theory with neural network, can forecast stream level only by using the precipitation and stream level data in catchment without using a lot of physical data that are necessary in existing physical model. Time series data for precipitation and stream level are used as input, and stream levels for t+1, t+2, and t+3 are forecasted with this model. The applicability and the appropriateness of the model is examined by actual rainfall and stream level data from 2003 to 2005 in the Tancheon catchment area. The results of applying ANFIS to the Tancheon catchment area for the actual data show that the stream level can be simulated without large error.