• Title/Summary/Keyword: Inference models

Search Result 450, Processing Time 0.026 seconds

DeNERT: Named Entity Recognition Model using DQN and BERT

  • Yang, Sung-Min;Jeong, Ok-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.29-35
    • /
    • 2020
  • In this paper, we propose a new structured entity recognition DeNERT model. Recently, the field of natural language processing has been actively researched using pre-trained language representation models with a large amount of corpus. In particular, the named entity recognition, which is one of the fields of natural language processing, uses a supervised learning method, which requires a large amount of training dataset and computation. Reinforcement learning is a method that learns through trial and error experience without initial data and is closer to the process of human learning than other machine learning methodologies and is not much applied to the field of natural language processing yet. It is often used in simulation environments such as Atari games and AlphaGo. BERT is a general-purpose language model developed by Google that is pre-trained on large corpus and computational quantities. Recently, it is a language model that shows high performance in the field of natural language processing research and shows high accuracy in many downstream tasks of natural language processing. In this paper, we propose a new named entity recognition DeNERT model using two deep learning models, DQN and BERT. The proposed model is trained by creating a learning environment of reinforcement learning model based on language expression which is the advantage of the general language model. The DeNERT model trained in this way is a faster inference time and higher performance model with a small amount of training dataset. Also, we validate the performance of our model's named entity recognition performance through experiments.

Development of a surrogate model based on temperature for estimation of evapotranspiration and its use for drought index applicability assessment (증발산 산정을 위한 온도기반의 대체모형 개발 및 가뭄지수 적용성 평가)

  • Kim, Ho-Jun;Kim, Kyoungwook;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.11
    • /
    • pp.969-983
    • /
    • 2021
  • Evapotranspiration, one of the hydrometeorological components, is considered an important variable for water resource planning and management and is primarily used as input data for hydrological models such as water balance models. The FAO56 PM method has been recommended as a standard approach to estimate the reference evapotranspiration with relatively high accuracy. However, the FAO56 PM method is often challenging to apply because it requires considerable hydrometeorological variables. In this perspective, the Hargreaves equation has been widely adopted to estimate the reference evapotranspiration. In this study, a set of parameters of the Hargreaves equation was calibrated with relatively long-term data within a Bayesian framework. Statistical index (CC, RMSE, IoA) is used to validate the model. RMSE for monthly results reduced from 7.94 ~ 24.91 mm/month to 7.94 ~ 24.91 mm/month for the validation period. The results confirmed that the accuracy was significantly improved compared to the existing Hargreaves equation. Further, the evaporative demand drought index (EDDI) based on the evaporative demand (E0) was proposed. To confirm the effectiveness of the EDDI, this study evaluated the estimated EDDI for the recent drought events from 2014 to 2015 and 2018, along with precipitation and SPI. As a result of the evaluation of the Han-river watershed in 2018, the weekly EDDI increased to more than 2 and it was confirmed that EDDI more effectively detects the onset of drought caused by heatwaves. EDDI can be used as a drought index, particularly for heatwave-driven flash drought monitoring and along with SPI.

Clarifying the Meaning of 'Scientific Explanation' for Science Teaching and Learning (과학 학습지도를 위한 '과학적 설명'의 의미 명료화)

  • Jongwon Park;Hye-Gyoung Yoon;Insun Lee
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.6
    • /
    • pp.509-520
    • /
    • 2023
  • Scientific explanation is the main goal of scientists' scientific practice, and the science curriculum also includes developing students' abilities to construct scientific explanations as a major goal. Thus, clarifying its meaning is an important issue in the science education community. In this paper, the researchers identified three perspectives on 'scientific explanation' based on the scoping review method (Deductive-Nomological, Probabilistic, and Pragmatic explanation models). We argued that it is important to clarify and distinguish the meanings of 'scientific explanation' from other concepts used in science education, such as 'description', 'prediction', 'hypothesis', and 'argument' based on a review of the literature. It is also pointed out that there is a difference between 'scientific explanation' as a product and 'explaining scientifically' as communication, and several ways to revise achievement standard statements in the science curriculum are suggested, to guide students to construct scientific explanations and to help students to explain scientifically. By adopting the three scientific explanation models, the important factors to be considered were classified and organized, and examples of science learning activities for scientific explanation considering such factors were suggested. It is hoped that the discussion in this study will help establish clearer learning goals in science learning related to scientific explanation and aid the design of more appropriate learning activities accordingly.

Literature Review of AI Hallucination Research Since the Advent of ChatGPT: Focusing on Papers from arXiv (챗GPT 등장 이후 인공지능 환각 연구의 문헌 검토: 아카이브(arXiv)의 논문을 중심으로)

  • Park, Dae-Min;Lee, Han-Jong
    • Informatization Policy
    • /
    • v.31 no.2
    • /
    • pp.3-38
    • /
    • 2024
  • Hallucination is a significant barrier to the utilization of large-scale language models or multimodal models. In this study, we collected 654 computer science papers with "hallucination" in the abstract from arXiv from December 2022 to January 2024 following the advent of Chat GPT and conducted frequency analysis, knowledge network analysis, and literature review to explore the latest trends in hallucination research. The results showed that research in the fields of "Computation and Language," "Artificial Intelligence," "Computer Vision and Pattern Recognition," and "Machine Learning" were active. We then analyzed the research trends in the four major fields by focusing on the main authors and dividing them into data, hallucination detection, and hallucination mitigation. The main research trends included hallucination mitigation through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), inference enhancement via "chain of thought" (CoT), and growing interest in hallucination mitigation within the domain of multimodal AI. This study provides insights into the latest developments in hallucination research through a technology-oriented literature review. This study is expected to help subsequent research in both engineering and humanities and social sciences fields by understanding the latest trends in hallucination research.

Intentionality Judgement in the Criminal Case: The Role of Moral Character (형사사건에서의 고의성 판단: 도덕적 특성의 역할)

  • Choi, Seung-Hyuk;Hur, Taekyun
    • Korean Journal of Culture and Social Issue
    • /
    • v.26 no.1
    • /
    • pp.25-45
    • /
    • 2020
  • Intentionality judgement in criminal cases is a core area of fact finding that is root of guilty and sentencing judgment on the defendant. However, the third party is not sure the intentionality because it reflects subjective aspect of agent. Thus, mechanism behind intentionality judgment is an important factor to be properly understood by the academia and the criminal justice system. However, previous studies regarding intentionality judgment models have shown inconsistent results. Mental-state models proposed foreseeability(belief) and desire of agent at the time of the offence as key factors in intentionality judgment. These factors consistent with central things on intentionality judgment in criminal law. However, key factors in moral-evaluation models are blameworthiness of agent and badness of outcome reflected on the consequent aspect of act. Recently, deep-self concordance model emerged suggesting important factors on intentionality judgment are not mental states and moral evaluations but individual's deep-self. However, these models are limited in that they do not consider the important features of criminal cases, that the consequence of the case is inevitably negative, and therefore the actor who is a party to legal punishment rarely expresses his or her mental state at the time of the act. Therefore, this study suggests that, based on the existing intentionality judgment studies and the characteristics of the criminal case, the inference about who the agent was originally will play a key role in judging the intentionality in the criminal case. This is the moral-character model. Futhermore, In this regard, this study discussed what the media and criminal justice institutions should keep in mind and the directions for future research.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

A Future Study Agenda Applying Service Research Framework (서비스 연구 프레임워크 관점에서의 향후 연구과제)

  • Lee, JeungSun;Ahn, Jinho;Kim, Hyunsoo
    • Journal of Service Research and Studies
    • /
    • v.7 no.1
    • /
    • pp.83-96
    • /
    • 2017
  • The importance of service science is emphasized in the modern economy, and the value and necessity of service research still increasing. Since the service research framework was proposed, it has been studied from various perspectives and incorporated into one framework--service research. The direction of service research has been established and a new baseline of research has been established. However, the modern economic and social environment could be described as a new era, the Fourth Industrial Revolution has changed drastically. More and more systematic research on services has become necessary. Therefore, this study analyzed the field of service research in the existing framework. The study suggested how service research could broaden the horizon of service research by studying the 'what'. To do this, we analyzed recent service research trends by themes. We also identified the shortcomings of previous studies about service, and suggested directions and research themes for future research. Based on this study we developed a general approach to the creation of new models from the viewpoint of service science. The authors were also able to develop a general approach to areas such as service innovation, service inference, service solution, and service design leverage. In addition, it is necessary to extend service research and business model to the utilization of service technology. This approach could contribute to forming the basis of future service development, and to utilize social media to create new value of innovative company. The results of this study could contribute to deepening and expanding service research.

A Knowledge-assisted Hybrid System for effectively Supporting Personalization of a Web Customer (웹 고객의 개인화를 지원하는 지식기반 통합시스템)

  • Kim, Chul-Soo
    • The KIPS Transactions:PartB
    • /
    • v.9B no.1
    • /
    • pp.1-6
    • /
    • 2002
  • Many customers consult the Internet before making purchase goods and using contents. The systems in the Internet could store a lot of data and classify the data into information to get relationship between a company and customers. To do that, let's consider a knowledge-assisted hybrid system that utilizes individually a customer's preference to make an optimal solution in the his/her decision making. The knowledge made by using the preference is employed to select an domain set appropriate to him/her business, and the process of selecting definitely provides the customer some benefits: elimination of discomfort from unknown information and reduction of costs and search time for forming an suitable domain set. To effectively adopt individual customer's preference and actively adapt change of business situation, this study propose an architecture of the system which includes rule presentations and an inference engine, and integrates a knowledge-based component into a quadratic programming component. In the experimental results, it is found that a knowledge-assisted hybrid system implemented by this idea is more flexible than existing systems in extension of knowledge about an customer's preference and goes beyond the traditional models.

The Analysis and Design of Advanced Neurofuzzy Polynomial Networks (고급 뉴로퍼지 다항식 네트워크의 해석과 설계)

  • Park, Byeong-Jun;O, Seong-Gwon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.3
    • /
    • pp.18-31
    • /
    • 2002
  • In this study, we introduce a concept of advanced neurofuzzy polynomial networks(ANFPN), a hybrid modeling architecture combining neurofuzzy networks(NFN) and polynomial neural networks(PNN). These networks are highly nonlinear rule-based models. The development of the ANFPN dwells on the technologies of Computational Intelligence(Cl), namely fuzzy sets, neural networks and genetic algorithms. NFN contributes to the formation of the premise part of the rule-based structure of the ANFPN. The consequence part of the ANFPN is designed using PNN. At the premise part of the ANFPN, NFN uses both the simplified fuzzy inference and error back-propagation learning rule. The parameters of the membership functions, learning rates and momentum coefficients are adjusted with the use of genetic optimization. As the consequence structure of ANFPN, PNN is a flexible network architecture whose structure(topology) is developed through learning. In particular, the number of layers and nodes of the PNN are not fixed in advance but is generated in a dynamic way. In this study, we introduce two kinds of ANFPN architectures, namely the basic and the modified one. Here the basic and the modified architecture depend on the number of input variables and the order of polynomial in each layer of PNN structure. Owing to the specific features of two combined architectures, it is possible to consider the nonlinear characteristics of process system and to obtain the better output performance with superb predictive ability. The availability and feasibility of the ANFPN are discussed and illustrated with the aid of two representative numerical examples. The results show that the proposed ANFPN can produce the model with higher accuracy and predictive ability than any other method presented previously.

Exchange Rate Pass-Through and Market Response: Competition between Korea and Japan in the US Steel Market (환율전이와 시장의 반응: 미국 철강시장에서의 한국과 일본의 경쟁)

  • Tcha, MoonJoong;Kim, Jae H.
    • KDI Journal of Economic Policy
    • /
    • v.26 no.2
    • /
    • pp.281-314
    • /
    • 2004
  • This paper theoretically formulated and empirically explored the relationship between exchange rate pass-through (ERPT) for (average) market price and an individual country's price, using steel products data in the US market, with special reference to two major steel exporting countries, Korea and Japan. It was found that the direction of market ERPT can be different from that of individual ERPT that each exporter experiences, due to strategic interactions among producers and different parameters. Vector error correction (VEC) models and impulse response analysis were used with the statistical inference based on the bootstrap-after- bootstrap of Kilian (1998) for short-run, and the fully modified estimation of Phillips and Hansen (1990) was used for long-run. Empirical results indicate that market ERPT in the US market due to changes in Korea-US exchange rates is different from those due to changes in Japan-US exchange rates. The framework developed in this study indicates that this phenomenon is attributed to either (i) the two countries have individual ERPTs of different magnitudes and directions for the products in the US market, or (ii) the pricing strategies of the other exporters' (to the US steel market) respond differently depending on whether the price of the product from Korea changes or that from Japan does. As each exporter's ERPT can be significantly different, and market response to each country's ERPT can be also different, this study concludes that it is crucial for an exporter to understand how competitors in the market respond to changes in its price, as well as to understand how its price changes when the relevant exchange rate fluctuates.

  • PDF