• Title/Summary/Keyword: action probability learning

Search Result 9, Processing Time 0.018 seconds

Multiple Behavior s Learning and Prediction in Unknown Environment

  • Song, Wei;Cho, Kyung-Eun;Um, Ky-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.12
    • /
    • pp.1820-1831
    • /
    • 2010
  • When interacting with unknown environments, an autonomous agent needs to decide which action or action order can result in a good state and determine the transition probability based on the current state and the action taken. The traditional multiple sequential learning model requires predefined probability of the states' transition. This paper proposes a multiple sequential learning and prediction system with definition of autonomous states to enhance the automatic performance of existing AI algorithms. In sequence learning process, the sensed states are classified into several group by a set of proposed motivation filters to reduce the learning computation. In prediction process, the learning agent makes a decision based on the estimation of each state's cost to get a high payoff from the given environment. The proposed learning and prediction algorithms heightens the automatic planning of the autonomous agent for interacting with the dynamic unknown environment. This model was tested in a virtual library.

Motivation based Behavior Sequence Learning for an Autonomous Agent in Virtual Reality

  • Song, Wei;Cho, Kyung-Eun;Um, Ky-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.12
    • /
    • pp.1819-1826
    • /
    • 2009
  • To enhance the automatic performance of existing predicting and planning algorithms that require a predefined probability of the states' transition, this paper proposes a multiple sequence generation system. When interacting with unknown environments, a virtual agent needs to decide which action or action order can result in a good state and determine the transition probability based on the current state and the action taken. We describe a sequential behavior generation method motivated from the change in the agent's state in order to help the virtual agent learn how to adapt to unknown environments. In a sequence learning process, the sensed states are grouped by a set of proposed motivation filters in order to reduce the learning computation of the large state space. In order to accomplish a goal with a high payoff, the learning agent makes a decision based on the observation of states' transitions. The proposed multiple sequence behaviors generation system increases the complexity and heightens the automatic planning of the virtual agent for interacting with the dynamic unknown environment. This model was tested in a virtual library to elucidate the process of the system.

  • PDF

A hidden anti-jamming method based on deep reinforcement learning

  • Wang, Yifan;Liu, Xin;Wang, Mei;Yu, Yu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3444-3457
    • /
    • 2021
  • In the field of anti-jamming based on dynamic spectrum, most methods try to improve the ability to avoid jamming and seldom consider whether the jammer would perceive the user's signal. Although these existing methods work in some anti-jamming scenarios, their long-term performance may be depressed when intelligent jammers can learn user's waveform or decision information from user's historical activities. Hence, we proposed a hidden anti-jamming method to address this problem by reducing the jammer's sense probability. In the proposed method, the action correlation between the user and the jammer is used to evaluate the hiding effect of the user's actions. And a deep reinforcement learning framework, including specific action correlation calculation and iteration learning algorithm, is designed to maximize the hiding and communication performance of the user synchronously. The simulation result shows that the algorithm proposed reduces the jammer's sense probability significantly and improves the user's anti-jamming performance slightly compared to the existing algorithms based on jamming avoidance.

A Learning AI Algorithm for Poker with Embedded Opponent Modeling

  • Kim, Seong-Gon;Kim, Yong-Gi
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.3
    • /
    • pp.170-177
    • /
    • 2010
  • Poker is a game of imperfect information where competing players must deal with multiple risk factors stemming from unknown information while making the best decision to win, and this makes it an interesting test-bed for artificial intelligence research. This paper introduces a new learning AI algorithm with embedded opponent modeling that can be used for these types of situations and we use this AI and apply it to a poker program. The new AI will be based on several graphs with each of its nodes representing inputs, and the algorithm will learn the optimal decision to make by updating the weight of the edges connecting these nodes and returning a probability for each action the graphs represent.

User Identification Using Real Environmental Human Computer Interaction Behavior

  • Wu, Tong;Zheng, Kangfeng;Wu, Chunhua;Wang, Xiujuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3055-3073
    • /
    • 2019
  • In this paper, a new user identification method is presented using real environmental human-computer-interaction (HCI) behavior data to improve method usability. User behavior data in this paper are collected continuously without setting experimental scenes such as text length, action number, etc. To illustrate the characteristics of real environmental HCI data, probability density distribution and performance of keyboard and mouse data are analyzed through the random sampling method and Support Vector Machine(SVM) algorithm. Based on the analysis of HCI behavior data in a real environment, the Multiple Kernel Learning (MKL) method is first used for user HCI behavior identification due to the heterogeneity of keyboard and mouse data. All possible kernel methods are compared to determine the MKL algorithm's parameters to ensure the robustness of the algorithm. Data analysis results show that keyboard data have a narrower range of probability density distribution than mouse data. Keyboard data have better performance with a 1-min time window, while that of mouse data is achieved with a 10-min time window. Finally, experiments using the MKL algorithm with three global polynomial kernels and ten local Gaussian kernels achieve a user identification accuracy of 83.03% in a real environmental HCI dataset, which demonstrates that the proposed method achieves an encouraging performance.

Training-Free sEMG Pattern Recognition Algorithm: A Case Study of A Patient with Partial-Hand Amputation (무학습 근전도 패턴 인식 알고리즘: 부분 수부 절단 환자 사례 연구)

  • Park, Seongsik;Lee, Hyun-Joo;Chung, Wan Kyun;Kim, Keehoon
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.3
    • /
    • pp.211-220
    • /
    • 2019
  • Surface electromyogram (sEMG), which is a bio-electrical signal originated from action potentials of nerves and muscle fibers activated by motor neurons, has been widely used for recognizing motion intention of robotic prosthesis for amputees because it enables a device to be operated intuitively by users without any artificial and additional work. In this paper, we propose a training-free unsupervised sEMG pattern recognition algorithm. It is useful for the gesture recognition for the amputees from whom we cannot achieve motion labels for the previous supervised pattern recognition algorithms. Using the proposed algorithm, we can classify the sEMG signals for gesture recognition and the calculated threshold probability value can be used as a sensitivity parameter for pattern registration. The proposed algorithm was verified by a case study of a patient with partial-hand amputation.

Generalized LR Parser with Conditional Action Model(CAM) using Surface Phrasal Types (표층 구문 타입을 사용한 조건부 연산 모델의 일반화 LR 파서)

  • 곽용재;박소영;황영숙;정후중;이상주;임해창
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.81-92
    • /
    • 2003
  • Generalized LR parsing is one of the enhanced LR parsing methods so that it overcome the limit of one-way linear stack of the traditional LR parser using graph-structured stack, and it has been playing an important role of a firm starting point to generate other variations for NL parsing equipped with various mechanisms. In this paper, we propose a conditional Action Model that can solve the problems of conventional probabilistic GLR methods. Previous probabilistic GLR parsers have used relatively limited contextual information for disambiguation due to the high complexity of internal GLR stack. Our proposed model uses Surface Phrasal Types representing the structural characteristics of the parse for its additional contextual information, so that more specified structural preferences can be reflected into the parser. Experimental results show that our GLR parser with the proposed Conditional Action Model outperforms the previous methods by about 6-7% without any lexical information, and our model can utilize the rich stack information for syntactic disambiguation of probabilistic LR parser.

The Construction and Internal Validation of Lifelong Education ISD Model (평생교육 교수체제설계 모형 개발 및 내적 타당화)

  • Yun, Gyuwon;Kim, Moon-Seup;Kim, Jin-Sook
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.213-219
    • /
    • 2022
  • The purpose of this study is the construction and internal validation of DT ISD model that incorporates Design Thinking process into Instructional Systems Design. The study proceeded in 3 steps. Firstly, literature review was conducted to examine the process and components of Instructional Systems Design model such as ADDIE. Design Thinking models were also reviewed to determine the probability of integrating Design Thinking with ISD. Secondly, DT ISD model was constructed by adopting 3 principles of Design Thinking to ISD process. Thirdly, the internal validation research was conducted through 3 rounds delphi study and DT ISD model was finally validated by experts of instructional technology and lifelong education. DT ISD model is based upon constructionism in learning theory rather than behavioral or cognitive learning theory such as ADDIE. Hence, DT ISD model is an effective instructional design model for adult lifelong education program. It is suggested that action research is necessary to examine the external validation of DT ISD model.

Anomaly Detection for User Action with Generative Adversarial Networks (적대적 생성 모델을 활용한 사용자 행위 이상 탐지 방법)

  • Choi, Nam woong;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.43-62
    • /
    • 2019
  • At one time, the anomaly detection sector dominated the method of determining whether there was an abnormality based on the statistics derived from specific data. This methodology was possible because the dimension of the data was simple in the past, so the classical statistical method could work effectively. However, as the characteristics of data have changed complexly in the era of big data, it has become more difficult to accurately analyze and predict the data that occurs throughout the industry in the conventional way. Therefore, SVM and Decision Tree based supervised learning algorithms were used. However, there is peculiarity that supervised learning based model can only accurately predict the test data, when the number of classes is equal to the number of normal classes and most of the data generated in the industry has unbalanced data class. Therefore, the predicted results are not always valid when supervised learning model is applied. In order to overcome these drawbacks, many studies now use the unsupervised learning-based model that is not influenced by class distribution, such as autoencoder or generative adversarial networks. In this paper, we propose a method to detect anomalies using generative adversarial networks. AnoGAN, introduced in the study of Thomas et al (2017), is a classification model that performs abnormal detection of medical images. It was composed of a Convolution Neural Net and was used in the field of detection. On the other hand, sequencing data abnormality detection using generative adversarial network is a lack of research papers compared to image data. Of course, in Li et al (2018), a study by Li et al (LSTM), a type of recurrent neural network, has proposed a model to classify the abnormities of numerical sequence data, but it has not been used for categorical sequence data, as well as feature matching method applied by salans et al.(2016). So it suggests that there are a number of studies to be tried on in the ideal classification of sequence data through a generative adversarial Network. In order to learn the sequence data, the structure of the generative adversarial networks is composed of LSTM, and the 2 stacked-LSTM of the generator is composed of 32-dim hidden unit layers and 64-dim hidden unit layers. The LSTM of the discriminator consists of 64-dim hidden unit layer were used. In the process of deriving abnormal scores from existing paper of Anomaly Detection for Sequence data, entropy values of probability of actual data are used in the process of deriving abnormal scores. but in this paper, as mentioned earlier, abnormal scores have been derived by using feature matching techniques. In addition, the process of optimizing latent variables was designed with LSTM to improve model performance. The modified form of generative adversarial model was more accurate in all experiments than the autoencoder in terms of precision and was approximately 7% higher in accuracy. In terms of Robustness, Generative adversarial networks also performed better than autoencoder. Because generative adversarial networks can learn data distribution from real categorical sequence data, Unaffected by a single normal data. But autoencoder is not. Result of Robustness test showed that he accuracy of the autocoder was 92%, the accuracy of the hostile neural network was 96%, and in terms of sensitivity, the autocoder was 40% and the hostile neural network was 51%. In this paper, experiments have also been conducted to show how much performance changes due to differences in the optimization structure of potential variables. As a result, the level of 1% was improved in terms of sensitivity. These results suggest that it presented a new perspective on optimizing latent variable that were relatively insignificant.