• Title/Summary/Keyword: Q learning

Search Result 426, Processing Time 0.026 seconds

Study on Q-value prediction ahead of tunnel excavation face using recurrent neural network (순환인공신경망을 활용한 터널굴착면 전방 Q값 예측에 관한 연구)

  • Hong, Chang-Ho;Kim, Jin;Ryu, Hee-Hwan;Cho, Gye-Chun
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.3
    • /
    • pp.239-248
    • /
    • 2020
  • Exact rock classification helps suitable support patterns to be installed. Face mapping is usually conducted to classify the rock mass using RMR (Rock Mass Ration) or Q values. There have been several attempts to predict the grade of rock mass using mechanical data of jumbo drills or probe drills and photographs of excavation surfaces by using deep learning. However, they took long time, or had a limitation that it is impossible to grasp the rock grade in ahead of the tunnel surface. In this study, a method to predict the Q value ahead of excavation surface is developed using recurrent neural network (RNN) technique and it is compared with the Q values from face mapping for verification. Among Q values from over 4,600 tunnel faces, 70% of data was used for learning, and the rests were used for verification. Repeated learnings were performed in different number of learning and number of previous excavation surfaces utilized for learning. The coincidence between the predicted and actual Q values was compared with the root mean square error (RMSE). RMSE value from 600 times repeated learning with 2 prior excavation faces gives a lowest values. The results from this study can vary with the input data sets, the results can help to understand how the past ground conditions affect the future ground conditions and to predict the Q value ahead of the tunnel excavation face.

The Hidden Object Searching Method for Distributed Autonomous Robotic Systems

  • Yoon, Han-Ul;Lee, Dong-Hoon;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1044-1047
    • /
    • 2005
  • In this paper, we present the strategy of object search for distributed autonomous robotic systems (DARS). The DARS are the systems that consist of multiple autonomous robotic agents to whom required functions are distributed. For instance, the agents should recognize their surrounding at where they are located and generate some rules to act upon by themselves. In this paper, we introduce the strategy for multiple DARS robots to search a hidden object at the unknown area. First, we present an area-based action making process to determine the direction change of the robots during their maneuvers. Second, we also present Q learning adaptation to enhance the area-based action making process. Third, we introduce the coordinate system to represent a robot's current location. In the end of this paper, we show experimental results using hexagon-based Q learning to find the hidden object.

  • PDF

Reinforcement Learning-based Duty Cycle Interval Control in Wireless Sensor Networks

  • Akter, Shathee;Yoon, Seokhoon
    • International journal of advanced smart convergence
    • /
    • v.7 no.4
    • /
    • pp.19-26
    • /
    • 2018
  • One of the distinct features of Wireless Sensor Networks (WSNs) is duty cycling mechanism, which is used to conserve energy and extend the network lifetime. Large duty cycle interval introduces lower energy consumption, meanwhile longer end-to-end (E2E) delay. In this paper, we introduce an energy consumption minimization problem for duty-cycled WSNs. We have applied Q-learning algorithm to obtain the maximum duty cycle interval which supports various delay requirements and given Delay Success ratio (DSR) i.e. the required probability of packets arriving at the sink before given delay bound. Our approach only requires sink to compute Q-leaning which makes it practical to implement. Nodes in the different group have the different duty cycle interval in our proposed method and nodes don't need to know the information of the neighboring node. Performance metrics show that our proposed scheme outperforms existing algorithms in terms of energy efficiency while assuring the required delay bound and DSR.

Semi-supervised Cross-media Feature Learning via Efficient L2,q Norm

  • Zong, Zhikai;Han, Aili;Gong, Qing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.3
    • /
    • pp.1403-1417
    • /
    • 2019
  • With the rapid growth of multimedia data, research on cross-media feature learning has significance in many applications, such as multimedia search and recommendation. Existing methods are sensitive to noise and edge information in multimedia data. In this paper, we propose a semi-supervised method for cross-media feature learning by means of $L_{2,q}$ norm to improve the performance of cross-media retrieval, which is more robust and efficient than the previous ones. In our method, noise and edge information have less effect on the results of cross-media retrieval and the dynamic patch information of multimedia data is employed to increase the accuracy of cross-media retrieval. Our method can reduce the interference of noise and edge information and achieve fast convergence. Extensive experiments on the XMedia dataset illustrate that our method has better performance than the state-of-the-art methods.

Reinforcement Learning Based Evolution and Learning Algorithm for Cooperative Behavior of Swarm Robot System (군집 로봇의 협조 행동을 위한 강화 학습 기반의 진화 및 학습 알고리즘)

  • Seo, Sang-Wook;Kim, Ho-Duck;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.5
    • /
    • pp.591-597
    • /
    • 2007
  • In swarm robot systems, each robot must behaves by itself according to the its states and environments, and if necessary, must cooperates with other robots in order to carry out a given task. Therefore it is essential that each robot has both learning and evolution ability to adapt the dynamic environments. In this paper, the new polygon based Q-learning algorithm and distributed genetic algorithms are proposed for behavior learning and evolution of collective autonomous mobile robots. And by distributed genetic algorithm exchanging the chromosome acquired under different environments by communication each robot can improve its behavior ability Specially, in order to improve the performance of evolution, selective crossover using the characteristic of reinforcement learning is adopted in this paper. we verify the effectiveness of the proposed method by applying it to cooperative search problem.

Research Trends in Wi-Fi Performance Improvement in Coexistence Networks with Machine Learning (기계학습을 활용한 이종망에서의 Wi-Fi 성능 개선 연구 동향 분석)

  • Kang, Young-myoung
    • Journal of Platform Technology
    • /
    • v.10 no.3
    • /
    • pp.51-59
    • /
    • 2022
  • Machine learning, which has recently innovatively developed, has become an important technology that can solve various optimization problems. In this paper, we introduce the latest research papers that solve the problem of channel sharing in heterogeneous networks using machine learning, analyze the characteristics of mainstream approaches, and present a guide to future research directions. Existing studies have generally adopted Q-learning since it supports fast learning both on online and offline environment. On the contrary, conventional studies have either not considered various coexistence scenarios or lacked consideration for the location of machine learning controllers that can have a significant impact on network performance. One of the powerful ways to overcome these disadvantages is to selectively use a machine learning algorithm according to changes in network environment based on the logical network architecture for machine learning proposed by ITU.

A Study on Machine Learning and Basic Algorithms (기계학습 및 기본 알고리즘 연구)

  • Kim, Dong-Hyun;Lee, Tae-ho;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.35-36
    • /
    • 2018
  • 본 논문에서는 기계학습 및 기계학습 기법 중에서도 Markov Decision Process (MDP)를 기반으로 하는 강화학습에 대해 알아보고자 한다. 강화학습은 기계학습의 일종으로 주어진 환경 안에서 의사결정자(Agent)는 현재의 상태를 인식하고 가능한 행동 집합 중에서 보상을 극대화할 수 있는 행동을 선택하는 방법이다. 일반적인 기계학습과는 달리 강화학습은 학습에 필요한 사전 지식을 요구하지 않기 때문에 불명확한 환경 속에서도 반복 학습이 가능하다. 본 연구에서는 일반적인 강화학습 및 강화학습 중에서 가장 많이 사용되고 있는 Q-learning 에 대해 간략히 설명한다.

  • PDF

Reinforcement Learning-Based Intelligent Decision-Making for Communication Parameters

  • Xie, Xia.;Dou, Zheng;Zhang, Yabin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2942-2960
    • /
    • 2022
  • The core of cognitive radio is the problem concerning intelligent decision-making for communication parameters, the objective of which is to find the most appropriate parameter configuration to optimize transmission performance. The current algorithms have the disadvantages of high dependence on prior knowledge, large amount of calculation, and high complexity. We propose a new decision-making model by making full use of the interactivity of reinforcement learning (RL) and applying the Q-learning algorithm. By simplifying the decision-making process, we avoid large-scale RL, reduce complexity and improve timeliness. The proposed model is able to find the optimal waveform parameter configuration for the communication system in complex channels without prior knowledge. Moreover, this model is more flexible than previous decision-making models. The simulation results demonstrate the effectiveness of our model. The model not only exhibits better decision-making performance in the AWGN channels than the traditional method, but also make reasonable decisions in the fading channels.

A biologically inspired model based on a multi-scale spatial representation for goal-directed navigation

  • Li, Weilong;Wu, Dewei;Du, Jia;Zhou, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1477-1491
    • /
    • 2017
  • Inspired by the multi-scale nature of hippocampal place cells, a biologically inspired model based on a multi-scale spatial representation for goal-directed navigation is proposed in order to achieve robotic spatial cognition and autonomous navigation. First, a map of the place cells is constructed in different scales, which is used for encoding the spatial environment. Then, the firing rate of the place cells in each layer is calculated by the Gaussian function as the input of the Q-learning process. The robot decides on its next direction for movement through several candidate actions according to the rules of action selection. After several training trials, the robot can accumulate experiential knowledge and thus learn an appropriate navigation policy to find its goal. The results in simulation show that, in contrast to the other two methods(G-Q, S-Q), the multi-scale model presented in this paper is not only in line with the multi-scale nature of place cells, but also has a faster learning potential to find the optimized path to the goal. Additionally, this method also has a good ability to complete the goal-directed navigation task in large space and in the environments with obstacles.

A Case Study of Flipped Llearning of Cooking Practice Subject of University Students (대학생 조리실무 교과목의 플립드러닝(Flipped learning) 적용사례 연구)

  • Kim, Hak-Ju;Kim, Chan-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.9
    • /
    • pp.129-139
    • /
    • 2020
  • This study was conducted to analyze the subjective perception types of college students majoring in cooking by applying flip-learning teaching and learning methods to the subject of cooking practice to improve the educational efficiency of cooking-related classes. Also, in order to study subjective perception of small students, we tried to grasp the common structure in subjective attitude and perception using Q methodology, and the analysis resulted in four types. Type 1 (N=5): Problem solving ability effect, Type 2 (N=6): Self-directed learning effect, Type 3 (N=3): Mutual cooperation practice effect, Type 4 (N=6) ): Theory learning effect was analyzed for each unique feature type. Flip-learning is applied to cooking practice classes, which is a learner-centered education that leaves the traditional teaching method. Interest was found to have a very positive effect on learners' opinion sharing and learning outcomes. However, it was revealed that all students need to find additional solutions to problems such as the operation plan for flipped learning and the free ride evaluation method in group learning.