DOI QR코드

DOI QR Code

Reinforcement Learning-based Duty Cycle Interval Control in Wireless Sensor Networks

  • Akter, Shathee (Department of Electrical and Computer Engineering, University of Ulsan) ;
  • Yoon, Seokhoon (Department of Electrical and Computer Engineering, University of Ulsan)
  • Received : 2018.09.02
  • Accepted : 2018.09.13
  • Published : 2018.12.31

Abstract

One of the distinct features of Wireless Sensor Networks (WSNs) is duty cycling mechanism, which is used to conserve energy and extend the network lifetime. Large duty cycle interval introduces lower energy consumption, meanwhile longer end-to-end (E2E) delay. In this paper, we introduce an energy consumption minimization problem for duty-cycled WSNs. We have applied Q-learning algorithm to obtain the maximum duty cycle interval which supports various delay requirements and given Delay Success ratio (DSR) i.e. the required probability of packets arriving at the sink before given delay bound. Our approach only requires sink to compute Q-leaning which makes it practical to implement. Nodes in the different group have the different duty cycle interval in our proposed method and nodes don't need to know the information of the neighboring node. Performance metrics show that our proposed scheme outperforms existing algorithms in terms of energy efficiency while assuring the required delay bound and DSR.

Keywords

OTNBCL_2018_v7n4_19_f0001.png 이미지

Figure 1. An example of network model

OTNBCL_2018_v7n4_19_f0002.png 이미지

Figure 2. (a): Total energy consumption

OTNBCL_2018_v7n4_19_f0003.png 이미지

Figure 2. (b): Maximum energy consumption of the node in Network

OTNBCL_2018_v7n4_19_f0004.png 이미지

Figure 3. (a): Total energy consumption

OTNBCL_2018_v7n4_19_f0005.png 이미지

Figure 3. (b): Maximum energy Consumption of the node in network.

References

  1. H. Wang, N. Agoulmine, M. Ma and Y. Jin, "Network lifetime optimization in wireless sensor networks", IEEE Journal on Selected Areas in Communications, Vol. 28, No. 7, pp. 1127 - 1137, 2010. DOI: http://dx.doi.org/10.1109/JSAC.2010.100917
  2. T. Rault, A. Bouabdallah and Y. Challal, "Energy efficiency in wireless sensor networks: A top-down survey", Computer Networks, Vol.67, pp.104-122, 2014. DOI: https://doi.org/10.1016/j.comnet.2014.03.027
  3. R. Alberola and D. psch," Duty Cycle Learning Algorithm (DCLA) for IEE 802.15.4 Beacon-enabled Wireless-Sensor Networks", Journal of Ad hoc networks, Vol.10, no-4, pp. 664-679, 2012. DOI: https://doi.org/10.1016/j.adhoc.2011.06.006
  4. V. D. Son and S. Yoon, "Duty Cycle Scheduling considering Delay Time Constraints in Wireless Sensor Networks", The Journal of The Institute of Internet, Broadcasting and Communication (IIBC), Vol. 18, No. 2, pp. 169-176, Apr. 30, 2018. DOI: https://doi.org/10.3390/electronics7110306
  5. T. N. Dao, S. Yoon, and J. Kim, "A deadline-aware scheduling and forwarding scheme in wireless sensor networks", Sensors, vol. 16, no. 1, 2016. DOI: https://doi.org/10.3390/s16010059
  6. R. Sutton and A. Barto., "Reinforcement Learning", MIT Press., Cambridge, MA., 1998.
  7. D. White, "Real applications of Markov decision processes", Interfaces, Vol. 15, no. 6, pp. 73-83, 1985. DOI: https://doi.org/10.1287/inte.15.6.73
  8. Watldns, C.J.C.H., Learning from delayed rewards, PhD Thesis, University of Cambridge, England, 1989.
  9. D. Bertsekas and J. Tsitsiklis., "Neuro-Dynamic Programming", Athena Scientific, Belmont, MA, 1996.
  10. J. Tsitsiklis., "Asynchronous stochastic approximation and Q-learning", Machine Learning, Vol. 16, pp. 185-202, 1994. DOI: https://doi.org/10.1023/A:102268912504
  11. T. Jaakkola, M. Jordan, and S. Singh., "On the convergence of stochastic iterative dynamic programming algorithms", Neural Computation, Vol. 6, pp. 1185 - 1201, 1994. DOI: https://doi.org/10.1162/neco.1994.6.6.1185
  12. C. Watkins and P. Dyan., "Q-learning", Machine Learning, Vol. 8, pp. 279-292, 1992. DOI: https://doi.org/10.1007/BF00992698
  13. Sutton, R.S., Temporal credit assignment in reinforcement learning, PhD Thesis, University of Massachusetts, Amherst, MA, 1984
  14. R. Sutton, "Learning to predict by the methods of temporal difference", Machine Learning, Vol.3, pp. 9-44, 1988. DOI: https://doi.org/10.1007/BF00115009