Balancing the Tradeoffs Between Exploration and Exploitation

탐색 (Exploration)과 이용(Exploitation)의 상반관계의 균형에 관한 연구

  • Published : 2005.11.01

Abstract

As auctions become popular, developing good agent bidding strategies has been an important focus in agent-based electronic commerce research. Especially for the continuous double auctions where no single dominant strategy is known, the agent bidding strategy has practical significance. This paper introduces an adaptive agent strategy for the countinuous double auction. The central idea is to let the agent figure out at run time when the sophisticated strategy (called the p-strategy) is beneficial and when a simpler strategy is better. Balance between exploration and exploitation is achieved by using a heuristic exploration function that trades off the expected profits and the number of tries of each strategy. We have experimentally evaluated the performance of the adaptive strategy in a wide variety of environments. The experiment results indicate that the adaptive strategy outperforms the plain p-strategy when the p-strategy performs poorly, while it performs similar to the p-strategy when the p-strategy dominates the other simple strategies.

전자상거래상에서 경매가 활발해짐에 따라 경매용 에이전트와 경매 에이전트의 비딩 스트레티지 개발에 관한 연구가 중요한 관심의 초점이 되고 있다. 특히, 우세한 스트레티지가 알려져 있지 않는 복잡한 경매 환경에서의 에이전트 스트레티지 개발은 실용적인 의미를 가지고 있다 이 논문은 최적의 스트레티지가 존재하지 않는 연속이중경매(Continuous Double Auction, CDA) 환경에서 사용할 수 있는 "적응성 스트레티지"를 소개한다. 적응성 스트레티지는 현재 알려져 있는 P-스트레티지에 실시간 적응력을 부가하는 것을 주 아이디어로 한다. 적응성 스트레티지는 여러 종류의 알려진 스트레티지들 중 이제까지 좋은 성능을 보여준 스트레티지를 계속 사용하려는 탐색(exploitation)과 바뀌어졌을지도 모르는 새로운 환경에 적합한 스트레티지를 찾아내려는 이용(exploration)간의 균형을 꾀하며, 이를 각 스트레티지의 기대이득과 실행횟수사이의 상반관계를 고려하는 휴리스틱 탐색 함수를 이용하여 결정한다. 실험분석의 결과, 적응성 스트레티지는 (1) P-스트레티지가 잘 작동하지 않는 환경에선 P-스트레티지보다 높은 이득을, (2) P-스트레티지가 다른 종류의 단순한 스트레티지를 앞서는 환경에서는 P-스트레티지와 비슷한 이득을 보인다.

Keywords

References

  1. Park, S., E. H. Durfee, et al. 'Use of Markov Chains to Design an Agent Bidding Strategy for Continuous Double Auctions,' Journal of Artificial Intelligence Research, Vol. 22, 175-214, November, 2004
  2. Steiglitz, K., M. L. Honig, et al. 'A Computational Market Model based on Individual Action,' Marketbased Control: A Paradigm for Distributed Resource Allocation. S. Clearwater. 1996
  3. Gode, D. K. and S. Sunder. 'Lower Bounds for Efficiency of Surplus Extraction in Double Auctions,' The Double Auction Market: Institutions, Theories, and Evidence. D. Friedman and J. Rust. Reading, MA, Addison-Wesley: 199-219. 1993
  4. Verkama, M., R. P. Hamalainen, et al. 'Multi-Agent Interaction Processes: From Oligopoly Theory to Decentralized Artificial Intelligence,' Group Decision and Negotiation 2: 137-159. 1992 https://doi.org/10.1007/BF00406752
  5. Roth, A. E. 'On the Early History of Experimental Economics,' Journal of the History of Economic Thought: 184-209. 1993
  6. Rust, J., J. Miller, et al. 'Behavior of Trading Automata in a Computerized Double Auction Market,' The Double Auction Market. D. Friedman and J. Rust: 155-198. 1993
  7. Kagel, J. and A. E. Roth, Eds. Handbook of Experimental Economics, Princeton University Press. 1995
  8. Roth, A. E. 'Introduction to Experimental Economics,' Handbook of Experimental Economics. J. Kagel and A. E. Roth, Princeton University Press: 3-109. 1995.
  9. Kirchler, E., B. Maciejovsky, and M. Weber (Forthcoming). 'Framing Effects, Selective Information and Market Behavior: An Experimental Approach,' Journal of Behavioral Finance. 2005
  10. White, J. E. 'Telescript Technology: The Foundation for the Electronic Marketplace,' White Paper, General Magic. 1994
  11. Cliff, D. 'Genetic Optimization of Adaptive Trading Agents for Double-Auction Markets,' Autonomous Agents '98 Workshop, Artificial Societies and Computational Markets, Minneapolis/St.Paul. 1998 https://doi.org/10.1109/CIFER.1998.690152
  12. Oliver, J. R. 'On Artificial Agents for Negotiation in Electronic Commerce,' Dissertation, Wharton school of business. Philadelphia, U of Pennsylvania. 1998
  13. Byde, A. 'Applying Evolutionary Game Theory to Auction Mechanism Design,' Technical Report, HPL2002-321, Hewlett-Packard Lab. 2002
  14. Priest, C. 'Commodity Trading Using an Agent-Based Iterated Double Auction,' Technical Report: HPL-2003-238, Hewlett-Packard Lab. 2003
  15. Tesauro, G. and R. Das. 'High-performance bidding agents for the continuous double auction,' Proceedings of the 3rd ACM conference on Electronic Commerce, 206-209, Tampa, Florida, USA, 2001
  16. He, M. and N. R. Jennings. 'Designing a Successful Trading Agent: A Fuzzy Set Approach,' IEEE Transactions on Fuzzy Systems, Vol 12, No. 3: 389-410. 2004 https://doi.org/10.1109/TFUZZ.2004.825064
  17. Vytelingum, P., R. K. Dash, E. David, and N. R. Jennings. 'A Risk-Based Bidding Strategy for Continuous Double Auctions,' European Conference on Artificial Intelligence, 79-83. 2004
  18. Wellman, M. P. and J. Hu. 'Conjectural Equilibrium in Multiagent Learning,' Machine Learning 33: 179-200. 1998 https://doi.org/10.1023/A:1007514623589
  19. Hu, J. and M. P. Wellman. 'Learning About Other Agents in a Dynamic Multiagent System,' Cognitive Systems Research 2 : 67-79, 2001 https://doi.org/10.1016/S1389-0417(01)00016-X
  20. Gmytrasiewicz, P. J. and E. H. Durfee. 'Rational Communication in Multi-Agent Systems,' Autonomous Agents and Multi-Agent Systems Journal, 4(3): 233-272. 2001 https://doi.org/10.1023/A:1011495811107
  21. Bartos, O. J. Process and Outcome of Negotiations, Columbia University Press. 1974
  22. Watkins, C. J. and P. Dayan. 'Q-learning,' Machine Learning, 8: 279-292. 1992 https://doi.org/10.1023/A:1022676722315
  23. Russell, S. and P. Norvig. Artificial Intelligence: A Modern Approach, Prentice Hall. 1995
  24. Tanenbaum, A. Computer Networks, Prentice Hall. 1996