skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The role of information structures in game-theoretic multi-agent learning

Journal Article · · Annual Reviews in Control

Multi-agent learning (MAL) studies how agents learn to behave optimally and adaptively from their experience when interacting with other agents in dynamic environments. The outcome of a MAL process is jointly determined by all agents’ decision-making. Hence, each agent needs to think strategically about others’ sequential moves, when planning future actions. The strategic interactions among agents makes MAL go beyond the direct extension of single-agent learning to multiple agents. With the strategic thinking, each agent aims to build a subjective model of others decision-making using its observations. Such modeling is directly influenced by agents’ perception during the learning process, which is called the information structure of the agent’s learning. As it determines the input to MAL processes, information structures play a significant role in the learning mechanisms of the agents. This review creates a taxonomy of MAL and establishes a unified and systematic way to understand MAL from the perspective of information structures. Here we define three fundamental components of MAL: the information structure (i.e., what the agent can observe), the belief generation (i.e., how the agent forms a belief about others based on the observations), as well as the policy generation (i.e., how the agent generates its policy based on its belief). In addition, this taxonomy enables the classification of a wide range of state-of-the-art algorithms into four categories based on the belief-generation mechanisms of the opponents, including stationary, conjectured, calibrated, and sophisticated opponents. We introduce Value of Information (VoI) as a metric to quantify the impact of different information structures on MAL. Finally, we discuss the strengths and limitations of algorithms from different categories and point to promising avenues of future research.

Research Organization:
The Ohio State University, Columbus, OH (United States)
Sponsoring Organization:
USDOE Office of Nuclear Energy (NE); National Science Foundation (NSF); Army Research Office (ARO)
Grant/Contract Number:
NE0008986; SES-1541164; ECCS-1847056; CNS-2027884; BCS-2122060; 20-19829; W911NF-19-1-0041
OSTI ID:
1976878
Journal Information:
Annual Reviews in Control, Vol. 53, Issue C; ISSN 1367-5788
Publisher:
International Federation of Automatic Control - ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (52)

Multiagent Learning: Basics, Challenges, and Prospects journal September 2012
Sophisticated Experience-Weighted Attraction Learning and Strategic Teaching in Repeated Games journal May 2002
A Framework for Sequential Planning in Multi-Agent Settings journal July 2005
Deep learning journal May 2015
Evolutionary game dynamics journal July 2003
Using communication to reduce locality in distributed multiagent learning journal July 1998
Stochastic Approximations and Differential Inclusions journal January 2005
Distributed Fictitious Play for Multiagent Systems in Uncertain Environments journal April 2018
Human-level performance in 3D multiplayer games with population-based reinforcement learning journal May 2019
Best-response dynamics in zero-sum stochastic games journal September 2020
Evolving Dynamical Neural Networks for Adaptive Behavior journal June 1992
Stochastic games journal November 2015
Distributed No-Regret Learning in Multiagent Systems: Challenges and Recent Developments journal May 2020
A dynamic games approach to proactive defense strategies against Advanced Persistent Threats in cyber-physical systems journal February 2020
Individual Q-Learning in Normal Form Games journal January 2005
Perspectives on multiagent learning journal May 2007
An actor-critic algorithm for constrained Markov decision processes journal March 2005
Convergent multiple-timescales reinforcement learning algorithms in normal form games journal November 2003
Risk-Sensitive Mean-Field Games journal April 2014
A Comprehensive Survey of Multiagent Reinforcement Learning journal March 2008
Online Learning and Online Convex Optimization journal January 2011
Q-learning journal May 1992
Reinforcement Learning: A Survey journal January 1996
Dynamic Games With Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition journal January 2017
A survey and critique of multiagent deep reinforcement learning journal October 2019
Superhuman AI for heads-up no-limit poker: Libratus beats top professionals journal December 2017
Payoff-Based Dynamics for Multiplayer Weakly Acyclic Games journal January 2009
${{\cal Q} {\cal D}}$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through ${\rm Consensus} + {\rm Innovations}$ journal April 2013
Superhuman AI for multiplayer poker journal August 2019
Learning by trial and error journal March 2009
On Gradient-Based Learning in Continuous Games journal January 2020
Planning and acting in partially observable stochastic domains journal May 1998
On the relation between social dynamics and social learning journal January 1995
Multi-Agent Systems: A Survey journal January 2018
Human-level control through deep reinforcement learning journal February 2015
Cognition and Behavior in Normal-Form Games: An Experimental Study journal September 2001
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents journal September 2006
A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning journal January 2013
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker journal March 2017
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems journal March 2019
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning journal August 1999
Stochastic Games journal October 1953
Multiagent learning using a variable learning rate journal April 2002
Finite-time Analysis of the Multiarmed Bandit Problem journal May 2002
Evolving artificial neural networks journal January 1999
Incentive Compatibility and the Bargaining Problem journal January 1979
If multi-agent learning is the answer, what is the question? journal May 2007
On Best-Response Dynamics in Potential Games journal January 2018
Distributed Inertial Best-Response Dynamics journal December 2018
A Simple Model of Herd Behavior journal August 1992
A Cognitive Hierarchy Model of Games journal August 2004
Value-function reinforcement learning in Markov games journal April 2001

Similar Records

Reinforcement Learning for feedback-enabled cyber resilience
Journal Article · Mon Jan 31 00:00:00 EST 2022 · Annual Reviews in Control · OSTI ID:1976878

Citizen involvement in energy decision making. [Nuclear power development]
Technical Report · Tue Mar 01 00:00:00 EST 1977 · OSTI ID:1976878

Igiugig Site Visit Report
Technical Report · Wed Nov 17 00:00:00 EST 2021 · OSTI ID:1976878