Decentralized learning in control and optimization for networks and dynamic game

Module Description

During the past decade, decentralized optimization and games have played an important role in the development of algorithms for control, coordination and resource sharing in networks. Examples include resource management in wireless systems, routing and congestion control in data networks, control of epidemic-like dynamics over networks, control in social networks. Motivated by the emergence of large-scale networks or systems characterized by a lack of a centralized coordinator, researchers have recently developed novel/new techniques leading to the design of distributed algorithms that either efficiently reach a socially optimal system operating point, or converge to Nash equilibrium. These techniques and hence the resulting algorithms differ according to the type of actions (discrete or continuous) available to the agents composing the network or system, to the nature of the information exchange among agents, and to the underlying network dynamics (deterministic or stochastic). This course presents some of the recent key techniques for decentralized optimization and learning in games in such systems.

Syllabus

  • Part I: Centralized optimization
    • I-A. Gradient-free optimization
    • I-A.1 Random global search: Hit and Run algorithm
    • I-A.2 Random local search: Generating Set Search algorithm
    • I-A-3 Simulated annealing
    • I-A.4 Gradient-estimator methods
    • I-B Gradient-based optimization
    • I-B.1 Convex analysis
    • I-B.2 Unconstrained smooth optimization
    • I-B.3 Constrained smooth optimization
    • I-B.4 Lagrange Duality and dual descent algorithm
    • I-B.5 Fixed point iteration
  • Part II: Distributed optimization
    • II-A Internet congestion control. Distributed gradient methods: Separable objectives
    • II-B Two miracles in resource allocation in wireless networks (Power control, Carrier Sensing Multiple Access). Distributed gradient methods: Non-separable objectives
    • II-C The parallel computation model
    • II-D Distributed combinatorial optimization: coloring problems
    • II-E Simulation-based distributed optimization
  • Part III: Bandits and adversarial optimization
    • III-A Multi-armed bandit problems
    • III-B Stochastic bandit problems
    • III-B.1 IID setting
    • III-B.2 Lower bound on regret
    • III-B.3 UCB policies, finite time analysis
    • III-B.4 Asymptotically optimal policies: KL-UCB
    • III-C Adversarial bandit problems
    • III-D Online convex optimization
  • Part IV: Dynamics in games
    • IV-A Nash dynamics
    • IV-B Fictitious play
    • IV-C No-regret algorithms
    • IV-D Learning by trial and error
    • IV-E Equilibrium selection