Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning

May 29, 2019 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Hyunwook Kang, Taehwan Kwon, Jinkyoo Park, James R. Morrison arXiv ID 1905.12204 Category cs.LG: Machine Learning Cross-listed cs.AI, cs.MA, cs.RO, stat.ML Citations 4 Venue Neural Information Processing Systems Last Checked 4 months ago

Abstract

This paper explores the possibility of near-optimally solving multi-agent, multi-task NP-hard planning problems with time-dependent rewards using a learning-based algorithm. In particular, we consider a class of robot/machine scheduling problems called the multi-robot reward collection problem (MRRC). Such MRRC problems well model ride-sharing, pickup-and-delivery, and a variety of related problems. In representing the MRRC problem as a sequential decision-making problem, we observe that each state can be represented as an extension of probabilistic graphical models (PGMs), which we refer to as random PGMs. We then develop a mean-field inference method for random PGMs. We then propose (1) an order-transferable Q-function estimator and (2) an order-transferability-enabled auction to select a joint assignment in polynomial time. These result in a reinforcement learning framework with at least $1-1/e$ optimality. Experimental results on solving MRRC problems highlight the near-optimality and transferability of the proposed methods. We also consider identical parallel machine scheduling problems (IPMS) and minimax multiple traveling salesman problems (minimax-mTSP).