Learning Neural Search Policies for Classical Planning

November 27, 2019 · Declared Dead · 🏛 International Conference on Automated Planning and Scheduling

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Pawel Gomoluch, Dalal Alrajeh, Alessandra Russo, Antonio Bucchiarone arXiv ID 1911.12200 Category cs.AI: Artificial Intelligence Citations 11 Venue International Conference on Automated Planning and Scheduling Last Checked 4 months ago

Abstract

Heuristic forward search is currently the dominant paradigm in classical planning. Forward search algorithms typically rely on a single, relatively simple variation of best-first search and remain fixed throughout the process of solving a planning problem. Existing work combining multiple search techniques usually aims at supporting best-first search with an additional exploratory mechanism, triggered using a handcrafted criterion. A notable exception is very recent work which combines various search techniques using a trainable policy. It is, however, confined to a discrete action space comprising several fixed subroutines. In this paper, we introduce a parametrized search algorithm template which combines various search techniques within a single routine. The template's parameter space defines an infinite space of search algorithms, including, among others, BFS, local and random search. We further introduce a neural architecture for designating the values of the search parameters given the state of the search. This enables expressing neural search policies that change the values of the parameters as the search progresses. The policies can be learned automatically, with the objective of maximizing the planner's performance on a given distribution of planning problems. We consider a training setting based on a stochastic optimization algorithm known as the cross-entropy method (CEM). Experimental evaluation of our approach shows that it is capable of finding effective distribution-specific search policies, outperforming the relevant baselines.