Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

October 03, 2023 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai arXiv ID 2310.02304 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.LG, stat.ML Citations 84 Venue arXiv.org Last Checked 4 months ago

Abstract

Several recent advances in AI systems solve problems by providing a "scaffolding" program that structures multiple calls to language models (LMs) to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with a seed "improver" that improves an input program according to a given utility function by querying an LM several times and returning the best solution. We then run this seed improver to improve itself. Across a small set of downstream tasks, the resulting improved improver generates programs with significantly better performance than its seed improver. A variety of self-improvement strategies are proposed by the language model, including beam search, genetic algorithms, and simulated annealing. Since the language models themselves are not altered, this is not full recursive self-improvement. Nonetheless, it demonstrates that a modern language model, GPT-4 in our experiments, is capable of writing code that can call itself to improve itself. We consider concerns around the development of self-improving technologies and evaluate the frequency with which the generated code bypasses a sandbox.