Cristina Cornelio, Judy Goldsmith, et al.
JAIR
Chain-of-thought (CoT) reasoning applies to complex tasks with multiple intermediate steps, a key feature of large language models. Recent studies have revealed CoT as a composition of in-context filtering and learning. This paper proposes a unified framework for CoT optimization that exploits the nested problem structure to formulate training as multilevel optimization. Each intermediate reasoning step is a distinct optimization level. We develop an epigraph-based multilevel optimization (EMO) method to iteratively find the optimal solution for this class of problems. Experiments using GPT-2 show that the proposed EMO achieves the lowest generalization errors across all intermediate steps compared to state-of-the-art, highlighting the importance of nested optimization approaches for CoT reasoning.
Cristina Cornelio, Judy Goldsmith, et al.
JAIR
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025