Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Lakehouse systems enable the same data to be queried with multiple execution engines. However, selecting the engine best suited to run a SQL query still requires a priori knowledge of the query’s computational requirements and an engine’s capabilities, a complex and manual task that only becomes more difficult with the emergence of new engines and workloads. In this paper, we address this limitation by proposing a cross-engine optimizer that is able to automate engine selection for diverse SQL queries by means of a learned cost model. A query plan, optimized with hints, is used for query cost prediction and routing. Cost prediction is formulated as a multi-task learning problem and multiple predictor heads, corresponding to different engines and provisionings, are used in the model architecture. This effectively eliminates the need to train engine-specific models and allows the flexible addition of new engines at a minimal fine-tuning cost.
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025
Yidi Wu, Thomas Bohnstingl, et al.
ICML 2025
Gosia Lazuka, Andreea Simona Anghel, et al.
SC 2024