C.A. Micchelli, W.L. Miranker
Journal of the ACM
Fine-tuning on multiple datasets? Static mixing with pre-determined percentages can often lead to overfitting and demands extensive ablations for the right mix. Dynamic data mixing addresses this using signals/rewards like training loss. While this has been studied (aclanthology.org/2024.emnlp-main.787) in research, full-fledged tooling is limited. In this session, we present a PyTorch-native (uses DataLoader and IterableDataset), online, reward-based data mixing framework that is: (a) composable with existing training loops with minimal code changes, (b) plug-and-play with user-defined mixing strategies and rewards, and (c) compatible with distributed training. We demonstrate its flexibility through 5 reward-driven data mixing recipes and its scalability via large-scale multi-GPU experiment with insights on mixing. We believe our session will motivate PyTorch developers to adopt our framework for their use cases involving multiple finetuning datasets. The code is available at github.com/foundation-model-stack/fms-hf-tuning/tree/online-dyn-reward-data-mixing.
C.A. Micchelli, W.L. Miranker
Journal of the ACM
Saurabh Paul, Christos Boutsidis, et al.
JMLR
Joxan Jaffar
Journal of the ACM
Kenneth L. Clarkson, Elad Hazan, et al.
Journal of the ACM