Gradient Transformer: Learning to Generate Updates for LLMs

AI & ML··2 min read·via ArXivOriginal source →

Gradient Transformer: Learning to Generate Updates for LLMs

arXiv:2605.27591v1 Announce Type: new Abstract: Many organizations lack computational resources to fine-tune large language models (LLMs) on private (unshareable) data for better utility, while fine-tuning tiny language models (TinyLMs) alone performs poorly. To address this bottleneck, we propose a data-free knowledge distillation framework that generates LLM update vectors based on TinyLMs fine-tuned on private data. An update vector is a vector of parameter changes from an initial model to i

More Stories