TECHNOLOGIES

Fine-Tuning

Continuing a model's training on your own examples to change how it behaves, which is the right tool for style and format and the wrong tool for injecting fresh facts.

Last reviewed: 2026-06-02 byKevin Riedl wiki β†—

Fine-tuning takes a pretrained model and trains it further on a curated set of your own examples, adjusting the model’s weights so it leans toward the behavior you demonstrated. You use it to bake in a consistent tone, a specific output format, a domain vocabulary, or a task the base model handles awkwardly. It changes how the model responds, not what facts it has access to.

The decision that actually matters is fine-tuning versus RAG, and people get it backwards constantly. RAG injects fresh, changing facts at runtime by retrieving from your data, so the answer is only as current as your last document update. Fine-tuning teaches durable behavior but freezes the knowledge at training time, so it is the wrong tool when your facts change. The rule of thumb: tune for form, retrieve for facts. Many real systems use both, a fine-tuned model that answers in your house style, grounded by retrieval for the live data.

The cost reality is less scary than it used to be but still real. You need a clean, labeled dataset (usually hundreds to thousands of examples), the training run itself, and the ongoing cost of re-tuning every time the base model improves or your requirements shift. That last cost is the one teams forget. A fine-tune is not a one-time project, it is a maintenance commitment.

When does fine-tuning win outright? When prompting plus retrieval cannot reliably produce the format or behavior you need, and you have enough high-quality examples to teach it. When in doubt, exhaust prompting and RAG first, because they are cheaper to change. We make this build-versus-tune call under Artificial Intelligence.

// FAQ

FAQs

FAQs

Tune for form, retrieve for facts. Fine-tuning changes how the model behaves (tone, format, task). RAG injects current facts at runtime. If your data changes often, you want RAG, not a fine-tune that freezes knowledge.
More than the training run suggests. Budget for building a clean labeled dataset, the training itself, and the recurring cost of re-tuning whenever the base model or your requirements change. The maintenance is the part teams underestimate.
When you need fresh or frequently changing facts in the answers. Fine-tuning bakes knowledge in at training time. For live data, use retrieval. Also wrong if prompting and RAG already get you there cheaper.