Fine-tuning language models
Adapt open-weight models to your domain, data and terms of use. LoRA / QLoRA for fast iteration, full fine-tunes when warranted, preference tuning (DPO) where the data supports it.
- Domain-specific instruction tuning
- Evaluation harness with your tasks
- Self-hosted or managed inference