Over the past week I’ve gone deep on something that’s shifting how I think about AI in production: you don’t need a frontier-scale model to get frontier-level results — you need the right model trained on the right knowledge.
Here’s what I’ve been building
I’m developing a self-hosted algorithmic trading platform (Secant) and I needed an AI brain that genuinely understands trading — circuit breakers, regulatory compliance, risk management, market sentiment, MCP tool orchestration. Not a general assistant. A specialist.
So instead of paying API costs for GPT or Claude on every inference, I took LFM 2.5 8B — Liquid AI’s open-source mixture-of-experts model — and fine-tuned 8 separate LoRA adapters, each a domain expert:
- Sentiment analysis
- Risk management
- Safety & regulatory compliance (SAR filings, OFAC screening, adversarial prompt detection)
- Strategy development
- Platform debugging
- Customer service
- Broker operations
- MCP tool control
Each adapter is ~1.7M trainable parameters on top of an 8B base. Tiny. Fast. Self-hosted. No ongoing API cost.
What I’m learning
Training your own model forces you to understand things that using an API hides from you — tokenization, loss curves, learning rate schedules, checkpoint recovery, GPU memory tradeoffs. This week I debugged PyTorch/trl version incompatibilities, learned why BF16 matters for MoE architectures, and rented H100s on Vast.ai to run 4 parallel training jobs simultaneously.
I’m also using Claude Code as my AI pair programmer throughout — not just for code, but for architecture decisions, debugging sessions, and reasoning through tradeoffs in real time. It’s a fundamentally different development experience.
The combination of open weights models + curated domain datasets + efficient fine-tuning is genuinely democratizing what a solo developer can build. A week ago this felt like research. Today I’m watching 8 specialist models train in parallel at 4.6 seconds per step on H100s I’m renting for $1.60/hr.
The era of “one big model for everything” is giving way to something more interesting: fleets of small, precise models that know their lane deeply.