Together AI
Access leading open-source models including Llama, DeepSeek, Kimi, and GLM through a unified OpenAI-compatible API. Cost-effective inference for open models.
Together AI is a cloud inference platform specializing in open-source models. They host the most popular open models — Llama, DeepSeek, Kimi, GLM, Qwen, Mistral, and more — on optimized infrastructure, giving you cloud-quality inference for open-source models without managing your own GPUs.
The pricing is compelling: Llama 3.3 70B at $0.88/MTok (input and output), DeepSeek V3.1 at $0.49/$0.99, and even free models like Kimi K2.5. Together also offers DeepSeek R1 with thinking support at $3/$7 per MTok — expensive for open-source but still competitive for a reasoning model. The platform supports over 100 models with an OpenAI-compatible API, making integration with OpenClaw straightforward.
What sets Together apart from running models locally is infrastructure. They run models on clusters of NVIDIA H100/H200 GPUs with optimized serving stacks, delivering inference speeds that are difficult to match on consumer hardware. For models like DeepSeek R1 (671B parameters) that can't run on a single machine, cloud inference is the only practical option.
Together also offers fine-tuning, dedicated endpoints for production workloads, and a Batch API for non-time-sensitive tasks. For OpenClaw users, it's the go-to option when you want open-source model quality without the hassle of local infrastructure.
Tags: open-source, inference, openai-compatible, cost-effective
Use Cases
- Budget-friendly OpenClaw agent using open-source models at cloud speed
- Running large reasoning models (DeepSeek R1 671B) that can't fit on local hardware
- High-throughput batch processing with open-source models at discounted rates
- Zero-cost agent tasks using free model tier (Kimi K2.5)
- Embeddings provider for OpenClaw memory and RAG features
- Model evaluation — quickly test multiple open-source models without local setup
Tips
- Llama 3.3 70B at $0.88/MTok is the best bang-for-buck for general OpenClaw tasks that don't need frontier intelligence.
- Use DeepSeek V3.1 ($0.49/$0.99) as an ultra-budget daily driver — surprisingly capable for the price.
- Free Kimi K2.5 is excellent for heartbeats and low-priority cron jobs where you want zero cost.
- Together's embeddings models work well for OpenClaw memory features at very low cost.
- Use the Batch API (discounted pricing) for bulk data processing, enrichment, or analysis tasks.
- Check Together's /models endpoint for current model availability and pricing — it changes frequently.
Known Issues & Gotchas
- Open-source models are still less capable than frontier models (Claude Opus, GPT-5.4) for complex reasoning and nuanced instruction-following.
- Free models may have availability issues during peak demand. Don't rely on them for critical workflows.
- DeepSeek R1 at $3/$7 per MTok is expensive for an open-source model — comparable to Claude Sonnet. Consider if you really need reasoning vs cheaper alternatives.
- Together uses the OpenAI completions API — some features specific to Anthropic Messages API won't work.
- Model availability can change. Models may be deprecated or replaced without long notice periods.
- Batch API is available but processing times can vary significantly. Not suitable for time-sensitive work.
Alternatives
- Ollama (Local)
- Hugging Face Inference
- OpenRouter
- Anthropic/OpenAI (Direct)
Community Feedback
Together AI's pricing for Llama 3.3 70B is hard to beat — $0.88/MTok for a model that handles most tasks well. Cheaper than running your own GPU server.
— Reddit r/LocalLLaMA
Together's inference speed is genuinely impressive. DeepSeek V3 at 100+ tokens/sec throughput. You're not getting that on your home setup.
— Reddit r/MachineLearning
The free models on Together are surprisingly decent for simple tasks. Kimi K2.5 at $0 is a steal for OpenClaw cron jobs.
— Reddit r/selfhosted
Configuration Examples
Basic Together AI setup
providers:
together:
apiKey: your-together-api-key
model: together/meta-llama/Llama-3.3-70B-Instruct-TurboTogether with free model
providers:
together:
apiKey: your-together-api-key
model: together/moonshotai/Kimi-K2.5
# Free model — great for heartbeats and cronTogether as budget alternative
providers:
anthropic:
apiKey: sk-ant-xxxxx
model: anthropic/claude-sonnet-4-6
together:
apiKey: your-together-api-key
model: together/deepseek-ai/DeepSeek-V3.1
# $0.49/MTok — use for bulk/simple tasks
# Switch with: /model together/deepseek-ai/DeepSeek-V3.1