Together AI

Access leading open-source models including Llama, DeepSeek, Kimi, and GLM through a unified OpenAI-compatible API. Cost-effective inference for open models.

Together AI is a cloud inference platform specializing in open-source models. They host the most popular open models — Llama, DeepSeek, Kimi, GLM, Qwen, Mistral, and more — on optimized infrastructure, giving you cloud-quality inference for open-source models without managing your own GPUs. The pricing is compelling: Llama 3.3 70B at $0.88/MTok (input and output), DeepSeek V3.1 at $0.49/$0.99, and even free models like Kimi K2.5. Together also offers DeepSeek R1 with thinking support at $3/$7 per MTok — expensive for open-source but still competitive for a reasoning model. The platform supports over 100 models with an OpenAI-compatible API, making integration with OpenClaw straightforward. What sets Together apart from running models locally is infrastructure. They run models on clusters of NVIDIA H100/H200 GPUs with optimized serving stacks, delivering inference speeds that are difficult to match on consumer hardware. For models like DeepSeek R1 (671B parameters) that can't run on a single machine, cloud inference is the only practical option. Together also offers fine-tuning, dedicated endpoints for production workloads, and a Batch API for non-time-sensitive tasks. For OpenClaw users, it's the go-to option when you want open-source model quality without the hassle of local infrastructure.

Tags: open-source, inference, openai-compatible, cost-effective

Use Cases

Budget-friendly OpenClaw agent using open-source models at cloud speed
Running large reasoning models (DeepSeek R1 671B) that can't fit on local hardware
High-throughput batch processing with open-source models at discounted rates
Zero-cost agent tasks using free model tier (Kimi K2.5)
Embeddings provider for OpenClaw memory and RAG features
Model evaluation — quickly test multiple open-source models without local setup

Tips

Llama 3.3 70B at $0.88/MTok is the best bang-for-buck for general OpenClaw tasks that don't need frontier intelligence.
Use DeepSeek V3.1 ($0.49/$0.99) as an ultra-budget daily driver — surprisingly capable for the price.
Free Kimi K2.5 is excellent for heartbeats and low-priority cron jobs where you want zero cost.
Together's embeddings models work well for OpenClaw memory features at very low cost.
Use the Batch API (discounted pricing) for bulk data processing, enrichment, or analysis tasks.
Check Together's /models endpoint for current model availability and pricing — it changes frequently.

Known Issues & Gotchas

Open-source models are still less capable than frontier models (Claude Opus, GPT-5.4) for complex reasoning and nuanced instruction-following.
Free models may have availability issues during peak demand. Don't rely on them for critical workflows.
DeepSeek R1 at $3/$7 per MTok is expensive for an open-source model — comparable to Claude Sonnet. Consider if you really need reasoning vs cheaper alternatives.
Together uses the OpenAI completions API — some features specific to Anthropic Messages API won't work.
Model availability can change. Models may be deprecated or replaced without long notice periods.
Batch API is available but processing times can vary significantly. Not suitable for time-sensitive work.

Alternatives

Ollama (Local)
Hugging Face Inference
OpenRouter
Anthropic/OpenAI (Direct)

Community Feedback

Together AI's pricing for Llama 3.3 70B is hard to beat — $0.88/MTok for a model that handles most tasks well. Cheaper than running your own GPU server.
— Reddit r/LocalLLaMA

Together's inference speed is genuinely impressive. DeepSeek V3 at 100+ tokens/sec throughput. You're not getting that on your home setup.
— Reddit r/MachineLearning

The free models on Together are surprisingly decent for simple tasks. Kimi K2.5 at $0 is a steal for OpenClaw cron jobs.
— Reddit r/selfhosted

Configuration Examples

Basic Together AI setup

providers:
  together:
    apiKey: your-together-api-key
    model: together/meta-llama/Llama-3.3-70B-Instruct-Turbo

Together with free model

providers:
  together:
    apiKey: your-together-api-key
    model: together/moonshotai/Kimi-K2.5
    # Free model — great for heartbeats and cron

Together as budget alternative

providers:
  anthropic:
    apiKey: sk-ant-xxxxx
    model: anthropic/claude-sonnet-4-6
  together:
    apiKey: your-together-api-key
    model: together/deepseek-ai/DeepSeek-V3.1
    # $0.49/MTok — use for bulk/simple tasks
    # Switch with: /model together/deepseek-ai/DeepSeek-V3.1