Together AI

Access leading open-source models including Llama, DeepSeek, Kimi, and GLM through a unified OpenAI-compatible API. Cost-effective inference for open models.

Together AI is a cloud inference platform specializing in open-source models. They host the most popular open models — Llama, DeepSeek, Kimi, GLM, Qwen, Mistral, and more — on optimized infrastructure, giving you cloud-quality inference for open-source models without managing your own GPUs. The pricing is compelling: Llama 3.3 70B at $0.88/MTok (input and output), DeepSeek V3.1 at $0.49/$0.99, and even free models like Kimi K2.5. Together also offers DeepSeek R1 with thinking support at $3/$7 per MTok — expensive for open-source but still competitive for a reasoning model. The platform supports over 100 models with an OpenAI-compatible API, making integration with OpenClaw straightforward. What sets Together apart from running models locally is infrastructure. They run models on clusters of NVIDIA H100/H200 GPUs with optimized serving stacks, delivering inference speeds that are difficult to match on consumer hardware. For models like DeepSeek R1 (671B parameters) that can't run on a single machine, cloud inference is the only practical option. Together also offers fine-tuning, dedicated endpoints for production workloads, and a Batch API for non-time-sensitive tasks. For OpenClaw users, it's the go-to option when you want open-source model quality without the hassle of local infrastructure.

Tags: open-source, inference, openai-compatible, cost-effective

Use Cases

  • Budget-friendly OpenClaw agent using open-source models at cloud speed
  • Running large reasoning models (DeepSeek R1 671B) that can't fit on local hardware
  • High-throughput batch processing with open-source models at discounted rates
  • Zero-cost agent tasks using free model tier (Kimi K2.5)
  • Embeddings provider for OpenClaw memory and RAG features
  • Model evaluation — quickly test multiple open-source models without local setup

Tips

  • Llama 3.3 70B at $0.88/MTok is the best bang-for-buck for general OpenClaw tasks that don't need frontier intelligence.
  • Use DeepSeek V3.1 ($0.49/$0.99) as an ultra-budget daily driver — surprisingly capable for the price.
  • Free Kimi K2.5 is excellent for heartbeats and low-priority cron jobs where you want zero cost.
  • Together's embeddings models work well for OpenClaw memory features at very low cost.
  • Use the Batch API (discounted pricing) for bulk data processing, enrichment, or analysis tasks.
  • Check Together's /models endpoint for current model availability and pricing — it changes frequently.

Known Issues & Gotchas

  • Open-source models are still less capable than frontier models (Claude Opus, GPT-5.4) for complex reasoning and nuanced instruction-following.
  • Free models may have availability issues during peak demand. Don't rely on them for critical workflows.
  • DeepSeek R1 at $3/$7 per MTok is expensive for an open-source model — comparable to Claude Sonnet. Consider if you really need reasoning vs cheaper alternatives.
  • Together uses the OpenAI completions API — some features specific to Anthropic Messages API won't work.
  • Model availability can change. Models may be deprecated or replaced without long notice periods.
  • Batch API is available but processing times can vary significantly. Not suitable for time-sensitive work.

Alternatives

  • Ollama (Local)
  • Hugging Face Inference
  • OpenRouter
  • Anthropic/OpenAI (Direct)

Community Feedback

Together AI's pricing for Llama 3.3 70B is hard to beat — $0.88/MTok for a model that handles most tasks well. Cheaper than running your own GPU server.

— Reddit r/LocalLLaMA

Together's inference speed is genuinely impressive. DeepSeek V3 at 100+ tokens/sec throughput. You're not getting that on your home setup.

— Reddit r/MachineLearning

The free models on Together are surprisingly decent for simple tasks. Kimi K2.5 at $0 is a steal for OpenClaw cron jobs.

— Reddit r/selfhosted

Configuration Examples

Basic Together AI setup

providers:
  together:
    apiKey: your-together-api-key
    model: together/meta-llama/Llama-3.3-70B-Instruct-Turbo

Together with free model

providers:
  together:
    apiKey: your-together-api-key
    model: together/moonshotai/Kimi-K2.5
    # Free model — great for heartbeats and cron

Together as budget alternative

providers:
  anthropic:
    apiKey: sk-ant-xxxxx
    model: anthropic/claude-sonnet-4-6
  together:
    apiKey: your-together-api-key
    model: together/deepseek-ai/DeepSeek-V3.1
    # $0.49/MTok — use for bulk/simple tasks
    # Switch with: /model together/deepseek-ai/DeepSeek-V3.1