Z.AI (GLM Models)

Z.AI (Zhipu AI) is the official API platform for the GLM model family. Offers GLM-5, GLM-4.7, and GLM-4.6 with tool_stream for streaming tool calls, vision models, image/video generation, and free Flash tiers. Bearer auth with API key.

Z.AI (formerly Zhipu AI / BigModel) is one of China's leading AI companies and the creator of the GLM (General Language Model) family. The platform provides a comprehensive API through both z.ai (international) and open.bigmodel.cn (China), offering text models, vision models, image generation (GLM-Image, CogView-4), video generation (CogVideoX-3, Vidu), speech recognition (GLM-ASR), and specialized agents. The GLM model lineup is extensive and competitive. GLM-5 is the flagship model with strong reasoning and tool use capabilities, scoring 50 on Artificial Analysis benchmarks (tied with MiniMax M2.7 for second among Chinese models). GLM-5-Turbo offers enhanced performance at slightly higher cost, while GLM-5-Code is specifically optimized for coding tasks. The GLM-4.7 series provides a strong mid-tier option, and GLM-4.7 FlashX delivers impressive quality at just $0.07/$0.40 per MTok. A standout feature for OpenClaw integration is Z.AI's tool_stream capability — streaming tool calls that enable real-time feedback during function execution. This is a differentiator not available from all providers. The platform also offers free Flash tiers (GLM-4.7-Flash and GLM-4.5-Flash) with unlimited free usage, making it excellent for experimentation and low-cost agent operations. Z.AI's pricing is competitive with cached input pricing available at 80% discount (e.g., GLM-5: $0.20 cached vs $1.00 standard input). This makes repeated context scenarios significantly cheaper. The platform also includes built-in web search ($0.01 per use) that models can invoke during inference. For OpenClaw users, Z.AI is particularly valuable as a cost-effective provider with genuine free tiers. GLM-4.7-Flash provides competent free inference for heartbeats, cron jobs, and light tasks. For production workloads, GLM-5 at $1.00/$3.20 offers strong reasoning at well below frontier pricing. Zhipu AI raised prices 30% in early 2026, reflecting the models' growing competitive quality.

Tags: glm, zhipu-ai, chinese-ai, tool-streaming, vision, image-generation, video-generation, free-tier, coding

Use Cases

  • Free-tier agent operations with GLM-4.7-Flash for heartbeats and lightweight tasks
  • Cost-effective reasoning and tool use with GLM-5 at $1.00/$3.20 per MTok
  • Coding tasks with the specialized GLM-5-Code model
  • Vision processing with GLM-4.6V at budget pricing
  • Real-time tool streaming for responsive agent workflows
  • Budget-conscious OpenClaw setups using FlashX ($0.07 input) for daily operations

Tips

  • Use GLM-4.7-Flash (free) for OpenClaw heartbeats, cron jobs, and basic tasks. Zero cost.
  • GLM-4.7 FlashX at $0.07/$0.40 is an incredible value — near-free quality with much better reasoning than the free Flash.
  • Leverage cached input pricing (80% discount) for repeated prompts — great for agent loops with stable system prompts.
  • GLM-5-Code is purpose-built for coding. Use it for code generation tasks instead of generic GLM-5.
  • GLM-4.6V for vision tasks at $0.30/$0.90 is significantly cheaper than GPT-4o or Claude vision.
  • Built-in web search ($0.01/use) lets GLM models fetch real-time information during inference.
  • Also available through Synthetic and other aggregators if you prefer a unified API key.

Known Issues & Gotchas

  • Two platforms: z.ai (international) and open.bigmodel.cn (China). API keys may not be interchangeable between them.
  • Pricing increased 30% in early 2026 — check current rates at docs.z.ai/guides/overview/pricing.
  • Free Flash tiers (GLM-4.7-Flash, GLM-4.5-Flash) have lower quality than paid models. Good for basic tasks, not complex reasoning.
  • Vision is only available on GLM-4.6V and GLM-4.6V-Flash. The flagship GLM-5 does not support vision.
  • Cache input storage is 'limited-time free' — may become paid in the future. Build with that expectation.
  • Tool_stream is specific to Z.AI's API — not a standard OpenAI API feature. OpenClaw handles the translation.
  • GLM models are primarily trained on Chinese data. English quality is good but may lag behind Western-focused models.
  • Video generation (CogVideoX, Vidu) and image generation are separate APIs — not usable through OpenClaw's chat interface.

Alternatives

  • DeepSeek
  • Qwen (via DashScope)
  • Moonshot (Kimi)
  • Synthetic

Community Feedback

GLM-5 is genuinely competitive. Scored 50 on Artificial Analysis — same tier as MiniMax M2.7. The tool_stream feature is great for real-time agent workflows.

— Reddit r/LocalLLaMA

Zhipu's free Flash tiers are generous. GLM-4.7-Flash for free is hard to beat for basic tasks. Just don't expect frontier-level quality.

— Hacker News

The 30% price hike on GLM-5 in early 2026 was notable. They went from being the cheapest Chinese option to mid-range. Quality justified it, but it stung.

— Reddit r/MachineLearning

Configuration Examples

Basic Z.AI setup with GLM-5

providers:
  zai:
    apiKey: your-zai-api-key
    model: zai/glm-5

Free tier with GLM-4.7-Flash

providers:
  zai:
    apiKey: your-zai-api-key
    model: zai/glm-4.7-flash
    # Completely free — great for heartbeats and cron

Z.AI with vision model

providers:
  zai:
    apiKey: your-zai-api-key
    model: zai/glm-4.6v
    # Vision support at $0.30/$0.90 per MTok