Z.AI (GLM Models)
Z.AI (Zhipu AI) is the official API platform for the GLM model family. Offers GLM-5, GLM-4.7, and GLM-4.6 with tool_stream for streaming tool calls, vision models, image/video generation, and free Flash tiers. Bearer auth with API key.
Tags: glm, zhipu-ai, chinese-ai, tool-streaming, vision, image-generation, video-generation, free-tier, coding
Use Cases
- Free-tier agent operations with GLM-4.7-Flash for heartbeats and lightweight tasks
- Cost-effective reasoning and tool use with GLM-5 at $1.00/$3.20 per MTok
- Coding tasks with the specialized GLM-5-Code model
- Vision processing with GLM-4.6V at budget pricing
- Real-time tool streaming for responsive agent workflows
- Budget-conscious OpenClaw setups using FlashX ($0.07 input) for daily operations
Tips
- Use GLM-4.7-Flash (free) for OpenClaw heartbeats, cron jobs, and basic tasks. Zero cost.
- GLM-4.7 FlashX at $0.07/$0.40 is an incredible value — near-free quality with much better reasoning than the free Flash.
- Leverage cached input pricing (80% discount) for repeated prompts — great for agent loops with stable system prompts.
- GLM-5-Code is purpose-built for coding. Use it for code generation tasks instead of generic GLM-5.
- GLM-4.6V for vision tasks at $0.30/$0.90 is significantly cheaper than GPT-4o or Claude vision.
- Built-in web search ($0.01/use) lets GLM models fetch real-time information during inference.
- Also available through Synthetic and other aggregators if you prefer a unified API key.
Known Issues & Gotchas
- Two platforms: z.ai (international) and open.bigmodel.cn (China). API keys may not be interchangeable between them.
- Pricing increased 30% in early 2026 — check current rates at docs.z.ai/guides/overview/pricing.
- Free Flash tiers (GLM-4.7-Flash, GLM-4.5-Flash) have lower quality than paid models. Good for basic tasks, not complex reasoning.
- Vision is only available on GLM-4.6V and GLM-4.6V-Flash. The flagship GLM-5 does not support vision.
- Cache input storage is 'limited-time free' — may become paid in the future. Build with that expectation.
- Tool_stream is specific to Z.AI's API — not a standard OpenAI API feature. OpenClaw handles the translation.
- GLM models are primarily trained on Chinese data. English quality is good but may lag behind Western-focused models.
- Video generation (CogVideoX, Vidu) and image generation are separate APIs — not usable through OpenClaw's chat interface.
Alternatives
- DeepSeek
- Qwen (via DashScope)
- Moonshot (Kimi)
- Synthetic
Community Feedback
GLM-5 is genuinely competitive. Scored 50 on Artificial Analysis — same tier as MiniMax M2.7. The tool_stream feature is great for real-time agent workflows.
— Reddit r/LocalLLaMA
Zhipu's free Flash tiers are generous. GLM-4.7-Flash for free is hard to beat for basic tasks. Just don't expect frontier-level quality.
— Hacker News
The 30% price hike on GLM-5 in early 2026 was notable. They went from being the cheapest Chinese option to mid-range. Quality justified it, but it stung.
— Reddit r/MachineLearning
Configuration Examples
Basic Z.AI setup with GLM-5
providers:
zai:
apiKey: your-zai-api-key
model: zai/glm-5Free tier with GLM-4.7-Flash
providers:
zai:
apiKey: your-zai-api-key
model: zai/glm-4.7-flash
# Completely free — great for heartbeats and cronZ.AI with vision model
providers:
zai:
apiKey: your-zai-api-key
model: zai/glm-4.6v
# Vision support at $0.30/$0.90 per MTok