FreeInference Documentation
Free LLM inference for coding agents and IDEs
FreeInference provides free access to state-of-the-art language models for coding agents and AI-powered IDEs, with a particularly smooth setup path for Kilo Code.
Quick Links
Quick Start - Get started in 5 minutes
IDE & Coding Agent Integrations - Configure with Kilo Code, Cursor, Roo Code, and other coding agents
Available Models - View available models
API Headers Reference - API headers reference
Key Features
- Free Access
Free inference for coding agents and development tools
- Multiple Models
Access GLM, Qwen, MiniMax, and other powerful models
- IDE Integration
Easy setup with Kilo Code, Cursor, Roo Code, and more
- Kilo-Friendly Setup
Detailed Kilo Code instructions for a fast OpenAI-compatible configuration
Getting Started
Get your API key - Register at https://freeinference.org and create your API key
Choose your IDE:
Configure and start coding!
See the Quick Start guide for detailed setup instructions.
Available Models
Model |
Context Length |
Best For |
|---|---|---|
GLM-5.1 |
200K tokens |
Latest GLM-5 generation |
GLM-5 Turbo recommended |
200K tokens |
Faster GLM-5 variant |
GLM-4.7 |
200K tokens |
Long context, bilingual |
MiniMax M2.5 |
205K tokens |
Ultra-long context, multimodal |
MiniMax M2.7 |
205K tokens |
Large codebases |
Qwen3.6 35B fastest |
262K tokens |
Strong reasoning and coding intelligence |
See the complete Available Models list for all available models.
Support
Need help? Check out:
IDE & Coding Agent Integrations - IDE setup guides
Available Models - Available models
GitHub Issues - Report bugs or request features