FreeInference Documentation

Free LLM inference for coding agents and IDEs

FreeInference provides free access to state-of-the-art language models for coding agents and AI-powered IDEs, with a particularly smooth setup path for Kilo Code.

Key Features

Free Access

Free inference for coding agents and development tools

Multiple Models

Access GLM, Qwen, MiniMax, and other powerful models

IDE Integration

Easy setup with Kilo Code, Cursor, Roo Code, and more

Kilo-Friendly Setup

Detailed Kilo Code instructions for a fast OpenAI-compatible configuration

Getting Started

  1. Get your API key - Register at https://freeinference.org and create your API key

  2. Choose your IDE:

    • Kilo Code - Kilo Code setup with recommended models

    • Cursor - AI-powered code editor

  3. Configure and start coding!

See the Quick Start guide for detailed setup instructions.

Available Models

Model

Context Length

Best For

GLM-5.1

200K tokens

Latest GLM-5 generation

GLM-5 Turbo recommended

200K tokens

Faster GLM-5 variant

GLM-4.7

200K tokens

Long context, bilingual

MiniMax M2.5

205K tokens

Ultra-long context, multimodal

MiniMax M2.7

205K tokens

Large codebases

Qwen3.6 35B fastest

262K tokens

Strong reasoning and coding intelligence

See the complete Available Models list for all available models.

Support

Need help? Check out: