Mistral Codestral 22B: Europe's First Coding Model

Q: Q: How many programming languages does Codestral support?

Codestral was trained on 80+ programming languages, including Python, JavaScript, TypeScript, Java, C, C++, C#, Rust, Go, SQL, Kotlin, Swift, PHP, Ruby, and many others. Mistral hasn't published the full training language breakdown, but the breadth is meaningfully wider than earlier code models like CodeLlama, which focused more heavily on Python and a narrower set of mainstream languages.

Q: Q: Is Codestral open source or free to use commercially?

Codestral is NOT open source for commercial use. It is released under the Mistral AI Non-Production License, which allows personal, research, and non-commercial use but prohibits use in commercial products, SaaS services, or as part of a paid API without a separate enterprise agreement with Mistral. This is a departure from the Apache-2 licenses Mistral used for their earlier models (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B).

Q: Q: How do I use Codestral in VS Code?

Install the Continue extension (`Continue.continue`) from the VS Code marketplace, then configure `codestral-latest` as your `tabAutocompleteModel` in `~/.continue/config.json`. Set the provider to `mistral` and supply your Mistral API key (from console.mistral.ai). For local inference, point the provider at your LM Studio server URL (`http://localhost:1234/v1`) instead of the Mistral API.

Q: Q: What is Codestral's context window?

Codestral's context window is 32,768 tokens — approximately 24,000–26,000 words of code and documentation. This is a significant advantage over CodeLlama's 4,096-token limit and enables the model to reason about entire files, multi-file components, and larger codebases without chunking. For most individual files and moderate-sized modules, the full context fits without truncation.

What Is Codestral and Why Does It Matter? #

Codestral 22B is Mistral AI's first model purpose-built for code — released yesterday, May 29, 2024 — and it immediately outperforms CodeLlama 70B despite being a fraction of the size. At 22.2 billion parameters with a 32K context window, fill-in-the-middle (FIM) support, and 80+ programming languages in training, it lands as the most capable dedicated coding model available outside of closed APIs. And it runs locally.

This is a bigger deal than the raw benchmark numbers suggest. Until now, if you wanted production-quality AI code completion, you had two real options: pay for a closed API (GitHub Copilot, GPT-4o) or accept a significant quality gap with open-weights alternatives like CodeLlama. Codestral closes that gap — and does it coming out of Paris, not San Francisco.

Mistral AI has shipped three significant open-weights milestones in under a year: Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B. Each one punched above its class. Codestral is the first model in that lineage built entirely around a single domain: writing, completing, and understanding code. That specialization shows in the numbers.

For developers who care about local inference, privacy, or running AI tooling without a per-token meter running, Codestral is the most compelling option available today. It's already wired into VS Code via the Continue extension and LM Studio on day one — no waiting for ecosystem catchup.

Key specs at a glance:

22.2 billion parameters — efficient enough to run locally on a high-VRAM consumer GPU
32,768-token context window — more than 2× what CodeLlama offered, enabling full-file reasoning
80+ programming languages in training — not just Python, but Rust, TypeScript, Go, SQL, and more
Fill-in-the-middle (FIM) support — the architecture behind real IDE-style autocomplete, not just generation
Available now — via the Mistral API (la Plateforme) and as a downloadable model weight

There's one catch worth being honest about: the license. This is the first Mistral model that ships under a non-commercial license rather than Apache-2. That shifts who can use it and how. More on that below.

Codestral 22B Specs at a Glance #

Codestral ships with 22.2 billion parameters, a 32,768-token context window, and training coverage across 80+ programming languages — making it the widest-coverage dedicated coding model available in open weights. Here's the full spec sheet:

Spec	Codestral 22B	CodeLlama 70B	CodeLlama 34B
Parameter count	22.2B	70B	34B
Context window	32,768 tokens	4,096 tokens	16,384 tokens
Languages trained	80+	~30+	~30+
FIM support	Yes (native)	Yes	Yes
Released	May 29, 2024	Aug 2023	Aug 2023
License	Non-commercial	Llama 2 Community	Llama 2 Community
API availability	Mistral la Plateforme	Via Replicate/Together	Via Replicate/Together
Local inference	Yes (download available)	Yes	Yes

The context window is the most immediately practical advantage. CodeLlama 70B tops out at 4,096 tokens — barely enough for a single moderately-sized Python file. Codestral's 32K window means you can feed it an entire module, a full TypeScript component with tests, or a complex SQL schema and get answers that account for the whole picture.

Architecture notes: Codestral uses a decoder-only transformer architecture optimized for causal language modeling with an additional FIM training objective. Mistral hasn't published the full architecture paper yet, but the 22B count at 32K context suggests aggressive attention efficiency improvements relative to earlier Mistral models. The model is available as a raw safetensors download and in GGUF format for llama.cpp compatibility.

Training data: Mistral reports that Codestral was trained on a dataset of 80+ programming languages, with emphasis on Python, JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, and SQL. The training mix includes open-source code repositories, documentation, and technical prose — a standard composition for coding models that's harder to quantify without a technical report.

Deployment targets:

Cloud API: Mistral la Plateforme (api.mistral.ai) — requires API key, billed per token
Self-hosted: Raw weights download, compatible with llama.cpp, Ollama, LM Studio, vLLM, and text-generation-inference
IDE plugins: Continue (VS Code) on day one; LM Studio for local GUI inference

How Codestral Benchmarks Against the Competition #

Codestral 22B scores 81.1% on HumanEval pass@1 — outperforming CodeLlama 70B and matching or exceeding GPT-3.5-turbo on standard coding benchmarks, while being 3× smaller than CodeLlama 70B. These aren't cherry-picked numbers; they hold across multiple evaluation frameworks.

Benchmark	Codestral 22B	CodeLlama 70B	GPT-3.5-turbo	Mistral Large 2 (123B)
HumanEval pass@1	81.1%	~67–69%	~68–73%	92.0%
HumanEval+	73.2%	~60%	~64%	—
MBPP pass@1	78.2%	~62%	~70%	—
MBPP+	62.2%	—	—	—
HumanEvalFIM avg	91.6%	—	—	—

HumanEval is the most-cited coding benchmark — 164 hand-written Python programming problems. Pass@1 measures whether the model solves the problem on its first attempt. Codestral's 81.1% is a substantial jump over the previous best open-weights coding model (CodeLlama 70B at ~67–69%) while using a model that's less than a third the size.

The HumanEvalFIM score of 91.6% is the number that matters most for real-world IDE use. This measures fill-in-the-middle accuracy — precisely the task your IDE autocomplete performs all day: given code before and after the cursor, generate the correct completion. 91.6% average across Python, JavaScript, and Java is a genuinely strong result.

What Codestral doesn't beat yet: Mistral Large 2 (123B) clocks 92.0% HumanEval — still above Codestral. GPT-4-class models and Claude Opus are still ahead on complex multi-step reasoning tasks. Codestral is optimized for the local inference / IDE completion use case, not for being the most capable model at any cost.

On non-Python languages: The benchmark corpus is heavily Python-weighted. Codestral reportedly outperforms GPT-4-turbo on Kotlin benchmarks (73.75 vs 72.05), suggesting the 80-language training mix produces genuinely broader language coverage — not just Python performance with superficial exposure elsewhere.

The honest caveat: HumanEval Pro — a more stringent variant that tests self-invoking code — drops Codestral to 59.1%. This mirrors the industry-wide pattern; every model shows a significant score cliff on harder benchmarks. The 81.1% headline number is real but reflects performance on relatively contained, function-level problems. Production code generation at file or module scope is still harder than these benchmarks capture.

The Fill-in-the-Middle (FIM) Capability Explained #

Fill-in-the-middle (FIM) is the training objective that separates a real IDE autocomplete model from a standard code generator — and Codestral scores 91.6% on FIM benchmarks, making it the most capable open-weights model for this specific task. Understanding FIM is essential to understanding why Codestral matters for production developer tooling.

Standard autoregressive language model training teaches the model to predict the next token given everything that came before. That's useful for generating code top-to-bottom, but it's the wrong architecture for IDE autocomplete. When you're typing in the middle of a function, the model needs to know what comes after the cursor — not just before it.

FIM training reshapes this. The model learns to fill a masked span given both the prefix (code before the cursor) and the suffix (code after the cursor). The training uses special sentinel tokens to mark the boundaries:

<PRE> — marks the start of the prefix
<SUF> — marks the start of the suffix
<MID> — the model fills this position

This is exactly what GitHub Copilot, Cursor's tab completion, and other production IDE tools use under the hood. When you hit tab and the model completes a line in context, FIM is working.

Codestral's FIM breakdown by language:

Python: ~92% (HumanEvalFIM)
JavaScript: ~91% (HumanEvalFIM)
Java: ~92% (HumanEvalFIM)
Average: 91.6%

The practical implication: Codestral can be wired into VS Code, Neovim, or any editor that supports a completion backend, and it will behave like a real autocomplete system — not a chat model you ask to write code for you. Those are fundamentally different use cases, and FIM support is the dividing line.

Using Codestral via the Mistral API (la Plateforme) #

Codestral is available right now via api.mistral.ai under the model name codestral-latest — standard chat completions endpoint, same auth pattern as every other Mistral model. If you're already using Mistral's API, adding Codestral is a one-line model name swap.

Setup #

Get your API key at console.mistral.ai (la Plateforme account required)
Install the Mistral Python client: pip install mistralai
Set your key: export MISTRAL_API_KEY=your-key-here

Basic Python Example #

from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage

client = MistralClient(api_key="your-api-key")

response = client.chat(
    model="codestral-latest",
    messages=[
        ChatMessage(
            role="user",
            content="Write a Python function that takes a list of integers and returns a dictionary with the count of each unique value."
        )
    ]
)

print(response.choices[0].message.content)

Curl Example #

curl https://api.mistral.ai/v1/chat/completions \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "codestral-latest",
    "messages": [
      {
        "role": "user",
        "content": "Write a TypeScript function to debounce any async function with a configurable delay."
      }
    ]
  }'

Pricing: Mistral hasn't published Codestral's per-token pricing at launch beyond the free tier access during beta. La Plateforme's billing dashboard shows usage per model — check the console for current rates. For self-hosted deployments, you pay only the hardware cost.

Model name string to use: codestral-latest — Mistral uses the -latest convention for their current recommended checkpoint, same pattern as mistral-large-latest, mistral-small-latest, etc.

Codestral's Fill-in-the-Middle API Pattern — Code Examples #

Codestral exposes FIM through a dedicated /v1/fim/completions endpoint — distinct from the standard chat completions endpoint — using prompt (prefix), suffix, and optional stop sequences. This is the pattern that powers real-time IDE tab completion.

FIM API Request — Python #

import requests
import os

api_key = os.environ["MISTRAL_API_KEY"]

# The code BEFORE the cursor
prefix = """def calculate_statistics(data: list[float]) -> dict:
    \"\"\"Calculate basic statistics for a list of numbers.\"\"\"
    if not data:
        return {}
    
    n = len(data)
    mean = sum(data) / n
    """

# The code AFTER the cursor (what the model needs to complete BETWEEN)
suffix = """
    return {
        "n": n,
        "mean": mean,
        "median": median,
        "std_dev": std_dev,
        "min": min(data),
        "max": max(data),
    }"""

response = requests.post(
    "https://api.mistral.ai/v1/fim/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "codestral-latest",
        "prompt": prefix,
        "suffix": suffix,
        "max_tokens": 256,
        "temperature": 0.0,  # Low temp for completions — deterministic is better
        "stop": ["\ndef ", "\nclass "],  # Stop at the next function/class definition
    }
)

completion = response.json()["choices"][0]["message"]["content"]
print(completion)
# Output: sorted_data = sorted(data)\n    median = sorted_data[n//2]\n    ...

FIM via curl #

curl https://api.mistral.ai/v1/fim/completions \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "codestral-latest",
    "prompt": "function debounce(fn, delay) {\n  let timer;\n  ",
    "suffix": "\n  return debounced;\n}",
    "max_tokens": 128,
    "temperature": 0
  }'

Key FIM Parameters #

Parameter	Use	Recommendation
`prompt`	Code before the cursor	Include enough context — 50–200 lines is typical
`suffix`	Code after the cursor	Include the next function/class boundary
`temperature`	Randomness	Use `0` for completions — you want the most likely token
`max_tokens`	Max completion length	64–256 for inline completions; higher for block generation
`stop`	Stop sequences	`["\ndef ", "\nclass "]` prevents overgeneration into next scope

The FIM endpoint is the right interface for editor plugins. The chat completions endpoint (/v1/chat/completions) works for ad-hoc code generation via prompting, but FIM is what the Continue extension and IDE integrations use for true inline tab completion.

Running Codestral Locally With LM Studio #

You can run Codestral locally using LM Studio (free GUI), Ollama (CLI), or raw llama.cpp — and LM Studio has Codestral available in its model browser as of today. The hardware requirement for useful inference is a GPU with 16–24GB VRAM for the Q4 quantization, or a high-RAM Apple Silicon Mac using Metal.

Hardware Requirements by Quantization #

GGUF Format	VRAM Required	Quality vs. Full	Recommended For
Q2_K	~10GB	Significant degradation	Not recommended for code
Q4_K_M	~14GB	Modest tradeoff	16GB GPU (RTX 3080/4070)
Q5_K_M	~17GB	Near-full quality	24GB GPU (RTX 3090/4090)
Q8_0	~24GB	Minimal degradation	24GB+ GPU or M2/M3 Max
F16 (full)	~44GB	Reference quality	Multi-GPU or cloud

For most developers, the Q4_K_M quantization is the practical sweet spot — fits in 16GB VRAM, produces code that's meaningfully better than nothing, and runs at 15–25 tokens/second on an RTX 3090.

LM Studio Setup (GUI) #

Download LM Studio from lmstudio.ai (free, Mac/Windows/Linux)
In the search bar, type Codestral — it appears under the Mistral organization
Select your preferred GGUF quantization level and click Download
Once downloaded, click the model to load it
Switch to the Local Server tab (the <-> icon)
Start the server — it exposes an OpenAI-compatible API at http://localhost:1234/v1
Set this endpoint in the Continue extension config (see next section)

Ollama Setup (CLI) #

# Pull Codestral via Ollama (when available — may require the GGUF manual import initially)
ollama pull codestral

# Or run directly
ollama run codestral "Write a Rust function to parse a TOML config file"

# List running models
ollama list

Apple Silicon note: Codestral runs on Apple Silicon via Metal. An M2 Max (96GB unified memory) runs the full Q8 quantization comfortably. An M3 Pro (18GB) can run Q4_K_M at useful speed. The unified memory architecture means Apple Silicon users get better large-model performance per dollar than comparable Nvidia setups at the low end.

VS Code Integration via the Continue Extension #

The Continue extension for VS Code (continue.dev) has native Codestral support on day one — you can wire it as your autocomplete backend in under five minutes using either the Mistral API or your local LM Studio instance. Continue is the open-source alternative to Copilot that lets you swap in any model, and Codestral is now the strongest option for its completion backend.

Install Continue #

# In VS Code, open the Extensions panel and search:
# "Continue - Codestral, Claude, and more"
# Or install via command line:
code --install-extension Continue.continue

Configure Codestral as the Autocomplete Model #

Open your Continue config at ~/.continue/config.json and add Codestral under tabAutocompleteModel:

{
  "models": [
    {
      "title": "Codestral (Mistral API)",
      "provider": "mistral",
      "model": "codestral-latest",
      "apiKey": "YOUR_MISTRAL_API_KEY"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Codestral Autocomplete",
    "provider": "mistral",
    "model": "codestral-latest",
    "apiKey": "YOUR_MISTRAL_API_KEY"
  },
  "allowAnonymousTelemetry": false
}

Using Local LM Studio Instead of the API #

If you're running Codestral locally via LM Studio (server on localhost:1234), point Continue at your local server instead:

{
  "tabAutocompleteModel": {
    "title": "Codestral Local",
    "provider": "lmstudio",
    "model": "codestral-22b-v0.1-Q4_K_M",
    "apiBase": "http://localhost:1234/v1"
  }
}

What the Experience Looks Like #

Once wired up, Continue adds ghost-text completions to VS Code as you type — the same UX as Copilot. Press Tab to accept a suggestion, Esc to dismiss. The FIM architecture means suggestions are contextually aware of what's below the cursor, not just above it.

The Continue extension also supports Codestral for its chat sidebar (asking questions about your code, explaining functions, generating tests), but the autocomplete tab is where the FIM advantage is most felt day-to-day.

Latency note: Using the Mistral API means each completion round-trips to Paris. For interactive typing-speed completions, this adds noticeable latency compared to a local model on fast hardware. If latency is a concern, run Codestral locally and accept the VRAM cost.

The License Controversy: Non-Commercial vs. Apache-2 #

Codestral ships under the Mistral AI Non-Production License — not Apache-2 — making it the first Mistral model that cannot legally be used in commercial products or production services without a separate enterprise agreement. This is a meaningful shift from the Apache-2 licenses on Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B, and the developer community noticed immediately.

Here's what the non-commercial license actually restricts:

Use Case	Apache-2 Models	Codestral (Non-Commercial)
Personal/research use	✅	✅
Open-source projects (non-commercial)	✅	✅
Internal business tools (no external users)	✅	❌ Requires license
Commercial SaaS products	✅	❌ Requires license
Selling completions via your API	✅	❌ Requires license
Fine-tuning for commercial use	✅	❌ Requires license
Research and evaluation	✅	✅

The community reaction has been loud. Mistral built significant goodwill with the Apache-2 releases, positioning itself as the "truly open" alternative to locked-down proprietary labs. Codestral breaks that pattern — and it breaks it on their highest-profile specialized model.

The business logic is clear enough: Codestral is purpose-built for developer tooling. If it's as good as the benchmarks suggest, it's competitive with GitHub Copilot's backend. Mistral would be funding their competitors' products for free under Apache-2. A non-commercial license lets individual developers and researchers use it freely while creating a monetization path for the enterprise use cases that generate real revenue.

What this means for you as a builder:

If you're using Codestral for personal coding, side projects, or research: You're fine under the non-commercial license.
If you're building an internal dev tool for your team at a company: Technically restricted — you'd need Mistral's enterprise agreement.
If you're building a product that exposes AI code completion to end users: The non-commercial license prohibits this without a deal with Mistral.

The precedent matters. Mistral has signaled that future specialized models may follow the same pattern — open for personal use, licensed for commercial deployment. This is closer to Meta's Llama 2 / Llama 3 model than the pure Apache-2 ethos of their earlier releases. Whether it's a sustainable strategy or will push developers toward alternatives like DeepSeek Coder and StarCoder2 remains to be seen.

Why Codestral Is a Big Deal for European AI #

Codestral is the first dedicated coding model from a European AI lab — and it beats American and Chinese open-weights competitors on benchmark scores while running from Paris. This is the clearest signal yet that European AI is capable of frontier-level specialized model development, not just research.

Mistral AI was founded in April 2023 by former Meta and Google DeepMind researchers and raised €105M in seed funding before shipping a single model. They've moved faster than almost anyone: Mistral 7B in September 2023, Mixtral 8x7B in December 2023, Mixtral 8x22B in April 2024, and now Codestral in May 2024. That's four significant model releases in under eight months.

The geopolitical framing isn't incidental. The EU AI Act is entering enforcement. European enterprises are looking for AI infrastructure that isn't entirely dependent on American hyperscalers or Chinese labs. A competitive French AI company that can offer frontier-level model quality via a European API endpoint matters for data residency, regulatory compliance, and strategic independence.

What Mistral has that the others don't:

Speed of iteration — 4 significant models in 8 months is a faster release cadence than Anthropic or Google at equivalent company size
Architectural efficiency — Mistral models consistently punch above their parameter count, likely due to aggressive training recipe optimization
European data law alignment — Mistral's infrastructure is EU-based, which matters for GDPR-sensitive enterprise use cases
Community credibility — The Apache-2 releases built real developer trust. Codestral's non-commercial license tests that trust, but the core goodwill is still there.

The competitive picture for AI coding tools is now: GitHub Copilot (Microsoft/OpenAI), Cursor (Anysphere + Claude/GPT-4), Amazon CodeWhisperer, and — if Codestral delivers on its benchmarks in real workflows — a legitimate European challenger from Mistral. That's a more interesting market than it was 90 days ago.

Should You Replace GitHub Copilot With Codestral? #

Codestral is a genuine alternative to GitHub Copilot for individual developers and open-source projects today — but it's not a straight replacement for teams using Copilot's IDE integration, GitHub context awareness, or enterprise features. Here's the honest comparison:

Factor	Codestral (via Continue)	GitHub Copilot	Cursor
Underlying model	Codestral 22B	GPT-4o-based	GPT-4o + Claude Sonnet
FIM quality	91.6% HumanEval FIM	Not published	Not published
Context window	32K tokens	~8K (completions)	Full-file context
GitHub repo awareness	No	Yes (deep)	Via repo indexing
Local inference option	Yes	No	No
Price	Free (self-hosted) / API fees	$10/mo individual	$20/mo
Enterprise license	Requires Mistral agreement	$19/user/mo	Enterprise available
VS Code integration	Continue extension	Native	Native (own editor)
Chat + completions	Both (Continue)	Both	Both

Codestral wins when:

You want local inference (privacy, zero latency on good hardware, offline use)
You're using it for personal/open-source projects and want zero subscription cost
You need the best available open-weights FIM model for a custom IDE build
You're on a GDPR-sensitive project and want EU-hosted API infrastructure

GitHub Copilot / Cursor still lead when:

You want GitHub repository context in suggestions (Copilot knows your issue tracker, PRs, and codebase structure in ways Codestral can't without integration work)
You need a polished, maintained IDE-native integration without config overhead
Your team is already on an enterprise plan and needs SSO, audit logs, and managed compliance
You're using a less common editor that Continue doesn't support well

My take: For personal dev work and individual developers experimenting with open-weights tooling, Codestral via Continue is worth trying today. The 32K context and 91.6% FIM score are genuinely competitive. For production team deployments, Copilot and Cursor's full-stack editor approach still have a UX polish and context integration advantage that's hard to dismiss.

How I'm Wiring Codestral Into My AI Automation Workflow #

I'm using Codestral today as a lightweight, fast code-generation backend inside n8n automation pipelines — specifically for generating function bodies, SQL queries, and TypeScript transformers that I don't want to pay GPT-4o rates for. Here's the setup and where it's earning its place.

The core pattern: in n8n, I use an HTTP Request node to call the Mistral API with Codestral whenever the automation needs to write or transform code. For lightweight code-gen tasks — string formatters, data mappers, regex generators — Codestral's speed and cost profile beats GPT-4o for repetitive work.

n8n HTTP Node Config for Codestral #

{
  "url": "https://api.mistral.ai/v1/chat/completions",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer {{ $env.MISTRAL_API_KEY }}",
    "Content-Type": "application/json"
  },
  "body": {
    "model": "codestral-latest",
    "messages": [
      {
        "role": "system",
        "content": "You are a code generation assistant. Return only the code, no explanation. Always use TypeScript."
      },
      {
        "role": "user",
        "content": "{{ $json.codePrompt }}"
      }
    ],
    "temperature": 0.1,
    "max_tokens": 512
  }
}

Where Codestral Earns Its Place in an Automation Stack #

Task	Use Codestral	Use GPT-4o Instead
Generate a TypeScript data transformer	✅	—
Write a SQL query from a schema	✅	—
Generate regex patterns	✅	—
Explain a complex algorithm	—	✅
Multi-step code architecture planning	—	✅
Code review with nuanced feedback	—	✅
Filling in function bodies in known patterns	✅	—

The honest tradeoff: Codestral is fast and cheap for contained code-gen tasks. For anything requiring reasoning across a large codebase context, architectural judgment, or multi-step code planning, GPT-4o or Claude Opus is still the right call. Think of Codestral as the specialist you bring in for a specific, well-defined task — not the generalist who understands your whole system.

I'm not using Codestral as my primary IDE autocomplete today — I'm on Cursor with GPT-4o for my main editing workflow, and the editor-level integration is hard to beat. But for the automation pipeline use case, Codestral is genuinely useful and I'll keep running it until something materially better shows up in the open-weights space.

Frequently Asked Questions #

Q: What is Mistral Codestral 22B? #

Codestral 22B is Mistral AI's first model purpose-built for code generation and completion, released May 29, 2024. It has 22.2 billion parameters, a 32,768-token context window, and training coverage across 80+ programming languages. It supports fill-in-the-middle (FIM) inference, making it suitable for real IDE autocomplete use cases, and is available via the Mistral API and as a self-hosted download.

Q: How many programming languages does Codestral support? #

Codestral was trained on 80+ programming languages, including Python, JavaScript, TypeScript, Java, C, C++, C#, Rust, Go, SQL, Kotlin, Swift, PHP, Ruby, and many others. Mistral hasn't published the full training language breakdown, but the breadth is meaningfully wider than earlier code models like CodeLlama, which focused more heavily on Python and a narrower set of mainstream languages.

Q: What is fill-in-the-middle (FIM) and does Codestral support it? #

Fill-in-the-middle (FIM) is a training objective that teaches a model to generate code given both the text before and after the cursor — the mechanism behind real IDE tab completion. Yes, Codestral fully supports FIM and achieves 91.6% accuracy on the HumanEvalFIM benchmark (averaged across Python, JavaScript, and Java). Codestral exposes FIM through a dedicated /v1/fim/completions API endpoint using prompt (prefix) and suffix parameters.

Q: How does Codestral benchmark against CodeLlama 70B? #

Codestral 22B outperforms CodeLlama 70B on HumanEval despite being less than a third the size. Codestral scores 81.1% on HumanEval pass@1; CodeLlama 70B scores approximately 67–69%. Codestral also significantly extends CodeLlama's 4,096-token context limit to 32,768 tokens — an 8× advantage for whole-file and multi-file reasoning tasks.

Q: Is Codestral open source or free to use commercially? #

Codestral is NOT open source for commercial use. It is released under the Mistral AI Non-Production License, which allows personal, research, and non-commercial use but prohibits use in commercial products, SaaS services, or as part of a paid API without a separate enterprise agreement with Mistral. This is a departure from the Apache-2 licenses Mistral used for their earlier models (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B).

Q: Can I run Codestral locally on my own hardware? #

Yes — Codestral is available as a direct model weight download and in GGUF format for llama.cpp-based runtimes. You can run it via LM Studio (GUI), Ollama (CLI), or vLLM (server). Minimum practical hardware: a GPU with 16GB VRAM for Q4_K_M quantization, or an Apple Silicon Mac with 32GB+ unified memory. An RTX 3090 (24GB) runs Q5_K_M at usable speeds; an M2 Max (96GB) can run near-full quality.

Q: How do I use Codestral in VS Code? #

Install the Continue extension (Continue.continue) from the VS Code marketplace, then configure codestral-latest as your tabAutocompleteModel in ~/.continue/config.json. Set the provider to mistral and supply your Mistral API key (from console.mistral.ai). For local inference, point the provider at your LM Studio server URL (http://localhost:1234/v1) instead of the Mistral API.

Q: What is Codestral's context window? #

Codestral's context window is 32,768 tokens — approximately 24,000–26,000 words of code and documentation. This is a significant advantage over CodeLlama's 4,096-token limit and enables the model to reason about entire files, multi-file components, and larger codebases without chunking. For most individual files and moderate-sized modules, the full context fits without truncation.

Q: How does Codestral compare to GPT-3.5 for coding tasks? #

Codestral 22B matches or exceeds GPT-3.5-turbo on standard coding benchmarks. On HumanEval pass@1, Codestral scores 81.1% against GPT-3.5-turbo's approximately 68–73% (depending on evaluation setup). Codestral also beats GPT-4-Turbo on specific language benchmarks like Kotlin. The key advantage Codestral has over GPT-3.5 for production use: it runs locally and offline, with no per-token billing after hardware costs.

Q: Is Codestral good for non-Python languages like Rust or TypeScript? #

Yes — Codestral's 80+ language training mix includes strong coverage of Rust, TypeScript, Go, SQL, and Java. Mistral reports Codestral outperforms GPT-4-Turbo on Kotlin benchmarks specifically. For TypeScript in particular, the 32K context window is highly practical — a full TypeScript module with types, utilities, and tests fits in context without chunking. Python-centric benchmarks like HumanEval underrepresent Codestral's multi-language capability in real workflows.

Build AI-Powered Dev Workflows — Not Just Better Autocomplete #

Codestral is a great piece of infrastructure. But having the best local coding model is only part of the equation — the other part is knowing how to wire it into workflows that actually ship faster.

I help founders and engineering teams build custom AI automation pipelines: code-gen agents, n8n workflows that auto-generate and test code, AI-assisted PR review systems, and full-stack AI ops that reduce the grunt-work tax on every sprint.

If you're evaluating how to integrate open-weights models like Codestral into your dev workflow — or you want a custom AI automation system built for your team — book an AI automation strategy call and let's talk through what makes sense for your stack.

If you found this useful, these posts are in the same orbit:

Mistral 8x22B: The Apache-2 MoE That Dropped via Torrent and Changed Open Weights Forever — The MoE release that set the stage for Codestral, and why the Apache-2 license made it a landmark moment.
The State of Open-Weights AI: Why Mistral 7B/8x7B Ruled the World (For Now) — How Mistral got here: the efficiency story behind 7B and Mixtral's MoE architecture.
Cursor in 2024: The Indie Dev's Secret Weapon Before the $400M Valuation — How Cursor wires Claude and GPT-4o into an IDE-native workflow that Codestral + Continue is now competing with.
GitHub Copilot Workspace: When Copilot Became an Agent (Technical Preview Deep-Dive) — Microsoft's counter-move: turning Copilot into a full coding agent. Context for understanding where the IDE war is heading.

Mistral Codestral 22B: Europe's First Dedicated Coding Model Lands

Table of Contents

What Is Codestral and Why Does It Matter? #

Codestral 22B Specs at a Glance #

How Codestral Benchmarks Against the Competition #

The Fill-in-the-Middle (FIM) Capability Explained #

Using Codestral via the Mistral API (la Plateforme) #

Setup #

Basic Python Example #

Curl Example #

Codestral's Fill-in-the-Middle API Pattern — Code Examples #

FIM API Request — Python #

FIM via curl #

Key FIM Parameters #

Running Codestral Locally With LM Studio #

Hardware Requirements by Quantization #

LM Studio Setup (GUI) #

Ollama Setup (CLI) #

VS Code Integration via the Continue Extension #

Install Continue #

Configure Codestral as the Autocomplete Model #

Using Local LM Studio Instead of the API #

What the Experience Looks Like #

The License Controversy: Non-Commercial vs. Apache-2 #

Why Codestral Is a Big Deal for European AI #

Should You Replace GitHub Copilot With Codestral? #

How I'm Wiring Codestral Into My AI Automation Workflow #

n8n HTTP Node Config for Codestral #

Where Codestral Earns Its Place in an Automation Stack #

Frequently Asked Questions #

Q: What is Mistral Codestral 22B? #

Q: How many programming languages does Codestral support? #

Q: What is fill-in-the-middle (FIM) and does Codestral support it? #

Q: How does Codestral benchmark against CodeLlama 70B? #

Q: Is Codestral open source or free to use commercially? #

Q: Can I run Codestral locally on my own hardware? #

Q: How do I use Codestral in VS Code? #

Q: What is Codestral's context window? #

Q: How does Codestral compare to GPT-3.5 for coding tasks? #

Q: Is Codestral good for non-Python languages like Rust or TypeScript? #

Build AI-Powered Dev Workflows — Not Just Better Autocomplete #

Related Posts #

Related Posts

Cursor Year-End Recap: The IDE That Defined 2024

Cursor Raises $60M Series A at $400M Valuation: The AI-Native Code Editor Just Changed Everything

GitHub Models Public Preview: OpenAI, Anthropic, and Mistral Inside GitHub

Recent Posts

Recent Posts

AI Sales Agents in 2026: What They Can Do, What They Can't, and When to Deploy One

AI Chat vs Live Chat: When to Use Each and How to Set Up the Handoff

How to Build an AI Support Agent That Handles 80% of Tickets Automatically

The Question-First Content Model That Gets You Cited by AI

Schema, Structured Data, and Entity SEO: The Technical Core of AI Visibility

Categories

Explore Categories

AI Coding & Dev Tools

AI Models & Frontier News

AI Agents & Automation

Growth & Operations

Web Design & Digital Craft

AI Policy & Safety