Distillation

External LLM integration for data generation and code fixing

The distillation pipeline uses external LLMs (Claude, GPT-4) to generate high-quality training data and fix compilation errors.

In This Section

Compilation Fix Agent — Auto-fix compilation errors
Prompt Engineering — Effective prompts for external LLMs

Why Distillation?

Approach	Pros	Cons
Local RAFT only	Fast, no API costs	Limited by local model quality
External LLM only	High quality output	Expensive, can’t fine-tune
Distillation	Best of both	Requires pipeline setup

The key insight: Use expensive inference to generate data, then train your local model on verified winners.

External LLM → Generate samples
                    │
                    ▼
            Your verification pipeline
                    │
                    ▼
            Filter to verified samples
                    │
                    ▼
            SFT your local model on verified samples

Components

1. External Generators

Connect to Claude or GPT-4 APIs for code generation:

from malagent.generators import create_generator

generator = create_generator(
    provider="anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

sample = await generator.generate(prompt)

2. Compilation Fix Agent

Automatically fix compilation errors using an external LLM:

from malagent.generators import CompileFixAgent

fix_agent = CompileFixAgent(
    generator=generator,
    verifier=verifier,
    max_attempts=3
)

fixed_sample = await fix_agent.fix(
    original_code=broken_code,
    compiler_error=error_message
)

3. Distillation Pipeline

End-to-end pipeline combining generation, fixing, and verification:

from malagent.distillation import DistillationPipeline

pipeline = DistillationPipeline(
    generator=generator,
    verifier=verifier,
    fix_agent=fix_agent
)

results = await pipeline.run(
    prompts=prompts,
    samples_per_prompt=5,
    budget_usd=10.0
)

Workflow

┌─────────────────────────────────────────────────────────────┐
│                   DISTILLATION PIPELINE                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. GENERATE      2. VERIFY        3. FIX         4. STORE  │
│      │                │               │               │      │
│      ▼                ▼               ▼               ▼      │
│  External LLM    Compile +        Fix Agent       Save to    │
│  (Claude/GPT)    Detect           (if failed)     Dataset    │
│                                                              │
│  Prompt ────► Code ────► Result ────► Fixed? ────► Sample   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Use Cases

1. Bootstrap Training Data

Generate initial high-quality samples before any local training:

malagent distill run \
    --provider anthropic \
    --model claude-sonnet-4-20250514 \
    --prompts data/prompts/techniques.jsonl \
    --output distillation_output \
    --samples 5 \
    --budget 20.0

2. Fix Compilation Failures

Convert failing RAFT samples into successes:

# In RAFT cycle, use fix agent for failed samples
for sample in failed_samples:
    if sample.error_type == "compile":
        fixed = await fix_agent.fix(sample.code, sample.error)
        if fixed.success:
            training_samples.append(fixed)

3. Analyze Failures

Use external LLM to understand why samples fail or get detected:

analysis = await generator.generate(
    f"Analyze why this code was detected by EDR:\n"
    f"Code: {sample.code}\n"
    f"Detection: {sample.detection_rule}\n"
    f"Explain the detection and suggest improvements."
)

CLI Commands

# Run distillation pipeline
malagent distill run \
    --provider anthropic \
    --model claude-sonnet-4-20250514 \
    --prompts prompts.jsonl \
    --output ./output \
    --budget 10.0

# Analyze results
malagent distill analyze --samples-dir ./output/samples

# Export verified samples to SFT format
malagent distill export \
    --samples-dir ./output/samples \
    --output sft_dataset.jsonl \
    --min-reward 0.5

Cost Management

External LLM calls incur API costs. The pipeline includes budget controls:

pipeline = DistillationPipeline(
    generator=generator,
    budget_usd=10.0,           # Stop when budget exceeded
    cost_warning_threshold=0.8  # Warn at 80% budget
)

Note: Actual costs depend on your provider, model selection, token usage, and current pricing. Check your provider’s pricing page for current rates.

Sample Attribution

All samples track their origin for training transparency:

sample = GeneratedSample(
    prompt=prompt,
    completion=code,
    generator="anthropic/claude-sonnet-4-20250514",
    was_fixed=True,
    fix_iterations=2,
    original_completion=original_broken_code,
    compile_errors_fixed=["C2065", "C2143"]
)

This enables:

Understanding which samples were fixed vs. original
Tracking fix agent effectiveness
Filtering by sample origin for experiments