Distillation

External LLM integration for data generation and code fixing

The distillation pipeline uses external LLMs (Claude, GPT-4) to generate high-quality training data and fix compilation errors.

In This Section

Why Distillation?

ApproachProsCons
Local RAFT onlyFast, no API costsLimited by local model quality
External LLM onlyHigh quality outputExpensive, can’t fine-tune
DistillationBest of bothRequires pipeline setup

The key insight: Use expensive inference to generate data, then train your local model on verified winners.

External LLM → Generate samples
                    │
                    ▼
            Your verification pipeline
                    │
                    ▼
            Filter to verified samples
                    │
                    ▼
            SFT your local model on verified samples

Components

1. External Generators

Connect to Claude or GPT-4 APIs for code generation:

from malagent.generators import create_generator

generator = create_generator(
    provider="anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

sample = await generator.generate(prompt)

2. Compilation Fix Agent

Automatically fix compilation errors using an external LLM:

from malagent.generators import CompileFixAgent

fix_agent = CompileFixAgent(
    generator=generator,
    verifier=verifier,
    max_attempts=3
)

fixed_sample = await fix_agent.fix(
    original_code=broken_code,
    compiler_error=error_message
)

3. Distillation Pipeline

End-to-end pipeline combining generation, fixing, and verification:

from malagent.distillation import DistillationPipeline

pipeline = DistillationPipeline(
    generator=generator,
    verifier=verifier,
    fix_agent=fix_agent
)

results = await pipeline.run(
    prompts=prompts,
    samples_per_prompt=5,
    budget_usd=10.0
)

Workflow

┌─────────────────────────────────────────────────────────────┐
│                   DISTILLATION PIPELINE                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. GENERATE      2. VERIFY        3. FIX         4. STORE  │
│      │                │               │               │      │
│      ▼                ▼               ▼               ▼      │
│  External LLM    Compile +        Fix Agent       Save to    │
│  (Claude/GPT)    Detect           (if failed)     Dataset    │
│                                                              │
│  Prompt ────► Code ────► Result ────► Fixed? ────► Sample   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Use Cases

1. Bootstrap Training Data

Generate initial high-quality samples before any local training:

malagent distill run \
    --provider anthropic \
    --model claude-sonnet-4-20250514 \
    --prompts data/prompts/techniques.jsonl \
    --output distillation_output \
    --samples 5 \
    --budget 20.0

2. Fix Compilation Failures

Convert failing RAFT samples into successes:

# In RAFT cycle, use fix agent for failed samples
for sample in failed_samples:
    if sample.error_type == "compile":
        fixed = await fix_agent.fix(sample.code, sample.error)
        if fixed.success:
            training_samples.append(fixed)

3. Analyze Failures

Use external LLM to understand why samples fail or get detected:

analysis = await generator.generate(
    f"Analyze why this code was detected by EDR:\n"
    f"Code: {sample.code}\n"
    f"Detection: {sample.detection_rule}\n"
    f"Explain the detection and suggest improvements."
)

CLI Commands

# Run distillation pipeline
malagent distill run \
    --provider anthropic \
    --model claude-sonnet-4-20250514 \
    --prompts prompts.jsonl \
    --output ./output \
    --budget 10.0

# Analyze results
malagent distill analyze --samples-dir ./output/samples

# Export verified samples to SFT format
malagent distill export \
    --samples-dir ./output/samples \
    --output sft_dataset.jsonl \
    --min-reward 0.5

Cost Management

External LLM calls incur API costs. The pipeline includes budget controls:

pipeline = DistillationPipeline(
    generator=generator,
    budget_usd=10.0,           # Stop when budget exceeded
    cost_warning_threshold=0.8  # Warn at 80% budget
)

Note: Actual costs depend on your provider, model selection, token usage, and current pricing. Check your provider’s pricing page for current rates.

Sample Attribution

All samples track their origin for training transparency:

sample = GeneratedSample(
    prompt=prompt,
    completion=code,
    generator="anthropic/claude-sonnet-4-20250514",
    was_fixed=True,
    fix_iterations=2,
    original_completion=original_broken_code,
    compile_errors_fixed=["C2065", "C2143"]
)

This enables:

  • Understanding which samples were fixed vs. original
  • Tracking fix agent effectiveness
  • Filtering by sample origin for experiments