Distillation
External LLM integration for data generation and code fixing
The distillation pipeline uses external LLMs (Claude, GPT-4) to generate high-quality training data and fix compilation errors.
In This Section
- Compilation Fix Agent — Auto-fix compilation errors
- Prompt Engineering — Effective prompts for external LLMs
Why Distillation?
| Approach | Pros | Cons |
|---|---|---|
| Local RAFT only | Fast, no API costs | Limited by local model quality |
| External LLM only | High quality output | Expensive, can’t fine-tune |
| Distillation | Best of both | Requires pipeline setup |
The key insight: Use expensive inference to generate data, then train your local model on verified winners.
External LLM → Generate samples
│
▼
Your verification pipeline
│
▼
Filter to verified samples
│
▼
SFT your local model on verified samples
Components
1. External Generators
Connect to Claude or GPT-4 APIs for code generation:
from malagent.generators import create_generator
generator = create_generator(
provider="anthropic",
model="claude-sonnet-4-20250514",
api_key=os.environ["ANTHROPIC_API_KEY"]
)
sample = await generator.generate(prompt)
2. Compilation Fix Agent
Automatically fix compilation errors using an external LLM:
from malagent.generators import CompileFixAgent
fix_agent = CompileFixAgent(
generator=generator,
verifier=verifier,
max_attempts=3
)
fixed_sample = await fix_agent.fix(
original_code=broken_code,
compiler_error=error_message
)
3. Distillation Pipeline
End-to-end pipeline combining generation, fixing, and verification:
from malagent.distillation import DistillationPipeline
pipeline = DistillationPipeline(
generator=generator,
verifier=verifier,
fix_agent=fix_agent
)
results = await pipeline.run(
prompts=prompts,
samples_per_prompt=5,
budget_usd=10.0
)
Workflow
┌─────────────────────────────────────────────────────────────┐
│ DISTILLATION PIPELINE │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. GENERATE 2. VERIFY 3. FIX 4. STORE │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ External LLM Compile + Fix Agent Save to │
│ (Claude/GPT) Detect (if failed) Dataset │
│ │
│ Prompt ────► Code ────► Result ────► Fixed? ────► Sample │
│ │
└─────────────────────────────────────────────────────────────┘
Use Cases
1. Bootstrap Training Data
Generate initial high-quality samples before any local training:
malagent distill run \
--provider anthropic \
--model claude-sonnet-4-20250514 \
--prompts data/prompts/techniques.jsonl \
--output distillation_output \
--samples 5 \
--budget 20.0
2. Fix Compilation Failures
Convert failing RAFT samples into successes:
# In RAFT cycle, use fix agent for failed samples
for sample in failed_samples:
if sample.error_type == "compile":
fixed = await fix_agent.fix(sample.code, sample.error)
if fixed.success:
training_samples.append(fixed)
3. Analyze Failures
Use external LLM to understand why samples fail or get detected:
analysis = await generator.generate(
f"Analyze why this code was detected by EDR:\n"
f"Code: {sample.code}\n"
f"Detection: {sample.detection_rule}\n"
f"Explain the detection and suggest improvements."
)
CLI Commands
# Run distillation pipeline
malagent distill run \
--provider anthropic \
--model claude-sonnet-4-20250514 \
--prompts prompts.jsonl \
--output ./output \
--budget 10.0
# Analyze results
malagent distill analyze --samples-dir ./output/samples
# Export verified samples to SFT format
malagent distill export \
--samples-dir ./output/samples \
--output sft_dataset.jsonl \
--min-reward 0.5
Cost Management
External LLM calls incur API costs. The pipeline includes budget controls:
pipeline = DistillationPipeline(
generator=generator,
budget_usd=10.0, # Stop when budget exceeded
cost_warning_threshold=0.8 # Warn at 80% budget
)
Note: Actual costs depend on your provider, model selection, token usage, and current pricing. Check your provider’s pricing page for current rates.
Sample Attribution
All samples track their origin for training transparency:
sample = GeneratedSample(
prompt=prompt,
completion=code,
generator="anthropic/claude-sonnet-4-20250514",
was_fixed=True,
fix_iterations=2,
original_completion=original_broken_code,
compile_errors_fixed=["C2065", "C2143"]
)
This enables:
- Understanding which samples were fixed vs. original
- Tracking fix agent effectiveness
- Filtering by sample origin for experiments