Quick Start
Get malagent running in 30 minutes
Prerequisites
Before starting, ensure you have:
- Training Host: Linux with AMD GPU (Strix Halo recommended)
- Windows DEVBOX: Windows 10/11 with MSVC Build Tools
- Elastic Security: For detection-based rewards (optional for MVR mode)
- Proxmox Server: For VM pool orchestration (optional)
Installation
1. Clone the Repository
git clone https://github.com/professor-moody/malagent.git
cd malagent
2. Build the Toolbox
The custom ROCm toolbox provides the ML environment:
cd toolbox
./build.sh
3. Create and Enter the Toolbox
# Remove old toolbox if exists
toolbox rm -f malagent || true
# Create the toolbox
toolbox create malagent --image localhost/malagent:latest
# Enter the toolbox
toolbox enter malagent
4. Install malagent
pip install -e .
Configuration
Interactive Setup Wizard (Recommended)
The easiest way to configure malagent is with the interactive setup wizard:
malagent setup
This guides you through:
- Mode Selection: Minimal (MVR), Standard (Elastic), or Full (Proxmox)
- Windows Build Server: SSH connection for MSVC compilation
- Elastic Security: Kibana API and detection rules (if selected)
- Proxmox: VM orchestration and template discovery (if selected)
Quick Setup Options
# Minimal setup (Windows build server only)
malagent setup --minimal
# Full setup (all infrastructure)
malagent setup --full
Manual Configuration
If you prefer manual configuration:
cp configs/raft_config.yaml.example configs/raft_config.yaml
Edit with your connection details:
# Windows build server
windows:
host: "10.0.0.152"
username: "keys"
key_file: "~/.ssh/win"
# Verifier mode
verifier:
mode: mvr # or "elastic" for full detection
Verify Connection
malagent test --level smoke
First Training Run
MVR Mode (Compilation Only)
Start with MVR mode for faster iteration:
malagent raft train --mode mvr --prompts data/prompts/mvr_prompt_v2.jsonl
This will:
- Load prompts from the dataset
- Generate code samples
- Compile each sample via SSH
- Filter by compilation success
- Train on successful samples
Monitor Progress
Watch the training output:
RAFT CYCLE 1/3
==============
Generating samples... 569 prompts × 8 samples
Verifying 4552 samples...
Compiled: 1823 (40.1%)
Failed: 2729
Filtering samples...
Kept: 912 samples (reward >= 0.5)
Training on filtered samples...
Loss: 0.856 → 0.342
Elastic Mode (Full Detection)
Once you have Elastic Security configured:
malagent raft train --mode elastic --config configs/elastic_verifier.yaml
CLI Commands Reference
malagent setup # Interactive setup wizard
malagent info # Show environment info
malagent test --level full # Run full validation tests
malagent raft train # Run RAFT training
malagent sft train # Run SFT training
malagent verify --code x.cpp # Verify single file
malagent proxmox status # Show VM status
malagent elastic rules # Check detection rules
Next Steps
- Set up Elastic Controller — Enable detection-based rewards
- Configure VM Pool — For sample execution
- Understand the Pipeline — Full training workflow