Alpha Available Soon

Safest open source harness for autonomous AI agents.
Agents can't cheat, leak, wreak havoc.

Write feature specs, and let containerized agents iterate in a zero-trust sandbox until it passes your checks and tests.

Safe AI Factory (SAIFAC) is a spec-driven software factory.
Language-agnostic. Use with any agentic CLI. Safe by design.

Install CLI VSCode Extension

The Architecture

Code doesn't leave the sandbox until it survives the Gauntlet.

Most AI coding tools put you in the loop. You prompt, you review, you fix, you prompt again. You are the quality gate.

SAIFAC replaces that loop with a deterministic, multi-stage pipeline. The AI iterates inside a locked-down sandbox, getting rejected by your own rules - linters, type-checkers, adversarial reviewer, and hidden tests - until the code actually works. You only see a PR when it has already passed everything.

📝

Proposal

Start with your raw intent - a GitHub issue, a Jira ticket, or just a loose description of the problem. SAIFAC reads it, then hands it to the Spec Agent. You can edit or override the proposal before the next step runs.

Every artifact is inspectable. proposal.md → specification.md → tests.json → PR.
You can read, edit, or override any artifact before the next step begins.

The Guarantee

Three things SAIFAC guarantees. Mechanically.

🎯

The AI builds exactly what you asked for.

It is locked in a loop and physically cannot stop until your TDD tests pass.

🛡️

The AI can't break previously-built features.

All features built with SAIFAC are protected by tests the AI cannot modify. Regressions are impossible.

🔒

The AI touches nothing outside its sandbox.

Your codebase, your secrets, your machine. All are safe.
AI only sees files tracked by git.

The proof is in the work:

saifac feat run -n rate-limiting

ℹStarting SAIFAC run · feature: rate-limiting · agent: claude-code

↳Attempt 1/3 — sandbox provisioned · Cedar policies applied

🔴FAILED — Holdout tests: race condition detected in test_concurrent_requests

↳Attempt 2/3 — container reset · state wiped · feedback injected

🔴FAILED — memory_profiler_threshold exceeded under load

↳Attempt 3/3 — container reset · state wiped · feedback injected

✅PASSED — All 14 tests green. Gate ✅ Reviewer ✅ Holdout Tests ✅

→Opening PR · agent: saifac-agent[run-a1b2c3] · cost: $1.42 · 43 min

Don't take our word for it.

Test SAIFAC against your own codebase. Give it an issue you already know the answer to and see for yourself.

$ npm install -g safe-ai-factory

$ cd your-repo

$ saifac prove --issue 1234

Read the saifac prove docs →

Feature Grid

Built for every layer of your engineering org

SAIFAC isn't just a tool for one person. It's infrastructure for the whole team - the engineer who uses it daily, the manager who relies on it for predictable delivery, the CTO who needs to know it's secure, and the security team that has to sign off on it.

Focus on architectural design. Let the agent do the grinding.

AI agents should speed up your workflow, not leave you with an indecipherable black box.

When things go wrong, most tools leave you a cryptic terminal error and a broken state you have to manually untangle.

SAIFAC does the opposite: when an agent hits its limit, it saves the exact state - the partial diff, the last error, the stack trace.

$ saifac run debug <run-id>

A VSCode Remote Container opens with everything intact. You fix the blocker and resume.

You own delivery. You can't own what you can't see.

If your team runs agents on their laptops, you have no visibility. Shadow compute. Unknown API spend. Rogue loops burning budget over the weekend while nobody's watching.

The solution: SAIFAC's centralized orchestration plane.

Live terminal logs from every active agent run across the org
API spend per team, per feature, per user - mapped to your org structure
Agent iteration limits - a stuck agent halts automatically

Don't bet on one tool. Own the workflow instead.

A new AI tool arrives every month. When a better one drops, you're either locked in, or you're rebuilding your workflow from scratch. SAIFAC is a verification engine, not a coding agent. Swap agents and models with ease:

$ saifac run --agent=claude-code

$ saifac run --model=claude-sonnet-4-6

SAIFAC is language-agnostic. Reuse SAIFAC across projects or teams without changing your AI workflow.

You need to trust what runs in your infrastructure.

Most autonomous agents have full access to the developer's machine by default. They can read.env files, access ~/.aws credentials, or call external endpoints. And these tools ask you to trust them...

SAIFAC assumes you won't. And it's built accordingly.

Ephemeral Docker containers governed by Cedar access policies
Agent physically blocked from reading secrets or hidden tests
No persistent state, no backdoors, no lingering access

SAIFAC is open source. Audit the Dockerfiles, review the Cedar policies, inspect the data flow.

Deployment

Where you need it. How you need it.

SAIFAC is designed to start on your laptop and scale to your entire org. Adopt one ticket at a time, or deploy a full fleet. The factory runs anywhere your Docker daemon does.

💻

Local CLI

Start on your laptop, today

Open source. Runs on your laptop via Docker Compose. Zero infrastructure overhead. Zero config beyond an API key. Pick a ticket, write a proposal, let it run while you work on something else.

View the Docs

🏢

Self-Hosted VPC

Full control inside your own infrastructure

Deploy the SAIFAC Control Server inside your own infrastructure via Kubernetes (Helm). Your codebase never leaves your network. Full identity-aware cost attribution, RBAC, org-wide budget caps, and a centralized fleet dashboard.

☁️

Managed Cloud

Zero infrastructure overhead

We host the orchestration. You bring your own API keys. Get the full enterprise control plane - fleet observability, budget caps, team management - without standing up a Kubernetes cluster.

In all three tiers, the agents run in ephemeral containers, secrets are never exposed to the agent workspace, and every run is signed with a verifiable Agent Identity.

Security

SAIFAC treats every AI agent as an insider threat. So should you.

An autonomous coding agent has the same access as a developer with a grudge. It can read your secrets, exfiltrate your codebase, install malicious dependencies, and rewrite test files to fake a passing build - all while looking like it's just doing its job.

The industry has responded with vibes. "We sandboxed it. We trust the model."

SAIFAC responded with architecture. Every security property below is enforced in code.

The Attack Vector	How SAIFAC Physically Prevents It
Agent escapes the sandbox via shared memory, Docker socket access, or by leaving malicious state that persists between iterations	Five independent isolation layers: (1) The agent runs in a Docker container with network and filesystem access controlled by Leash. (2) Agent receives only a copy of the codebase. Your secrets, and`.git` and hidden test files are never exposed to th eagent. (3) The agent's output is extracted as a plain-text `git diff`. Unsafe changes (e.g. modifications to tests) are stripped from the git diff before processing. (4) After every iteration, the coder container and all ephemeral services are destroyed entirely. (5) During testing, the test runner and agent's code changes are in two separate containers and communicate over HTTP only. There is no shared memory and no Docker socket access between the coder and the test runner.
Agent uses prompt injection to manipulate SAIFAC's own AI components into approving the code	The agent never controls any input that reaches SAIFAC's internal AI agents. Neither code changes nor test failures are forwarded to SAIFAC's AI agents. The coding agent cannot inject instructions into any part of the pipeline it doesn't own. The only exception is the AI Reviewer. The Reviewer checks for semantic correctness of the code. It lives directly in the agent's container and reads agent's code. Bypassing Reviewer does NOT guarantee success.
Agent modifies or deletes test files to force a passing build (reward hacking)	Every SAIFAC feature has two test sets: Public - The agent can see public tests but cannot modify them. Hidden - The agent never sees the hidden tests. You write these tests yourself. Agent never writes its own tests.
Agent reads `~/.aws`, `.env`, or SSH keys and leaks credentials	Secrets are never exposed to the agent. The agent can see only source-controlled files in the project directory. The agent's container cannot see the host filesystem at all.
Agent exfiltrates proprietary source code by making outbound HTTP calls to an attacker-controlled endpoint	Preventable in single step: Define network access with Cedar policies. Every outbound connection attempt is intercepted and logged. Traffic to unapproved endpoints is blocked at the syscall level.
Agent hallucinates a package name; an attacker has typosquatted it on npm; malware is installed silently	Preventable in 3 steps: (1) Configure `.npmrc` to point to your internal registry. (2) Set Cedar policies blocking outbound traffic to public registries. (3) Add a SAIFAC Gate script (`--gate-script`) that runs a dependency audit.

SAIFAC is fully open source. Read the Dockerfiles, audit the Cedar policies, inspect every data flow, and verify these properties before deploying a single agent. Read the Full Security Architecture →

Reliability

The guardrails that let you actually sleep at night.

Security keeps the attacker out. Reliability means an agent stuck in a loop won't burn through your budget. SAIFAC takes care of both.

The Operational Risk	How SAIFAC Handles It
An agent accidentally deletes your staging or production database	The agent never sees your real database. Its Docker network is physically isolated. Instead of a real database, define ephemeral mock services in `docker-compose.yml`. If an ephemeral service crashes, SAIFAC detects it via health checks and halts the run immediately.
An agent gets stuck in a retry loop Friday evening and burns through the team's API budget by Saturday morning	Configure max attempts as a hard circuit breaker. When the limit is reached, SAIFAC halts execution, saves the state, and sends an alert. Set budget caps at the user, team, and department level to limit LLM spend.
Your team is running agents on their laptops and nobody has any idea what's happening, what it's costing, or whether any of it is working	SAIFAC has a centralized dashboard (self-hosted or managed cloud). Every run flows through it, regardless of where it was launched. Live terminal logs, API spend per team or per feature, health metrics, and a real-time map of the entire swarm. No shadow compute. No surprise bills.
AI-generated code is merged under a developer's identity, making it impossible to distinguish human from AI work in audit logs	Every SAIFAC commit is signed with a dedicated, verifiable Agent Identity (`saifac-agent[run-id]`). It never inherits your developer's gitconfig. Human commits and AI commits are always distinguishable in your Git history.

VSCode Extension

Your entire AI factory, without leaving your editor.

The SAIFAC CLI is powerful. But you live in your IDE. Context-switching to a terminal to check on a running agent, triage a failed run, or kick off a debug session breaks your flow. The SAIFAC VSCode Extension brings the entire factory into your sidebar. No terminal required.

VS Code — safe-ai-factory

SAIFAC — Features

📁 rate-limiting● running

📁 auth-refresh● failed

📁 webhook-retry● idle

Live Log — rate-limiting

Attempt 1/3 — sandbox provisioned

🔴 FAILED — race condition in test_concurrent

Attempt 2/3 — container reset

✅ PASSED — 14 tests green

[ screenshot / recording placeholder ]src: x_web/workspace.png or demo video

▶

Launch & monitor runs

Click Run on any feature. Watch the live agent log stream directly in the sidebar. Pause, cancel, or resume without leaving your editor.

🐛

One-click debug

When a run fails, click Debug. A VSCode Remote Container opens with the agent's exact state. Fix the blocker and resume - all inside the same window you were already working in.

📁

Manage your feature backlog

Create features, write proposals. Directly from the sidebar tree view. Your SAIFAC feature backlog lives alongside your code, versioned in Git.

Install the VSCode Extension View Extension Docs →

Transparency

SAIFAC shows its work. That's the point.

Engineers don't trust AI tools that work perfectly on the first try. Neither do we.

What you actually want from an AI agent isn't magic — it's evidence. You want to know what it tried, why it failed, how it corrected itself, and what exactly it proved before opening a PR. SAIFAC attaches a full run log to every PR it opens. To prove it did the work.

feat: implement user rate limiting

opened by saifac-agent[run-a1b2c3] · 3 commits · 14 tests added

Spec Check: Passed 4/4 functional requirements.

The Factory Run Log:

🔴 Attempt 1/3 FAILED

Coder Agent implemented Redis sliding window. Holdout tests: race condition in test_concurrent_requests. Concurrent writes not atomic. Container reset.

🔴 Attempt 2/3 FAILED

Agent added distributed lock. Gate ✅ Reviewer ✅. Failure: memory_profiler_threshold — lock introduced memory leak under load. Container reset.

✅ Attempt 3/3 PASSED

Agent simplified implementation. Removed redundant lock layer. All 14 tests green. Gate: ✅ Reviewer: ✅ Holdout Tests: ✅

Agent Identity: saifac-agent[run-a1b2c3] · Runtime: 43 min · API cost: $1.42 · Full log: saifac-run-log.md

Yes, SAIFAC took 43 minutes and failed twice before getting it right. That's not a bug — that's the system doing its job. It found a race condition and a memory leak before they reached your PR queue. That's what deterministic verification looks like. It's slower than a magic button. It's faster than your current review cycle.

Get Started

Ready to stop reviewing broken AI PRs?

Your team is spending hours per week reviewing code that never should have reached the PR queue - wrong architecture, alien patterns, missing edge cases, failing tests caught too late.

SAIFAC enforces rigor before the PR exists.

Install the CLI ★ Star on GitHub

Don't take our word for it.

Test SAIFAC against your own codebase. Give it an issue you already know the answer to and see for yourself.

$ npm install -g safe-ai-factory

$ cd your-repo

$ saifac prove --issue 1234

Read the saifac prove docs →

Safest open source harness for autonomous AI agents.Agents can't cheat, leak, wreak havoc.

Code doesn't leave the sandbox until it survives the Gauntlet.

Proposal

Three things SAIFAC guarantees. Mechanically.

The AI builds exactly what you asked for.

The AI can't break previously-built features.

The AI touches nothing outside its sandbox.

Don't take our word for it.

Built for every layer of your engineering org

Focus on architectural design. Let the agent do the grinding.

You own delivery. You can't own what you can't see.

Don't bet on one tool. Own the workflow instead.

You need to trust what runs in your infrastructure.

Where you need it. How you need it.

Local CLI

Self-Hosted VPC

Managed Cloud

SAIFAC treats every AI agent as an insider threat. So should you.

The guardrails that let you actually sleep at night.

Your entire AI factory, without leaving your editor.

Launch & monitor runs

One-click debug

Manage your feature backlog

SAIFAC shows its work. That's the point.

feat: implement user rate limiting

Ready to stop reviewing broken AI PRs?

Don't take our word for it.

Safest open source harness for autonomous AI agents.
Agents can't cheat, leak, wreak havoc.