So far, the workflow has been straightforward. We collect context from the repository, ask the model for a structured patch, apply that patch with git, rerun tests, and write a report. That loop is powerful because it is grounded in real execution and real exit codes.
When automation can edit files, the main risk is not that it “fails loudly.” The risk is that it succeeds in a way you didn’t intend. Safety patterns are about keeping the loop useful while making surprises unlikely.
In this lesson, we’ll apply a simple safety mindset to the same engineer loop you already have, using the same Excalidraw repo layout you’ve been working with.
A safe agent is not one that never makes mistakes. A safe agent is one that has a small, clear operating area, and stops early when it tries to step outside it.
That leads to three practical ideas.
First, automation should be bounded. If you only want code fixes, the agent should only be allowed to touch code and tests, not workflows, lockfiles, or configuration.
Second, safety should be deterministic. A CI job should not hang waiting for a prompt, and a script should not behave differently depending on whether it has a terminal attached. If something needs approval, approval should be explicit and machine-checkable.
Third, safety should be reviewable. Every run should leave behind artifacts that let you answer, quickly and confidently, what changed and why.
Now we’ll implement these ideas in our case, inside the engineer loop that generates a unified diff and applies it using git apply.
This repository is a monorepo. Source files and tests typically live under packages/<name>/src/ and packages/<name>/tests/. That gives us a natural boundary for automated edits.
The most reliable place to enforce that boundary is the patch itself. The unified diff contains file headers such as +++ b/packages/math/src/point.ts, which tells us exactly which paths the patch wants to modify. We can extract those paths and reject the patch before git apply runs.
This kind of check is intentionally strict. It doesn’t try to be clever, and it doesn’t “fix up” the patch. It simply refuses to apply it when it targets paths outside the allowed space.
Even if a patch only touches allowed directories, it can still be too broad. When a “minimal fix” request turns into many files changed, review becomes harder and risk goes up.
Instead of prompting with input(), which is fragile in automation, we use an environment variable approval. That keeps the behavior consistent locally and in CI, and it makes approval visible in logs and configuration.
This gate is not trying to measure how “good” a change is. It’s a checkpoint that says, “this is bigger than expected, so we require explicit consent.”
After the patch is applied and tests are rerun, you want a durable record of what actually happened. The simplest and most useful record is the final git diff HEAD, captured after the run. That diff is the truth of the working tree, regardless of what the plan said.
Your reporting stage already captures git diff HEAD. The safety mindset here is to treat that as non-optional: every run should produce a report that includes the final diff so a human can review it quickly.
These checks belong at the boundary between “model output” and “real side effects.” In the engineer loop, that boundary is the moment right before apply_patch(...).
From here, the loop proceeds exactly as before. You verify using the test exit code through verify_fix(), and you write REPORT.md through generate_report(...), which includes the final git diff HEAD.
You now have a simple safety layer that matches the way the rest of this course works. The model can still propose fixes, but the patch cannot escape the directories you intended, large changes cannot slip through without explicit approval, and every run produces an artifact that shows the real final diff.
Next, you’ll implement these checks in the engineer loop utilities and see how they behave when the model proposes changes that are out of bounds or unexpectedly wide.
