Safe Automation Patterns

Introduction

So far, the workflow has been straightforward. We collect context from the repository, ask the model for a structured patch, apply that patch with git, rerun tests, and write a report. That loop is powerful because it is grounded in real execution and real exit codes.

When automation can edit files, the main risk is not that it “fails loudly.” The risk is that it succeeds in a way you didn’t intend. Safety patterns are about keeping the loop useful while making surprises unlikely.

In this lesson, we’ll apply a simple safety mindset to the same engineer loop you already have, using the same Excalidraw repo layout you’ve been working with.

The philosophy: constrain first, then automate

A safe agent is not one that never makes mistakes. A safe agent is one that has a small, clear operating area, and stops early when it tries to step outside it.

That leads to three practical ideas.

First, automation should be bounded. If you only want code fixes, the agent should only be allowed to touch code and tests, not workflows, lockfiles, or configuration.

Second, safety should be deterministic. A CI job should not hang waiting for a prompt, and a script should not behave differently depending on whether it has a terminal attached. If something needs approval, approval should be explicit and machine-checkable.

Third, safety should be reviewable. Every run should leave behind artifacts that let you answer, quickly and confidently, what changed and why.

Now we’ll implement these ideas in our case, inside the engineer loop that generates a unified diff and applies it using git apply.

Bounded scope in our repo

This repository is a monorepo. Source files and tests typically live under packages/<name>/src/ and packages/<name>/tests/. That gives us a natural boundary for automated edits.

The most reliable place to enforce that boundary is the patch itself. The unified diff contains file headers such as +++ b/packages/math/src/point.ts, which tells us exactly which paths the patch wants to modify. We can extract those paths and reject the patch before git apply runs.

This kind of check is intentionally strict. It doesn’t try to be clever, and it doesn’t “fix up” the patch. It simply refuses to apply it when it targets paths outside the allowed space.

A non-interactive approval gate for large changes

Even if a patch only touches allowed directories, it can still be too broad. When a “minimal fix” request turns into many files changed, review becomes harder and risk goes up.

Instead of prompting with input(), which is fragile in automation, we use an environment variable approval. That keeps the behavior consistent locally and in CI, and it makes approval visible in logs and configuration.

This gate is not trying to measure how “good” a change is. It’s a checkpoint that says, “this is bigger than expected, so we require explicit consent.”

The audit trail: keep the final "git diff HEAD"

After the patch is applied and tests are rerun, you want a durable record of what actually happened. The simplest and most useful record is the final git diff HEAD, captured after the run. That diff is the truth of the working tree, regardless of what the plan said.

Your reporting stage already captures git diff HEAD. The safety mindset here is to treat that as non-optional: every run should produce a report that includes the final diff so a human can review it quickly.

Where this fits in the engineer loop

These checks belong at the boundary between “model output” and “real side effects.” In the engineer loop, that boundary is the moment right before apply_patch(...).

From here, the loop proceeds exactly as before. You verify using the test exit code through verify_fix(), and you write REPORT.md through generate_report(...), which includes the final git diff HEAD.

Conclusion and next steps

You now have a simple safety layer that matches the way the rest of this course works. The model can still propose fixes, but the patch cannot escape the directories you intended, large changes cannot slip through without explicit approval, and every run produces an artifact that shows the real final diff.

Next, you’ll implement these checks in the engineer loop utilities and see how they behave when the model proposes changes that are out of bounds or unexpectedly wide.

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal