Welcome back! In the previous lessons, we built a complete Repo Navigator skill in a single SKILL.md file. It worked great: role definition, discovery process, output format, and constraints all in one place. For learning, that single-file approach was perfect—you could see the whole picture.
But here's what happens in production: that skill grows. Your team adds Rust detection. Someone tweaks the output format. Another developer adds verification steps. Each change touches the same file, increasing the risk of breaking something. Plus, when you want to reuse the output format in another skill or share the safety constraints across multiple skills, you end up duplicating content.
This lesson shows you how to evolve that single-file skill into a multi-file knowledge system where different concerns live in different places. We'll take the exact skill you built and restructure it so it's easier to maintain, safer to modify, and more powerful to use.
Before we dive into restructuring, let's understand the deeper principle at work here. When we talk about "skills as knowledge systems," we're drawing on a fundamental idea from software engineering: separation of concerns.
Think about how a well-designed application works. You don't put database queries, business logic, and HTML templates all in one giant function. Instead, you separate them into layers: data access, domain logic, presentation. Each layer has a clear responsibility, and changes to one layer don't ripple through to others. This separation makes the system maintainable, testable, and comprehensible.
Skills follow the same principle. A skill isn't just one thing—it's actually several distinct kinds of knowledge working together:
- Procedural knowledge: "How do I detect what languages are in this repo?"
- Output contracts: "What structure should my results follow?"
- Safety knowledge: "What rules prevent me from making mistakes?"
- Automation knowledge: "What repetitive checks can be scripted?"
- Reference knowledge: "What does good output look like?"
When these live in one file, they're entangled. Changing how you detect languages might accidentally affect your output format. Adding a safety rule might break an example. But when each type of knowledge lives in its own file, you can evolve each aspect independently.
The philosophy here is that complexity should be managed through modularity, not through clever organization within a single file. As your skill grows more sophisticated—handling more repository types, providing richer output, ensuring stronger safety guarantees—you don't want that complexity concentrated in one place. You want it distributed across focused components that each do one thing well.
This isn't just about keeping files small. It's about creating clear interfaces between different types of knowledge. When a checklist references a script, that's an interface. When SKILL.md points to a template, that's an interface. Interfaces let different parts evolve independently while maintaining compatibility. They let different people own different aspects. They make the system resilient to change.
Let's look at what we built in the previous lesson. Our SKILL.md contained:
This single file mixes four different concerns: procedures (how to discover things), templates (how to format output), safety rules (what not to do), and the role/task definition. When you need to update the table format, you're scrolling past discovery steps. When adding a new language, you might accidentally modify the output template. The boundaries aren't clear.
Here's what happens in practice: You want to add TypeScript detection. You scroll to the language detection section, add a line, and save. Later, someone reports the output format broke. Turns out while scrolling, you accidentally deleted a line from the Output Format section. Or someone adds a new constraint but places it in the middle of discovery steps, making the procedure hard to follow. Or you want to reuse the output format in another skill, so you copy-paste the template—now you're maintaining two copies.
These aren't hypothetical problems. They're what teams experience with growing single-file skills.
Here's the key insight: SKILL.md should orchestrate, not contain. We'll restructure our Repo Navigator like this:
Now each concern lives in its own file. This structure reflects our philosophy: different types of knowledge get different homes. Procedures live in checklists/, contracts in templates/, automation in scripts/, and reference material in examples/.
Before we build this structure, let's address the operational questions you'll face when running multi-file skills in practice. Understanding how Codex activates skills and resolves file references is crucial for debugging when things don't work as expected.
When you ask "What's the tech stack?", how do you know if the Repo Navigator skill actually activated? You have several ways to check:
Ask Codex directly:
Codex will tell you explicitly: "I'm using the repo-navigator skill" or "No specific skill activated; I'm using general knowledge."
Check Codex's response structure:
If the output matches your template exactly (the table format from templates/output-table.md), that's strong evidence the skill fired. If Codex responds with prose or a different structure, it probably didn't activate.
Examine the .codex/logs directory (if available): Some Codex implementations maintain activation logs showing which skills triggered for each request. Check your specific Codex environment's documentation for log locations.
Common failure modes:
- Trigger too narrow: Your trigger says "when the user asks about repository structure" but they said "show me the project layout" - close but not matching
- Missing frontmatter: If
trigger:isn't in the YAML frontmatter, Codex won't know when to activate - Skill not registered: The skill file exists but isn't in a location Codex scans (typically
.codex/skills/*/SKILL.md)
Codex automatically reads referenced files when the skill activates. You don't need to explicitly include their contents in SKILL.md.
When SKILL.md says "follow the checklist at checklists/discovery.md", Codex:
- Locates that file relative to the skill directory
- Reads its entire contents
- Incorporates that knowledge into its working context
- Follows the checklist as if it were written directly in SKILL.md
Now let's build the structure. First, we slim down our entrypoint. Here's what our new SKILL.md looks like:
That's it! From 80+ lines down to about 25. The role and trigger stay (they're unique to this skill), but everything else is referenced. Notice how the Task section tells Codex where to find instructions rather than duplicating them.
This is the orchestration layer. It says "You're a repository analyst. Follow these checklists. Use these templates. Check this example." It doesn't try to contain all the knowledge—it points to it.
Take all those discovery steps from the original SKILL.md and move them to checklists/discovery.md:
Now when you need to add Rust detection or Bazel support, you edit one focused file. The "Tooling (optional)" section hints that a script can help, but manual checks are always available. This embodies our philosophy: automation enhances but doesn't replace procedural knowledge.
Extract all your constraints into checklists/honesty-guardrails.md:
These safety rules represent a distinct type of knowledge—constraint knowledge that prevents mistakes. By putting them in their own file, we make them impossible to accidentally delete or modify while working on procedures. And critically, other skills can reference this same file. When you build a "Dependency Analyzer" skill or a "Security Checker" skill, they can use honesty-guardrails.md too, ensuring consistent safety standards across your entire skill library.
Extract the table structure into templates/output-table.md:
This template is an output contract. It promises that every invocation of this skill will produce data in this exact structure, with these exact fields, following these exact conventions. Tools that parse skill output can depend on this contract. When you need to add a "CI/CD" row, you update one template and every skill invocation uses the new format instantly.
You can also create templates/output-example.json for tools that prefer JSON output. The philosophy: provide multiple presentation formats while keeping the underlying data gathering consistent.
Here's where we introduce optional automation. Scripts represent automation knowledge—understanding what can be mechanized safely. Create scripts/scan-languages.sh:
The line set -euo pipefail makes the script fail fast and predictably: -e stops execution if a command fails, -u treats unset variables as errors, and pipefail ensures a pipeline fails if any command in it fails. This makes helper scripts safer and easier to debug because they don't silently continue after something goes wrong.
The final line uses jq, a lightweight command-line tool for working with JSON. Here, jq converts each detected language into a JSON string and then collects those strings into a JSON array, so the script outputs structured data rather than plain text. If jq isn't available in the environment, that's fine—the discovery checklist still provides manual detection steps, and the skill can continue without the script.
This script is read-only (no file modifications), stdout-only (just prints results), and fail-safe (if it errors, the checklist's manual steps still work). The philosophy: scripts are power-ups, not requirements. They accelerate work when available but don't block progress when absent.
Create examples/node-monorepo.example.md:
Examples represent reference knowledge—what good looks like in practice. They serve multiple purposes: documentation showing expected behavior, informal tests (if the skill can't produce this output, something's broken), and training material for new team members. As your skill encounters more repository types, you can add more examples that capture hard-won understanding about edge cases and variations.
Let's trace what happens when someone asks "What's the tech stack?":
- SKILL.md activates and establishes the repository analyst role
- Opens
checklists/discovery.mdand follows steps sequentially - Optionally runs
scripts/scan-languages.shfor faster detection - Verifies results per
checklists/honesty-guardrails.md(checks files actually exist) - Formats output using
templates/output-table.mdstructure - Compares against the example to ensure output matches expected patterns
Each file plays its role. SKILL.md orchestrates. Checklists guide behavior. Scripts accelerate. Templates enforce structure. Examples validate. This is modularity in action: separate components with clear interfaces cooperating to achieve a shared goal.
Why is this better than our single-file version?
Isolated changes: Update language detection in discovery.md without touching output formatting. Change table structure in templates/output-table.md without affecting discovery logic. Add safety rules in honesty-guardrails.md without scrolling through procedures. Each change has a clear, limited scope.
Easy reuse: Multiple skills can reference the same honesty-guardrails.md. Different skills can use the same output template. Scripts can be shared across related skills. You build a library of components, not isolated monoliths. This is the compound interest of good architecture.
Team-friendly: Different people own different parts. Your security expert maintains guardrails. Your documentation specialist owns templates. Your automation engineer writes scripts. Everyone contributes without merge conflicts or stepping on toes.
Natural growth: When you discover a new repository pattern, add it to the checklist. When automation opportunities emerge, add a script. When edge cases appear, add an example. The structure supports continuous improvement without rewrites. Complexity doesn't accumulate in one file—it distributes across focused components.
Multi-file skills transform single-file procedures into knowledge systems where concerns are separated: SKILL.md orchestrates, checklists/ hold procedures, templates/ define output contracts, scripts/ add optional automation, and examples/ validate behavior.
We took the Repo Navigator skill you built in previous lessons and restructured it into focused components. The core logic didn't change—it's still detecting languages, finding commands, and producing structured reports. But now it's easier to maintain (isolated changes), safer to modify (clear boundaries), and more powerful (reusable components).
This structure reflects a deeper philosophy: complexity should be managed through modularity. Different types of knowledge get different homes, creating clear interfaces that make skills resilient to change and friendly to teams. The structure emerges naturally as skills mature—you don't need to build it upfront, but extract components when the need becomes clear.
