Welcome back to the Capstone Project course! In the previous unit, we mapped out the architecture for our Documentation Quality Automation System. Now, in this second unit, we're ready to roll up our sleeves and build it.
This lesson walks through the complete implementation, from writing the first line of configuration to deploying a production-ready system. We'll create each component step by step: project context, specialized skills, automation hooks, the agent application, and CI/CD integration. By the end, you'll have a working system that automatically reviews documentation quality in pull requests.
The implementation follows a logical progression, starting with foundational pieces and building up to the integrated system. Let's begin by establishing our implementation strategy.
Before writing any code, let's outline our implementation approach. Building a multi-component system requires careful sequencing to avoid rework and integration headaches.
We'll follow a seven-phase implementation plan:
- Phase 1: Establish
project contextthroughCLAUDE.md - Phase 2: Define quality standards via
Skills - Phase 3: Configure
bash hooksfor local automation - Phase 4: Build the
Agent SDKapplication - Phase 5: Set up
GitHub Actionsworkflow - Phase 6: Test the complete system locally
- Phase 7: Package for deployment with
Docker
This sequence ensures each layer builds on the previous one. Skills reference the context from CLAUDE.md. The agent application uses Skills. CI/CD runs the agent. Each phase delivers a testable increment, allowing us to validate as we build.
Our first step is creating CLAUDE.md to provide the project context and documentation standards. This file serves as the knowledge base for all subsequent automation components.
This context establishes what matters for this project. We define five quality requirements that any documentation change must meet. The Docusaurus MDX conventions specify technical requirements specific to the platform. Notice how we keep standards concrete and actionable; "clear, concise writing" is measurable through readability checks, and "all links valid" can be automatically verified.
With project context established, we now define Skills that encode our quality standards. We'll create three specialized Skills, each focusing on a different aspect of documentation quality.
First, the docs-quality skill:
This skill provides a concrete checklist for structural quality. Each item is verifiable: we can scan for introduction sections, check code blocks for language tags, and validate that images include alt text. The checklist format makes it easy for the agent to systematically evaluate documentation.
Next, we create the seo-standards skill to ensure search engine optimization:
The link-validation skill defines how we check links:
These two Skills complement the docs-quality skill by addressing specific technical requirements. SEO standards ensure discoverability, while link validation prevents the frustrating experience of broken links. Together, our three Skills form a comprehensive quality framework.
Now we configure bash hooks to provide immediate feedback during local development. These hooks automatically log documentation file creations and edits for audit purposes.
This hook captures both Write and Edit operations on documentation files. We parse the incoming JSON to extract tool_name and the target file path. If the tool is Write or Edit and the file is markdown or MDX, we append an audit log entry with the timestamp and filename. The exit 0 ensures the hook doesn't block the operation.
To activate our audit logging hook, we update .claude/settings.json:
This configuration tells Claude Code to run our bash script after every Write or Edit tool invocation. The matcher filters for documentation-changing operations specifically, ensuring we log file creations and modifications without capturing irrelevant tool uses.
With configuration in place, we now build the core agent application using the Agent SDK. This Python application orchestrates the complete documentation review process.
We start by importing the Agent SDK components and creating our bot class. The constructor retrieves the Anthropic API key from environment variables, failing fast if it's missing. This ensures we catch configuration issues before attempting any API calls.
The agent includes a Python callback hook for detailed logging:
This Python hook complements our bash hook from earlier. While the bash hook logs to files, this Python callback provides real-time console output during agent execution. We use a PreToolUse hook with strongly-typed parameters to extract the tool_name and print it with a clear prefix. Returning an empty dictionary satisfies the hook interface without modifying the agent's behavior.
The heart of our agent is the review_file method:
This method configures the agent with three key aspects. First, we explicitly allow Read, Grep, WebFetch, and Skill tools; the agent needs to read files, search content, validate external links, and access our quality Skills. We also use disallowed_tools to prevent any write operations or command execution, ensuring the review process is read-only. Second, the system_prompt explicitly references our three Skills and defines what to check. Third, we attach our Python logging hook to capture tool invocations before they execute.
We complete the review method by executing the query and processing results:
The query function returns an async iterator, yielding messages as the agent processes the request. We accumulate text content from messages that have a role attribute, concatenating text blocks as they arrive. The SDK automatically calculates the total API cost and includes it in the ResultMessage via the total_cost_usd attribute, which we capture for reporting purposes. Our return value bundles the filepath, review results, and API cost for reporting purposes.
To make results human-readable, we add a format_report method:
This method generates markdown-formatted reports suitable for GitHub comments or local files. We include the filename as a header, the review results, and the API cost rounded to four decimal places. The emoji and formatting make reports scannable and professional.
Finally, we add a main function to enable command-line usage:
This CLI interface accepts a file path as an argument, instantiates our bot, runs the review, and outputs results both to the console and to a markdown file. The saved report provides a persistent record of the review for later reference or CI/CD integration.
With our agent application complete, we configure GitHub Actions to run reviews automatically on pull requests:
This workflow triggers only when pull requests modify markdown or MDX files in the docs directory. We use path filters to avoid running checks unnecessarily on non-documentation changes. The job runs on Ubuntu, providing a consistent Linux environment.
The workflow installs dependencies and runs our agent:
We check out the repository, set up Python 3.11, install our agent's dependencies, and run the quality check. The API key comes from GitHub secrets, keeping sensitive credentials secure. For a production workflow, we'd likely iterate over all changed files rather than hardcoding a single file path.
Before deploying, we test the complete system locally to verify all components work together. Running the agent produces detailed output:
This generates the following output showing our hooks and review in action:
The SDK hooks log each tool invocation (Read and Grep), confirming our Python callback works. The review report shows three checks: clarity passed, SEO needs attention (missing meta description), and links validated successfully. The cost tracking helps monitor API usage.
We also verify our bash audit logging hook by examining the log file after making manual edits to documentation:
This shows our bash hook captured documentation changes made during development:
Note that these audit entries come from manual file edits or writes made during development, not from the bot's read-only review process. Since our bot uses only Read, Grep, and WebFetch tools, it won't trigger the Write/Edit audit hook. Instead, the audit trail captures when developers act on the bot's feedback by manually updating documentation files with Claude Code.
Finally, we package the agent for consistent deployment across environments using a Dockerfile:
This Dockerfile creates a minimal Python environment with only the dependencies we need. We copy the agent code, Claude Code configuration (.claude/ directory), and project context (CLAUDE.md). The slim base image keeps the container size manageable. The default command runs our bot, though in practice we'd override this with specific file paths.
We've now built the complete Documentation Quality Automation System from the ground up. Starting with project context in CLAUDE.md, we created three specialized Skills to encode quality standards, configured bash hooks for audit logging, developed a full Agent SDK application with Python callbacks, and integrated everything with GitHub Actions and Docker.
This implementation brings together skills from all previous courses: configuration management, hooks, Skills, and the Agent SDK. More importantly, we've seen how these components work together as a cohesive system. The bash hooks provide local feedback and an audit trail for documentation file creations and edits, the Skills define standards, the Agent SDK orchestrates reviews, and CI/CD ensures consistency across the team.
Now it's time to put this knowledge into practice. In the upcoming exercises, you'll implement each component yourself, test the integrations, and experience firsthand how all the pieces fit together to create production-ready automation!
