Introduction: TDD and AI Guardrails

In the previous lesson, we acted as architects. We defined the requirements for our Movie Recommendation System and used Codex to generate a detailed design document. Now, we are ready to start building. However, instead of jumping straight into writing the application code, we are going to take a different approach: Test-Driven Development (TDD).

TDD is a software development process in which you write tests before you write the actual software. You might wonder, "How can I test code that doesn't exist yet?"

When working with an AI assistant like Codex, TDD becomes even more powerful. Think of tests as a strict set of instructions or a "contract."

  1. Clarity: Writing a test forces you to clarify exactly what the API should do (e.g., "When I send a POST request to /register, I expect a 201 status code").
  2. Guardrails: AI models can sometimes "hallucinate" or drift away from your original design. If you ask Codex to "build the auth system," it might make assumptions you didn't want. But if you provide a test suite and say, "Write code to pass these tests," the AI is constrained to produce exactly what you need.

In this lesson, we will use Codex to generate a comprehensive test suite using pytest. These tests will serve as the foundation for the rest of the course.

The Test Environment: Pytest and Fixtures

To test a FastAPI application, we use a library called pytest. It is the industry standard for testing in Python. We also need httpx to act as a web client that sends requests to our API. On CodeSignal, these libraries are pre-installed for you.

The heart of a pytest suite is a file named conftest.py. This file contains "fixtures." Fixtures are reusable pieces of code that set up the environment before a test runs.

Creating the App Fixture

First, we need a fixture that creates an instance of our application. Since we haven't built the app yet, we will define a factory function create_app that we expect to exist later.

  • @pytest.fixture: This decorator tells pytest that this function is a fixture.
  • scope="session": This means the app is created once and used for all tests, which saves time.
Ensuring Test Isolation (Database Transactions)

A common challenge in TDD is ensuring that tests don't interfere with each other. For example, if one test creates a user, a subsequent test shouldn't fail because that user already exists in the database. We solve this through isolation.

In a production-grade test suite, we use a db_session fixture that wraps every single test in a database transaction.

  1. Start: Before the test starts, we open a transaction.
  2. Execute: The test runs its logic (adding users, rating movies).
  3. Rollback: After the test finishes, we "roll back" the transaction. This undoes every database change made during the test, leaving the database perfectly clean for the next test.
Seeding Data

Sometimes, a test needs existing data to work (e.g., you can't test "Get Movie Details" if there are no movies). We use seeding fixtures for this. A fixture can insert a set of "dummy" movies into the database before the test runs, providing a predictable environment.

Creating the Test Client

Next, we need a "client." This is a tool that acts like a web browser. It sends requests (GET, POST) to our app. Since FastAPI is asynchronous (it can handle multiple things at once), we use an AsyncClient.

  • ASGITransport: This connects the client directly to the FastAPI app code without needing to start a real web server.
  • yield: This provides the client to the test function.

When you prompt Codex, you can ask: "Create a conftest.py file with fixtures for an async FastAPI app, a database session that rolls back after each test for isolation, and an AsyncClient."

Testing Basic Endpoints: Auth and Shows

We start by identifying the key behaviors and edge cases for our basic endpoints: Authentication and the Catalog.

Authentication

For user registration and login, consider the following tests:

  • Successful Registration: Register a new user with valid email, password, and display name.
  • Duplicate Registration: Attempt to register with an email that already exists.
  • Invalid Email Format: Try registering with an improperly formatted email address.
  • Weak Password: Register with a password that does not meet security requirements.
  • Missing Fields: Omit required fields (e.g., password or email) and verify the error response.
  • Successful Login: Log in with correct credentials and expect a valid token or session.
  • Invalid Login: Attempt login with incorrect password or non-existent email.
Catalog

For the movie catalog, consider these tests:

  • List All Shows: Retrieve the full list of shows and verify the response structure.
  • Show Details: Request details for a specific show by ID and check for all expected fields (e.g., title, directors, actors).
  • Non-existent Show: Request a show with an invalid or non-existent ID and expect a 404 error.
  • Search Functionality: Search for shows by title, genre, or other attributes and verify correct filtering.
  • Empty Catalog: Handle the case where no shows exist in the database.
  • Pagination: Test listing shows with pagination parameters (e.g., page size, page number) and verify correct slicing of results.
Testing Complex Interactions: Reviews and Recommendations

For more advanced features, focus on the following behaviors and edge cases.

Reviews and Views
  • Add Review: Submit a review for a show and verify it is stored and returned correctly.
  • Duplicate Review: Attempt to submit multiple reviews for the same show by the same user.
  • Edit Review: Update an existing review and check that changes are reflected.
  • Delete Review: Remove a review and ensure it no longer appears in the show’s reviews.
  • Invalid Review Data: Submit a review with missing or invalid fields (e.g., rating out of bounds).
  • Mark as Viewed: Mark a show as viewed and verify the status is updated.
  • Unmark as Viewed: Remove the viewed status and check the change is reflected.
  • View Non-existent Show: Attempt to mark a non-existent show as viewed.
Recommendations
  • Get Recommendations: Retrieve a list of recommended shows for a user and verify the response structure.
  • Exclude Viewed: Ensure that shows marked as viewed by the user do not appear in recommendations.
  • No Recommendations Available: Handle the case where there are no suitable recommendations.
  • Personalization: Verify that recommendations differ for users with different viewing histories.
  • Edge Case – All Shows Viewed: If a user has viewed all shows, ensure the recommendations list is empty or appropriate message is returned.
  • Invalid User: Attempt to get recommendations for a non-existent or unauthorized user.

These lists will help you define a comprehensive test suite that covers both standard and edge-case behaviors for your API.

Summary and Practice Overview

In this lesson, we established the foundation for our application using Test-Driven Development.

  1. Design: We started with the requirements from our design document.
  2. Test: We learned how to write tests that define exactly how our API should behave and how to ensure database isolation so tests remain independent.
  3. Implement: In future lessons, we will write the code to make these tests pass.

By writing these tests first, we have created a strict guide for Codex. We can now confidently ask it to generate implementation code, knowing that our tests will catch any mistakes or hallucinations.

Good luck with the practices!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal