You've learned that tasks should be "atomic" - small, focused units. But there's a tension:
Too Large:
- AI loses focus across many files
- Context overflow leads to errors
- Difficult to review massive changes
- Hard to isolate bugs when things break
Too Small:
- Excessive context switching overhead
- Each task requires re-reading specs and understanding codebase
- Integration overhead connecting tiny pieces
- Workflow becomes unwieldy with 50+ micro-tasks
The Question: Where's the sweet spot? How do you know if a task is "just right"?
An atomic task has these properties working together:
1. Clear, Testable Completion Criteria
✅ GOOD (Observable, Measurable):
- CommentRepository has create() method returning Comment
- Unit tests pass: pytest tests/unit/test_comment_repository.py
- Coverage ≥90%
- Type checking passes: mypy src/repositories/comment_repository.py
❌ BAD (Vague, Subjective):
- Repository works well
- Code is clean
2. Reasonable Implementation Scope (45-120 minutes)
✅ REASONABLE (60 min):
- Implement CommentRepository with 5 CRUD methods
- Write 8 unit tests with mocked DB
❌ TOO LARGE (>120 min):
- Implement entire commenting system (8 files, 6 hours)
❌ TOO SMALL (<45 min):
- Add one method to existing repository (15 minutes)
Why 45-120 minutes?
- Below 45 min: Setup overhead exceeds implementation time
- Above 120 min: Fatigue sets in, AI context degrades, too much to review
3. Logical Cohesion (Complete Capability)
✅ COHESIVE: [T001] Implement Comment Creation
- Model + Repository + Service + API endpoint + Tests
- Result: Users CAN create comments (working feature)
❌ INCOHERENT: [T001] Create All Model Definitions
- Comment, Attachment, Notification, Tag models
- Result: 4 models exist but NO working features
4. Independent or Explicit Dependencies
✅ EXPLICIT DEPENDENCIES: [T005] Create Comment API Endpoints
- Depends on: T003 (CommentService), T004 (CommentSchema)
- Integration: Import from src/services/comment_service.py
❌ UNCLEAR: [T005] Create API Endpoints
- Depends on: "backend stuff"
1. Scope Management: Break 8-hour features into 2-hour chunks for fresh AI context and reviewable PRs
2. Parallel Work: Enable multiple developers to work simultaneously on independent tasks
3. Risk Isolation: Test complex areas (like external API integrations) separately before integration
4. Clear Milestones: Demonstrate incremental progress with working features
Example:
❌ ONE BIG TASK: [T001] Build Complete Commenting System (8 hours)
- No demo until day 8, entire feature blocked if issues occur
✅ PHASED TASKS:
[T001] Comment Model + Repository (2 hours)
[T002] Comment Service + Validation (2 hours)
[T003] Comment API + Integration Tests (2 hours)
[T004] Authorization + E2E Tests (2 hours)
- 4 reviewable PRs, 4 milestones, can ship T001-T002 early
Every new task requires:
- AI re-reading specification
- Analyzing codebase for patterns
- Reconstructing mental model
- Integration overhead
Time Cost: ~10-15 minutes setup per task
Example:
❌ OVER-SPLIT (4 tasks):
- 75 min implementation + 40 min setup = 115 min (53% overhead)
✅ VERTICAL SLICE (1 task):
- 75 min implementation + 10 min setup = 85 min (13% overhead)
Savings: 30 minutes (26% faster)
Key Insight: Only split tasks when benefits (scope management, parallelism, risk isolation) exceed context switching costs.
1. Scope Genuinely Large (>120 min)
- Split 6-hour feature into three 2-hour tasks
- Each task delivers testable milestone
2. Natural Feature Boundaries
[T001] Comment Creation (90 min) - Users can add comments [T002] Comment Deletion (75 min) - Users can delete comments [T003] Comment Editing (60 min) - Users can edit comments
Each is independently valuable and testable
3. Technical Complexity Warrants Isolation
[T001] Payment Model + Validation (90 min) - Low risk [T002] Stripe Integration (2 hours) - Complex, worth isolating [T003] Payment API (90 min) - Integrate tested components
4. Parallel Opportunities
AFTER: Comment Model exists [T002] CommentRepository (90 min) - Developer A [T003] CommentSchema (45 min) - Developer B
Both can run in parallel (45 min savings)
1. Implementing Technical Layers Separately
❌ BAD (By Layer): [T001] Add Comment model (20 min) [T002] Add CommentRepository (30 min) [T003] Add CommentService (30 min) [T004] Add Comment API (40 min) → No working feature until T004, 4× context switching
✅ GOOD (Vertical Slice): [T001] Implement Comment Creation (90 min) → Working feature, end-to-end testable immediately
2. Adding Single Field
❌ BAD: Split into 5 tasks (model, schema, repository, API, tests) ✅ GOOD: One task "Add Task Priority Field Throughout" (60 min)
3. Simple CRUD
❌ BAD: One task per endpoint (GET, POST, PATCH, DELETE) ✅ GOOD: One task "Implement Task CRUD API" (90 min)
Example 1: Vertical Slice
[T001] Implement Comment Creation (60-90 min)
FILES: model, repository, service, schema, API, tests
ACCEPTANCE:
- POST /api/tasks/{task_id}/comments creates comment
- Content validation (1-5000 chars)
- Authorization (user must own task)
- Tests pass with 90%+ coverage
RESULT: Users CAN create comments (working feature)
Example 2: Related Functionality
[T002] Implement Comment Listing (60 min)
FILES: repository methods, API endpoints, tests
ACCEPTANCE:
- GET /api/tasks/{task_id}/comments returns list
- GET /api/comments/{id} returns single comment
- Pagination with skip/limit
- Authorization verified
RESULT: Users CAN view comments (working feature)
Example 3: Feature + Security
[T003] Implement Comment Deletion (75 min)
FILES: service, API endpoint, tests
ACCEPTANCE:
- DELETE /api/comments/{id} removes comment
- Author can delete (authorization)
- Task owner can delete any comment
- Non-owners get 403 Forbidden
- Comprehensive auth tests
RESULT: Secure deletion (working feature)
Example 1: Over-Split Model
❌ BAD TASKS (8 tasks, 50 min + 8× setup): [T001] Create Comment class (10 min) [T002] Add id field (5 min) [T003] Add task_id field (5 min) [T004] Add user_id field (5 min) [T005] Add content field (5 min) [T006] Add created_at field (5 min) [T007] Add task relationship (5 min) [T008] Add author relationship (5 min)
Problems: Ridiculous granularity, 8× context switching
✅ BETTER (1 task): [T001] Create Comment Model (45 min)
- Complete model with all fields and relationships
Example 2: Over-Split Endpoint
❌ BAD TASKS (7 tasks, 100 min + 7× setup): [T001] Create router file (5 min) [T002] Add route signature (10 min) [T003] Add request validation (15 min) [T004] Add business logic (20 min) [T005] Add response formatting (10 min) [T006] Add error handling (15 min) [T007] Add tests (25 min)
Problems: Splitting ONE endpoint, can't test until T007
✅ BETTER (1 task): [T001] Implement POST /api/comments Endpoint (90 min)
- Complete endpoint with validation, logic, errors, tests
Aim for 45-120 minute tasks that deliver complete, working features. Avoid over-splitting (excessive context switching) and under-splitting (loss of focus). When in doubt, favor vertical slices over horizontal layers.
