ADR-001: Golden Testing Strategy for Output Verification

Date: 2025-12-12

Status

Proposed

Context

DocBuilder’s core value is transforming documentation from Git repositories into Hugo-rendered sites. Currently, we lack comprehensive integration tests that verify the entire pipeline: repository retrieval → content discovery → Hugo generation → theme-specific transformations → final rendered output.

Current State

Unit tests: Cover individual components (git, docs, hugo packages)
CLI integration tests: Test command execution but not output correctness
Manual verification: Developers run builds and visually inspect Hugo sites
Regression risk: Refactoring breaks transformations without detection
Feature validation: No systematic way to verify new content transformations

Problems

No end-to-end validation: Can’t verify the complete pipeline produces correct output
Refactoring fear: Large-scale changes (like pipeline refactor) lack safety net
Bug reproduction: No standard way to create minimal reproducible test cases
Feature documentation: Supported transformations not demonstrated in tests
Configuration validation: Can’t systematically verify Relearn theme configuration

Decision

Implement a golden testing framework that:

Uses test repositories stored as directory structures in test/testdata/repos/
Creates temporary Git repos from these structures during test execution
Runs full build pipeline with test-specific configurations
Verifies output against stored golden files (Hugo config, content structure, rendered samples)
Supports update workflow via -update-golden flag for intentional changes

Architecture

test/
  testdata/
    repos/                           # Source repository structures
      transforms/
        frontmatter-injection/       # editURL and metadata injection
        cross-repo-links/            # Link transformation
        image-paths/                 # Asset path handling
      edge-cases/
        empty-docs/                  # Repository with no markdown
        malformed-frontmatter/       # Invalid YAML handling
      regression/
        issue-XXX/                   # Specific bug reproductions
    
    configs/                         # Test configurations
      basic-build.yaml
      multi-repo.yaml
      custom-params.yaml
    
    golden/                          # Verified output snapshots
      basic-build/
        hugo-config.golden.yaml      # Generated Hugo configuration
        content-structure.golden.json # File structure with front matter
        rendered-samples.golden.json  # Selected HTML pages (optional)
      multi-repo/
        hugo-config.golden.yaml
        content-structure.golden.json
  
  integration/                       # Integration test package
    golden_test.go                   # Golden test framework
    helpers.go                       # Test repo setup, verification
    testdata_test.go                 # Tests using testdata repos

Golden File Format

Hugo Config (hugo-config.golden.yaml):

1
2
3
4
5


# Complete hugo.yaml as generated by DocBuilder
baseURL: http://localhost:1313/
title: Test Documentation
theme: relearn
# ... full config

Content Structure (content-structure.golden.json):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


## Hugo Configuration

baseURL: "http://localhost:1313/"
title: "Test Documentation"
theme: "relearn"  # Always Relearn
module:
  imports:
    - path: "github.com/McShelby/hugo-theme-relearn"

params:
  themeVariant: "relearn-light"
  disableSearch: false
  # ... other Relearn params

Content Structure (content-structure.golden.json):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


{
  "files": {
    "content/_index.md": {
      "frontmatter": {
        "title": "Home"
      },
      "contentHash": "sha256:abc123..."
    },
    "content/test-docs/guide.md": {
      "frontmatter": {
        "title": "Guide",
        "editURL": "https://github.com/org/repo/blob/main/docs/guide.md"
      },
      "contentHash": "sha256:def456..."
    }
  },
  "structure": {
    "content/": {
      "_index.md": {},
      "test-docs/": {
        "_index.md": {},
        "guide.md": {}
      }
    }
  }
}

Rendered Samples (rendered-samples.golden.json - optional):

1
2
3
4
5
6
7
8


{
  "public/test-docs/guide/index.html": {
    "selectors": {
      "h1": "Guide",
      ".edit-link": "https://github.com/org/repo/blob/main/docs/guide.md"
    }
  }
}

Test Implementation Pattern

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


func TestGolden_BasicBuild(t *testing.T) {
    if testing.Short() {
        t.Skip("Skipping golden test in short mode")
    }
    
    // Create temp git repo from testdata structure
    repoPath := setupTestRepo(t, "testdata/repos/basic-build")
    defer cleanupTestRepo(t, repoPath)
    
    // Load and configure test
    cfg := loadGoldenConfig(t, "testdata/configs/basic-build.yaml")
    cfg.Repositories[0].URL = repoPath // Point to temp repo
    
    // Execute full build
    outputDir := t.TempDir()
    svc := build.NewDefaultService(cfg)
    err := svc.Build(context.Background(), outputDir)
    require.NoError(t, err)
    
    // Verify outputs
    goldenDir := "testdata/golden/basic-build"
    verifyHugoConfig(t, outputDir, goldenDir+"/hugo-config.golden.yaml")
    verifyContentStructure(t, outputDir, goldenDir+"/content-structure.golden.json")
    
    // Optional: verify rendered HTML
    if *updateGolden || fileExists(goldenDir+"/rendered-samples.golden.json") {
        verifyRenderedSamples(t, outputDir, goldenDir+"/rendered-samples.golden.json")
    }
}

### Update Workflow

```bash
# Update all golden files
go test ./test/integration -update-golden

# Update specific test
go test ./test/integration -run TestGolden_BasicBuild -update-golden

# Run without rendering (faster)
go test ./test/integration -skip-render

Rationale

Why Golden Testing?

Deterministic output: Hugo builds are reproducible with fixed input
Comprehensive verification: Catches subtle regressions in transformations
Living documentation: Test repos demonstrate supported features
Regression confidence: Safe to refactor with verified snapshots
Bug reproduction: Minimal test cases for issue reports

Why Not Full HTML Snapshots?

Size: Complete Hugo sites are large (MBs per test)
Brittleness: Relearn theme updates change HTML constantly
Diffability: Binary archives are hard to review in Git
Focus: We care about our transformations, not theme rendering

Why Separate Levels?

Hugo config: Always verify (our direct output)
Content structure: Lightweight, covers 90% of bugs
Rendered HTML: Optional, only for transformation features (math, callouts, etc.)

Why JSON for Structure?

Human-readable diffs in Git
Easy to update with -update-golden
Programmatic verification (no string matching)
Selective verification (check only relevant fields)

Consequences

Positive

Refactoring confidence: Pipeline changes, configuration updates are safe
Bug prevention: Regressions caught before merge
Feature documentation: Tests show what’s supported
Developer productivity: Fast feedback on changes
CI integration: Automated verification on every PR

Negative

Test maintenance: Golden files need updates when output intentionally changes
Initial investment: Writing helpers and first test cases takes time
Storage: Git repo grows with golden files (mitigated by JSON, not archives)
Test duration: Full builds slower than unit tests (use -short to skip)

Neutral

Test location: New test/integration/ package separate from unit tests
Naming convention: TestGolden_* prefix for golden tests
Flag dependencies: Requires -update-golden flag support

Status

Implemented - Golden testing framework is in place and actively used.

Notes

Single Relearn theme simplifies testing (no theme variation needed)
Focus on transform pipeline and configuration generation
Integration tests cover full build workflow

Alternatives Considered

Alternative 1: Snapshot Entire Hugo Site

Rejected: Too large, brittle, not Git-friendly

Alternative 2: Compare Only File Counts

Rejected: Too coarse, misses content errors

Alternative 3: String Matching on hugo.yaml

Rejected: Brittle to formatting, whitespace

Alternative 4: No Integration Tests

Rejected: Refactoring risk too high, manual testing unreliable

References

Testing Go Applications Using Golden Files
Go Wiki: Table Driven Tests
DocBuilder: internal/hugo/*_golden_test.go (existing pattern)
Hugo: Testing Documentation

permalink

[adr-001-golden-testing-strategy-for-output-verification](https://docs.home.luguber.info/_uid/93f56604-829d-4753-a7dc-9e7447a7af4f/)