Architecture Overview
DocBuilder implements a staged pipeline to turn multiple Git repositories into a unified Hugo documentation site.
Pipeline Flow
Transform Content Stage executes the fixed transform pipeline:
Each stage records duration, outcome, and issues for observability.
Key Components
| Component | Responsibility | Location |
|---|---|---|
| Config Loader | Parse YAML, expand ${ENV} variables, apply defaults. |
internal/config/ |
| Build Service | Orchestrate build pipeline execution. | internal/build/ |
| Git Client | Clone/update repositories with auth strategies (token, ssh, basic). | internal/git/ |
| Discovery | Walk configured doc paths, filter markdown, build DocFile list. |
internal/docs/ |
| Hugo Generator | Emit hugo.yaml, content tree, index pages, theme params. |
internal/hugo/ |
| Transform Pipeline | Fixed-order content processing pipeline with direct mutation. | internal/hugo/pipeline/ |
| Relearn Theme | Theme configuration hardcoded for Relearn theme. | internal/hugo/ (config_writer.go) |
| Forge Integration | GitHub/GitLab/Forgejo API clients. | internal/forge/ |
| Error Foundation | Classified error system with retry strategies. | internal/foundation/errors/ |
| Report | Aggregate metrics & fingerprints for external tooling. | internal/hugo/ |
Namespacing Logic
Forge namespacing (conditional content/<forge>/<repo>/...) prevents collisions and yields scalable URL design. Auto mode activates only when more than one forge type exists.
Idempotence & Change Detection
- Repository update strategy (
clone_strategy) avoids unnecessary reclones. - Combined check: unchanged repo heads + identical doc file set ⇒ logged and optionally triggers early exit (when output already valid).
doc_files_hashoffers external determinism for CI/CD.
Error & Retry Model
Error Classification (internal/foundation/errors):
Severity Levels:
Fatal- Stops execution completelyError- Fails the current operationWarning- Continues with degraded functionalityInfo- Informational, no impact
Retry Strategies:
RetryNever- Permanent failureRetryImmediate- Retry immediatelyRetryBackoff- Exponential backoffRetryRateLimit- Wait for rate limit windowRetryUserAction- Requires user intervention
Error Categories: User-facing (Config, Validation, Auth, NotFound), External (Network, Git, Forge), Build (Build, Hugo, FileSystem), Runtime (Runtime, Daemon, Internal)
Transient classification guides retry policy (clone/update network issues; certain Hugo invocations).
Content Generation Details
Transform Pipeline (internal/hugo/pipeline/):
Each markdown file passes through a fixed-order transform pipeline:
- parseFrontMatter - Extract YAML front matter from markdown
- normalizeIndexFiles - Rename README.md → _index.md for Hugo
- buildBaseFrontMatter - Generate default fields (title, type, date)
- extractIndexTitle - Extract H1 as title for index pages
- stripHeading - Remove H1 from content when appropriate
- rewriteRelativeLinks - Fix markdown links (.md → /, directory-style)
- rewriteImageLinks - Fix image paths to content root
- generateFromKeywords - Create new documents from keywords (, etc.)
- addRepositoryMetadata - Inject repository/forge/commit metadata
- addEditLink - Generate editURL for source links
- serializeDocument - Output final YAML + markdown
Pipeline Features:
- Fixed execution order (explicit, no dependency resolution needed)
- Direct document mutation (no patch merge complexity)
- Document type with all fields accessible
- Generators create missing index files before transforms run
Theme Integration:
- Supported themes use Hugo Modules (no local theme directory needed)
- Theme-specific configuration for Relearn
- Index template override search order ensures safe customization
- Front matter includes forge, repository, section, editURL for theme logic
Pruning Strategy
Optional top-level pruning removes non-doc directories to shrink workspace footprint—controlled with allow/deny precedence rules to avoid accidental removal of required assets.
Design Rationale Highlights
| Concern | Approach |
|---|---|
| Cross-repo collisions | Conditional forge prefix + repository segmentation. |
| Performance | Incremental fetch + pruning + shallow clones. |
| Theming | Module-based imports; param injection per theme. |
| Observability | Structured build report + issue taxonomy + stage timing. |
| Reproducibility | Environment expansion + explicit config + stable hashing. |
Extensibility Points
- Add new transform: Create function in
internal/hugo/pipeline/transforms.goand add todefaultTransforms()list - Add new generator: Create function in
internal/hugo/pipeline/generators.goand add todefaultGenerators()list - Add new theme: Implement
Themeinterface ininternal/hugo/theme/themes/ - Additional issue codes: Augment taxonomy without breaking consumers
- Future caching: Leverage
doc_files_hashfor selective downstream regeneration
Non-Goals
- Rendering arbitrary SSGs other than Hugo.
- Full-text search indexing logic (delegated to Hugo theme or external indexing).