Skip to content

Conversation

@jrepp
Copy link
Owner

@jrepp jrepp commented Nov 7, 2025

Summary

This PR implements pluggable upstream authentication for Goblet, enabling different upstream URLs to use different authentication credentials and token types. This is critical for multi-organization deployments and GitHub Enterprise compatibility.

Key Features:

  • URL-based token generation for org-specific authentication
  • Dynamic token type support (Bearer vs Basic for GitHub Enterprise)
  • Comprehensive test coverage (1,148 lines)
  • Streamlined documentation with better time-to-value

Merged Upstream PRs

PR #11: Pass upstream URL to token generation

Source: google#11
Author: @mdehoog

Enables custom token generation mechanisms for different upstreams. Critical when Goblet caches repositories from different organizations requiring separate authentication tokens (e.g., GitHub App installation tokens).

Changes:

  • Modified TokenSource from oauth2.TokenSource to func(*url.URL) (*oauth2.Token, error)
  • Updated all token generation call sites to pass upstream URL
  • Allows per-upstream token customization

PR #10: Use token type for authentication

Source: google#10
Author: @mdehoog

Respects OAuth2 token type (Bearer vs Basic) for authentication. GitHub Enterprise expects personal access tokens to use Basic auth instead of Bearer tokens.

Changes:

  • Use token.Type() instead of hardcoded "Bearer" in git fetch commands
  • Combined with empty token check logic
  • No impact on existing users (Bearer is default)

Documentation

RFC-002: GitHub OAuth and Multi-Tenancy Architecture

Comprehensive RFC providing:

  • Current state analysis with authentication flow diagrams
  • GitHub authentication models (GitHub.com vs Enterprise)
  • Multi-tenancy threat model (CVSS 8.1 for cross-tenant access)
  • Security requirements and technical architecture
  • 5-phase implementation strategy (12-16 weeks)
  • Recommendations for GitHub Apps and cache partitioning

See RFC-002

Streamlined Documentation

README improvements:

  • Removed verbose content and emojis (40% size reduction)
  • Better time-to-value with focused Quick Start section
  • Generalized automation tool references (not Terraform-specific)
  • Moved detailed examples to dedicated documentation

New documentation:

  • docs/operations/offline-mode.md - Comprehensive offline mode guide with configuration examples, monitoring, testing, and best practices

Test Coverage

New Test Files (1,148 lines)

managed_repository_auth_test.go (475 lines)

Integration tests verifying TokenSource with managed_repository:

  • TokenSource called with correct upstream URL
  • Different token types (Bearer, Basic, custom) applied to HTTP requests
  • Empty token handling (no Authorization header for public repos)
  • Error propagation from TokenSource
  • Multiple token calls for refresh scenarios
  • URL verification (GitHub, GitHub Enterprise, GitLab, custom servers)
  • Concurrent token requests

testing/upstream_auth_test.go (673 lines)

Unit tests for TokenSource function behavior:

  • URL-based token selection for different upstreams
  • Token type handling (Bearer, Basic, empty defaults)
  • Organization-specific token mapping from URLs
  • Error handling (connection failures, missing config)
  • Concurrent calls (50+ simultaneous requests)
  • Empty tokens for public repositories
  • GitHub App installation token patterns

Test Results

✓ All tests passing with race detection
✓ Coverage: 84.7% of statements
✓ 14 new test functions
✓ All lint checks passing (golangci-lint, staticcheck, go vet)

What This Enables

Multi-Organization Authentication

TokenSource: func(upstreamURL *url.URL) (*oauth2.Token, error) {
    // Extract GitHub org from URL
    org := extractGitHubOrg(upstreamURL)
    // github.com/acme-corp/repo → acme-corp

    // Generate org-specific token (e.g., GitHub App installation token)
    return tokenManager.GetToken(upstreamURL, org)
}

GitHub Enterprise Support

// Respects token type for GHE
token := &oauth2.Token{
    AccessToken: "ghp_enterprise_token",
    TokenType:   "Basic",  // GHE requires Basic
}
// Result: "Authorization: Basic ghp_enterprise_token"

Files Changed

Core functionality:

  • goblet.go - Updated TokenSource type signature
  • goblet-server/main.go - Adapted to new TokenSource function signature
  • managed_repository.go - Updated all token calls to pass upstream URL and use dynamic token type
  • testing/test_proxy_server.go - Updated test helpers

Documentation:

  • README.md - Streamlined, removed emojis, better structure
  • SECURITY.md - Generalized automation tool references
  • docs/architecture/rfc-002-github-oauth-multi-tenancy.md - New comprehensive RFC (1200+ lines)
  • docs/operations/offline-mode.md - New comprehensive guide with examples

Tests:

  • managed_repository_auth_test.go - Integration tests (475 lines)
  • testing/upstream_auth_test.go - Unit tests (673 lines)

Validation

All checks passing:

  • ✓ Unit tests with race detection
  • ✓ Linters (golangci-lint, staticcheck, go vet)
  • ✓ Documentation link validation (276 links across 47 files)

Use Cases

Multi-Organization Deployments:

  • Different GitHub organizations requiring separate authentication
  • GitHub App installation tokens per organization
  • Enterprise customers with multiple tenants

GitHub Enterprise:

  • Respects Basic auth for personal access tokens
  • Compatible with GitHub Enterprise Server authentication

Infrastructure as Code:

  • CI/CD pipelines accessing multiple organizations
  • Terraform/Ansible/Pulumi automation with org-specific credentials
  • Security scanning across multiple repositories

Breaking Changes

Minor API Change:

TokenSource in ServerConfig changed from oauth2.TokenSource to func(*url.URL) (*oauth2.Token, error)

Migration:

// Before:
TokenSource: ts,

// After:
TokenSource: func(upstreamURL *url.URL) (*oauth2.Token, error) {
    return ts.Token()
},

Security Implications

What This PR Provides:

  • ✓ Foundation for org-specific tokens
  • ✓ GitHub Enterprise compatibility
  • ✓ Enables secure multi-tenant architecture
  • ✓ Comprehensive test coverage

What Still Needs Implementation:

  • Authorization layer for repo-level access control
  • Token manager with GitHub App support
  • Cache partitioning for tenant isolation

See RFC-002 for complete security analysis and implementation roadmap.

Related Issues

Addresses needs for:

  • Multi-organization GitHub deployments
  • GitHub Enterprise compatibility
  • Custom token generation per upstream
  • Foundation for secure multi-tenant caching

Credit: Upstream features by @mdehoog (google#11, google#10)
Total Changes: +2,516 lines (code + tests + documentation)

mdehoog and others added 4 commits December 23, 2021 13:33
Merges google/goblet PR #11 by @mdehoog
google#11

Allows custom token generation mechanisms for different upstreams.
This is useful when Goblet caches repos from different organizations
where each needs its own token (e.g., GitHub app installation tokens).

Changes:
- Modified TokenSource from oauth2.TokenSource to a function accepting upstream URL
- Updated all token generation calls to pass upstream URL parameter
Merges google/goblet PR #10 by @mdehoog
google#10

Respects the token type (Bearer vs Basic) for authentication.
GitHub Enterprise expects personal access tokens using basic auth
instead of bearer. This change uses the token type from the token
itself rather than hardcoding 'Bearer'.

Changes:
- Changed hardcoded 'Bearer' to use t.Type() in git fetch commands
- Combined with empty token check from previous merge
- Already working for lsRefsUpstream via SetAuthHeader
- No impact on existing users (Bearer is the default)

Conflicts resolved:
- managed_repository.go: Combined t.Type() usage with empty token checks
@jrepp jrepp added the enhancement New feature or request label Nov 7, 2025
jrepp added 2 commits November 7, 2025 02:19
Comprehensive analysis of GitHub Enterprise and public GitHub OAuth
support with respect to multi-tenancy isolation concerns.

Covers:
- Current state analysis of authentication flows
- GitHub authentication models (Apps, PATs, OAuth Apps)
- Multi-tenancy isolation requirements and threat model
- Technical architecture for secure multi-tenant operation
- Implementation strategy (5 phases)
- Tradeoffs and recommendations
- Migration path from current to full implementation

Key findings:
- PR #7 provides critical foundation (URL-aware tokens, dynamic type)
- Complete solution requires: authorization layer + token manager + cache partitioning
- GitHub Apps recommended for production multi-tenant (automatic rotation, org-scoped)
- Estimated 12-16 weeks for full implementation

Related: PR #7, RFC-001
Update repository URLs from github-cache-daemon to goblet to match
upstream naming convention.

Changes:
- Updated RFC-002 PR link reference
- Updated CHANGELOG unreleased comparison link
jrepp added a commit that referenced this pull request Nov 7, 2025
Comprehensive analysis of GitHub Enterprise and public GitHub OAuth
support with respect to multi-tenancy isolation concerns.

Covers:
- Current state analysis of authentication flows
- GitHub authentication models (Apps, PATs, OAuth Apps)
- Multi-tenancy isolation requirements and threat model
- Technical architecture for secure multi-tenant operation
- Implementation strategy (5 phases)
- Tradeoffs and recommendations
- Migration path from current to full implementation

Key findings:
- PR #7 provides critical foundation (URL-aware tokens, dynamic type)
- Complete solution requires: authorization layer + token manager + cache partitioning
- GitHub Apps recommended for production multi-tenant (automatic rotation, org-scoped)
- Estimated 12-16 weeks for full implementation

Related: PR #7, RFC-001
jrepp added 4 commits November 7, 2025 23:36
- Format goblet-server/main.go with gofmt
- Update test_proxy_server.go to use new TokenSource function signature with adapter
…-fixes

# Conflicts:
#	docs/architecture/rfc-002-github-oauth-multi-tenancy.md
Add extensive test coverage for the URL-based TokenSource functionality
that enables different upstream URLs to use different authentication
credentials and token types.

New test files:
- managed_repository_auth_test.go: Integration tests for TokenSource
  with managed_repository, including token type handling (Bearer/Basic),
  URL passing, error propagation, and concurrent access
- testing/upstream_auth_test.go: Unit tests for TokenSource function
  behavior, including URL-based selection, org-specific tokens, GitHub
  App patterns, error handling, and concurrency

Test coverage includes:
- URL-based token selection for different upstreams (GitHub, GitLab, etc)
- Token type handling (Bearer, Basic, custom types)
- Organization-specific token mapping from URLs
- GitHub App installation token patterns
- Error handling and propagation
- Concurrent token requests (50+ concurrent calls)
- Empty token handling for public repositories
- Integration with managed_repository HTTP requests

All tests passing with 14 new test cases covering the pluggable
authentication feature.
- Add error checking for w.Write() calls in managed_repository_auth_test.go
- Add nolint directives for intentional test patterns in upstream_auth_test.go
- All lint checks now passing
@jrepp jrepp merged commit 14886a2 into main Nov 10, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants