Skip to content

Conversation

@EmilienM
Copy link

Introduce a new Public API service that acts as a single entry point
for all clients (Browser, SDK, MCP). This thin shim layer provides:

  • Versioned REST endpoints (/v1/sessions)
  • Simplified JSON responses (abstracts K8s CRD complexity)
  • Token validation and project extraction from headers or JWT
  • Proxy to internal Go backend

Components added:

  • components/public-api/ - Go service with Gin framework
  • Kubernetes manifests for deployment, service, and route
  • Makefile build target (build-public-api)
  • Design proposal document

Closes #557

Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
Signed-off-by: Emilien Macchi emacchi@redhat.com

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review

Summary

This PR introduces a new Public API gateway service that provides a simplified, versioned REST API for external clients (Browser, SDK, MCP). The implementation is well-architected as a thin shim layer that proxies requests to the internal Go backend while abstracting Kubernetes CRD complexity. The code demonstrates strong adherence to security best practices, comprehensive test coverage, and proper Kubernetes integration.

Overall Assessment: High quality implementation with excellent security practices and comprehensive testing. A few minor improvements and documentation enhancements would make this production-ready.


Issues by Severity

🚫 Blocker Issues

None - No blocking issues found. The implementation is secure and follows established patterns.


🔴 Critical Issues

None - No critical issues found. Security patterns are properly implemented.


🟡 Major Issues

1. Token Security - Potential Log Exposure (handlers/middleware.go:145)

The logging middleware logs requests but does not redact tokens from query parameters or other headers beyond Authorization.

Recommendation: Follow the backend pattern from server/server.go:22-34 which redacts tokens in URLs and headers.

2. JWT Security Comment Clarity (handlers/middleware.go:83-93)

The security comment for extractJWTSubject is excellent, but it should emphasize that this is ONLY safe because the backend validates the token. Consider making this even more explicit.


🔵 Minor Issues

1. Error Handling - Duplicate Code (handlers/sessions.go:40-44)

Dead code detected - checking err != nil twice consecutively without any operation between them. Remove the duplicate error check.

2. Inconsistent Error Messages (handlers/sessions.go:216)

When backend returns non-JSON, the error message is generic: "Request failed". Could be more descriptive.

3. Missing Dockerfile Multi-Architecture Build

The Dockerfile does not specify TARGETPLATFORM for multi-arch builds.

4. Test Coverage - Missing Backend Integration Tests

While unit tests are comprehensive, there are no tests for actual backend integration with a mock backend server.

5. Documentation - Missing ADR

This is a significant architectural change (new API gateway), but there is no ADR following the pattern in docs/adr/.

6. Security Context Missing in Deployment

While the pod security context is excellent, the deployment does not specify a securityContext at the pod level for defense in depth.

7. Environment Variable Validation

The main.go does not validate required environment variables at startup.

8. HTTP Client Timeout Not Configurable in Tests

Tests modify the global HTTPClient variable which could cause test pollution if tests run in parallel.


Positive Highlights

Excellent Security Practices:

  • Token redaction in error responses
  • No panic in production code
  • Proper Bearer token extraction with fallback to X-Forwarded-Access-Token
  • Clear security comments
  • Proper SecurityContext with readOnlyRootFilesystem and dropped capabilities
  • Non-root user in Dockerfile

Comprehensive Test Coverage:

  • Unit tests for all middleware functions
  • Unit tests for session transformations
  • Proxy error handling tests
  • E2E tests covering full CRUD cycle
  • Edge case testing

Clean Architecture:

  • Proper separation of concerns
  • Thin shim design
  • Versioned API endpoints
  • Simplified DTOs hiding K8s complexity

Good Kubernetes Integration:

  • Proper health/readiness probes
  • Resource limits defined
  • ServiceAccount without unnecessary RBAC

Code Quality:

  • Table-driven tests
  • Clear variable naming
  • Proper error wrapping
  • No code duplication
  • Consistent error response format

Documentation:

  • Clear README with examples
  • Comprehensive design proposal document
  • Inline comments for complex logic
  • Security rationale documented

Recommendations

Priority 1 (Before Merge)

  1. Fix the duplicate error check in handlers/sessions.go:40-44
  2. Add token redaction to logging middleware
  3. Create ADR document for this architectural decision

Priority 2 (Follow-up PR)

  1. Add backend integration tests with mock server
  2. Add startup configuration validation in main.go
  3. Enhance pod-level security context
  4. Add multi-architecture build support to Dockerfile

Final Verdict

APPROVED with minor improvements

This is a well-implemented feature that adds significant value to the platform. The code is production-ready with proper security practices, comprehensive testing, and clean architecture. The suggested improvements are mostly documentation and defense-in-depth enhancements rather than functional issues.

Recommended Actions:

  1. Address the duplicate error check (quick fix)
  2. Add token redaction to logging middleware (security best practice)
  3. Create ADR document (documentation completeness)

Then this is ready to merge! 🚀


Review completed by: Claude Code Review Bot
Standards checked: CLAUDE.md, backend-development.md, security-standards.md, k8s-client-usage.md, error-handling.md
Review date: 2026-01-29


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review

Summary

PR #558 introduces a new Public API gateway service that provides a simplified, versioned REST API (/v1/sessions) for external clients (SDK, CLI, MCP). This is a well-architected thin proxy layer that follows established security patterns from the existing backend. The implementation demonstrates strong adherence to the repository's coding standards with comprehensive tests and proper documentation.

Overall Assessment:APPROVED - High quality implementation with only minor suggestions for improvement.


Issues by Severity

🚫 Blocker Issues

None - No blocking issues found.


🔴 Critical Issues

None - No critical security or functionality issues.


🟡 Major Issues

1. JWT Extraction Without Signature Validation - Security Documentation Gap

Location: handlers/middleware.go:83-133

Issue: The extractJWTSubject() function extracts JWT claims without signature validation. While the extensive security warning comment (lines 85-106) clearly explains this is safe for routing purposes only, there's a potential gap:

  • The function extracts the project namespace from unvalidated JWT claims
  • If the backend's token validation fails (network issue, misconfiguration), a forged token could route requests to any project
  • The backend validation is the critical security boundary, but network failures could cause issues

Recommendation:

  • Add explicit logging when project is extracted from JWT: log.Printf("Extracted project %s from ServiceAccount token for routing (backend will validate)", project)
  • Consider adding a configuration flag to require explicit X-Ambient-Project header for production deployments
  • Add integration test demonstrating that forged tokens are rejected by backend even if routing succeeds

Current Code:

// extractJWTSubject extracts the 'sub' claim from a JWT WITHOUT validating signature.
// ... extensive warning comment ...
func extractJWTSubject(token string) string {
    // Decodes without validation
}

Why Major: While technically safe due to backend validation, the separation of concerns (routing vs auth) increases the attack surface if backend validation ever fails.


🔵 Minor Issues

1. Dead Code - Duplicate Error Check

Location: handlers/sessions.go:42-45

// Line 28-33: Read response body
body, err := io.ReadAll(resp.Body)
if err != nil {
    log.Printf("Failed to read backend response: %v", err)
    c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to read response"})
    return
}

// Lines 42-45: Dead code (err is already checked above)
if err != nil {
    c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to read response"})
    return
}

Fix: Remove lines 42-45 (unreachable code after successful io.ReadAll).


2. Missing BACKEND_TIMEOUT Documentation

Location: components/public-api/README.md, handlers/proxy.go:20

Issue: The BACKEND_TIMEOUT environment variable is configurable in code but not documented in the README.

Current:

| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `8081` | Server port |
| `BACKEND_URL` | `http://backend-service:8080` | Internal backend URL |
| `GIN_MODE` | `release` | Gin mode (debug/release) |

Add:

| `BACKEND_TIMEOUT` | `30s` | Backend request timeout (Go duration format) |

3. Inconsistent Error Messages - Generic vs Specific

Location: Multiple handlers

Issue: Error responses mix generic and specific messages inconsistently:

  • sessions.go:22: "Backend unavailable" (generic, good)
  • sessions.go:31: "Failed to read response" (specific, potentially leaks info)
  • sessions.go:215: "Request failed" (very generic)

Recommendation: Standardize on generic user-facing messages while keeping detailed logs:

// GOOD (current pattern in proxy.go:86-87)
log.Printf("Backend request failed: %v", err)  // Detailed log
c.JSON(http.StatusBadGateway, gin.H{"error": "Backend unavailable"})  // Generic response

// AVOID exposing internal details
c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("Failed to: %v", err)})

4. Test Coverage - Missing Edge Cases

Good: Comprehensive middleware tests (314 lines) with table-driven patterns ✅
Missing:

  • sessions_test.go: Only 170 lines, missing error scenarios:
    • Backend returns 500 (should forward error)
    • Backend returns malformed JSON
    • Backend timeout (context deadline exceeded)
    • Empty session list vs nil session list

Recommendation: Add test cases for backend error conditions to achieve >80% coverage.


5. Kubernetes Manifest - Missing Resource Quotas

Location: components/manifests/base/public-api-deployment.yaml:18-58

Current:

serviceAccountName: public-api  # ✅ Good: dedicated SA
# NOTE: No RoleBinding needed (only proxies HTTP) ✅ Good: documented
securityContext:  # ✅ Excellent: follows security standards
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  capabilities:
    drop:
    - ALL

Missing: PodDisruptionBudget for production resilience (optional but recommended)

Add:

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: public-api-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: public-api

6. Proposal Document - Success Criteria Not Verified

Location: docs/proposals/acp-public-rest-api.md:172-178

Issue: Success criteria checklist is not referenced in PR description or test results:

- [ ] Public API service is deployable
- [ ] Can list/create/get/delete sessions via `curl` to Public API
- [ ] Browser continues to work through Public API
- [ ] SDK/MCP can authenticate with access keys
- [ ] Latency < 200ms for API requests

Recommendation: Update PR description with checklist status or add link to test results demonstrating criteria met.


Positive Highlights

🌟 Security Excellence

  1. Token Redaction (middleware.go:155-219) - Follows exact pattern from backend/server/server.go with comprehensive query parameter redaction ✅
  2. SecurityContext (public-api-deployment.yaml:52-58) - Perfect adherence to security standards:
    • readOnlyRootFilesystem: true
    • allowPrivilegeEscalation: false
    • runAsNonRoot: true
    • capabilities.drop: [ALL]
  3. No Direct K8s Access - ServiceAccount has NO RBAC permissions (lines 84-87), enforcing separation of concerns ✅
  4. Extensive Security Documentation - 22-line warning comment explaining JWT extraction safety (middleware.go:85-106) ✅

🎯 Architecture

  1. Clean Separation - Public API is truly a thin proxy (~500 lines), backend remains untouched ✅
  2. DTO Transformation (sessions.go:218-287) - Simplifies K8s CRD complexity for external clients ✅
  3. Versioned API - /v1/sessions allows future evolution without breaking changes ✅
  4. Dual Auth Support - Handles both Authorization: Bearer and X-Forwarded-Access-Token (middleware.go:48-61) ✅

🧪 Testing

  1. Table-Driven Tests - middleware_test.go uses proper Go testing patterns with 10+ test cases per function ✅
  2. E2E Coverage - public-api.cy.ts tests health, auth, CRUD operations end-to-end ✅
  3. Test Organization - Separate test files for middleware, proxy, sessions (good separation) ✅

📚 Documentation

  1. Comprehensive README - Includes usage examples, environment variables, development instructions ✅
  2. Design Proposal - 214-line document explaining architecture, decisions, and implementation plan ✅
  3. Code Comments - Security-critical sections have extensive explanatory comments ✅

🛠️ Code Quality

  1. Error Handling - Follows established patterns: log detailed errors, return generic messages ✅
  2. Context Propagation - Uses c.Request.Context() for backend requests (proxy.go:53) ✅
  3. Resource Management - Proper defer resp.Body.Close() in all handlers ✅
  4. Environment Configuration - Configurable timeouts and URLs via env vars ✅

Recommendations

Priority 1 (Before Merge)

  1. Remove dead code in sessions.go:42-45
  2. Add BACKEND_TIMEOUT to README environment variables table
  3. Update PR description with success criteria checklist status

Priority 2 (Follow-up PR)

  1. 🔄 Add integration test demonstrating forged token rejection by backend
  2. 🔄 Expand sessions_test.go with backend error scenarios (500, timeout, malformed JSON)
  3. 🔄 Standardize error messages across all handlers (generic user messages)

Priority 3 (Future Enhancement)

  1. 💡 Add PodDisruptionBudget for production resilience
  2. 💡 Add Prometheus metrics for latency tracking (verify <200ms SLA)
  3. 💡 Consider rate limiting middleware for SDK/MCP abuse prevention

Architectural Alignment

CLAUDE.md Compliance:

  • Follows Go error handling patterns (log + return)
  • Uses proper token redaction (matches backend/server.go)
  • SecurityContext matches operator patterns
  • Table-driven tests follow established conventions

ADR Compliance:

  • ADR-0002 (User Token Authentication): Token forwarded to backend, no bypass ✅
  • ADR-0004 (Go Backend): New Go service follows same patterns ✅

Security Standards:

  • No tokens in logs ✅
  • Generic error messages to users ✅
  • Detailed logs for debugging ✅
  • RBAC properly scoped (no permissions needed) ✅

Final Verdict

✅ LGTM with minor fixes

This PR demonstrates excellent software engineering:

  • Well-architected thin proxy layer
  • Comprehensive security measures
  • Strong test coverage
  • Thorough documentation
  • Clean code following repository patterns

The only blockers are trivial (dead code, missing docs). Addressing Priority 1 recommendations will make this merge-ready.

Estimated Effort to Address:

  • Priority 1 fixes: ~15 minutes
  • Priority 2 improvements: ~2 hours (follow-up PR)
  • Priority 3 enhancements: Future work

Reviewed by: Claude Sonnet 4.5
Review Date: 2026-01-29
Standards Applied: CLAUDE.md, security-standards.md, backend-development.md, error-handling.md


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review

Summary

This PR introduces a well-architected Public API gateway service that provides a versioned REST API (v1) for client access. The code is secure, well-tested, and follows Go best practices. The service uses a stateless gateway pattern (proxy to backend) which differs from the direct K8s client pattern used in the backend service - this is an intentional architectural choice that should be documented in CLAUDE.md.

Overall Assessment: APPROVE WITH MINOR CHANGES


Issues by Severity

🚫 Blocker Issues

None - Code is production-ready after addressing critical issues below.


🔴 Critical Issues

1. JWT Subject Extraction Without Signature Validation

File: components/public-api/handlers/middleware.go:85-135

Issue: The extractJWTSubject() function decodes JWT tokens without validating signatures. While properly documented (lines 85-108), this creates potential for misuse.

Current Risk: LOW (only used for routing, backend validates)
Future Risk: HIGH (easy to misuse by future developers)
Attack Vector: Attacker could forge JWT with any namespace in sub claim

Recommendation: Add validation that header project matches token project (if both present):

func extractProject(c *gin.Context, token string) string {
    headerProject := c.GetHeader("X-Ambient-Project")
    tokenProject := extractProjectFromToken(token)
    
    // If both exist, they must match (prevents routing attacks)
    if headerProject != "" && tokenProject != "" && headerProject != tokenProject {
        log.Printf("SECURITY: Project mismatch - header=%s token=%s", headerProject, tokenProject)
        return "" // Force authentication failure
    }
    
    // Prefer explicit header
    if headerProject != "" {
        return headerProject
    }
    return tokenProject
}

2. Missing Input Validation on Session IDs and Project Names

File: components/public-api/handlers/sessions.go:65

Issue: No validation on session IDs or project names to prevent injection attacks.

// Current code - no validation
sessionID := c.Param("id")
path := fmt.Sprintf("/api/projects/%s/agentic-sessions/%s", project, sessionID)

Recommendation: Add K8s name validation:

import "regexp"

var kubernetesNameRegex = regexp.MustCompile(`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`)

func validateKubernetesName(name string) bool {
    return len(name) > 0 && len(name) <= 63 && kubernetesNameRegex.MatchString(name)
}

// In handlers:
if !validateKubernetesName(sessionID) {
    c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid session ID"})
    return
}

3. Architectural Deviation Not Documented in CLAUDE.md

Files: All handlers

Issue: The public-api service does NOT use GetK8sClientsForRequest() pattern from CLAUDE.md. This is intentional (stateless gateway design) but not documented.

Risk: Future developers may incorrectly try to apply backend patterns to this service.

Recommendation: Update CLAUDE.md:

### Exception: Public API Gateway Service
The `components/public-api` service is a stateless HTTP gateway that does NOT
follow the standard backend pattern of using GetK8sClientsForRequest(). Instead,
it forwards authenticated requests to the backend service, which performs all
K8s operations and RBAC validation.

Also add comment to main.go:

// ARCHITECTURE NOTE: This service is a stateless gateway that forwards
// authenticated requests to the backend. Unlike the backend service, we do NOT
// create K8s clients here - all K8s operations and RBAC validation happen in
// the backend service. Our role is to:
// 1. Extract and validate tokens (middleware.go)
// 2. Extract project context (from header or token)
// 3. Forward requests with proper authorization headers

🟡 Major Issues

4. Missing Integration Tests

Gap: No tests verify end-to-end flow with real HTTP server.

Current tests: Mock backend responses, test functions in isolation
Missing: HTTP server integration tests, backend connectivity tests, token forwarding tests

Recommendation: Add handlers/integration_test.go:

func TestE2E_CreateSession(t *testing.T) {
    // Start mock backend server
    backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Verify token was forwarded
        auth := r.Header.Get("Authorization")
        if auth != "Bearer test-token" {
            t.Errorf("Token not forwarded correctly")
        }
        w.WriteHeader(http.StatusCreated)
        json.NewEncoder(w).Encode(map[string]string{"name": "session-123"})
    }))
    defer backend.Close()
    
    handlers.BackendURL = backend.URL
    // ... test full flow
}

5. Missing golangci-lint Configuration

Gap: No .golangci.yml in components/public-api/

Impact: Cannot verify linting standards match backend.

Recommendation:

cp components/backend/.golangci.yml components/public-api/

🔵 Minor Issues

6. Missing Context Timeout on Backend Proxy Requests

File: components/public-api/handlers/proxy.go:57-70

Issue: Relies solely on HTTP client timeout, no explicit context timeout.

Recommendation:

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
req, err := http.NewRequestWithContext(ctx, method, fullURL, bytes.NewReader(body))

7. Validation Helpers Should Be Extracted

Issue: Inline validation logic could be extracted into reusable helpers.

Recommendation: Create helpers/validation.go:

package helpers

func IsValidKubernetesName(name string) bool { ... }
func SanitizeProjectName(name string) string { ... }

Positive Highlights

Security ✅

  • Excellent token redaction in logs (middleware.go:157-221) with 11 test cases
  • Proper error sanitization - no internal details leaked (proxy.go:80-101)
  • Safe type assertions throughout - no panics on wrong types (sessions.go:214-267)
  • Comprehensive security tests verify no information leakage (proxy_test.go:90-120)

Code Quality ✅

  • Clean Go code following best practices
  • Comprehensive error handling - explicit checks, contextual logging, appropriate status codes
  • No panics in production code
  • Type-safe unstructured access patterns
  • Well-organized package structure

Architecture ✅

  • Proper middleware ordering - auth before logging (main.go:34-45)
  • Clean DTO pattern for API versioning (types/dto.go)
  • Stateless gateway design - no K8s client dependencies
  • Health endpoints properly unauthenticated

Testing ✅

  • 24 test functions with excellent coverage
  • Edge cases covered - type safety, error handling, security
  • Total: 810 lines of test code (middleware_test.go: 315, proxy_test.go: 196, sessions_test.go: 299)

Dockerfile Security ✅

  • Multi-stage build (smaller image)
  • Non-root user
  • Static binary (CGO_ENABLED=0)
  • Minimal base image (alpine:3.19)

Recommendations

Must Address Before Merge (Priority 1)

  1. ✅ Add input validation for session IDs and project names (Critical Issue Epic: RAT Architecture & Design #2)
  2. ✅ Add validation that header project matches token project (Critical Issue Outcome: Reduce Refinement Time with agent System #1)
  3. ✅ Update CLAUDE.md to document architectural exception (Critical Issue Epic: Data Source Integration #3)
  4. ✅ Add architectural comment to main.go (Critical Issue Epic: Data Source Integration #3)

Should Address Soon (Priority 2)

  1. 🔶 Add integration tests with httptest.Server (Major Issue Epic: AI Agent Development #4)
  2. 🔶 Add golangci-lint configuration (Major Issue Epic: Jira Integration & Workflow #5)

Nice to Have (Priority 3)

  1. 🔵 Add context timeout to backend proxy requests (Minor Issue Epic: Testing & Validation #6)
  2. 🔵 Extract validation helpers (Minor Issue Test: Automation Workflow Validation #7)

Detailed Assessment

Security: 8/10 ⚠️

Strengths: Token redaction, error sanitization, safe type assertions
Issues: JWT parsing without signature validation, missing input validation

Code Quality: 9/10 ✅

Strengths: Clean Go code, comprehensive error handling, no panics
Issues: Minor - could add validation helpers and golangci-lint config

Architecture Compliance: 7/10 ⚠️

Strengths: Proper middleware ordering, clean DTO pattern, stateless design
Issues: Major documentation gap - deviation from backend patterns not documented

Testing: 8/10 ✅

Strengths: 24 test functions, edge cases covered, security tests
Issues: Missing integration tests with real HTTP server


Conclusion

This is well-written, secure code that introduces a clean API gateway pattern. The architecture differs from the backend (stateless gateway vs direct K8s client), but this is a valid design choice for an API gateway service.

Key strengths:

  • Strong security with proper token handling
  • High code quality with comprehensive error handling
  • Excellent test coverage (810 lines of tests)
  • Clean DTO pattern for API versioning

Main gap: Documentation - the architectural differences need to be explicitly documented in CLAUDE.md to prevent confusion for future developers.

Recommendation: Merge after addressing Priority 1 items (input validation + documentation). Priority 2 and 3 can be follow-up tasks.


Review conducted using:

  • CLAUDE.md project standards
  • Backend development patterns (.claude/context/backend-development.md)
  • Security standards (.claude/context/security-standards.md)
  • K8s client usage patterns (.claude/patterns/k8s-client-usage.md)
  • Error handling patterns (.claude/patterns/error-handling.md)

🤖 Generated by Claude Code


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review - PR #558

Summary

This PR introduces a Public API gateway service as a thin shim layer between clients (Browser, SDK, MCP) and the internal Go backend. The implementation is well-architected with strong security patterns, comprehensive testing, and excellent documentation.

Overall Assessment:Approved with Minor Recommendations

Issues by Severity

🚫 Blocker Issues

None - No blocking issues found.

🔴 Critical Issues

None - No critical issues found.

🟡 Major Issues

1. Missing Error Context in Some Logs

  • Location: components/public-api/handlers/sessions.go:34,50,90,102,146,163
  • Issue: Several error logs don't include full context (project name, session ID)
  • Impact: Makes debugging harder when multiple requests fail simultaneously
  • Recommendation: Add project/session context to all error logs

🔵 Minor Issues

1. Inconsistent Error Messages

  • Location: components/public-api/handlers/sessions.go:26,83,156
  • Generic 'Backend unavailable' doesn't distinguish between network errors and timeouts

2. Magic Number in Test

  • Location: e2e/cypress/e2e/public-api.cy.ts:42
  • Hardcoded retry count 20 without explanation

3. Unused ErrorResponse Type

  • Location: components/public-api/types/dto.go:35
  • ErrorResponse type defined but never used (handlers use gin.H instead)

Positive Highlights

⭐ Excellent Security Documentation

The JWT extraction function (middleware.go:101-124) has outstanding security documentation explaining why unvalidated JWT parsing is safe in this context. This is a textbook example of security documentation done right.

⭐ Token Redaction Pattern

Logging middleware correctly implements token redaction following backend patterns, redacting token, access_token, and api_key parameters.

⭐ Comprehensive Test Coverage

  • Unit tests: 907 lines (middleware, sessions, proxy, validation)
  • Integration tests: 269 lines
  • E2E tests: 277 lines
  • Total: ~1,453 lines of tests for ~893 lines of production code = 162% coverage ratio

⭐ Correct Architecture Pattern

The public-api service correctly diverges from backend patterns as documented in CLAUDE.md:

  • No K8s clients (stateless gateway)
  • No RBAC permissions on ServiceAccount
  • Token forwarding only (no authorization decisions)

⭐ Strong Input Validation

Kubernetes DNS label validation prevents path injection attacks (RFC 1123, max 63 chars, alphanumeric + hyphens).

⭐ Production-Ready Container Security

  • Multi-stage build
  • Non-root user (UID 1001)
  • Read-only root filesystem
  • Drop all capabilities
  • Health checks configured
  • Resource limits set

Recommendations

Priority 1: Pre-Merge (Low Effort)

  1. Add error context to logs (include project/session IDs)
  2. Use ErrorResponse type consistently or remove it
  3. Extract magic numbers to named constants

Priority 2: Post-Merge (Follow-up)

  1. Add Prometheus metrics (latency, error rates)
  2. Add per-project rate limiting
  3. Consider response caching for GET endpoints

Priority 3: Future Enhancements

  1. Document API versioning strategy
  2. Generate OpenAPI/Swagger spec
  3. Add circuit breaker pattern

Architecture Compliance

Standard Status Notes
User Token Auth ✅ PASS Correctly extracts and forwards tokens
RBAC Enforcement ✅ PASS No RBAC in gateway (backend validates)
Token Redaction ✅ PASS Comprehensive logging redaction
Error Handling ✅ PASS Generic errors to users, detailed logs
Input Validation ✅ PASS Prevents injection attacks
Container Security ✅ PASS Non-root, read-only, no caps
Type Safety ⚠️ MINOR Could use custom types vs gin.H
Testing ✅ EXCELLENT 162% coverage ratio

Final Verdict

Status:APPROVED

This PR demonstrates excellent engineering practices. The minor issues are recommendations for future improvement, not blockers. The code is safe to merge.


Reviewed by: Claude Sonnet 4.5
Memory Context Loaded: CLAUDE.md, backend-development.md, frontend-development.md, security-standards.md, k8s-client-usage.md, error-handling.md, react-query-usage.md


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@EmilienM EmilienM marked this pull request as ready for review January 29, 2026 18:31
@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review

Summary

This PR introduces a Public API gateway service that provides a simplified, versioned REST API (/v1/sessions) for external clients (Browser, SDK, MCP). The implementation is architecturally sound and follows project standards with excellent documentation. The service acts as a stateless HTTP proxy that forwards authenticated requests to the internal Go backend.

Overall Assessment: ✅ Well-designed, secure, and production-ready with one blocker issue to address.


Issues by Severity

🚫 Blocker Issues

1. Type Assertions Without Checking in transformSession

Location: components/public-api/handlers/sessions.go:243-283

Issue: Multiple unchecked type assertions that could panic in production:

// Line 243 - No check if 'ok' is false before using
if metadata, ok := data["metadata"].(map[string]interface{}); ok {
    if name, ok := metadata["name"].(string); ok {  // ✅ Good
        session.ID = name
    }
}

This pattern is actually CORRECT! The outer if ok check prevents panic. However, there are other locations where this could be improved for consistency.

Actually, reviewing more carefully - this is handled correctly! The code uses the if val, ok := ...; ok pattern throughout. This is NOT a blocker.


🔴 Critical Issues

1. Missing Security Context Validation Test

Location: components/public-api/handlers/middleware_test.go

Issue: The middleware tests cover token extraction and project extraction, but there's no explicit test for the security-critical project mismatch detection (lines 72-75 in middleware.go):

// SECURITY: If both header and token specify a project, they must match
if headerProject \!= "" && tokenProject \!= "" && headerProject \!= tokenProject {
    log.Printf("SECURITY: Project mismatch - header=%s token=%s (possible routing attack)", headerProject, tokenProject)
    return "" // Force authentication failure
}

Recommendation: Add a dedicated test case:

func TestExtractProject_SecurityMismatch(t *testing.T) {
    // Test that project mismatch is detected and blocked
    c, _ := gin.CreateTestContext(httptest.NewRecorder())
    c.Request = httptest.NewRequest("GET", "/", nil)
    c.Request.Header.Set("X-Ambient-Project", "project-a")
    
    // Token claims project-b
    token := createMockServiceAccountToken("project-b", "test-sa")
    
    project := extractProject(c, token)
    assert.Empty(t, project, "Should reject mismatched projects")
}

🟡 Major Issues

1. Unstructured Data Transformation Lacks Error Logging

Location: components/public-api/handlers/sessions.go:239-307

Issue: The transformSession function silently ignores malformed backend responses. If the backend returns unexpected data structures, the function returns a partial SessionResponse with empty fields, making debugging difficult.

Current behavior:

if metadata, ok := data["metadata"].(map[string]interface{}); ok {
    // Extract fields...
} // If not ok, silently continue

Recommendation: Add debug logging for transformation failures:

if metadata, ok := data["metadata"].(map[string]interface{}); ok {
    // Extract fields...
} else {
    log.Printf("Warning: Backend response missing 'metadata' field or wrong type")
}

Why Major: In production, silent data transformation failures can mask backend API changes or bugs.


2. Backend Timeout Not Configurable Per-Request

Location: components/public-api/handlers/proxy.go:21

Issue: The backend timeout is set globally at startup (BackendTimeout = 30s). Some operations (like creating sessions) may need longer timeouts, while health checks should be fast.

Current implementation:

HTTPClient = &http.Client{
    Timeout: BackendTimeout,  // Global 30s
}

Recommendation: Consider per-endpoint timeout configuration:

// In CreateSession handler:
ctx, cancel := context.WithTimeout(c.Request.Context(), 60*time.Second)
defer cancel()

Why Major: Long-running session creation requests could timeout prematurely with a 30s global timeout.


3. No Rate Limiting

Location: components/public-api/main.go

Issue: The public API has no rate limiting middleware. This exposes the service to:

  • Accidental DoS from buggy clients
  • Malicious abuse from compromised tokens
  • Resource exhaustion attacks

Recommendation: Add rate limiting middleware using gin-contrib/rate or similar:

import "github.com/ulule/limiter/v3/drivers/middleware/gin"

// Per-token rate limiting (100 req/min)
v1.Use(rateLimitMiddleware())

Why Major: Production APIs should have basic DoS protection.


🔵 Minor Issues

1. Health Check Returns 200 Without Backend Validation

Location: components/public-api/main.go:41-43

Issue: The /health endpoint returns success even if the backend is unreachable. This could cause load balancers to route traffic to a healthy pod that can't reach its backend.

r.GET("/health", func(c *gin.Context) {
    c.JSON(200, gin.H{"status": "ok"})  // Always returns 200
})

Recommendation: Add backend connectivity check (optional, only for liveness probe):

r.GET("/health", func(c *gin.Context) {
    // Quick ping to backend
    ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
    defer cancel()
    
    req, _ := http.NewRequestWithContext(ctx, "GET", BackendURL+"/health", nil)
    resp, err := HTTPClient.Do(req)
    if err \!= nil || resp.StatusCode \!= 200 {
        c.JSON(503, gin.H{"status": "backend unavailable"})
        return
    }
    c.JSON(200, gin.H{"status": "ok"})
})

Why Minor: Current behavior is acceptable for basic health checks, but deep health checks improve reliability.


2. Missing Metrics Endpoint

Location: components/public-api/main.go

Issue: No Prometheus metrics endpoint (/metrics) for monitoring request latency, error rates, etc.

Recommendation: Add gin-contrib/prometheus middleware:

import "github.com/zsais/go-gin-prometheus"

p := ginprometheus.NewPrometheus("public_api")
p.Use(r)

Why Minor: Observability is important for production, but can be added in a follow-up PR.


3. JWT Subject Extraction Logs on Every Request

Location: components/public-api/handlers/middleware.go:83-84

Issue: The middleware logs every time a project is extracted from a ServiceAccount token. This could generate excessive log volume in production.

if tokenProject \!= "" {
    log.Printf("Extracted project %s from ServiceAccount token for routing (backend will validate)", tokenProject)
}

Recommendation: Downgrade to debug level or remove after initial deployment:

// Only log at debug level or remove after validating the feature works
// log.Printf("DEBUG: Extracted project %s from SA token", tokenProject)

Why Minor: High log volume, but not a functional issue.


4. normalizePhase Missing default Case Logging

Location: components/public-api/handlers/sessions.go:294-307

Issue: The normalizePhase function returns the raw phase if it doesn't match known values. This could leak internal K8s phase names to clients.

default:
    return phase  // Could be internal K8s phase name

Recommendation: Log unexpected phases and return a normalized value:

default:
    log.Printf("Warning: Unknown phase '%s', normalizing to 'unknown'", phase)
    return "unknown"

Why Minor: Unlikely to happen in practice, but improves API contract stability.


5. Redundant go.sum* in Dockerfile

Location: components/public-api/Dockerfile:10

Issue: The Dockerfile copies go.sum* (with wildcard), but go.sum should always exist:

COPY go.mod go.sum* ./

Recommendation: Remove wildcard since go.sum is checked in:

COPY go.mod go.sum ./

Why Minor: Doesn't affect functionality, just style.


Positive Highlights

🌟 Architecture & Design

  1. Excellent Separation of Concerns: The public-api as a stateless gateway is architecturally sound. Clear separation from backend K8s operations.

  2. Security-First Design:

    • No K8s client access (minimal attack surface)
    • No RBAC permissions (principle of least privilege)
    • Token redaction in logs (prevents accidental exposure)
    • Project mismatch detection (prevents routing attacks)
    • Input validation (prevents injection attacks)
  3. Well-Documented Exception: The CLAUDE.md update clearly explains why this service doesn't follow standard backend patterns. The main.go architecture note is excellent.

🌟 Code Quality

  1. Comprehensive Test Coverage:

    • 1,206 lines of test code across 5 test files
    • Unit tests for all middleware functions
    • Integration tests with mock backend
    • Edge case coverage (missing headers, invalid tokens, etc.)
  2. Excellent Error Handling:

    • Consistent gin.H{"error": ...} pattern
    • Generic error messages to users (security best practice)
    • Detailed internal logging for debugging
  3. Type Safety:

    • Proper use of if val, ok := ...; ok pattern
    • No unchecked type assertions found
    • Validation functions for all user input

🌟 Security

  1. JWT Subject Extraction with Clear Warning: The 24-line security warning comment (middleware.go:103-124) is exemplary documentation of why unvalidated JWT parsing is safe in this context.

  2. Token Security:

    • Token redaction in logs (redactSensitiveParams)
    • Token forwarding (not storage)
    • Backend validation (defense in depth)
  3. Container Security:

    • Non-root user (UID 1001)
    • Read-only root filesystem
    • Dropped capabilities (ALL)
    • allowPrivilegeEscalation: false

🌟 Production Readiness

  1. Resource Limits: Sensible defaults (50m CPU request, 200m limit)
  2. Health Probes: Both liveness and readiness probes
  3. PodDisruptionBudget: Ensures availability during rollouts
  4. Multi-stage Build: Optimized Docker image (alpine base)

🌟 Documentation

  1. Comprehensive Proposal: The acp-public-rest-api.md design doc is thorough and well-structured
  2. README with Examples: Clear usage examples for SDK/CLI
  3. E2E Test Coverage: e2e/cypress/e2e/public-api.cy.ts (341 lines)

Recommendations

Prioritized Action Items

Before Merge (Blocker):

None! The initial blocker I identified was actually a false positive - the type assertions are handled correctly.

High Priority (Address in this PR or immediate follow-up):

  1. ✅ Add test for project mismatch security check (5 min)
  2. ⚠️ Add logging to transformSession for malformed responses (10 min)
  3. ⚠️ Consider per-endpoint timeout configuration for long-running operations (15 min)

Medium Priority (Follow-up PR):

  1. Add rate limiting middleware
  2. Improve health check to validate backend connectivity
  3. Add Prometheus metrics endpoint

Low Priority (Nice-to-have):

  1. Reduce logging verbosity for JWT extraction
  2. Improve normalizePhase default case handling
  3. Clean up Dockerfile go.sum* wildcard

Security Review Summary

Token Handling: Excellent - tokens are never logged, always forwarded, never stored
Input Validation: All user inputs validated with K8s regex
Authorization: Correctly delegated to backend (no RBAC bypass risk)
Error Handling: Generic messages to users, detailed logs internally
Container Security: Follows all best practices (non-root, read-only, no caps)
⚠️ Rate Limiting: Missing (recommend adding)


Final Verdict

Status:APPROVE (with minor recommendations)

This is a well-architected, secure, and production-ready implementation. The code follows project standards, has excellent test coverage, and demonstrates thoughtful security design. The separation of concerns between the public-api gateway and the backend is clean and maintainable.

The only critical recommendation is adding a test for the project mismatch security check. Everything else can be addressed in follow-up PRs.

Excellent work! 🎉


Reviewed with:

  • ✅ CLAUDE.md standards
  • ✅ Backend development context
  • ✅ Security standards
  • ✅ Error handling patterns
  • ✅ K8s client usage patterns
  • ✅ Go best practices

Generated by Claude Code Review (Sonnet 4.5)


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Introduce a new Public API service that acts as a single entry point
for all clients (Browser, SDK, MCP). This thin shim layer provides:

- Versioned REST endpoints (/v1/sessions)
- Simplified JSON responses (abstracts K8s CRD complexity)
- Token validation and project extraction from headers or JWT
- Proxy to internal Go backend

Components added:
- components/public-api/ - Go service with Gin framework
- Kubernetes manifests for deployment, service, and route
- Makefile build target (build-public-api)
- Design proposal document

Closes ambient-code#557

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Emilien Macchi <emacchi@redhat.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Claude Code Review

Summary

This PR introduces a new Public API Gateway service that provides a simplified, versioned REST API for the Ambient Code Platform. The implementation is well-architected as a thin proxy layer that abstracts K8s CRD complexity while maintaining security through token forwarding.

Overall Assessment: Strong implementation with excellent security patterns, comprehensive test coverage, and good alignment with project standards. A few minor improvements recommended.

Issues by Severity

🚫 Blocker Issues

None. The PR is ready to merge from a functional and security perspective.

🔴 Critical Issues

None identified.

🟡 Major Issues

1. Error Handling: Generic Internal Error Messages

// handlers/sessions.go:34-36
if err := io.ReadAll(resp.Body); err != nil {
    log.Printf("Failed to read backend response: %v", err)
    c.JSON(http.StatusInternalServerError, gin.H{"error": "Internal server error"})
    return
}

Issue: Multiple locations use generic "Internal server error" messages that don't help users understand what went wrong.

Pattern violation: Error handling pattern recommends more specific user-facing messages while keeping detailed logs internal.

Recommendation: Provide more context in user-facing errors:

  • "Failed to retrieve sessions" (line 35)
  • "Failed to retrieve session details" (line 91)
  • "Failed to create session" (line 147)

Locations: handlers/sessions.go:36, 50, 91, 104, 147, 164, 177, 218


2. Type Safety: Unstructured Map Access

// handlers/sessions.go:243-246
if metadata, ok := data["metadata"].(map[string]interface{}); ok {
    if name, ok := metadata["name"].(string); ok {
        session.ID = name
    }
}

Issue: While the code does check types with ok patterns, it's accessing unstructured data directly rather than using typed structs.

Pattern alignment: This is acceptable for a proxy service transforming backend responses, but consider defining backend response types for better maintainability.

Recommendation: Low priority - current implementation is safe, but adding typed structs in types/backend.go would improve maintainability.

🔵 Minor Issues

1. Missing Context Cancellation Logging

// handlers/proxy.go:78-81
resp, err := HTTPClient.Do(req)
if err != nil {
    return nil, fmt.Errorf("backend request failed: %w", err)
}

Issue: When context is cancelled (client disconnect), the error includes the full backend URL in the wrapped error, which gets logged.

Recommendation: Add context cancellation check:

if err != nil {
    if ctx.Err() == context.Canceled {
        return nil, fmt.Errorf("request cancelled")
    }
    return nil, fmt.Errorf("backend request failed: %w", err)
}

2. Security Context Documentation

# components/manifests/base/public-api-deployment.yaml:52-58
securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  capabilities:
    drop:
    - ALL

Positive: Excellent security context configuration! This follows security standards perfectly.

Recommendation: Add a comment explaining why readOnlyRootFilesystem is safe (Go binary doesn't need write access).


3. Test Coverage: Integration Test Assertion

// handlers/integration_test.go

Observation: Integration tests exist but weren't run in my review environment.

Recommendation: Ensure integration tests run in CI and validate against a real backend instance.


4. JWT Subject Extraction Security Warning

// handlers/middleware.go:101-124
// SECURITY WARNING - READ BEFORE MODIFYING

Positive: Excellent security documentation! This is a great example of explaining WHY unvalidated JWT parsing is safe in this specific context.

Recommendation: None - this is exemplary documentation.

Positive Highlights

✅ Excellent Security Patterns

  1. Token Redaction in Logs (middleware.go:188-217)

    • Implements the same redaction pattern as backend (server/server.go:22-34)
    • Redacts token, access_token, and api_key parameters
    • Follows security standards perfectly
  2. Project Mismatch Detection (middleware.go:70-75)

    • Prevents routing attacks where header and token projects don't match
    • Logs security events for auditing
    • Returns empty string to force auth failure (fail-safe)
  3. Input Validation (validation.go)

    • Validates K8s naming conventions to prevent path injection
    • 63-character limit enforcement
    • Regex pattern matches K8s DNS label requirements
  4. No RBAC Permissions (public-api-deployment.yaml:84-87)

    • ServiceAccount has NO RoleBinding (intentional)
    • Clear documentation explaining this is a proxy-only service
    • Follows the "Exception: Public API Gateway Service" pattern from CLAUDE.md

✅ Strong Architecture

  1. Thin Shim Pattern

    • Service is ~500 lines as designed
    • Single responsibility: token validation + proxying
    • No K8s client creation (correct for gateway pattern)
  2. DTO Transformation (handlers/sessions.go:238-291)

    • Abstracts K8s CRD complexity
    • Normalizes phase names ("Pending"/"Creating" → "pending")
    • Clean separation of concerns
  3. Comprehensive Error Handling

    • Logs internal errors with context
    • Returns generic user-facing messages (doesn't leak internals)
    • Proper HTTP status code usage

✅ Excellent Test Coverage

  1. Unit Tests

    • Token extraction: 5 test cases (middleware_test.go:15-68)
    • Project extraction: 7 test cases (middleware_test.go:70-133)
    • Security mismatch validation included
    • Session transformation: Multiple test cases (sessions_test.go)
  2. E2E Tests (e2e/cypress/e2e/public-api.cy.ts)

    • Graceful degradation if public-api unavailable
    • Tests all CRUD operations
    • Real integration with backend
  3. Integration Tests (handlers/integration_test.go)

    • Tests full request flow with mock backend

✅ Documentation & Standards

  1. README.md - Clear usage examples, environment variables documented
  2. Design Proposal (docs/proposals/acp-public-rest-api.md) - Comprehensive architecture documentation
  3. CLAUDE.md Updated - Properly documented the exception pattern for public-api
  4. Comments - Excellent inline documentation, especially security warnings

✅ DevOps & Deployment

  1. Resource Limits - Sensible defaults (50m-200m CPU, 64Mi-256Mi memory)
  2. Health Probes - Both liveness and readiness configured
  3. PodDisruptionBudget - Ensures availability during updates
  4. Multi-arch Build - CI workflow builds for amd64 and arm64

Recommendations

Priority 1 (Before Merge - Nice to Have)

  1. Improve user-facing error messages - Replace generic "Internal server error" with specific messages like "Failed to retrieve sessions"

  2. Add context cancellation check - Distinguish client disconnects from backend failures in proxy.go

Priority 2 (Follow-up PR)

  1. Add typed backend response structs - Define types for backend responses in types/backend.go for better type safety

  2. Add metrics/observability - Consider adding Prometheus metrics for:

    • Request count by endpoint
    • Backend request latency
    • Error rates by type
  3. Rate limiting - Consider adding rate limiting middleware to prevent abuse

Priority 3 (Future Enhancement)

  1. API versioning strategy - Document how v2 will be introduced when needed
  2. WebSocket support - Consider whether real-time session updates need WebSocket proxying
  3. Response caching - Consider caching GET /v1/sessions responses with short TTL

Architecture Alignment Check

Backend Development Standards

  • Does NOT use GetK8sClientsForRequest (correct - this is the exception)
  • Properly documented as gateway pattern in main.go:1-17
  • Token forwarding follows security standards

Security Standards

  • Token redaction in logs ✓
  • Input validation (K8s names) ✓
  • No token logging ✓
  • Fail-safe security (project mismatch → auth failure) ✓

Error Handling Patterns

  • Logs errors with context ✓
  • Generic user messages ✓
  • Appropriate HTTP status codes ✓
  • Minor: Could use more specific error messages

Go Standards

  • Proper error wrapping with %w ✓
  • Context-aware requests ✓
  • Table-driven tests ✓
  • No panic in production code ✓

Final Verdict

APPROVED - This is a well-implemented feature that follows project standards and security best practices. The identified issues are all minor and non-blocking. The code is production-ready.

Key Strengths:

  • Excellent security patterns (token redaction, project mismatch detection, input validation)
  • Clean architecture (thin shim, proper separation of concerns)
  • Comprehensive test coverage (unit, integration, e2e)
  • Great documentation (inline comments, README, design proposal)

Recommendation: Merge with optional follow-up for improved error messages.


Review conducted by Claude Code with full memory system context (CLAUDE.md, security-standards.md, backend-development.md, k8s-client-usage.md, error-handling.md)


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@bobbravo2 bobbravo2 added this to the v0.0.20 milestone Jan 30, 2026
@jeremyeder
Copy link
Collaborator

@EmilienM hey, I ran this through another reviewer and it had different feedback. It was calling out CORS as being necessary to import from gin. Does this need to have an HPA with it?

Please address the critical feedback below and in the previous comments.

Below are all useful additions and I would prefer they be implemented either now, or as a follow-up.

Lastly, I want to include this public_api in a release urgently.


Critical Issues (MUST FIX before merge):

  1. ❌ Add CORS middleware (will break browser clients)
  2. ❌ Add rate limiting (security vulnerability)
  3. ❌ Add HTTP client timeout (availability risk)
  4. ⚠️ Verify token redaction in logs (security risk if missing)

Recommended Improvements (SHOULD FIX):

  1. Improve error forwarding (include backend response details)
  2. Validate required env vars on startup
  3. Increase memory limits (64Mi → 128Mi)
  4. Add metrics endpoint (/metrics)

Future Enhancements (NICE TO HAVE):

  1. Response caching (Redis) - i dont know how necessary this is
  2. OpenTelemetry tracing - yes, and if you wanted to add this now it would be great
  3. Structured logging (JSON format) - maybe pull this in now
  4. Circuit breaker pattern for backend calls - unsure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: ACP Rest API

3 participants