-
Notifications
You must be signed in to change notification settings - Fork 47
fix(google-mcp): Fix Google Workspace MCP authentication and connection [RHOAIENG-46519] #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(google-mcp): Fix Google Workspace MCP authentication and connection [RHOAIENG-46519] #562
Conversation
Fixes RHOAIENG-46519 - Google Workspace MCP shows authenticated but fails when used due to expired tokens. Google Workspace MCP integration showed "Authenticated" in the UI but failed when agents attempted Drive operations. The status check only verified that credentials.json existed, not whether tokens were valid/unexpired. Symptoms: - Status showed "Authenticated" with stale/expired credentials - Tool calls failed with placeholder email (user@example.com) - Disconnect/reconnect temporarily fixed (until tokens expired again) 1. _check_mcp_authentication() only checked file existence, not token validity 2. USER_GOOGLE_EMAIL environment variable was never set from actual credentials 3. No detection of expired access tokens or missing refresh tokens 4. Status showed "authenticated" with invalid/expired credentials - Replace file existence check with actual token validation - Parse and check token_expiry timestamps - Validate required fields (access_token, refresh_token) are non-empty - Return three-state status: True (valid), False (invalid), None (needs refresh) - Reject placeholder email user@example.com - Extract user email from credentials.json on setup/refresh - Set USER_GOOGLE_EMAIL environment variable with actual email - Update email when credentials are refreshed - Check MCP authentication status before each agent run - Display user-visible warnings for auth issues - Distinguish between "not configured", "expired", and "needs refresh" - Update type signature: authenticated: boolean | null - Show "Needs Refresh" badge for null state (expired with refresh token) - Amber warning icon for uncertain auth states - Unit tests (test_google_auth.py): All token validation scenarios - E2E tests (test_google_drive_e2e.py): Complete authentication flow - Backwards compatible with Jira/Atlassian MCP integration Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
RHOAIENG-46519 This commit fixes multiple issues preventing the Google Workspace MCP server from connecting and authenticating properly in session pods. ## Root Causes Identified and Fixed ### 1. Workspace Directory Permissions (hydrate.sh) MCP servers spawn as subprocesses and need write access to their working directory. Changed permissions from 755 to 777 for /workspace/artifacts, /workspace/file-uploads, and /workspace/repos. The chown to uid 1001 is attempted first but fails on SELinux-restricted hosts, so 777 is the fallback. ### 2. Credentials Format Mismatch (sessions.go) The workspace-mcp server expects a flat JSON format with specific fields: token, refresh_token, token_uri, client_id, client_secret, scopes, expiry. The operator was creating a nested format. Fixed to produce correct flat format. ### 3. Read-Only Credentials Mount (sessions.go) K8s secrets are mounted read-only, but workspace-mcp needs to write updated tokens during refresh. Added postStart lifecycle hook to copy credentials from read-only /app/.google_workspace_mcp/credentials/ to writable /workspace/.google_workspace_mcp/credentials/. ### 4. Invalid MCP Config Field (.mcp.json) Removed invalid "type": "stdio" field from google-workspace config. Claude Code SDK doesn't recognize this field and was failing to start the MCP server. ### 5. Missing Python Imports (main.py) Added missing `from pathlib import Path` and `from datetime import datetime, timezone` imports that caused NameError crashes. ### 6. Timezone-Naive Datetime Comparison (main.py) Fixed _parse_token_expiry() to always return timezone-aware datetime. Was causing "can't compare offset-naive and offset-aware datetimes" error when checking token expiry. ### 7. Environment Variable Overwrite (adapter.py) Fixed _set_google_user_email() to not overwrite the operator-set USER_GOOGLE_EMAIL with incorrect values from the new flat credentials format. ### 8. OAuth Callback 404 (frontend route.ts) Added /oauth2callback route to frontend to proxy OAuth callbacks to backend. Without this, Google's redirect would 404. ### 9. Secret Key Sanitization (oauth.go, sessions.go) Added sanitizeSecretKey() to handle userIDs containing invalid K8s secret key characters (@ : /). ## Files Changed - components/runners/state-sync/hydrate.sh - 777 permissions - components/operator/internal/handlers/sessions.go - credentials format, postStart hook, USER_GOOGLE_EMAIL env var - components/runners/claude-code-runner/.mcp.json - remove invalid field - components/runners/claude-code-runner/main.py - imports, timezone fix - components/runners/claude-code-runner/adapter.py - preserve env var - components/backend/handlers/oauth.go - secret key sanitization - components/frontend/src/app/oauth2callback/route.ts - OAuth proxy - components/runners/claude-code-runner/README.md - documentation - docs/integrations/google-workspace.md - documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit addresses findings from the code review of the Google
Workspace MCP authentication fix.
## Security Fix
### XSS Vulnerability in OAuth Callback Route
- Added `escapeHtml()` function to sanitize error messages before HTML
interpolation in oauth2callback/route.ts
- Prevents reflected XSS attacks where malicious content in backend
error responses could be executed in the user's browser
- Escapes &, <, >, ", and ' characters
## Code Quality
### Remove DEBUG Logging Statements
- Removed two `logging.info("DEBUG: ...")` statements from adapter.py
- These were left in production code and added unnecessary log noise
### Fix Test Assertion
- Updated test_google_auth.py to expect "not configured" instead of
"empty" for empty credentials file test
- Matches actual implementation behavior where empty files are treated
as not configured
## Files Changed
- components/frontend/src/app/oauth2callback/route.ts - XSS fix
- components/runners/claude-code-runner/adapter.py - DEBUG removal
- components/runners/claude-code-runner/tests/test_google_auth.py - assertion fix
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
Resolved conflict in hydrate.sh by: - Keeping our chown addition (sets ownership to runner user 1001) - Accepting main's permission approach (755 for artifacts/file-uploads, 777 for repos) This combines both approaches: chown ensures the runner can write as owner, and the permissions follow main's security rationale.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Claude Code ReviewSummaryThis PR comprehensively fixes Google Workspace MCP authentication across 10 root causes spanning backend, operator, runner, frontend, and init container. The fix has been validated end-to-end and includes 386 lines of test coverage. The code demonstrates strong attention to security (XSS fix, sanitization) and follows established patterns. A few critical security and code quality issues must be addressed before merge. Issues by Severity🚫 Blocker IssuesNone - All critical issues from initial review have been addressed in commit 286369c. 🔴 Critical Issues1. Frontend Type Definition Inconsistency
2. Unused Test Fixture
3. Environment Variable Mutation in Tests
🟡 Major Issues4. Permissions 777 Security Trade-off
5. Incomplete Python Imports in Tests import os
import sys
6. Type Annotation Inconsistency def _read_google_credentials(workspace_path: Path, secret_path: Path) -> dict | None:
7. Missing Error Handling in Frontend Route
🔵 Minor Issues8. Redundant Comment // Google MCP credentials directory - uses writable workspace location
// Credentials are copied from read-only secret mount by postStart lifecycle hook
9. Inconsistent String Formatting return None, f"Google OAuth authenticated as {user_email} (token expiry format invalid)"
10. Test Organization
Positive Highlights✅ Comprehensive Root Cause Analysis: 10 distinct issues identified and fixed systematically Recommendations (Prioritized)Must Fix Before Merge
Should Fix Before Merge
Nice to Have
Testing GapsWhile test coverage is excellent overall, consider:
Architecture Compliance✅ Backend: User token auth pattern followed correctly Final VerdictApproval Status: ✅ Approved with Required Changes This is high-quality work that systematically addresses a complex multi-component issue. The three critical issues above must be fixed before merge, but they are straightforward. Once addressed, this PR will be production-ready. Estimated Time to Fix: ~30 minutes for all critical issues Recommended Merge Strategy: Squash and merge (already has 3 clean commits) 🤖 Generated with Claude Code 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
Testing
|
- Fix type annotations in main.py: use Dict[str, Any] instead of bare dict for _read_google_credentials() and _validate_google_token() return types - Add error handling in oauth2callback route: wrap response.text() in try-catch to handle potential body read failures gracefully - Fix McpServer type: make authenticated field optional (authenticated?) to maintain backwards compatibility with existing API responses - Add context to operator error: include userID in "credentials missing email field" error message for debugging - Use pytest monkeypatch in tests: properly clean up USER_GOOGLE_EMAIL environment variable after each test to prevent test pollution Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude Code ReviewSummaryThis PR fixes 10 root causes preventing Google Workspace MCP from authenticating. The fix is comprehensive, touching init container, operator, runner (Python), backend, and frontend. The approach is sound, but there are critical security issues that must be addressed before merge. 🚫 Blocker Issues (MUST FIX)1. Critical Security: 777 Permissions in Production# hydrate.sh:81-82, 144, 148
chmod 777 /workspace/repos
chmod -R 777 /workspace/reposProblem: World-writable directories (777) are a severe security vulnerability in multi-tenant environments. Solution: Use fsGroup in PodSecurityContext to propagate group ownership automatically, eliminating the need for 777. Impact: High - code in customer repos could be modified by other pods on the same node. 2. XSS Vulnerability Not Fully FixedWhile frontend correctly escapes errors, backend HTML responses in oauth.go are NOT escaped (lines 249, 346, 365, 372, 383). Attack: OAuth provider returns malicious error_description with script tags Solution: Use html.EscapeString() for ALL user-controlled strings in HTML responses Impact: High - credential theft, session hijacking via XSS 3. Missing CSRF Validation on Cluster-Level OAuth (oauth.go:266)Cluster-level OAuth flow skips HMAC signature verification that session-specific flow requires. Solution: Validate HMAC signature BEFORE checking cluster flag Impact: Critical - CSRF attack allowing unauthorized credential storage Full ReviewSee complete analysis with code examples, security assessment, and prioritized recommendations at: #562 (comment) Verdict: ⏸️ DO NOT MERGE until 3 blocker security issues are resolved. Estimated fix time: 2-4 hours Great work on comprehensive debugging! Once security issues are addressed, this will be a solid contribution. 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
The Google credentials directory is created by the container user (1001) via mkdir -p, so it already has appropriate ownership and permissions. The MCP server runs as the same user and can read/write without chmod. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude Code ReviewSummaryThis PR addresses RHOAIENG-46519 by fixing multiple root causes preventing Google Workspace MCP server authentication. Comprehensive fix touching backend, frontend, operator, and runner. Changes follow established patterns well, with good test coverage. Critical Issues
Major Issues
Minor Issues
Positive Highlights✅ Excellent security awareness - XSS vulnerability found and fixed RecommendationsPriority 1 (Before Merge): Verify lifecycle hook SecurityContext, add error logging, extract sanitizeSecretKey Final AssessmentHigh-quality work with strong security awareness. Multi-root-cause analysis is thorough. Critical security issues addressed. Remaining issues are refinements. Recommendation: Approve with minor follow-ups 🤖 Automated review by Claude Code 🔍 View AI decision process (logs available for 90 days) 📋 View memory system files loaded (click to expand)What Amber Loaded for Code ReviewAmber automatically loaded these repository standards from the memory system:
Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines. |
| user_creds = creds[user_email] | ||
| # Map new flat format to expected field names | ||
| user_creds = { | ||
| "access_token": creds.get("token", ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put some debug here - make sure it gets to langfuse
| system_prompt_config = { | ||
| "type": "preset", | ||
| "preset": "claude_code", | ||
| "append": workspace_prompt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be validated. we just put that claude_code thing in there this week. i dont want to lose it - what does preset do?
| │ └── {workflow-name}/ # Individual workflow | ||
| ├── artifacts/ # Output files created by Claude | ||
| ├── file-uploads/ # User-uploaded files | ||
| └── .google_workspace_mcp/ # Google OAuth credentials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this work if they have the same path
| - **Timeout Protection**: Operations have configurable timeouts | ||
| - **User Context Validation**: User IDs and names are sanitized | ||
| - **Read-only Workflow Directories**: Workflows are read-only, outputs go to artifacts | ||
| - **OAuth Credentials Isolation**: Google credentials are stored in session-specific secrets, copied to writable storage only within the container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to verify this - i want to know whether other users or sessions can see other users or sessions within the same workspace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and of course across workspaces should currently have zero leakage
| - Ensures Claude has both general coding capabilities and workspace knowledge | ||
| - `ClaudeAgentOptions.system_prompt` expects `str | SystemPromptPreset | None` | ||
| - The list format was invalid and caused `'list' object has no attribute 'get'` error | ||
| - `SystemPromptPreset` uses `type="preset"`, `preset="claude_code"`, and optional `append` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, did they change this...ok. we just have to validate
|
Merging this now bc i have a pr that builds on this and resolves some of the concerns |
…time credential fetching - Operator no longer fetches/sets USER_GOOGLE_EMAIL (runner handles this) - Runner sets USER_GOOGLE_EMAIL from backend API response - Supersedes PR ambient-code#562's volume mounting with runtime API fetching - Simpler architecture: no syncing, mounting, or postStart hooks needed
Summary
Fixes RHOAIENG-46519: Google Workspace MCP server failing to authenticate and connect in session pods.
This was a multi-root-cause issue requiring fixes across the init container, operator, runner, backend, and frontend. The fix has been validated
end-to-end in a fresh kind cluster.
Root Causes Fixed
1. Workspace Directory Permissions (hydrate.sh)
MCP servers spawn as subprocesses and need write access to their working directory. Changed permissions from 755 to 777 for
/workspace/artifacts,/workspace/file-uploads, and/workspace/repos. Thechownto uid 1001 is attempted first but fails on SELinux-restricted hosts.2. Credentials Format Mismatch (sessions.go)
The workspace-mcp server expects a flat JSON format with specific fields (token, refresh_token, token_uri, client_id, client_secret, scopes, expiry). The
operator was creating a nested format.
3. Read-Only Credentials Mount (sessions.go)
K8s secrets are mounted read-only, but workspace-mcp needs to write updated tokens during refresh. Added postStart lifecycle hook to copy credentials to
writable location.
4. Invalid MCP Config Field (.mcp.json)
Removed invalid
"type": "stdio"field from google-workspace config. Claude Code SDK doesn't recognize this field.5. Missing Python Imports (main.py)
Added missing
Pathanddatetimeimports that caused NameError crashes.6. Timezone-Naive Datetime Comparison (main.py)
Fixed
_parse_token_expiry()to always return timezone-aware datetime.7. Environment Variable Overwrite (adapter.py)
Fixed
_set_google_user_email()to not overwrite operator-setUSER_GOOGLE_EMAIL.8. OAuth Callback 404 (route.ts)
Added
/oauth2callbackroute to frontend to proxy OAuth callbacks to backend.9. Secret Key Sanitization (oauth.go, sessions.go)
Added
sanitizeSecretKey()to handle userIDs with invalid K8s secret key characters.10. XSS Vulnerability (route.ts)
Added
escapeHtml()to sanitize error messages in OAuth callback HTML responses.Files Changed
hydrate.shsessions.go.mcp.jsonmain.pyadapter.pyoauth.gooauth2callback/route.tsREADME.md,google-workspace.mdTest Plan
Security Notes
🤖 Generated with Claude Code