Skip to content

OIDC/Keycloak authentication with RBAC and scope-based permissions #930

@mrunalp

Description

@mrunalp

Problem Statement

OpenShell currently authenticates CLI-to-gateway communication via mTLS client certificates or Cloudflare Access JWTs. It doesn't support standard enterprise identity providers (Keycloak, Entra ID, Okta) or fine-grained, per-method access control. Organizations that want to:

  • Authenticate users against their existing identity provider
  • Restrict automation tokens to specific operations (e.g., sandbox-only CI bots)
  • Enforce admin vs. user role separation on provider and config management

Proposed Design

Add OAuth2/OIDC as a third authentication mode alongside mTLS and Cloudflare Access, with two-layer authorization: role-based access control (RBAC) and opt-in scope-based fine-grained permissions.

Authentication (server-side):

  • JWT validation against a configurable OIDC issuer's JWKS endpoint (--oidc-issuer)
  • JWKS key caching with TTL and key rotation handling
  • Method classification: unauthenticated (health/reflection), sandbox-secret (supervisor RPCs), dual-auth (UpdateConfig, GetSandboxConfig), Bearer-authenticated (everything else)
  • Anti-spoofing: internal auth-source headers stripped from inbound requests

Authorization — RBAC:

  • Configurable admin/user role names extracted from a configurable JWT claim path (--oidc-roles-claim)
  • Admin methods: provider CRUD, config mutations, draft policy approvals, inference write
  • User methods: sandbox lifecycle, provider read, policy/settings read
  • Auth-only mode: empty role names skip RBAC (for providers like GitHub Actions that don't emit roles)

Authorization — Scopes (opt-in):

  • When --oidc-scopes-claim is set, the server enforces per-method scopes on top of roles
  • 8 scopes: sandbox:read, sandbox:write, provider:read, provider:write, config:read, config:write, inference:read, inference:write, plus openshell:all wildcard
  • Exhaustive scope map — unlisted methods require openshell:all
  • Scopes cannot escalate past role gates

CLI:

  • Browser-based Authorization Code + PKCE flow (gateway login)
  • Client Credentials flow for CI/automation (OPENSHELL_OIDC_CLIENT_SECRET)
  • Token storage with automatic refresh, logout command
  • --oidc-scopes flag to request specific scopes during login
  • OIDC bearer token injected over mTLS transport for local K3s gateways

Sandbox supervisor auth:

  • Sandbox-to-server RPCs authenticated via x-sandbox-secret header (not OIDC)
  • Sandbox-secret callers restricted to sandbox-scoped policy sync on UpdateConfig
  • GetInferenceBundle classified as sandbox-secret for credential isolation

Deployment:

  • Full plumbing through DeployOptions, Docker env vars, Helm values/templates, HelmChart manifest, cluster-entrypoint.sh, and bootstrap scripts
  • OPENSHELL_OIDC_* env vars for all configuration
  • Gateway metadata preserves OIDC config across stop/start cycles

Keycloak dev environment:

  • mise run keycloak starts a pre-configured Keycloak with test realm, users (admin@test, user@test), roles, PKCE client, CI client, and OpenShell client scopes
  • Built-in OIDC scopes (openid, profile, email, roles) configured as defaults so role and profile claims are always present

Provider compatibility:

  • Tested with Keycloak. Configurable roles claim path and role names support Entra ID, Okta, and GitHub Actions.
  • Scope enforcement supports space-delimited strings (Keycloak, Entra) and JSON arrays (Okta)

Alternatives Considered

mTLS-only with cert-based roles: Would require a custom CA that encodes roles in certificate OUs. Complex PKI management, no standard IdP integration, no token expiry or refresh.

API keys / static tokens: Simple but no identity, no role hierarchy, no integration with enterprise identity providers, no audit trail of who did what.

Full openidconnect crate: Evaluated the openidconnect Rust crate for both server and CLI. Recommend the lighter oauth2 crate for CLI flows (PKCE, token exchange, refresh). Server-side JWT validation stays with jsonwebtoken because the openidconnect crate's IdTokenVerifier is designed for ID tokens, not access tokens, and the JWKS cache with thundering-herd protection needs to remain custom.

Agent Investigation

  • Explored the existing auth boundary in multiplex.rs — the AuthGrpcRouter tower middleware was the natural insertion point for JWT validation before request dispatch
  • Reviewed the Cloudflare Access auth flow in auth.rs and tls.rs to understand the existing EdgeAuthInterceptor pattern, which was extended for OIDC bearer token injection
  • Reviewed the gateway metadata and token storage patterns in openshell-bootstrap (edge_token.rs, metadata.rs) to model OIDC token persistence consistently
  • Reviewed the Keycloak realm export format to build the pre-configured dev realm with proper client scope assignments and built-in OIDC scope preservation
  • Reviewed the oauth2 crate API (v5) for type-state safety and RequestBody vs BasicAuth client authentication semantics

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions