Skip to content

FEAT: Inline Container App secret + KV lockdown for GUI deployments#1721

Open
varunj-msft wants to merge 1 commit into
microsoft:mainfrom
varunj-msft:varunj-msft/GUI-Deployment-Inline-Secret
Open

FEAT: Inline Container App secret + KV lockdown for GUI deployments#1721
varunj-msft wants to merge 1 commit into
microsoft:mainfrom
varunj-msft:varunj-msft/GUI-Deployment-Inline-Secret

Conversation

@varunj-msft
Copy link
Copy Markdown
Contributor

Description

Migrates the CoPyRIT GUI from a Key Vault secretRef runtime model to an inline Container App secret, which lets us fully lock down the per-instance Key Vault for SFI/NS221 compliance without breaking the running app.

Why this approach:

The S360 alert on copyrit-demo-kv requires publicNetworkAccess=Disabled. I tested the fix (pna=Disabled + defaultAction=Deny + bypass=AzureServices) on a throwaway and it broke the deploy at container startup with unable to fetch secret 'env-file' using Managed identity. Root cause: Azure Container Apps is not on Key Vault's "trusted services" allowlist, so KV-backed secret references fail at runtime against a locked-down vault. App Service is on that list; ACA isn't.

The fix is to remove the runtime KV dependency entirely. The .env content is now passed inline to the Container App via the Bicep envFileContents @secure() parameter and stored in ACA's encrypted secrets store. The KV is still created, populated with the .env as backup/audit, and locked down, but never read at runtime.

Validated end-to-end on a throwaway (kvtest5 in centralus): deploy succeeded, KV ended up locked down, container app came up, login worked, chat worked.

Files touched (9):

  • infra/main.bicep + infra/main.json: replaced envSecretName (KV ref) with @secure() envFileContents (inline value).
  • infra/deploy_instance.py: applies SFI lockdown to KV after backup secret upload; removed the now-unneeded KV Secrets User MI grant; passes envFileContents via temp parameters JSON file (avoids ps exposure).
  • gui-deploy.yml: same temp-parameters-file pattern; strict validation that fails loudly if enablePrivateEndpoint, enableOtel, or envFileContents ADO variables are unresolved.
  • infra/parameters.example.json + infra/parameters.demo.json: switched from envSecretName to envFileContents; added warnings about the @file silent-fallback footgun.
  • infra/README.md + infra/DEPLOY_NEW_INSTANCE.md: rewrote secret-rotation guidance (use az containerapp secret set --secrets env-file=@./updated.env with pre-flight test -f check; explicitly anti-patterns the unsafe alternatives); both bash and PowerShell examples.
  • docker/start.sh: 1-line comment update.

What this does NOT do:

  • Does NOT touch the live copyrit-demo deployment. It still runs the OLD pattern (KV reference). The S360 alert on copyrit-demo-kv is not cleared by this PR.
  • Does NOT consolidate the standalone-script and ADO-pipeline deployment paths (separate scope).

Follow-up after merge:

  1. Add envFileContents SECURE variable to ADO library variable groups copyrit-gui-test and copyrit-gui-prod. Without it, the next pipeline run fails fast with a clear error (the new strict validation catches it).
  2. Redeploy the live copyrit-demo with the new inline-secret model (this implies a brief revision cutover for active users).
  3. After the demo is on the new model, apply the lockdown to copyrit-demo-kv to clear the S360 alert.

Tests and Documentation

No code-level unit tests added - the change is to deployment infrastructure (Bicep, ADO pipeline, deploy script). Validation done:

  • Live throwaway test on kvtest5 (centralus): full deploy succeeded; container started; chat worked end-to-end against the locked-down KV pattern.
  • az deployment group validate against real copyrit-demo resource IDs: succeeded.
  • Strict-validation unit checks on the pipeline's parameter-generation Python (8 scenarios per stage): all pass; both stages byte-identical.
  • pre-commit run on all 9 modified files: clean.
  • az bicep build: zero new warnings.
  • infra/main.json matches compiled infra/main.bicep.

JupyText not applicable - no doc/ changes.

@varunj-msft
Copy link
Copy Markdown
Contributor Author

varunj-msft commented May 12, 2026

@romanlutz tagging for review!

cc @behnam-o

Comment thread gui-deploy.yml
"keyVaultResourceId": {"value": os.environ["KEY_VAULT_RESOURCE_ID"]},
"acrResourceId": {"value": os.environ["ACR_RESOURCE_ID"]},
"enablePrivateEndpoint": {"value": parse_bool("ENABLE_PRIVATE_ENDPOINT")},
"enableOtel": {"value": parse_bool("ENABLE_OTEL")},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing otel? We should at some point, just didn't realize it's part of this PR

Comment thread gui-deploy.yml
PARAMS_FILE="$(mktemp --suffix=.parameters.json)"
trap 'rm -f "$PARAMS_FILE"' EXIT

python3 - <<'PY' > "$PARAMS_FILE"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be an inline script? Could it just be in the repo somewhere (scripts dir?)

--name env-global \
--file ./updated.env
# Pre-flight: fail fast if the file is missing
test -f ./updated.env || { echo "ERROR: ./updated.env not found"; exit 1; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is updated.env?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants