Skip to content

Kubenew/SovereignStack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

⬡ SOVEREIGN AI INFRASTRUCTURE STANDARD ⬡

SovereignStack

Drop-in sovereign replacement for public AI platforms — air-gapped, OASA-compliant, OpenAI-compatible.

OASA L1 Compatible OASA L2 Verified OASA L3 Certified CI Status

L1 Compatible L2 Verified L3 Certified License Release Stars Contributors


Try It in 2 Minutes

# 1. One-command install (auto-detects GPU, RAM, TPM)
curl -sSL https://install.sovereignstack.ai | bash

# 2. Download a local model
curl -Lo playground/models/model.gguf \
  https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf

# 3. Launch the stack
docker compose up --build -d

# 4. Chat (OpenAI-compatible API — just change the base URL)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer mock-valid-token" \
  -d '{
    "model":"Qwen/Qwen2.5-7B-Instruct",
    "messages":[{"role":"user","content":"What is digital sovereignty?"}],
    "oasa_compliance_lock":true
  }'
# Or from any OpenAI client — just change three lines:
import openai
openai.api_key = "mock-valid-token"                          # was: sk-...
openai.base_url = "http://localhost:8080/v1"                  # was: https://api.openai.com/v1
openai.default_headers = {"oasa_compliance_lock": "true"}     # was: nothing

That's it. Zero data leaves your network. No API tokens. No cloud dependency.


Benchmarks

Inference Performance (vLLM, INT4 AWQ, Batch=1)

Model Quantization VRAM Tokens/sec TTFT Hardware
Llama 3.1 8B INT4 AWQ 8 GB 142 tok/s 45ms RTX 4090
Llama 3.1 70B INT4 AWQ 28 GB 39 tok/s 120ms 2x RTX 6000
Mistral 7B INT4 GGUF 6 GB 68 tok/s 55ms RTX 3090
Qwen 2.5 7B INT4 AWQ 8 GB 134 tok/s 48ms RTX 4090
Phi-3 Mini INT4 GGUF 4 GB 22 tok/s 95ms CPU-only (M3)
DeepSeek-Coder 33B INT4 AWQ 18 GB 56 tok/s 88ms A100 40GB

Benchmarks run with tools/benchmark.py on isolated hardware. See Benchmarking Guide.

Cost Comparison: Cloud vs. Sovereign (3-Year TCO)

Scenario Public Cloud SovereignStack Savings
10 users, GPT-4 class $360K $12K (RTX 4090) 97%
50 users, GPT-4 class $1.8M $45K (2x A100) 97.5%
200 users, mixed models $7.2M $150K (4-node cluster) 98%

Architecture

                          ┌──────────────────────────────────┐
                          │        CLIENT APPLICATION        │
                          │  OpenAI SDK / LangChain / Custom  │
                          └──────────────┬───────────────────┘
                                         │
                              ┌──────────▼──────────┐
                              │   SOVEREIGN GATEWAY  │
                              │  :8080 — OIDC + OPA  │
                              │  Auth → Policy → Audit│
                              └──────────┬──────────┘
                                         │
              ┌──────────────────────────┼──────────────────────────┐
              │                          │                          │
     ┌────────▼────────┐       ┌─────────▼────────┐      ┌─────────▼────────┐
     │   vLLM ENGINE    │       │  MEMORY SERVICE   │      │  INGEST SERVICE   │
     │  PagedAttention  │       │  TurboMemory       │      │  pdf2struct       │
     │  INT4/AWQ/FP8   │       │  AES-256 Vector DB │      │  PDF/DOCX → JSON  │
     │  FlashAttention  │       │  KV Cache Isolation│      │  VOLATILE RAM Only │
     └────────┬────────┘       └─────────┬────────┘      └─────────┬────────┘
              │                          │                          │
              └──────────────────────────┼──────────────────────────┘
                                         │
                              ┌──────────▼──────────┐
                              │ IDENTITY & ACCESS    │
                              │  Keycloak (OIDC)     │
                              │  Open Policy Agent   │
                              │  OpenTelemetry       │
                              │  Prometheus          │
                              └─────────────────────┘

Key Design Principle

When local compute fails, the oasa_compliance_lock ensures a 503 Service Unavailable is returned rather than silently forwarding data to external APIs. A 503 is inconvenient; a GDPR fine of 4% annual revenue is catastrophic.


OASA Compliance Program

SovereignStack implements the Open Architecture for Sovereign AI (OASA) — a three-tier conformance certification program:

Level Badge Requirements Use Case
L1 Compatible L1 Schema validation, YAML manifest, JSON schemas, basic tooling Evaluation & dev
L2 Verified L2 L1 + compliance lock enforcement, TPM binding, encrypted memory, audit logs, blocked exfiltration domains Production single-node
L3 Certified L3 L2 + runtime memory protection, Helm lint, comprehensive report, hardware attestation Regulated enterprise

Full certification specification →


Features

Identity & Access (OIDC + RBAC)

# Get a token from Keycloak
curl -X POST http://localhost:8083/realms/sovereign/protocol/openid-connect/token \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "client_id=sovereign-gateway" \
  -d "username=sovereign-admin" \
  -d "password=admin123" \
  -d "grant_type=password"

# Use the token
curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <access_token>" \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen/Qwen2.5-7B-Instruct","messages":[{"role":"user","content":"Hello"}],"oasa_compliance_lock":true}'

Role-based access: inference:write, inference:read, audit:read — enforced at the gateway.

Policy Engine (OPA)

Data Loss Prevention, prompt injection blocking, and role-based model budgets — all governed by Open Policy Agent Rego policies at policies/inference.rego.

Observability

  • OpenTelemetry — Trace propagation across all services (x-trace-id, x-span-id)
  • Prometheus — Metrics scraping at /metrics on all services
  • Audit Log — Immutable append-only JSON log with jurisdiction tags

Air-Gapped Deployment

# Docker Compose (all traffic on internal: true bridge)
docker compose up --build -d

# Kubernetes with Helm (strict NetworkPolicies, gVisor sandboxing)
helm install sovereign-stack ./charts/sovereignstack \
  --namespace sovereign-stack --create-namespace \
  --set vllm.model.name="Qwen/Qwen2.5-7B-Instruct" \
  --set global.air_gapped=true

Documentation

Document Description
ARCHITECTURE.md System topology, layers, subsystems, trust boundaries
CONFORMANCE.md OASA certification specification (L1/L2/L3)
OASA.md Full OASA protocol specification
Architecture Guide Trust boundaries, data flow, identity flow
Deployment Guide Deployment profiles, Docker Compose, Helm
Deployment Profiles Personal, Edge, Air-Gapped, Datacenter
Docker Compose Local stack with Keycloak, vLLM, OTel, Prometheus
Helm Chart Kubernetes deployment with NetworkPolicies & gVisor
Threat Model STRIDE threat catalogue, attack surface, compliance mapping
RFCs Standards evolution (Runtime Spec, RFC Process)
Specifications Formal subsystem and protocol specifications
Governance Project roles, decision-making, release model
Roadmap Development phases and milestones
Contributing How to contribute
Security Vulnerability disclosure, threat model, compliance
Code of Conduct Community standards

Tooling

# VRAM estimation
python tools/vram_calculator.py --params 70B --quant INT4 --context 8192

# Compliance validation
python tools/sovereign_stack.py validate sovereign-stack.yaml --audit-host

# Performance benchmarking
python tools/benchmark.py --url http://localhost:8080/v1 --model sovereign-llama3

# Runtime exfiltration watchdog
python tools/runtime_shield.py --interval 10

# Compliance report generator
python tools/generate_compliance_report.py --level L2 --output report.md

Regulatory Compliance Matrix

Regulation Jurisdiction Coverage
GDPR EU Zero exfiltration, jurisdictional routing, immutable audit logs
HIPAA US AES-256-GCM encryption, air-gapped compute, access logging
NIS2 EU Hardware security (TPM), immutable audit trail, incident response isolation
EU AI Act EU Local model control, transparency logging, human oversight
DORA EU Operational resilience via air-gapped orchestration
SOX US Deterministic ingestion, tamper-evident logs, financial data isolation

Commercialization

SovereignStack is structured for enterprise adoption:

  • Enterprise Support (SLA) — 24/7 incident response, deployment audits, custom integration
  • SovereignNode Appliances — Turnkey air-gapped hardware with K3s, vLLM, encrypted Qdrant
  • OASA Certification — Compliance badges and third-party audit reports
  • Dedicated Training — On-site workshops for regulated deployments

Roadmap

See ROADMAP.md for the full phased roadmap through 2027+.

Phase Highlights
2026.1 Helm chart, vLLM, OPA, CI/CD, Keycloak OIDC, Threat Model
2026.2 🚧 RFC Process, Architecture Docs, Deployment Profiles, Governance
2026.3 📅 Merkle-Tree Auditing, Federated Memory, Mesh Networking
2027.1 📅 Hardware Enclaves, SBOM/Cosign, SPIFFE/SPIRE, Sovereign Node OS
2027.2 📅 Agent Orchestration, Multi-Model Routing, Federated Agents
2027.3+ 🔮 Autonomous Infrastructure, Certification Program, Enterprise Platform

Contributing

See CONTRIBUTING.md and our Code of Conduct.

License

Apache 2.0 — see LICENSE.

About

**Open Architecture Specification for Autonomous and Sovereign AI (OASA)**

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages