Zeph

A memory-first AI agent for long-running work on local, cloud, and decentralized inference.

Zeph is a Rust-native AI agent built for work that cannot fit into one chat window: coding sessions, operations, research loops, document RAG, scheduled jobs, and multi-agent workflows. It keeps short-term context sharp, persists long-term memory, builds a relationship graph from decisions and entities, and routes each task to the cheapest provider that can handle it.

Unlike single-session assistants, Zeph is designed to remember why a decision happened, not just the last messages around it.

Why Try Zeph

If you want...	Zeph gives you...
An agent that survives long projects	SQLite conversation history, semantic recall, graph memory, session digests, trajectory memory, and goal-aware compaction.
Lower infrastructure cost	A default SQLite vector backend, local Ollama defaults, feature-gated bundles, and provider routing for simple vs. hard tasks.
More than keyword memory	Typed graph facts, BFS recall, SYNAPSE spreading activation, MMR reranking, temporal decay, and write-quality gates. See graph memory concepts.
Provider freedom	Ollama, Claude, OpenAI, Gemini, Candle, any OpenAI-compatible endpoint, and distributed inference networks (Gonka, Cocoon TEE) for cost-sensitive or privacy-sensitive workloads.
Agent-grade safety	Age-encrypted vault secrets, sandboxed tool execution, MCP injection detection, SSRF guards, PII filtering, and exfiltration checks.
Daily operator ergonomics	CLI, TUI dashboard, MCP tools, plugins, skills, sub-agents, ACP for IDEs, A2A, scheduler, and JSON output modes.

Quick Start

Install the latest release:

curl -fsSL https://github.com/bug-ops/zeph/releases/latest/download/install.sh | sh

Or install from crates.io:

cargo install zeph

Initialize and run:

zeph init
zeph doctor
zeph --tui

Important

Zeph requires Rust 1.95 or later when building from source. Pre-built binaries do not require a Rust toolchain.

For a local-first setup, run Ollama and pull the default lightweight models:

ollama pull qwen3:8b
ollama pull qwen3-embedding
zeph init
zeph

Distributed Inference

Long-running agents are the worst-case workload for centralized API providers: thousands of calls per session, rate limits that pause mid-task, and costs that compound across every tool loop, memory retrieval, and sub-agent spawn.

Distributed inference networks change the economics. Compute is supplied by independent nodes rather than a single data center — which means no shared rate ceiling, no single vendor dependency, and in hardware-attested networks, provable isolation of your prompts from the node operator.

Zeph treats distributed networks as first-class providers alongside Ollama and cloud APIs, participating in the same adaptive routing — you can send cheap extraction and embedding work to a distributed node while reserving TEE-isolated compute for steps that touch sensitive context.

Network	Provider type	Characteristic
Gonka	`gonka` / `compatible`	High-capacity distributed nodes, signed transport, OpenAI-compatible gateway
Cocoon	`cocoon`	Hardware TEE isolation — node operators cannot read prompts or weights

Both plug into the standard provider declaration:

[[llm.providers]]
name = "distributed"
type = "gonka"   # or "cocoon", or "compatible" for gateway mode
model = "qwen3-235b"
default = true

Run zeph init to configure either network interactively through the setup wizard.

Messenger as Agent Infrastructure

Most agents treat messaging apps as a thin input channel — user sends text, agent replies. Zeph's Telegram integration flips that model: the messenger becomes a coordination layer where agents serve public audiences, accept tasks from orchestrators, and talk to other bots.

Guest Mode removes the assumption that every user is a registered Telegram account. A transparent local proxy intercepts guest queries from the Bot API 10.0 and routes them to the agent without opening a second getUpdates connection (no 409 conflicts). The agent responds via answerGuestQuery — one call, no extra infra. This makes it practical to deploy public-facing agents that handle anonymous or unauthenticated requests.

Bot-to-Bot communication lets Zeph register as a managed bot via setManagedBotAccessSettings and accept tasks from other bots in a controlled chain. Consecutive bot replies are tracked per-chat, depth is capped at max_bot_chain_depth, and each inbound bot is validated against an allowlist — so the agent participates in multi-agent pipelines without becoming a relay for arbitrary bots.

Voice input via Cocoon STT. The Telegram adapter detects voice and audio messages, downloads the file, and passes it to the configured speech-to-text provider. With type = "cocoon" and stt_model set, transcription runs inside a hardware TEE — audio bytes never leave the isolated enclave unencrypted. This makes voice-driven agentic workflows practical for sensitive use cases: a voice note becomes a task, without the audio touching a third-party transcription API.

Configurable streaming interval (stream_interval_ms, default 3 s, minimum 500 ms) fixes a silent data-loss bug in the original hardcoded delay: responses that completed within a single interval window were discarded before Telegram saw them. Now the agent flushes on completion regardless of the timer.

[telegram]
guest_mode          = true
bot_to_bot          = true
allowed_bots        = ["orchestrator_bot", "scheduler_bot"]
max_bot_chain_depth = 3
stream_interval_ms  = 1500

[[llm.providers]]
name      = "stt"
type      = "cocoon"
stt_model = "whisper-large-v3"   # transcribes Telegram voice messages inside TEE

See the Telegram guide for full configuration and Bot API 10.0 details.

What Makes It Different

Memory is the product

Zeph combines several memory layers instead of treating recall as a side feature:

Layer	Purpose
Working context	Keeps the current task coherent under context pressure. See context budgets.
Semantic memory	Stores conversations, tool outputs, documents, and summaries for retrieval. See semantic memory guide.
Graph memory	Records entities, decisions, relationships, causality, temporal links, and hierarchy. See graph memory.
Episodic memory	Preserves session-level scenes, digests, goals, and trajectories.
Quality gates	Reject noisy writes, validate compaction, and log retrieval failures for later improvement. See quality self-check.

Ask "Why did we choose PostgreSQL?" and Zeph can traverse decision edges instead of searching raw chat text.

Built for low-resource setups

Zeph does not require a heavyweight stack to be useful:

The default vector backend is embedded SQLite.
Qdrant is optional for larger semantic and graph workloads.
The default local chat model is qwen3:8b through Ollama.
Feature bundles let you build only what you need: desktop, ide, server, chat, ml, or full.
Release builds are optimized for small native binaries.

Multi-model by design

Declare providers once in [[llm.providers]], then route work by complexity, cost, latency, and reliability:

[[llm.providers]]
name = "fast"
type = "ollama"
model = "qwen3:8b"
embedding_model = "qwen3-embedding"
embed = true

[[llm.providers]]
name = "quality"
type = "claude"
model = "claude-sonnet-4-6"
default = true

[llm]
routing = "bandit"

Use local models for extraction, embeddings, routing, and summarization. Keep expensive models for planning, code generation, and expert reasoning.

Tools without loose secrets

Secrets live in the Zeph age vault, not in .env files or shell profiles. Tool execution goes through trust gates, command filters, sandboxing, audit logs, and redaction paths. MCP tools are discovered and exposed without dropping the injection and authorization checks.

Demo

Common Commands

zeph init                    # generate config through the wizard
zeph doctor                  # run preflight checks
zeph --tui                   # launch the dashboard
zeph ingest ./docs           # ingest documents into semantic memory
zeph skill list              # inspect installed skills
zeph plugin list --overlay   # inspect plugin config overlays
zeph router stats            # inspect adaptive provider routing
zeph memory export dump.json # export memory snapshot
zeph project purge --dry-run # preview local state cleanup

Installation Options

Pre-built Binary

curl -fsSL https://github.com/bug-ops/zeph/releases/latest/download/install.sh | sh

Cargo

cargo install zeph
cargo install zeph --features desktop

Docker

docker pull ghcr.io/bug-ops/zeph:latest

From Source

git clone https://github.com/bug-ops/zeph.git
cd zeph
cargo build --release --features full
./target/release/zeph init

Feature Highlights

Area	Highlights
Memory	SQLite/PostgreSQL history, embedded SQLite vectors or Qdrant, graph memory, SYNAPSE, SleepGate, APEX-MEM write-quality gates, BeliefMem probabilistic edge layer, MemCoT Zoom-In/Out recall views, document RAG.
Context	Goal-aware compaction, TypedPage assembler pipeline, TACO output compression, tool-output archive, session recap, active-goal injection.
Skills	`SKILL.md` registry, hot reload, BM25 + embedding matching, trust levels, self-learning skill improvement.
Providers	Ollama, Claude, OpenAI, Gemini, OpenAI-compatible APIs, Gonka native inference, Cocoon decentralized TEE inference, Candle local inference, adaptive routing.
Tools	Shell, file, web, MCP, tool quotas, approval gates, audit trail, sandboxing, output compression, speculative dispatch, ShadowSentinel safety probes, TrajectorySentinel capability governance.
Interfaces	CLI, TUI, Telegram (with Guest Mode and Bot-to-Bot), Discord, Slack, ACP, A2A, HTTP gateway, scheduler daemon.
Code intelligence	Tree-sitter indexing, semantic repo map, LSP diagnostics and hover context through MCP.
Observability	Debug dumps, JSONL mode, Prometheus metrics, OpenTelemetry traces, profiling builds.

Architecture

See the architecture overview and crates reference for full details.

zeph
  src/                    CLI, bootstrap, init wizard, command handlers
  crates/zeph-core        agent loop and runtime orchestration
  crates/zeph-config      TOML schema, migration, provider registry
  crates/zeph-llm         provider abstraction and model backends
  crates/zeph-memory      semantic, graph, episodic, and document memory
  crates/zeph-skills      skill registry, matching, trust, learning
  crates/zeph-tools       tool executors, sandboxing, policy, audit
  crates/zeph-mcp         MCP client and tool lifecycle
  crates/zeph-tui         ratatui dashboard
  crates/zeph-acp         IDE integration through Agent Client Protocol
  crates/zeph-a2a         agent-to-agent protocol support
  crates/zeph-subagent    sub-agent definitions, spawning, transcripts
  crates/zeph-orchestration DAG planning, scheduling, verification

Documentation

Full documentation
Installation guide
Configuration recipes
Memory concepts — graph, semantic, episodic layers
Graph memory
Adaptive inference routing
MCP integration
Security model
Feature flags
CLI reference

Zeph draws from published work on parallel tool execution, temporal knowledge graphs, agentic memory linking, failure-driven compression, retrieval quality, and multi-model routing. See References & Inspirations for the full list.

Contributing

See CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1,633 Commits
.github		.github
.zeph		.zeph
book		book
config		config
crates		crates
docker		docker
scripts		scripts
specs		specs
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zeph

Why Try Zeph

Quick Start

Distributed Inference

Messenger as Agent Infrastructure

What Makes It Different

Memory is the product

Built for low-resource setups

Multi-model by design

Tools without loose secrets

Demo

Common Commands

Installation Options

Pre-built Binary

Cargo

Docker

From Source

Feature Highlights

Architecture

Documentation

Contributing

License

About

Uh oh!

Releases 68

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Zeph

Why Try Zeph

Quick Start

Distributed Inference

Messenger as Agent Infrastructure

What Makes It Different

Memory is the product

Built for low-resource setups

Multi-model by design

Tools without loose secrets

Demo

Common Commands

Installation Options

Pre-built Binary

Cargo

Docker

From Source

Feature Highlights

Architecture

Documentation

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 68

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages