release: v0.15.0 — BM25符号搜索 / AppContext拆分 / Embedding多后端 / Health扩展 / 不变量CI by juice094 · Pull Request #8 · juice094/devbase

juice094 · 2026-05-11T01:17:54Z

v0.15.0 完整交付，含文档一致化修复（commit 684e18a）。\n\n验证:\n- cargo check --workspace: 0 errors\n- cargo test --workspace: 503 passed / 0 failed / 4 ignored\n\n变更摘要:\n- Cargo.toml / workspace.package version bump 0.14.3 → 0.15.0\n- README badge + Roadmap 表时间线正序重排\n- AGENTS.md 版本字段同步\n- 新增 RELEASE_NOTES_v0.15.0.md\n- 归档过期状态报告到 docs/_audit/

…rmissions softprops/action-gh-release@v1 consistently fails with 403 / Too many retries on this repository. Switch to native gh release create CLI which uses the same GITHUB_TOKEN but has better error handling. Also add explicit permissions: contents: write.

…branches Phase 1: AGENTS.md version hallucination fix - v0.16.1 -> v0.14.3, commit hash 5928499 -> 2867811 - Test counts: 456 -> 490+, integration 9/11 -> 11/11 - Clippy: 1 warning -> 0 warning Phase 2: Remove dead code KnowledgeRepository::generate_report - Zero callers confirmed by grep across src/ - Actual devkit_knowledge_report uses oplog_analytics::generate_report Phase 3: Remote feature branches already absent (cleaned previously)

feat(cli): add --json to scan/workflow list, knowledge-report command CLI parity fixes (Agent-driven testing): - scan: add --json flag (underlying run_json already existed) - workflow list: add --json flag with {success, count, workflows[]} output - knowledge-report: new CLI subcommand matching MCP devkit_knowledge_report Health query performance optimization: - Add EnvVersionCache (30s TTL) to AppContext; eliminates 5 sequential subprocess spawns on cached calls - Parallel subprocess spawns via tokio::join! on cold cache - Batch get_health_batch() query replacing N+1 individual SELECTs - Dedup repo.head() call in analyze_repo -> calc_ahead_behind Benchmarks (Windows): - health --json median: 122ms -> 68ms (-44%) - Agent loop median: 168ms -> 114ms (-32%) - 408 tests pass, 0 regression

…le batch - Add EmbeddingProvider::encode_batch() trait method for future GPU/ONNX providers - Implement encode_batch_with_candle using true batch forward (pad + single pass) - Revert generate_and_save_embeddings to rayon par_iter single encoding: Candle CPU BERT batch=32 forward takes ~1.7s vs ~10ms single; total 88s vs 16s - Add --skip-embeddings CLI flag: index drops from ~16s to ~250ms - Keep unified AST walk (extract_symbols_and_calls) as prior cleanup BREAKING: run_index/run_index_with_progress signatures gain skip_embeddings bool

…mbeddings

…pipeline reference - docs/_audit/2026-04-26-embedding-research.md: add 2026-05-04 supplement with batch encoding failure data (88s vs 16s), --skip-embeddings path, and external distill-knowledge Skill pipeline reference - AGENTS.md: cross-reference to audit doc for knowledge distillation spec

Based on architecture governance methodology from external research (Kimi session e9f2965f-b949-46a5-9d7c-afd6d4d9232c): - docs/architecture/adr-template.md: ADR template + completed ADR-001/002 - docs/architecture/invariants.md: global + tiered invariants + extraction drill checklist - AGENTS.md: add Architecture Governance subsection linking to new docs - docs/README.md: register new architecture docs in navigation

Replace 24 unwrap/expect occurrences across 7 production files: - src/search.rs: 12× schema.get_field expect → ? (functions return TantivyError) - src/workflow/scheduler.rs: 4× expect → ok_or_else + ? (anyhow::Result) - src/search/hybrid.rs: 2× expect → Vec::remove(0) (len==1 verified) - src/discovery_engine.rs: 2× expect → ok_or_else + ? (anyhow::Result) - src/semantic_index/mod.rs: 2× expect/unwrap + signature changes - index_repo_full → anyhow::Result<(Vec<CodeSymbol>, Vec<CodeCall>)> - index_repo → anyhow::Result<Vec<CodeSymbol>> - callers: knowledge_engine/index.rs + semantic_index/mod.rs updated - src/query.rs: 1× expect → ? (Option propagation) - src/test_utils.rs: 1× expect removed, return type → anyhow::Result - benches/semantic_index.rs: add .unwrap() to bench call (benchmark exempt) Plan documented in plans/rf6-unwrap-audit-plan.md All checks pass: - cargo test --all-targets: 408 passed / 0 failed / 3 ignored - cargo clippy --all-targets -- -D warnings: 0 warnings - cargo fmt --check: 0 diff

- docs/README.md: version v0.13.0 -> v0.14.3, test count 389->408, add RF-6 and crates metrics - AGENTS.md: update current phase description, version ref, completed milestones

- init_db() global paths: no external callers found, fully migrated to AppContext - Feature flags: tui/watch/mcp/embedding all optional, --no-default-features compiles

…/go) Add optional dependencies + feature flags for all 4 tree-sitter grammars: - Cargo.toml: tree-sitter-{rust,python,typescript,go} -> optional - New features: lang-rust, lang-python, lang-js-ts, lang-go - Default features include all 4 for backward compatibility - semantic_index/mod.rs: #[cfg] on Lang enum variants + from_ext/parser_language - semantic_index/symbol.rs: #[cfg] on grammar match arms + test guard - semantic_index/call_graph.rs: #[cfg] on lang match arms This allows --no-default-features + selective languages to reduce compile time by skipping unused grammar C compilation. All checks pass: - cargo test --all-targets: 408 passed / 0 failed / 3 ignored - cargo clippy --all-targets -- -D warnings: 0 warnings - cargo fmt --check: 0 diff

Feature-gated grammar crates: lang-rust/python/js-ts/go features added, allowing selective compilation to reduce build time.

Extend repair_tantivy_consistency to detect the reverse gap: SQLite entities missing from Tantivy index (silent search gap). Changes: - Add RepairResult { orphans, missing_from_index } - Convert tantivy_ids to HashSet for O(1) lookup - After orphan cleanup, iterate all SQLite repo IDs and warn for any not found in Tantivy index - Update repair_tantivy_consistency_at signature - Update test_repair_tantivy_consistency_detects_orphan Does NOT trigger re-index at startup (too heavy); only logs warnings for operator visibility. All checks pass: - cargo test --all-targets: 408 passed / 0 failed / 3 ignored - cargo clippy --all-targets -- -D warnings: 0 warnings - cargo fmt --check: 0 diff

Reverse consistency check landed in fe14c81. Short-term detection gap closed; long-term transaction coordination still open.

…onflict resolved)

When system clock drifts backward or checked_at is in the future, signed_duration_since returns a negative value. Previously this would incorrectly satisfy elapsed < ttl_seconds, causing the cache to never refresh. Change elapsed < ttl_seconds to elapsed >= 0 && elapsed < ttl_seconds to force a refresh when elapsed is negative.

Replace SQLite LIKE fallback in keyword_search_symbols with Tantivy BM25 via a dedicated symbol_index. - New src/search/symbol_index.rs: schema (repo_id, name, signature, file_path, line_start), add_symbols, search_symbols, delete_repo_symbols - StorageBackend trait: add symbol_index_path() (default + Temp + 4 test impls) - knowledge_engine/index.rs: index symbols into Tantivy after SQLite persist - hybrid.rs: keyword_search_symbols now primary-path BM25, fallback to SQLite LIKE for repos without symbol index (backward compatible) - search.rs: pub mod symbol_index Tests: 410 passed / 0 failed / 4 ignored

- New tools/invariant-checks/run-checks.ps1: - G5 (RF-6): detect new unwrap/expect/panic in production code via git diff - T11: detect direct rusqlite::Connection usage in mcp/tools - T12: detect write operations in tui/render production code - Module extraction drill: verify README.md + Cargo.toml presence - Known exceptions for legacy T11 violations - CI integration: add 'Architecture Invariants' job to ci.yml - Fix crates/devbase-embedding/src/lib.rs: encode_with_candle .unwrap() -> .ok_or_else() (RF-6 compliance) - Add plans/appcontext-refactor-design.md (P2 design doc) Tests: 410 passed / 0 failed / 4 ignored

…age.rs Split AppContext's 6 Client trait implementations from storage.rs into their respective domain modules (zero behavior change): - ScanClient -> scan.rs - HealthClient -> health.rs - SyncClient -> sync.rs - DigestClient -> digest.rs - KnowledgeClient -> knowledge_engine/mod.rs - RegistryClient -> registry.rs Result: storage.rs reduced from ~860 lines to ~430 lines (-50%). Tests: 410 passed / 0 failed / 4 ignored

- Move query_code_symbols SQL logic to registry::code_symbols with CodeSymbolRow - Move query_dead_code SQL logic to registry::dead_code with DeadCodeRow - Simplify RegistryClient impl to pure delegation + JSON wrapping - Add unit tests for both new query functions using in-memory DB - Preserve re-exports for backward compatibility with src/repository/symbol.rs Zero behavioral change; SQL strings identical to pre-extraction.

- EnvVersionCache: add python, bun, zig, java fields - refresh_env_cache: detect 9 tools in parallel (was 5) - get_tool_version: fallback to stderr when stdout empty (Java) - fmt_version: handle quoted versions (Java), Docker version, Python - JSON/CLI output: all 9 tools displayed - tests: 5 new fmt_version tests for new tools

- devbase-embedding: add OllamaProvider (HTTP /api/embed via ureq) - devbase-embedding: add create_provider(backend, model, base_url, timeout) - Config default: model 'nomic-embed-text' -> 'all-minilm' (384-dim, candle-compatible) - Config docs: explain candle vs ollama backend choice - Tests: 3 create_provider tests (candle, ollama, unknown fallback)

- src/embedding.rs: generate_query_embedding now reads EmbeddingConfig via OnceLock-cached provider (first call loads config, rest reuse) - crates/devbase-embedding: remove generate_query_embedding (moved to main crate where Config is accessible) - All 5 call sites (skill search, index, TUI search, MCP search x2) automatically use configured backend without code changes

- AGENTS.md: mark P1~P5 complete, update commit ref to e230b6b - devbase-embedding/README.md: 50-word extraction drill doc

P1~P5 全部交付: - BM25 符号搜索 (Tantivy) - Embedding 多后端 (Candle + Ollama) - AppContext 拆分 Phase 1/2 - Health 工具链扩展 (9 tools) - 架构不变量 CI (G5/T11/T12) - TTL 负值 bugfix

- Cargo.toml: 主 crate & workspace.package version 0.14.3 → 0.15.0 - README: version badge 0.14.3 → 0.15.0, tests badge 390 → 490+ - README: 路线图表按时间正序重排，修正 v0.15.0/v0.16.0/v0.14.3 状态 - v0.15.0 描述对齐 CHANGELOG（P1-P5），标为 ✅ 当前 - v0.16.0 描述对齐 docs/ROADMAP.md，标为 📋 进行中 - v0.14.3 由 ✅ 当前改为 ✅ 已发布 - AGENTS.md: 当前版本描述由 v0.15.0-in-progress 改为 v0.15.0 已发布 - 新增 RELEASE_NOTES_v0.15.0.md（弥补 v0.11.0~v0.14.x 缺失） - 归档过期状态报告：PROJECT_STATUS_2026-04-29.md → docs/_audit/ STAGE_REPORT_2026-04-09.md → docs/_audit/ 验证: cargo check --workspace 0 errors; cargo test --workspace 503 passed / 0 failed / 4 ignored

juice094 added 29 commits May 6, 2026 20:58

style: cargo fmt across health, index, storage, workflow, embedding

9f846e4

docs(AGENTS): record 2026-05-04 embedding batch experiment & --skip-e…

7242b80

…mbeddings

docs: update project status to v0.14.3

f836303

- docs/README.md: version v0.13.0 -> v0.14.3, test count 389->408, add RF-6 and crates metrics - AGENTS.md: update current phase description, version ref, completed milestones

docs(AGENTS): mark init_db and feature flags debts as resolved

3708d78

- init_db() global paths: no external callers found, fully migrated to AppContext - Feature flags: tui/watch/mcp/embedding all optional, --no-default-features compiles

docs(AGENTS): mark tree-sitter compile cost as resolved

8320299

Feature-gated grammar crates: lang-rust/python/js-ts/go features added, allowing selective compilation to reduce build time.

docs(AGENTS): update Tantivy+SQLite debt status

ca433fc

Reverse consistency check landed in fe14c81. Short-term detection gap closed; long-term transaction coordination still open.

chore: merge origin/main into fix/project-health-cleanup (AGENTS.md c…

8983c93

…onflict resolved)

docs(plan): add v0.15.0 development directions research

4f8e473

docs: update AGENTS.md progress + add devbase-embedding README

0db7c8b

- AGENTS.md: mark P1~P5 complete, update commit ref to e230b6b - devbase-embedding/README.md: 50-word extraction drill doc

release: v0.15.0 changelog

4e8d882

P1~P5 全部交付: - BM25 符号搜索 (Tantivy) - Embedding 多后端 (Candle + Ollama) - AppContext 拆分 Phase 1/2 - Health 工具链扩展 (9 tools) - 架构不变量 CI (G5/T11/T12) - TTL 负值 bugfix

style: cargo fmt — fix rustfmt check failures on v0.15.0 branch

d4aa3e7

juice094 merged commit 0a6758d into main May 11, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.15.0 — BM25符号搜索 / AppContext拆分 / Embedding多后端 / Health扩展 / 不变量CI#8

release: v0.15.0 — BM25符号搜索 / AppContext拆分 / Embedding多后端 / Health扩展 / 不变量CI#8
juice094 merged 29 commits into
mainfrom
fix/project-health-cleanup

juice094 commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

juice094 commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant