feat(code-index): add Semble as a local on-the-fly embedding provider#399
feat(code-index): add Semble as a local on-the-fly embedding provider#399navedmerchant wants to merge 11 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntegrates Semble as a local CLI-backed code index provider: adds types/schemas, SembleCLI wrapper, downloader with checksum and extraction, SembleProvider implementation, CodeIndexManager/service-factory routing, UI/platform detection and i18n, embedding model profile, and extensive Vitest tests. ChangesSemble local indexing support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Not ready for review yet, the code is being reviewed and evolved |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
src/services/code-index/interfaces/manager.ts (1)
78-87: ⚡ Quick winConsider consolidating the duplicated EmbedderProvider type definition.
EmbedderProvideris defined both here and inpackages/types/src/embedding.ts. Different parts of the codebase import from different locations (e.g.,config-manager.tsimports from this file, whileservice-factory.tsimports from@roo-code/types). While currently in sync, maintaining two sources of truth creates risk of drift and inconsistency.♻️ Suggested consolidation approach
Consider removing the local type definition and importing from the shared package instead:
+import { EmbedderProvider } from "`@roo-code/types`" import { VectorStoreSearchResult } from "./vector-store" import * as vscode from "vscode" /** * Interface for the code index manager */ export interface ICodeIndexManager { // ... rest of interface -export type IndexingState = "Standby" | "Indexing" | "Indexed" | "Error" | "Stopping" -export type EmbedderProvider = - | "openai" - | "ollama" - | "openai-compatible" - | "gemini" - | "mistral" - | "vercel-ai-gateway" - | "bedrock" - | "openrouter" - | "semble"This ensures a single source of truth and prevents accidental divergence.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/code-index/interfaces/manager.ts` around lines 78 - 87, Duplicate EmbedderProvider type exists locally and in the shared package; remove the local type definition (EmbedderProvider) from this file and replace usages/imports (e.g., in config-manager.ts) to import the type from the canonical package (`@roo-code/types` or packages/types/src/embedding.ts), update any export or re-export as needed so service-factory.ts and other consumers use the single source of truth, run TypeScript build/typecheck to ensure no unresolved imports remain, and delete the now-unused local declaration to avoid drift.webview-ui/src/components/chat/CodeIndexPopover.tsx (1)
774-776: ⚡ Quick winConsider adding test coverage for semble provider UI.
Per coding guidelines for
webview-ui/src/**/*.{ts,tsx}, prefer local webview-ui tests for React/webview behavior. Consider adding tests underwebview-ui/src/**/__tests__to cover:
- Selecting semble from the provider dropdown
- Qdrant URL/API key fields hidden when semble is selected
- Validation passes with only
codebaseIndexEnabledfor semble- Semble path input field renders and updates state correctly
- Installation instructions panel renders
As per coding guidelines: "Prefer local
webview-uitests for React/webview behavior such as component rendering, local state, hooks, form dirty-state, validation, or prop wiring."Also applies to: 1446-1482, 1484-1537
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@webview-ui/src/components/chat/CodeIndexPopover.tsx` around lines 774 - 776, Add local webview-ui tests for the CodeIndexPopover component to cover the semble provider UI: create tests under webview-ui/src/**/__tests__ that render CodeIndexPopover, select the SelectItem with value "semble", assert that Qdrant URL/API key fields are hidden, confirm validation passes when only codebaseIndexEnabled is set for semble, verify the semble path input field appears and updates component state on change, and assert the installation instructions panel is rendered; use the existing test utilities (render, fireEvent/userEvent, assertions) consistent with other webview-ui tests to locate and interact with the provider dropdown and semble-specific inputs.webview-ui/src/i18n/locales/en/settings.json (1)
204-204: 💤 Low valueConsider adding a Windows path example.
The
semblePathDescriptionprovides a helpful Unix path example ("/usr/local/bin/semble"), but users on Windows might benefit from seeing a Windows path example as well, such as"C:\\Python310\\Scripts\\semble.exe".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@webview-ui/src/i18n/locales/en/settings.json` at line 204, The translation entry semblePathDescription currently shows only a Unix example; update its text to include a Windows path example (e.g., "C:\\Python310\\Scripts\\semble.exe") so Windows users have guidance—locate the semblePathDescription key in the locales/en/settings.json and add a short Windows example alongside the existing Unix example, keeping escaping consistent for backslashes and quotes.src/services/code-index/semble/semble-cli.ts (1)
226-256: ⚡ Quick winConsider adding runtime validation for CLI output structure.
Lines 242 and 247 use type assertions (
as SembleSearchResult[]) without runtime validation. If the semble CLI output format changes or contains unexpected data, this could cause runtime errors downstream when code expects the SembleSearchResult shape.Consider adding a simple validation helper or at least checking for required fields before casting.
🛡️ Example validation approach
// Handle successful response: {query, results: [{chunk, score}]} if (parsed.results && Array.isArray(parsed.results)) { - return parsed.results as SembleSearchResult[] + // Basic validation: check that results have expected shape + return parsed.results.filter((r: any) => + r?.chunk?.file_path && + r?.chunk?.content && + typeof r?.score === 'number' + ) as SembleSearchResult[] } // Fallback: if it's a flat array (older format) if (Array.isArray(parsed)) { - return parsed as SembleSearchResult[] + return parsed.filter((r: any) => + r?.chunk?.file_path && + r?.chunk?.content && + typeof r?.score === 'number' + ) as SembleSearchResult[] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/code-index/semble/semble-cli.ts` around lines 226 - 256, The _parseOutput method casts parsed.results and parsed arrays to SembleSearchResult[] without runtime checks; add a lightweight validator (e.g., validateSembleResult) and use it to filter/transform parsed.results and parsed before returning to ensure each item has required properties (e.g., chunk is string, score is number, and any other SembleSearchResult fields); replace the direct casts in _parseOutput with Array.isArray checks plus items.filter(validateSembleResult) (or map-to-safe-shape) so only well-formed SembleSearchResult objects are returned and malformed entries are dropped or logged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/code-index/semble/semble-cli.ts`:
- Around line 150-157: The spawn call in _spawn uses an unsupported maxBuffer
option and silences the TS error with an as any cast; remove the maxBuffer
property from the options passed to spawn(this.semblePath, args, { shell: false,
timeout: options.timeout, stdio: [...] }) and drop the as any cast, and if you
need to limit output implement manual buffering and size checks inside
child.stdout?.on("data", ...) and child.stderr?.on("data", ...) (e.g.,
accumulate into stdout/stderr strings, enforce a MAX_BUFFER threshold, kill the
child and reject if exceeded) while keeping the rest of _spawn (child handling,
resolve/reject) intact.
---
Nitpick comments:
In `@src/services/code-index/interfaces/manager.ts`:
- Around line 78-87: Duplicate EmbedderProvider type exists locally and in the
shared package; remove the local type definition (EmbedderProvider) from this
file and replace usages/imports (e.g., in config-manager.ts) to import the type
from the canonical package (`@roo-code/types` or packages/types/src/embedding.ts),
update any export or re-export as needed so service-factory.ts and other
consumers use the single source of truth, run TypeScript build/typecheck to
ensure no unresolved imports remain, and delete the now-unused local declaration
to avoid drift.
In `@src/services/code-index/semble/semble-cli.ts`:
- Around line 226-256: The _parseOutput method casts parsed.results and parsed
arrays to SembleSearchResult[] without runtime checks; add a lightweight
validator (e.g., validateSembleResult) and use it to filter/transform
parsed.results and parsed before returning to ensure each item has required
properties (e.g., chunk is string, score is number, and any other
SembleSearchResult fields); replace the direct casts in _parseOutput with
Array.isArray checks plus items.filter(validateSembleResult) (or
map-to-safe-shape) so only well-formed SembleSearchResult objects are returned
and malformed entries are dropped or logged.
In `@webview-ui/src/components/chat/CodeIndexPopover.tsx`:
- Around line 774-776: Add local webview-ui tests for the CodeIndexPopover
component to cover the semble provider UI: create tests under
webview-ui/src/**/__tests__ that render CodeIndexPopover, select the SelectItem
with value "semble", assert that Qdrant URL/API key fields are hidden, confirm
validation passes when only codebaseIndexEnabled is set for semble, verify the
semble path input field appears and updates component state on change, and
assert the installation instructions panel is rendered; use the existing test
utilities (render, fireEvent/userEvent, assertions) consistent with other
webview-ui tests to locate and interact with the provider dropdown and
semble-specific inputs.
In `@webview-ui/src/i18n/locales/en/settings.json`:
- Line 204: The translation entry semblePathDescription currently shows only a
Unix example; update its text to include a Windows path example (e.g.,
"C:\\Python310\\Scripts\\semble.exe") so Windows users have guidance—locate the
semblePathDescription key in the locales/en/settings.json and add a short
Windows example alongside the existing Unix example, keeping escaping consistent
for backslashes and quotes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 332e2e21-894e-4a2d-9e0f-464ce90a3d05
📒 Files selected for processing (19)
packages/types/src/codebase-index.tspackages/types/src/embedding.tspackages/types/src/vscode-extension-host.tssrc/core/webview/ClineProvider.tssrc/core/webview/webviewMessageHandler.tssrc/services/code-index/config-manager.tssrc/services/code-index/interfaces/config.tssrc/services/code-index/interfaces/manager.tssrc/services/code-index/manager.tssrc/services/code-index/semble/__tests__/provider.spec.tssrc/services/code-index/semble/__tests__/semble-cli.spec.tssrc/services/code-index/semble/index.tssrc/services/code-index/semble/provider.tssrc/services/code-index/semble/semble-cli.tssrc/services/code-index/semble/types.tssrc/services/code-index/service-factory.tssrc/shared/embeddingModels.tswebview-ui/src/components/chat/CodeIndexPopover.tsxwebview-ui/src/i18n/locales/en/settings.json
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
So users have to install python, hope to their diety that they did it right and pip works, and then install semble like that. So that still "a lot" of local fiddling. It's still a fragile solution. The only thing this removes, is the requirement for a text embedding model, and a Qdrant instance. It doesn't remove fragility. It's not specifically in #200, but iyam it would be in the spirit of the requested feature to make code indexing a simple checkbox. Just tick it, and done. No local installs of anything. Ticking the checkbox does all the black magic that needs to happen. |
I had the same vision when trying to implement this (and even had a version implemented) but decided against it for two reasons
In my testing this is a pretty simple implementation which makes it easier for users to enable codebase indexing. It works quite well and adds value without adding too much complexity inside zoo. Setting up semble with pip is a lot simpler than setting up qdrant and gives pretty good indexing performance But you have a valid point, we should make codebase indexing as seamless as possible, and it should be “built in” we will revisit this in the future. |
|
It would be cool if we could add a companion PR to the read me in the semble repo instructions to install zoo code https://github.com/MinishLab/semble |
|
@thany I took your advice and now Zoo Code will automatically download a prebuilt semble from https://github.com/navedmerchant/sembleexec for the current platform, making it seamless! all you need to do is tick the box and it should "just work" @taltas that would not be necessary anymore, it should work out of the box for all users. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/code-index/semble/provider.ts (1)
137-145:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSearch should initialize Semble on first use.
searchIndex()currently returns[]untilinitialize()/startIndexing()has already run. That contradicts the on-the-fly contract and can make the first codebase search silently fail. Initialize here and only bail if startup leaves the provider in"Error".♻️ Minimal fix
async searchIndex(query: string, directoryPrefix?: string): Promise<VectorStoreSearchResult[]> { if (!this._isInitialized) { - console.warn("[SembleProvider] searchIndex called before initialization") - return [] + await this.initialize() } - if (this._state === "Error") { + if (!this._isInitialized || this._state === "Error") { return [] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/code-index/semble/provider.ts` around lines 137 - 145, searchIndex currently bails out when !_isInitialized causing first-use searches to silently fail; modify searchIndex in SembleProvider to call and await this.initialize() (or this.startIndexing() if that is the intended boot path) when !_isInitialized, then proceed with the search flow and only return [] if after initialization this._state === "Error". Locate the searchIndex method and replace the early return on !_isInitialized with an awaited initialization attempt, preserve the existing check for this._state === "Error" after initialization, and ensure any initialization errors are handled the same way the rest of the provider handles startup failures.
🧹 Nitpick comments (2)
webview-ui/src/components/chat/CodeIndexPopover.tsx (2)
179-183: ⚡ Quick winAdd local webview-ui tests for Semble-specific behavior.
This change adds provider-gated rendering and provider-specific validation rules; please add/update Vitest coverage in
webview-ui/src/**/__tests__for: supported vs unsupported platform option visibility, Semble validation path, and Qdrant fields hidden for Semble.As per coding guidelines, "webview-ui/src/**/*.{ts,tsx}: Prefer local
webview-uitests for React/webview behavior ... Add or update Vitest coverage underwebview-ui/src/**/__tests__instead of reaching forapps/vscode-e2e."Also applies to: 206-209, 774-778, 1449-1450
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@webview-ui/src/components/chat/CodeIndexPopover.tsx` around lines 179 - 183, The PR added Semble-specific UI and validation in the CodeIndexPopover component but lacks local Vitest coverage; add/update tests under webview-ui/src/**/__tests__ that mount CodeIndexPopover (or the relevant exported component) and assert: (1) provider-gated rendering shows supported platform options and hides unsupported ones, (2) when provider is "semble" the form uses the Semble validation path (e.g., emits/validates only codebaseIndexEnabled) and rejects missing required fields for other providers, and (3) Qdrant-related fields/inputs are not rendered when provider === "semble" (cover the other provider case where Qdrant fields are visible). Ensure tests use React testing library utilities already used in the repo and target the component names/functions from CodeIndexPopover to keep tests local to webview-ui.
206-209: ⚡ Quick winAvoid duplicating Semble platform support matrices in UI.
SEMBLE_SUPPORTED_PLATFORMSis hardcoded here, which can drift from the backend support source and hide/show the provider incorrectly over time. Prefer consuming a single source of truth from extension state (or a shared constant) instead of duplicating the list in the component.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@webview-ui/src/components/chat/CodeIndexPopover.tsx` around lines 206 - 209, The hardcoded SEMBLE_SUPPORTED_PLATFORMS array in CodeIndexPopover.tsx (and the derived isSembleSupported check) duplicates backend/extension state and may drift; replace the literal array with a single source of truth by importing the supported-platforms constant or reading the Semble support flag from the extension state/feature-flags exposed to the UI (e.g., use the shared exported constant or a selector from the extension state used elsewhere), then compute isSembleSupported using that imported value along with platform and arch so the UI always reflects the canonical support matrix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/code-index/semble/__tests__/semble-downloader.spec.ts`:
- Around line 27-41: The test currently reuses shared EventEmitter instances
(mockRequest and mockResponse) across test cases which can leak listeners;
update the https.get mock implementation used in the test (the vi.mock block) so
it creates new EventEmitter instances inside the get callback each invocation
(i.e., instantiate mockRequest and mockResponse within the vi.fn for "get") or
alternatively add a beforeEach that calls mockRequest.removeAllListeners() and
mockResponse.removeAllListeners() and reinitializes any spy functions; target
the mock definitions referenced as mockRequest, mockResponse and the mocked
"https".get to ensure fresh emitters per test.
In `@src/services/code-index/semble/semble-downloader.ts`:
- Around line 226-234: The Windows branch that spawns PowerShell embeds
archivePath and destDir inside single-quoted literals for Expand-Archive (in the
spawn call where process.platform === "win32"), which breaks when paths contain
apostrophes; to fix, escape any single quotes in archivePath and destDir
(replace ' with '' per PowerShell rules) before building the command string or
avoid single-quoted literals by using & { } with properly escaped arguments, and
update the spawn invocation that calls "powershell" with the corrected,
safely-escaped path variables so Expand-Archive receives valid paths.
---
Outside diff comments:
In `@src/services/code-index/semble/provider.ts`:
- Around line 137-145: searchIndex currently bails out when !_isInitialized
causing first-use searches to silently fail; modify searchIndex in
SembleProvider to call and await this.initialize() (or this.startIndexing() if
that is the intended boot path) when !_isInitialized, then proceed with the
search flow and only return [] if after initialization this._state === "Error".
Locate the searchIndex method and replace the early return on !_isInitialized
with an awaited initialization attempt, preserve the existing check for
this._state === "Error" after initialization, and ensure any initialization
errors are handled the same way the rest of the provider handles startup
failures.
---
Nitpick comments:
In `@webview-ui/src/components/chat/CodeIndexPopover.tsx`:
- Around line 179-183: The PR added Semble-specific UI and validation in the
CodeIndexPopover component but lacks local Vitest coverage; add/update tests
under webview-ui/src/**/__tests__ that mount CodeIndexPopover (or the relevant
exported component) and assert: (1) provider-gated rendering shows supported
platform options and hides unsupported ones, (2) when provider is "semble" the
form uses the Semble validation path (e.g., emits/validates only
codebaseIndexEnabled) and rejects missing required fields for other providers,
and (3) Qdrant-related fields/inputs are not rendered when provider === "semble"
(cover the other provider case where Qdrant fields are visible). Ensure tests
use React testing library utilities already used in the repo and target the
component names/functions from CodeIndexPopover to keep tests local to
webview-ui.
- Around line 206-209: The hardcoded SEMBLE_SUPPORTED_PLATFORMS array in
CodeIndexPopover.tsx (and the derived isSembleSupported check) duplicates
backend/extension state and may drift; replace the literal array with a single
source of truth by importing the supported-platforms constant or reading the
Semble support flag from the extension state/feature-flags exposed to the UI
(e.g., use the shared exported constant or a selector from the extension state
used elsewhere), then compute isSembleSupported using that imported value along
with platform and arch so the UI always reflects the canonical support matrix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 22c81bb0-3664-464b-a867-3c12e1e8caf7
📒 Files selected for processing (34)
packages/types/src/codebase-index.tspackages/types/src/vscode-extension-host.tssrc/core/webview/ClineProvider.tssrc/services/code-index/__tests__/config-manager.spec.tssrc/services/code-index/__tests__/service-factory.spec.tssrc/services/code-index/config-manager.tssrc/services/code-index/manager.tssrc/services/code-index/semble/__tests__/provider.spec.tssrc/services/code-index/semble/__tests__/semble-cli.spec.tssrc/services/code-index/semble/__tests__/semble-downloader.spec.tssrc/services/code-index/semble/index.tssrc/services/code-index/semble/provider.tssrc/services/code-index/semble/semble-cli.tssrc/services/code-index/semble/semble-downloader.tssrc/services/code-index/semble/types.tswebview-ui/src/components/chat/CodeIndexPopover.tsxwebview-ui/src/i18n/locales/ca/settings.jsonwebview-ui/src/i18n/locales/de/settings.jsonwebview-ui/src/i18n/locales/en/settings.jsonwebview-ui/src/i18n/locales/es/settings.jsonwebview-ui/src/i18n/locales/fr/settings.jsonwebview-ui/src/i18n/locales/hi/settings.jsonwebview-ui/src/i18n/locales/id/settings.jsonwebview-ui/src/i18n/locales/it/settings.jsonwebview-ui/src/i18n/locales/ja/settings.jsonwebview-ui/src/i18n/locales/ko/settings.jsonwebview-ui/src/i18n/locales/nl/settings.jsonwebview-ui/src/i18n/locales/pl/settings.jsonwebview-ui/src/i18n/locales/pt-BR/settings.jsonwebview-ui/src/i18n/locales/ru/settings.jsonwebview-ui/src/i18n/locales/tr/settings.jsonwebview-ui/src/i18n/locales/vi/settings.jsonwebview-ui/src/i18n/locales/zh-CN/settings.jsonwebview-ui/src/i18n/locales/zh-TW/settings.json
💤 Files with no reviewable changes (2)
- packages/types/src/codebase-index.ts
- src/services/code-index/config-manager.ts
✅ Files skipped from review due to trivial changes (9)
- webview-ui/src/i18n/locales/pl/settings.json
- webview-ui/src/i18n/locales/en/settings.json
- webview-ui/src/i18n/locales/nl/settings.json
- webview-ui/src/i18n/locales/ja/settings.json
- webview-ui/src/i18n/locales/it/settings.json
- webview-ui/src/i18n/locales/de/settings.json
- webview-ui/src/i18n/locales/tr/settings.json
- webview-ui/src/i18n/locales/zh-TW/settings.json
- webview-ui/src/i18n/locales/id/settings.json
🚧 Files skipped from review as they are similar to previous changes (2)
- src/services/code-index/semble/index.ts
- src/services/code-index/manager.ts
|
Verified this PR work on MacOS arm, windows x64 and linux x64. In my testing it makes zoo find relevant context and answer questions about 20-30percent faster |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/code-index/semble/semble-downloader.ts`:
- Around line 167-171: The code currently skips checksum verification when
SEMBLE_SHA256 lacks an entry for the current platform, allowing extraction to
proceed without integrity checks; update the block that computes
platformKey/expectedChecksum so that if expectedChecksum is missing (falsy) you
immediately abort with a hard error (throw an Error or call process.exit(1)) and
include identifying info (platformKey and SEMBLE_VERSION) in the message; keep
the subsequent await verifyChecksum(archivePath, expectedChecksum) unchanged so
verified paths still run as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 549c7ede-f2e7-4f8b-ab0e-e9d0ebdb1efd
📒 Files selected for processing (2)
src/services/code-index/semble/__tests__/semble-downloader.spec.tssrc/services/code-index/semble/semble-downloader.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/services/code-index/semble/tests/semble-downloader.spec.ts
|
This PR will set the precedent for allowing the extension to download, configure, install, and run third party executables. I know we haven’t set any governance principles but my preference would be for users to input a PATH location for where to find the CLI if they want to configure semble. |
I guess this isn’t automatic, the user still needs to interact with zoo code to get it installed. I’ll review a bit more. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/code-index/semble/semble-downloader.ts (1)
149-157:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftUpgrade deletes old binary before verifying new one.
If the download or extraction fails after deleting the old installation, the user loses their working Semble binary. Consider extracting to a temporary location first, then atomically replacing the old installation only after the new one is verified.
🔧 Suggested approach
// Version mismatch — remove old installation before downloading new one if (installedVersion && installedVersion !== SEMBLE_VERSION) { console.log(`[SembleDownloader] Version changed from ${installedVersion} to ${SEMBLE_VERSION}, updating...`) - try { - await fs.rm(extractDir, { recursive: true, force: true }) - } catch { - // ignore cleanup errors - } + // Keep old installation until new one is verified - see extractDir usage below }Then modify the extraction flow to use a staging directory (e.g.,
semble-new) and only rename tosembleafter successful verification, removing the old directory at that point.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/code-index/semble/semble-downloader.ts` around lines 149 - 157, The current upgrade flow deletes the existing installation (using installedVersion, SEMBLE_VERSION and fs.rm on extractDir) before downloading/extracting the new binary; change this to extract into a staging directory (e.g., extractDir + "-new" or "semble-new"), run verification there, and only after successful validation atomically replace the old install (rename/move staging to extractDir) and then remove the old directory; update the logic around the version check in the code that uses installedVersion, SEMBLE_VERSION and extractDir so that fs.rm is only used to remove the old installation after the new staged install is verified (or to cleanup failed staging), and ensure error paths leave the original extractDir intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/services/code-index/semble/semble-downloader.ts`:
- Around line 149-157: The current upgrade flow deletes the existing
installation (using installedVersion, SEMBLE_VERSION and fs.rm on extractDir)
before downloading/extracting the new binary; change this to extract into a
staging directory (e.g., extractDir + "-new" or "semble-new"), run verification
there, and only after successful validation atomically replace the old install
(rename/move staging to extractDir) and then remove the old directory; update
the logic around the version check in the code that uses installedVersion,
SEMBLE_VERSION and extractDir so that fs.rm is only used to remove the old
installation after the new staged install is verified (or to cleanup failed
staging), and ensure error paths leave the original extractDir intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 1a4fad23-6a88-432c-a30e-65a760f90d61
📒 Files selected for processing (1)
src/services/code-index/semble/semble-downloader.ts
Related GitHub Issue
Closes: #200
Description
Semble is a local code search provider that requires no API keys or external services. When a user selects "Semble - Local" as their code index provider, the extension handles everything automatically:
Searches are performed on-the-fly — semble indexes as it searches, so there's no separate indexing step or background process.
Supported platforms:
linux-x64,linux-arm64,darwin-arm64,win32-x64. The semble option only appears in the UI on these platforms.Architecture:
semble-downloader.ts— Downloads archive from GitHub releases, extracts via systemtar/powershell, caches inglobalStorageUri/semble/semble-cli.ts— Thin wrapper that spawns the binary with array args (no shell) for safe argument passingprovider.ts— Orchestrates download → verify → search. Implements the samesearchIndexinterface as other code index providersThe binary is downloaded once and reused across sessions. No user configuration needed.
Test Procedure
semble-downloader.spec.ts— Platform detection, download/skip logic, redirect handling, extraction, cleanup on failuresemble-cli.spec.ts— CLI spawn behavior, argument construction, JSON output parsingprovider.spec.ts— Download flow, platform check, error states, search result conversionpnpm --filter zoo-code exec vitest run services/code-index/semble/Pre-Submission Checklist
Screenshots / Videos
Documentation Updates
Additional Notes
darwin-x64) is not supported as the tree sitter dependency does not support it.Get in Touch
navedmerchant
Summary by CodeRabbit
New Features
Improvements
Tests