Turn raw images into clean, model-ready annotations — powered by local AI, reviewed by you.
ImageTagger is a desktop tool for building and curating image caption and tag datasets. It pairs a fast visual browser with an AI-assisted fixup workflow, so you can generate, validate, and correct annotations at scale without ever leaving your machine.
Built with PyQt6. Runs on Windows, Linux, and macOS.
Main window — browse your dataset, review AI-generated tags and descriptions, filter by annotation status, and launch batch operations:
Merge dialog — a side-by-side diff view for reviewing and accepting AI-proposed fixes before they land in your files:
- Dual-purpose annotations. Tags and short captions for diffusion training (LoRA, SDXL, Flux). High-density VLM captions with chain-of-thought reasoning for vision language model training — ready to drop into an Unsloth fine-tuning dataset.
- Local and private. Works with Ollama or any OpenAI-compatible API. Your images never leave your machine.
- Batch-first. Select any number of images and run Generate, Validate, or AI Find in one shot.
- Human in the loop. The fixup workflow shows exactly what the AI wants to change. You accept, reject, or edit each suggestion before it is written.
- Smart filtering. Expression-based filter lets you find exactly what needs attention:
fixup & "bird",!validated & resolution > 5,untagged | 'blurry'. - Annotation status at a glance. Each image in the list carries status badges — ⚖️ fixup pending (from Validate), ✨ vision/refine data ready (from Generate with Refine), 🔍 matched by AI Find, ✅ validated (from Validate or user merge) — so you always know what still needs work. Hover a ✅ image to see who validated it and when: model name or "user", plus the date.
- Customizable prompts. Edit every workflow prompt (Tags, Description, Validation, AI Search, Vision, Refine) directly in the app. Each tab has an agent role field ("You are ..."), a Test button to see the full rendered prompt and model response against any image, and Apply / Save / Reset controls.
| Feature | Details |
|---|---|
| Generate | Batch-generate tags, descriptions, and/or VLM captions for selected images using a vision LLM |
| Vision / Refine | Generate high-density captions with chain-of-thought reasoning for VLM training — stored separately from diffusion tags, exportable to Unsloth datasets |
| Validate | Run the AI over existing annotations and write fixup data into each image's sidecar for anything that needs correction |
| AI Find | Search your dataset by concept — the AI scans selected images and marks matches |
| Fixup / Merge | Visual diff dialog for reviewing AI-proposed changes before accepting them |
| Expression filter | fixup, untagged, resolution </>, "tag", 'text', NOT / AND / OR, parentheses |
| Prompt editor | Per-tab prompt editing with Apply, Save, Reset, and Test buttons |
| Regenerate in-dialog | Switch model or prompt mid-review and regenerate fresh candidates without leaving the merge dialog |
| External editors | Open With context menu auto-detects installed image editors |
| Auto thread mode | Adaptive parallelism for Ollama — ramps up threads as GPU headroom allows |
| Cross-platform | Windows, Linux, macOS — one-step install and run scripts included |
Prerequisite: Python 3.9 or newer on your PATH.
install.bat # first time only
run.bat
chmod +x install.sh run.sh update.sh
./install.sh # first time only
./run.shTo update dependencies later: update.bat or ./update.sh.
- File → Open Folder — point ImageTagger at a directory of images.
- Connect to an LLM server:
- Ollama on
http://localhost:11434(default), or - any OpenAI-compatible server (e.g. vLLM) on a custom port.
- Ollama on
- Fetch models and select one. Recommended: Qwen3-VL-8B for the best performance/quality balance.
- Select one or more images in the list.
- Hit Generate to produce tags and descriptions, then Validate to catch issues.
- Open Fixup to review and merge AI suggestions image by image.
Supported formats: jpg, jpeg, png, bmp, gif, webp.
- Usage guide — workflows, filter syntax, keyboard shortcuts, and more
- Keyboard shortcuts — full cross-platform shortcut reference
- Docs index
This project is heavily inspired by TagGUI, which deserves full credit for the core UI layout direction and practical workflow ideas.
For transparency, this codebase was 100% AI-generated with GitHub Copilot.

