Skip to content

FEAT: Drop fastchat from GCG (#965)#1049

Merged
romanlutz merged 30 commits into
microsoft:mainfrom
varshini2305:feature/replace_fastchat
May 13, 2026
Merged

FEAT: Drop fastchat from GCG (#965)#1049
romanlutz merged 30 commits into
microsoft:mainfrom
varshini2305:feature/replace_fastchat

Conversation

@varshini2305
Copy link
Copy Markdown
Contributor

@varshini2305 varshini2305 commented Aug 5, 2025

Summary

Drops the fastchat (a.k.a. fschat) runtime dependency from GCG and replaces its
per-model Conversation machinery with tokenizer.apply_chat_template(), the standard
HuggingFace path that works for any chat-tuned model whose tokenizer ships a Jinja
chat template. Closes #965.

Why

fastchat is unmaintained (last release Aug 2024) and bundles a hardcoded list of ~5
model templates that breaks whenever a new chat-tuned model ships without a fastchat
entry. We hit this on Azure ML with Phi-3-mini, whose template lacks a .system
attribute and crashed AttackPrompt._update_ids with
AttributeError: 'Conversation' object has no attribute 'system'. PR #1705 captured
the failure mode as xfail(strict=True, raises=AttributeError) Vicuna integration
tests; this PR makes those tests pass and removes the markers.

The replacement (tokenizer.apply_chat_template) automatically supports any
chat-tuned model and removes the need to know template internals — addressing #990
(more models for suffix evaluation) as a side effect of the architecture change.

Polish on top of the original WIP

The original commits prototyped the right idea but had several rough edges that needed
cleanup before landing. The squashed commit on this branch (still authored by
@varshini2305 to preserve credit) folds in the following:

  • Removed print(...) debug statements sprinkled through _update_ids.
  • Replaced _detect_assistant_role() heuristic (only matched <|assistant|> /
    assistant:) with positional computation: _assistant_role_slice is whatever sits
    between the end of the control and the start of the target. Works for llama-2's
    [INST]/[/INST], llama-3 header markers, phi-3 ChatML, and any future template
    without code changes.
  • Handled char_to_token returning None when a substring ends exactly at the
    prompt boundary (common for the target string at end-of-prompt). The prototype hit
    TypeError: NoneType - int; new code clamps end positions to len(toks) and walks
    forward for start positions.
  • Restored **params.tokenizer_kwargs[i] spread in get_workers (the prototype
    hardcoded use_fast=True and dropped user-provided kwargs).
  • Added a clear ValueError in get_workers if a tokenizer has no chat template
    configured, instead of crashing later inside _update_ids.
  • Updated callers/tests/configs: 11 YAML configs no longer carry the
    conversation_templates field; train.py no longer accepts the
    conversation_templates parameter; the GCG Dockerfile no longer installs fschat;
    existing unit + integration tests stop passing conv_template=... to AttackPrompt
    and friends.
  • Strict signatures on the affected __init__ methods (dropped *args, **kwargs
    catch-alls) and type hints + Google-style docstrings matching PyRIT's style guide.

Scope

Surgically dropped conv_template from every API surface that referenced it:
AttackPrompt, PromptManager, MultiPromptAttack, EvaluateAttack,
IndividualPromptAttack, ProgressiveMultiPromptAttack, ModelWorker,
get_workers. The log-file per-worker telemetry now records chat_template instead
of conv_template.name.

A small follow-up commit on this branch (863c4906, by @romanlutz) adds an
individual_phi_4.yaml config and registers phi_4 in _MODEL_NAMES after we
validated Phi-4 end-to-end on AML alongside the fastchat removal.

Testing

Local

  • 77 GCG unit tests pass (tests/unit/auxiliary_attacks/gcg/).
  • 12 GCG integration tests pass (tests/integration/auxiliary_attacks/) —
    exercises _update_ids end-to-end on real GPT-2 with two distinct chat-template
    shapes:
  • Broader pyrit unit suite (7,492 tests) unaffected.

Azure ML (real models, real GPUs)

Submitted GCG jobs across multiple chat-tuned model families on gcg-gpu-a100. Run
command for each was the same shape as the notebook (python -m pyrit.auxiliary_attacks.gcg.experiments.run --model_name {NAME} --setup single --n_train_data 5 --n_test_data 0 --n_steps 5 --batch_size 64 --output_dir ${{outputs.results}}).

Model Result Notes
meta-llama/Llama-2-7b-chat-hf ✅ Completed Baseline; no regression vs PR #1705. Also re-validated after disabling shared-key auth on the workspace storage account.
meta-llama/Meta-Llama-3-8B-Instruct ✅ Completed Different chat template family (Llama-3 header markers) — apply_chat_template handles it cleanly.
microsoft/Phi-3-mini-4k-instruct ✅ Completed The model that crashed in #965 now works end-to-end.
microsoft/phi-4 ✅ Completed New 14B dense — individual_phi_4.yaml config added in follow-up commit on this branch.
lmsys/vicuna-13b-v1.5 ❌ Failed (unrelated) Pre-existing tiktoken runtime dep issue: vicuna's tokenizer files need pip install tiktoken to convert. Fails in AutoTokenizer.from_pretrained before our code runs. Tracked as follow-up.
Qwen/Qwen3.6-27B, google/gemma-4-e4b-it ❌ Failed (unrelated) Newer model classes reject use_cache in __init__; the YAML configs pass model_kwargs: [{"use_cache": False}] which older transformers happily forwarded. Pre-existing config issue, not a fastchat regression.
zai-org/GLM-4.7-Flash, openai/gpt-oss-20b ❌ Failed (unrelated) Loaded fine and apply_chat_template worked, but failed in get_embedding_matrix with Unknown model type — that helper hardcodes isinstance checks against a fixed list of model classes (Llama, Mistral, Phi-3, GPT-2, etc.). Replacing with a generic model.get_input_embeddings() fallback (what nanoGCG does) is on the GCG roadmap.

Net validation: 5 successful AML runs across 4 distinct chat-tuned families
(llama-2, llama-3, phi-3, phi-4) using tokenizer.apply_chat_template().
Fastchat removal works.

Backward compatibility

AttackPrompt and friends drop conv_template from their public signatures. This
is technically breaking, but every caller in pyrit/ is updated in the same commit
and there are no documented external integrations of these classes. The pre-existing
generate_suffix(...) entry point (and the YAML configs that drive it) is the
intended public API.

Note on base branch

This branch is currently based on gcg-refactor (PR #1705) since that PR is in
review. Once #1705 merges to main, this PR's diff will collapse to just the
fastchat removal.

Out-of-scope follow-ups discovered during validation

These are not in this PR; flagged for future work:

  1. get_embedding_matrix is hardcoded to specific model classes. GLM-4.7-Flash,
    gpt-oss-20b, and presumably most modern HF chat models hit
    ValueError: Unknown model type. Drop-in fix: fall back to
    model.get_input_embeddings() — same pattern nanoGCG uses.
  2. GCG worker death is not propagated. When a worker process raises in
    model.forward, the main process hangs indefinitely waiting on the joinable
    queue, and AML reports the job as "Running" forever. Worth a separate fix.
  3. use_cache: False in the shipped YAML configs is rejected by newer
    transformers model __init__ for some model families. Should drop it from
    model_kwargs (it's already the default for inference-time generation
    irrelevant to GCG anyway).
  4. tiktoken is a missing runtime dep for vicuna-13b-v1.5 (and presumably
    other models whose tokenizers ship as tiktoken files). Add to gcg extra.

Closes / references

@varshini2305
Copy link
Copy Markdown
Contributor Author

@romanlutz

Copy link
Copy Markdown
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I've kicked off a couple jobs and will report back on how they turned out.

Comment thread pyrit/auxiliary_attacks/gcg/attack/base/attack_manager.py Outdated
Comment thread pyrit/auxiliary_attacks/gcg/experiments/train.py Outdated
Comment thread pyrit/auxiliary_attacks/gcg/experiments/train.py Outdated
Comment thread pyrit/auxiliary_attacks/gcg/attack/base/attack_manager.py Outdated
Comment thread pyrit/auxiliary_attacks/gcg/attack/base/attack_manager.py
Comment thread pyrit/auxiliary_attacks/gcg/experiments/train.py Outdated
@romanlutz romanlutz linked an issue Sep 30, 2025 that may be closed by this pull request
romanlutz and others added 3 commits May 4, 2026 04:58
Add 26 new unit tests covering:
- get_filtered_cands: filtering, clamping, padding behavior
- target_loss / control_loss: shape, finiteness, loss ordering
- sample_control: shape, vocab bounds, single-position changes, non-ASCII filtering
- _build_params: ConfigDict construction from kwargs
- _apply_target_augmentation: length preservation, modification, seed reproducibility
- _create_attack: transfer flag routing (Progressive vs Individual)
- Embedding helpers: error handling for unknown model types
- PromptManager init: validation of goals/targets
- EvaluateAttack init: worker count validation

Total GCG test count: 24 -> 50

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Data & config tests (test_data_and_config.py, 12 tests):
- YAML loading: valid files, list values, missing file error
- Real config validation: all 11 shipped configs parse, have required keys,
  individual vs transfer configs have correct settings
- get_goals_and_targets: seed reproducibility, different seeds differ,
  separate test data files, n_train_data limiting
- run_trainer validation: unsupported model names, missing HF token

Lifecycle tests (test_lifecycle.py, 7 tests):
- GPU memory: nvidia-smi parsing (single/multi GPU), MLflow logging, failure handling
- generate_suffix lifecycle: MLflow started before training, workers stopped
  after training, BUG CHARACTERIZATION: workers NOT stopped on failure (leak)

Total GCG test count: 24 -> 69

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add 10 integration tests that exercise the GCG attack pipeline with a real
GPT-2 model on CPU, validating end-to-end correctness:

- token_gradients: gradient shape matches (n_control, vocab_size), values
  are finite and non-zero
- GCGAttackPrompt: initializes with valid non-overlapping slices, grad()
  returns correct shape, test_loss() returns finite positive float
- GCGPromptManager.sample_control: sampled candidates are decodable,
  correct batch size
- Embedding helpers: layer/matrix/embeddings work with GPT2LMHeadModel,
  get_nonascii_toks returns non-empty tensor

Uses llama-2 conversation template (has explicit handling in _update_ids).
Marked @run_only_if_all_tests (requires RUN_ALL_TESTS=true + torch/transformers).
Runs in ~18s on CPU.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
romanlutz and others added 20 commits May 4, 2026 05:36
These tests only need optional Python packages (torch, transformers, fastchat),
not external services or credentials. The importorskip at the top already
handles skipping when deps are not installed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move class references to module level to fix N806 (variable naming)
- Add noqa: E402 for imports after importorskip guards
- Fix ruff format issues
- Remove outdated RUN_ALL_TESTS reference in docstring

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove mlflow and azureml-mlflow from GCG dependencies entirely:
- Replace mlflow logging in log.py with Python standard logging
- Remove mlflow.start_run()/end_run() from train.py and attack_manager.py
- Remove mlflow and azureml-mlflow from gcg and all extras in pyproject.toml
- Update tests to not mock mlflow
- Fix Dockerfile: use nvidia/cuda base + Python 3.11 + uv + pip install -e .[gcg]
- Add pyarrow>=22 pin for Python 3.14 compatibility

The mlflow dependency caused Azure ML failures due to version incompatibility
between mlflow 3.x and azureml-mlflow. Proper experiment logging will be
added later via CentralMemory or Azure storage (tracked in plan).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts:
#	tests/unit/auxiliary_attacks/gcg/test_lifecycle.py
Remove gbda_deterministic/mpa_deterministic — dead code from GBDA attack
that was never consumed by any GCG class. Its presence caused a TypeError
in individual mode because MultiPromptAttack.__init__() doesn't accept it.
This was a pre-existing bug from the original llm-attacks research repo
(silently swallowed by **kwargs there, but our copy removed **kwargs).

Also adds scripts/run_gcg_aml.py (launcher with sys.path fix for Azure ML)
and scripts/submit_gcg_job.py (job submission reading from .env files).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All mpa_kwargs (deterministic, lr, batch_size, n_steps) were silently
absorbed by **kwargs in the original llm-attacks repo's MultiPromptAttack.
__init__() but never read. Our copy removed **kwargs, exposing the bug.

The original repo even has a typo: 'self.mpa_kewargs' in
IndividualPromptAttack (line 1114 of llm-attacks/attack_manager.py).

Verified: none of these kwargs are consumed by MultiPromptAttack in either
the original repo or our copy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…crosoft#965)

Phi-3-mini hits 'Conversation has no attribute system' in _update_ids()
due to fastchat API change. Llama-2 has dedicated handling path that works.

GCG baseline VALIDATED on Azure ML:
- Model: meta-llama/Llama-2-7b-chat-hf
- Config: 5 prompts, 5 steps, batch_size 64
- Result: loss decreases across steps (1.9 -> 0.86 on best prompt)
- Runtime: ~6 min on Standard_NC24ads_A100_v4
- Job: silly_vinegar_82x7td6gpn (Completed)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The existing TestCreateAttack tests mock the manager classes, so they never
exercise MultiPromptAttack.__init__() with real kwargs. That's why the dead
mpa_kwargs bug only surfaced on Azure (TypeError when MPA didn't accept
deterministic / lr / etc). This test constructs the real GCG manager classes
and verifies IndividualPromptAttack and ProgressiveMultiPromptAttack can
create an internal MultiPromptAttack without error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The existing GPT-2 integration tests only use the llama-2 conversation
template path. Bugs in the else branch of AttackPrompt._update_ids -- like
the Phi-3 conv_template.system AttributeError we hit on Azure -- would
never be caught.

The two new tests construct GCGAttackPrompt with the vicuna template,
which exercises the same code path. They are marked xfail (strict=True)
because vicuna's fastchat conversation template lacks a .system attribute,
reproducing the same bug. The xfail marker references issue microsoft#965 and will
flip to 'unexpectedly passed' when the fastchat replacement lands, prompting
removal of the marker.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Updates the AML notebook to reflect the actual flow we ran during Phase 1c
baseline validation: llama-2 baseline (phi-3 has fastchat microsoft#965 bug),
run_gcg_aml.py launcher script (so the uploaded code snapshot wins over the
Docker-installed package), repo-root build context (Dockerfile needs to COPY
pyproject.toml + pyrit/ for pip install -e .[gcg]), and PyRIT-style env
file loading via _load_environment_files.

Adds tests/end_to_end/auxiliary_attacks/test_gcg_aml_e2e.py mirroring that
same flow as a real e2e test. Submits a small (5-step, 5-train, batch 64)
llama-2 GCG job, polls until terminal state, asserts Completed. Skipped
unless RUN_ALL_TESTS=true and AZURE_ML_* + HUGGINGFACE_TOKEN env vars are
set (since it submits real paid Azure ML compute). Always cancels the
submitted job on test failure or interruption to avoid leaking compute.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The scripts/ directory is not packaged for PyPI installs, so the AML
launcher there was inaccessible to anyone who pip-installed pyrit[gcg].
Move the entry-point cwd handling into pyrit/auxiliary_attacks/gcg/
experiments/run.py itself: when run as `__main__`, chdir into the file's
own directory so the relative `configs/` and `results/` paths resolve
regardless of where the script is invoked from.

AML jobs (notebook and e2e test) now run
  python -m pyrit.auxiliary_attacks.gcg.experiments.run --model_name ...
which also makes the previous sys.path hack unnecessary -- `python -m`
puts cwd at the front of sys.path, so the uploaded code snapshot still
wins over the Docker-installed package.

Deletes scripts/run_gcg_aml.py and scripts/submit_gcg_job.py (the latter
was a CLI duplicate of the notebook's submission flow).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pyarrow>=22.0.0; python_version >= '3.14' pin was added in c98af28 to
the core dependencies, but pyrit core does not actually need it -- without
the gcg extra, the resolver picks a 3.14-compatible pyarrow on its own via
the transitive datasets -> pyarrow chain. The pin is only needed when the
gcg extra is installed because something in that extra constrains the
resolution toward an older pyarrow that lacks cp314 wheels and fails to
build from source on Python 3.14.

Moves the pin to the gcg extra and adds an inline comment explaining
why it is there, matching the existing precedent for the spacy cp314
wheel comment in the all extra.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The AML e2e test previously rebuilt the MLClient, environment, and command
from scratch -- a near-copy of the notebook's submission flow. Replace that
with `runpy.run_path()` of `doc/code/auxiliary_attacks/1_gcg_azure_ml.py`.
The notebook is jupytext percent format (the `# %%` markers are plain
comments) so the file is valid Python and runs as a script. The test then
pulls `returned_job` and `ml_client` out of the executed namespace and
polls the job to a terminal state.

Result: the notebook is the single source of truth for the submission flow,
and the test verifies that what we ask users to run actually works
end-to-end. Net diff is -27 lines.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ran jupytext --to ipynb --execute against the notebook .py to capture cell
outputs (workspace name, environment build status, submitted job name +
status + Studio URL) per PyRIT convention. The submitted job
('lucid_muscle_nt947p71s0') ran to completion on Azure ML, doubling as a
verification that the refactored notebook (which now invokes the GCG runner
via 'python -m pyrit.auxiliary_attacks.gcg.experiments.run' instead of the
old scripts/ launcher) still works end-to-end.

The captured stderr cells include some Azure ML SDK telemetry noise
('ActivityCompleted: ... HowEnded=Failure' for benign UserError conditions
like 'environment already at version N'). Will be cleaned up in a follow-up
by suppressing the azure.ai.ml._telemetry logger in the notebook source.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds two pieces to make the AML tutorial actually show users an end-to-end
result instead of stopping at "Job submitted, monitor in Studio":

1. ``run.py`` now accepts ``--output_dir``. The runner writes its result
   JSON under whatever path the caller passes, defaulting to ``outputs/``.
   The notebook's command declares ``outputs={"results": Output(uri_folder)}``
   and passes ``--output_dir ${{outputs.results}}`` so AML mounts a path,
   the runner writes there, and the contents are uploaded as a named output
   artifact (auto-capture of ``./outputs/`` is *not* available in SDK v2
   command jobs -- you have to declare named outputs explicitly).

2. A new poll-and-inspect cell at the end polls the submitted job, then
   downloads with ``all=True``, finds the result JSON under
   ``<download_dir>/named-outputs/results/``, and prints the final loss
   and generated adversarial suffix.

Also adds a (best-effort) logging suppression block early in the notebook
for azure.ai.ml SDK telemetry. It catches the python-logging warnings but
not the "ActivityCompleted: HowEnded=Failure" lines or the upload progress
bars -- those go through the SDK's own stderr handler with
propagate=False and are not reachable via standard logging config (see
azure-ai-ml _utils/_logger_utils.py). The remaining noise is benign
telemetry for expected UserError conditions like "environment already at
this version".

Notebook re-executed end-to-end against AML (job stoic_parcel_6clfs67hp9,
llama-2, 5 train data, 5 steps): completed successfully, suffix downloaded
and printed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…noise

The 1.32.0 release includes:

  Skip _list_secrets for identity-based datastores to prevent noisy
  telemetry traces.

That bullet is exactly the source of the ``ActivityCompleted:
Activity=Datastore.ListSecrets, HowEnded=Failure ... UserError ... No
secrets for credentials of type None`` blob that was showing up in our
Azure ML notebook's executed cell outputs and made it look like the env
build / job submission was failing when it actually wasn't.

After bumping, a quick smoke (build MLClient, list envs) drops from many
lines of telemetry noise to a single ``Class X is experimental`` info
message -- much more reasonable for a tutorial. Bumped both the ``gcg``
extra and the ``all`` extra so they stay aligned.

The upload progress bars and the experimental-class warning still show
up; those are separate noise sources that this SDK release does not
address.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI's coverage check (>=90% on diff) was failing on the log_gpu_memory
try/except added in c98af28: lines 70-74 of log.py weren't reached.

Two issues:

1. The TestGpuMemoryLogging class lived in test_lifecycle.py, which
   does pytest.importorskip on the GCG train module. CI installs the
   'all' extra but not 'gcg', so ml_collections is missing and the
   train import fails, skipping the whole test_lifecycle.py module --
   including the GPU memory tests, even though they only need stdlib.
   Moved them into test_log.py (which only importorskips the log
   module, all stdlib) so they actually run in CI.

2. The new test_log_gpu_memory_swallows_nvidia_smi_failure exercises
   the except branch (lines 73-74) that the old success-only test
   never hit. log_gpu_memory must swallow nvidia-smi failures so the
   training loop never crashes when run on a host without nvidia-smi.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…osoft#965)

GCG previously relied on the (long-unmaintained) `fastchat` library to render
the user/assistant exchange for each AttackPrompt. That library bundled a
hardcoded list of ~5 model templates and broke whenever a new model shipped
without a fastchat entry — most recently Phi-3-mini, whose template lacks
`.system` and crashed `_update_ids` on Azure ML.

This change replaces the fastchat-driven slice computation with
`tokenizer.apply_chat_template()`, which is the standard HuggingFace path and
works for any chat-tuned model whose tokenizer ships a Jinja chat template.
It also drops `fastchat`/`fschat` as a runtime dependency entirely (removed
from the GCG Dockerfile).

The work is based on PR microsoft#1049 by @varshini2305, which prototyped the same
approach. This commit polishes that prototype:

- Removes WIP `print(...)` debug statements from `_update_ids`.
- Replaces the narrow `_detect_assistant_role` heuristic (only matched
  `<|assistant|>` / `assistant:`) with positional computation: the assistant
  role tokens are always whatever sits between the end of the control and the
  start of the target. This works for llama-2's `[/INST]`, llama-3's header
  markers, phi-3 ChatML, and any future template without code changes.
- Handles the edge case where `char_to_token` returns None (when a substring
  ends exactly at the prompt boundary, common for the target string).
- Restores `**params.tokenizer_kwargs[i]` spread in `get_workers` (the
  prototype hardcoded `use_fast=True` and dropped user kwargs).
- Drops `conv_template` from `AttackPrompt`, `PromptManager`,
  `MultiPromptAttack`, `EvaluateAttack`, `IndividualPromptAttack`,
  `ProgressiveMultiPromptAttack`, `ModelWorker`, `get_workers`, and the
  log-file dicts they populate.
- Drops `conversation_templates` from `generate_suffix`, `_build_params`,
  every shipped YAML config, and the config-validation test.
- Adds a clear `ValueError` in `get_workers` if a tokenizer has no chat
  template configured.

Tests:
- All 72 GCG unit tests pass (existing tests adjusted for new signatures).
- The 12 GCG integration tests pass on real GPT-2 with both a llama-2-style
  and a ChatML/phi-3-style chat template — the second template shape was
  previously `xfail(strict=True, raises=AttributeError)` referencing this
  exact issue. Those xfail markers are removed.
- Broader pyrit unit suite (7,492 tests) unaffected.

Closes microsoft#965.
Builds on microsoft#1049 by @varshini2305.

Co-authored-by: Roman Lutz <romanlutz13@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@varshini2305
Copy link
Copy Markdown
Contributor Author

@romanlutz Hi! Is this feature still relevant? If so, can you please share what is the error faced while integrating this?

@romanlutz
Copy link
Copy Markdown
Contributor

Hey @varshini2305 ! Yes, still needed. I accidentally opened a new draft PR but I'll keep pushing updates here instead to keep the record of your contribution. I have had to get my GCG pipeline on Azure ML working again. That was my main blocker. There are 1-2 other PRs that need merging to complete that and then the remaining changes will be easier to read. I believe it's mostly based on your changes. No blocker other than needing validation 🙂

romanlutz and others added 3 commits May 12, 2026 13:15
Adds an `individual_phi_4.yaml` config and registers `phi_4` in the GCG
runner's known model list.

Validated end-to-end on Azure ML alongside the apply_chat_template
fastchat removal: a 5-step run on `microsoft/phi-4` completed
successfully with `python -m pyrit.auxiliary_attacks.gcg.experiments.run
--model_name phi_4 --setup single --n_train_data 5 --n_test_data 0
--n_steps 5 --batch_size 64`, producing a finite loss and non-trivial
suffix.

Note `tokenizer_kwargs` uses `use_fast: True` (Phi-4 ships with a fast
tokenizer; `use_fast: False` is reserved for older models that need it,
e.g. llama-2).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds unit tests exercising the four diff-coverage gaps flagged by CI on PR microsoft#1717:

- AttackPrompt._update_ids() ValueError when goal/control/target don't appear

  verbatim in the chat-templated prompt (line 186).

- start_tok() walking forward when char_to_token returns None at the initial

  position, then finding a mappable position later (line 210).

- start_tok() falling back to len(toks) when no position from char_pos to

  end-of-prompt maps to a token (line 211).

- get_workers() ValueError when a tokenizer has no chat_template configured

  (lines 1620-1621).

Both start_tok tests use a fully mocked tokenizer because real tokenizers

are too well-behaved (every byte position maps to some token) to deterministically

exercise the None-handling branches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz
Copy link
Copy Markdown
Contributor

romanlutz commented May 13, 2026

Alright here's the list of changes: @varshini2305

  • no more print statements (obviously leftovers from debugging, no worries!)
  • updated assistant role detection because the existing one didn't match llama-2 or other templates ([INST] etc.)
  • get_workers kwargs had use_fast=True and I restored it to the passed tokenizer_kwargs
  • char_to_token can return None so that needed handling (weird edge case)
  • init signatures updated to be stricter, and class declarations no longer include (object) as before
  • google style docstrings (for consistency with pyrit)
  • type hints
  • tokenizer.chat_template is None precondition to make it fail clearly instead of in _update_ids
  • tests
  • subsequent dockerfile update to remove fschat (hooray!)

In other words, your changes were the core and all conceptually correct! That was the hard part 😄 I just polished it to make our gates happy and green.

@romanlutz romanlutz force-pushed the feature/replace_fastchat branch from 1d5020d to 1ba1be4 Compare May 13, 2026 04:19
@romanlutz romanlutz changed the title FEAT: Remove dependency on fastchat for model conversation templates FEAT: Drop fastchat from GCG, use tokenizer.apply_chat_template (#965) May 13, 2026
# Conflicts:
#	pyrit/auxiliary_attacks/gcg/src/Dockerfile
#	tests/integration/auxiliary_attacks/test_gcg_integration.py
#	tests/unit/auxiliary_attacks/gcg/test_attack_wiring.py
#	tests/unit/auxiliary_attacks/gcg/test_lifecycle.py
@romanlutz romanlutz changed the title FEAT: Drop fastchat from GCG, use tokenizer.apply_chat_template (#965) FEAT: Drop fastchat from GCG (#965) May 13, 2026
_create_attack passes `logfile=f"{result_prefix}_{timestamp}.json'` directly to IndividualPromptAttack/ProgressiveMultiPromptAttack, whose __init__ writes that file as soon as it is non-None. Three tests were calling _build_params (or its helpers) with result_prefix="test", so each test left a test_<timestamp>.json artifact in the worktree root.

Switch them to result_prefix=str(tmp_path / "test") so the logfile lands in pytest's per-test tmp dir and is cleaned up automatically.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@varshini2305
Copy link
Copy Markdown
Contributor Author

@romanlutz Thanks for the detailed update. Anything I can help with on my end with this PR?

@romanlutz
Copy link
Copy Markdown
Contributor

@romanlutz Thanks for the detailed update. Anything I can help with on my end with this PR?

I think we're all set! Thanks a ton for this contribution @varshini2305 🥳

romanlutz and others added 2 commits May 13, 2026 06:43
GCGPromptManager.sample_control replaces exactly one position per candidate, but the new token is sampled uniformly from the top-k of (-grad). When that draw happens to land on the same id already at that position the candidate ends up identical to the original (diffs == 0). The previous assertion (diffs == 1) was therefore flaky against the underlying randint, and CI hit it on PR microsoft#1049.

Relax to 'at most 1' and document why; ran the test 20x locally with no failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz added this pull request to the merge queue May 13, 2026
Merged via the queue into microsoft:main with commit 2d8fa00 May 13, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEAT replace fastchat in GCG

3 participants