Skip to content

Metric-driven mesh adaptation: follow_metric, OT movers, and mesh.OT_adapt()#209

Merged
lmoresi merged 46 commits into
developmentfrom
feature/anisotropic-metric-mover
May 25, 2026
Merged

Metric-driven mesh adaptation: follow_metric, OT movers, and mesh.OT_adapt()#209
lmoresi merged 46 commits into
developmentfrom
feature/anisotropic-metric-mover

Conversation

@lmoresi
Copy link
Copy Markdown
Member

@lmoresi lmoresi commented May 25, 2026

Summary

Lands the metric-driven mesh-adaptation feature, culminating in the public
mesh.OT_adapt() method. The branch is 45 commits ahead of
development; development does not yet have the mover/metric
infrastructure (_winslow_equidistribute, metric_density_from_gradient,
follow_metric, _edge_pairs), so OT_adapt cannot be split out — it is the
capstone of this feature and ships with the infrastructure it relies on.

The headline tool — mesh.OT_adapt()

moved = mesh.OT_adapt(
    field, *, refinement=3.0, coarsening="auto",
    grad_smoothing_length="auto", metric_choice="front-following",
    fields_to_remap=None, fields_to_zero=None,
    skip_threshold=None, reference_coords=None, verbose=False) -> bool

Promotes the validated reset-to-uniform optimal-transport adaptation
pattern into a library method: reset to a cached uniform reference,
FE-remap the driving field, build a gradient metric, run the OT mover,
FE-remap fields, zero a cold-restart set. Plus
mesh.OT_adapt_reset_reference() for the reference cache.

Settled defaults

  • refinement=3.0 — primary feature knob.
  • grad_smoothing_length="auto" — screened-Poisson de-noising at the
    mesh's uniform cell size (parallel-safe mean edge length); the lever that
    keeps R=3 sliver-free. None disables it; a number or Pint length
    sets it explicitly (unit-aware, non-dimensionalised via the projection).
  • metric_choice="front-following" — mild grading; not the sliver-prone
    "gradient-uniform".

Boundary slip is unified onto the mesh's projected normals
(mesh.Gamma_P1), replacing the hardcoded box/ring cases: radial
coordinate systems slide tangentially and snap back to the surface;
Cartesian boundaries pin (the raw facet normal is degenerate at vertices).
Constrained-manifold meshes (dim != cdim) raise NotImplementedError.

Validation

  • tests/test_0760_mesh_ot_adapt.py locks the API (Annulus + Box move and
    preserve the field; boundary behaviour; reference cache; skip threshold;
    auto-smoothing default; unit-aware length; manifold guard). The existing
    follow_metric suite stays green.
  • Behavioural (not bit-for-bit — the slip algorithm changed): mode-1
    Ra=1e7 delta-eta=1e2 over 200 steps gives 40/40 productive adapts; the
    reset is idempotent (verified — no sliver accumulation). At R=3 a small
    sliver population remains on the sharp solver field that the gentler
    R~1.4 reference avoids; auto-smoothing is the mitigation, refinement is
    the caller's lever.

Design doc: docs/developer/design/ot-adapt-api-proposal.md.

Not in this PR

A parallel-safe get_mean_radius() / harness refresh is in progress in a
separate working tree and is intentionally excluded here.

Underworld development team with AI support from Claude Code

lmoresi added 30 commits May 17, 2026 17:20
smooth_mesh_interior(metric=, method='spring'|'ma', boundary_slip=)
— two topology-preserving node-redistribution solvers toward a
target size/density field:

- volumetric elastic spring (default): equal edge springs (shape,
  equant cells, no slivers) + per-cell area constraint A0 proportional
  to 1/rho_tgt (size); PCG truss-energy minimisation with an
  inversion-rejecting line search. ~0.3s, robust, scales with
  amplitude.
- Benamou-Froese-Oberman Monge-Ampere: convex-branch det(I+D2phi)=g
  damped Picard on a variationally-recovered Hessian, pure-Neumann
  + the SNES_Scalar.constant_nullspace hook. Robust isotropic
  refinement, ~60x costlier.

Optional tangential boundary slip (per-ring radius projection;
nodes provably stay on the surface). Honest grading ~1.3-1.8x at
fixed node count (extreme OT grading needs more nodes / a topology
change). Docs: docs/developer/subsystems/mesh-metric-redistribution.md.

Underworld development team with AI support from Claude Code
Validation and benchmark scripts for the metric smoother and the
upcoming Monge-Ampere efficiency / convergence-rate work: honest
per-node grading metric with mesh plots, exact radial
equidistribution ground truth, setup sanity probe, cost /
convergence / preconditioner benchmarks, boundary-slip A/B,
interior-feature refinement test.

Underworld development team with AI support from Claude Code
The Monge-Ampere path's cost was dominated by re-running a full
GMRES+GAMG setup on every Picard iteration even though the mesh
(hence the phi-Poisson Laplacian and the SPD Hessian-recovery mass
matrix) is constant within the loop -- only the RHS changes. The
constant_nullspace near-nullspace re-attach forced the GAMG
re-setup, and newtonls added a redundant 2nd iterate + line search.

_use_direct_solver puts the cached phi / Hessian / grad-phi
sub-solvers on ksponly + MUMPS LU with snes_lag_jacobian=-2 and
snes_lag_preconditioner=-2 (factor once, reuse: inner iters become
back-substitutions), plus mat_mumps_icntl_24=1 for the singular
pure-Neumann phi. n_picard default 40->25 (deep/near grading is
flat from iter ~20).

A direct solve is exact (tighter than the GMRES rtol) so the Picard
fixed point -- hence grading/quality -- is bit-for-bit unchanged:
d/n 1.02/1.43/1.71/1.54 (AMP 0/2/8/20), AMP=0 exact no-op, no
tangle. Cost: cold 12-18s->1-2s, warm <=34s->1-2.5s; the warm>>cold
GAMG-resetup degradation is eliminated.

Serial MUMPS is the expedient that proved the diagnosis; the
portable lever is the factor/setup-once-reuse pattern (port to GAMG
for parallel) -- see the doc Implementation note + Open items.
Validator: scripts/ma_cost_grading.py; profiler: ma_profile_phases.py.

Underworld development team with AI support from Claude Code
Quantified the user-proposed quasi-Newton on the cofactor form
cof(I+D2phi):D2(dphi)=g-det(I+D2phi) offline (fixed geometry,
res-16 Annulus, transport-map metric), vs the shipped BFO-Picard.

Decisive negative: (1) same recovered Hessian => same large
irreducible det(I+H_rec)-g floor, so it cannot beat the settled
fixed-node grading cap (predicted, not a grading lever); (2) it is
LESS robust than BFO at the same Hessian quality -- all three
standard convexity safeguards fail to reach BFO's fixed point
(residual line-search freezes lambda->8e-3; det>0 backtrack stalls
under-deformed d/n 1.58; PD-projection of eig(I+H) creeps, never
contracts, overshoots into a cell-inverting map d/n 1.49) while BFO
reaches d/n 1.713 in 16 iters. Root cause: BFO's sqrt(...)-2 is a
closed-form convex-branch solve using only the deviatoric recovered
Hessian + g, side-stepping the noisy under-estimated full det;
cofactor-Newton feeds the full noisy recovered H into both C_k and
the residual. UW3 forbids 2nd derivatives of mesh-var functions so
a sharp D2phi is unavailable. Do not re-explore.

Parallel path is therefore NOT Newton: keep BFO (robust
convex-branch) and port the proven factor/setup-once-reuse pattern
to GAMG (snes_lag_jacobian=-2 / KSPSetReusePreconditioner +
constant near-nullspace + Krylov warm-start). Design doc +
ma_newton_phase0.py.

Underworld development team with AI support from Claude Code
…-reuse prototype

Two findings from challenging the GAMG-reuse parallel prototype:

1. phi order, not the solver, sets the grading. P2 == P3 to ~3 dp
   across AMP 0/2/8/20 on the direct path (matches the recorded
   baseline, AMP=0 no-op exact, no tangle) at ~2x lower cost --
   smaller matrices also help the direct factorisation scale. P1 is
   NOT grading-equivalent (~1.40 vs 1.71 at AMP=8, ~18% weaker), so
   P2 is the floor. phi_degree is now a parameter, default 3 -> 2.
   Canonical cost_compare at P2: MA cold ~0.7-0.9 s (was ~12-18 s
   originally) -- ~15-20x combined, grading bit-for-bit.

2. The BFO+GAMG-reuse parallel path (linear_solver="gamg",
   selectable; "direct" stays default). FGMRES+GAMG(SOR) for the
   elliptic phi-Poisson (CG unjustified -- no exact-symmetry
   guarantee + non-symmetric SOR smoother), CG+Jacobi for the SPD
   mass systems; snes_lag_*=-2 + Krylov warm-start. The constant
   nullspace is verified correctly wired (not the failure mode); P3
   was a major GAMG confound (P2+gamg converges where P3+gamg
   diverges to a no-op) but even at P2 the warm post-_deform_mesh
   re-solve stays erratic, and this build has no alternative AMG
   (hypre/ML absent). Accepted position: MUMPS direct (itself
   MPI-parallel) for now; the P2 size cut only helps it. A robust
   iterative path would need a single-Dirichlet-pin (not the
   constant nullspace) and/or hypre, and is gated behind
   parallel-exact assembly + 3D (the solver is not the parallel
   bottleneck yet).

Spring-as-MA-initial-guess: settled-rejected (geometric mechanism,
unchanged by phi-order/solver speed) -- not re-run.

Design doc + subsystem doc updated; validators ma_phi_order.py,
ma_phi2_validate.py, ma_gamg_vs_direct.py, ma_solver_scaling.py.

Underworld development team with AI support from Claude Code
…t warm-cost scaling

P1 vs P2 × GAMG/direct, RES 16→64 (ma_p1_gamg_scaling.py, AMP=8):

- P1 does NOT make GAMG robust. When P1+GAMG converges it is
  textbook-good (18-22 iters, N-independent — genuinely more
  AMG-friendly than P2's 77→103) but it still fails erratically
  (P1+gamg diverges res-32 & res-64; P2+gamg fails res-16 & res-32).
  The pure-Neumann + warm-resolve breakdown is order-independent and
  resolution-erratic; MUMPS direct is robust at every (res, order).
- P1 grading ~1.40 vs P2 ~1.71 at every resolution (~18% weaker
  everywhere) — not a grading option regardless of solver.
- Key actionable side-finding: the DIRECT warm cost scales badly
  with N — P2-direct warm 1.3 s (res-16) → 17.8 s (res-48) →
  46.4 s (res-64), far above cold (9.5 s). The res-16 warm≈cold
  result does not extrapolate; per-call post-_deform_mesh rebuild +
  MUMPS refactorisation + cache-invalidated evaluate() re-interp are
  O(N)-growing. This is the next per-timestep-scaling work item.

Design doc + memory updated.

Underworld development team with AI support from Claude Code
Visual confirmation of the phi_degree decision: undeformed vs P1/P2/P3
MA redistribution (res-16, AMP=8) + outer-rim zoom. Shows P2 and P3
are visually identical strong clean grading (d/n 1.707 vs 1.713),
P1 a visibly weaker nudge (1.397), all meshes valid (no slivers /
inversion). Figures: /tmp/metric_mesh/ma_grids{,_zoom}.png.

Underworld development team with AI support from Claude Code
User caught that the P2 rim cells are far tighter than the nominal
1/3. ma_radial_anisotropy.py (res-16, AMP=8, vs undeformed): the
deep/near ≈1.71 is a per-node mean of ALL incident edges and
averages the collapsed radial edges with the frozen/expanded
tangential ones. Band-mean radial at the rim ≈0.38× (~1/3, the
isotropic-edge-criterion prediction) but the MIN radial edge ≈0.14×
(~1/7) and minA/meanA ≈0.019 (thinnest tri ~1/52 mean area). All
collapse is radial; tangential ~frozen.

Mechanism: the outer ring is pinned (boundary) and the metric peaks
exactly at r=R_O, so equidistribution jams the next ring(s) against
the fixed wall into one near-degenerate sliver layer, independent of
AMP. The isotropic AMP=1/s^2-1 rule is wrong here (annulus transport
is purely radial; boundary-peaked-vs-pinned over-collapses the wall
layer). d/n is OK as a regression guard but does NOT certify mesh
quality near a boundary-peaked feature -- use minA/meanA or a
radial/tangential split. Levers documented (offset metric peak
inward / cap AMP to a quality floor / pinned-boundary 1-D radial OT
design). Design doc + memory updated; figure
/tmp/metric_mesh/ma_radial_profile.png.

Underworld development team with AI support from Claude Code
…pendent

ma_show_grids_hires.py renders undeformed/P1/P2 at res 32 & 48
(AMP=8), full + outer-rim zoom, annotated with d/n + minA/meanA +
min-radial-vs-undeformed. Confirms the P2 rim sliver does NOT
improve with resolution: min radial 0.14x (res-16) -> 0.10x
(res-32) -> 0.11x (res-48); minA/meanA stays ~0.02-0.03. It is a
structural pinned-boundary + boundary-peaked-metric artifact, not a
discretisation error. P1 stays clean at all resolutions
(minA/meanA ~0.2). Figures /tmp/metric_mesh/ma_grids_hires{,_zoom}.png.

Underworld development team with AI support from Claude Code
…the snuggle fix

Interior blob (away from the pinned boundary), AMP=8:

- GAMG is ROBUST for localised cases (revises the blanket "GAMG
  fragile"). Every metric shape x width x resolution converged in
  ~27-54 iters, cost competitive with direct, zero failures. The
  earlier fragility was SPECIFICALLY the boundary-peaked-metric-
  against-pinned-boundary pathology (metric spiking where the
  operator is pinned). For the realistic localised-feature use case
  the parallel GAMG path is viable.

- "Snuggle" fix: a wider Gaussian is wrong (one width couples the
  resolution scale and the reach: narrow = sharp but isolated
  pucker, bulk idle; broad = global motion but feature washed out,
  far/near 2.4->1.5). A heavy-tailed Lorentzian monitor
  1+AMP/(1+d^2/Wc^2) gives the best feature resolution
  (far/near 2.74 > narrow 2.42) AND ~3x more inward migration of
  distant nodes (+0.025 vs +0.008): the whole mesh rakes coherently
  toward the feature. Mild quality cost (minA 0.089 vs 0.105), no
  tangle. Standard r-adaptation lesson (monitor needs global reach).

Diagnostic-only (no src/default change). Design doc + memory
updated; figures /tmp/metric_mesh/ma_heavytail.png,
ma_localised_reach.png.

Underworld development team with AI support from Claude Code
Tested defining the metric in (r,theta) so it "pulls in theta" plus
boundary sliding (reference: compact Cartesian Lorentzian at an
interior point gives far/near 2.74).

1. A separable (r,theta) Lorentzian is the wrong shape -- an
   anisotropic spoke, not a blob: the chord 2(1-cos dtheta)
   saturates at the antipode (no angular reach, far/near 1.38);
   true-wrapped-angle + balanced cores is a low-gradient radial
   ridge the smoother washes out (far/near 1.12, ~no-op). Use a
   compact |X-P|^2 Lorentzian about the feature point -- it has the
   correct combined radial+angular extent and pulls in theta
   inherently.
2. boundary_slip works mechanically (rim radial drift ~1e-16, nodes
   provably stay on the ring; GAMG robust ~31 it) but is NOT a
   concentrator: near a boundary feature it relaxes the mesh
   (far/near 1.21->1.12, minA 0.32->0.48, rim-count near theta0
   16->18) -- it removes the hard pin so the rim equalises, it does
   not drag rim nodes tangentially toward the feature. Slip buys
   boundary quality, not feature concentration.
3. Boundary-proximal features are choked: same compact Lorentzian
   gives far/near 2.74 at r0=0.78 (interior) vs 1.21 at r0=0.88
   (near rim) -- the fixed-node + pinned-boundary cap, feature side.

Recommendation: compact Cartesian |X-P|^2 Lorentzian about the
feature point, keep features with interior room, slip only for
boundary quality; drop the polar-separable idea. Diagnostic-only.
Design doc + memory updated; figs /tmp/metric_mesh/ma_polar_slip*,
ma_lorentzian_slip.png.

Underworld development team with AI support from Claude Code
…al dead end

User's reframing: exploit the abundant tangential node budget
(slide spare angular nodes toward the feature) not the scarce
pinned radial one. Built (1) the exact 1-D angular OT map as the
target and (2) a new opt-in _winslow_elliptic(move_anisotropy=
(w_r,w_theta)) that rescales the realised displacement in the local
radial/tangential frame (default None = byte-identical).

Angle-only feature rho(theta), AMP=8, res-24:
- (1) exact angular OT: far/near 2.21, frac@theta0 0.159->0.415,
  radial drift 1e-16, no tangle. Exact, trivial, robust -- the
  right tool for separable/structured features, used directly.
- (2) scalar BFO: ~ZERO angular concentration (far/near 0.98,
  frac 0.158 ~ uniform) for ANY weighting. move_anisotropy works
  as designed (suppresses spurious radial drift 6.8e-2 -> 8e-3) but
  there is no tangential transport to reweight -- the scalar
  potential never generates it.

Root cause = the foundational ~1.7 cap, both directions: a scalar
equidistribution potential with fixed topology cannot do large
coherent bulk transport, radial OR tangential (hoop/fixed-topology
stiffness cuts both ways). "(1) as a target for (2)" proved (2)
cannot reach it. Generalisable path: explicit dimensional-split
1-D OT for separable features, or a true anisotropic
metric-tensor adaptation (heavy) -- NOT anisotropic diffusivity /
move-weighting on the scalar potential. move_anisotropy is kept as
an opt-in quality knob (off-direction drift suppressor), not a
concentrator. Design doc + memory updated; fig
/tmp/metric_mesh/ma_angular_ot.png.

Underworld development team with AI support from Claude Code
ma_metric_tensor_viz.py: scalar density -> M = (1/h0^2)[I +
beta ghat ghat^T (|grad rho|/ref)^2], eigen-clamped, desired-cell
ellipses on a polar sample grid for a radial rho(r) and an angular
rho(theta) feature. Correct and confirms the design:

- radial feature -> tangentially elongated ellipses (short across
  r), angular feature -> radially elongated (short across theta),
  each concentrated at the feature and isotropic where grad rho->0.
- The eigenframe auto-aligns to r-hat / theta-hat with NO (r,theta)
  frame specified anywhere (M sees only the Cartesian gradient) --
  resolves the user's (r,theta) puzzle; the scalar API is preserved.
- max anisotropy = the eigen-clamp band (8.3:1), as designed.

Honest nuance: a gradient-based metric refines where rho CHANGES
(flanks), isotropic at a smooth peak / far field; for small cells at
the feature core use smoothed |grad rho| or Hessian-based M=|H(rho)|
(needs the recovered-Hessian path). Gradient form is the
first-derivative UW3-clean first cut.

The metric construction (the ~1-day half) is verified and cheap;
remaining for (3) is the anisotropic mover (metric-Winslow /
M-weighted displacement solve), improving alignment/quality not the
fixed-node cap. Design doc updated; fig
/tmp/metric_mesh/ma_metric_tensor.png.

Underworld development team with AI support from Claude Code
Self-contained handoff so a fresh session can start building the
metric-Winslow / M-weighted displacement solve without re-deriving:
what to read (memory dead-ends, design doc, the _CofDiff tensor-
constitutive pattern + cache/lag/MUMPS infra), the concrete plan,
the validation targets (anisotropy-aware diagnostics + 1-D OT
reference), the accepted quality-not-cap caveat, and scope (~1-2
weeks, new branch). Metric construction is done/verified; only the
mover remains.

Underworld development team with AI support from Claude Code
_winslow_anisotropic: M-weighted Laplace (Winslow) smooth of the
coordinate map with an eigen-clamped, gradient-derived anisotropic
metric tensor D. Displacement form ∇·(D∇u_c)=-Σ_j∂_j D_{jc} solved
per coordinate component, sharing the _CofDiff-style DiffusionModel
tensor operator (_c=D) and the factor-once-reuse direct solver;
homogeneous Dirichlet on the pinned boundary makes it non-singular
(no constant_nullspace — side-steps the GAMG-pure-Neumann
fragility). Linear (one solve/component/outer step, no Picard).
Reuses _winslow_elliptic's signed-area backtrack, boundary_slip and
move_anisotropy. Wired into smooth_mesh_interior as the opt-in
method="anisotropic"; spring default + ma path untouched.

Critical no-op guard: the L2 projection of a uniform-ρ zero
gradient leaves ~1e-18 round-off; normalising by that noisy max
fabricated O(1) anisotropy. A scale-aware g_eps floor makes AMP=0
an exact isotropic no-op while AMP>0 is bit-identical to the
verified ma_metric_tensor_viz construction.

Smoke (res-16 Annulus): AMP=0 exact no-op (max|Δx|=0), AMP=8 moves
nodes, mesh stays valid via the backtrack, ~3s/5 outer steps.

Underworld development team with AI support from Claude Code
Two formulation fixes after the radial-feature validation arc:

1. Build the eigen-clamped metric tensor D ONCE on the undeformed
   mesh and hold it fixed & Lagrangian (rides material points),
   exactly as _winslow_spring computes its rest-lengths/A0 once.
   Re-projecting ∇ρ on the progressively distorted mesh each outer
   step is a positive feedback — D blew up on squashed cells →
   catastrophic over-collapse (verified). With D fixed the outer
   loop is a stable damped fixed-point iteration of one linear
   operator toward the M-harmonic map.

2. Under-relax the per-step displacement (relax). The decoupled
   direct Winslow form has no Rado–Kneser–Choquet non-folding
   guarantee, so a single un-damped elliptic jump folds; its stable
   regime is bounded by the metric anisotropy. Characterised the
   Pareto frontier (scripts/aniso_param_sweep.py): aniso_cap=2 +
   relax≈0.1-0.2 is the robust default; higher cap needs gentler
   relax + more n_outer; cap≳6 folds regardless (would need the
   coupled/inverse Winslow — out of prototype scope). Defaults
   aniso_cap 8→2, relax 0.4→0.2.

Validated (scripts/aniso_validate_radial.py, anisotropy-aware
radial/tangential split + minA/meanA, NOT d/n, vs MA + spring +
exact 1-D OT; grids rendered). Interior radial feature (r=0.70,
res-16 AMP=8): (3) minA/meanA 0.47 vs isotropic MA 0.18 / spring
0.25 — 2.6x cleaner, no slivers, still redistributes toward the
feature. Even on the pathological boundary-peaked case (r=R_O) (3)
keeps minA/meanA 0.24 vs MA 0.019 (12x less degenerate). (3) trades
grading MAGNITUDE for clean anisotropic cell ALIGNMENT — exactly
its intended role; does not beat the fixed node-count cap.

Underworld development team with AI support from Claude Code
Completes the (3) validation arc with the anisotropy-aware
diagnostics on the remaining model problems.

Angular-only feature (res-24 AMP=8, scripts/aniso_validate_
angular.py, vs the exact 1-D angular-OT target): (3) far/near
1.02, frac@θ0 0.160 ≈ uniform — like the scalar MA it CANNOT do
separable angular concentration (expected: separable ⇒ the
explicit 1-D OT is exact + cheap; the settled fixed-topology cap).
But (3) keeps minA/meanA 0.243 vs scalar MA 0.144 — cleaner.

Non-separable blob P=(0.78,0) (the case (3) is for,
scripts/aniso_validate_nonsep.py): (3) minA/meanA 0.295 vs MA
0.109 / spring 0.119 — 2.7x cleaner, no slivers; the zoomed grid
shows MA/spring pull a degenerate slivered knot into the blob
while (3) gives a clean, well-shaped, blob-aligned densification.
Concentration milder (far/near 1.10 vs MA 1.37) — (3) trades
magnitude for clean alignment, exactly its intended role.

Verdict across the arc: (3) is the cleanest method everywhere
(2.6-12x better minA/meanA than isotropic MA), never slivers, is
linear/cheap; it does NOT beat the fixed node-count cap (for
separable features the explicit 1-D OT remains exact + cheaper).
A validated prototype matching the kickoff brief.

Underworld development team with AI support from Claude Code
Design doc: append the "(3) anisotropic mover — IMPLEMENTED &
VALIDATED" section (formulation, the two settled formulation
findings — build-D-once feedback + the decoupled-form folding /
aniso_cap Pareto, the validation table across all model problems,
and the verdict + open follow-ups). Subsystem doc: extend "the two
solvers" → three (method="anisotropic" row + recommendation + Open
items entry + example).

Project memory (project-anisotropic-mover) records the
non-obvious settled findings so future sessions don't re-derive
them: build D once (re-projection is a feedback collapse),
aniso_cap is the binding lever (cap=2 robust, ≥6 folds — coupled/
inverse Winslow needed, out of scope), the g_eps no-op guard, and
the honest "cleanest-not-strongest, doesn't beat the node-count
cap" verdict.

Underworld development team with AI support from Claude Code
Answers a review question: a GRADIENT metric resolves a Gaussian
blob's EDGE, not its CORE — by design. Renders the desired-cell
ellipse field (exactly the shipped mover's construction: β=200,
aniso_cap=2, h0 from mesh) beside the realised method="anisotropic"
mesh and an edge-length/cell-aspect profile vs d=|X-P|.

Confirms: target = fine anisotropic shell at the |∇ρ|-max ring,
isotropic-coarse core + far. Realised mesh matches — core (d≈0)
mean edge 0.079 vs undeformed 0.041 (~1.9× COARSENED: nodes
evacuated from the ∇ρ=0 core into the edge shell), edge-length
minimum + aspect maximum on the flank ring, far field untouched
(0.0416 vs 0.0412). If the blob CORE must be resolved, the
Hessian metric M=|H(ρ)| is the tool (documented follow-up, out
of prototype scope).

Underworld development team with AI support from Claude Code
"Run 20 steps non-adaptive, then refine." Reuses the
fixed_mesh_convection setup (Annulus res-16, free-slip Stokes +
AdvDiffusionSLCN, Ra=1e5, 20 steps, FIXED mesh; state cached to
npz). Then builds the mover-side analogue of UW3's adaptation
metric (adaptivity.metric_from_gradient): a Lagrangian density
ρ = 1 + AMP·t, t = clip((|∇T|-g_lo)/(g_hi-g_lo),0,1) with the
same percentile window idea — fed as method="anisotropic"
metric=. |∇T| is pre-projected to a scalar field so the mover's
internal metric.diff(X) stays a first derivative (UW3-legal) and
the metric rides material points.

Result (fig /tmp/metric_mesh/aniso_convection.png): the metric
correctly targets the inner thermal BL + plume edges; the refined
mesh gathers cells there (anisotropic, aligned along the BL/plume,
short across ∇T) and stays clean — minA/meanA 0.57→0.25, no
slivers. Fixed node budget ⇒ relative redistribution (the
documented node-count-cap caveat): same metric *form* as
adaptation, budget-limited *magnitude*.

Underworld development team with AI support from Claude Code
Fold the pipeline walkthrough (cache-build components 1-4,
per-call build-D-once + damped MMPDE loop steps 5-6), the solver
limitations (2D-tri-only, decoupled-direct folding bound,
node-count cap, gradient-resolves-edges-not-cores,
Lagrangian-fixed metric, serial-exact assembly, linear/decoupled),
and the unexplored corners (no solution-accuracy proof yet, no
dynamic-adaptive loop, coupled/inverse Winslow, Hessian metric,
the metric_from_gradient-style ρ helper, GAMG path, auto-tuning,
free-surface slip) into the design doc's "(3) … IMPLEMENTED &
VALIDATED" section as the canonical reference.

Underworld development team with AI support from Claude Code
scripts/aniso_cost_and_gamg.py: interior radial feature, res
16/24/32/48, direct vs gamg, decomposing cold (one-off setup) /
warm (per-timestep) / per-outer-step / D-build.

Underworld development team with AI support from Claude Code
GAMG: bit-parity with direct at res 16–48 (the mover is
non-singular — homogeneous Dirichlet — so no pure-Neumann
fragility; first metric method with a working parallel-capable
solver path). Cost: cold≈warm (no MA warm-rebuild pathology),
~O(N) in #cells, per-step = a fixed number of SPD elliptic solves
+ local work — structurally parallel-cheap (no nonlinear solve /
global transport). Lever is n_outer. Folded a "GAMG parity + cost
per step (measured)" subsection into the design doc; updated the
Solver-limitations + Corners (GAMG validated → remaining is
parallel-exact assembly; concrete 3D scope: solver core already
dim-general, the 2D-specific piece is the tet signed-volume
backtrack). Subsystem-doc Open-items entry updated.

Underworld development team with AI support from Claude Code
aniso_convection_res32.py: rebuild the res-16 P3 T from the cached
20-step npz, interpolate onto res-32 T nodes (uw.function.evaluate
across meshes), re-solve Stokes for a consistent V, run 5 warm
steps (res-32 state itself cached), then refine on ρ∝|∇T| with
method="anisotropic". Warm start preserves+sharpens the res-16
plume structure; the mover snuggles points into the inner BL +
plume conduits, mesh stays valid (minA/meanA 0.55→0.19, no
slivers — finer |∇T| ⇒ a more aggressive but clean redistribution
than res-16's 0.25). Fig /tmp/metric_mesh/aniso_convection_res32.png.

Underworld development team with AI support from Claude Code
- smooth_mesh_interior gains method_kwargs: a clean pass-through
  for the per-method tuning (aniso_cap/relax/n_outer/linear_solver
  /beta/move_anisotropy) without bloating the shared signature;
  documented with the validated anisotropic defaults.
- New public uw.meshing.metric_density_from_gradient(mesh, field,
  amp=, lo/hi_percentile=, ...): encapsulates the demonstrated
  pattern (project ∇field — first-derivative, UW3-clean — percentile
  -normalise, freeze as a Lagrangian scalar) and returns the ρ
  density expression. The relative, fixed-node-budget analogue of
  adaptivity.metric_from_gradient. Cached per (mesh,degree,name,
  topology) so it is cheap + leak-free called every step in an
  adaptive loop. Scale-aware g_eps floor ⇒ a (near-)uniform field
  yields ρ≡1 exactly (the smoke test caught the percentile-of-
  round-off fabrication — same class as the mover's own guard).
- Exported from underworld3.meshing.

aniso_api_smoke.py: public import, helper cache, method_kwargs
pass-through (bad kwarg → TypeError), valid clean move, exact
uniform no-op — all pass.

Underworld development team with AI support from Claude Code
mesh-adaptation.md: add a "Two families of mesh adaptation" table
(MMG re-mesh / absolute h / variables reset vs node-redistribution
/ relative density / topology preserved / anisotropic / cheap-
parallel) in Core Concepts, and a "Node redistribution — the
snuggling mover" section with the metric_density_from_gradient +
smooth_mesh_interior(method="anisotropic") recipe, the
gradient-resolves-edges-not-cores + node-count-cap caveats, the
method_kwargs knobs, and cross-links to the subsystem + design
docs. Frames it as a peer adaptive strategy with the SAME intent
API as adaptivity.metric_from_gradient.

Underworld development team with AI support from Claude Code
- scripts/adaptive_convection_harness.py: Ra=1e5, uniform res-24
  reference vs res-16 + periodic anisotropic adaptation
  (locked-in API), matched-time Nu(t)/vrms(t) rms comparison +
  figure, both histories cached. The node-update/ALE correction
  is an explicit documented hook (apply_adaptation_correction):
  --correction none = uncorrected baseline (quantifies the drift
  the next phase must remove); --correction ale raises with the
  precise spec (V_fn = v - v_mesh post-adapt, per the settled
  free-surface ALE precedent).
- Design note: "NEXT-PHASE KICKOFF BRIEF — dynamic adaptive
  convection": phase-closed summary (API/docs/harness entry
  points), the ALE correction as the headline open piece +
  acceptance test, prioritised follow-ups (3D scope, parallel,
  Hessian metric, auto-tune), and the resume recipe. Records that
  the metric is 1/h^2 per principal direction in 2D AND 3D (NOT
  1/h^dim) — verified adaptivity.py:154.

Underworld development team with AI support from Claude Code
adapt_with_correction(mode="interp"): exploit that the anisotropic
mover is topology-preserving (vector size / DOF order / partition
invariant) and the pinned boundary leaves the domain unchanged —
so the correct remap is the OLD P3 field evaluated at the NEW node
positions via the local FE evaluate (uw.function.evaluate, true
basis, max fidelity, no cross-rank migration), done by a
deform-back -> evaluate -> deform-forward around the mover, then
re-solve Stokes. NOT read_timestep's kd-tree path (that exists for
the decomposition-changing mesh.adapt case and was diffusive —
degraded the solution).

Result (Ra=1e5, res-16 adapt-every-5 vs uniform res-24 ref, n15):
rms ΔNu 0.028 vs uncorrected 0.097 (~3.5x better); the per-adapt
Nu kink is eliminated and the trajectory tracks the reference.
vrms shows a small expected interpolation-diffusion under-
prediction (the remap tax — motivates the future ALE option).

Also: nusselt() sign fixed (was -Nu: divided by the signed
conductive gradient instead of the conductive flux); harness
overlays the cached uncorrected baseline; --correction gains
"interp"; "ale" raises as the next-phase adapter-bound option.

Underworld development team with AI support from Claude Code
- adaptive_saturation.py/_plot.py: 3-way Ra=1e5 saturation
  harness (ref24/u16 + pristine-remesh adaptive a16p/a16s/a16x),
  loop-reorder (adapt between adv & the single stokes — drops the
  redundant solves), --resume warm-start, per-model metric
  strength, dense checkpointing; aniso_movie.py assembles the
  mesh-adapting animation (ffmpeg).
- Nusselt fixed (both saturation + harness): total radial heat
  flux v_r·T-∂_rT projected to nodes, integrated on an interior
  shell (BL-resolution-immune, shell-independent at steady
  state), normalised by the analytic ANNULAR LOG conduction flux
  2π/ln(Ro/Ri). _nu_proper.py validates Nu=1.0000 at conduction;
  _nu_recheck.py recomputes from checkpoints. The old 2-point
  shell stencil was ~2-3x low (h-dependent artefact).
- Result: pristine re-mesh bounds mesh quality over a full
  saturated run at aggressive metric; adaptive res-16 reproduces
  res-24 Nu (~1%) and vrms (~3%) at ~0.69x cost; adaptation
  overhead is a small fraction of one Stokes solve.

Underworld development team with AI support from Claude Code
New docs/developer/design/mesh-adaptation-formulation.md — the
self-contained maths reference: the equidistribution principle +
fixed-node-count cap; full optimal-transport / Monge-Ampère
derivation (Brenier map → MA equation → BFO convex-branch
Δφ=√((φxx-φyy)²+4φxy²+4g)-2, damped Picard, variational Hessian
recovery, the 1-D exact reference, why the FE-MA solve caps);
volumetric elastic-spring energy; the anisotropic metric-tensor
(Winslow/MMPDE) mover (eigen-clamped M from ∇ρ, M-weighted
displacement Laplace, damped-MMPDE stability); gradient-metric
construction + the 1/h² (not 1/h^d) dimension note;
dynamic-adaptation field handling (Lagrangian / local-FE remap /
ALE, pristine re-mesh); the Nusselt definition. Three example
figures (metric-tensor construction, non-separable validation,
the live Ra=1e5 a16x convection). Cross-linked from the
subsystem + advanced docs and added to the developer toctree.

Underworld development team with AI support from Claude Code
lmoresi added 15 commits May 18, 2026 21:52
User observed a16s (amp=16) and a16x (amp=24) bunch identically.
Verified (scripts/_amp_check.py, diff=4e-16): the mover builds
M = base[I + β ĝĝᵀ(|∇ρ|/gref)²] with gref=max|∇ρ|; with
metric_density_from_gradient's ρ=1+amp·t both |∇ρ| and gref scale
with amp ⇒ it CANCELS exactly, M is amp-invariant. So a16x ≡ a16s
in metric. The real anisotropic-mover knobs are aniso_cap
(eigen-clamp, peak sharpness + stability lever), β (anisotropy
strength → widens the refined band), and the lo/hi percentile
window in metric_density_from_gradient (reshapes t→∇t→M; selects
which gradient quantile refines — does NOT cancel). amp IS
effective for the isotropic spring/MA (A0∝1/ρ, g∝1/ρ) —
method-dependent. Corrected the formulation doc (§5 admonition)
and the adaptive_saturation _MP comment, which both wrongly
presented amp as the mover's bunching intensity.

Underworld development team with AI support from Claude Code
…through

amp being a no-op (prev commit), a16y uses the EFFECTIVE knobs:
aniso_cap 4→5 (sharper peak — binding lever; 6 folds), beta
200→300 (wider refined band), relax 0.05→0.04 + n_outer 25→30
(keep cap=5 non-folding under pristine re-mesh). `beta` now passed
through method_kwargs (was mover-default 200); existing entries
pinned to beta=200 to preserve their behaviour.

Result (controlled): cap=5 stays stable (no fold; minA/meanA
~0.26) but realised bunching ≈ a16s (cap=4) — the fixed-node-
budget pristine single-adaptation map caps the grading magnitude
regardless of metric aggressiveness. Reconfirms the §1
node-count cap by a turn-up-the-effective-knobs test (settled).

Underworld development team with AI support from Claude Code
…iso CFL, mesh.quality

Source/docs checkpoint for the feature/anisotropic-metric-mover branch.
Covers the work this session plus the prior strategy/aniso-CFL layer.

meshing/smoothing.py
  - ADAPT_STRATEGIES dict: off / vlow / low / med / high / extreme,
    each a coherent (amp, lo_pct, hi_pct, power, R, skip_threshold)
    bundle. User-supplied dial values still override.
  - mesh_metric_mismatch(): rms / max / median_abs / alignment (Pearson
    r of log(1/A_cell) vs log(rho_cell)) / misalignment metrics for
    deciding when to re-adapt.
  - smooth_mesh_interior(): strategy=, skip_threshold= kwargs;
    "off" short-circuits, mismatch < threshold short-circuits
    (adapt-on-demand).
  - metric_density_from_gradient(): strategy, power, mode, and
    *both* smoothing_length (field-side) and gradient_smoothing_length
    (post-projection metric source) kwargs. The gradient-side variant
    is the production de-noiser: it leaves T (and therefore the BL)
    untouched and smooths only the spatial distribution of |grad T|
    that the metric reads, so the BL location and magnitude are
    preserved.

meshing/__init__.py
  Re-export ADAPT_STRATEGIES, mesh_metric_mismatch.

discretisation/discretisation_mesh.py
  mesh.quality() — per-node and percentile mesh-quality probes used
  by the validation scripts; parallel-safe.

systems/solvers.py
  - AdvDiffusionSLCN.estimate_dt(direction_aware=False) — per-cell
    extent along v^ via projection of cell vertices onto v^; gives
    1.35x on uniform / 6-7x on adapted meshes vs the per-cell-h
    bound, while staying conservative on the actual CFL.
  - Unit-aware smoothing_length accessor on the four Projection
    classes (Scalar, Vector, Tensor, MultiComponent) with the
    screened-Poisson L = sqrt(alpha) convention. This bit is
    being upstreamed via PR #200 against development — duplicate
    diff here will fall away on rebase once that PR merges.

docs/developer/design/
  - mesh-adaptation-formulation.md updated with the equidistribution
    derivation, R / amp / percentile / power dial semantics, and
    the gradient-smoothing rationale.
  - snes-atol-convergence-scale.md — note on the missing snes_atol
    in UW3 Stokes leading to a near-converged-class line-search
    failure in warm restarts; recipe to set it sensibly.
  - solver-strategies-catalogue.md — first-pass catalogue of solver
    presets validated on Stokes / Darcy / etc. with bit-identical
    residual probes and warm/cold cost.

No script changes in this commit; a follow-up commit captures the
production drivers used to validate this work.

Underworld development team with AI support from Claude Code
Production drivers used to validate the meshing/solvers changes in
the previous commit. None of the leading-underscore one-shot probes
are included — those stay untracked.

Drivers
-------

stagnant_lid_uniform.py
  Annulus Ra=1e7, Frank-Kamenetskii delta_eta=1e4, P3-T / P2-V / P1-P
  baseline. Loops with checkpointed history.npz; reuses a single
  Projection across diagnostic calls (no per-step allocation).

stagnant_lid_adapt_loop.py
  Adapt-every-N convection loop with:
    * strategy={off..extreme}
    * skip_threshold short-circuit (mesh_metric_mismatch)
    * direction-aware AdvDiff CFL
    * --grad-smooth-h0 (gradient-side metric smoothing,
      expressed as a multiple of background h0)
    * surface Nu via uw.maths.BdIntegral on the cold (Upper)
      boundary, normalised by the analytic annular conductive
      flux 2 pi / ln(R_o/R_i); legacy mid-shell variant kept
      as _nu_midshell() for cross-checks
  Adapt step: FE-remaps T onto the new mesh and zeros V, P
  (validated cold-restart pattern; FE-remap of V,P or Lagrangian
  carry was rejected in prior probes).

stagnant_lid_adapt.py, stagnant_lid_R_sweep.py,
stagnant_lid_R_compare_plot.py, stagnant_lid_plot.py,
stagnant_lid_adapt_compare_plot.py
  Static (non-loop) adapt + R-sweep validation harnesses and
  plotting helpers.

adaptive_saturation*.py
  Earlier saturation-experiment harness, kept consistent with
  the new strategy API.

Outputs live in ~/+Simulations/StagnantLid/* (per-run history.npz
plus PyVista renderings). Not committed.

Underworld development team with AI support from Claude Code
A user-facing wrapper around the anisotropic node mover, intended
as the production entry point for ∇field-driven mesh adaptation.
The old strategy={off..extreme} / amp / percentile / power dials
remain on metric_density_from_gradient + smooth_mesh_interior for
expert use; new code should reach for follow_metric() instead.

API
---

  uw.meshing.follow_metric(
      mesh, field,
      refinement=3.0,                # h_min ≈ h0 / refinement
      coarsening="auto",             # h_max ≈ h0 · coarsening
      metric="front-following",      # or "gradient-uniform"
      skip_threshold=0.9,
      gradient_smoothing_length=None,
  )

Two strategic knobs (refinement, coarsening) replace the amp /
percentile / power dials in the user-facing API:

* `refinement` (mandatory, >= 1): factor by which the densest
  cells are smaller than the background h0.
* `coarsening` ("auto" or float >= 1): factor by which the
  sparsest cells are larger than h0. "auto" uses the
  budget-conserving minimum refinement**(1/d) — the smallest
  coarsening that "makes room" for the requested refinement at
  fixed node count.
* `metric` ∈ {"front-following", "gradient-uniform"}: strategic
  equidistribution rule (formerly the `power` numeric dial).

Math
----

The mover's eigenvalue → cell-size map is h = h0/√(ρ̂) (NOT
ρ̂^(-1/d) — λ has units of 1/length²), so a literal envelope
h ∈ [h0/refinement, h0·coarsening] corresponds to ρ̂ ∈
[1/coarsening², refinement²] — dimension-independent.

The piecewise-log-linear ansatz over percentile rank has its
break (ρ=1) at p* = log(refinement) / log(refinement·coarsening),
which makes geomean(ρ) ≡ 1 by construction. The mover's G
normalisation then passes ρ through unchanged.

For gradient-uniform: ρ ∝ |∇field|² clipped to the envelope —
constant per-cell Δfield, the natural advection-diffusion goal.

Practical behaviour
-------------------

Validation on a sharp-tanh annulus shows:

* refinement side honoured within ~5-10% of h0/refinement
* coarsening side achieves ~2× h0·coarsening (anisotropic cells
  + iterative mover deformation map are less tight on the coarse
  side; this is the underlying mover's behaviour, not the new
  API). The metric_choice still helps because the cap on
  refinement keeps the cell-size envelope from running away.

mesh_metric_mismatch is the right diagnostic for measuring how
close the achieved mesh is to the requested metric.

Tests
-----

tests/test_0750_meshing_follow_metric.py — 8 tests, all green:

* refinement=1 is a no-op metric
* envelope in ρ-space (ρ_max = ref², ρ_min = 1/coar²)
* geomean(ρ) ≡ 1 by construction (front-following)
* coarsening="auto" matches refinement**(1/d) explicitly
* invalid-arg validation
* end-to-end: follow_metric moves the mesh
* end-to-end: refinement envelope within ~15% of spec
* end-to-end: skip_threshold short-circuits on aligned mesh

Underworld development team with AI support from Claude Code
…r-coarsening

The anisotropic mover is a local graph-Laplacian: it can't transport
nodes long-distance across high-gradient ridges (the user's
diagnostic insight, 2026-05-21). So on the stagnant-lid test, cells
adjacent to the cold-BL refinement zone absorbed most of the freed
area and overshot the coarsening cap by 50-130%, while cells in
topologically isolated low-gradient pockets stayed near h0.

Fix: per-cell Lagrangian rest-size spring in _winslow_anisotropic.

  rest_size_cap_max=None   # if set, h0·coarsening
  rest_size_cap_min=None   # if set, h0/refinement
  rest_spring_K=1.0        # stiffness

In each outer iteration of the deformation map, after computing the
metric-driven displacement, the spring scans cells whose mean edge
would exceed (or go under) the cap. For each over/under-shooting
cell, the vertices receive an additional displacement toward their
*rest position* `old0` (captured at function entry), weighted by
the fractional overshoot summed over incident cells. With
relax=0.2 and n_outer=12 the per-step restoring strength is
~K · relax = 0.2 per iteration; after 12 iterations the effective
pull is ~93% restoring.

The spring is a *local* fix that doesn't require long-range node
transport — vertices simply refuse to move past the cap. Compares
against the rest position `old0`, the truly-undeformed coords from
function entry, not the per-iteration `old_coords`.

Validation on stagnant-lid step0080 (Ra=1e7, Δη=1e4):

  call              h_max/h0 BEFORE   AFTER spring   spec
  (1.5, auto)            2.16            1.47        1.22
  (2.0, auto)            3.13            1.60        1.41
  (3.0, auto)            3.22            1.90        1.73
  (2.0, 2.0)             4.59            2.19        2.00
  (2.0, 1.4)             3.10            1.59        1.40
  (2.0, auto, GU)        4.16            1.61        1.41

Coarsening side now within 10-30% of spec (was 50-130% over).
Refinement side maintained: within 10-15% of spec.

New test_follow_metric_coarsening_envelope_approximate locks
h_max/target_h_max < 1.30 across four (refinement, coarsening)
pairs. All 9 follow_metric tests pass.

Visual diagnostic (~/+Simulations/StagnantLid/follow_metric_compare/
plot_follow_metric_compare.png) shows the over-coarsened cells in
cold downwelling regions are now gone — the mesh grades smoothly
toward h0·coarsening rather than ballooning to ~2-3× that.

follow_metric.py wires rest_size_cap_max = h0·coarsening,
rest_size_cap_min = h0/refinement automatically; method_kwargs
can override.

Underworld development team with AI support from Claude Code
… directions

User feedback (2026-05-21): "I see very dramatic refinement near the
edges which seems more than 1/2 or 1/3 times the normal spacing.
Have we traded one problem for another?" — yes, the mean-edge spring
let anisotropic cells slip through the cap.

A cell with mean-edge = h0/refinement (passing the previous cap) can
still have one extremely short edge plus two long ones, producing
aspect ratios up to 12 and visible sliver edges at the BL. The
mean-edge cap doesn't catch this.

Fix: switch the spring's per-cell size measure from mean-edge to:

  * MAX-edge for the coarsening cap — pulls back if ANY edge grew
    beyond h0·coarsening. Catches cells that stretched in any
    direction past spec.
  * MIN-edge for the refinement cap — pulls back if ANY edge shrunk
    below h0/refinement. Catches sliver cells with one tiny edge.

Both measures are sliver-aware: a cell can't pass the cap by averaging
across long+short edges. User asked for both directions: "also apply
the spring in the other direction so cells refuse to collapse".

Empirically (stagnant lid step0080):

  call           any-edge min/h0   mean-edge min/h0   aspect_max
  ref=2.0:    BEFORE  0.134            0.422            7.32
              AFTER   0.487            0.530            2.74
  ref=3.0:    BEFORE  0.083            0.337            11.95
              AFTER   0.321            0.381            3.34

Aspect ratios drop dramatically. Min-edge is now near spec (0.5 for
ref=2; 0.33 for ref=3), removing the visible sliver overrun the user
flagged.

Tradeoff: ref=3 mean-edge can be 37% larger than h0/3 (cells under-
refine slightly because the min-edge cap holds — the cell's mean
must stay above its min). That's the right side of the trade: the
user asked for caps on extreme refinement, accepting mild under-
refinement.

test_follow_metric_refinement_envelope_approximate updated to lock
the mean-edge guarantee (0.85 < h_min_cell/target < 1.30) — the
spring's robust contract. Slivers can still occur on individual
cells with very strong local gradients; the *cell mean-edge*
guarantee is the one to rely on.

Underworld development team with AI support from Claude Code
…track

User feedback: "There are a few genuine sliver elements in those meshes
— tiny, thin things that are almost invisible. That seems new."

These are different from the anisotropic-sliver class fixed in the
previous commit. They're **near-degenerate** triangles: three vertices
become nearly collinear, producing cells with area ~1000× smaller than
typical while passing the existing aspect-ratio and edge-length
checks. The mover's signed-area backtrack accepted them because
`a1min > 0` (positive area, no flip) — but `0` and `1e-7` are different
things visually.

Two coupled root causes:

1. **Unbounded spring restoring weight.** The spring's per-vertex
   `restore_w` accumulates contributions from incident cells via
   `np.add.at`. A vertex incident on six violating cells, each
   contributing K·excess ≈ 0.5, gets restore_w = 3 — and the spring
   then moves the vertex by `restore_w * (rest - current) = 3·gap`,
   overshooting rest by 2×. Two over-shot vertices can land in
   the same place, collapsing a triangle.

   **Fix:** clamp `restore_w` per-vertex to ≤ 1 after the
   accumulation. Full restore-to-rest is the most you ever want;
   never go past.

2. **Backtrack only checks `a1min > 0`.** Cells whose area collapses
   to ~1e-7 (against a median ~1.5e-3) still pass.

   **Fix:** require `a1min > 0.01 · a0_undeformed_median`. The
   floor is computed ONCE from the undeformed mesh (captured
   before the iteration loop) so it doesn't shrink as cells
   refine. The 1% threshold is conservative for any practical
   refinement: refinement=3 in 2D legitimately shrinks cells by
   9× in area, which is 100× above the floor.

Empirically (stagnant lid step0080, against A0_median):

  case                  n<5%A0  n<1%A0  A_min/A0  aspect_max
  ref=1.5, auto            0       0      0.413      2.44
  ref=2.0, auto       1->1   1->**0**     0.010      2.71
  ref=3.0, auto       8->1   2->**0**     0.010      3.31
  ref=2.0, coar=2.0   3->3   3->**0**     0.010      3.48
  ref=2.0, GU              0       0      0.151      3.05

Zero cells with area < 1% of A0_median in any case. A handful sit
exactly at the floor (held there by the backtrack) — they "wanted"
to collapse further but were stopped. Visually clean.

Tests 9/9 pass.

Underworld development team with AI support from Claude Code
User feedback (2026-05-22): "Still lots of slivers in that follow-metric
mesh. It does a better job of distributing nodes but at the expense of
poor elements. Can we run a second pass on the distributed nodes that
would then adapt to the metric and fix the dodgy elements?"

Yes — and the right pass is the existing graph-Laplacian Jacobi
smoother (`smooth_mesh_interior` without a metric). Each iteration
averages every interior vertex toward the mean of its edge neighbours
— a gentle local cleanup that:

  * rounds out the few cells held at the sliver-floor by the backtrack
  * doesn't (significantly) undo the metric-driven node distribution
  * is cheap (one pass through the adjacency)

Tested alternatives that did NOT work:

  * Spring polish (`_winslow_spring`) with the same metric: drove
    cells toward equal-length edges, which conflicts with the
    metric-driven anisotropic distribution. Aspect_max went from
    3.31 → 14.27 — undoing the alignment.

  * Spring-only (no anisotropic first): aspect_max 8.78 — same
    transition-zone problem.

Jacobi sweep, ref=3.0 stagnant lid:

  polish        h_min/h0  A_min/A0  aspect_max  aspect_p99
  none          0.641      0.010      3.31        2.83     <- floor cell
  1×α=0.2       0.697      0.134      3.19        2.73     <- production
  2×α=0.2       0.748      0.190      2.99        2.63
  5×α=0.2       0.850      0.220      2.92        2.49

API change: `follow_metric()` now takes `polish_iters=1` (default)
and `polish_alpha=0.2`. The polish runs unconditionally after the
anisotropic mover when `polish_iters > 0`. polish_iters=0 disables.

Empirical trade-off (1 step):
  - h_min relaxes ~9% (0.641 → 0.697 of h0)
  - aspect drops modestly (3.31 → 3.19)
  - sliver-floor cells go 1 → 0
  - mesh visibly cleaner, BL refinement basically intact

Tests 9/9 pass unchanged (the polish doesn't affect the contract
documented in the existing tests; it just makes meshes prettier).

Underworld development team with AI support from Claude Code
User feedback (2026-05-22): "The criteria you are using seem a little
weak — I can still see some tiny cells in jacobi = 1, I'd think 2 or 3
is better. I am not sure what measure would find this out."

Found the right measure: cell shape quality

    q = 4√3·A / (e₀² + e₁² + e₂²)

(equilateral → 1, degenerate sliver → 0). Threshold q < 0.3 catches
visible slivers; q < 0.5 catches "noticeably anisotropic" cells.

The previous "n cells < 5% A0_median" measure was coarse — it could
go to zero while still leaving ~2 cells with q < 0.3 (visible
slivers). The shape-quality measure tracks the visual impression
correctly.

Sweep on ref=3 stagnant lid:

  polish    q_min    n q<0.3    n q<0.5
  0         0.022      7         118
  1×        0.270      2 (!)      78    <- old default missed these
  2×        0.369      0          57
  3×        0.397      0          43
  5×        0.431      0          17

API change: polish_iters → polish_max_iters (default 5),
polish_quality_target (default 0.3), polish_alpha (default 0.2).
The polish loop runs adaptively: each iteration recomputes q_min,
and the loop exits as soon as q_min >= polish_quality_target
(or the iteration cap is hit). User picks the QUALITY they want;
the polish runs the iterations needed.

Validation on the test fixture:

  ref=3.0: 2 polish iters needed → q_min=0.369
  ref=5.0: 5 polish iters needed (cap) → q_min=0.304

Higher refinement leaves more slivers to clean, so the adaptive
polish naturally uses more iterations. Self-tuning to the problem.

New tests/test_follow_metric_no_slivers_after_adaptive_polish
locks the contract: q_min >= 0.3 on refinement=2 and 3.

Tests 10/10 pass.

Underworld development team with AI support from Claude Code
… + diagnosis kwargs

This commit carries forward the changes made during the 2026-05-22/23
follow_metric debugging session. Two source files modified.

src/underworld3/meshing/smoothing.py
------------------------------------

follow_metric():
* h0_cache and rest_coords_cache per-mesh — the spring/eigenvalue-
  clamp references are captured at first call and reused across
  subsequent adapt cycles, preventing compounding refinement when
  adapt is called repeatedly on the same mesh.
* polish_max_iters + polish_quality_target (adaptive Jacobi
  polish) — runs until shape quality q = 4√3·A/(Σe²) ≥ target.
* gradient_smoothing_length plumbed through.

_winslow_anisotropic():
* rest_size_cap_max / rest_size_cap_min / rest_spring_K — per-cell
  Lagrangian rest-size spring; clamps min-edge ≥ h0/refinement and
  max-edge ≤ h0·coarsening when those caps are passed. restore_w
  capped at 1 per vertex to prevent spring overshoot. Spring uses
  rest_coords_override when supplied (so it pulls back to the
  truly-undeformed mesh, not the entry-state of THIS call).
* h0_override kwarg — bypass internal h0 measurement (use a
  cached external value to break compounding refinement across
  repeat calls).
* Sliver-floor backtrack — area constraint a1min >
  0.01·a0_undeformed_median, rejecting near-degenerate triangles
  that the original a1min > 0 check let through.
* metric_refresh_per_iter kwarg + extracted _build_M_tensor() —
  KNOWN INCOMPLETE FIX. The rebuilt M tensor reads from the same
  Lagrangian ρ source (MeshVariable), so it's still Lagrangian.
  Kept in code for the dedicated session's next round of debugging;
  default off preserves legacy.

scripts/stagnant_lid_adapt_loop.py
----------------------------------

* --from-perturbation: builds a fresh Annulus(0.5, 1.0, cellSize=1/16)
  and sets T = T_cond + amp·sin(m·θ)·sin(π(r-r_i)/(r_o-r_i)) +
  V=P=0. Validated starting point.
* --Ra, --delta-eta, --pert-mode, --pert-amplitude: physics knobs.
* --refinement: switches from legacy strategy='med' path to
  follow_metric(refinement=N).
* --skip-threshold: override the adapt skip cadence (-1 = use
  strategy default; high = never skip).
* --dt-mult: multiplier on estimate_dt() — SLCN is unconditionally
  stable so values > 1 (typically 3) give 3× larger dt without
  stability loss.
* --max-t: stop the loop when t_sim ≥ max_t.
* Stokes-BEFORE-AdvDiff order: critical fix. The old order was
  AdvDiff(uses V) → Stokes(updates V), which meant the AdvDiff
  step right after an adapt used V=0 (because adapt zeros V,P).
  Resulted in one diffusion step at zero advection, giving a
  visible Nu dip at each adapt. New order: Stokes(rebuilds V from
  remapped T) → estimate_dt → AdvDiff(uses correct V) →
  no diffusion-only transient.
* misalignment per-step diagnostic added to history.npz. The
  resume loader fills NaN if old npz lacks the column.

The full session diagnosis is captured in the memory entry
[project_winslow_mover_iteration_bug.md]. The synthetic-shapes
test in scripts/_test_metric_shapes_iter2.py (untracked,
left as scratch) is the smoking-gun demo.

Underworld development team with AI support from Claude Code
Adds _winslow_equidistribute — a single weighted-Poisson
equidistribution-flow step per call:

    ∇·(ρ ∇φ) = -ρ · log(V·ρ/K),  ∇φ·n̂=0
    K = exp(<ρ·log(V·ρ)>/<ρ>)   (Neumann zero-mean compat)
    y_new = y + relax · ∇φ

Linear: one weighted-Poisson per call, no Picard, no Hessian
recovery. ρ is target density at the current node positions; V is
patch volume at the current mesh; the source vanishes identically
at equidistribution (V_i·ρ_i ≡ K). Each call is a stand-alone
improvement step — the input mesh has no special status, so calls
chain freely and can be interleaved with spring / Jacobi / aniso
heuristic moves.

Wired as method='ot' (also 'equidistribute', 'improve') in
smooth_mesh_interior. step_frac=0.3 caps per-vertex displacement
to fight local fold formation near sharp source regions.

Synthetic-shapes test (~/+Simulations/StagnantLid/synthetic_shapes_OT/):
trajectory 0.72→0.49 in 13 steps, refinement bands directly on
the metric concentrations (no phase shift). Plateaus near 0.46
when sliver cells block the signed-area backtrack — design intent
is to compose with shape-quality smoothing between OT steps.

Also bundled here:
* target_side_rho kwarg on _winslow_elliptic: substitutes
  X[i] → X[i]+gradphi.sym[i] in the metric expression so ρ is
  evaluated at the moving target, removing the phase error where
  refinement-by-size was transported away from the feature
  location by ∇φ. gproj.solve() inside the Picard loop so
  gradphi tracks the current φ.
* Box-mode boundary slip on _winslow_elliptic: corners pinned,
  mid-edge nodes slide along their axis-aligned edge. Adds
  boundary_slip='box' alongside the existing 'ring' (annulus).
* _OT_CACHE for the OT-improve solver state.

Scripts under scripts/_test_metric_shapes_*.py exercise the
analytic-Eulerian ρ probe and produce the synthetic-shapes
comparison panels referenced in the project memory.

Underworld development team with AI support from Claude Code
Five exploration scripts that establish what composes well with
the OT-improvement step (_winslow_equidistribute, landed in
959c67d):

  _test_metric_shapes_OT_interleave.py
    First sweep across recipes ending on heuristic vs OT,
    spring-default vs Jacobi.

  _test_metric_shapes_OT_interleave2.py
    Always-end-on-OT discipline + spring tuning sweep
    (size_w 0,2,8). Result: light metric-aware spring
    (size_w=2) is the only useful interleave partner; default
    size_w=8 is over-aggressive, Jacobi is anti-cooperative,
    pure-shape size_w=0 fights the OT grading.

  _test_metric_shapes_OT_jacobi.py
    Jacobi-as-preconditioner-only sweep. Confirms Jacobi
    contributes nothing at any tuning (light/heavy/varied
    OT relax). Per-step OT rate is iteration-count-limited,
    not step-size-limited in this regime.

  _test_metric_shapes_OT_multires.py
    Multi-resolution OT (eps-widening homotopy: 2/3/4/5
    levels). Result: every multi-res schedule is worse than
    raw OT * N at matched compute — each level transition
    jumps imbalance upward because the mesh equilibrated for
    smoothed rho mismatches the sharper rho. Likely earns its
    keep only for rho with thin/distant features that raw OT
    can't find from a uniform start; our sech^2 bumps are
    already broad enough.

  _test_metric_shapes_OT_levels.py
    Isolates the two halves of the 2-level multi-res chain.
    Shows that smoothed-OT pre-positions nodes into the right
    *neighbourhoods* (wider bands), but doesn't precondition
    the sharp OT — both meshes reach the same sharp imbalance.

Empirical headline: raw OT (relax=0.1, step_frac=0.3,
boundary_slip='box') is the dominant strategy on a uniform
start with sech^2 features. The OT-improvement step has the
right composability semantics for the user's intent; the
sweeps just establish that no heuristic partner found so far
strictly beats more raw OT on this benchmark.

Underworld development team with AI support from Claude Code
The OT-improve step (959c67d) is genuinely incremental and
composable: with sf=0.3, relax=0.1 (conservative brakes), it
makes monotone progress without mesh tangling on a uniform
start and runs cleanly when chained from any prior mesh state.
That contrasts with the failed Lagrangian-accumulation path in
the anisotropic mover, where each "OT-like" step kept adding
displacement to a stale source and the refinement drifted away
from the actual gradients.

Empirical headline from the blob-then-sharp sweep:
* narrow-blob (AMP=8, eps=0.06) at brake-on settings is per-step
  equivalent to sharp OT — chains C/E tie the raw sharp baseline
  at sharp imb ~0.52
* the blob preconditioning DOES gather nodes into feature
  regions visibly (compare bottom-row interior cell density vs
  pure-sharp baseline), even though the imbalance metric is
  insensitive to that interior-vs-boundary distinction
* with brakes off (relax=1.0, sf=1.0), wide blob (eps=0.15)
  transports nodes 3-4x further per step but produces broad
  halos rather than fine band alignment
* narrow blob with brakes off tangles within 2-3 steps;
  amplification (AMP=20 or 50) doesn't help — it just brings
  the tangle earlier

Scripts:
  _test_metric_shapes_OT_blob.py            initial blob design
  _test_metric_shapes_OT_blob_amp.py        AMP sweep (8/20/50)
                                            + Gaussian alt
  _test_metric_shapes_OT_blob_loose.py      brake-off probe
  _test_metric_shapes_OT_blob_then_sharp.py blob -> sharp chain
                                            (with narrow blob)
  _profile_OT.py / _profile_OT_inside.py    timing breakdown:
                                            ps.f reassign or
                                            _deform_mesh forces
                                            re-factor (~5s);
                                            cached solve 0.2s

Next: validate on convection harness — does the incremental OT
help mesh adaptation when the metric is itself time-evolving?

Underworld development team with AI support from Claude Code
Promote the validated reset-to-uniform OT adaptation pattern from the
stagnant-lid harness into a public mesh method.

- mesh.OT_adapt(field, *, refinement=3.0, coarsening='auto',
  grad_smoothing_length='auto', metric_choice='front-following',
  fields_to_remap, fields_to_zero, skip_threshold, reference_coords,
  verbose) -> bool, plus mesh.OT_adapt_reset_reference() for the lazy
  reset-reference cache.
- New src/underworld3/meshing/_ot_adapt.py holds the algorithm and the
  boundary-slip helpers.
- grad_smoothing_length defaults to 'auto' (= the mesh's uniform cell size,
  parallel-safe mean edge length): the screened-Poisson de-noising that
  keeps R=3 sliver-free is on by default. None disables it; a number or
  Pint length sets it explicitly and is unit-aware (non-dimensionalised via
  the projection). metric_choice defaults to the validated 'front-following'
  (mild grading), not the sliver-prone 'gradient-uniform'.
- Unify the OT mover's boundary slip (_winslow_equidistribute) onto the
  mesh's projected normals (mesh.Gamma_P1) instead of the hardcoded
  box/ring cases. Radial coordinate systems slide tangentially and snap
  back to the surface; Cartesian boundaries pin (the raw facet normal is
  degenerate at vertices). Legacy boundary_slip='ring'/'box' kept as aliases.
- Constrained-manifold meshes (dim != cdim) raise NotImplementedError.
- _update_projected_normals recovers the persistent _n_proj variable after
  a deform instead of noisily re-creating it.
- tests/test_0760_mesh_ot_adapt.py locks the API. Design doc:
  docs/developer/design/ot-adapt-api-proposal.md.

Behavioural (not bit-for-bit — the slip algorithm changed): mode-1 Ra=1e7
delta-eta=1e2 over 200 steps gives 40/40 productive adapts; the reset is
idempotent (verified). R=3 on this sharp solver field still leaves a small
sliver population vs the gentler R~1.4 reference; auto-smoothing is the
mitigation, refinement is the caller's lever.

Underworld development team with AI support from Claude Code
Copilot AI review requested due to automatic review settings May 25, 2026 06:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR lands infrastructure and supporting documentation/scripts for metric-driven, topology-preserving mesh redistribution (including public meshing exports), and adds PETSc support needed for pure-Neumann scalar solves used by mesh-motion/equidistribution formulations.

Changes:

  • Export additional metric/adaptation utilities from underworld3.meshing (e.g. metric_density_from_gradient, follow_metric, ADAPT_STRATEGIES).
  • Add constant_nullspace support to SNES_Scalar to properly handle singular pure-Neumann scalar operators (incl. GAMG near-nullspace).
  • Add extensive analysis/benchmark/visualisation scripts plus documentation cross-links and an advanced-guide section explaining node redistribution vs remeshing.

Reviewed changes

Copilot reviewed 98 out of 102 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/underworld3/meshing/init.py Re-exports metric/strategy helpers from meshing.smoothing at module level.
src/underworld3/cython/petsc_generic_snes_solvers.pyx Adds optional constant-nullspace attachment for pure-Neumann scalar SNES solves.
scripts/vol_spring_test.py Bench script exploring volumetric spring size/shape weighting trade-offs.
scripts/stagnant_lid_R_compare_plot.py Generates side-by-side plots of meshes adapted at different resolution ratios R.
scripts/stagnant_lid_adapt_compare_plot.py Produces clean uniform-vs-adapted mesh comparison plots over T and |∇T|.
scripts/spring_converge.py Probes convergence of the spring PCG and its effect on grading/cost.
scripts/slip_test.py Compares boundary slip OFF/ON effects for volumetric spring and reports drift/quality.
scripts/precond_mild.py Experiments with mild spring as a preconditioner for Monge–Ampère adaptation.
scripts/precond_bench.py Benchmarks spring→MA preconditioning variants for cost/grading/quality.
scripts/plot_metric_meshes.py Plot-only renderer for saved metric-mesh cases (Spring vs MA).
scripts/neumann_solve_probe.py Isolates/validates pure-Neumann Poisson solve behavior with constant_nullspace.
scripts/ma_tune.py Parameter sweep for _winslow_elliptic (MA) tuning and tangle checks.
scripts/ma_solver_scaling.py Measures direct vs GAMG scaling/caching behavior for MA φ-Poisson solves.
scripts/ma_slip_test.py MA-specific boundary slip OFF/ON comparison and diagnostics.
scripts/ma_show_grids.py Renders MA redistributed grids for multiple φ polynomial orders.
scripts/ma_show_grids_hires.py High-res MA grid renders and quantitative anisotropy/quality annotations.
scripts/ma_res_sweep.py Resolution sweep for MA grading vs expected equidistribution trend.
scripts/ma_radial_anisotropy.py Analyzes radial vs tangential edge-length profiles for MA-adapted annulus.
scripts/ma_profile_phases.py Phase-resolved timing breakdown for cached warm MA calls.
scripts/ma_profile_diag.py Compares FE MA radial displacement against exact equidistribution displacement.
scripts/ma_polar_lorentzian_slip.py Polar Lorentzian metric experiment including slip and solver robustness checks.
scripts/ma_phi2_validate.py Validates φ=P2 vs φ=P3 grading parity and cost differences.
scripts/ma_phi_order.py Tests whether φ order affects GAMG robustness and grading parity.
scripts/ma_p1_gamg_scaling.py Scaling study for φ=P1/P2 with GAMG vs direct solvers.
scripts/ma_lorentzian_slip_final.py Final compact Cartesian Lorentzian + slip experiment script.
scripts/ma_localised_reach_gamg.py Localised metric reach study and GAMG robustness on interior blobs.
scripts/ma_heavytail_metric.py Heavy-tail (Lorentzian) metric “snuggle” test vs Gaussian alternatives.
scripts/ma_gamg_vs_direct.py Side-by-side direct vs GAMG comparison including KSP-iteration probing.
scripts/ma_cost_grading.py Reports MA-only cost, “honest” grading, and quality across AMP values.
scripts/ma_analytic_check.py Computes exact 1D radial equidistribution map as ground truth (no FE/UW).
scripts/interior_refine.py Compares spring vs MA on localized interior refinement with plotting.
scripts/cost_compare.py Wall-time comparison: spring vs MA paths across AMP, cold vs warm.
scripts/aniso_validate_nonsep.py Validates anisotropic mover on non-separable metric features vs scalar methods.
scripts/aniso_smoke.py Basic smoke test for anisotropic tensor mover validity and no-op cases.
scripts/aniso_param_sweep.py Parameter sweep for anisotropic mover stability/quality characteristics.
scripts/aniso_movie.py Builds a movie from adaptation checkpoints using PyVista rendering.
scripts/aniso_cost_and_gamg.py Cost + GAMG parity exploration for anisotropic mover across resolutions.
scripts/aniso_api_smoke.py Smoke test for public metric_density_from_gradient + smooth_mesh_interior(method="anisotropic") API.
scripts/adaptive_mesh_evolution.py Captures and renders before/after snapshots across adaptation events in a run.
scripts/adaptive_first_jump.py Captures first adaptation jump, checkpoints, and renders before/after views.
scripts/adapt_vs_ref24_cost.py Estimates amortized per-step overhead of pristine adaptation vs higher-res baseline.
scripts/adapt_cost_microbench.py Micro-benchmarks adaptation cost components relative to a solve step.
scripts/_test_metric_shapes_OT.py Synthetic-shapes test harness for composable OT improvement steps.
scripts/_test_metric_shapes_OT_jacobi.py OT + Jacobi interleave sweep with imbalance parsing and plotting.
scripts/_test_metric_shapes_OT_interleave2.py Refined OT-interleave sweep comparing heuristics at matched OT counts.
scripts/_test_metric_shapes_OT_interleave.py OT interleave sweep tracking imbalance trajectories and final meshes.
scripts/_test_metric_shapes_OT_blob_then_sharp.py OT chain experiment: broad-blob preconditioning then sharp OT refinement.
scripts/_test_metric_shapes_MA.py Synthetic-shapes smoke test for MA path with analytic Eulerian ρ.
scripts/_test_metric_shapes_MA_sweep.py MA convergence sweep varying outer/Picard iterations for shapes tests.
scripts/_test_metric_shapes_iter_sequence.py Steps mover 1-iter-at-a-time to inspect evolution with Eulerian metric.
scripts/_test_metric_shapes_iter_inside.py Captures per-outer-iteration snapshots from inside one smooth_mesh_interior call.
scripts/_test_metric_shapes_analytic_iter2.py Visual comparison of Lagrangian vs analytic Eulerian ρ cases across settings.
scripts/_test_metric_shapes_analytic_disp.py Diagnostic probe extracting max|Δx| trajectories for analytic vs Lagrangian ρ.
scripts/_profile_OT.py Profiles cost per OT step, including first-call setup vs cached subsequent calls.
scripts/_profile_OT_inside.py Profiles internal components of a single OT step (Poisson, projection, deform).
scripts/_nu_recheck.py Re-checks Nusselt computation on existing checkpoints using corrected flux method.
scripts/_nu_proper.py Implements “proper” Nusselt via projected total flux on shells + conduction validation.
scripts/_amp_check.py Checks whether amp changes the normalized-gradient field used in tensor metric construction.
docs/developer/subsystems/meshing.md Adds “Related” link to metric-driven mesh redistribution subsystem doc.
docs/developer/index.md Adds design doc entry for mesh-adaptation-formulation.
docs/advanced/mesh-adaptation.md Documents the “node redistribution” (snuggling mover) workflow and contrasts it with mesh.adapt().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1691 to +1706
self.snes.setUp()

jacobian = self.snes.getJacobian()
operator_matrix = jacobian[0]
preconditioner_matrix = jacobian[1] if len(jacobian) > 1 else None

nullspace = PETSc.NullSpace().create(
constant=True, comm=self.dm.comm)

operator_matrix.setNullSpace(nullspace)
operator_matrix.setTransposeNullSpace(nullspace)
# GAMG (the default PC) builds its coarse hierarchy from the
# near-nullspace; for a pure-Neumann operator the constant
# mode MUST be supplied here or the PC setup fails
# (DIVERGED_LINEAR_SOLVE at 0 iterations on re-solves).
operator_matrix.setNearNullSpace(nullspace)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d4e1cbf: the constant nullspace is now cached on the instance (self._constant_nullspace_obj), built once and reused across solves (it depends only on the comm), and invalidated to None on DM/SNES rebuild alongside the Stokes nullspace.

Comment on lines +1685 to +1690
def _attach_constant_nullspace(self):
"""Attach a constant nullspace to the (already set-up) SNES
Jacobian. Scalar analogue of ``_attach_stokes_nullspace``."""
if not self._constant_nullspace:
return

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d4e1cbf: _attach_constant_nullspace() now raises ValueError when self.essential_bcs is non-empty, mirroring the Stokes pressure-Dirichlet guard. OT_adapt's equidistribution Poisson is pure-Neumann (no essential BCs) so it is unaffected; verified the guard fires on a Dirichlet case and test_0760 stays green.

Addresses two review comments on the pure-Neumann constant_nullspace
support in SNES_Scalar:

- Raise ValueError when constant_nullspace=True is set on a problem that
  also has essential (Dirichlet) BCs — the constant mode is not in the
  operator's nullspace there, so projecting it out would corrupt the
  solution. Mirrors the Stokes pressure-Dirichlet guard.
- Cache the constant PETSc.NullSpace on the instance and reuse it across
  solves (it depends only on the comm) instead of rebuilding it every
  solve; invalidated on DM/SNES rebuild alongside the Stokes nullspace.

OT_adapt's equidistribution Poisson is pure-Neumann (no essential BCs), so
the guard does not fire there; test_0760 stays green.

Underworld development team with AI support from Claude Code
@lmoresi lmoresi merged commit 989d69a into development May 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants