Metric-driven mesh adaptation: follow_metric, OT movers, and mesh.OT_adapt()#209
Conversation
smooth_mesh_interior(metric=, method='spring'|'ma', boundary_slip=) — two topology-preserving node-redistribution solvers toward a target size/density field: - volumetric elastic spring (default): equal edge springs (shape, equant cells, no slivers) + per-cell area constraint A0 proportional to 1/rho_tgt (size); PCG truss-energy minimisation with an inversion-rejecting line search. ~0.3s, robust, scales with amplitude. - Benamou-Froese-Oberman Monge-Ampere: convex-branch det(I+D2phi)=g damped Picard on a variationally-recovered Hessian, pure-Neumann + the SNES_Scalar.constant_nullspace hook. Robust isotropic refinement, ~60x costlier. Optional tangential boundary slip (per-ring radius projection; nodes provably stay on the surface). Honest grading ~1.3-1.8x at fixed node count (extreme OT grading needs more nodes / a topology change). Docs: docs/developer/subsystems/mesh-metric-redistribution.md. Underworld development team with AI support from Claude Code
Validation and benchmark scripts for the metric smoother and the upcoming Monge-Ampere efficiency / convergence-rate work: honest per-node grading metric with mesh plots, exact radial equidistribution ground truth, setup sanity probe, cost / convergence / preconditioner benchmarks, boundary-slip A/B, interior-feature refinement test. Underworld development team with AI support from Claude Code
The Monge-Ampere path's cost was dominated by re-running a full GMRES+GAMG setup on every Picard iteration even though the mesh (hence the phi-Poisson Laplacian and the SPD Hessian-recovery mass matrix) is constant within the loop -- only the RHS changes. The constant_nullspace near-nullspace re-attach forced the GAMG re-setup, and newtonls added a redundant 2nd iterate + line search. _use_direct_solver puts the cached phi / Hessian / grad-phi sub-solvers on ksponly + MUMPS LU with snes_lag_jacobian=-2 and snes_lag_preconditioner=-2 (factor once, reuse: inner iters become back-substitutions), plus mat_mumps_icntl_24=1 for the singular pure-Neumann phi. n_picard default 40->25 (deep/near grading is flat from iter ~20). A direct solve is exact (tighter than the GMRES rtol) so the Picard fixed point -- hence grading/quality -- is bit-for-bit unchanged: d/n 1.02/1.43/1.71/1.54 (AMP 0/2/8/20), AMP=0 exact no-op, no tangle. Cost: cold 12-18s->1-2s, warm <=34s->1-2.5s; the warm>>cold GAMG-resetup degradation is eliminated. Serial MUMPS is the expedient that proved the diagnosis; the portable lever is the factor/setup-once-reuse pattern (port to GAMG for parallel) -- see the doc Implementation note + Open items. Validator: scripts/ma_cost_grading.py; profiler: ma_profile_phases.py. Underworld development team with AI support from Claude Code
Quantified the user-proposed quasi-Newton on the cofactor form cof(I+D2phi):D2(dphi)=g-det(I+D2phi) offline (fixed geometry, res-16 Annulus, transport-map metric), vs the shipped BFO-Picard. Decisive negative: (1) same recovered Hessian => same large irreducible det(I+H_rec)-g floor, so it cannot beat the settled fixed-node grading cap (predicted, not a grading lever); (2) it is LESS robust than BFO at the same Hessian quality -- all three standard convexity safeguards fail to reach BFO's fixed point (residual line-search freezes lambda->8e-3; det>0 backtrack stalls under-deformed d/n 1.58; PD-projection of eig(I+H) creeps, never contracts, overshoots into a cell-inverting map d/n 1.49) while BFO reaches d/n 1.713 in 16 iters. Root cause: BFO's sqrt(...)-2 is a closed-form convex-branch solve using only the deviatoric recovered Hessian + g, side-stepping the noisy under-estimated full det; cofactor-Newton feeds the full noisy recovered H into both C_k and the residual. UW3 forbids 2nd derivatives of mesh-var functions so a sharp D2phi is unavailable. Do not re-explore. Parallel path is therefore NOT Newton: keep BFO (robust convex-branch) and port the proven factor/setup-once-reuse pattern to GAMG (snes_lag_jacobian=-2 / KSPSetReusePreconditioner + constant near-nullspace + Krylov warm-start). Design doc + ma_newton_phase0.py. Underworld development team with AI support from Claude Code
…-reuse prototype Two findings from challenging the GAMG-reuse parallel prototype: 1. phi order, not the solver, sets the grading. P2 == P3 to ~3 dp across AMP 0/2/8/20 on the direct path (matches the recorded baseline, AMP=0 no-op exact, no tangle) at ~2x lower cost -- smaller matrices also help the direct factorisation scale. P1 is NOT grading-equivalent (~1.40 vs 1.71 at AMP=8, ~18% weaker), so P2 is the floor. phi_degree is now a parameter, default 3 -> 2. Canonical cost_compare at P2: MA cold ~0.7-0.9 s (was ~12-18 s originally) -- ~15-20x combined, grading bit-for-bit. 2. The BFO+GAMG-reuse parallel path (linear_solver="gamg", selectable; "direct" stays default). FGMRES+GAMG(SOR) for the elliptic phi-Poisson (CG unjustified -- no exact-symmetry guarantee + non-symmetric SOR smoother), CG+Jacobi for the SPD mass systems; snes_lag_*=-2 + Krylov warm-start. The constant nullspace is verified correctly wired (not the failure mode); P3 was a major GAMG confound (P2+gamg converges where P3+gamg diverges to a no-op) but even at P2 the warm post-_deform_mesh re-solve stays erratic, and this build has no alternative AMG (hypre/ML absent). Accepted position: MUMPS direct (itself MPI-parallel) for now; the P2 size cut only helps it. A robust iterative path would need a single-Dirichlet-pin (not the constant nullspace) and/or hypre, and is gated behind parallel-exact assembly + 3D (the solver is not the parallel bottleneck yet). Spring-as-MA-initial-guess: settled-rejected (geometric mechanism, unchanged by phi-order/solver speed) -- not re-run. Design doc + subsystem doc updated; validators ma_phi_order.py, ma_phi2_validate.py, ma_gamg_vs_direct.py, ma_solver_scaling.py. Underworld development team with AI support from Claude Code
…t warm-cost scaling P1 vs P2 × GAMG/direct, RES 16→64 (ma_p1_gamg_scaling.py, AMP=8): - P1 does NOT make GAMG robust. When P1+GAMG converges it is textbook-good (18-22 iters, N-independent — genuinely more AMG-friendly than P2's 77→103) but it still fails erratically (P1+gamg diverges res-32 & res-64; P2+gamg fails res-16 & res-32). The pure-Neumann + warm-resolve breakdown is order-independent and resolution-erratic; MUMPS direct is robust at every (res, order). - P1 grading ~1.40 vs P2 ~1.71 at every resolution (~18% weaker everywhere) — not a grading option regardless of solver. - Key actionable side-finding: the DIRECT warm cost scales badly with N — P2-direct warm 1.3 s (res-16) → 17.8 s (res-48) → 46.4 s (res-64), far above cold (9.5 s). The res-16 warm≈cold result does not extrapolate; per-call post-_deform_mesh rebuild + MUMPS refactorisation + cache-invalidated evaluate() re-interp are O(N)-growing. This is the next per-timestep-scaling work item. Design doc + memory updated. Underworld development team with AI support from Claude Code
Visual confirmation of the phi_degree decision: undeformed vs P1/P2/P3
MA redistribution (res-16, AMP=8) + outer-rim zoom. Shows P2 and P3
are visually identical strong clean grading (d/n 1.707 vs 1.713),
P1 a visibly weaker nudge (1.397), all meshes valid (no slivers /
inversion). Figures: /tmp/metric_mesh/ma_grids{,_zoom}.png.
Underworld development team with AI support from Claude Code
User caught that the P2 rim cells are far tighter than the nominal 1/3. ma_radial_anisotropy.py (res-16, AMP=8, vs undeformed): the deep/near ≈1.71 is a per-node mean of ALL incident edges and averages the collapsed radial edges with the frozen/expanded tangential ones. Band-mean radial at the rim ≈0.38× (~1/3, the isotropic-edge-criterion prediction) but the MIN radial edge ≈0.14× (~1/7) and minA/meanA ≈0.019 (thinnest tri ~1/52 mean area). All collapse is radial; tangential ~frozen. Mechanism: the outer ring is pinned (boundary) and the metric peaks exactly at r=R_O, so equidistribution jams the next ring(s) against the fixed wall into one near-degenerate sliver layer, independent of AMP. The isotropic AMP=1/s^2-1 rule is wrong here (annulus transport is purely radial; boundary-peaked-vs-pinned over-collapses the wall layer). d/n is OK as a regression guard but does NOT certify mesh quality near a boundary-peaked feature -- use minA/meanA or a radial/tangential split. Levers documented (offset metric peak inward / cap AMP to a quality floor / pinned-boundary 1-D radial OT design). Design doc + memory updated; figure /tmp/metric_mesh/ma_radial_profile.png. Underworld development team with AI support from Claude Code
…pendent
ma_show_grids_hires.py renders undeformed/P1/P2 at res 32 & 48
(AMP=8), full + outer-rim zoom, annotated with d/n + minA/meanA +
min-radial-vs-undeformed. Confirms the P2 rim sliver does NOT
improve with resolution: min radial 0.14x (res-16) -> 0.10x
(res-32) -> 0.11x (res-48); minA/meanA stays ~0.02-0.03. It is a
structural pinned-boundary + boundary-peaked-metric artifact, not a
discretisation error. P1 stays clean at all resolutions
(minA/meanA ~0.2). Figures /tmp/metric_mesh/ma_grids_hires{,_zoom}.png.
Underworld development team with AI support from Claude Code
…the snuggle fix Interior blob (away from the pinned boundary), AMP=8: - GAMG is ROBUST for localised cases (revises the blanket "GAMG fragile"). Every metric shape x width x resolution converged in ~27-54 iters, cost competitive with direct, zero failures. The earlier fragility was SPECIFICALLY the boundary-peaked-metric- against-pinned-boundary pathology (metric spiking where the operator is pinned). For the realistic localised-feature use case the parallel GAMG path is viable. - "Snuggle" fix: a wider Gaussian is wrong (one width couples the resolution scale and the reach: narrow = sharp but isolated pucker, bulk idle; broad = global motion but feature washed out, far/near 2.4->1.5). A heavy-tailed Lorentzian monitor 1+AMP/(1+d^2/Wc^2) gives the best feature resolution (far/near 2.74 > narrow 2.42) AND ~3x more inward migration of distant nodes (+0.025 vs +0.008): the whole mesh rakes coherently toward the feature. Mild quality cost (minA 0.089 vs 0.105), no tangle. Standard r-adaptation lesson (monitor needs global reach). Diagnostic-only (no src/default change). Design doc + memory updated; figures /tmp/metric_mesh/ma_heavytail.png, ma_localised_reach.png. Underworld development team with AI support from Claude Code
Tested defining the metric in (r,theta) so it "pulls in theta" plus boundary sliding (reference: compact Cartesian Lorentzian at an interior point gives far/near 2.74). 1. A separable (r,theta) Lorentzian is the wrong shape -- an anisotropic spoke, not a blob: the chord 2(1-cos dtheta) saturates at the antipode (no angular reach, far/near 1.38); true-wrapped-angle + balanced cores is a low-gradient radial ridge the smoother washes out (far/near 1.12, ~no-op). Use a compact |X-P|^2 Lorentzian about the feature point -- it has the correct combined radial+angular extent and pulls in theta inherently. 2. boundary_slip works mechanically (rim radial drift ~1e-16, nodes provably stay on the ring; GAMG robust ~31 it) but is NOT a concentrator: near a boundary feature it relaxes the mesh (far/near 1.21->1.12, minA 0.32->0.48, rim-count near theta0 16->18) -- it removes the hard pin so the rim equalises, it does not drag rim nodes tangentially toward the feature. Slip buys boundary quality, not feature concentration. 3. Boundary-proximal features are choked: same compact Lorentzian gives far/near 2.74 at r0=0.78 (interior) vs 1.21 at r0=0.88 (near rim) -- the fixed-node + pinned-boundary cap, feature side. Recommendation: compact Cartesian |X-P|^2 Lorentzian about the feature point, keep features with interior room, slip only for boundary quality; drop the polar-separable idea. Diagnostic-only. Design doc + memory updated; figs /tmp/metric_mesh/ma_polar_slip*, ma_lorentzian_slip.png. Underworld development team with AI support from Claude Code
…al dead end User's reframing: exploit the abundant tangential node budget (slide spare angular nodes toward the feature) not the scarce pinned radial one. Built (1) the exact 1-D angular OT map as the target and (2) a new opt-in _winslow_elliptic(move_anisotropy= (w_r,w_theta)) that rescales the realised displacement in the local radial/tangential frame (default None = byte-identical). Angle-only feature rho(theta), AMP=8, res-24: - (1) exact angular OT: far/near 2.21, frac@theta0 0.159->0.415, radial drift 1e-16, no tangle. Exact, trivial, robust -- the right tool for separable/structured features, used directly. - (2) scalar BFO: ~ZERO angular concentration (far/near 0.98, frac 0.158 ~ uniform) for ANY weighting. move_anisotropy works as designed (suppresses spurious radial drift 6.8e-2 -> 8e-3) but there is no tangential transport to reweight -- the scalar potential never generates it. Root cause = the foundational ~1.7 cap, both directions: a scalar equidistribution potential with fixed topology cannot do large coherent bulk transport, radial OR tangential (hoop/fixed-topology stiffness cuts both ways). "(1) as a target for (2)" proved (2) cannot reach it. Generalisable path: explicit dimensional-split 1-D OT for separable features, or a true anisotropic metric-tensor adaptation (heavy) -- NOT anisotropic diffusivity / move-weighting on the scalar potential. move_anisotropy is kept as an opt-in quality knob (off-direction drift suppressor), not a concentrator. Design doc + memory updated; fig /tmp/metric_mesh/ma_angular_ot.png. Underworld development team with AI support from Claude Code
ma_metric_tensor_viz.py: scalar density -> M = (1/h0^2)[I + beta ghat ghat^T (|grad rho|/ref)^2], eigen-clamped, desired-cell ellipses on a polar sample grid for a radial rho(r) and an angular rho(theta) feature. Correct and confirms the design: - radial feature -> tangentially elongated ellipses (short across r), angular feature -> radially elongated (short across theta), each concentrated at the feature and isotropic where grad rho->0. - The eigenframe auto-aligns to r-hat / theta-hat with NO (r,theta) frame specified anywhere (M sees only the Cartesian gradient) -- resolves the user's (r,theta) puzzle; the scalar API is preserved. - max anisotropy = the eigen-clamp band (8.3:1), as designed. Honest nuance: a gradient-based metric refines where rho CHANGES (flanks), isotropic at a smooth peak / far field; for small cells at the feature core use smoothed |grad rho| or Hessian-based M=|H(rho)| (needs the recovered-Hessian path). Gradient form is the first-derivative UW3-clean first cut. The metric construction (the ~1-day half) is verified and cheap; remaining for (3) is the anisotropic mover (metric-Winslow / M-weighted displacement solve), improving alignment/quality not the fixed-node cap. Design doc updated; fig /tmp/metric_mesh/ma_metric_tensor.png. Underworld development team with AI support from Claude Code
Self-contained handoff so a fresh session can start building the metric-Winslow / M-weighted displacement solve without re-deriving: what to read (memory dead-ends, design doc, the _CofDiff tensor- constitutive pattern + cache/lag/MUMPS infra), the concrete plan, the validation targets (anisotropy-aware diagnostics + 1-D OT reference), the accepted quality-not-cap caveat, and scope (~1-2 weeks, new branch). Metric construction is done/verified; only the mover remains. Underworld development team with AI support from Claude Code
_winslow_anisotropic: M-weighted Laplace (Winslow) smooth of the
coordinate map with an eigen-clamped, gradient-derived anisotropic
metric tensor D. Displacement form ∇·(D∇u_c)=-Σ_j∂_j D_{jc} solved
per coordinate component, sharing the _CofDiff-style DiffusionModel
tensor operator (_c=D) and the factor-once-reuse direct solver;
homogeneous Dirichlet on the pinned boundary makes it non-singular
(no constant_nullspace — side-steps the GAMG-pure-Neumann
fragility). Linear (one solve/component/outer step, no Picard).
Reuses _winslow_elliptic's signed-area backtrack, boundary_slip and
move_anisotropy. Wired into smooth_mesh_interior as the opt-in
method="anisotropic"; spring default + ma path untouched.
Critical no-op guard: the L2 projection of a uniform-ρ zero
gradient leaves ~1e-18 round-off; normalising by that noisy max
fabricated O(1) anisotropy. A scale-aware g_eps floor makes AMP=0
an exact isotropic no-op while AMP>0 is bit-identical to the
verified ma_metric_tensor_viz construction.
Smoke (res-16 Annulus): AMP=0 exact no-op (max|Δx|=0), AMP=8 moves
nodes, mesh stays valid via the backtrack, ~3s/5 outer steps.
Underworld development team with AI support from Claude Code
Two formulation fixes after the radial-feature validation arc: 1. Build the eigen-clamped metric tensor D ONCE on the undeformed mesh and hold it fixed & Lagrangian (rides material points), exactly as _winslow_spring computes its rest-lengths/A0 once. Re-projecting ∇ρ on the progressively distorted mesh each outer step is a positive feedback — D blew up on squashed cells → catastrophic over-collapse (verified). With D fixed the outer loop is a stable damped fixed-point iteration of one linear operator toward the M-harmonic map. 2. Under-relax the per-step displacement (relax). The decoupled direct Winslow form has no Rado–Kneser–Choquet non-folding guarantee, so a single un-damped elliptic jump folds; its stable regime is bounded by the metric anisotropy. Characterised the Pareto frontier (scripts/aniso_param_sweep.py): aniso_cap=2 + relax≈0.1-0.2 is the robust default; higher cap needs gentler relax + more n_outer; cap≳6 folds regardless (would need the coupled/inverse Winslow — out of prototype scope). Defaults aniso_cap 8→2, relax 0.4→0.2. Validated (scripts/aniso_validate_radial.py, anisotropy-aware radial/tangential split + minA/meanA, NOT d/n, vs MA + spring + exact 1-D OT; grids rendered). Interior radial feature (r=0.70, res-16 AMP=8): (3) minA/meanA 0.47 vs isotropic MA 0.18 / spring 0.25 — 2.6x cleaner, no slivers, still redistributes toward the feature. Even on the pathological boundary-peaked case (r=R_O) (3) keeps minA/meanA 0.24 vs MA 0.019 (12x less degenerate). (3) trades grading MAGNITUDE for clean anisotropic cell ALIGNMENT — exactly its intended role; does not beat the fixed node-count cap. Underworld development team with AI support from Claude Code
Completes the (3) validation arc with the anisotropy-aware diagnostics on the remaining model problems. Angular-only feature (res-24 AMP=8, scripts/aniso_validate_ angular.py, vs the exact 1-D angular-OT target): (3) far/near 1.02, frac@θ0 0.160 ≈ uniform — like the scalar MA it CANNOT do separable angular concentration (expected: separable ⇒ the explicit 1-D OT is exact + cheap; the settled fixed-topology cap). But (3) keeps minA/meanA 0.243 vs scalar MA 0.144 — cleaner. Non-separable blob P=(0.78,0) (the case (3) is for, scripts/aniso_validate_nonsep.py): (3) minA/meanA 0.295 vs MA 0.109 / spring 0.119 — 2.7x cleaner, no slivers; the zoomed grid shows MA/spring pull a degenerate slivered knot into the blob while (3) gives a clean, well-shaped, blob-aligned densification. Concentration milder (far/near 1.10 vs MA 1.37) — (3) trades magnitude for clean alignment, exactly its intended role. Verdict across the arc: (3) is the cleanest method everywhere (2.6-12x better minA/meanA than isotropic MA), never slivers, is linear/cheap; it does NOT beat the fixed node-count cap (for separable features the explicit 1-D OT remains exact + cheaper). A validated prototype matching the kickoff brief. Underworld development team with AI support from Claude Code
Design doc: append the "(3) anisotropic mover — IMPLEMENTED & VALIDATED" section (formulation, the two settled formulation findings — build-D-once feedback + the decoupled-form folding / aniso_cap Pareto, the validation table across all model problems, and the verdict + open follow-ups). Subsystem doc: extend "the two solvers" → three (method="anisotropic" row + recommendation + Open items entry + example). Project memory (project-anisotropic-mover) records the non-obvious settled findings so future sessions don't re-derive them: build D once (re-projection is a feedback collapse), aniso_cap is the binding lever (cap=2 robust, ≥6 folds — coupled/ inverse Winslow needed, out of scope), the g_eps no-op guard, and the honest "cleanest-not-strongest, doesn't beat the node-count cap" verdict. Underworld development team with AI support from Claude Code
Answers a review question: a GRADIENT metric resolves a Gaussian blob's EDGE, not its CORE — by design. Renders the desired-cell ellipse field (exactly the shipped mover's construction: β=200, aniso_cap=2, h0 from mesh) beside the realised method="anisotropic" mesh and an edge-length/cell-aspect profile vs d=|X-P|. Confirms: target = fine anisotropic shell at the |∇ρ|-max ring, isotropic-coarse core + far. Realised mesh matches — core (d≈0) mean edge 0.079 vs undeformed 0.041 (~1.9× COARSENED: nodes evacuated from the ∇ρ=0 core into the edge shell), edge-length minimum + aspect maximum on the flank ring, far field untouched (0.0416 vs 0.0412). If the blob CORE must be resolved, the Hessian metric M=|H(ρ)| is the tool (documented follow-up, out of prototype scope). Underworld development team with AI support from Claude Code
"Run 20 steps non-adaptive, then refine." Reuses the fixed_mesh_convection setup (Annulus res-16, free-slip Stokes + AdvDiffusionSLCN, Ra=1e5, 20 steps, FIXED mesh; state cached to npz). Then builds the mover-side analogue of UW3's adaptation metric (adaptivity.metric_from_gradient): a Lagrangian density ρ = 1 + AMP·t, t = clip((|∇T|-g_lo)/(g_hi-g_lo),0,1) with the same percentile window idea — fed as method="anisotropic" metric=. |∇T| is pre-projected to a scalar field so the mover's internal metric.diff(X) stays a first derivative (UW3-legal) and the metric rides material points. Result (fig /tmp/metric_mesh/aniso_convection.png): the metric correctly targets the inner thermal BL + plume edges; the refined mesh gathers cells there (anisotropic, aligned along the BL/plume, short across ∇T) and stays clean — minA/meanA 0.57→0.25, no slivers. Fixed node budget ⇒ relative redistribution (the documented node-count-cap caveat): same metric *form* as adaptation, budget-limited *magnitude*. Underworld development team with AI support from Claude Code
Fold the pipeline walkthrough (cache-build components 1-4, per-call build-D-once + damped MMPDE loop steps 5-6), the solver limitations (2D-tri-only, decoupled-direct folding bound, node-count cap, gradient-resolves-edges-not-cores, Lagrangian-fixed metric, serial-exact assembly, linear/decoupled), and the unexplored corners (no solution-accuracy proof yet, no dynamic-adaptive loop, coupled/inverse Winslow, Hessian metric, the metric_from_gradient-style ρ helper, GAMG path, auto-tuning, free-surface slip) into the design doc's "(3) … IMPLEMENTED & VALIDATED" section as the canonical reference. Underworld development team with AI support from Claude Code
scripts/aniso_cost_and_gamg.py: interior radial feature, res 16/24/32/48, direct vs gamg, decomposing cold (one-off setup) / warm (per-timestep) / per-outer-step / D-build. Underworld development team with AI support from Claude Code
GAMG: bit-parity with direct at res 16–48 (the mover is non-singular — homogeneous Dirichlet — so no pure-Neumann fragility; first metric method with a working parallel-capable solver path). Cost: cold≈warm (no MA warm-rebuild pathology), ~O(N) in #cells, per-step = a fixed number of SPD elliptic solves + local work — structurally parallel-cheap (no nonlinear solve / global transport). Lever is n_outer. Folded a "GAMG parity + cost per step (measured)" subsection into the design doc; updated the Solver-limitations + Corners (GAMG validated → remaining is parallel-exact assembly; concrete 3D scope: solver core already dim-general, the 2D-specific piece is the tet signed-volume backtrack). Subsystem-doc Open-items entry updated. Underworld development team with AI support from Claude Code
aniso_convection_res32.py: rebuild the res-16 P3 T from the cached 20-step npz, interpolate onto res-32 T nodes (uw.function.evaluate across meshes), re-solve Stokes for a consistent V, run 5 warm steps (res-32 state itself cached), then refine on ρ∝|∇T| with method="anisotropic". Warm start preserves+sharpens the res-16 plume structure; the mover snuggles points into the inner BL + plume conduits, mesh stays valid (minA/meanA 0.55→0.19, no slivers — finer |∇T| ⇒ a more aggressive but clean redistribution than res-16's 0.25). Fig /tmp/metric_mesh/aniso_convection_res32.png. Underworld development team with AI support from Claude Code
- smooth_mesh_interior gains method_kwargs: a clean pass-through for the per-method tuning (aniso_cap/relax/n_outer/linear_solver /beta/move_anisotropy) without bloating the shared signature; documented with the validated anisotropic defaults. - New public uw.meshing.metric_density_from_gradient(mesh, field, amp=, lo/hi_percentile=, ...): encapsulates the demonstrated pattern (project ∇field — first-derivative, UW3-clean — percentile -normalise, freeze as a Lagrangian scalar) and returns the ρ density expression. The relative, fixed-node-budget analogue of adaptivity.metric_from_gradient. Cached per (mesh,degree,name, topology) so it is cheap + leak-free called every step in an adaptive loop. Scale-aware g_eps floor ⇒ a (near-)uniform field yields ρ≡1 exactly (the smoke test caught the percentile-of- round-off fabrication — same class as the mover's own guard). - Exported from underworld3.meshing. aniso_api_smoke.py: public import, helper cache, method_kwargs pass-through (bad kwarg → TypeError), valid clean move, exact uniform no-op — all pass. Underworld development team with AI support from Claude Code
mesh-adaptation.md: add a "Two families of mesh adaptation" table (MMG re-mesh / absolute h / variables reset vs node-redistribution / relative density / topology preserved / anisotropic / cheap- parallel) in Core Concepts, and a "Node redistribution — the snuggling mover" section with the metric_density_from_gradient + smooth_mesh_interior(method="anisotropic") recipe, the gradient-resolves-edges-not-cores + node-count-cap caveats, the method_kwargs knobs, and cross-links to the subsystem + design docs. Frames it as a peer adaptive strategy with the SAME intent API as adaptivity.metric_from_gradient. Underworld development team with AI support from Claude Code
- scripts/adaptive_convection_harness.py: Ra=1e5, uniform res-24 reference vs res-16 + periodic anisotropic adaptation (locked-in API), matched-time Nu(t)/vrms(t) rms comparison + figure, both histories cached. The node-update/ALE correction is an explicit documented hook (apply_adaptation_correction): --correction none = uncorrected baseline (quantifies the drift the next phase must remove); --correction ale raises with the precise spec (V_fn = v - v_mesh post-adapt, per the settled free-surface ALE precedent). - Design note: "NEXT-PHASE KICKOFF BRIEF — dynamic adaptive convection": phase-closed summary (API/docs/harness entry points), the ALE correction as the headline open piece + acceptance test, prioritised follow-ups (3D scope, parallel, Hessian metric, auto-tune), and the resume recipe. Records that the metric is 1/h^2 per principal direction in 2D AND 3D (NOT 1/h^dim) — verified adaptivity.py:154. Underworld development team with AI support from Claude Code
adapt_with_correction(mode="interp"): exploit that the anisotropic mover is topology-preserving (vector size / DOF order / partition invariant) and the pinned boundary leaves the domain unchanged — so the correct remap is the OLD P3 field evaluated at the NEW node positions via the local FE evaluate (uw.function.evaluate, true basis, max fidelity, no cross-rank migration), done by a deform-back -> evaluate -> deform-forward around the mover, then re-solve Stokes. NOT read_timestep's kd-tree path (that exists for the decomposition-changing mesh.adapt case and was diffusive — degraded the solution). Result (Ra=1e5, res-16 adapt-every-5 vs uniform res-24 ref, n15): rms ΔNu 0.028 vs uncorrected 0.097 (~3.5x better); the per-adapt Nu kink is eliminated and the trajectory tracks the reference. vrms shows a small expected interpolation-diffusion under- prediction (the remap tax — motivates the future ALE option). Also: nusselt() sign fixed (was -Nu: divided by the signed conductive gradient instead of the conductive flux); harness overlays the cached uncorrected baseline; --correction gains "interp"; "ale" raises as the next-phase adapter-bound option. Underworld development team with AI support from Claude Code
- adaptive_saturation.py/_plot.py: 3-way Ra=1e5 saturation harness (ref24/u16 + pristine-remesh adaptive a16p/a16s/a16x), loop-reorder (adapt between adv & the single stokes — drops the redundant solves), --resume warm-start, per-model metric strength, dense checkpointing; aniso_movie.py assembles the mesh-adapting animation (ffmpeg). - Nusselt fixed (both saturation + harness): total radial heat flux v_r·T-∂_rT projected to nodes, integrated on an interior shell (BL-resolution-immune, shell-independent at steady state), normalised by the analytic ANNULAR LOG conduction flux 2π/ln(Ro/Ri). _nu_proper.py validates Nu=1.0000 at conduction; _nu_recheck.py recomputes from checkpoints. The old 2-point shell stencil was ~2-3x low (h-dependent artefact). - Result: pristine re-mesh bounds mesh quality over a full saturated run at aggressive metric; adaptive res-16 reproduces res-24 Nu (~1%) and vrms (~3%) at ~0.69x cost; adaptation overhead is a small fraction of one Stokes solve. Underworld development team with AI support from Claude Code
New docs/developer/design/mesh-adaptation-formulation.md — the self-contained maths reference: the equidistribution principle + fixed-node-count cap; full optimal-transport / Monge-Ampère derivation (Brenier map → MA equation → BFO convex-branch Δφ=√((φxx-φyy)²+4φxy²+4g)-2, damped Picard, variational Hessian recovery, the 1-D exact reference, why the FE-MA solve caps); volumetric elastic-spring energy; the anisotropic metric-tensor (Winslow/MMPDE) mover (eigen-clamped M from ∇ρ, M-weighted displacement Laplace, damped-MMPDE stability); gradient-metric construction + the 1/h² (not 1/h^d) dimension note; dynamic-adaptation field handling (Lagrangian / local-FE remap / ALE, pristine re-mesh); the Nusselt definition. Three example figures (metric-tensor construction, non-separable validation, the live Ra=1e5 a16x convection). Cross-linked from the subsystem + advanced docs and added to the developer toctree. Underworld development team with AI support from Claude Code
User observed a16s (amp=16) and a16x (amp=24) bunch identically. Verified (scripts/_amp_check.py, diff=4e-16): the mover builds M = base[I + β ĝĝᵀ(|∇ρ|/gref)²] with gref=max|∇ρ|; with metric_density_from_gradient's ρ=1+amp·t both |∇ρ| and gref scale with amp ⇒ it CANCELS exactly, M is amp-invariant. So a16x ≡ a16s in metric. The real anisotropic-mover knobs are aniso_cap (eigen-clamp, peak sharpness + stability lever), β (anisotropy strength → widens the refined band), and the lo/hi percentile window in metric_density_from_gradient (reshapes t→∇t→M; selects which gradient quantile refines — does NOT cancel). amp IS effective for the isotropic spring/MA (A0∝1/ρ, g∝1/ρ) — method-dependent. Corrected the formulation doc (§5 admonition) and the adaptive_saturation _MP comment, which both wrongly presented amp as the mover's bunching intensity. Underworld development team with AI support from Claude Code
…through amp being a no-op (prev commit), a16y uses the EFFECTIVE knobs: aniso_cap 4→5 (sharper peak — binding lever; 6 folds), beta 200→300 (wider refined band), relax 0.05→0.04 + n_outer 25→30 (keep cap=5 non-folding under pristine re-mesh). `beta` now passed through method_kwargs (was mover-default 200); existing entries pinned to beta=200 to preserve their behaviour. Result (controlled): cap=5 stays stable (no fold; minA/meanA ~0.26) but realised bunching ≈ a16s (cap=4) — the fixed-node- budget pristine single-adaptation map caps the grading magnitude regardless of metric aggressiveness. Reconfirms the §1 node-count cap by a turn-up-the-effective-knobs test (settled). Underworld development team with AI support from Claude Code
…iso CFL, mesh.quality
Source/docs checkpoint for the feature/anisotropic-metric-mover branch.
Covers the work this session plus the prior strategy/aniso-CFL layer.
meshing/smoothing.py
- ADAPT_STRATEGIES dict: off / vlow / low / med / high / extreme,
each a coherent (amp, lo_pct, hi_pct, power, R, skip_threshold)
bundle. User-supplied dial values still override.
- mesh_metric_mismatch(): rms / max / median_abs / alignment (Pearson
r of log(1/A_cell) vs log(rho_cell)) / misalignment metrics for
deciding when to re-adapt.
- smooth_mesh_interior(): strategy=, skip_threshold= kwargs;
"off" short-circuits, mismatch < threshold short-circuits
(adapt-on-demand).
- metric_density_from_gradient(): strategy, power, mode, and
*both* smoothing_length (field-side) and gradient_smoothing_length
(post-projection metric source) kwargs. The gradient-side variant
is the production de-noiser: it leaves T (and therefore the BL)
untouched and smooths only the spatial distribution of |grad T|
that the metric reads, so the BL location and magnitude are
preserved.
meshing/__init__.py
Re-export ADAPT_STRATEGIES, mesh_metric_mismatch.
discretisation/discretisation_mesh.py
mesh.quality() — per-node and percentile mesh-quality probes used
by the validation scripts; parallel-safe.
systems/solvers.py
- AdvDiffusionSLCN.estimate_dt(direction_aware=False) — per-cell
extent along v^ via projection of cell vertices onto v^; gives
1.35x on uniform / 6-7x on adapted meshes vs the per-cell-h
bound, while staying conservative on the actual CFL.
- Unit-aware smoothing_length accessor on the four Projection
classes (Scalar, Vector, Tensor, MultiComponent) with the
screened-Poisson L = sqrt(alpha) convention. This bit is
being upstreamed via PR #200 against development — duplicate
diff here will fall away on rebase once that PR merges.
docs/developer/design/
- mesh-adaptation-formulation.md updated with the equidistribution
derivation, R / amp / percentile / power dial semantics, and
the gradient-smoothing rationale.
- snes-atol-convergence-scale.md — note on the missing snes_atol
in UW3 Stokes leading to a near-converged-class line-search
failure in warm restarts; recipe to set it sensibly.
- solver-strategies-catalogue.md — first-pass catalogue of solver
presets validated on Stokes / Darcy / etc. with bit-identical
residual probes and warm/cold cost.
No script changes in this commit; a follow-up commit captures the
production drivers used to validate this work.
Underworld development team with AI support from Claude Code
Production drivers used to validate the meshing/solvers changes in
the previous commit. None of the leading-underscore one-shot probes
are included — those stay untracked.
Drivers
-------
stagnant_lid_uniform.py
Annulus Ra=1e7, Frank-Kamenetskii delta_eta=1e4, P3-T / P2-V / P1-P
baseline. Loops with checkpointed history.npz; reuses a single
Projection across diagnostic calls (no per-step allocation).
stagnant_lid_adapt_loop.py
Adapt-every-N convection loop with:
* strategy={off..extreme}
* skip_threshold short-circuit (mesh_metric_mismatch)
* direction-aware AdvDiff CFL
* --grad-smooth-h0 (gradient-side metric smoothing,
expressed as a multiple of background h0)
* surface Nu via uw.maths.BdIntegral on the cold (Upper)
boundary, normalised by the analytic annular conductive
flux 2 pi / ln(R_o/R_i); legacy mid-shell variant kept
as _nu_midshell() for cross-checks
Adapt step: FE-remaps T onto the new mesh and zeros V, P
(validated cold-restart pattern; FE-remap of V,P or Lagrangian
carry was rejected in prior probes).
stagnant_lid_adapt.py, stagnant_lid_R_sweep.py,
stagnant_lid_R_compare_plot.py, stagnant_lid_plot.py,
stagnant_lid_adapt_compare_plot.py
Static (non-loop) adapt + R-sweep validation harnesses and
plotting helpers.
adaptive_saturation*.py
Earlier saturation-experiment harness, kept consistent with
the new strategy API.
Outputs live in ~/+Simulations/StagnantLid/* (per-run history.npz
plus PyVista renderings). Not committed.
Underworld development team with AI support from Claude Code
A user-facing wrapper around the anisotropic node mover, intended
as the production entry point for ∇field-driven mesh adaptation.
The old strategy={off..extreme} / amp / percentile / power dials
remain on metric_density_from_gradient + smooth_mesh_interior for
expert use; new code should reach for follow_metric() instead.
API
---
uw.meshing.follow_metric(
mesh, field,
refinement=3.0, # h_min ≈ h0 / refinement
coarsening="auto", # h_max ≈ h0 · coarsening
metric="front-following", # or "gradient-uniform"
skip_threshold=0.9,
gradient_smoothing_length=None,
)
Two strategic knobs (refinement, coarsening) replace the amp /
percentile / power dials in the user-facing API:
* `refinement` (mandatory, >= 1): factor by which the densest
cells are smaller than the background h0.
* `coarsening` ("auto" or float >= 1): factor by which the
sparsest cells are larger than h0. "auto" uses the
budget-conserving minimum refinement**(1/d) — the smallest
coarsening that "makes room" for the requested refinement at
fixed node count.
* `metric` ∈ {"front-following", "gradient-uniform"}: strategic
equidistribution rule (formerly the `power` numeric dial).
Math
----
The mover's eigenvalue → cell-size map is h = h0/√(ρ̂) (NOT
ρ̂^(-1/d) — λ has units of 1/length²), so a literal envelope
h ∈ [h0/refinement, h0·coarsening] corresponds to ρ̂ ∈
[1/coarsening², refinement²] — dimension-independent.
The piecewise-log-linear ansatz over percentile rank has its
break (ρ=1) at p* = log(refinement) / log(refinement·coarsening),
which makes geomean(ρ) ≡ 1 by construction. The mover's G
normalisation then passes ρ through unchanged.
For gradient-uniform: ρ ∝ |∇field|² clipped to the envelope —
constant per-cell Δfield, the natural advection-diffusion goal.
Practical behaviour
-------------------
Validation on a sharp-tanh annulus shows:
* refinement side honoured within ~5-10% of h0/refinement
* coarsening side achieves ~2× h0·coarsening (anisotropic cells
+ iterative mover deformation map are less tight on the coarse
side; this is the underlying mover's behaviour, not the new
API). The metric_choice still helps because the cap on
refinement keeps the cell-size envelope from running away.
mesh_metric_mismatch is the right diagnostic for measuring how
close the achieved mesh is to the requested metric.
Tests
-----
tests/test_0750_meshing_follow_metric.py — 8 tests, all green:
* refinement=1 is a no-op metric
* envelope in ρ-space (ρ_max = ref², ρ_min = 1/coar²)
* geomean(ρ) ≡ 1 by construction (front-following)
* coarsening="auto" matches refinement**(1/d) explicitly
* invalid-arg validation
* end-to-end: follow_metric moves the mesh
* end-to-end: refinement envelope within ~15% of spec
* end-to-end: skip_threshold short-circuits on aligned mesh
Underworld development team with AI support from Claude Code
…r-coarsening The anisotropic mover is a local graph-Laplacian: it can't transport nodes long-distance across high-gradient ridges (the user's diagnostic insight, 2026-05-21). So on the stagnant-lid test, cells adjacent to the cold-BL refinement zone absorbed most of the freed area and overshot the coarsening cap by 50-130%, while cells in topologically isolated low-gradient pockets stayed near h0. Fix: per-cell Lagrangian rest-size spring in _winslow_anisotropic. rest_size_cap_max=None # if set, h0·coarsening rest_size_cap_min=None # if set, h0/refinement rest_spring_K=1.0 # stiffness In each outer iteration of the deformation map, after computing the metric-driven displacement, the spring scans cells whose mean edge would exceed (or go under) the cap. For each over/under-shooting cell, the vertices receive an additional displacement toward their *rest position* `old0` (captured at function entry), weighted by the fractional overshoot summed over incident cells. With relax=0.2 and n_outer=12 the per-step restoring strength is ~K · relax = 0.2 per iteration; after 12 iterations the effective pull is ~93% restoring. The spring is a *local* fix that doesn't require long-range node transport — vertices simply refuse to move past the cap. Compares against the rest position `old0`, the truly-undeformed coords from function entry, not the per-iteration `old_coords`. Validation on stagnant-lid step0080 (Ra=1e7, Δη=1e4): call h_max/h0 BEFORE AFTER spring spec (1.5, auto) 2.16 1.47 1.22 (2.0, auto) 3.13 1.60 1.41 (3.0, auto) 3.22 1.90 1.73 (2.0, 2.0) 4.59 2.19 2.00 (2.0, 1.4) 3.10 1.59 1.40 (2.0, auto, GU) 4.16 1.61 1.41 Coarsening side now within 10-30% of spec (was 50-130% over). Refinement side maintained: within 10-15% of spec. New test_follow_metric_coarsening_envelope_approximate locks h_max/target_h_max < 1.30 across four (refinement, coarsening) pairs. All 9 follow_metric tests pass. Visual diagnostic (~/+Simulations/StagnantLid/follow_metric_compare/ plot_follow_metric_compare.png) shows the over-coarsened cells in cold downwelling regions are now gone — the mesh grades smoothly toward h0·coarsening rather than ballooning to ~2-3× that. follow_metric.py wires rest_size_cap_max = h0·coarsening, rest_size_cap_min = h0/refinement automatically; method_kwargs can override. Underworld development team with AI support from Claude Code
… directions
User feedback (2026-05-21): "I see very dramatic refinement near the
edges which seems more than 1/2 or 1/3 times the normal spacing.
Have we traded one problem for another?" — yes, the mean-edge spring
let anisotropic cells slip through the cap.
A cell with mean-edge = h0/refinement (passing the previous cap) can
still have one extremely short edge plus two long ones, producing
aspect ratios up to 12 and visible sliver edges at the BL. The
mean-edge cap doesn't catch this.
Fix: switch the spring's per-cell size measure from mean-edge to:
* MAX-edge for the coarsening cap — pulls back if ANY edge grew
beyond h0·coarsening. Catches cells that stretched in any
direction past spec.
* MIN-edge for the refinement cap — pulls back if ANY edge shrunk
below h0/refinement. Catches sliver cells with one tiny edge.
Both measures are sliver-aware: a cell can't pass the cap by averaging
across long+short edges. User asked for both directions: "also apply
the spring in the other direction so cells refuse to collapse".
Empirically (stagnant lid step0080):
call any-edge min/h0 mean-edge min/h0 aspect_max
ref=2.0: BEFORE 0.134 0.422 7.32
AFTER 0.487 0.530 2.74
ref=3.0: BEFORE 0.083 0.337 11.95
AFTER 0.321 0.381 3.34
Aspect ratios drop dramatically. Min-edge is now near spec (0.5 for
ref=2; 0.33 for ref=3), removing the visible sliver overrun the user
flagged.
Tradeoff: ref=3 mean-edge can be 37% larger than h0/3 (cells under-
refine slightly because the min-edge cap holds — the cell's mean
must stay above its min). That's the right side of the trade: the
user asked for caps on extreme refinement, accepting mild under-
refinement.
test_follow_metric_refinement_envelope_approximate updated to lock
the mean-edge guarantee (0.85 < h_min_cell/target < 1.30) — the
spring's robust contract. Slivers can still occur on individual
cells with very strong local gradients; the *cell mean-edge*
guarantee is the one to rely on.
Underworld development team with AI support from Claude Code
…track User feedback: "There are a few genuine sliver elements in those meshes — tiny, thin things that are almost invisible. That seems new." These are different from the anisotropic-sliver class fixed in the previous commit. They're **near-degenerate** triangles: three vertices become nearly collinear, producing cells with area ~1000× smaller than typical while passing the existing aspect-ratio and edge-length checks. The mover's signed-area backtrack accepted them because `a1min > 0` (positive area, no flip) — but `0` and `1e-7` are different things visually. Two coupled root causes: 1. **Unbounded spring restoring weight.** The spring's per-vertex `restore_w` accumulates contributions from incident cells via `np.add.at`. A vertex incident on six violating cells, each contributing K·excess ≈ 0.5, gets restore_w = 3 — and the spring then moves the vertex by `restore_w * (rest - current) = 3·gap`, overshooting rest by 2×. Two over-shot vertices can land in the same place, collapsing a triangle. **Fix:** clamp `restore_w` per-vertex to ≤ 1 after the accumulation. Full restore-to-rest is the most you ever want; never go past. 2. **Backtrack only checks `a1min > 0`.** Cells whose area collapses to ~1e-7 (against a median ~1.5e-3) still pass. **Fix:** require `a1min > 0.01 · a0_undeformed_median`. The floor is computed ONCE from the undeformed mesh (captured before the iteration loop) so it doesn't shrink as cells refine. The 1% threshold is conservative for any practical refinement: refinement=3 in 2D legitimately shrinks cells by 9× in area, which is 100× above the floor. Empirically (stagnant lid step0080, against A0_median): case n<5%A0 n<1%A0 A_min/A0 aspect_max ref=1.5, auto 0 0 0.413 2.44 ref=2.0, auto 1->1 1->**0** 0.010 2.71 ref=3.0, auto 8->1 2->**0** 0.010 3.31 ref=2.0, coar=2.0 3->3 3->**0** 0.010 3.48 ref=2.0, GU 0 0 0.151 3.05 Zero cells with area < 1% of A0_median in any case. A handful sit exactly at the floor (held there by the backtrack) — they "wanted" to collapse further but were stopped. Visually clean. Tests 9/9 pass. Underworld development team with AI support from Claude Code
User feedback (2026-05-22): "Still lots of slivers in that follow-metric
mesh. It does a better job of distributing nodes but at the expense of
poor elements. Can we run a second pass on the distributed nodes that
would then adapt to the metric and fix the dodgy elements?"
Yes — and the right pass is the existing graph-Laplacian Jacobi
smoother (`smooth_mesh_interior` without a metric). Each iteration
averages every interior vertex toward the mean of its edge neighbours
— a gentle local cleanup that:
* rounds out the few cells held at the sliver-floor by the backtrack
* doesn't (significantly) undo the metric-driven node distribution
* is cheap (one pass through the adjacency)
Tested alternatives that did NOT work:
* Spring polish (`_winslow_spring`) with the same metric: drove
cells toward equal-length edges, which conflicts with the
metric-driven anisotropic distribution. Aspect_max went from
3.31 → 14.27 — undoing the alignment.
* Spring-only (no anisotropic first): aspect_max 8.78 — same
transition-zone problem.
Jacobi sweep, ref=3.0 stagnant lid:
polish h_min/h0 A_min/A0 aspect_max aspect_p99
none 0.641 0.010 3.31 2.83 <- floor cell
1×α=0.2 0.697 0.134 3.19 2.73 <- production
2×α=0.2 0.748 0.190 2.99 2.63
5×α=0.2 0.850 0.220 2.92 2.49
API change: `follow_metric()` now takes `polish_iters=1` (default)
and `polish_alpha=0.2`. The polish runs unconditionally after the
anisotropic mover when `polish_iters > 0`. polish_iters=0 disables.
Empirical trade-off (1 step):
- h_min relaxes ~9% (0.641 → 0.697 of h0)
- aspect drops modestly (3.31 → 3.19)
- sliver-floor cells go 1 → 0
- mesh visibly cleaner, BL refinement basically intact
Tests 9/9 pass unchanged (the polish doesn't affect the contract
documented in the existing tests; it just makes meshes prettier).
Underworld development team with AI support from Claude Code
User feedback (2026-05-22): "The criteria you are using seem a little
weak — I can still see some tiny cells in jacobi = 1, I'd think 2 or 3
is better. I am not sure what measure would find this out."
Found the right measure: cell shape quality
q = 4√3·A / (e₀² + e₁² + e₂²)
(equilateral → 1, degenerate sliver → 0). Threshold q < 0.3 catches
visible slivers; q < 0.5 catches "noticeably anisotropic" cells.
The previous "n cells < 5% A0_median" measure was coarse — it could
go to zero while still leaving ~2 cells with q < 0.3 (visible
slivers). The shape-quality measure tracks the visual impression
correctly.
Sweep on ref=3 stagnant lid:
polish q_min n q<0.3 n q<0.5
0 0.022 7 118
1× 0.270 2 (!) 78 <- old default missed these
2× 0.369 0 57
3× 0.397 0 43
5× 0.431 0 17
API change: polish_iters → polish_max_iters (default 5),
polish_quality_target (default 0.3), polish_alpha (default 0.2).
The polish loop runs adaptively: each iteration recomputes q_min,
and the loop exits as soon as q_min >= polish_quality_target
(or the iteration cap is hit). User picks the QUALITY they want;
the polish runs the iterations needed.
Validation on the test fixture:
ref=3.0: 2 polish iters needed → q_min=0.369
ref=5.0: 5 polish iters needed (cap) → q_min=0.304
Higher refinement leaves more slivers to clean, so the adaptive
polish naturally uses more iterations. Self-tuning to the problem.
New tests/test_follow_metric_no_slivers_after_adaptive_polish
locks the contract: q_min >= 0.3 on refinement=2 and 3.
Tests 10/10 pass.
Underworld development team with AI support from Claude Code
… + diagnosis kwargs This commit carries forward the changes made during the 2026-05-22/23 follow_metric debugging session. Two source files modified. src/underworld3/meshing/smoothing.py ------------------------------------ follow_metric(): * h0_cache and rest_coords_cache per-mesh — the spring/eigenvalue- clamp references are captured at first call and reused across subsequent adapt cycles, preventing compounding refinement when adapt is called repeatedly on the same mesh. * polish_max_iters + polish_quality_target (adaptive Jacobi polish) — runs until shape quality q = 4√3·A/(Σe²) ≥ target. * gradient_smoothing_length plumbed through. _winslow_anisotropic(): * rest_size_cap_max / rest_size_cap_min / rest_spring_K — per-cell Lagrangian rest-size spring; clamps min-edge ≥ h0/refinement and max-edge ≤ h0·coarsening when those caps are passed. restore_w capped at 1 per vertex to prevent spring overshoot. Spring uses rest_coords_override when supplied (so it pulls back to the truly-undeformed mesh, not the entry-state of THIS call). * h0_override kwarg — bypass internal h0 measurement (use a cached external value to break compounding refinement across repeat calls). * Sliver-floor backtrack — area constraint a1min > 0.01·a0_undeformed_median, rejecting near-degenerate triangles that the original a1min > 0 check let through. * metric_refresh_per_iter kwarg + extracted _build_M_tensor() — KNOWN INCOMPLETE FIX. The rebuilt M tensor reads from the same Lagrangian ρ source (MeshVariable), so it's still Lagrangian. Kept in code for the dedicated session's next round of debugging; default off preserves legacy. scripts/stagnant_lid_adapt_loop.py ---------------------------------- * --from-perturbation: builds a fresh Annulus(0.5, 1.0, cellSize=1/16) and sets T = T_cond + amp·sin(m·θ)·sin(π(r-r_i)/(r_o-r_i)) + V=P=0. Validated starting point. * --Ra, --delta-eta, --pert-mode, --pert-amplitude: physics knobs. * --refinement: switches from legacy strategy='med' path to follow_metric(refinement=N). * --skip-threshold: override the adapt skip cadence (-1 = use strategy default; high = never skip). * --dt-mult: multiplier on estimate_dt() — SLCN is unconditionally stable so values > 1 (typically 3) give 3× larger dt without stability loss. * --max-t: stop the loop when t_sim ≥ max_t. * Stokes-BEFORE-AdvDiff order: critical fix. The old order was AdvDiff(uses V) → Stokes(updates V), which meant the AdvDiff step right after an adapt used V=0 (because adapt zeros V,P). Resulted in one diffusion step at zero advection, giving a visible Nu dip at each adapt. New order: Stokes(rebuilds V from remapped T) → estimate_dt → AdvDiff(uses correct V) → no diffusion-only transient. * misalignment per-step diagnostic added to history.npz. The resume loader fills NaN if old npz lacks the column. The full session diagnosis is captured in the memory entry [project_winslow_mover_iteration_bug.md]. The synthetic-shapes test in scripts/_test_metric_shapes_iter2.py (untracked, left as scratch) is the smoking-gun demo. Underworld development team with AI support from Claude Code
Adds _winslow_equidistribute — a single weighted-Poisson
equidistribution-flow step per call:
∇·(ρ ∇φ) = -ρ · log(V·ρ/K), ∇φ·n̂=0
K = exp(<ρ·log(V·ρ)>/<ρ>) (Neumann zero-mean compat)
y_new = y + relax · ∇φ
Linear: one weighted-Poisson per call, no Picard, no Hessian
recovery. ρ is target density at the current node positions; V is
patch volume at the current mesh; the source vanishes identically
at equidistribution (V_i·ρ_i ≡ K). Each call is a stand-alone
improvement step — the input mesh has no special status, so calls
chain freely and can be interleaved with spring / Jacobi / aniso
heuristic moves.
Wired as method='ot' (also 'equidistribute', 'improve') in
smooth_mesh_interior. step_frac=0.3 caps per-vertex displacement
to fight local fold formation near sharp source regions.
Synthetic-shapes test (~/+Simulations/StagnantLid/synthetic_shapes_OT/):
trajectory 0.72→0.49 in 13 steps, refinement bands directly on
the metric concentrations (no phase shift). Plateaus near 0.46
when sliver cells block the signed-area backtrack — design intent
is to compose with shape-quality smoothing between OT steps.
Also bundled here:
* target_side_rho kwarg on _winslow_elliptic: substitutes
X[i] → X[i]+gradphi.sym[i] in the metric expression so ρ is
evaluated at the moving target, removing the phase error where
refinement-by-size was transported away from the feature
location by ∇φ. gproj.solve() inside the Picard loop so
gradphi tracks the current φ.
* Box-mode boundary slip on _winslow_elliptic: corners pinned,
mid-edge nodes slide along their axis-aligned edge. Adds
boundary_slip='box' alongside the existing 'ring' (annulus).
* _OT_CACHE for the OT-improve solver state.
Scripts under scripts/_test_metric_shapes_*.py exercise the
analytic-Eulerian ρ probe and produce the synthetic-shapes
comparison panels referenced in the project memory.
Underworld development team with AI support from Claude Code
Five exploration scripts that establish what composes well with the OT-improvement step (_winslow_equidistribute, landed in 959c67d): _test_metric_shapes_OT_interleave.py First sweep across recipes ending on heuristic vs OT, spring-default vs Jacobi. _test_metric_shapes_OT_interleave2.py Always-end-on-OT discipline + spring tuning sweep (size_w 0,2,8). Result: light metric-aware spring (size_w=2) is the only useful interleave partner; default size_w=8 is over-aggressive, Jacobi is anti-cooperative, pure-shape size_w=0 fights the OT grading. _test_metric_shapes_OT_jacobi.py Jacobi-as-preconditioner-only sweep. Confirms Jacobi contributes nothing at any tuning (light/heavy/varied OT relax). Per-step OT rate is iteration-count-limited, not step-size-limited in this regime. _test_metric_shapes_OT_multires.py Multi-resolution OT (eps-widening homotopy: 2/3/4/5 levels). Result: every multi-res schedule is worse than raw OT * N at matched compute — each level transition jumps imbalance upward because the mesh equilibrated for smoothed rho mismatches the sharper rho. Likely earns its keep only for rho with thin/distant features that raw OT can't find from a uniform start; our sech^2 bumps are already broad enough. _test_metric_shapes_OT_levels.py Isolates the two halves of the 2-level multi-res chain. Shows that smoothed-OT pre-positions nodes into the right *neighbourhoods* (wider bands), but doesn't precondition the sharp OT — both meshes reach the same sharp imbalance. Empirical headline: raw OT (relax=0.1, step_frac=0.3, boundary_slip='box') is the dominant strategy on a uniform start with sech^2 features. The OT-improvement step has the right composability semantics for the user's intent; the sweeps just establish that no heuristic partner found so far strictly beats more raw OT on this benchmark. Underworld development team with AI support from Claude Code
The OT-improve step (959c67d) is genuinely incremental and composable: with sf=0.3, relax=0.1 (conservative brakes), it makes monotone progress without mesh tangling on a uniform start and runs cleanly when chained from any prior mesh state. That contrasts with the failed Lagrangian-accumulation path in the anisotropic mover, where each "OT-like" step kept adding displacement to a stale source and the refinement drifted away from the actual gradients. Empirical headline from the blob-then-sharp sweep: * narrow-blob (AMP=8, eps=0.06) at brake-on settings is per-step equivalent to sharp OT — chains C/E tie the raw sharp baseline at sharp imb ~0.52 * the blob preconditioning DOES gather nodes into feature regions visibly (compare bottom-row interior cell density vs pure-sharp baseline), even though the imbalance metric is insensitive to that interior-vs-boundary distinction * with brakes off (relax=1.0, sf=1.0), wide blob (eps=0.15) transports nodes 3-4x further per step but produces broad halos rather than fine band alignment * narrow blob with brakes off tangles within 2-3 steps; amplification (AMP=20 or 50) doesn't help — it just brings the tangle earlier Scripts: _test_metric_shapes_OT_blob.py initial blob design _test_metric_shapes_OT_blob_amp.py AMP sweep (8/20/50) + Gaussian alt _test_metric_shapes_OT_blob_loose.py brake-off probe _test_metric_shapes_OT_blob_then_sharp.py blob -> sharp chain (with narrow blob) _profile_OT.py / _profile_OT_inside.py timing breakdown: ps.f reassign or _deform_mesh forces re-factor (~5s); cached solve 0.2s Next: validate on convection harness — does the incremental OT help mesh adaptation when the metric is itself time-evolving? Underworld development team with AI support from Claude Code
Promote the validated reset-to-uniform OT adaptation pattern from the stagnant-lid harness into a public mesh method. - mesh.OT_adapt(field, *, refinement=3.0, coarsening='auto', grad_smoothing_length='auto', metric_choice='front-following', fields_to_remap, fields_to_zero, skip_threshold, reference_coords, verbose) -> bool, plus mesh.OT_adapt_reset_reference() for the lazy reset-reference cache. - New src/underworld3/meshing/_ot_adapt.py holds the algorithm and the boundary-slip helpers. - grad_smoothing_length defaults to 'auto' (= the mesh's uniform cell size, parallel-safe mean edge length): the screened-Poisson de-noising that keeps R=3 sliver-free is on by default. None disables it; a number or Pint length sets it explicitly and is unit-aware (non-dimensionalised via the projection). metric_choice defaults to the validated 'front-following' (mild grading), not the sliver-prone 'gradient-uniform'. - Unify the OT mover's boundary slip (_winslow_equidistribute) onto the mesh's projected normals (mesh.Gamma_P1) instead of the hardcoded box/ring cases. Radial coordinate systems slide tangentially and snap back to the surface; Cartesian boundaries pin (the raw facet normal is degenerate at vertices). Legacy boundary_slip='ring'/'box' kept as aliases. - Constrained-manifold meshes (dim != cdim) raise NotImplementedError. - _update_projected_normals recovers the persistent _n_proj variable after a deform instead of noisily re-creating it. - tests/test_0760_mesh_ot_adapt.py locks the API. Design doc: docs/developer/design/ot-adapt-api-proposal.md. Behavioural (not bit-for-bit — the slip algorithm changed): mode-1 Ra=1e7 delta-eta=1e2 over 200 steps gives 40/40 productive adapts; the reset is idempotent (verified). R=3 on this sharp solver field still leaves a small sliver population vs the gentler R~1.4 reference; auto-smoothing is the mitigation, refinement is the caller's lever. Underworld development team with AI support from Claude Code
There was a problem hiding this comment.
Pull request overview
This PR lands infrastructure and supporting documentation/scripts for metric-driven, topology-preserving mesh redistribution (including public meshing exports), and adds PETSc support needed for pure-Neumann scalar solves used by mesh-motion/equidistribution formulations.
Changes:
- Export additional metric/adaptation utilities from
underworld3.meshing(e.g.metric_density_from_gradient,follow_metric,ADAPT_STRATEGIES). - Add
constant_nullspacesupport toSNES_Scalarto properly handle singular pure-Neumann scalar operators (incl. GAMG near-nullspace). - Add extensive analysis/benchmark/visualisation scripts plus documentation cross-links and an advanced-guide section explaining node redistribution vs remeshing.
Reviewed changes
Copilot reviewed 98 out of 102 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/underworld3/meshing/init.py | Re-exports metric/strategy helpers from meshing.smoothing at module level. |
| src/underworld3/cython/petsc_generic_snes_solvers.pyx | Adds optional constant-nullspace attachment for pure-Neumann scalar SNES solves. |
| scripts/vol_spring_test.py | Bench script exploring volumetric spring size/shape weighting trade-offs. |
| scripts/stagnant_lid_R_compare_plot.py | Generates side-by-side plots of meshes adapted at different resolution ratios R. |
| scripts/stagnant_lid_adapt_compare_plot.py | Produces clean uniform-vs-adapted mesh comparison plots over T and |∇T|. |
| scripts/spring_converge.py | Probes convergence of the spring PCG and its effect on grading/cost. |
| scripts/slip_test.py | Compares boundary slip OFF/ON effects for volumetric spring and reports drift/quality. |
| scripts/precond_mild.py | Experiments with mild spring as a preconditioner for Monge–Ampère adaptation. |
| scripts/precond_bench.py | Benchmarks spring→MA preconditioning variants for cost/grading/quality. |
| scripts/plot_metric_meshes.py | Plot-only renderer for saved metric-mesh cases (Spring vs MA). |
| scripts/neumann_solve_probe.py | Isolates/validates pure-Neumann Poisson solve behavior with constant_nullspace. |
| scripts/ma_tune.py | Parameter sweep for _winslow_elliptic (MA) tuning and tangle checks. |
| scripts/ma_solver_scaling.py | Measures direct vs GAMG scaling/caching behavior for MA φ-Poisson solves. |
| scripts/ma_slip_test.py | MA-specific boundary slip OFF/ON comparison and diagnostics. |
| scripts/ma_show_grids.py | Renders MA redistributed grids for multiple φ polynomial orders. |
| scripts/ma_show_grids_hires.py | High-res MA grid renders and quantitative anisotropy/quality annotations. |
| scripts/ma_res_sweep.py | Resolution sweep for MA grading vs expected equidistribution trend. |
| scripts/ma_radial_anisotropy.py | Analyzes radial vs tangential edge-length profiles for MA-adapted annulus. |
| scripts/ma_profile_phases.py | Phase-resolved timing breakdown for cached warm MA calls. |
| scripts/ma_profile_diag.py | Compares FE MA radial displacement against exact equidistribution displacement. |
| scripts/ma_polar_lorentzian_slip.py | Polar Lorentzian metric experiment including slip and solver robustness checks. |
| scripts/ma_phi2_validate.py | Validates φ=P2 vs φ=P3 grading parity and cost differences. |
| scripts/ma_phi_order.py | Tests whether φ order affects GAMG robustness and grading parity. |
| scripts/ma_p1_gamg_scaling.py | Scaling study for φ=P1/P2 with GAMG vs direct solvers. |
| scripts/ma_lorentzian_slip_final.py | Final compact Cartesian Lorentzian + slip experiment script. |
| scripts/ma_localised_reach_gamg.py | Localised metric reach study and GAMG robustness on interior blobs. |
| scripts/ma_heavytail_metric.py | Heavy-tail (Lorentzian) metric “snuggle” test vs Gaussian alternatives. |
| scripts/ma_gamg_vs_direct.py | Side-by-side direct vs GAMG comparison including KSP-iteration probing. |
| scripts/ma_cost_grading.py | Reports MA-only cost, “honest” grading, and quality across AMP values. |
| scripts/ma_analytic_check.py | Computes exact 1D radial equidistribution map as ground truth (no FE/UW). |
| scripts/interior_refine.py | Compares spring vs MA on localized interior refinement with plotting. |
| scripts/cost_compare.py | Wall-time comparison: spring vs MA paths across AMP, cold vs warm. |
| scripts/aniso_validate_nonsep.py | Validates anisotropic mover on non-separable metric features vs scalar methods. |
| scripts/aniso_smoke.py | Basic smoke test for anisotropic tensor mover validity and no-op cases. |
| scripts/aniso_param_sweep.py | Parameter sweep for anisotropic mover stability/quality characteristics. |
| scripts/aniso_movie.py | Builds a movie from adaptation checkpoints using PyVista rendering. |
| scripts/aniso_cost_and_gamg.py | Cost + GAMG parity exploration for anisotropic mover across resolutions. |
| scripts/aniso_api_smoke.py | Smoke test for public metric_density_from_gradient + smooth_mesh_interior(method="anisotropic") API. |
| scripts/adaptive_mesh_evolution.py | Captures and renders before/after snapshots across adaptation events in a run. |
| scripts/adaptive_first_jump.py | Captures first adaptation jump, checkpoints, and renders before/after views. |
| scripts/adapt_vs_ref24_cost.py | Estimates amortized per-step overhead of pristine adaptation vs higher-res baseline. |
| scripts/adapt_cost_microbench.py | Micro-benchmarks adaptation cost components relative to a solve step. |
| scripts/_test_metric_shapes_OT.py | Synthetic-shapes test harness for composable OT improvement steps. |
| scripts/_test_metric_shapes_OT_jacobi.py | OT + Jacobi interleave sweep with imbalance parsing and plotting. |
| scripts/_test_metric_shapes_OT_interleave2.py | Refined OT-interleave sweep comparing heuristics at matched OT counts. |
| scripts/_test_metric_shapes_OT_interleave.py | OT interleave sweep tracking imbalance trajectories and final meshes. |
| scripts/_test_metric_shapes_OT_blob_then_sharp.py | OT chain experiment: broad-blob preconditioning then sharp OT refinement. |
| scripts/_test_metric_shapes_MA.py | Synthetic-shapes smoke test for MA path with analytic Eulerian ρ. |
| scripts/_test_metric_shapes_MA_sweep.py | MA convergence sweep varying outer/Picard iterations for shapes tests. |
| scripts/_test_metric_shapes_iter_sequence.py | Steps mover 1-iter-at-a-time to inspect evolution with Eulerian metric. |
| scripts/_test_metric_shapes_iter_inside.py | Captures per-outer-iteration snapshots from inside one smooth_mesh_interior call. |
| scripts/_test_metric_shapes_analytic_iter2.py | Visual comparison of Lagrangian vs analytic Eulerian ρ cases across settings. |
| scripts/_test_metric_shapes_analytic_disp.py | Diagnostic probe extracting max|Δx| trajectories for analytic vs Lagrangian ρ. |
| scripts/_profile_OT.py | Profiles cost per OT step, including first-call setup vs cached subsequent calls. |
| scripts/_profile_OT_inside.py | Profiles internal components of a single OT step (Poisson, projection, deform). |
| scripts/_nu_recheck.py | Re-checks Nusselt computation on existing checkpoints using corrected flux method. |
| scripts/_nu_proper.py | Implements “proper” Nusselt via projected total flux on shells + conduction validation. |
| scripts/_amp_check.py | Checks whether amp changes the normalized-gradient field used in tensor metric construction. |
| docs/developer/subsystems/meshing.md | Adds “Related” link to metric-driven mesh redistribution subsystem doc. |
| docs/developer/index.md | Adds design doc entry for mesh-adaptation-formulation. |
| docs/advanced/mesh-adaptation.md | Documents the “node redistribution” (snuggling mover) workflow and contrasts it with mesh.adapt(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.snes.setUp() | ||
|
|
||
| jacobian = self.snes.getJacobian() | ||
| operator_matrix = jacobian[0] | ||
| preconditioner_matrix = jacobian[1] if len(jacobian) > 1 else None | ||
|
|
||
| nullspace = PETSc.NullSpace().create( | ||
| constant=True, comm=self.dm.comm) | ||
|
|
||
| operator_matrix.setNullSpace(nullspace) | ||
| operator_matrix.setTransposeNullSpace(nullspace) | ||
| # GAMG (the default PC) builds its coarse hierarchy from the | ||
| # near-nullspace; for a pure-Neumann operator the constant | ||
| # mode MUST be supplied here or the PC setup fails | ||
| # (DIVERGED_LINEAR_SOLVE at 0 iterations on re-solves). | ||
| operator_matrix.setNearNullSpace(nullspace) |
There was a problem hiding this comment.
Addressed in d4e1cbf: the constant nullspace is now cached on the instance (self._constant_nullspace_obj), built once and reused across solves (it depends only on the comm), and invalidated to None on DM/SNES rebuild alongside the Stokes nullspace.
| def _attach_constant_nullspace(self): | ||
| """Attach a constant nullspace to the (already set-up) SNES | ||
| Jacobian. Scalar analogue of ``_attach_stokes_nullspace``.""" | ||
| if not self._constant_nullspace: | ||
| return | ||
|
|
There was a problem hiding this comment.
Addressed in d4e1cbf: _attach_constant_nullspace() now raises ValueError when self.essential_bcs is non-empty, mirroring the Stokes pressure-Dirichlet guard. OT_adapt's equidistribution Poisson is pure-Neumann (no essential BCs) so it is unaffected; verified the guard fires on a Dirichlet case and test_0760 stays green.
Addresses two review comments on the pure-Neumann constant_nullspace support in SNES_Scalar: - Raise ValueError when constant_nullspace=True is set on a problem that also has essential (Dirichlet) BCs — the constant mode is not in the operator's nullspace there, so projecting it out would corrupt the solution. Mirrors the Stokes pressure-Dirichlet guard. - Cache the constant PETSc.NullSpace on the instance and reuse it across solves (it depends only on the comm) instead of rebuilding it every solve; invalidated on DM/SNES rebuild alongside the Stokes nullspace. OT_adapt's equidistribution Poisson is pure-Neumann (no essential BCs), so the guard does not fire there; test_0760 stays green. Underworld development team with AI support from Claude Code
Summary
Lands the metric-driven mesh-adaptation feature, culminating in the public
mesh.OT_adapt()method. The branch is 45 commits ahead ofdevelopment;developmentdoes not yet have the mover/metricinfrastructure (
_winslow_equidistribute,metric_density_from_gradient,follow_metric,_edge_pairs), so OT_adapt cannot be split out — it is thecapstone of this feature and ships with the infrastructure it relies on.
The headline tool —
mesh.OT_adapt()Promotes the validated reset-to-uniform optimal-transport adaptation
pattern into a library method: reset to a cached uniform reference,
FE-remap the driving field, build a gradient metric, run the OT mover,
FE-remap fields, zero a cold-restart set. Plus
mesh.OT_adapt_reset_reference()for the reference cache.Settled defaults
refinement=3.0— primary feature knob.grad_smoothing_length="auto"— screened-Poisson de-noising at themesh's uniform cell size (parallel-safe mean edge length); the lever that
keeps R=3 sliver-free.
Nonedisables it; a number or Pint lengthsets it explicitly (unit-aware, non-dimensionalised via the projection).
metric_choice="front-following"— mild grading; not the sliver-prone"gradient-uniform".Boundary slip is unified onto the mesh's projected normals
(
mesh.Gamma_P1), replacing the hardcoded box/ring cases: radialcoordinate systems slide tangentially and snap back to the surface;
Cartesian boundaries pin (the raw facet normal is degenerate at vertices).
Constrained-manifold meshes (
dim != cdim) raiseNotImplementedError.Validation
tests/test_0760_mesh_ot_adapt.pylocks the API (Annulus + Box move andpreserve the field; boundary behaviour; reference cache; skip threshold;
auto-smoothing default; unit-aware length; manifold guard). The existing
follow_metricsuite stays green.Ra=1e7 delta-eta=1e2 over 200 steps gives 40/40 productive adapts; the
reset is idempotent (verified — no sliver accumulation). At R=3 a small
sliver population remains on the sharp solver field that the gentler
R~1.4 reference avoids; auto-smoothing is the mitigation, refinement is
the caller's lever.
Design doc:
docs/developer/design/ot-adapt-api-proposal.md.Not in this PR
A parallel-safe
get_mean_radius()/ harness refresh is in progress in aseparate working tree and is intentionally excluded here.
Underworld development team with AI support from Claude Code