fix(consensus,framework,actuator): use Locale.ROOT for case-insensitive by halibobo1205 · Pull Request #6698 · tronprotocol/java-tron

halibobo1205 · 2026-04-23T03:44:34Z

Summary

Fix a locale-dependent code hygiene issue: no-arg String.toLowerCase() / toUpperCase() calls use the JVM default locale, and under a minority of locales (tr/az) the folding of 'I' diverges from every other locale. Switching uniformly to Locale.ROOT decouples the behavior from the deployment environment, backed by a compile-time guard and a one-time data normalization.

This PR does three things:

Correctness. Replace every no-arg toLowerCase() / toUpperCase() call with its Locale.ROOT counterpart so that string folding is independent of the JVM's startup locale.
Defense-in-depth. Enable ErrorProne's StringCaseLocaleUsage checker at ERROR level, and add a custom StringCaseLocaleUsageMethodRef checker to cover the String::toLowerCase method-reference form that upstream misses. Any future no-arg call fails compilation.
Data hygiene. MigrateTurkishKeyHelper runs once at Manager.init() to normalize any non-ROOT legacy keys that may exist in AccountIdIndexStore.

What this fix really does

Make the case-insensitive index behavior a pure function of the input bytes, no longer implicitly dependent on the JVM's startup locale. Before this fix, AccountIdIndexStore was, in principle, "case-insensitive relative to whichever locale this JVM happens to use" rather than truly locale-independent. After the fix it is the latter.

This is a data-structure-level correctness convergence — remove an implicit environment dependency and align the implementation with the intended semantics.

Scope

Only one site reaches persistent storage: AccountIdIndexStore.getLowerCaseAccountId — the lowercased result becomes a DB key. All other touched call sites (DB engine selection, disabled-API list, log topic classification, hex display, OS detection, etc.) are in-process runtime comparisons — they neither write to disk nor cross nodes.

The input domain itself is tightly constrained: input to setAccountId is gated by validReadableBytes to printable ASCII (0x21–0x7E). Within this domain, the locale divergence in lowercase only manifests on a single character; every other character (letters, digits, symbols) folds identically under any locale.

Auditability of the migration scope

The "only one site reaches persistent storage" claim above can be independently verified by inspection. Every other touched call site falls into one of:

(a) Runtime engine selection — dbEngine.toUpperCase() in TronDatabase, TronStoreWithRevoking, TxCacheDB. Locale only affects which DB engine the running process selects; it is never written to disk.
(b) In-memory config/membership checks — disabledApiList, HttpMethod string comparisons in Util / access filters, Account.accountType switch. Comparison results are consumed in-process and never persisted.
(c) Display/formatting — Hex.toHexString().toUpperCase() in DataWord, help-text formatting in Args. Output to logs/console, never read back.
(d) Host-environment probing — os.name / os.arch / java.vendor reads in Arch, WalletUtils.getDefaultKeyDirectory, KeystoreFactory. Local environment branching never crosses nodes.

None of these reach disk or cross-node boundaries, so the migration scope (AccountIdIndexStore only) is exact, not approximate.

Why `MigrateTurkishKeyHelper` is bundled in

A code-only fix does not fully cover "nodes that were ever started under a non-default locale" — such nodes may carry legacy-format keys in their DB. The cleanest way to keep the code fix and the data state in sync is a one-shot normalization at startup.

Design notes:

Follows the MoveAbiHelper pattern, gated by a TURKISH_KEY_MIGRATION_DONE flag in DynamicPropertiesStore so it runs at most once.
Full-table scan (AccountIdIndexStore is a sparse index with very low cardinality — only 14 entries on mainnet today; scan cost is negligible).
For each non-ROOT-form key: write a new entry under its ROOT-form key, then delete the legacy one.
Conflict policy: if the ROOT-form key already exists, keep the ROOT entry and drop the legacy value — simple, auditable, and equivalent to the state of a node that was never affected.

For nodes that were never affected, the scan yields zero candidates and the helper is a no-op.

Why the simplified normalization strategy

In principle one could enumerate every possible legacy variant of each ROOT-canonical key and merge them, but that enters a combinatorial space (per-key variant count can reach 2^k, where k is the occurrence count of the divergent character). Even with full enumeration, picking the correct value across conflicts requires knowing the historical insertion order — information the DB layer does not retain.

The conservative strategy chosen here:

Simple and auditable
Post-migration local semantics equal those of a node that only ever ran under ROOT
In the extremely rare conflict case, one legacy value may be dropped — an acceptable cost under case-insensitive index semantics

Impact

Mainnet / Nile. Safe to include in any normal release. The code changes produce no byte-level behavioral difference in the expected locale environment; the migration scans a trivially small table and is a no-op for nodes that were never affected.
Node operators. You'll see a one-time migration line in the startup log; it won't run again on subsequent restarts.
Long term. The ErrorProne rule ensures no future patch can reintroduce no-arg lowercase/uppercase.

Note for downstream fork chains

A downstream fork chain needs to treat this PR as a non-transparent upgrade only if both of the following hold:

All of its consensus nodes have historically run under the same minority locale that diverges in lowercase (e.g. all on tr/az).
Its history includes a committed SetAccountId transaction whose accountId contains a character that triggers the locale divergence (e.g. I).

On such a chain, the AccountIdIndexStore historical state was produced under non-ROOT folding; after this PR, has() lookups for the same accountId may yield different verdicts, which requires a coordinated hard-fork activation height and operator-defined historical data migration strategy.

Mainnet and Nile are not in this category: condition (1) has never held historically.

Release scope

Ship as a normal release — no extra coordination required.
The ErrorProne rule takes effect with this PR; any no-arg lowercase/uppercase in future patches will be blocked at compile time.

lxcmyf

Review findings：

yanghang8612 · 2026-04-27T02:37:54Z

The three-layer fix is well-shaped: Locale.ROOT substitution is byte-for-byte neutral within the input domain established by the PR description (the only ASCII codepoint that diverges across locales is I, only on tr/az), the ErrorProne StringCaseLocaleUsage rule plus the custom StringCaseLocaleUsageMethodRef checker close the regression door at compile time (including the method-reference form upstream misses), and MigrateTurkishKeyHelper — gated by TURKISH_KEY_MIGRATION_DONE, modelled on MoveAbiHelper, with a documented conflict policy — is the right shape for the only persisted site.

One angle worth recording in the PR body for future readers: the "only one site reaches persistent storage" claim is independently verifiable by inspection. Every other touched call site is one of (a) runtime engine selection (dbEngine.toUpperCase() in TronDatabase / TronStoreWithRevoking / TxCacheDB), (b) in-memory config or membership checks (disabledApiList, HttpMethod comparisons, accountType switch in Account), (c) display/formatting (Hex.toHexString().toUpperCase() in DataWord, help-text formatting in Args), or (d) host-environment probing (os.name / os.arch / java.vendor in Arch, WalletUtils, KeystoreFactory). None of these reach disk or cross nodes, so the migration scope is genuinely exact rather than approximate — worth a one-liner in the description so the audit is auditable later without re-running the sweep.

LGTM once the remaining NITs resolve.

halibobo1205 · 2026-04-27T03:18:47Z

@yanghang8612 Thanks for the detailed audit and the clean (a)–(d) categorization — updated the PR description accordingly.

String.toLowerCase()/toUpperCase() without an explicit Locale uses Locale.getDefault(), which on Turkish (tr) or Azerbaijani (az) systems folds 'I' to dotless-ı (U+0131) instead of 'i' (U+0069). Changes: - Fix all toLowerCase()/toUpperCase() calls to use Locale.ROOT - Enable the ErrorProne StringCaseLocaleUsage checker at ERROR level to prevent future regressions at compile time - Add one-time data migration (MigrateTurkishKeyHelper) to normalize all Turkish legacy keys (ı → i) at startup.

github-actions Bot requested review from 317787106, lvs0075, lxcmyf, xxo1shine and yanghang8612 April 23, 2026 03:44

halibobo1205 added topic:vm VM, smart contract Improvement topic:DB Database topic:consensus labels Apr 23, 2026

halibobo1205 added this to the GreatVoyage-v4.8.2 milestone Apr 23, 2026

halibobo1205 assigned lxcmyf, yanghang8612 and 317787106 Apr 23, 2026

alan-eth reviewed Apr 23, 2026

View reviewed changes

Comment thread framework/src/test/java/org/tron/core/db/AccountIdIndexStoreTest.java

lxcmyf reviewed Apr 23, 2026

View reviewed changes

xxo1shine reviewed Apr 23, 2026

View reviewed changes

Comment thread framework/src/main/java/org/tron/core/db/api/MigrateTurkishKeyHelper.java Outdated

xxo1shine reviewed Apr 23, 2026

View reviewed changes

Comment thread errorprone/src/main/java/errorprone/StringCaseLocaleUsageMethodRef.java Outdated

Comment thread errorprone/src/main/java/errorprone/StringCaseLocaleUsageMethodRef.java Outdated

halibobo1205 requested review from lxcmyf and xxo1shine April 24, 2026 07:04

halibobo1205 added 3 commits April 28, 2026 13:56

refactor: consolidate Turkish key migration loops and use ASTHelpers

718c6e2

test(toolkit): fix toLowerCase

49bb2c1

halibobo1205 force-pushed the fix/locale-sensitive-case branch from 888cbc2 to 49bb2c1 Compare April 28, 2026 06:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(consensus,framework,actuator): use Locale.ROOT for case-insensitive#6698

fix(consensus,framework,actuator): use Locale.ROOT for case-insensitive#6698
halibobo1205 wants to merge 3 commits intotronprotocol:developfrom
halibobo1205:fix/locale-sensitive-case

halibobo1205 commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

lxcmyf left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yanghang8612 commented Apr 27, 2026

Uh oh!

halibobo1205 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

halibobo1205 commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this fix really does

Scope

Auditability of the migration scope

Why MigrateTurkishKeyHelper is bundled in

Why the simplified normalization strategy

Impact

Note for downstream fork chains

Release scope

Uh oh!

Uh oh!

lxcmyf left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yanghang8612 commented Apr 27, 2026

Uh oh!

halibobo1205 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

halibobo1205 commented Apr 23, 2026 •

edited

Loading

Why `MigrateTurkishKeyHelper` is bundled in

lxcmyf left a comment •

edited

Loading