perf: make StringProtocol.range(of:) scan UTF-8 bytes#325
Conversation
The previous implementation walked the receiver Character by Character, recomputing `distance(from:to:)` on every outer iteration and comparing grapheme clusters one at a time. That made it O(n²) in the receiver length and extremely slow for long strings — a hot callsite in Tuist's XCActivityLog parser (three `range(of:)` calls per diagnostic) was pinning dirty CPU scheduler threads in production, taking ~16 ms per call on ~700 character messages. Scan the UTF-8 byte view instead. UTF-8 is self-synchronizing, so byte-level matches are Unicode-scalar-aligned and the returned `String.Index` values remain valid on the receiver. All index operations on `String.UTF8View` are O(1), and byte comparisons are dramatically cheaper than Character comparisons. Benchmark (6000 realistic diagnostic messages × 3 patterns = 18 000 calls, release build, Apple Silicon): before: 286.04 s (~16 ms per call) after: 0.044 s (~2.4 µs per call) ~6500× speed-up on the benchmark and full test suite (98 tests) still passes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
52ba687 to
c801c75
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c801c750d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let haystack = utf8 | ||
| let needle = string.utf8 | ||
| let nCount = needle.count |
There was a problem hiding this comment.
Preserve canonical-equivalent Unicode matches
Switching range(of:) to raw UTF-8 byte comparison changes semantics from character equality to byte equality, which regresses canonical-equivalent matching. For example, a haystack containing decomposed "Cafe\u{301}" used to match a precomposed needle "Café" (via Character equality) but now returns nil. This can cause real substring misses on normalized-vs-decomposed Unicode inputs even though the text is canonically the same.
Useful? React with 👍 / 👎.
Three public API-shape hazards removed because they win overload resolution against Foundation or stdlib and silently replace caller intent, the same pattern that caused the O(n^2) range(of:) regression (PR #325): - String.init(_ bytes: [UInt8]) and String.init(_ bytes: ArraySlice<UInt8>) in Binary.Serializable — any call site doing String(someByteArray) silently resolves to these lossy UTF-8 decoders (String(decoding: bytes, as: UTF8.self)), swallowing invalid bytes as U+FFFD without the caller opting in. - Sequence.count(where:) — duplicate of the Swift 6 stdlib built-in Sequence.count(where:); a competing declaration creates overload-resolution ambiguity on Swift 6 consumers with no functional benefit. - Sequence.max(count:) and Sequence.min(count:) — same signature as swift-algorithms, causing ambiguous-overload errors for consumers that depend on both packages. The implementation is also O(N * K log K) due to a full sort inside the per-element loop instead of a partial-sort/heap, which is a footgun for any large-N caller. The only internal use of String.init(_ bytes: [UInt8]) was in SetCharacter+INCITS_4_1986.swift to build the CRLF Character, which is rewritten to call String(decoding:as:) explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
StandardLibraryExtensions.StringProtocol.range(of:)previously walked the receiverCharacterbyCharacter, callingdistance(from:to:)on every outer iteration — effectively O(n²) in the receiver length and dominated by grapheme-cluster comparison cost.processor/native/xcactivitylog_nif/Sources/XCActivityLogParser/XCActivityLogParser.swift:181-183) calls it three times per diagnostic ("warning: ","error: "," ^~"). On builds with thousands of diagnostics (Depop ~6800, Pinterest ~700, monday ~280) this was pinning every dirty CPU scheduler thread (ERL_NIF_DIRTY_JOB_CPU_BOUNDinnif_bridge.c:101) in production, matching theperf/gdbhotspot reported:StandardLibraryExtensions.range(of:)→String.distance(from:to:).String.Indexvalues from theutf8view remain valid on the receiver.String.UTF8Viewindex ops are O(1), so the inner loop is a tight byte-comparison scan instead of Character compares plus repeateddistance(...)calls.StandardLibraryExtensionsTeststarget (no test coverage existed forStandardLibraryExtensionsbefore) with correctness tests covering: empty pattern, pattern-longer-than-receiver, match at start/middle/end, non-matching, backtrack ("aaab".range(of: "aab")), Substring receivers, ASCII pattern inside a Unicode haystack, and Unicode pattern.Benchmark
Unit test
test_benchmark_many_issue_messages— 6,000 realistic diagnostic messages (~700 chars each, warning + error + plain templates) × 3range(of:)calls = 18,000 calls, release build, Apple Silicon:~6,500× faster. Full package test suite (98 tests) passes.
Extrapolating to the worst offender in the report — Depop
ae9124af-...with ~6,805 issues — pre-fix that one build burned ~5.4 minutes of CPU insiderange(of:)alone; post-fix it's ~50 ms.Test plan
swift test -c release— 98/98 passStringProtocolRangeOfTests— 11/11 pass (ASCII, Unicode, Substring)🤖 Generated with Claude Code