Reproducer
Tested against tomd master at commit ad567e3.
Reproduce
tomd p3984r0.pdf --outdir out/
sed -n '/^### 6\. References/,/^### 7/p' out/p3984r0.md | head -20
The References section is rendered as a single line with • separators rather than as a vertical list.
Symptom
Bibliography entries that should be on individual lines are joined onto a single line. Excerpt:
• [BS21] B. Stroustrup: [Type-and-resource safety in modern C++](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2410r0.pdf) • [BS22b] B. Stroustrup and G. Dos Reis: [Design Alternatives for Type-and-Resource Safe C++](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2687r0.pdf). WG21 P2687R0. 2022. • [BS22c] B. Stroustrup: A Tour of C++ (3rd Edition). Addison-Wesley 2022. ISBN 978-0136816485. • [BS23] B. Stroustrup and G. Dos Reis: [Safety Profiles: Type-and-resource Safe](url) [programming in ISO Standard C++](url). P2816R0. 2023-02-16 • [BS23a] B. Stroustrup and G. Dos Reis: [Safety Profiles: Type-and-resource Safe](url) [programming in ISO Standard C++](url). P2816R0. 2023-02-16. • ...
Expected
Each bibliography entry on its own line, prefixed with - (or • if preserving the original bullet character). docling on the same PDF produces:
- [BS21] B. Stroustrup: Type-and-resource safety in modern C++
- [BS22b] B. Stroustrup and G. Dos Reis: Design Alternatives for Type-and-Resource Safe C++. WG21 P2687R0. 2022.
- [BS22c] B. Stroustrup: A Tour of C++ (3rd Edition). Addison-Wesley 2022. ISBN 9780136816485.
- [BS23] B. Stroustrup and G. Dos Reis: Safety Profiles: Type-and-resource Safe programming in ISO Standard C++. P2816R0. 2023-02-16
- [BS23a] B. Stroustrup and G. Dos Reis: Safety Profiles: Type-and-resource Safe programming in ISO Standard C++. P2816R0. 2023-02-16.
- [BS23b] B. Stroustrup: Concrete suggestions for initial Profiles. P3038R0. 2023-12-16.
Impact
- Duplicate-entry detection becomes structurally harder when entries cannot be compared line-by-line. p3984r0 has reviewer-confirmed real bibliography findings (PeterTurcan flagged [BS23]/[BS23a] identical entries and two [BS24] entries pointing at different works) that the discovery LLM missed when scanning the flattened-paragraph version produced by tomd.
- Any downstream tool that walks bibliography entries by line will fail.
Uncertainty signal
p3984r0.prompts.md is written for this paper (1KB), but it flags the Acknowledgements section on page 11 — not the References section. The bibliography flattening is not surfaced as an uncertain region.
Hypothesis on root cause
One of three symptoms of the same classifier-confidence bug — see the two companion issues filed alongside this one (bullets becoming deep headings; definition prose wrapped word-by-word in backticks). All three involve over-aggressive structural classification of ambiguous content.
Reproducer
Tested against tomd master at commit ad567e3.
Reproduce
The References section is rendered as a single line with
•separators rather than as a vertical list.Symptom
Bibliography entries that should be on individual lines are joined onto a single line. Excerpt:
Expected
Each bibliography entry on its own line, prefixed with
-(or•if preserving the original bullet character). docling on the same PDF produces:Impact
Uncertainty signal
p3984r0.prompts.mdis written for this paper (1KB), but it flags the Acknowledgements section on page 11 — not the References section. The bibliography flattening is not surfaced as an uncertain region.Hypothesis on root cause
One of three symptoms of the same classifier-confidence bug — see the two companion issues filed alongside this one (bullets becoming deep headings; definition prose wrapped word-by-word in backticks). All three involve over-aggressive structural classification of ambiguous content.