Skip to content

Unfold continuation lines before UTF-8 decode#159

Open
peter-boden wants to merge 1 commit into
fedora-java:masterfrom
peter-boden:fix_continuation
Open

Unfold continuation lines before UTF-8 decode#159
peter-boden wants to merge 1 commit into
fedora-java:masterfrom
peter-boden:fix_continuation

Conversation

@peter-boden
Copy link
Copy Markdown

I ran into a manifest where a UTF-8 multibyte sequence was split across mulitple lines, which made the parser crash.

How to reproduce:

podman run --rm quay.io/ovirt/buildcontainer:el10stream bash -lc 'set -e; VERSION=1.9.18; mvn -q dependency:get -Dartifact=org.apache.maven.resolver:maven-resolver-util:${VERSION} -Dmaven.repo.local=/tmp/m2; python3 -c "from javapackages.common.manifest import Manifest; Manifest(\"/tmp/m2/org/apache/maven/resolver/maven-resolver-util/${VERSION}/maven-resolver-util-${VERSION}.jar\")"'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.12/site-packages/javapackages/common/manifest.py", line 48, in __init__
    self._manifest = self._read_manifest()
                     ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/javapackages/common/manifest.py", line 70, in _read_manifest
    return content.decode("utf-8")
           ^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 1767: invalid continuation byte

Changes introduced:

  • process MANIFEST.MF as bytes and join continuation lines before decode
  • simplify _normalize_manifest to line split/strip only
  • add regression tests

Fix UnicodeDecodeError when manifest continuation folding splits UTF-8
multibyte sequences across line boundaries (e.g. "Bou\xc3\r\n \xa9").

- process MANIFEST.MF as bytes and join continuation lines before decode
- simplify _normalize_manifest to line split/strip only
- add regression tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant