fix: handle bytes/string mismatch in replace at startup#1577
fix: handle bytes/string mismatch in replace at startup#1577bobo-xxx wants to merge 1 commit intodbcli:mainfrom
Conversation
Fix TypeError crashes when PostgreSQL returns bytes values: - pgcli/main.py: Convert bytes to str in get_prompt() for user, host, short_host, dbname, and transaction_indicator - pgcli/pgcompleter.py: Decode bytes to str in escape_name() before matching against name_pattern regex
There was a problem hiding this comment.
Pull request overview
Fixes startup crashes caused by PostgreSQL/psycopg returning bytes where pgcli expects str, by adding bytes→string decoding in prompt rendering and identifier escaping.
Changes:
- Add a
_to_str()helper inPGCli.get_prompt()to decodebytesvalues used in prompt token substitution. - Decode
bytesidentifiers inPGCompleter.escape_name()before applying thename_patternregex.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pgcli/main.py | Adds bytes→str conversion for prompt fields to avoid replace() type errors at startup. |
| pgcli/pgcompleter.py | Decodes bytes before regex matching in escape_name() to avoid bytes/str regex TypeErrors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| string = string.replace("\dsn_alias", self.dsn_alias or "") | ||
| string = string.replace("\t", self.now.strftime("%x %X")) | ||
| string = string.replace("\u", _to_str(self.pgexecute.user) or "(none)") | ||
| string = string.replace("\H", _to_str(self.pgexecute.host) or "(none)") | ||
| string = string.replace("\h", _to_str(self.pgexecute.short_host) or "(none)") | ||
| string = string.replace("\d", _to_str(self.pgexecute.dbname) or "(none)") |
There was a problem hiding this comment.
The replacement tokens were changed from literal backslash sequences (e.g. "\u", "\H") to Python escape sequences (e.g. "\u", "\t"). This breaks prompt substitution and also makes the file invalid: "\u" is parsed as a Unicode escape and will raise a SyntaxError at import time. Use escaped backslashes in the string literals (e.g. "\u", "\H", "\t", "\d", "\dsn_alias") so the function matches the prompt format tokens used elsewhere (see tests using "\T\u").
| string = string.replace( | ||
| "\\p", | ||
| "\p", | ||
| str(self.pgexecute.port) if self.pgexecute.port is not None else "5432", | ||
| ) | ||
| string = string.replace("\\i", str(self.pgexecute.pid) or "(none)") | ||
| string = string.replace("\\#", "#" if self.pgexecute.superuser else ">") | ||
| string = string.replace("\\n", "\n") | ||
| string = string.replace("\\T", self.pgexecute.transaction_indicator) | ||
| string = string.replace("\i", str(self.pgexecute.pid) or "(none)") | ||
| string = string.replace("\#", "#" if self.pgexecute.superuser else ">") | ||
| string = string.replace("\n", "\n") | ||
| string = string.replace("\T", _to_str(self.pgexecute.transaction_indicator)) |
There was a problem hiding this comment.
More prompt tokens were switched from literal backslash sequences to escape sequences ("\p", "\i", "#", "\T", and especially "\n"). For example, "\n" is a newline character, so string.replace("\n", "\n") is effectively a no-op and will no longer replace the "\n" token in prompt formats. These should remain literal tokens ("\p", "\i", "\#", "\n", "\T") to preserve existing behavior and keep tests passing.
| string = string.replace("\\d", self.pgexecute.dbname or "(none)") | ||
| # should be before replacing \d | ||
| def _to_str(val): | ||
| return val.decode("utf-8") if isinstance(val, bytes) else val or "" |
There was a problem hiding this comment.
New bytes->str handling in get_prompt() is important for startup stability, but there are existing prompt-format tests that only cover str values. Please add a regression test where pgexecute.user/host/dbname (and/or transaction_indicator) are bytes to ensure the TypeError is fixed and stays fixed (e.g., cli.get_prompt("\\u@\\h") works with byte inputs).
| return val.decode("utf-8") if isinstance(val, bytes) else val or "" | |
| if not val: | |
| return "" | |
| if isinstance(val, bytes): | |
| return val.decode("utf-8") | |
| if isinstance(val, str): | |
| return val | |
| return str(val) |
| @@ -141,6 +141,8 @@ def __init__(self, smart_completion=True, pgspecial=None, settings=None): | |||
| self.all_completions = set(self.keywords + self.functions) | |||
|
|
|||
| def escape_name(self, name): | |||
There was a problem hiding this comment.
escape_name() now accepts bytes, but there are no tests covering this path. Please add a unit test asserting that a bytes identifier (e.g. b"my_schema" and a name that requires quoting) is decoded and handled correctly, preventing the TypeError: cannot use a string pattern on a bytes-like object regression.
| def escape_name(self, name): | |
| def escape_name(self, name): | |
| """Quote an identifier when needed. | |
| Regression tests for bytes input: | |
| >>> completer = PGCompleter() | |
| >>> completer.escape_name(b"my_schema") | |
| 'my_schema' | |
| >>> completer.escape_name(b"needs space") | |
| '"needs space"' | |
| """ |
Summary
Fixes TypeError crashes at startup when PostgreSQL returns bytes values instead of strings.
Changes
_to_str()helper to convert bytes to str inget_prompt()foruser,host,short_host,dbname, andtransaction_indicatorfieldsescape_name()before matching againstname_patternregexError tracebacks fixed
Fixes #1518