_subprocess.py stdout reading may corrupt UTF-8 characters, and then fail when decodes the data #573
Labels
bug
Something isn't working
component:dep-sources
Dependency sources
duplicate
This issue or pull request already exists
Bug description
_subprocess.py
may not read the whole contents of thestdout
andstderr
pipes. Attempting to decode the resulting incompletestdout
data as UTF-8 might lead to an UnicodeDecodeError.Reproduction steps
mock_pip.py
Generates output with a two-byte UTF-8 character split "in half" at the buffer boundary.
subprocess_run_isolated.py
Minimal reproducible example based on
_subprocess.py
:Running
subprocess_run_isolated.py
will lead to UnicodeDecodeError most of the times.Expected behavior
It should read everything from the pipes prior to attempting
.decode("utf-8")
. It should also be able to handle decoding errors whatsoever, if the called process terminates unexpectedly, leaving possibly truncated output.Screenshots and logs
A typical error message is:
Platform information
pip-audit
version (pip-audit -V
): v2.5.1, v2.5.2python -V
orpython3 -V
): Python 3.11.2pip
version (pip -V
orpip3 -V
): Python 3.11.2Additional context
Sporadic UnicodeDecodeErrors popped up during repeated, similar
pip-audit (...) --fix
runs.The text was updated successfully, but these errors were encountered: