-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve pathname2url()
and url2pathname()
docs
#127125
Conversation
These functions have long sown confusion among Python developers. Even in the urllib implementation and tests, they seem to be used in contradictory ways. A test helper named `sanepathname2url()` has been with us since 2004! The existing documentation says that these functions deal with URL path components. But that doesn't fit the evidence on Windows: >>> pathname2url(r'C:\foo') '///C:/foo' >>> pathname2url(r'\\server\share') '////server/share' # or '//server/share' as of quite recently If these were URL path components, they would imply complete URLs like `file://///C:/foo` and `file://////server/share`. Clearly this isn't right. The conclusion I draw is that these functions operate on everything after the `file:` prefix, which may include an authority section.
I think the confusion came about because the original edit: actually, it might be due to a 90s-era misunderstanding between two devs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks @barneygale for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. |
These functions have long sown confusion among Python developers. The existing documentation says they deal with URL path components, but that doesn't fit the evidence on Windows: >>> pathname2url(r'C:\foo') '///C:/foo' >>> pathname2url(r'\\server\share') '////server/share' # or '//server/share' as of quite recently If these were URL path components, they would imply complete URLs like `file://///C:/foo` and `file://////server/share`. Clearly this isn't right. Yet the implementation in `nturl2path` is deliberate, and the `url2pathname()` function correctly inverts it. On non-Windows platforms, the behaviour until quite recently is to simply quote/unquote the path without adding or removing any leading slashes. This behaviour is compatible with *both* interpretations -- 1) the value is a URL path component (existing docs), and 2) the value is everything following `file:` (this commit) The conclusion I draw is that these functions operate on everything after the `file:` prefix, which may include an authority section. This is the only explanation that fits both the Windows and non-Windows behaviour. It's also a better match for the function names. (cherry picked from commit 307c633) Co-authored-by: Barney Gale <[email protected]>
These functions have long sown confusion among Python developers. The existing documentation says they deal with URL path components, but that doesn't fit the evidence on Windows: >>> pathname2url(r'C:\foo') '///C:/foo' >>> pathname2url(r'\\server\share') '////server/share' # or '//server/share' as of quite recently If these were URL path components, they would imply complete URLs like `file://///C:/foo` and `file://////server/share`. Clearly this isn't right. Yet the implementation in `nturl2path` is deliberate, and the `url2pathname()` function correctly inverts it. On non-Windows platforms, the behaviour until quite recently is to simply quote/unquote the path without adding or removing any leading slashes. This behaviour is compatible with *both* interpretations -- 1) the value is a URL path component (existing docs), and 2) the value is everything following `file:` (this commit) The conclusion I draw is that these functions operate on everything after the `file:` prefix, which may include an authority section. This is the only explanation that fits both the Windows and non-Windows behaviour. It's also a better match for the function names. (cherry picked from commit 307c633) Co-authored-by: Barney Gale <[email protected]>
GH-127232 is a backport of this pull request to the 3.13 branch. |
GH-127233 is a backport of this pull request to the 3.12 branch. |
…#127232) Improve `pathname2url()` and `url2pathname()` docs (GH-127125) These functions have long sown confusion among Python developers. The existing documentation says they deal with URL path components, but that doesn't fit the evidence on Windows: >>> pathname2url(r'C:\foo') '///C:/foo' >>> pathname2url(r'\\server\share') '////server/share' # or '//server/share' as of quite recently If these were URL path components, they would imply complete URLs like `file://///C:/foo` and `file://////server/share`. Clearly this isn't right. Yet the implementation in `nturl2path` is deliberate, and the `url2pathname()` function correctly inverts it. On non-Windows platforms, the behaviour until quite recently is to simply quote/unquote the path without adding or removing any leading slashes. This behaviour is compatible with *both* interpretations -- 1) the value is a URL path component (existing docs), and 2) the value is everything following `file:` (this commit) The conclusion I draw is that these functions operate on everything after the `file:` prefix, which may include an authority section. This is the only explanation that fits both the Windows and non-Windows behaviour. It's also a better match for the function names. (cherry picked from commit 307c633) Co-authored-by: Barney Gale <[email protected]>
…#127233) Improve `pathname2url()` and `url2pathname()` docs (GH-127125) These functions have long sown confusion among Python developers. The existing documentation says they deal with URL path components, but that doesn't fit the evidence on Windows: >>> pathname2url(r'C:\foo') '///C:/foo' >>> pathname2url(r'\\server\share') '////server/share' # or '//server/share' as of quite recently If these were URL path components, they would imply complete URLs like `file://///C:/foo` and `file://////server/share`. Clearly this isn't right. Yet the implementation in `nturl2path` is deliberate, and the `url2pathname()` function correctly inverts it. On non-Windows platforms, the behaviour until quite recently is to simply quote/unquote the path without adding or removing any leading slashes. This behaviour is compatible with *both* interpretations -- 1) the value is a URL path component (existing docs), and 2) the value is everything following `file:` (this commit) The conclusion I draw is that these functions operate on everything after the `file:` prefix, which may include an authority section. This is the only explanation that fits both the Windows and non-Windows behaviour. It's also a better match for the function names. (cherry picked from commit 307c633) Co-authored-by: Barney Gale <[email protected]>
These functions have long sown confusion among Python developers. The existing documentation says they deal with URL path components, but that doesn't fit the evidence on Windows:
If these were URL path components, they would imply complete URLs like
file://///C:/foo
andfile://////server/share
. Clearly this isn't right. Yet the implementation innturl2path
is deliberate, and theurl2pathname()
function correctly inverts it.On non-Windows platforms, the behaviour until quite recently is to simply quote/unquote the path without adding or removing any leading slashes. This behaviour is compatible with both interpretations -- 1) the value is a URL path component (existing docs), and 2) the value is everything following
file:
(this PR)The conclusion I draw is that these functions operate on everything after the
file:
prefix, which may include an authority section. This is the only explanation that fits both the Windows and non-Windows behaviour. It's also a better match for the function names.📚 Documentation preview 📚: https://cpython-previews--127125.org.readthedocs.build/