-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download fails with 400, message='Can not decode content-encoding: gzip' #121
Comments
Can you give me a full print out of the http response headers? (Either with httpie/curl or debug logging?) |
If it helps, using aiohttp directly works: import aiohttp
import asyncio
async def main():
async with aiohttp.ClientSession() as session:
async with session.get('https://sohoftp.nascom.nasa.gov/solarsoft/sdo/aia/response/aia_V9_20200706_215452_response_table.txt') as response
print("Status:", response.status)
print([f"{key}: {response.headers[key]}" for key in response.headers])
html = await response.text()
print("Body:", html, "...")
asyncio.run(main())
|
|
The filename in my original post was actually not the right one (fixed now), but I believe both show the same issue. |
Here's the curl request for a gzip-ed response, and piped to gunzip to decompress the response:
That is, the server appears to be returning a valid gzip-ed response, which is consistent with the fact that aiohttp by itself appears to have no problem. |
Now that I've added a stream handler to the parfive logger so that I can actually see the parfive debug logging, it's clearer what is going on. The parfive error is related to the fact that parfive is splitting this already tiny file into super-tiny 72-byte requests. There is no error if you explicitly specify Here's example parfive debug output:
Note that it appeared to error on just the second 72-byte chunk here. Seemingly sometimes that chunk succeeds, and a different chunk errors. I'm not sure why some chunks fail sometimes. |
Ah, it's actually failing on any chunk other than the first one. Here's what I believe is happening, and it traces back to aiohttp. Since Setting aside the fact that the bytes aren't the same, the problem is that each individual chunk is not a complete gzip-compressed payload, but rather just a partial segment of one. However, aiohttp sees |
Using curl to request bytes 0–71, then piped to gunzip:
Using curl to request a later chunk, then piped to gunzip, to mimic parfive and aiohttp:
|
Here's the modified version of @alasdairwilson's example to show that the problem is with aiohttp, not with parfive: >>> import asyncio
>>> import aiohttp
>>>
>>> async def main():
... async with aiohttp.ClientSession(headers={'Range': 'bytes=142-213'}) as session:
... async with session.get('https://sohoftp.nascom.nasa.gov/solarsoft/sdo/aia/response/aia_V3_error_table.txt') as response:
... print("\n".join([f"{key}: {response.headers[key]}" for key in response.headers]))
... html = await response.text()
... print(f"Body:\n{html}")
...
>>> asyncio.run(main())
Date: Wed, 11 Jan 2023 21:13:24 GMT
Server: Apache
Strict-Transport-Security: max-age=31536000; includeSubdomains;
Last-Modified: Thu, 27 Sep 2012 13:22:00 GMT
Etag: "72d-4caaed300ae00-gzip"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Range: bytes 142-213/356
Content-Length: 72
Content-Type: text/plain
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\ayshih\AppData\Local\mambaforge\envs\test4\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\ayshih\AppData\Local\mambaforge\envs\test4\lib\asyncio\base_events.py", line 646, in run_until_complete
return future.result()
File "<stdin>", line 5, in main
File "C:\Users\ayshih\AppData\Local\mambaforge\envs\test4\lib\site-packages\aiohttp\client_reqrep.py", line 1081, in text
await self.read()
File "C:\Users\ayshih\AppData\Local\mambaforge\envs\test4\lib\site-packages\aiohttp\client_reqrep.py", line 1037, in read
self._body = await self.content.read()
File "C:\Users\ayshih\AppData\Local\mambaforge\envs\test4\lib\site-packages\aiohttp\streams.py", line 349, in read
raise self._exception
aiohttp.client_exceptions.ClientPayloadError: 400, message='Can not decode content-encoding: gzip' |
Okay, okay, okay. It is possible to turn off the automatic decompression of gzip-compressed responses by aiohttp by specifying (surprise) |
Does this also happen when you set Just tried it out and that works (does not throw an exception). |
The download succeeds with |
When trying to download a (seemingly) plain text file, the download fails and I'm getting the following error,
The code to reproduce this is,
parfive version: 2.0.2
Python version: 3.9
OS: macOS 12.6.2
Solution detailed here: #121 (comment)
The text was updated successfully, but these errors were encountered: