Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto_decompress not supported as keyword argument for aiohttp<3.9 #716

Open
sebastianmika opened this issue Apr 4, 2024 · 5 comments
Open

Comments

@sebastianmika
Copy link

With the latest release of gcloud-aio-auth/storage (5.3.0) the following breaks when using aiohttp<3.9, which is allowed since the minimal required version is currently aiohttp>=3.3.0.

Potential fixes:

  • Force aiohttp>=3.9
  • Set allow_redirects=allow_redirects when setting up the self.ClientSession object in line 556 of gcloud/aio/storage/storage.py and remove from the .get calls (and probably other places)

Code to reproduce with aiohttp==3.8.6:

from aiohttp import ClientSession
from gcloud.aio.storage import Storage

async with ClientSession() as session:
    async with Storage(session=session) as client:
         await client.download("test", "test")

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 6
      4 async with ClientSession() as session:
      5     async with Storage(session=session) as client:
----> 6          await client.download("test", "test")

File ~/venvs/lib/python3.11/site-packages/gcloud/aio/storage/storage.py:329, in Storage.download(self, bucket, object_name, headers, timeout, session)
    323 async def download(
    324     self, bucket: str, object_name: str, *,
    325     headers: Optional[Dict[str, Any]] = None,
    326     timeout: int = DEFAULT_TIMEOUT,
    327     session: Optional[Session] = None,
    328 ) -> bytes:
--> 329     return await self._download(
    330         bucket, object_name, headers=headers,
    331         timeout=timeout, params={'alt': 'media'},
    332         session=session,
    333     )

File ~/venvs/lib/python3.11/site-packages/gcloud/aio/storage/storage.py:557, in Storage._download(self, bucket, object_name, params, headers, timeout, session)
    554 headers.update(await self._headers())
    556 s = AioSession(session) if session else self.session
--> 557 response = await s.get(
    558     url, headers=headers, params=params or {},
    559     timeout=timeout,
    560 )
    562 # N.B. the GCS API sometimes returns 'application/octet-stream' when a
    563 # string was uploaded. To avoid potential weirdness, always return a
    564 # bytes object.
    565 try:

File ~/venvs/lib/python3.11/site-packages/gcloud/aio/auth/session.py:214, in AioSession.get(self, url, headers, timeout, params, stream, auto_decompress)
    208 if stream is not None:
    209     log.warning(
    210         'passed unused argument stream=%s to AioSession: '
    211         'this argument is only used by SyncSession',
    212         stream,
    213     )
--> 214 resp = await self.session.get(
    215     url, headers=headers,
    216     timeout=timeout, params=params,
    217     auto_decompress=auto_decompress,
    218 )
    219 await _raise_for_status(resp)
    220 return resp

File ~/venvs/lib/python3.11/site-packages/aiohttp/client.py:922, in ClientSession.get(self, url, allow_redirects, **kwargs)
    917 def get(
    918     self, url: StrOrURL, *, allow_redirects: bool = True, **kwargs: Any
    919 ) -> "_RequestContextManager":
    920     """Perform HTTP GET request."""
    921     return _RequestContextManager(
--> 922         self._request(hdrs.METH_GET, url, allow_redirects=allow_redirects, **kwargs)
    923     )

TypeError: ClientSession._request() got an unexpected keyword argument 'auto_decompress'
@fbalboaDialpad
Copy link
Contributor

Hi @sebastianmika, thanks for bringing this up. Previous to the addition of the auto_decompress param in aiohttp 3.9, there's no way of supporting downloading gzipped files without decompressing them. I consider this to be a valid use case, as you may want to download a gzip file from GCS and serve it from your own server, or save it as-is to disk for later processing or whatever you may want to do with it. Automatically decompressing it just to compress it again makes no sense, so that's why we added the option to fetch it compressed in #714.
@TheKevJames thoughts on forcing aiohttp>=3.9?

@sebastianmika
Copy link
Author

I am seeing your point and the advantage. My challenge (and I am likely not alone) is that I have plenty of aiohttp<3.9 dependencies (internal and external), why forcing 3.9 would come at quite some cost.

Looking at the recent PR I was wondering whether a very similar result couldn't have been achieved by making the user pass in a ClientSession(auto_decompress=False) session into storage = Storage(session=session) explicitly (i.e. the PR maybe wasn't needed?). The only scenario where that would be annoying is when I need to change this behaviour on a call by call basis. But I would not find that very likely to happen in reality.

@fbalboaDialpad
Copy link
Contributor

Yes, that would be a valid option too. I was trying to achieve it on call-by-call basis as we usually have a single storage object for the whole application, and you could have different use cases within it. Still, I see that we do accept a session parameter on both Storage.download and Blob.download, so maybe there's a possible workaround that's not super painful. I also have to take into account how this will work with the requests.Session object, as the codebase for gcloud-aio/rest modules is shared.
I'll take a look at this next week, for the moment I suggest pinning the version to 9.2.0

@TheKevJames
Copy link
Member

TheKevJames commented Apr 6, 2024

Ah, good catch on this. I would rather avoid upgrading the aiohttp pin too much, if it's not necessary -- it looks like 3.9 only added the auto_decompress parameter to the request method, but it was supported in the client itself before that as early as 3.3.0 (original commit), so we can avoid needing to bump dependency ranges by doing some patching.

I plan to do the following:

  • release v5.3.1 of auth with aiohttp >= 3.9.0, to make sure there's a working latest version
  • yank v5.3.0
  • open a PR with the patched solution, to be reviewed/merged early next week and released as v3.9.2. That PR will return to the current aiohttp >= 3.3.0 lower bound.

Proposed long-term fixes at #717 and #718.

@TheKevJames
Copy link
Member

Looking at the recent PR I was wondering whether a very similar result couldn't have been achieved by making the user pass in a ClientSession(auto_decompress=False) session into storage = Storage(session=session) explicitly (i.e. the PR maybe wasn't needed?).

That would have worked for aio (barring the call-by-call changes you called out), but unfortunately not for rest, since our internal usage of the stream parameter also needs to change based on that setting. I don't think we can fully offload this while mataining feature parity between the async and sync builds of this library, though if anyone can think of a way to do so I'd love to be proven wrong!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants