You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The class attribute async_impl can be used to test whether an implementation is async of not.
DirFileSystem inherits from AsyncFileSystem, so it sets async_impl = True. As a wrapper filesystem it requires to match the sync/async operation mode to that of the underlying filesystem. That is reflected by the dirfs.asynchronous attribute (true if set so when instantiated).
Therefore, the language above is misleading: async_impl indicates whether the class has async capabilities. But to test whether a specific instance is sync/async, one needs to test the dirfs.asynchronous attribute.
But even that can be tricky - since CachingFileSystem (base for all caches) is not capable of async operation, so it does not set async_impl = False; And it does not care about its cache.asynchromous attrtibute to match -or not- that of the unerlying filesystem. Instead, it report both async_impl and asynchronous from that underlying filesystem. This is not only misleading, but also means that using e.g. open_async would actually bypass the cache implementation and invoke the underlying filesystem's method instead (if that filesystem is async).
The documentation should be improved to clarify the difference between capable and enabled.
Wrapper filesystems (like caching variants) that do not support async should report async_impl and asynchronous both false, and should not accept asynchronous = True argument, and should complain if the underlying filesystem has asynchronous = True, and consider overriding underlying filesystem's async implementation with NotImplementedError. (Alternatively, add support for async operation to CachingFileSystem).
The text was updated successfully, but these errors were encountered:
asynchronous=True doesn't mean that this is an async filesystem, but that it is being initialised and run from within an async context (coroutine).
Most async implementations translate something like cp_file() to sync(_cp_file()), where _cp_file is a coroutine. dirfs is special in that cp_file calls the internal filesystem's cp_file, so sync() only happens if the inner fs is async. If it is sync, then the inner _cp_file doesn't exist at all.
In summary, I think it's reasonable to define .async_impl on dirfs instances (not the class) to depend on whatever the backend implementation is.
The documentation says:
DirFileSystem inherits from AsyncFileSystem, so it sets
async_impl = True
. As a wrapper filesystem it requires to match the sync/async operation mode to that of the underlying filesystem. That is reflected by thedirfs.asynchronous
attribute (true if set so when instantiated).Therefore, the language above is misleading:
async_impl
indicates whether the class has async capabilities. But to test whether a specific instance is sync/async, one needs to test the dirfs.asynchronous attribute.But even that can be tricky - since CachingFileSystem (base for all caches) is not capable of async operation, so it does not set
async_impl = False
; And it does not care about itscache.asynchromous
attrtibute to match -or not- that of the unerlying filesystem. Instead, it report bothasync_impl
andasynchronous
from that underlying filesystem. This is not only misleading, but also means that using e.g.open_async
would actually bypass the cache implementation and invoke the underlying filesystem's method instead (if that filesystem is async).Consdier this example:
So to recap:
async_impl
andasynchronous
both false, and should not acceptasynchronous = True
argument, and should complain if the underlying filesystem hasasynchronous = True
, and consider overriding underlying filesystem's async implementation with NotImplementedError. (Alternatively, add support for async operation to CachingFileSystem).The text was updated successfully, but these errors were encountered: