You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The method (awswrangler.s3.to_csv) supports a "mode" argument and **pandas_kwargs. The "mode" argument is not passed through to Pandas, but consumed in the awswrangler method, which also expects dataset=True to use "mode". In some cases, it would be useful to pass this argument through to Pandas.
If there is already a way to pass "mode" to Pandas, a documentation update would resolve this issue:
To avoid a breaking change we can consider introducing a one-off parameter (pandas_mode for example) that is re-labeled as mode and passed to the underlying pandas method. I can open a PR and the team can discuss if this is how we'd like to move forward.
This may not be necessary at all if there is a way to append an existing file. For example, this seems to silently fail (doesn't append the dataframe "df" to the s3 file "target" and does not raise an exception):
Is there a different way to append an existing file? We use pandas "append" mode primarily with pandas "chunksize" for large text files. If there's a different way to do this using awswrangler, "chunksize" and "mode" for pandas are not necessary. We're looking for a way to append to an existing file rather than writing a new file in the target directory.
Describe the bug
The method (awswrangler.s3.to_csv) supports a "mode" argument and **pandas_kwargs. The "mode" argument is not passed through to Pandas, but consumed in the awswrangler method, which also expects dataset=True to use "mode". In some cases, it would be useful to pass this argument through to Pandas.
If there is already a way to pass "mode" to Pandas, a documentation update would resolve this issue:
How to Reproduce
import awswrangler as wr
...
load a DataFrame and name it df
...
Pandas "mode"
wr.s3.to_csv(df, "some_test_file_name", mode="a", header=False)
awswrangler expects mode="append", dataset=True
Expected behavior
No response
Your project
No response
Screenshots
No response
OS
AWS Lambda x86_64 Architecture
Python version
3.10
AWS SDK for pandas version
3.2.0
Additional context
No response
The text was updated successfully, but these errors were encountered: