Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reads from local dir that changes directory #880

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

phofl
Copy link
Collaborator

@phofl phofl commented Feb 14, 2024

Not adding this to the token makes us thing this is the same operation, silently reading the wrong data

The example might be a little bit too constructed, but if we start caching more this can happen in different scenarios

@@ -449,6 +450,7 @@ class ReadParquet(PartitionsFiltered, BlockwiseIO):
"_partitions": None,
"_series": False,
"_dataset_info_cache": None,
"_cwd": None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure the parquet reader does not need this since we're actually computing a checksum for all the files. if that doesn't work, we should make sure that the checksum is reliable and we may want the same/similar mechanism for csv instead of relying on CWD. CWD feels odd when working with remote storages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants