Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Can't update inside of a metadata context when iterating over a different query #541

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

kbolashev
Copy link
Member

@kbolashev kbolashev commented Oct 28, 2024

Reproduction test (assuming a datasource with 1.txt..5.txt datapoints):

def test_add_in_a_different_context(datasource: "Datasource"):
    path = "1.txt"
    field = "field"

    with datasource.metadata_context() as ctx:
        ctx.update_metadata(path, {field: "old"})

    with datasource.metadata_context() as ctx:
        for dp in datasource[field].is_null().all():
            dp[field] = "new"

    res = [dp.metadata.get(field, None) for dp in datasource.all()]
    assert res == ["old", "new", "new", "new", "new"]

Right now this is failing, because res == ["old", None, None, None, None]

Happens because datasource[field].is_null().all() ends up creating a new datasource object, and so closing the original datasource's context doesn't end up uploading the implicit datapoints.

To fix this I extracted all implicit contexts in a global dictionary keyed by the datasource ID, that way no matter if you're doing subqueries, you're always getting the metadata context of this datasource that ends up being uploaded at some point.

Made sure that all the backend E2E tests are passing after this.

@kbolashev kbolashev added the bug Something isn't working label Oct 28, 2024
@kbolashev kbolashev requested a review from sdafni October 28, 2024 09:55
@kbolashev kbolashev self-assigned this Oct 28, 2024
Copy link

dagshub bot commented Oct 28, 2024

@@ -158,6 +158,9 @@ def to_dict(self, ds: "Datasource") -> Dict[str, Any]:
return res_dict


_metadata_contexts: Dict[Union[int, str], "MetadataContextManager"] = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too bad we need a global thing, but i can't think of any other solution

@sdafni
Copy link
Collaborator

sdafni commented Oct 28, 2024

where is this test than?
separate PR?

@kbolashev kbolashev merged commit 99ec378 into master Oct 28, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants