Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create HudiTableFactory implementing DataFusions TableProviderFactory #150

Closed
1 task done
matthewmturner opened this issue Sep 24, 2024 · 11 comments · Fixed by #162
Closed
1 task done

Create HudiTableFactory implementing DataFusions TableProviderFactory #150

matthewmturner opened this issue Sep 24, 2024 · 11 comments · Fixed by #162
Labels
Milestone

Comments

@matthewmturner
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Description of the bug

This isnt a bug report - its a feature request but i didnt see a way to submit a feature request.

I would like to be able to register hudi tables with datafusion like so:

CREATE EXTERNAL TABLE my_table STORED AS HUDITABLE LOCATION '/path/to/table';

If hudi-rs provided a TableProviderFactory then we could register and use that (This is how i currently register deltalake tables and I would like to do something similar for hudi).

Steps To Reproduce

Not a bug

Expected behavior

I can register tables to datafusion like so:

CREATE EXTERNAL TABLE my_table STORED AS HUDITABLE LOCATION '/path/to/table';

Screenshots / Logs

No response

Software information

N/A

Additional context

No response

@matthewmturner matthewmturner added the bug Something isn't working label Sep 24, 2024
@xushiyan xushiyan added feature and removed bug Something isn't working labels Oct 12, 2024
@xushiyan
Copy link
Member

@matthewmturner thanks for raising this! would you be able to review the linked PR by @kazdy please?

@xushiyan xushiyan added this to the release-0.2.0 milestone Oct 12, 2024
@matthewmturner
Copy link
Author

@xushiyan sure, checking it out

@matthewmturner
Copy link
Author

matthewmturner commented Oct 13, 2024

Looks good but when i try integrating in my app i get the below error (which isnt surprising since we arent on the same version of datafusion). Of course thats not a blocker for this though :). Excited to see this

image

@xushiyan
Copy link
Member

@matthewmturner we just landed this feature in main. can you pls test it out with your PR? note that we choose HUDI as the name to stay consistent. See #162 (comment). thanks.

@matthewmturner
Copy link
Author

@xushiyan sure will do, do you have any recommendations for what test data to use? i just need something simple to test a basic query

@xushiyan
Copy link
Member

@matthewmturner yes we keep a bunch of zipped test tables in https://github.com/apache/hudi-rs/tree/main/crates/tests/data/tables

you can unzip and use them directly.

@matthewmturner
Copy link
Author

Okay, going to work on this tonight / tomorrow. I may ping you for a review if you dont mind.

@matthewmturner
Copy link
Author

I must be doing something wrong, im getting error that hoodie.properties doesnt exist even though it does.

image

will pick this up tomorrow.

@kazdy
Copy link
Contributor

kazdy commented Oct 14, 2024

Hi Matt,

I cloned your branch and got this:
image
Can it be something related to your file permissions maybe?
edit: permissions look ok and same as mine

@matthewmturner
Copy link
Author

sigh i got it to work. the repo used to be called datafusion-tui and i still have it as that locally pointing to datafusion-dft so my path was wrong.

@matthewmturner
Copy link
Author

and now test passes :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants