Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for schemas smaller than the dataframe #12

Open
diegoquintanav opened this issue Aug 1, 2018 · 2 comments
Open

Support for schemas smaller than the dataframe #12

diegoquintanav opened this issue Aug 1, 2018 · 2 comments

Comments

@diegoquintanav
Copy link
Contributor

This is not the same as the existing support for columns.

In general terms

  • having some schema that contains, say, 13 validations, one per column
  • dataset that has 30 columns, of which the ones in the schema are a subset of

running validate() without passing the specific columns will fail, with something like

Invalid number of columns. The schema specifies 13, but the data frame has 30

I believe this should try to validate the subset, that is, validating the 13 columns that are in the schema.

@multimeric
Copy link
Owner

I think it's reasonable that PandasSchema throws an error here by default, but I agree there should be a flag to disable this. I'll look in to adding this once I get time (or you can, if you'd like)

In the meantime you can pass in a list of columns to validate, as you say.

@abhilasharevur
Copy link

Any update? I see the this is still in requests. Any plans of adding this to master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants