You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Polaris uses Pydantic as its data validation library. In addition to enforcing a type, we also use Pydantic validators to standardize data to a single format, to enforce constraints or to dynamically infer good defaults. For example:
Standardize data: If metrics are specified as str, we convert them to Metric objects.
Enforcing constraints: We validate that the train and test partition of a benchmark have no overlap.
Infer defaults: If the target types of a benchmark are not specified, we automatically infer them.
Description
These validations can get slow. See for example #148 and #154. Currently, we don't only validate the data model when an artifact is initially created, but also whenever we load an artifact from the Hub. Since we can assume that any data coming from the Hub is valid, we can also skip some of these checks when downloading the artifacts from the Hub to speed up the process.
However, we cannot simply disable data validation altogether. Of the three categories listed above, any validators that standardize data are still needed to deserialize the data that is sent by the Hub.
Acceptance Criteria
We can selectively skip data validation of time-consuming checks when downloading the data from the Hub.
Context
Polaris uses Pydantic as its data validation library. In addition to enforcing a type, we also use Pydantic validators to standardize data to a single format, to enforce constraints or to dynamically infer good defaults. For example:
str
, we convert them toMetric
objects.Description
These validations can get slow. See for example #148 and #154. Currently, we don't only validate the data model when an artifact is initially created, but also whenever we load an artifact from the Hub. Since we can assume that any data coming from the Hub is valid, we can also skip some of these checks when downloading the artifacts from the Hub to speed up the process.
However, we cannot simply disable data validation altogether. Of the three categories listed above, any validators that standardize data are still needed to deserialize the data that is sent by the Hub.
Acceptance Criteria
Links
The text was updated successfully, but these errors were encountered: