pydantic-spark

This library can convert a pydantic class to a spark schema or generate python code from a spark schema.

Install

pip install pydantic-spark

Pydantic class to spark schema

import json
from typing import Optional

from pydantic_spark.base import SparkBase

class TestModel(SparkBase):
    key1: str
    key2: int
    key2: Optional[str]

schema_dict: dict = TestModel.spark_schema()
print(json.dumps(schema_dict))

Coerce type

Pydantic-spark provides a coerce_type option that allows type coercion. When applied to a field, pydantic-spark converts the column's data type to the specified coercion type.

import json
from pydantic import Field
from pydantic_spark.base import SparkBase, CoerceType

class TestModel(SparkBase):
    key1: str = Field(extra_json_schema={"coerce_type": CoerceType.integer})

schema_dict: dict = TestModel.spark_schema()
print(json.dumps(schema_dict))

Install for developers

Install package

Requirement: Poetry 1.*

poetry install

Run unit tests

pytest
coverage run -m pytest  # with coverage
# or (depends on your local env) 
poetry run pytest
poetry run coverage run -m pytest  # with coverage

Run linting

The linting is checked in the github workflow. To fix and review issues run this:

black .   # Auto fix all issues
isort .   # Auto fix all issues
pflake .  # Only display issues, fixing is manual

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pydantic-spark

Install

Pydantic class to spark schema

Coerce type

Install for developers

Install package

Run unit tests

Run linting

Files

README.md

Latest commit

History

README.md

File metadata and controls

pydantic-spark

Install

Pydantic class to spark schema

Coerce type

Install for developers

Install package

Run unit tests

Run linting