Skip to content

Commit

Permalink
merged develop
Browse files Browse the repository at this point in the history
  • Loading branch information
bcdurak committed Nov 29, 2024
2 parents 88e247c + d16e3a4 commit 4cd155a
Show file tree
Hide file tree
Showing 10 changed files with 55 additions and 27 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ By default, ZenML uses colorful logging to make it easier to read logs. However,
ZENML_LOGGING_COLORS_DISABLED=true
```

Note that setting this on the [client environment](../configure-python-environments/README.md#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will automatically disable colorful logging on remote pipeline runs. If you wish to only disable it locally, but turn on for remote pipeline runs, you can set the `ZENML_LOGGING_COLORS_DISABLED` environment variable in your pipeline runs environment as follows:
Note that setting this on the [client environment](../../infrastructure-deployment/configure-python-environments/README.md#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will automatically disable colorful logging on remote pipeline runs. If you wish to only disable it locally, but turn on for remote pipeline runs, you can set the `ZENML_LOGGING_COLORS_DISABLED` environment variable in your pipeline runs environment as follows:

```python
docker_settings = DockerSettings(environment={"ZENML_LOGGING_COLORS_DISABLED": "false"})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,22 @@ Currently, the following visualization types are supported:
* **Image:** Visualizations of image data such as Pillow images (e.g. `PIL.Image`) or certain numeric numpy arrays,
* **CSV:** Tables, such as the pandas DataFrame `.describe()` output,
* **Markdown:** Markdown strings or pages.
* **JSON:** JSON strings or objects.

There are three ways how you can add custom visualizations to the dashboard:

* If you are already handling HTML, Markdown, or CSV data in one of your steps, you can have them visualized in just a few lines of code by casting them to a [special class](#visualization-via-special-return-types) inside your step.
* If you are already handling HTML, Markdown, CSV or JSON data in one of your steps, you can have them visualized in just a few lines of code by casting them to a [special class](#visualization-via-special-return-types) inside your step.
* If you want to automatically extract visualizations for all artifacts of a certain data type, you can define type-specific visualization logic by [building a custom materializer](#visualization-via-materializers).
* If you want to create any other custom visualizations, you can [create a custom return type class with corresponding materializer](#how-to-think-about-creating-a-custom-visualization) and build and return this custom return type from one of your steps.

## Visualization via Special Return Types

If you already have HTML, Markdown, or CSV data available as a string inside your step, you can simply cast them to one of the following types and return them from your step:
If you already have HTML, Markdown, CSV or JSON data available as a string inside your step, you can simply cast them to one of the following types and return them from your step:

* `zenml.types.HTMLString` for strings in HTML format, e.g., `"<h1>Header</h1>Some text"`,
* `zenml.types.MarkdownString` for strings in Markdown format, e.g., `"# Header\nSome text"`,
* `zenml.types.CSVString` for strings in CSV format, e.g., `"a,b,c\n1,2,3"`.
* `zenml.types.JSONString` for strings in JSON format, e.g., `{"key": "value"}`.

### Example:

Expand Down
10 changes: 1 addition & 9 deletions examples/quickstart/quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@
"\n",
"assert zenml_server_url\n",
"\n",
"!zenml connect --url $zenml_server_url"
"!zenml login $zenml_server_url"
]
},
{
Expand Down Expand Up @@ -722,14 +722,6 @@
"* If you have questions or feedback... join our [**Slack Community**](https://zenml.io/slack) and become part of the ZenML family!\n",
"* If you want to quickly get started with ZenML, check out [ZenML Pro](https://zenml.io/pro)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c560354d-9e78-4061-aaff-2e6213229911",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
1 change: 1 addition & 0 deletions src/zenml/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ class VisualizationType(StrEnum):
HTML = "html"
IMAGE = "image"
MARKDOWN = "markdown"
JSON = "json"


class ZenMLServiceType(StrEnum):
Expand Down
2 changes: 1 addition & 1 deletion src/zenml/integrations/feast/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ class FeastIntegration(Integration):

NAME = FEAST
# click is added to keep the feast click version in sync with ZenML's click
REQUIREMENTS = ["feast", "click>=8.0.1,<8.1.4"]
REQUIREMENTS = ["feast>=0.12.0", "click>=8.0.1,<8.1.4"]
REQUIREMENTS_IGNORED_ON_UNINSTALL = ["click", "pandas"]

@classmethod
Expand Down
22 changes: 13 additions & 9 deletions src/zenml/integrations/feast/feature_stores/feast_feature_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from typing import Any, Dict, List, Union, cast

import pandas as pd
from feast import FeatureStore # type: ignore
from feast import FeatureService, FeatureStore # type: ignore
from feast.infra.registry.base_registry import BaseRegistry # type: ignore

from zenml.feature_stores.base_feature_store import BaseFeatureStore
Expand All @@ -43,14 +43,14 @@ def config(self) -> FeastFeatureStoreConfig:
def get_historical_features(
self,
entity_df: Union[pd.DataFrame, str],
features: List[str],
features: Union[List[str], FeatureService],
full_feature_names: bool = False,
) -> pd.DataFrame:
"""Returns the historical features for training or batch scoring.
Args:
entity_df: The entity DataFrame or entity name.
features: The features to retrieve.
features: The features to retrieve or a FeatureService.
full_feature_names: Whether to return the full feature names.
Raise:
Expand All @@ -70,14 +70,14 @@ def get_historical_features(
def get_online_features(
self,
entity_rows: List[Dict[str, Any]],
features: List[str],
features: Union[List[str], FeatureService],
full_feature_names: bool = False,
) -> Dict[str, Any]:
"""Returns the latest online feature data.
Args:
entity_rows: The entity rows to retrieve.
features: The features to retrieve.
features: The features to retrieve or a FeatureService.
full_feature_names: Whether to return the full feature names.
Raise:
Expand Down Expand Up @@ -118,17 +118,21 @@ def get_entities(self) -> List[str]:
fs = FeatureStore(repo_path=self.config.feast_repo)
return [ds.name for ds in fs.list_entities()]

def get_feature_services(self) -> List[str]:
"""Returns the feature service names.
def get_feature_services(self) -> List[FeatureService]:
"""Returns the feature services.
Raise:
ConnectionError: If the online component (Redis) is not available.
Returns:
The feature service names.
The feature services.
"""
fs = FeatureStore(repo_path=self.config.feast_repo)
return [ds.name for ds in fs.list_feature_services()]
feature_services: List[FeatureService] = list(
fs.list_feature_services()
)

return feature_services

def get_feature_views(self) -> List[str]:
"""Returns the feature view names.
Expand Down
19 changes: 18 additions & 1 deletion src/zenml/materializers/built_in_materializer.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
)

from zenml.artifact_stores.base_artifact_store import BaseArtifactStore
from zenml.enums import ArtifactType
from zenml.enums import ArtifactType, VisualizationType
from zenml.logger import get_logger
from zenml.materializers.base_materializer import BaseMaterializer
from zenml.materializers.materializer_registry import materializer_registry
Expand Down Expand Up @@ -415,6 +415,23 @@ def save(self, data: Any) -> None:
self.artifact_store.rmtree(entry["path"])
raise e

# save dict type objects to JSON file with JSON visualization type
def save_visualizations(self, data: Any) -> Dict[str, "VisualizationType"]:
"""Save visualizations for the given data.
Args:
data: The data to save visualizations for.
Returns:
A dictionary of visualization URIs and their types.
"""
# dict/list type objects are always saved as JSON files
# doesn't work for non-serializable types as they
# are saved as list of lists in different files
if _is_serializable(data):
return {self.data_path: VisualizationType.JSON}
return {}

def extract_metadata(self, data: Any) -> Dict[str, "MetadataType"]:
"""Extract metadata from the given built-in container object.
Expand Down
11 changes: 8 additions & 3 deletions src/zenml/materializers/structured_string_materializer.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,23 @@
from zenml.enums import ArtifactType, VisualizationType
from zenml.logger import get_logger
from zenml.materializers.base_materializer import BaseMaterializer
from zenml.types import CSVString, HTMLString, MarkdownString
from zenml.types import CSVString, HTMLString, JSONString, MarkdownString

logger = get_logger(__name__)


STRUCTURED_STRINGS = Union[CSVString, HTMLString, MarkdownString]
STRUCTURED_STRINGS = Union[CSVString, HTMLString, MarkdownString, JSONString]

HTML_FILENAME = "output.html"
MARKDOWN_FILENAME = "output.md"
CSV_FILENAME = "output.csv"
JSON_FILENAME = "output.json"


class StructuredStringMaterializer(BaseMaterializer):
"""Materializer for HTML or Markdown strings."""

ASSOCIATED_TYPES = (CSVString, HTMLString, MarkdownString)
ASSOCIATED_TYPES = (CSVString, HTMLString, MarkdownString, JSONString)
ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA_ANALYSIS

def load(self, data_type: Type[STRUCTURED_STRINGS]) -> STRUCTURED_STRINGS:
Expand Down Expand Up @@ -94,6 +95,8 @@ def _get_filepath(self, data_type: Type[STRUCTURED_STRINGS]) -> str:
filename = HTML_FILENAME
elif issubclass(data_type, MarkdownString):
filename = MARKDOWN_FILENAME
elif issubclass(data_type, JSONString):
filename = JSON_FILENAME
else:
raise ValueError(
f"Data type {data_type} is not supported by this materializer."
Expand All @@ -120,6 +123,8 @@ def _get_visualization_type(
return VisualizationType.HTML
elif issubclass(data_type, MarkdownString):
return VisualizationType.MARKDOWN
elif issubclass(data_type, JSONString):
return VisualizationType.JSON
else:
raise ValueError(
f"Data type {data_type} is not supported by this materializer."
Expand Down
4 changes: 4 additions & 0 deletions src/zenml/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,7 @@ class MarkdownString(str):

class CSVString(str):
"""Special string class to indicate a CSV string."""


class JSONString(str):
"""Special string class to indicate a JSON string."""
5 changes: 4 additions & 1 deletion src/zenml/utils/visualization_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,10 @@
# permissions and limitations under the License.
"""Utility functions for dashboard visualizations."""

import json
from typing import TYPE_CHECKING, Optional

from IPython.core.display import HTML, Image, Markdown, display
from IPython.core.display import HTML, JSON, Image, Markdown, display

from zenml.artifacts.utils import load_artifact_visualization
from zenml.enums import VisualizationType
Expand Down Expand Up @@ -63,6 +64,8 @@ def visualize_artifact(
assert isinstance(visualization.value, str)
table = format_csv_visualization_as_html(visualization.value)
display(HTML(table))
elif visualization.type == VisualizationType.JSON:
display(JSON(json.loads(visualization.value)))
else:
display(visualization.value)

Expand Down

0 comments on commit 4cd155a

Please sign in to comment.