25 Sep 03:28

parano

6ad3649

BentoML-0.9.0

What's New

TLDR;

New input/output adapter design that let's user choose between batch or non-batch implementation
Speed up the API model server docker image build time
Changed the recommended import path of artifact classes, now artifact classes should be imported from bentoml.frameworks.*
Improved python pip package management
Huggingface/Transformers support!!
Managed packaged models with Labels API
Support GCS(Google Cloud Storage) as model storage backend in YataiService
Current Roadmap for feedback: #1128

New Input/Output adapter design

A massive refactoring on BentoML's inference API and input/output adapter redesign, lead by @bojiang with help from @akainth015.

BREAKING CHANGE: API definition now requires declaring if it is a batch API or non-batch API:

from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable  # type annotations are optional

@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):

        @api(input=JsonInput(), batch=True)
        def predict_batch(self, parsed_json_list: List[JsonSerializable]):
            results = self.artifacts.classifier([j['text'] for j in parsed_json_list])
            return results

        @api(input=JsonInput())  # default batch=False
        def predict_non_batch(self, parsed_json: JsonSerializable):
            results = self.artifacts.classifier([parsed_json['text']])
            return results[0]

For APIs with batch=True, the user-defined API function will be required to process a list of input item at a time, and return a list of results of the same length. On the contrary, @api by default uses batch=False, which processes one input item at a time. Implementing a batch API allow your workload to benefit from BentoML's adaptive micro-batching mechanism when serving online traffic, and also will speed up offline batch inference job. We recommend using batch=True if performance & throughput is a concern. Non-batch APIs are usually easier to implement, good for quick POC, simple use cases, and deploying on Serverless platforms such as AWS Lambda, Azure function, and Google KNative.

Read more about this change and example usage here: https://docs.bentoml.org/en/latest/api/adapters.html

BREAKING CHANGE: For `DataframeInput` and `TfTensorInput` users, it is now required to add `batch=True`

DataframeInput and TfTensorInput are special input types that only support accepting a batch of input at one time.

Input data validation while handling batch input

When the API function received a list of input, it is now possible to reject a subset of the input data and return an error code to the client, if the input data is invalid or malformated. Users can do this via the InferenceTask#discard API, here's an example:

from typings import List
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import JsonInput
from bentoml.types import JsonSerializable, InferenceTask  # type annotations are optional

@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('classifier')])
class MyPredictionService(BentoService):

        @api(input=JsonInput(), batch=True)
        def predict_batch(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]):
             model_input = []
             for json, task in zip(parsed_json_list, tasks):
                  if "text" in json:
                      model_input.append(json['text'])
                  else:
                      task.discard(http_status=400, err_msg="input json must contain `text` field")

            results = self.artifacts.classifier(model_input)

            return results

The number of tasks got discarded plus the length of the results array returned, should be equal to the length of the input list, this will allow BentoML to match the results back to tasks that have not yet been discarded.

Allow fine-grained control of the HTTP response, CLI inference job output, etc. E.g.:

import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError  # type annotations are optional

class MyService(bentoml.BentoService):

    @bentoml.api(input=JsonInput(), batch=False)
    def predict(self, parsed_json: JsonSerializable, task: InferenceTask) -> InferenceResult:
        if task.http_headers['Accept'] == "application/json":
            predictions = self.artifact.model.predict([parsed_json])
            return InferenceResult(
                data=predictions[0],
                http_status=200,
                http_headers={"Content-Type": "application/json"},
            )
        else:
            return InferenceError(err_msg="application/json output only", http_status=400)

Or when batch=True:

import bentoml
from bentoml.types import JsonSerializable, InferenceTask, InferenceError  # type annotations are optional

class MyService(bentoml.BentoService):

    @bentoml.api(input=JsonInput(), batch=True)
    def predict(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]) -> List[InferenceResult]:
        rv = []
        predictions = self.artifact.model.predict(parsed_json_list)
        for task, prediction in zip(tasks, predictions):
            if task.http_headers['Accept'] == "application/json":
                rv.append(
                    InferenceResult(
                        data=prediction,
                        http_status=200,
                        http_headers={"Content-Type": "application/json"},
                ))
            else:
                rv.append(InferenceError(err_msg="application/json output only", http_status=400))
                # or task.discard(err_msg="application/json output only", http_status=400)
        return rv

Other adapter changes:

Added a 3 base adapters for implementing advanced adapters: FileInput, StringInput, MultiFileInput
Implementing new adapters that support micro-batching is a lot easier now: https://github.com/bentoml/BentoML/blob/v0.9.0.pre/bentoml/adapters/base_input.py
Per inference task prediction log #1089
More adapters support launching batch inference job from BentoML CLI run command now, see API reference for detailed examples: https://docs.bentoml.org/en/latest/api/adapters.html

Docker Build Improvements

Optimize docker image build time (#1081) kudos to @ZeyadYasser!!
Per python minor version base image to speed up image building #1101 #1096, thanks @gregd33!!
Add "latest" tag to all user-facing docker base images (#1046)

Improved pip package management

Setting pip install options in BentoService `@env` specification

As suggested here: #1036 (comment), Thanks @danield137 for suggesting the pip_extra_index_url option!

@env(
  auto_pip_dependencies=True,
  pip_index_url='my_pypi_host_url',
  pip_trusted_host='my_pypi_host_url',
  pip_extra_index_url='extra_pypi_index_url'
)
@artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):
  ...

BREAKING CHANGE Due to this change, we have now removed the previous docker build arg PIP_INDEX_URL and ARG PIP_TRUSTED_HOST, due to it may be conflicting with settings in base image #1036

Support passing a conda environment.yml file to @env, as suggested in #725 #725
When a version is not specified in pip_packages list, it is expected to pin to the version found in the current python session. Now it is doing the same for packages added from adapter and artifact classes
Support specifying package requirement range now, e.g.:

@env(pip_packages=["abc==1.3", "foo>1.2,<=1.4"])

It can be any pip version requirement specifier https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers

Renamed pip_dependencies to pip_packages and auto_pip_dependencies to infer_pip_packages, the old API still works but will eventually be deprecated.

GCS support in YataiService

Adding Google Cloud Storage (GCS) support in YataiService, as the storage backend. This is an alternative to AWS S3, MiniIO, or POSIX file system. #1017 - Thank you @Korusuke @PrabhanshuAttri for creating the GCS support!

YataiService Labels API for model management

Managed packaged models in YataiService with labels API implemented in #1064

Add labels to BentoService.save

    svc = MyBentoService()
    svc.save(labels={'my_key': 'my_value', 'test': 'passed'})

Add label query for CLI commands

bentoml get BENTO_NAME, bentoml list, bentoml deployment list, bentoml lambda list, bentoml sagemaker list, bentoml azure-functions list
label query supports =, !=, In, NotIn, Exists, DoesNotExists operator
- e.g. key1=value1, key2!=value2, env In (prod, staging), Key Exists, Another_key DoesNotExist

Simple key/value label selector

Use Exists operator

Use DoesNotExist operator

Use In operator
<img width="1348" alt="Screen Shot 2020-09-0...

Assets 4

25 Aug 04:09

yubozhao

v0.8.6

599a465

BentoML-0.8.6

What's New

Yatai service helm chart for Kubernetes deployment #945 @jackyzha0

Helm chart offers a convenient way to deploy YataiService to a Kubernetes cluster

# Download BentoML source
$ git clone https://github.com/bentoml/BentoML.git
$ cd BentoML

# 1. Install an ingress controller if your cluster doesn't already have one, Yatai helm chart installs nginx-ingress by default:
$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx && helm dependencies build helm/YataiService

# 2. Install YataiService helm chart to the Kubernetes cluster:
$ helm install -f helm/YataiService/values/postgres.yaml yatai-service YataiService

# 3. To uninstall the YataiService from your cluster:
$ helm uninstall yatai-service

@jackyzha0 added a great tutorial about YataiService helm chart deployment. You can find the guide at https://docs.bentoml.org/en/latest/guides/helm.html

[Experimental] AnnotatedImageInput adapter for image plus additional JSON data #973 @ecrows

The AnnotatedImageInput adapter is designed for the common use-cases of image input to include additional information such as object detection bounding boxes, segmentation masks, etc. for prediction. This new adapter significantly improves the developer experience over the previous workaround solution.

Warning: Input adapters are currently under refactoring #1002, we may change the API for AnnotatedImageInput in future releases.

from bentoml.adapters import AnnotatedImageInput
from bentoml.artifact import TensorflowSavedModelArtifact
import bentoml

CLASS_NAMES = ['cat', 'dog']

@bentoml.artifacts([TensorflowSavedModelArtifact('classifier')]
class PetClassification(bentoml.BentoService):
    @api(input=AnnotatedImageInput)
    def predict(self, image, annotations):
         cropped_pets = some_pet_finder(image, annotations)
         results = self.artifacts.classifier.predict(cropped_pets)
         return [CLASS_NAMES[r] for r in results]

Making a request using curl

$ curl -F [email protected] -F [email protected] http://localhost:5000/predict

You can find the current API reference at https://docs.bentoml.org/en/latest/api/adapters.html#annotatedimageinput

Improvements:

#992 Make the prediction and feedback loggers log to console by default - @jackyzha0
#952 Add tutorial for deploying BentoService to Azure SQL server to the documentation @yashika51

Bug Fixes:

#987 & #991 Better AWS IAM roles handles for Sagemaker Deployment - @dinakar29
#995 Fix an edge case for encountering RecursionError when running gunicorn server with --enable-microbatch on MacOS @bojiang
#1012 Fix ruamel.yaml missing issue when using containerized BentoService with Conda. @parano

Internal & Testing:

#983 Move CI tests to Github Actions

Contributors:

Thank you, everyone, for contributing to this exciting release!

@bojiang @jackyzha0 @ecrows @dinakar29 @yashika51 @akainth015

Assets 4

11 Aug 16:45

parano

v0.8.5

7fa13e0

BentoML-0.8.5

Bug fixes

API server show blank index page #977 #975
Failed to package pip installed dependencies in some edge cases #978 #979

Assets 4

07 Aug 04:22

parano

v0.8.4

123dec3

BentoML-0.8.4

What's New

Breaking Change: JsonInput migrating to batch API #860,#953

We are officially changing JsonInput to use the batch-oriented syntax. By now(release 0.8.4), all input adapters in BentoML have migrated to this design. The main difference is that for the user-defined API function, the input parameter is now a list of JSONSerializable objects(Dict, List, Integer, Float, Str) instead of one JSONSerializable object. And the expected return value of the user-defined API function is an Iterable with the exact same length. This makes it possible for API endpoints using JsonInput adapter to take advantage of BentoML's adaptive micro-batching capability.

Here is an example of how JsonInput(formerly JsonHandler) used to work:

        @bentoml.api(input=LegacyJsonInput())
        def predict(self, parsed_json):
            results = self.artifacts.classifier([parsed_json['text']])
            return results[0]

And here is an example with the new JsonInput class:

        @bentoml.api(input=JsonInput())
        def predict(self, parsed_json_list):
            texts = [j['text'] for j in parsed_json_list])
            return self.artifacts.classifier(texts)

The old non-batching JsonInput is still available to help with the transition, simply use from bentoml.adapters import LegacyJsonInput as JsonInput to replace the JsonInput or JsonHandler in your code before BentoML 0.8.4. The LegacyJsonInput behaves exactly the same as JsonInput in previous releases. We will keep supporting it until BentoML version 1.0.

Custom Web UI support in API Server (#839)

Custom web UI can be added to your API server now! Here is an example project: https://github.com/bentoml/gallery/tree/master/scikit-learn/iris-classifier

Add your web frontend project directory to your BentoService class and BentoML will automatically bundle all the web UI files and host them when starting the API server:

@env(auto_pip_dependencies=True)
@artifacts([SklearnModelArtifact('model')])
@web_static_content('./static')
class IrisClassifier(BentoService):

    @api(input=DataframeInput())
    def predict(self, df):
        return self.artifacts.model.predict(df)

Artifact packing & loading workflow #911, #921, #949

We have refactored the Artifact API, which brings more flexibility to how users package their trained models with BentoML's API.

The most noticeable thing a user can do now is to separate model training job and BentoML model serving development - the user can now use the Artifact API to save a trained model from their training job and load it later for creating BentoService class for model serving. e.g.:

Step 1, model training:

from sklearn import svm
from sklearn import datasets
from bentoml.artifact import SklearnModelArtifact

if __name__ == "__main__":
    # Load training data
    iris = datasets.load_iris()
    X, y = iris.data, iris.target

    # Model Training
    clf = svm.SVC(gamma='scale')
    clf.fit(X, y)

    # save just the trained model  with the SklearnModelArtifact to a specific directory
    btml_model_artifact = SklearnModelArtifact('model')
    btml_model_artifact.pack(clf)
    btml_model_artifact.save('/tmp/temp_bentoml_artifact')

Step 2: Build BentoService class with the saved artifact:

from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.artifact import SklearnModelArtifact

@env(auto_pip_dependencies=True)
@artifacts([SklearnModelArtifact('model')])
class IrisClassifier(BentoService):

    @api(input=DataframeInput())
    def predict(self, df):
        # Optional pre-processing, post-processing code goes here
        return self.artifacts.model.predict(df)

if __name__ == "__main__":
    # Create a iris classifier service instance
    iris_classifier_service = IrisClassifier()

    # load the previously saved artifact
    iris_classifier_service.artifacts.get('model').load('/tmp/temp_bentoml_artifact')

    saved_path = iris_classifier_service.save()

This workflow makes developing and debugging BentoService code a lot easier, user no longer needs to retrain their model every time they change something in the BentoService class definition and wants to try it out.

Note that the old BentoService class method 'pack' has now been deprecated in this release #915

Add `bentoml containerize` command (#847,#884,#941)

$ bentoml containerize --help
Usage: bentoml containerize [OPTIONS] BENTO

  Containerizes given Bento into a ready-to-use Docker image.

Options:
  -p, --push
  -t, --tag TEXT       Optional image tag. If not specified, Bento will
                       generate one from the name of the Bento.

Support multiple images in the same request (#828)

A new input adapter class MultiImageInput https://docs.bentoml.org/en/latest/api/adapters.html#multiimageinput has been added. It is designed for prediction services that require multiple image files as its input:

from bentoml import BentoService
import bentoml

class MyService(BentoService):

    @bentoml.api(input=MultiImageInput(input_names=('imageX', 'imageY')))
    def predict(self, image_groups):
        for image_group in image_groups:
            image_array_x = image_group['imageX']
            image_array_y = image_group['imageY']

Add FileInput adapter(#734)

A new input adapter class FileInput for handling arbitrary binary files as the input for your prediction service https://github.com/bentoml/BentoML/blob/v0.8.4/bentoml/adapters/file_input.py#L33

Added Ngrok support (#917)

Expose your local development model API server over a public URL endpoint, using Ngrok under the hood. To try it out, simply add the --run-with-ngrok flag to your bentoml serve CLI command, e.g.:

bentoml serve IrisClassifier:latest --run-with-ngrok

Add support for CoreML (#939)

Serving CoreML model on Mac OS is now supported! Users can also convert their models trained with other frameworks to the CoreML format, for better performance on Mac OS platforms. Here's an example with Pytorch model serving with CoreML and BentoML:

import torch
from torch import nn

class PytorchModel(nn.Module):
    def __init__(self):
        super().__init__()

        self.linear = nn.Linear(5, 1, bias=False)
        torch.nn.init.ones_(self.linear.weight)

    def forward(self, x):
        x = self.linear(x)

        return x

# ------

import numpy
import pandas as pd

from coremltools.models import MLModel  # pylint: disable=import-error

import bentoml
from bentoml.adapters import DataframeInput
from bentoml.artifact import CoreMLModelArtifact

@bentoml.env(auto_pip_dependencies=True)
@bentoml.artifacts([CoreMLModelArtifact('model')])
class CoreMLClassifier(bentoml.BentoService):
    @bentoml.api(input=DataframeInput())
    def predict(self, df: pd.DataFrame) -> float:
        model: MLModel = self.artifacts.model
        input_data = df.to_numpy().astype(numpy.float32)
        output = model.predict({"input": input_data})
        return next(iter(output.values())).item()


def convert_pytorch_to_coreml(pytorch_model: PytorchModel) -> ct.models.MLModel:
    """CoreML is not for training ML models but rather for converting pretrained models
    and running them on Apple devices. Therefore, in this train we convert the
    pretrained PytorchModel from the tests.integration.test_pytorch_model_artifact
    module into a CoreML module."""
    pytorch_model.eval()
    traced_pytorch_model = torch.jit.trace(pytorch_model, torch.Tensor(test_df.values))
    model: MLModel = ct.convert(
        traced_pytorch_model, inputs=[ct.TensorType(name="input", shape=test_df.shape)]
    )
    return model


# ------

if __name__ == '__main__':
    svc = CoreMLClassifier()
    pytorch_model = PytorchModel()
    model = convert_pytorch_to_coreml(pytorch_model)
    svc.pack('model', model)
    svc.save()

Breaking Change: Remove CLI --with-conda option #898

Run inference job within an automatically generated conda environment seems like a good idea at first but we realized it introduces more problems than it solves. We are removing this option and encourage users to use docker for running inference jobs instead.

Improvements:

#966, #968 Faster save by improving python local module parsing code
#878, #879 Faster import bentoml with lazy module loader
#872 Add BentoService API name validation
#887 Set a smaller page limit for bentoml list
#916 Do not cache pip requirements in Dockerfile
#918 Improve error handling when micro batching service is unavailable
#925 Artifact refactoring: set_dependencies method
#932 Add warning for SavedBundle Python version mismatch
#904 JsonInput handle AWS Lambda event should ignore content type header
#951 Add openjdk to H2O artifact default conda dependencies
#958 Fix typo in cli default argument help message

Bug fixes:

#864 Fix decode headers with latin1
#867 Fix DataFrameInput passing NaN values over HTTP JSON request
#869 Change the default mb_max_latency value to avoid flaky micro-batching initialization
#897 Fix yatai web client import
#907 Fix CORS option in AWS Lambda SAM config
#922 Fix lambda deployment when using AWS assumed-role ARN
#959 Fix RecursionError: maximum recursion depth exceeded when saving BentoService bundle
#969 Fix error in CLI command bentoml --version

Internal & Testing

#870 Add docs for using BentoML's built-in benchmark client
#855, #871, #877 Add integration tests for dockerized BentoML API server ...

Assets 4

06 Jul 23:40

parano

v0.8.3

72c50eb

BentoML-0.8.3

What's New?

0.8.3 is a minor release with a few important bug fixes:

Fix: 500 Error without message when micro-batch enabled #857
Fix: port conflict with --debug flag #858
Permission issue while building docker image for BentoService created under Windows OS #851

Assets 4

26 Jun 01:31

parano

v0.8.2

bff94a5

BentoML-0.8.2

What's New?

Support Debian-slim docker images for containerizing model server, #822 by @jackyzha0. User can choose to use :
```
@env(
   auto_pip_dependencies=True,
   docker_base_image="bentoml/model-server:0.8.2-slim-py37"
)
```
New bentoml retrieve command for downloading saved bundle from remote YataiService model registry, #810 by @iancoffey
```
bentoml retrieve ModelServe:20200610145522_D08399 --target_dir /tmp/modelserve
```

Added --print-location option to bentoml get command to print the saved path, #825 by @jackyzha0

 $ bentoml get IrisClassifier:20200625114130_F3480B --print-location
 /Users/chaoyu/bentoml/repository/IrisClassifier/20200625114130_F3480B

Support Dataframe input JSON format orient parameter. DataframeInput now supports all pandas JSON orient options: records, columns, values split, index. #809 #815, by @bojiang

For example, with orient="records":
```
@api(input=DataframeInput(orient="records"))
def predict(self, df):
     ...
```
The API endpoint will be expecting HTTP request with JSON payload in the following format:
```
[{"col 1":"a","col 2":"b"},{"col 1":"c","col 2":"d"}]
```
Or with orient="index":
```
'{"row 1":{"col 1":"a","col 2":"b"},"row 2":{"col 1":"c","col 2":"d"}}'
```
See pandas's documentation on the orient option of to_json/from_json function for more detail: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html
Support Azure Functions deployment (beta). A new fully automated cloud deployment option that BentoML provides in addition to AWS SageMaker and AWS Lambda. See usage documentation here: https://docs.bentoml.org/en/latest/deployment/azure_functions.html
ModelServer API Swagger schema improvements including the ability to specify example HTTP request, #807 by @Korusuke
Add prediction logging when deploying with AWS Lambda, #790 by @jackyzha0
Artifact string name validation, #817 by @AlexDut
Fixed micro batching parameter(max latency and max batch size) not applied, #818 by @bojiang
Fixed issue with handling CSV file input by following RFC4180. #814 by @bojiang
Fixed TfTensorOutput casts floats as ints #813, in #823 by @bojiang

Announcements:

The BentoML team has created a new mailing list for future announcements, community-related discussions. Join now here!
For those interested in contributing to BentoML, there is a new contributing docs now, be sure to check it out.
We are starting a bi-weekly community meeting for community members to demo new features they are building, discuss the roadmap and gather feedback, etc. More details will be announced soon.

Assets 4

15 Jun 22:58

parano

v0.8.1

5b6bd29

BentoML-0.8.1

What's New?

Service API Input/Output adapter #783 #784 #789, by @bojiang
- A new API for defining service input and output data types and configs
- The new InputAdapter is essentially the API Handler concept in BentoML prior to version 0.8.x release
- The old API Handler syntax is being deprecated, it will continue to be supported until version 1.0
- The main motivation for this change, is to enable us to build features such as new API output types(such as file/image as service output), add gRPC support, better OpenAPI support, and more performance optimizations in online serving down the line
Model server docker image build improvements #761
- Reduced docker build time by using a pre-built BentoML model server docker image as the base image
- Removed the dependency on apt-get and conda from the custom docker base image
- Added alpine based docker image for model server deployment
Improved Image Input handling:
- Add micro-batching support for ImageInput (former ImageHandler) #717, by @bojiang
- Add support for using a list of images as input from CLI prediction run #731, by @bojiang
- In the new Input Adapter API introduced in 0.8.0, the LegacyImageInput is identical to the previous ImageHandler
- The new ImageInput works only for single image input, unlike the old ImageHandler
- For users using the old ImageHandler, we recommend migrating to the new ImageInput if it is only used to handle single image input
- For users using ImageHanlder for multiple images input, wait until the MultiImageInput is added, which will be a separate input adapter type
Added CORS support for AWS Lambda serving #752, by @omrihar
Added JsonArtifact for storing configuration and JsonSerializable data #746, by @lemontheme

Bug Fixes & Improvements:

Fixed Sagemaker deployment ModuleNotFounderError due to wrong gevent version #785 by @flosincapite
Fixed SpacyModelArtifact not exposed in bentoml.artifacts #782, by @docteurZ
Fixed errors when inheriting handler #767, by @bojiang
Removed future statements for py2 support, #756, by @jjmachan
Fixed bundled_pip_dependencies installation on AWS Lambda deployment #794
Removed aws.region config, use AWS CLI's own config instead #740
Fixed SageMaker deployment CLI: delete deployment with namespace specified #741
Removed pandas from BentoML dependencies list, it is only required when using DataframeInput #738

Internal, CI, Testing:

Added docs watch script for Linux #781, by @akainth015
Improved build bash scripts #774, by @akainth015, @flosincapite
Fixed YataiService end-to-end tests #773
Added PyTorch integration tests #762, by @jjmachan
Added ONNX integration tests #726, by @yubozhao
Added linter and formatting check to Travis CI
Codebase cleanup, reorganized deployment and repository module #768 #769 #771

Announcements:

The BentoML team is planning to start a bi-weekly community meeting to demo new features, discuss the roadmap and gather feedback. Join the BentoML slack channel for more details: click to join BentoML slack.
There are a few issues with PyPI release 0.8.0 that made it not usable. The newer 0.8.1 release has those issues fixed. Please do not use version 0.8.0.

Assets 4

27 May 02:05

parano

v0.7.8

9b0e45c

BentoML-0.7.8

What's New?

ONNX model support with onnxruntime backend. More example notebooks and tutorials are coming soon!
Added Python 3.8 support

Documentation:

BentoML API Server architecture overview https://docs.bentoml.org/en/latest/guides/micro_batching.html
Deploying YataiService behind Nginx https://docs.bentoml.org/en/latest/guides/yatai_service.html

Internal:

[benchmark] moved benchmark notebooks it a separate repo: https://github.com/bentoml/benchmark
[CI] Enabled Linting style check test on Travis CI, contributed by @kautukkundan
[CI] Fixed all existing linting errors in bentoml and tests module, contributed by @kautukkundan
[CI] Enabled Python 3.8 on Travis CI

Announcements:

There will be breaking changes in the coming 0.8.0 release, around ImageHandler, custom Handler and custom Artifacts. If you're using those features in production, please reach out.
Help us promote BentoML on Twitter @bentomlai and Linkedin Page!
Be sure to join the BentoML slack channel for roadmap discussions and development updates, click to join BentoML slack.

Assets 4

18 May 23:38

parano

v0.7.7

37505a2

BentoML-0.7.7

What's New?

Support custom docker base image, contributed by @withsmilo
Improved model saving & loading with YataiService backed by S3 storage, contributed by @withsmilo, BentoML now works with custom s3-like services such as a MinIO deployment

Improvements & Bug Fixes

Fixed a number of issues that are breaking Windows OS support, contributed by @bojiang
[YataiService] Fixed an issue where the deployment namespace configured on the server-side will be ignored

Internal:

[CI] Added Windows test environment in BentoML's CI test setup on Travis

Announcements:

Help us promote BentoML on Twitter @bentomlai and Linkedin Page!
Be sure to join the BentoML slack channel for roadmap discussions and development updates, click to join BentoML slack.

Assets 4

15 May 01:44

parano

v0.7.6

2cfa16a

BentoML-0.7.6

What's New?

Added Spacy Support, contributed by @spotter (#641)
Support custom s3_endpoint_url in BentoML’s model registry component(YataiService) (#656)
YataiService client can now connect via secure gRPC (#650)

Improvements & Bug Fixes

Micro-batching server performance optimization & troubleshoot back pressure (#630)
[YataiService] Included postgreSQL required dependency in the YataiService docker image by default
[Documentation] New fastest example project
[Bug Fix] Fixed overwriting pip_dependencies specified through @env (#657 #642)

Internal:

[Benchmark] released newly updated benchmark notebook with latest changes in micro batching server
[Benchmark] notebook updates and count dropped requests (#645)
[e2e test] Added e2e test using dockerized YataiService gRPC server

Assets 4

Releases: bentoml/BentoML

BentoML-0.9.0

What's New

New Input/Output adapter design

BREAKING CHANGE: API definition now requires declaring if it is a batch API or non-batch API:

BREAKING CHANGE: For DataframeInput and TfTensorInput users, it is now required to add batch=True

Input data validation while handling batch input

Allow fine-grained control of the HTTP response, CLI inference job output, etc. E.g.:

Other adapter changes:

Docker Build Improvements

Improved pip package management

Setting pip install options in BentoService @env specification

GCS support in YataiService

YataiService Labels API for model management

BentoML-0.8.6

What's New

Yatai service helm chart for Kubernetes deployment #945 @jackyzha0

[Experimental] AnnotatedImageInput adapter for image plus additional JSON data #973 @ecrows

Improvements:

Bug Fixes:

Internal & Testing:

Contributors:

BentoML-0.8.5

Bug fixes

BentoML-0.8.4

What's New

Breaking Change: JsonInput migrating to batch API #860,#953

Custom Web UI support in API Server (#839)

Artifact packing & loading workflow #911, #921, #949

Add bentoml containerize command (#847,#884,#941)

Support multiple images in the same request (#828)

Add FileInput adapter(#734)

Added Ngrok support (#917)

Add support for CoreML (#939)

Breaking Change: Remove CLI --with-conda option #898

Improvements:

Bug fixes:

Internal & Testing

BentoML-0.8.3

What's New?

BentoML-0.8.2

What's New?

Announcements:

BentoML-0.8.1

What's New?

Bug Fixes & Improvements:

Internal, CI, Testing:

Announcements:

BentoML-0.7.8

BentoML-0.7.7

BentoML-0.7.6

BREAKING CHANGE: For `DataframeInput` and `TfTensorInput` users, it is now required to add `batch=True`

Setting pip install options in BentoService `@env` specification

Add `bentoml containerize` command (#847,#884,#941)