Skip to content

Latest commit

 

History

History
1564 lines (1416 loc) · 172 KB

CHANGELOG.md

File metadata and controls

1564 lines (1416 loc) · 172 KB

Change Log

Full Changelog

Fixed bugs:

  • [gcp] v0.3.0-rc.1 cert manager can't get the SSL certificate #1666

Closed issues:

  • [GCP] Update credentials filename #1715
  • jsonnet test is failing but no jsonnet files changed in the PR. #1707
  • Jupyterlab service account token #1648
  • Create 0.3. Release #1541

Merged pull requests:

v0.3.0 (2018-10-04)

Full Changelog

Fixed bugs:

  • v0.3.0-rc.1: ERROR no prototype names matched 'pytorch-operator' #1663

Closed issues:

  • Accessing custom metrics in our Python model #1681
  • [v0.3.0-rc.1] "/" doesn't redirect to the centraldasbhoard UI #1670
  • Run tests periodically on the release branch starting with 0.3 #1603
  • Kubebench-job default param mainJobConfig points to a wrong path #1596
  • Release process for kubebench #1510

Merged pull requests:

v0.3.0-rc.3 (2018-10-02)

Full Changelog

Closed issues:

  • [v0.3.1-rc.1] JupyterHub spawner is missing TF images for 1.9 and 1.10 #1672
  • [v0.3.0-rc.1] Only TF 1.8 CPU shows up in the list of prepopulated Jupyter images #1671
  • [GCP] Click to deploy doesn't create any K8s resources when version is 0.2.5 #1631
  • Update Katib image in 0.3 branch #1604
  • central dashboard image build workflow doesn't work #1575
  • [Image Auto Release] process has lots of issues; should we use prow? #1574
  • Image Auto Release Cron Job is failing #1563
  • ack_guide.md out of date #1293
  • GKE: can not read from google cloud storage in Jupyter notebook #1249
  • Friction log for bootstrapper documentation #927
  • Create a minimal release process for our ksonnet configs #215

Merged pull requests:

v0.3.0-rc.2 (2018-09-30)

Full Changelog

Fixed bugs:

  • Deploy kfctl.sh apply k8s : Service "ambassador" is invalid ? #1566

Merged pull requests:

v0.3.0-rc.1 (2018-09-28)

Full Changelog

Closed issues:

  • [GCP] Deploy script fails with unsupported k8s version 1.9.7-gke.5 #1641
  • Update CentralUI image used at head and 0.3 branch #1435
  • kfctl.sh needs to get initial cluster version based on get-server-config #1359

Merged pull requests:

  • Cherrypick (#1657) tag image and change libsonnet for centraldashboard image #1660 (swiftdiaries)
  • Tag and update centraldashboard image #1657 (swiftdiaries)
  • Cherry-pick #1589 to v0.3 #1656 (lluunn)
  • Cherrypick #1650; automatically set master version to supported version. #1654 (jlewi)
  • Update initial clsuter version in cluster.jinja based on what gcloud get-server-config returns #1650 (ashahba)
  • Enable periodic prow tests #1649 (richardsliu)
  • Build image for centralui part of presubmit #1623 (swiftdiaries)

v0.2.6 (2018-09-28)

Full Changelog

Fixed bugs:

  • Ambassador Version 0.34.0 causing DNS Issues on Worker Node #945

Closed issues:

  • Update document on Tf Serving #1634
  • [test flake] pre and postsubmit failures deploying mnist #1617
  • tensorboard prototypes should include the optionalParams available in that prototype #1610
  • Katib ksonnet component needs an E2E test #1607
  • Update Kubebench images on 0.3. branch #1602
  • Update Jupyter images on 0.3. branch #1601
  • Update PyTorch Job image on 0.3 branch #1600
  • Update TFJob on 0.3 Branch #1599
  • Update Seldon to 0.2.3 on 0.3 release branch #1598
  • Envoy unable to read config #1588
  • deploying kubeflow with bootstrapper failed #1586
  • How can we modify jupyterhub configuration to use GitHub Authentication? #1585
  • Jupyter Image Builds Are failing; Dependency issue related to TFMA? #1576
  • bump gke version to 1.10.7-gke.2 #1572
  • presubmit build is failing with a quota error #1562
  • refactor tf-job-operator to match style-guide of libsonnet #1534
  • Test to verify we can deploy Katib #1483
  • [test flake] mnist gpu test is very flaky; not enough GPU; autoscaling not enabled for the GPU pool #1436
  • Make TF serving component more readable and extendable #1264
  • E2e test of TF Serving using built-in HTTP api #1258
  • [Discussion] TF serving image: should we just keep Dockerfile for the latest TF version? #1089
  • bootstrapper should support push the ksonnet app to a source repo #912
  • Investigate using tensorflow serving's built-in http server #896

Merged pull requests:

4e7f4ed (2018-09-19)

Full Changelog

Closed issues:

  • kfctl.sh needs to call get-server-config to get GKE version #1570
  • Certificate not working #1567
  • "Patching IAM bindings" halt during deployment. #1559
  • cert-manager missing clusterrole #1554
  • Jupyter Notebooks for TF 1.9 and 1.10 #1546
  • test_jsonnet is failing in postsubmit #1543
  • Directory ${KFAPP} already exists #1530
  • Move kubebench package to kubeflow repo #1513
  • cloud endpoint prototype breaks on master. #1507
  • Error: Failed to apply app: find objects: RUNTIME ERROR: Field does not exist: v1 #1506
  • PR shows review is not required to merge #1503
  • update tensorboard to use the same pattern as kubeflow/core/prototypes #1500
  • No prototype names matched 'kubeflow-core' #1492
  • cannot list namespace on tfjob dashboard #1491
  • Issue installing on GKE with deploy script #1489
  • Installation fails on Amazon EKS #1488
  • add ability to only generate parts of a component in the jsonnet file #1486
  • Prometheus for seldon models #1484
  • Multiple issues with gke/deploy.sh #1481
  • Katib StudyJob failed to mount directory #1480
  • New image release for pytorch operator #1479
  • simplify tensorboard as separate aws, gcp prototypes #1477
  • Cut release 0.2.5 #1476
  • do you have a performance benchmarks when run Horovod with your openmpi component? #1461
  • Review/extend jovyan permissions in TF notebooks #1438
  • [gcp] VM account should have GCS read only scope to support pulling from GCR #1432
  • kfctl.sh unable to find component ambassador #1429
  • [kfctl] support specify registries & version in "/kfctl/apps/create" request. #1417
  • standardize remaining <component>.{jsonnet,libsonnet} files #1414
  • Update 0.2 blog with new deployment script #1390
  • Update E2E test to use kfctl.sh and delete gke/deploy.sh; #1331
  • Docker image building workflows are failing #1135
  • [bootstrap] Fail to update role kubeflow.jupyter-role #1076
  • camelCase for some recently fixed params #1050
  • Add document on Stackdriver agents #997
  • Suggest using simple port forwarding instead of LoadBalancer for cloud deploy in User Guide #860
  • Need docs for TFJobs UI #573
  • Update docs to mention known issues with ksonnet and windows #501
  • Need: User facing website for Kubeflow that details how to choose a stack #213
  • Tutorial(s) that correspond to CUJs #85

Merged pull requests:

v0.2.5 (2018-09-04)

Full Changelog

Fixed bugs:

  • JupyterHub Version Mismatch #1393

Closed issues:

  • Error setting up kubeflow in minikube #1459
  • Current documentation for setting up Kubeflow in minikube not working #1455
  • presubmit failure for jsonnet test: name must be set #1453
  • What is image registry.opensource.zalan.do/teapot/external-dns ? #1446
  • What's the function of tfReplicaType: Master ? #1442
  • Bootstrapper fails in docker-for-desktop #1430
  • tf_job_simple_test results not being report #1426
  • deploy.sh should be restart-aware in terms of directory structure #1422
  • kfctl.sh should not assume uuidgen is present #1415
  • How to spawn the jupyter container as a root user #1412
  • Don't use DM for IAM policy management #1401
  • Trigger minikube E2E test on presubmit when minikube test is modified #1350
  • GKE version "foo" is unsupported. #1348
  • CentralDashboard returns 404; Ambassador can't parse the route #1306
  • When creating releases we should pin the version of source.tar.gz used in the deploy.sh #1239
  • provide the ability to add imagePullSecrets to different ServiceAccounts so that private images can be fetched #1231
  • Restrict privilege of Kubeflow services accounts such as tf-job-operator to namespace level #1213
  • Minikube deploy script should start minikube #1153
  • [gcp] Use BackendConfig to enable IAP #1146
  • Make jupyterlab discoverable/default #1124
  • Enable test for tf-job-simple prototype for v1alpha2 #1048
  • Initial report for spartakus metrics #351
  • Batch Prediction using GPUs with local runner #251

Merged pull requests:

v0.2.4-rc.0 (2018-08-21)

Full Changelog

v0.2.4 (2018-08-21)

Full Changelog

Closed issues:

  • The testing/install_minikube.sh script assumes the host OS is Ubuntu. #1383
  • Add Argo UI to Ambassador and Central UI #1310

Merged pull requests:

v0.2.3-rc.0 (2018-08-17)

Full Changelog

v0.2.3 (2018-08-17)

Full Changelog

Features and improvements:

  • Extra packages in jupyterhub-image #1175
  • Use GKE auto-scaling when configuring node pools in the provided Deployment Manager configs #1033

Fixed bugs:

  • apparent issue with 0.2.0 tag re: ks registry #1115
  • Ambassador crashing on minikube #734

Closed issues:

  • Jupyter-role error applying kubeflow-core component with ksonnet #1353
  • Jupyter notebook Connection failed because Ambassador doesn't enable websockets #1344
  • Need to update glide's ksonnet version to ^0.11.0 #1340
  • PS still running after tfjob is complete. #1334
  • Central UI should include a link to Kubeflow docs website #1318
  • Istio integration doc: Point to kubeflow/website documentation for #1315
  • Build and debug improvements for bootstrapper #1312
  • Fix incorrect links to user_guide in kubeflow.org #1300
  • JupyterHub login unauthorized (401) #1296
  • Katib apply fail with error: Field does not exist: modeldbDatabaseImage #1291
  • ambassador crashing on node with wrong DNS resolver address due to misconfigured kubelet #1289
  • Move Katib documentation to kubeflow website #1286
  • How can we change the tensorflow image in kubeflow? #1285
  • [gcp] deploy.sh should support rerunning deploy.sh when DM configs and ks app already exist #1284
  • TFJob operator v1alpha2 doesn't work with TF.Estimator API for TF <=1.6 #1283
  • [GCP] deploy.sh fails; can't create filestore because network is legacy #1282
  • [GCP] deploy.sh filestore API not enabled for project #1280
  • [GCP] deploy.sh gcloud error Invalid choice: 'filestore'. #1279
  • [gcp-deployer] Set up "cors-anywhere" proxy service for k8s api requests from web app #1276
  • Deploy argo by default; add it to deploy.sh scripts #1268
  • [Test Flake] simple tf job failing; Job not found waiting for job #1266
  • ERROR no prototype names matched 'kubeflow/core' #1263
  • TF Serving GPU test failing #1262
  • Keras training in Kubeflow on GKE gets "Killed" #1261
  • Better installation guide on kubeflow.org #1257
  • Create a batch predict example #1250
  • GPU support on GKE not available #1246
  • TF-job package missing #1245
  • Spawning Jupyter failed; user jovyan does not have permission to write to default storage class #1241
  • "Getting Involved" in README.md should point to kubeflow.org #1237
  • scripts/gke/deploy.sh fails when kubeflow_deployment_manager_configs/ exists #1233
  • [openmpi]- NodeSelector not working. #1230
  • [GCP] deploy.sh - don't show error if deployment doesn't exist #1222
  • Cant start jupyter-notebook pod in kubeflow version 0.2.1 #1221
  • [Test Failure] TFJob test failure; no module named py #1218
  • Error from server (NotFound): tfjobs.kubeflow.org "mycnnjob" not found #1217
  • Getting started error: No such file or directory: 'cluster-kubeflow.yaml' #1206
  • Getting started error #1205
  • README.md QuickStart should refer to kubeflow.org Getting Start #1202
  • "getting-started-gke" installer fails #1201
  • [gcp] deploy.sh shouldn't download secrets to the same directory as the DM configs #1197
  • deploy.sh is broken; wrong directory for the unpack? #1193
  • unable to spawn jupyter notebook - volume name is too long #1177
  • Delete old GCP configs #1171
  • Test Flake gke teardown failed; insufficient quota #1166
  • Create Prometheus Component in Kubeflow Core. #1160
  • Jupyter image suitable for running the examples/codelabs #1157
  • Create links on kubeflow.org to redirect to ksonnet tarballs and deploy scripts #1156
  • Make it easy to customizes PVs attached to Jupyter pods #1125
  • Split all prototype into separate prototypes #1107
  • Support for AVX2 when using deployment manager #1082
  • [gcp-click-to-deploy] Deploy click to deploy web app #1056
  • Bootstrapper release instructions need to explain how to build at appropriate commit #1053
  • Don't check in vendor for bootstrap; it adds 160M which slows down cloning the registry as part of Kubeflow deployment #1051
  • unflake TF serving testing #1031
  • Support and testing different versions of TF serving images #1005
  • Serving path should support logging request input/output #1000
  • 'ks delete ${KF_ENV} -c kubeflow-core' doesn't take down user notebook pods #968
  • Bootstrapper should support file and http registries in a consistent manner #962
  • Click-to-deploy UI upgrade. #959
  • Remove "alpha" in deployment manager config when gke-1.10.2 is public #821
  • TF Serving test flaky #815
  • TFJobs UI doesn't work behind IAP; React APP needs support IAP? #574
  • Can not launch TensorFlow Serving because AVX not available on VM #421
  • Trigger rebuild of TF serving image in E2E test only when files change #371
  • HTTP Proxy and TFServing should not use the same resource defaults #360
  • Add script to export inception into SaveModel from checkpoints #229
  • ks apply -f tf-job.jsonnet, -f is not a valid flag #201
  • e2e test for http-proxy #198
  • Make https://hub.docker.com/r/kubeflow/jupyterhub/ a community resource #197
  • How to port model developed in Jupyter notebook to TFJobs #110

Merged pull requests:

v0.2.2-rc.0 (2018-07-13)

Full Changelog

v0.2.2 (2018-07-13)

Full Changelog

Closed issues:

  • TFMA plots don't render; GET tfma_widget_js.js returns 404 #1130

Merged pull requests:

v0.2.1-rc.1 (2018-07-12)

Full Changelog

v0.2.1 (2018-07-12)

Full Changelog

Closed issues:

  • Use PV by default mounted at /home/jovyan #1187
  • Central UI image needs to be updated in 0.2.1 release; it is too old. #1147
  • metrics_collector should emit K8s events to indicate when Kubeflow is ready #1142
  • Jupyter images in 0.2.1 need to be upgraded #1129

Merged pull requests:

v0.2.1-rc.0 (2018-07-11)

Full Changelog

Closed issues:

  • Test Flake deploy.sh fails trying to enable the deployment service #1158
  • Make downloading our ksonnet registry for getting started efficient #1154
  • kubeflow cluster cannot pull image from GCR within same Project. #1139
  • [Test Flake] vm_util.wait_for_operation needs to retry on socket error #1137
  • Ambassador pod failed to run because kube-dns not running #1134
  • [Test Flake] tf_job_simple_test needs retries for ks init to deal with git connection issues #1128
  • PyTorch job prototype should contain the full job spec #1114
  • Finalize release 0.2.0 #1070
  • [gcp] GKE setup; do as much as possible in deploy.sh #1068
  • [gcp click-to-deploy] Need to build docker container #1055
  • Cherry-pick for release v0.2.0-rc.1 #1024
  • Cut a 0.2 release branch #964
  • [gcp] monitoring agent need to emit events/status information particularly related to IAP #955
  • Include GPU daemonset in GKE configs? #288
  • KubeFlow or Kubeflow? #44
  • Tooling to manage configuration and deployment #23

Merged pull requests:

  • Cherry pick upgrading the central UI to v0.2.1 #1182 (jlewi)
  • Make the deploy scripts more efficient and other fixes. (#1174) #1180 (jlewi)
  • Cherry Pick: Don't check in bootstrap/vendor. (#1152) #1176 (jlewi)
  • Make the deploy scripts more efficient and other fixes. #1174 (jlewi)
  • Make GKE VM service account storage.objectViewer to have read access of gcr #1164 #1172 (jlewi)
  • Cherypick: Tag the latest Jupyter images with v0.2.1 (#1144) #1170 (jlewi)
  • Delete kubeflow namespace before deleting the cluster #1167 (ankushagarwal)
  • Make GKE VM service account storage.objectViewer #1164 (kunmingg)
  • Give KF_USER_NAME service account roles/cloudbuild.builds.editor role #1163 (ankushagarwal)
  • Skip project setup during deployment. #1162 (jlewi)
  • Don't check in bootstrap/vendor. #1152 (jlewi)
  • Make Katib work with Ambassador. (#1103) #1150 (jlewi)
  • Fix the makefile so that we tag the image with the comit. #1149 (jlewi)
  • Tag the latest Jupyter images with v0.2.1 #1144 (jlewi)
  • [gcp-deployer] Use Material UI components and fonts #1138 (yebrahim)
  • Set requests and limits for RAM and CPU in TF notebook image releaser. #1136 (jlewi)
  • Make the test robust to test flakes due to problems initializing the ksonnet app #1133 (jlewi)
  • Fix TFMA jupyter extensions. #1131 (jlewi)
  • Fix typos #1127 (idealhack)
  • add service monitor prototype for monitoring deployment status #1123 (kunmingg)
  • Add Dockerfile and Makefile to build docker images #1122 (ankushagarwal)
  • Pin and fix katib images. (#1113) #1120 (jlewi)
  • Put the full PyTorch prototype in the jsonnet file. #1119 (jlewi)
  • Pin and fix katib images. #1113 (jlewi)
  • Create a deployment script for gke and minikube #1111 (ankushagarwal)
  • Add a jupyter-notebook-role and use it for notebooks in jupyterhub #1110 (ankushagarwal)
  • [gcp-deployer] Rudimentary progress in logs #1108 (yebrahim)
  • add katib releaser #1102 (YujiOshima)
  • Create a script to update a ksonnet app to the latest Kubeflow package #1100 (jlewi)
  • Tfjob create fails in tfjob UI #1099 (kkasravi)

v0.2.0 (2018-06-29)

Full Changelog

Fixed bugs:

  • [gcp click-to-deploy] hostname field is not editable #1101
  • [gcp click-to-deploy] Click to deploy web app crashes #1072

Closed issues:

  • Wrong comment in setting default CleanPodPolicy #1081
  • Deprecate tfserving http-proxy? #1080
  • user guide disappear? #1078
  • user_guide link is dead #1075
  • TFJob prototype's default TFVersion should be v1alpha2 #1049
  • TF-Serving 1.8 Images #845

Merged pull requests:

  • [gcp-deployer] Fix hostname typo bug #1105 (yebrahim)
  • cherry pick 3 commits #1104 (kunmingg)
  • Make Katib work with Ambassador. #1103 (jlewi)
  • add seldon to ks config; pre install all pkg #1098 (kunmingg)
  • Create a version of echo-server to echo headers. #1097 (jlewi)
  • add chainer-job/chainer-operator ksonnet package #1095 (everpeace)
  • Install gpu driver in deployment manager #1094 (lluunn)
  • Improvements to deploy.sh #1093 (jlewi)
  • Delete the tf-job package. #1091 (jlewi)
  • cherry-pick: update tf job default version (#1086) #1087 (kunmingg)
  • update tf job default version #1086 (kunmingg)
  • Add katib tag to images #1085 (inc0)
  • fix for file_cache is unavailable when using oauth2client >= 4.0.0 #1084 (kkasravi)
  • add a readme file to the mpi-job ksonnet component #1079 (rongou)
  • fix #1075, user_guide link is dead #1077 (theofpa)
  • TFJobs UI doesn't work behind IAP #1073 (kkasravi)
  • point bootstrapper to v0.2.0 in release branch #1069 (kunmingg)
  • Tooling to make it easier to tag images and update the ksonnet prototypes #1066 (jlewi)
  • Create a script to update some of the docker images in the prototypes #1063 (jlewi)
  • [gcp-deployer] Add Gapi manager class, more typings and fixes #1054 (yebrahim)

v0.2.0-rc.1 (2018-06-22)

Full Changelog

Closed issues:

  • Cross-Origin Resource Sharing with TF Jobs Dashboard #1046
  • nvidia-smi fails for TFJob's but not for similarly configured Job's #1042
  • Jupyter can't start pod; the default spawner image is way too old #1041
  • TFJob pods deleted on completion/failure impairing debugging #1039
  • Invalid value: "v1alpha2": field is immutable #1029
  • Update PyTorch Image to officially released one #1020
  • Bootstrapper in 0.2.0-RC 0 doesn't set tfJobsVersion to v1alpha2 by default #1018
  • TensorFlow 1.8 not included in stock Jupyter Images #1014
  • Investigate supporting of TF serving 1.7 gpu #1009
  • Parameter names for Ambassador images improperly named; have extra tf prefix #994
  • Bootstrapper: Friction log from minikube on macOS #981
  • Enable TFJob v1alpha2 by default in 0.2 release #977
  • Central UI sometimes doesn't render if screen too small #957
  • [gcp] cluster-kubeflow.yaml isn't tested #950
  • [central UI] needs a link to JupyterHub #810
  • Verify Central UI is working #805
  • Batch Prediction Beam Library #662

Merged pull requests:

v0.2.0-rc.0 (2018-06-16)

Full Changelog

Fixed bugs:

  • JupyterHub authenticates login but doesn't redirect to user home #430

Closed issues:

  • Bootstrapper should apply components in an order #1006
  • Make Katib work with reverse proxy and ambassador #991
  • ambassador memory leak statsd:0.30.1 #986
  • presubmit failing: couldn't open import "X": no match locally or in the Jsonnet library path #983
  • Verify Katib is working #973
  • Issues with image release workflow app #970
  • ks error: ERROR open /home/jlewi/app.yaml: no such file or directory #966
  • Bootstrapper throwing error when deployed on GKE #961
  • [gcp] bootstrapper fails to create CloudEndpoints resource #954
  • Nightly builds of TensorFlow notebook images do not have have the same commit hash in the tag #943
  • Create ksonnet prototype for auto image release #941
  • How to configure a local volume for Jupyterhub_spawner.py #933
  • Unknown variable error when applying kubeflow-core prototypes #932
  • GCP Deployment Manager needs to delete IAM roles when DM is deleted #910
  • Unable to use image-releaser for tf-notebook-workflow #909
  • Regression Jupyter notebooks no longer include python2 runtime #906
  • Deadlocks configuring envoy for IAP #903
  • IAP setup script needs permission to list backend services #902
  • bootstrapper should not crash loop on error #901
  • openmpi-controller:0.0.3 image pull error. #887
  • deployment manager should allow setting IAM roles in YAML config #883
  • Bootstrapper should support packages from other registries #880
  • Deployment manager config should create service accounts and set IAM roles #878
  • Deployment manager config should enable Cloud Endpoints API #876
  • dev.kubeflow.org envoy; iap sidecar stuck waiting for backend #871
  • Presubmit unhealthy? Networking issues #869
  • Failed to run TfCnn example from User Guide: ERROR find objects: parse jsonnet snippet: params.libsonnet:22:5-13 Expected a comma before next field. #863
  • unknown node type issue related to mycnnjob.jsonnet #858
  • Verify PVC for /home/jovyan works #854
  • E2E test for TFJob v1alpha2 ksonnet package #852
  • Include ssh in notebook image to support authenticated git push #850
  • Jupyter image for TensorFlow 1.8 #846
  • Some questions about tf-serving on NFS #844
  • E2E test verifies Kubeflow installed via Deployment Manager #836
  • JupyterHub GitHub OAuth Setup missing manifest/config.yaml files #835
  • GCP deployment manager test handle internal errors #833
  • bootstrapper error trying to create the PyTorch Operator #832
  • Unify and dedup release workflows #830
  • Bootstrapper should optionally use config map to specify ksonnet parameters #829
  • horovod error #826
  • Give release service account access to kubeflow-images-public #824
  • Ambassador failed to start up #811
  • IAP Envoy route should map / to central dashboard #809
  • [Central UI] Remove the box "There is nothing to display here" #808
  • Bootstrapper should deploy new Stackdriver agents #807
  • deployment manager should disable legacy Stackdriver agents #806
  • Invalid envoy config duplicate value IAP #804
  • On GCP trigger bootstrapper with deployment manager #802
  • TF serving GPU test failing #794
  • [openmpi] support to run a job with non-root users #793
  • Minikube E2E test timeout waiting for VM to be ready. #788
  • Spawner options error(401) #784
  • jupyterNotebookPVCMount is not work in v0.1.2 #770
  • Swift Jupyter Integration #763
  • It didn't return [dead loop] when Python 3 is being used. #760
  • Support using bootstrapper image as a kube deployment. #758
  • Support install kubeflow by Deployment Manager #757
  • the tfserving prototype missing label after generation #746
  • E2E test for bootstrapper #742
  • E2E test for Argo Package #740
  • cloud-endpoints fails when using bootstrapper #735
  • Use Pod Preset to add environment variables and volume mounts to pods #732
  • Bootstrapper fails if no ~/.kube/config is present #722
  • openmpi controller exited with error #718
  • [openmpi] Upload trained models to persistent storage #713
  • Create individual prototypes for just TFJob #669
  • Nightly (regular) build of container images #666
  • Bootstrapper should enable IAP on GKE #665
  • Bootstrapper should check if user has appropriate permissions and if not create cluster role #664
  • Bootstrapper should create namespace if it doesn't exist #663
  • Tests are failing because we are running out of PD quota in us-east1 #618
  • ksonnet packages for Pachyderm #611
  • Friction log for TFX Chicago taxi cab example on minikube #594
  • TFJob prototype should contain the full TFJob spec so that ks generate is mostly just copying the prototype #564
  • Can we put all of /home on PV for Jupyter notebooks #561
  • Create gcr.io/kubeflow-images-public #534
  • Central UI Ambassador Integration #528
  • Have the JupyterHub spawner report issues with spawning the user's server #505
  • Recommended minikube setup #502
  • TF Serving component logging #495
  • JupyterHub Spawner complains notebook is already spawning; upgrade JupyterHub to 0.9 #479
  • Build TFServing images for different TF versions #468
  • [Discussion] ISTIO and TFServing #464
  • E2E test for tf-job-simple prototype #462
  • Unable to mount volumes for pod "jupyter-... #424
  • multiple Matplotlib libraries #423
  • ksonnet style guide #403
  • Reformat only modified files #395
  • Establish a pattern for creating/using secrets used by multiple kubeflow prototypes #372
  • ks delete default fails #364
  • Python script to set parameters for ksonnet prototypes #322
  • Flakes building TF Serving image; problems downloading unbuntu packages #310
  • ksonnet prototypes for TensorBoard #297
  • Enforce formatting for all jsonnet files #282
  • Assess usability on vanilla dockers #267
  • Discussion: Eventing model/solution #263
  • Recommended Kubeflow Setup on GKE #241
  • Recommended setup for different K8s Solutions #240
  • Use Ambassador/Envoy as proxy for JupyterHub #239
  • ks apply cloud -c newjob silently fails #217
  • Tracking Central UI #199
  • Kubeflow logo? #187
  • [discussion] Support PyTorch distributed training? #179
  • Add kubeflow into tesorflow/ecosystem #177
  • Central UI #141
  • Investigate file server connection errors with Jupyter and IAP #140
  • TfJob controllers are not namespace scoped so tests aren't isolated #134
  • Add resource request and limit fields to tf-job ksonnet prototype #116
  • Figure out ksonnet repo organization #106
  • Add Argo package to our ksonnet registry #21

Merged pull requests:

v0.1.3 (2018-04-26)

Full Changelog

Closed issues:

  • all training run on the first worker #721
  • Jupyter images pruned too aggressively? #719
  • cert-manager in "setting-up-iap-on-gke" not working #709
  • New repo for benchmarking #708
  • minikube VM unavailable for e2e kubeflow-presubmit check #707
  • Kubeflow Jupyter notebook images are large because of all the extra Python libs; can we shrink to improve pull times? #568

Merged pull requests:

  • CP to v0.1-branch: restore some important NB pkgs; http_timeout; bool fixes for tf-serving #726 (pdmack)
  • update ksonnet version to v0.10.0-alpha.3 #725 (kunmingg)
  • Add a few packages to the jupyter notebook image #724 (ankushagarwal)
  • add kunming to reviewer #715 (kunmingg)
  • controller bug fix #714 (kunmingg)
  • [openmpi] Add OWNERS file #711 (jiezhang)
  • Adding ksonnet components for tensorboard files. Issue #297 #710 (abkosar)
  • Point instructions to 0.1.2 release #706 (ankushagarwal)
  • CP to v0.1-branch: Update http_timeout to 5 minutes in jupyterhub (#691) #705 (pdmack)
  • [openmpi] Introduce a sidecar container for inter-pod synchronization #704 (jiezhang)
  • openmpi: namespaced resource names should be prefixed with component name #698 (everpeace)

v0.1.2 (2018-04-21)

Full Changelog

Fixed bugs:

  • Segfault in bootstrapper while processing .kube #657

Merged pull requests:

  • CP to v0.1-branch: TF notebook slimming, joyvan pip installs, and new gcr.io locations #703 (pdmack)

v0.1.1 (2018-04-20)

Full Changelog

Fixed bugs:

  • tf-cnn: no matches for kind "TFJob" in version... #643

Closed issues:

  • ks 0.10.0-alpha.2 does not work with install instructions #686
  • Bootstrap image not found #682
  • Unable to get kubeflow working on hyperkube in a circleci vm #674
  • Cannot install pip packages from jupyter notebook #668
  • Expose istio dashboards #644
  • Timeout waiting for simple-tfjob-gke #636
  • Refactor tf-serving image build for multiple TF versions #632
  • Some individual tests not showing up in test-grid #631
  • User guide sprawl #629
  • inception server not working #621
  • Add Link to Intel's tutorial #612
  • Add auto-generated TOC for all docs #603
  • Error occurs when following user guide #598
  • [GKE] Use Cloud Endpoints to provision domain #586
  • kubeflow version #578
  • Presubmit failure: No such file or directory XXX/.kube/config #562
  • Make IAP config robust to updating the Ingress #550
  • TF MPI support #535
  • TF serving param deployHttpProxy needs to be transformed from string to bool #531
  • Cut a 0.1 release #506
  • TF serving component monitoring #496
  • TF serving GPU e2e test is flaky #484
  • Build more efficient tensorflow-notebook images #472
  • [Jupyter/Azure] Drivers are not mounted when spawning a JupyterHub with GPU #435
  • Make it easy to get started with Kubeflow #105
  • Size of gcr.io/kubeflow/tensorflow-notebook-* #37

Merged pull requests:

  • permission bug fix and image tag update #699 (kunmingg)
  • Update the hub spawner dropdown for latest NB images #697 (pdmack)
  • openmpi: master/workers sync mechanism is replaced with k8s api from redis #696 (everpeace)
  • Migrate images to kubeflow-images-public #695 (ankushagarwal)
  • Pin instructions to ks 0.9.2 (for now) #694 (pdmack)
  • add export to ksonnet github token instructions #693 (mattf)
  • openmpi: slots clause should be generated when gpus '> 0' #692 (everpeace)
  • Update http_timeout to 5 minutes in jupyterhub #691 (ankushagarwal)
  • openmpi: fix failing installing redis-tools in init.sh #690 (everpeace)
  • Refactor tensorflow-notebook-image/Dockerfile #689 (ankushagarwal)
  • [openmpi] Add GPU support #685 (jiezhang)
  • openmpi: make 'schedulerName' configurable to use custom schedulers. #683 (everpeace)
  • create rolebinding within namespace to guarantee permission #680 (kunmingg)
  • Support rollout new model with istio #679 (lluunn)
  • Add documentation for exposing grafana dashboard #678 (lluunn)
  • openmpi package doesn't work on kubernetes cluster having custom dns. #676 (everpeace)
  • Add batch support to openmpi package #671 (jiezhang)
  • make Dockerfile & Makefile cross platform #670 (kunmingg)
  • Update ksonnet_packages.md #661 (pdmack)
  • Include tensor2tensor in jupyter notebook image #659 (ankushagarwal)
  • Add willingc to reviewers #656 (willingc)
  • [Azure] Jupyter spawner: driver volumes for azure #655 (wbuchwalter)
  • Bootstraper polish: #654 (kunmingg)
  • add link to Intel's tutorial #652 (raddaoui)
  • [Azure] Update nvidia driver volumes for AKS #650 (wbuchwalter)
  • Remove zjj2wry as a reviewer. #648 (jlewi)
  • Remove the outdated YAML specs for TFCnn job. #647 (jlewi)
  • update user_guide #646 (lluunn)
  • Add components to work with GCP #645 (jlewi)
  • Change naming from jupyter to jupyterhub when referring to hub #642 (willingc)
  • Troubleshooting note on cluster-admin privileges for tf-job-operator #641 (tmckayus)
  • Use dashes not underscores in junit file names. #638 (jlewi)
  • Update various images in kubeflow to kubeflow-images-public #635 (ankushagarwal)
  • Create a kubeflow version file: version.txt and a configmap kubeflow-version #634 (ankushagarwal)
  • Cleaned up README.md #628 (ddutta)
  • TF Serving + Istio #627 (lluunn)
  • Improve user guide wrt exposing notebook in non-cloud setup #625 (xyhuang)
  • Add openmpi package #624 (jiezhang)
  • Fix setup/teardown of VM for minikube. #620 (jlewi)
  • Update and clarify JupyterHub README #616 (willingc)
  • images: Add the link #614 (gaocegege)
  • Improve documentation #613 (jlewi)
  • Initial ksonnet package for Pachyderm. #610 (jlewi)
  • Provide documentation about adding ksonnet packages to Kubeflow. #609 (jlewi)
  • Upgrade tf-serving version to 1.6.0 #608 (pdmack)
  • import of cloud-endpoints component and support in iap-ingress component #605 (danisla)
  • docs(TOC): add auto-generated TOC for all docs #604 (DjangoPeng)
  • docs(troubleshooting): add a new item into troubleshooting #602 (DjangoPeng)
  • Update the docs to point to the v0.1.0 release now that it is cut. #601 (jlewi)
  • Create a doc with information about our docker images. #600 (jlewi)
  • Typo in UG #599 (pdmack)
  • Create a POC of an app to simplify deployment of Kubeflow #595 (jlewi)
  • CP: Install graphviz in tensorflow notebook image (#583) #585 (pdmack)
  • Enable TF serving gpu test #558 (lluunn)
  • Branching and tagging policy for releases #519 (willb)

v0.1.0-rc.4 (2018-04-04)

Full Changelog

v0.1.0 (2018-04-04)

Full Changelog

Closed issues:

  • Explicitly specified version of tensorflow being replaced in the python 2 environment with the latest version from PyPI #571
  • Jupyter pod stuck at ContainerCreating when spawning #336

Merged pull requests:

  • Cherry pick : Update tensorflow notebook version to v20180403-1f854c44 (#589) #590 (ankushagarwal)
  • Update tensorflow notebook version to v20180403-1f854c44 #589 (ankushagarwal)
  • README: Add link for tf-operator #588 (gaocegege)
  • Clean up minor formatting errors in README #587 (pdmack)
  • Correct typo for param defaultHttpProxyImage (#556) #584 (jlewi)
  • Install graphviz in tensorflow notebook image #583 (ankushagarwal)
  • Disable PVC by default (#577) #582 (inc0)
  • Fix the tensorflow version in prebuilt images for python 2.7. #580 (ojarjur)
  • Disable PVC by default #577 (inc0)
  • Add pdmack to OWNERS #565 (pdmack)
  • Correct typo for param defaultHttpProxyImage #556 (pdmack)
  • sidecar for envoy pod to keep IAP up to date #552 (danisla)
  • Update instructions about releasing TFJob #532 (jlewi)

v0.1.0-rc.3 (2018-04-03)

Full Changelog

Closed issues:

  • IAP should not store client secret in ksonnet #569
  • other tf-operators? #539

Merged pull requests:

  • Cherrypick #570 and #572 into v0.1-branch #575 (ankushagarwal)
  • Moved OAuth secret from param to named secret #572 (danisla)
  • Update tf jupter notebook images to include tfma (from #544) #570 (ankushagarwal)
  • Rety upto 3 times while building tensorflow notebook images #559 (ankushagarwal)
  • Add the "tensorflow-model-analysis" package to the notebook images #544 (ojarjur)
  • add willb to approvers #542 (willb)
  • Docs should show how to pull a particular release #524 (jlewi)
  • Changes to support running E2E tests on minikube. #523 (jlewi)

v0.1.0-rc.2 (2018-04-02)

Full Changelog

Closed issues:

  • Tensorflow no longer working in the GPU image (if you build from HEAD) #549
  • TFJob UI not included in ksonnet configs #546
  • dev.kubeflow.org 502s #545

Merged pull requests:

v0.1.0-rc.1 (2018-03-30)

Full Changelog

Closed issues:

  • Upgrade ksonnet in our Jupyter images to 0.9.2 #490
  • Javascripts widgets don't work in JupyterLab #489
  • Option to disable use of PVC for Jupyter #365
  • Katacoda Demo Scenario Python incompatibility - Warning #70

Merged pull requests:

v0.1.0-rc.0 (2018-03-27)

Fixed bugs:

  • tf-cnn fails to create #458
  • Presubmit failure; cannot import k.libsonnet #447
  • Presubmit shows succeeded, but some test actually failed. #436

Closed issues:

  • Tf serving component should provide default HTTP image #511
  • Continuous testing for release branches #507
  • construct base object: Failed to filter components; the following components don't exist: [ 'kubeflow-core' ] #481
  • Document GITHUB_TOKEN #478
  • ks env set default --namespace=kubeflow doesn't change the namespace #477
  • Build Jupyter Notebook images for supported versions of TF #467
  • [ERROR] Can not ks apply default -c kubeflow-core with ks 0.9.5 #453
  • JupyterHub terminates with 500 Internal Server error #433
  • Proposal: We should have basic tests for every ksonnet prototype #432
  • Remove tensorboard link from Jupyter notebooks (hub?) #428
  • jovyan user cannot sudo in terminal #425
  • Cannot not run tensorboard from Jupyter notebooks #422
  • IPyWidgets not displaying when using a Python 2 kernel #419
  • normalize libsonnets so their "all(params)" is only called from kubeflow/core/all.libsonnet #417
  • libcublas.so.9.0: cannot open shared object file: No such file or directory #414
  • Executing child process 'start-notebook.sh' failed: 'Permission denied' #412
  • Use local NFS server as PersitentVolume #410
  • Create a ksonnet component to deploy seldon-core models #405
  • Replace two envoy containers with one in the envoy pod #404
  • [ question ] when I follow the setup tutorial, I got cannot parse dnsName problem #397
  • Build Envoy Container with JWT validation #394
  • [ support ] github.com set request rate limit #391
  • cnn tfjob status never change to completed #389
  • TFServing prototype for using GCS with service account key #385
  • Jupyter notebook image should pin TF version #375
  • ClusterRole's and ClusterRoleBindings should include namespace in the name #374
  • Ambassador can't watch services at the cluster level. #373
  • Missing TFServing monitoring features? #369
  • Set userid and group id in TF Serving GPU container #367
  • Reviewable is blocking automatic merging #356
  • Add tensorflow-serving-api to jupyter image #355
  • Build http proxy for TF serving as part of our release workflow #353
  • Better docs for http proxy #352
  • Failed to pull image "gcr.io/kubeflow/tf-benchmarks-gpu:.... #348
  • autoformat_jsonnet.sh unknown predicate -E #345
  • ambassadors are crashed and cannot be created #344
  • How to login the Jupyterhub on the remote server? #343
  • [ksonnet] RUNTIME ERROR: Field does not exist: core #340
  • Build TFServing GPU Docker Image #338
  • Automatic PR merge #331
  • Add additional links and information about JupyterHub to README #329
  • Jupyter pod can't start; jinja exception Encountered unknown tag 'trans'. #325
  • Avoid code duplication in Dockerfiles for Jupyter notebook images #321
  • Kubeflow not deployed #318
  • Investigate notebook container sidecar to enable in-notebook container builds #312
  • JupyterHub Spawner should have a UI element that specifies default docker images #309
  • Need an OWNERS file #308
  • [Discussion] Refactor the Registry? #306
  • option naming inconsistencies #303
  • Test the TFServing configs with default image #301
  • Configure JupyterHub spawner to allow root operations by default #300
  • TFServing docs are incorrect; we no longer assign a public IP by default #296
  • Add an option to set service type to TFServing deployment #295
  • Example/Docs for using IAP with TFServing #293
  • TFServing deployment should support GPUs #292
  • E2E Test for TFServing with GPUs #291
  • TFServing component should add an ambassador route #290
  • How to add persistent volume in jupyter_spawner.py file ? #285
  • Prow jobs should use a common docker image #276
  • TFJob test is failing #273
  • Presubmit failing: Time out waiting for Workflows #272
  • Better identity management in K8s #266
  • JupyterLab not working in latest notebook images #262
  • JupyterHub spawner: upstream request timeout #261
  • [bug] TFJob dashboard UI not showing up through Ambassador reverse proxy #260
  • IAP component should use cert-manager to get a signed certificate #255
  • Run util_test.jsonnet added in #246 as part of E2E tests #254
  • tf.transform libraries in our notebooks #244
  • user guide - tf-serving: ClusterIP instead of LoadBalancer service #233
  • Typo in argo-ui rolebinding #230
  • JupyterHub fails to load image properly, but starts a notebook anyway #226
  • Move troubleshooting guide into user_guide #223
  • Build and publish Docker images using Argo #221
  • E2E tests need to verify that we can submit a TFJob #207
  • Use kubeflow/testing to run Argo workflows #205
  • KubeFlow on AWS - tf-hub-lb not created and cannot change jupyterHubServiceType to loadbalancer #203
  • ERROR Server is unable to handle tensorflow.org/v1alpha1, Kind=TFJob #200
  • Increase rate limit for installing kubeflow packages #195
  • Add GPU Support for k8s-model-server on Kubeflow #194
  • TFJob controller cannot terminate job #193
  • Remove tensorflow/k8s as a submodule #190
  • Delete components/tf-controller #189
  • Error from server (Forbidden): error when creating "deploy_crd.yaml": clusterroles.rbac.authorization.k8s.io "tf-job-operator" is forbidden #188
  • tf-job.libsonnet is issuing the wrong CRD job type #186
  • Test failures #176
  • Create a repository for examples: kubeflow/examples #174
  • Create a testing repository #173
  • Change google/kubeflow links to kubeflow/kubeflow #168
  • Setup prow for the new org #165
  • Add TFJob CRD creation to ksonnet components #164
  • ksonnet component for Seldon.io deployment #159
  • E2E Solution for GitHub Issue Summarization On Kubeflow #157
  • CreateSession still waiting for response from worker: /job:ps/replica:0/task:0 #153
  • Test cluster is unhealthy no ready nodes #150
  • ClusterRoleBinding.rbac.authorization.k8s.io "tf-job-operator" is invalid #148
  • worker restart result in the TFJOB can not finish expectedly #147
  • PVC created but not added to volume list #145
  • Code of conduct #143
  • Add ksonnet to our Jupyter notebook Docker images #138
  • What's up with Tensorflow? #135
  • Postsubmits appear to be broken #133
  • Make ArgoUI for our tests publicly accessible #131
  • tf-cnn template doesn't set TerminationPolicy correctly #129
  • Copy Jupyter notebook Dockerfiles into Kubeflow repo #126
  • tf-job-operator service account is missing roles #125
  • JupyterHub should not open up external IP by default #123
  • where is the Dockerfile of gcr.io/kubeflow/tensorflow-notebook-cpu? #122
  • Default service type for ModelServer should not be loadbalancer #121
  • Add tensorBoard field to tf-job prototype #113
  • po/tf-job-operator logs said user cannot get endpoints in the namespace "default" #109
  • Typo in tfjob.jsonnet #107
  • Problems connecting to Jupyter Kernel with IAP #104
  • Create only one service for JupyterHub and make type a parameter #99
  • Delete inception.tar.gz #98
  • Postsubmits need to use registry components at commit being tested #96
  • Presubmits need to pull registry from the PR branch #95
  • link to ksonnet might be confusing #91
  • release .yaml manifest #90
  • Incorporate tf.transform #88
  • ProwTestCase library #83
  • Set suitable defaults for JupyterHub service type and make it configurable? #80
  • if i use kubeflow on local cluster with gpu #79
  • Failed to connect to my Hub at http://tf-hub-0:8081/hub/api (attempt 1/5). Is it running? #78
  • Support for other Deep Learning Libraries #74
  • Setup PR Dashboard #73
  • TfServing uber tracking bug #64
  • tf-job missing from ksonnet registry yaml #62
  • Add missing features to TfJob controller ksonnet component #61
  • Support IAP on GKE #60
  • Syntax error of juypterhub.yaml for Kubernetes 1.6 #58
  • Proposal: Include very basic tracking of usage by default #55
  • Should Kubeflow publish and maintain TF Serving Docker Images? #50
  • tf-cnn prototype doesn't add GPU resource requests #48
  • Remove namespace as a package parameter #43
  • Clean up repo after switching to ksonnet #41
  • Doc gen for kubeflow ksonnet registry #39
  • E2E Testing For Kubeflow. #38
  • Proposal: Discuss Kubeflow organization and community #35
  • Proposal: our expectation on KubeFlow #33
  • Add LICENSE file: Apache License, Version 2.0 #27
  • Fault tolerant storage for Jupterhub #19
  • Tensorboard support #17
  • python model server #15

Merged pull requests:

* This Change Log was automatically generated by github_changelog_generator