Skip to content

Model Metadata

Vincent Roseberry edited this page Jul 23, 2024 · 13 revisions

A full model is composed of 3 types of entities:

  1. The model
  2. The instances (also called variations)
  3. The instance versions (also called version of a variation)

Let's take the example of efficientnet to explain these entities.

A model like efficientnet contains multiple instances.

An instance is a specific variation of the model (e.g. B0, B1, ...) with a certain framework (e.g. TensorFlow2).

You can find more documentation on how to name your model and variations here.

Model

To create a model, a special model-metadata.json file must be specified.

Here's a basic example for model-metadata.json:

{
  "ownerSlug": "INSERT_OWNER_SLUG_HERE",
  "title": "INSERT_TITLE_HERE",
  "slug": "INSERT_SLUG_HERE",
  "subtitle": "",
  "isPrivate": true,
  "description": "Model Card Markdown, see below",
  "publishTime": "",
  "provenanceSources": ""
}

You can also use the API command kaggle models init -p /path/to/model to have the API create this file for you for a new model. If you wish to get the metadata for an existing model, you can use kaggle models get username/model-slug.

Contents

We currently support the following metadata fields for models.

  • ownerSlug: the slug of the user or organization
  • title: the model's title
  • slug: the model's slug (unique per owner)
  • subtitle: the model's subtitle
  • isPrivate: whether or not the model should be private (only visible by the owners). If not specified, will be true
  • description: the model's card in markdown syntax (see the template below)
  • publishTime: the original publishing time of the model
  • provenanceSources: the provenance of the model

Description

You can find a template of the model card on this wiki page: https://github.com/Kaggle/kaggle-api/wiki/Model-Card

Model Instance (a.k.a Model Variation)

To create a model instance, a special model-instance-metadata.json file must be specified.

Here's a basic example for model-instance-metadata.json:

{
  "ownerSlug": "INSERT_OWNER_SLUG_HERE",
  "modelSlug": "INSERT_EXISTING_MODEL_SLUG_HERE",
  "instanceSlug": "INSERT_INSTANCE_SLUG_HERE",
  "framework": "INSERT_FRAMEWORK_HERE",
  "overview": "",
  "usage": "Usage Markdown, see below",
  "licenseName": "Apache 2.0",
  "fineTunable": False,
  "trainingData": []
}

You can also use the API command kaggle models instances init -p /path/to/model-instance to have the API create this file for you for a new model instance.

Contents

We currently support the following metadata fields for model instances.

  • ownerSlug: the slug of the user or organization of the model
  • modelSlug: the existing model's slug
  • instanceSlug: the slug of the instance
  • framework: the instance's framework (see the list below)
  • overview: a short overview of the instance
  • usage: the instance's usage in markdown syntax (see more info below)
  • licenseName: the name of the license (see the list below)
  • fineTunable: whether the instance is fine tunable
  • trainingData: a list of training data in the form of strings, URLs, Kaggle Datasets, etc...

Frameworks

  • tensorFlow1
  • tensorFlow2
  • tfLite
  • tfJs
  • pyTorch
  • jax
  • flax
  • pax
  • maxText
  • gemmaCpp
  • ggml
  • gguf
  • coral
  • scikitLearn
  • mxnet
  • onnx
  • keras
  • transformers
  • api
  • other
  • tensorRtLlm
  • triton

Licenses

Here is a list of the available licenses for models (you can either use the "Name" or the "Abbreviation"):

Name Abbreviation URL
Attribution 4.0 International (CC BY 4.0) CC BY 4.0 link
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) CC BY-NC-ND 4.0 link
Attribution 3.0 IGO (CC BY 3.0 IGO) CC BY 3.0 IGO link
CC BY-NC-SA 4.0 CC BY-NC-SA 4.0 link
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) CC BY-NC 4.0 link
Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) CC BY-ND 4.0 link
CC0: Public Domain CC0 1.0 link
CC BY-SA 3.0 CC BY-SA 3.0 link
Attribution 3.0 Unported (CC BY 3.0) CC BY 3.0 link
Attribution-NonCommercial-ShareAlike 3.0 IGO (CC BY-NC-SA 3.0 IGO) CC BY-NC-SA 3.0 IGO link
CC BY-SA 4.0 CC BY-SA 4.0 link
GPL 3 gpl-3 link
GNU Free Documentation License 1.3 FDL 1.3 link
GNU Affero General Public License 3.0 AGPL 3.0 link
GNU Lesser General Public License 3.0 LGPL 3.0 link
GPL 2 GPL 2 link
ODC Attribution License (ODC-By) ODC-BY 1.0 link
ODC Public Domain Dedication and Licence (PDDL) PDDL link
BSD-3-Clause bsd-3-clause link
Community Data License Agreement - Sharing - Version 1.0 CDLA Sharing 1.0 link
Community Data License Agreement - Permissive - Version 1.0 CDLA Permissive 1.0 link
Apache 2.0 apache-2.0 link
MIT mit link
AI Pubs Open RAIL-M License RAIL-M link
AI Pubs Research-Use RAIL-M License AIPubs Research-Use RAIL-M link
BigScience Open RAIL-M License BigScience OpenRAIL-M link
RAIL (specified in description) RAIL link
Gemma Gemma link
Llama 2 Community License Llama 2 link
Llama 3 Community License Llama 3 link

Usage

Snippets of code on how to use the model in markdown.

The following template variables can be used in this markdown:

  • ${VERSION_NUMBER} is replaced by the version number when rendered
  • ${VARIATION_SLUG} is replaced by the variation slug when rendered
  • ${FRAMEWORK} is replaced by the framework name
  • ${PATH} is replaced by /kaggle/input/<model_slug>/<framework>/<variation_slug>/<version>.
  • ${FILEPATH} is replaced by /kaggle/input/<model_slug>/<framework>/<variation_slug>/<version>/<filename>. This value is only defined if the databundle contain a single file
  • ${URL} is replaced by the absolute URL of the model
Clone this wiki locally