Release Release v3.0.0 · nod-ai/shark-ai

This release marks public availability for the SHARK AI project, with a focus on serving the Stable Diffusion XL model on AMD Instinct™ MI300X Accelerators.

Highlights

shark-ai

The shark-ai package is the recommended entry point to using the project. This meta package includes compatible versions of all relevant sub-projects.

shortfin

The shortfin sub-project is SHARK's high performance inference library and serving engine.

Key features:

Fast inference using ahead of time model compilation powered by IREE
Throughput optimization via request batching and support for flexible device topologies
Asynchronous execution and efficient threading
Example applications for supported models
APIs available in Python and C
Detailed profiling support

For this release, shortfin uses precompiled programs built by the SHARK team using the sharktank sub-project. Future releases will streamline the model conversion process, add user guides, and enable adventurous users to bring their own custom models.

Current shortfin system requirements:

Python 3.11+
An AMD Instinct™ MI300X Accelerator
A compatible version of Linux and ROCm (see the ROCm compatability matrix)

Serving Stable Diffusion XL (SDXL) on MI300X

See the user guide for the latest instructions.

To serve the Stable Diffusion XL model, which generates output images given input text prompts:

# Set up a Python virtual environment.
python -m venv .venv
source .venv/bin/activate
# Optional: faster installation of torch with just CPU support.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
# Install shark-ai, including extra dependencies for apps.
pip install shark-ai[apps]

# Start the server then wait for it to download artifacts.
python -m shortfin_apps.sd.server \
  --device=amdgpu --device_ids=0 --topology="spx_single" \
  --build_preference=precompiled
# (wait for setup to complete)
# INFO - Application startup complete.
# INFO - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

# Run the interactive client, sending text prompts and receiving generated images back.
python -m shortfin_apps.sd.simple_client --interactive
# Enter a prompt: a single cybernetic shark jumping out of the waves set against a technicolor sunset
# Sending request with prompt: ['a single cybernetic shark jumping out of the waves set against a technicolor sunset']
# Sending request batch # 0
# Saving response as image...
# Saved to gen_imgs/shortfin_sd_output_2024-11-15_16-30-30_0.png

Roadmap

This release is just the start of a longer journey. The SHARK platform is fully open source, so stay tuned for future developments. Here is a taste of what we have planned:

Support for a wider range of ML models, including popular LLMs
Performance improvements and optimized implementations for supported models across a wider range of devices
Integrations with other popular frameworks and APIs
General availability and user guides for the sharktank model development toolkit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v3.0.0

Highlights

shark-ai

shortfin

Serving Stable Diffusion XL (SDXL) on MI300X

Roadmap