leejet 36330724bd
feat: add module backend assignment support (#1500)
Co-authored-by: Stéphane du Hamel <stephduh@live.fr>
2026-05-16 20:27:06 +08:00

123 lines
4.5 KiB
Markdown

# Backend selection
`stable-diffusion.cpp` has two backend assignments:
- `--backend` selects the runtime backend used to execute model graphs.
- `--params-backend` selects the backend used to allocate model parameters.
If `--params-backend` is not set, parameters use the same backend as their module runtime backend.
## Syntax
A backend assignment can be a single backend name:
```shell
sd-cli -m model.safetensors -p "a cat" --backend cpu
```
This applies to every module that does not have a more specific assignment.
Assignments can also target individual modules:
```shell
sd-cli -m model.safetensors -p "a cat" --backend te=cpu,vae=cuda0,diffusion=vulkan0
```
The same syntax is used for parameter placement:
```shell
sd-cli -m model.safetensors -p "a cat" --backend cuda0 --params-backend te=cpu,vae=cpu
```
Module names are case-insensitive. Hyphens and underscores in module names are ignored, so `clip_vision`, `clip-vision`, and `clipvision` are equivalent.
`all=`, `default=`, and `*=` can be used to set the default backend inside a mixed assignment:
```shell
sd-cli -m model.safetensors -p "a cat" --backend all=cuda0,te=cpu
```
## Modules
| Module | Purpose | Accepted names |
| --- | --- | --- |
| `diffusion` | UNet, DiT, MMDiT, Flux, Wan, Qwen Image, and other diffusion models | `diffusion`, `model`, `unet`, `dit` |
| `te` | Text encoders and conditioners | `te`, `clip`, `text`, `textencoder`, `textencoders`, `conditioner`, `cond`, `llm`, `t5`, `t5xxl` |
| `clip_vision` | CLIP vision encoder | `clip_vision`, `clipvision`, `clip-vision`, `vision` |
| `vae` | VAE and TAE | `vae`, `firststage`, `autoencoder`, `tae` |
| `controlnet` | ControlNet | `controlnet`, `control` |
| `photomaker` | PhotoMaker ID encoder and PhotoMaker LoRA | `photomaker`, `photomakerid`, `pmid`, `photo` |
| `upscaler` | ESRGAN upscaler | `upscaler`, `esrgan`, `hires` |
`te` is the preferred module name for text encoders. `clip` is kept as an accepted alias because many existing commands and model names use CLIP terminology.
## Backend names
Backend names are resolved against the GGML backend device list. Matching is case-insensitive and accepts exact names or unique prefixes, so common values include names such as:
- `cpu`
- `cuda0`
- `vulkan0`
- `metal`
The special values `auto`, `default`, and an empty backend name select the default backend. The default preference is GPU, then integrated GPU, then CPU.
The special value `gpu` selects the first GPU backend, falling back to the first integrated GPU backend.
## Runtime backend vs. parameter backend
The runtime backend controls where graph execution runs. The parameter backend controls where model weights are allocated.
For example:
```shell
sd-cli -m model.safetensors -p "a cat" --backend cuda0 --params-backend cpu
```
This runs all modules on `cuda0`, but stores parameters in CPU RAM. During execution, parameters are moved to the runtime backend as needed.
Per-module assignments can be mixed:
```shell
sd-cli -m model.safetensors -p "a cat" --backend diffusion=cuda0,te=cpu,vae=cpu --params-backend diffusion=cuda0,te=cpu,vae=cpu
```
This keeps text encoding and VAE execution on CPU while the diffusion model runs on GPU.
## Backend sharing and lifetime
Backends are managed by `SDBackendManager`.
Within one manager, backend instances are cached by resolved backend device name. If multiple modules request the same backend, they share the same `ggml_backend_t`.
For example:
```shell
--backend te=cpu,vae=cpu
```
uses one shared CPU backend for both `te` and `vae` runtime execution.
Runtime and parameter assignments also share the same backend cache. If `--backend diffusion=cuda0` and `--params-backend diffusion=cuda0` resolve to the same device, both use the same backend instance.
`SDBackendManager` owns the backend instances and frees them when the context or upscaler is destroyed. Model runners receive non-owning runtime and parameter backend pointers and do not free them.
## Compatibility flags
The older CPU placement flags are still supported:
- `--clip-on-cpu`
- `--vae-on-cpu`
- `--control-net-cpu`
- `--offload-to-cpu`
`--clip-on-cpu`, `--vae-on-cpu`, and `--control-net-cpu` affect runtime backend assignment only when `--backend` is not set. They map to `te=cpu`, `vae=cpu`, and `controlnet=cpu`.
`--offload-to-cpu` affects parameter backend assignment only when `--params-backend` is not set. It is equivalent to:
```shell
--params-backend cpu
```
Explicit `--backend` and `--params-backend` assignments are preferred for new commands.