mirror of
https://github.com/leejet/stable-diffusion.cpp.git
synced 2025-12-12 13:28:37 +00:00
docs: add kontext doc
This commit is contained in:
parent
c9b5735116
commit
884e23eeeb
18
README.md
18
README.md
@ -13,7 +13,7 @@ Inference of Stable Diffusion and Flux in pure C/C++
|
||||
- SD1.x, SD2.x, SDXL and [SD3/SD3.5](./docs/sd3.md) support
|
||||
- !!!The VAE in SDXL encounters NaN issues under FP16, but unfortunately, the ggml_conv_2d only operates under FP16. Hence, a parameter is needed to specify the VAE that has fixed the FP16 NaN issue. You can find it here: [SDXL VAE FP16 Fix](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors).
|
||||
- [Flux-dev/Flux-schnell Support](./docs/flux.md)
|
||||
|
||||
- [FLUX.1-Kontext-dev](./docs/kontext.md)
|
||||
- [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) support
|
||||
- [PhotoMaker](https://github.com/TencentARC/PhotoMaker) support.
|
||||
- 16-bit, 32-bit float support
|
||||
@ -220,7 +220,7 @@ arguments:
|
||||
-m, --model [MODEL] path to full model
|
||||
--diffusion-model path to the standalone diffusion model
|
||||
--clip_l path to the clip-l text encoder
|
||||
--clip_g path to the clip-l text encoder
|
||||
--clip_g path to the clip-g text encoder
|
||||
--t5xxl path to the the t5xxl text encoder
|
||||
--vae [VAE] path to vae
|
||||
--taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality)
|
||||
@ -231,26 +231,32 @@ arguments:
|
||||
--normalize-input normalize PHOTOMAKER input id images
|
||||
--upscale-model [ESRGAN_PATH] path to esrgan model. Upscale images after generate, just RealESRGAN_x4plus_anime_6B supported by now
|
||||
--upscale-repeats Run the ESRGAN upscaler this many times (default 1)
|
||||
--type [TYPE] weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_k, q3_k, q4_k)
|
||||
--type [TYPE] weight type (examples: f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_K, q3_K, q4_K)
|
||||
If not specified, the default is the type of the weight file
|
||||
--lora-model-dir [DIR] lora model directory
|
||||
-i, --init-img [IMAGE] path to the input image, required by img2img
|
||||
--mask [MASK] path to the mask image, required by img2img with mask
|
||||
--control-image [IMAGE] path to image condition, control net
|
||||
-r, --ref_image [PATH] reference image for Flux Kontext models (can be used multiple times)
|
||||
-o, --output OUTPUT path to write result image to (default: ./output.png)
|
||||
-p, --prompt [PROMPT] the prompt to render
|
||||
-n, --negative-prompt PROMPT the negative prompt (default: "")
|
||||
--cfg-scale SCALE unconditional guidance scale: (default: 7.0)
|
||||
--guidance SCALE guidance scale for img2img (default: 3.5)
|
||||
--slg-scale SCALE skip layer guidance (SLG) scale, only for DiT models: (default: 0)
|
||||
0 means disabled, a value of 2.5 is nice for sd3.5 medium
|
||||
--eta SCALE eta in DDIM, only for DDIM and TCD: (default: 0)
|
||||
--skip-layers LAYERS Layers to skip for SLG steps: (default: [7,8,9])
|
||||
--skip-layer-start START SLG enabling point: (default: 0.01)
|
||||
--skip-layer-end END SLG disabling point: (default: 0.2)
|
||||
SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END])
|
||||
SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END])
|
||||
--strength STRENGTH strength for noising/unnoising (default: 0.75)
|
||||
--style-ratio STYLE-RATIO strength for keeping input identity (default: 20%)
|
||||
--control-strength STRENGTH strength to apply Control Net (default: 0.9)
|
||||
1.0 corresponds to full destruction of information in init image
|
||||
-H, --height H image height, in pixel space (default: 512)
|
||||
-W, --width W image width, in pixel space (default: 512)
|
||||
--sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm}
|
||||
--sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm, ddim_trailing, tcd}
|
||||
sampling method (default: "euler_a")
|
||||
--steps STEPS number of sample steps (default: 20)
|
||||
--rng {std_default, cuda} RNG (default: cuda)
|
||||
@ -267,7 +273,7 @@ arguments:
|
||||
This might crash if it is not supported by the backend.
|
||||
--control-net-cpu keep controlnet in cpu (for low vram)
|
||||
--canny apply canny preprocessor (edge detection)
|
||||
--color Colors the logging tags according to level
|
||||
--color colors the logging tags according to level
|
||||
-v, --verbose print extra info
|
||||
```
|
||||
|
||||
|
||||
BIN
assets/flux/kontext1_dev_output.png
Normal file
BIN
assets/flux/kontext1_dev_output.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 496 KiB |
39
docs/kontext.md
Normal file
39
docs/kontext.md
Normal file
@ -0,0 +1,39 @@
|
||||
# How to Use
|
||||
|
||||
You can run Kontext using stable-diffusion.cpp with a GPU that has 6GB or even 4GB of VRAM, without needing to offload to RAM.
|
||||
|
||||
## Download weights
|
||||
|
||||
- Download Kontext
|
||||
- If you don't want to do the conversion yourself, download the preconverted gguf model from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF)
|
||||
- Otherwise, download FLUX.1-Kontext-dev from https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/flux1-kontext-dev.safetensors
|
||||
- Download vae from https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors
|
||||
- Download clip_l from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors
|
||||
- Download t5xxl from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp16.safetensors
|
||||
|
||||
## Convert Kontext weights
|
||||
|
||||
You can download the preconverted gguf weights from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF), this way you don't have to do the conversion yourself.
|
||||
|
||||
```
|
||||
.\bin\Release\sd.exe -M convert -m ..\..\ComfyUI\models\unet\flux1-kontext-dev.safetensors -o ..\models\flux1-kontext-dev-q8_0.gguf -v --type q8_0
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
- `--cfg-scale` is recommended to be set to 1.
|
||||
|
||||
### Example
|
||||
For example:
|
||||
|
||||
```
|
||||
.\bin\Release\sd.exe -M edit -r .\flux1-dev-q8_0.png --diffusion-model ..\models\flux1-kontext-dev-q8_0.gguf --vae ..\models\ae.sft --clip_l ..\models\clip_l.safetensors --t5xxl ..\models\t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v
|
||||
```
|
||||
|
||||
|
||||
| ref_image | prompt | output |
|
||||
| ---- | ---- |---- |
|
||||
|  | change 'flux.cpp' to 'kontext.cpp' | |
|
||||
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user