diff --git a/README.md b/README.md index 553fb7f..e30afe5 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ Inference of Stable Diffusion and Flux in pure C/C++ - SD1.x, SD2.x, SDXL and [SD3/SD3.5](./docs/sd3.md) support - !!!The VAE in SDXL encounters NaN issues under FP16, but unfortunately, the ggml_conv_2d only operates under FP16. Hence, a parameter is needed to specify the VAE that has fixed the FP16 NaN issue. You can find it here: [SDXL VAE FP16 Fix](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors). - [Flux-dev/Flux-schnell Support](./docs/flux.md) - +- [FLUX.1-Kontext-dev](./docs/kontext.md) - [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) support - [PhotoMaker](https://github.com/TencentARC/PhotoMaker) support. - 16-bit, 32-bit float support @@ -220,7 +220,7 @@ arguments: -m, --model [MODEL] path to full model --diffusion-model path to the standalone diffusion model --clip_l path to the clip-l text encoder - --clip_g path to the clip-l text encoder + --clip_g path to the clip-g text encoder --t5xxl path to the the t5xxl text encoder --vae [VAE] path to vae --taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality) @@ -231,26 +231,32 @@ arguments: --normalize-input normalize PHOTOMAKER input id images --upscale-model [ESRGAN_PATH] path to esrgan model. Upscale images after generate, just RealESRGAN_x4plus_anime_6B supported by now --upscale-repeats Run the ESRGAN upscaler this many times (default 1) - --type [TYPE] weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_k, q3_k, q4_k) + --type [TYPE] weight type (examples: f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_K, q3_K, q4_K) If not specified, the default is the type of the weight file --lora-model-dir [DIR] lora model directory -i, --init-img [IMAGE] path to the input image, required by img2img + --mask [MASK] path to the mask image, required by img2img with mask --control-image [IMAGE] path to image condition, control net + -r, --ref_image [PATH] reference image for Flux Kontext models (can be used multiple times) -o, --output OUTPUT path to write result image to (default: ./output.png) -p, --prompt [PROMPT] the prompt to render -n, --negative-prompt PROMPT the negative prompt (default: "") --cfg-scale SCALE unconditional guidance scale: (default: 7.0) + --guidance SCALE guidance scale for img2img (default: 3.5) + --slg-scale SCALE skip layer guidance (SLG) scale, only for DiT models: (default: 0) + 0 means disabled, a value of 2.5 is nice for sd3.5 medium + --eta SCALE eta in DDIM, only for DDIM and TCD: (default: 0) --skip-layers LAYERS Layers to skip for SLG steps: (default: [7,8,9]) --skip-layer-start START SLG enabling point: (default: 0.01) --skip-layer-end END SLG disabling point: (default: 0.2) - SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END]) + SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END]) --strength STRENGTH strength for noising/unnoising (default: 0.75) --style-ratio STYLE-RATIO strength for keeping input identity (default: 20%) --control-strength STRENGTH strength to apply Control Net (default: 0.9) 1.0 corresponds to full destruction of information in init image -H, --height H image height, in pixel space (default: 512) -W, --width W image width, in pixel space (default: 512) - --sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm} + --sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm, ddim_trailing, tcd} sampling method (default: "euler_a") --steps STEPS number of sample steps (default: 20) --rng {std_default, cuda} RNG (default: cuda) @@ -267,7 +273,7 @@ arguments: This might crash if it is not supported by the backend. --control-net-cpu keep controlnet in cpu (for low vram) --canny apply canny preprocessor (edge detection) - --color Colors the logging tags according to level + --color colors the logging tags according to level -v, --verbose print extra info ``` diff --git a/assets/flux/kontext1_dev_output.png b/assets/flux/kontext1_dev_output.png new file mode 100644 index 0000000..4fa5e38 Binary files /dev/null and b/assets/flux/kontext1_dev_output.png differ diff --git a/docs/kontext.md b/docs/kontext.md new file mode 100644 index 0000000..5197525 --- /dev/null +++ b/docs/kontext.md @@ -0,0 +1,39 @@ +# How to Use + +You can run Kontext using stable-diffusion.cpp with a GPU that has 6GB or even 4GB of VRAM, without needing to offload to RAM. + +## Download weights + +- Download Kontext + - If you don't want to do the conversion yourself, download the preconverted gguf model from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF) + - Otherwise, download FLUX.1-Kontext-dev from https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/flux1-kontext-dev.safetensors +- Download vae from https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors +- Download clip_l from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors +- Download t5xxl from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp16.safetensors + +## Convert Kontext weights + +You can download the preconverted gguf weights from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF), this way you don't have to do the conversion yourself. + +``` +.\bin\Release\sd.exe -M convert -m ..\..\ComfyUI\models\unet\flux1-kontext-dev.safetensors -o ..\models\flux1-kontext-dev-q8_0.gguf -v --type q8_0 +``` + +## Run + +- `--cfg-scale` is recommended to be set to 1. + +### Example +For example: + +``` + .\bin\Release\sd.exe -M edit -r .\flux1-dev-q8_0.png --diffusion-model ..\models\flux1-kontext-dev-q8_0.gguf --vae ..\models\ae.sft --clip_l ..\models\clip_l.safetensors --t5xxl ..\models\t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v +``` + + +| ref_image | prompt | output | +| ---- | ---- |---- | +| ![](../assets/flux/flux1-dev-q8_0.png) | change 'flux.cpp' to 'kontext.cpp' |![](../assets/flux/kontext1_dev_output.png) | + + +