103 Commits

Author SHA1 Message Date
wuhei
a908436729
docs: update download link for Stable Diffusion v1.5 (#1063) 2025-12-09 22:06:16 +08:00
leejet
c9005337a8 docs: update logo 2025-12-07 14:56:21 +08:00
leejet
2f0bd31a84
feat: add ovis image support (#1057) 2025-12-07 12:32:56 +08:00
leejet
710169df5c docs: update news 2025-12-01 22:46:15 +08:00
leejet
34a6fd4e60
feat: add z-image support (#1020)
* add z-image support

* use flux_latent_rgb_proj for z-image

* fix qwen3 rope type

* add support for qwen3 4b gguf

* add support for diffusers format lora

* fix nan issue that occurs when using CUDA with k-quants weights

* add z-image docs
2025-12-01 22:39:43 +08:00
leejet
3c1187ce83 docs: correct the time of adding flux2 support 2025-11-30 12:40:56 +08:00
leejet
ab7e8d285e docs: update news 2025-11-30 11:51:23 +08:00
leejet
52b67c538b
feat: add flux2 support (#1016)
* add flux2 support

* rename qwenvl to llm

* add Flux2FlowDenoiser

* update docs
2025-11-30 11:32:56 +08:00
leejet
742a7333c3
feat: add cpu rng (#977) 2025-11-16 14:48:15 +08:00
Wagner Bruna
945d9a9ee3
docs: add Koboldcpp as an available UI (#930) 2025-11-02 02:03:01 +08:00
Wagner Bruna
353e708844
docs: update ggml and llama.cpp URLs (#931) 2025-11-02 02:02:44 +08:00
leejet
9e28be6479
feat: add chroma radiance support (#910)
* add chroma radiance support

* fix ci

* simply generate_init_latent

* workaround: avoid ggml cuda error

* format code

* add chroma radiance doc
2025-10-25 23:56:14 +08:00
akleine
062490aa7c
feat: add SSD1B and tiny-sd support (#897)
* feat: add code and doc for running SSD1B models

* Added some more lines to support SD1.x with TINY U-Nets too.

* support SSD-1B.safetensors

* fix sdv1.5 diffusers format loader

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-10-25 23:35:54 +08:00
leejet
0585e2609d docs: split README sections (build, performance, etc.) into separate docs 2025-10-16 23:22:06 +08:00
leejet
40a6a8710e
fix: resolve precision issues in SDXL VAE under fp16 (#888)
* fix: resolve precision issues in SDXL VAE under fp16

* add --force-sdxl-vae-conv-scale option

* update docs
2025-10-15 23:01:00 +08:00
Daniele
e3702585cb
feat: added prediction argument (#334) 2025-10-15 23:00:10 +08:00
leejet
2e9242e37f
feat: add Qwen Image Edit support (#877)
* add ref latent support for qwen image

* optimize clip_preprocess and fix get_first_stage_encoding

* add qwen2vl vit support

* add qwen image edit support

* fix qwen image edit pipeline

* add mmproj file support

* support dynamic number of Qwen image transformer blocks

* set prompt_template_encode_start_idx every time

* to_add_out precision fix

* to_out.0 precision fix

* update docs
2025-10-13 23:17:18 +08:00
leejet
beb99a2de2
feat: add Qwen Image support (#851)
* add qwen tokenizer

* add qwen2.5 vl support

* mv qwen.hpp -> qwenvl.hpp

* add qwen image model

* add qwen image t2i pipeline

* fix qwen image flash attn

* add qwen image i2i pipline

* change encoding of vocab_qwen.hpp to utf8

* fix get_first_stage_encoding

* apply jeffbolz f32 patch

https://github.com/leejet/stable-diffusion.cpp/pull/851#issuecomment-3335515302

* fix the issue that occurs when using CUDA with k-quants weights

* optimize the handling of the FeedForward precision fix

* to_add_out precision fix

* update docs
2025-10-12 23:23:19 +08:00
Wagner Bruna
5b261b9cee
feat: add a stand-alone upscale mode (#865)
* feat: add a stand-alone upscale mode

* fix prompt option check

* format code

* update README.md

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-10-12 23:10:02 +08:00
Sharuzzaman Ahmat Raslan
b451728b2f
docs: update README.md (#866) 2025-10-11 00:11:10 +08:00
Stefan-Olt
98ba155fc6
docs: HipBLAS / ROCm build instruction fix (#843) 2025-09-25 00:03:05 +08:00
leejet
fd693ac6a2
refactor: remove unused --normalize-input parameter (#835) 2025-09-18 00:12:53 +08:00
rmatif
8376dfba2a
feat: add sgm_uniform scheduler, simple scheduler, and support for NitroFusion (#675)
* feat: Add timestep shift and two new schedulers

* update readme

* fix spaces

* format code

* simplify SGMUniformSchedule

* simplify shifted_timestep logic

* avoid conflict

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-09-16 22:42:09 +08:00
leejet
0ebe6fe118
refactor: simplify the logic of pm id image loading (#827) 2025-09-14 22:50:21 +08:00
leejet
52a97b3ac1
feat: add vace support (#819)
* add wan vace t2v support

* add --vace-strength option

* add vace i2v support

* fix the processing of vace_context

* add vace v2v support

* update docs
2025-09-14 16:57:33 +08:00
stduhpf
2c9b1e2594
feat: add VAE encoding tiling support and adaptive overlap (#484)
* implement  tiling vae encode support

* Tiling (vae/upscale): adaptative overlap

* Tiling: fix edge case

* Tiling: fix crash when less than 2 tiles per dim

* remove extra dot

* Tiling: fix edge cases for adaptative overlap

* tiling: fix edge case

* set vae tile size via env var

* vae tiling: refactor again, base on smaller buffer for alignment

* Use bigger tiles for encode (to match compute buffer size)

* Fix edge case when tile is bigger than latent

* non-square VAE tiling (#3)

* refactor tile number calculation

* support non-square tiles

* add env var to change tile overlap

* add safeguards and better error messages for SD_TILE_OVERLAP

* add safeguards and include overlapping factor for SD_TILE_SIZE

* avoid rounding issues when specifying SD_TILE_SIZE as a factor

* lower SD_TILE_OVERLAP limit

* zero-init empty output buffer

* Fix decode latent size

* fix encode

* tile size params instead of env

* Tiled vae parameter validation (#6)

* avoid crash with invalid tile sizes, use 0 for default

* refactor default tile size, limit overlap factor

* remove explicit parameter for relative tile size

* limit encoding tile to latent size

* unify code style and format code

* update docs

* fix get_tile_sizes in decode_first_stage

---------

Co-authored-by: Wagner Bruna <wbruna@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-14 16:00:29 +08:00
leejet
288e2d63c0 docs: update docs 2025-09-14 14:24:24 +08:00
Wagner Bruna
c607fc3ed4
feat: use Euler sampling by default for SD3 and Flux (#753)
Thank you for your contribution.
2025-09-14 12:34:41 +08:00
Markus Hartung
abb115cd02
fix: clarify lora quant support and small fixes (#792) 2025-09-08 22:39:25 +08:00
stduhpf
c587a43c99
feat: support incrementing ref image index (omni-kontext) (#755)
* kontext: support  ref images indices

* lora: support x_embedder

* update help message

* Support for negative indices

* support for OmniControl (offsets at index 0)

* c++11 compat

* add --increase-ref-index option

* simplify the logic and fix some issues

* update README.md

* remove unused variable

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-09-07 22:35:16 +08:00
leejet
d7f430cd69 docs: update docs and help message 2025-09-07 02:26:44 +08:00
leejet
cb1d975e96
feat: add wan2.1/2.2 support (#778)
* add wan vae suppport

* add wan model support

* add umt5 support

* add wan2.1 t2i support

* make flash attn work with wan

* make wan a little faster

* add wan2.1 t2v support

* add wan gguf support

* add offload params to cpu support

* add wan2.1 i2v support

* crop image before resize

* set default fps to 16

* add diff lora support

* fix wan2.1 i2v

* introduce sd_sample_params_t

* add wan2.2 t2v support

* add wan2.2 14B i2v support

* add wan2.2 ti2v support

* add high noise lora support

* sync: update ggml submodule url

* avoid build failure on linux

* avoid build failure

* update ggml

* update ggml

* fix sd_version_is_wan

* update ggml, fix cpu im2col_3d

* fix ggml_nn_attention_ext mask

* add cache support to ggml runner

* fix the issue of illegal memory access

* unify image loading processing

* add wan2.1/2.2 FLF2V support

* fix end_image mask

* update to latest ggml

* add GGUFReader

* update docs
2025-09-06 18:08:03 +08:00
SmallAndSoft
f0fa7ddc40
docs: add compile option needed by Ninja (#770) 2025-09-01 21:35:25 +08:00
Daniele
5b8996f74a
Conv2D direct support (#744)
* Conv2DDirect for VAE stage

* Enable only for Vulkan, reduced duplicated code

* Cmake option to use conv2d direct

* conv2d direct always on for opencl

* conv direct as a flag

* fix merge typo

* Align conv2d behavior to flash attention's

* fix readme

* add conv2d direct for controlnet

* add conv2d direct for esrgan

* clean code, use enable_conv2d_direct/get_all_blocks

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-08-03 01:25:17 +08:00
Ettore Di Giacinto
fb86bf4cb0
docs: add LocalAI to README's UIs (#741) 2025-07-24 22:39:26 +08:00
Daniele
26f3f61d37
docs: add sd.cpp-webui as an available frontend (#738) 2025-07-23 23:51:57 +08:00
leejet
ca0bd9396e
refactor: update c api (#728) 2025-07-13 18:48:42 +08:00
Binozo
7a8ff2e819
docs: add golang cgo bindings to README (#635) 2025-07-02 23:19:49 +08:00
rmatif
0927e8e322
docs: add Android app to README (#647) 2025-07-02 23:18:16 +08:00
rmatif
d42fd59464
feat: add OpenCL backend support (#680) 2025-06-30 23:32:23 +08:00
leejet
d6c87dce5c docs: add chroma doc 2025-06-29 23:58:15 +08:00
leejet
884e23eeeb docs: add kontext doc 2025-06-29 10:35:31 +08:00
piallai
99609761dc
docs: fix typo in readme (#574) 2025-02-22 21:30:28 +08:00
piallai
5eb15ef4d0
docs: add CLI-GUI to list (#546) 2025-01-18 13:16:54 +08:00
leejet
dcf91f9e0f chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA 2024-12-28 13:27:51 +08:00
piallai
b5cc1422da
fix: fix typo for skip layers parameters (#492) 2024-12-28 13:12:08 +08:00
R0CKSTAR
5cc74d1f09
feat: support Moore Threads GPU (#529)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-12-28 13:08:36 +08:00
Erik Scholz
1c168d98a5
fix: repair flash attention support (#386)
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>

* make flash attention in the diffusion model a runtime flag
no support for sd3 or video

* remove old flash attention option and switch vae over to attn_ext

* update docs

* format code

---------

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 12:39:08 +08:00
William Murray
ea9b647080
docs: update readme, add python bindings (#423) 2024-11-23 11:52:33 +08:00
Flavio Bizzarri
b99cbfe4dc
docs: update README.md (#452) 2024-11-23 11:46:50 +08:00