354 Commits

Author SHA1 Message Date
stduhpf
8ecdf053ac
feat: add image preview support (#522) master-354-8ecdf05 2025-11-10 00:12:02 +08:00
leejet
ee89afc878
fix: resolve issue with pmid (#957) master-353-ee89afc 2025-11-09 22:47:53 +08:00
akleine
d2d3944f50
feat: add support for SD2.x with TINY U-Nets (#939) master-352-d2d3944 2025-11-09 22:47:37 +08:00
akleine
0fa3e1a383
fix: prevent core dump in PM V2 in case of incomplete cmd line (#950) master-351-0fa3e1a 2025-11-09 22:36:43 +08:00
leejet
c2d8ffc22c
fix: compatibility for models with modified tensor shapes (#951) master-350-c2d8ffc 2025-11-07 23:04:41 +08:00
stduhpf
fb748bb8a4
fix: TAE encoding (#935) master-349-fb748bb 2025-11-07 22:58:59 +08:00
leejet
8f6c5c217b
refactor: simplify the model loading logic (#933)
* remove String2GGMLType

* remove preprocess_tensor

* fix clip init

* simplify the logic for reading weights
master-348-8f6c5c2
2025-11-03 21:21:34 +08:00
leejet
6103d86e2c
refactor: introduce GGMLRunnerContext (#928)
* introduce GGMLRunnerContext

* add Flash Attention enable control through GGMLRunnerContext

* add conv2d_direct enable control through GGMLRunnerContext
master-347-6103d86
2025-11-02 02:11:04 +08:00
stduhpf
c42826b77c
fix: resolve multiple inpainting issues (#926)
* Fix inpainting masked image being broken by side effect

* Fix unet inpainting concat not being set

* Fix Flex.2 inpaint mode crash (+ use scale factor)
master-346-c42826b
2025-11-02 02:10:32 +08:00
Wagner Bruna
945d9a9ee3
docs: add Koboldcpp as an available UI (#930) 2025-11-02 02:03:01 +08:00
Wagner Bruna
353e708844
docs: update ggml and llama.cpp URLs (#931) 2025-11-02 02:02:44 +08:00
leejet
dd75fc081c
refactor: unify the naming style of ggml extension functions (#921) master-343-dd75fc0 2025-10-28 23:26:48 +08:00
stduhpf
77eb95f8e4
docs: fix taesd direct download link (#917) 2025-10-28 23:26:23 +08:00
Wagner Bruna
8a45d0ff7f
chore: clean up stb includes (#919) master-341-8a45d0f 2025-10-28 23:25:45 +08:00
leejet
9e28be6479
feat: add chroma radiance support (#910)
* add chroma radiance support

* fix ci

* simply generate_init_latent

* workaround: avoid ggml cuda error

* format code

* add chroma radiance doc
master-340-9e28be6
2025-10-25 23:56:14 +08:00
akleine
062490aa7c
feat: add SSD1B and tiny-sd support (#897)
* feat: add code and doc for running SSD1B models

* Added some more lines to support SD1.x with TINY U-Nets too.

* support SSD-1B.safetensors

* fix sdv1.5 diffusers format loader

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-339-062490a
2025-10-25 23:35:54 +08:00
stduhpf
faabc5ad3c
feat: allow models to run without all text encoder(s) (#645) master-338-faabc5a 2025-10-25 22:00:56 +08:00
leejet
69b9511ce9 sync: update ggml 2025-10-24 00:32:45 +08:00
stduhpf
917f7bfe99
fix: support --flow-shift for flux models with default pred (#913) master-336-917f7bf 2025-10-23 21:35:18 +08:00
leejet
48e0a28ddf
feat: add shift factor support (#903) master-335-48e0a28 2025-10-23 01:20:29 +08:00
leejet
d05e46ca5e
chore: add .clang-tidy configuration and apply modernize checks (#902) master-334-d05e46c 2025-10-18 23:23:40 +08:00
Wagner Bruna
64a7698347
chore: report number of Qwen layers as info (#901) master-333-64a7698 2025-10-18 23:22:01 +08:00
leejet
0723ee51c9
refactor: optimize option printing (#900) master-332-0723ee5 2025-10-18 17:50:30 +08:00
leejet
90ef5f8246
feat: add auto-resize support for reference images (was Qwen-Image-Edit only) (#898) master-331-90ef5f8 2025-10-18 16:37:09 +08:00
leejet
db6f4791b4
feat: add wtype stat (#899) master-330-db6f479 2025-10-17 23:40:32 +08:00
leejet
b25785bc10 sync: update ggml 2025-10-17 21:46:39 +08:00
leejet
0585e2609d docs: split README sections (build, performance, etc.) into separate docs 2025-10-16 23:22:06 +08:00
leejet
683d6d08a8 chore: add github issue template 2025-10-16 21:04:41 +08:00
leejet
40a6a8710e
fix: resolve precision issues in SDXL VAE under fp16 (#888)
* fix: resolve precision issues in SDXL VAE under fp16

* add --force-sdxl-vae-conv-scale option

* update docs
master-326-40a6a87
2025-10-15 23:01:00 +08:00
Daniele
e3702585cb
feat: added prediction argument (#334) master-325-e370258 2025-10-15 23:00:10 +08:00
cmdr2
a7d6d296c7
chore: allow building ggml as a separate shared lib (#468) master-324-a7d6d29 2025-10-15 22:10:26 +08:00
leejet
2e9242e37f
feat: add Qwen Image Edit support (#877)
* add ref latent support for qwen image

* optimize clip_preprocess and fix get_first_stage_encoding

* add qwen2vl vit support

* add qwen image edit support

* fix qwen image edit pipeline

* add mmproj file support

* support dynamic number of Qwen image transformer blocks

* set prompt_template_encode_start_idx every time

* to_add_out precision fix

* to_out.0 precision fix

* update docs
master-323-2e9242e
2025-10-13 23:17:18 +08:00
Wagner Bruna
c64994dc1d
fix: better progress display for second-order samplers (#834) master-322-c64994d 2025-10-13 22:12:48 +08:00
Wagner Bruna
5436f6b814
fix: correct canny preprocessor (#861) master-321-5436f6b 2025-10-13 22:02:35 +08:00
leejet
1c32fa03bc
fix: avoid generating black images when running T5 on the GPU (#882) master-320-1c32fa0 2025-10-13 00:01:06 +08:00
Wagner Bruna
9727c6bb98
fix: resolve VAE tiling problem in Qwen Image (#873) master-319-9727c6b 2025-10-12 23:45:53 +08:00
leejet
beb99a2de2
feat: add Qwen Image support (#851)
* add qwen tokenizer

* add qwen2.5 vl support

* mv qwen.hpp -> qwenvl.hpp

* add qwen image model

* add qwen image t2i pipeline

* fix qwen image flash attn

* add qwen image i2i pipline

* change encoding of vocab_qwen.hpp to utf8

* fix get_first_stage_encoding

* apply jeffbolz f32 patch

https://github.com/leejet/stable-diffusion.cpp/pull/851#issuecomment-3335515302

* fix the issue that occurs when using CUDA with k-quants weights

* optimize the handling of the FeedForward precision fix

* to_add_out precision fix

* update docs
master-318-beb99a2
2025-10-12 23:23:19 +08:00
Wagner Bruna
aa68b875b9
refactor: deal with default img-cfg-scale at the library level (#869) master-317-aa68b87 2025-10-12 23:17:52 +08:00
Wagner Bruna
5b261b9cee
feat: add a stand-alone upscale mode (#865)
* feat: add a stand-alone upscale mode

* fix prompt option check

* format code

* update README.md

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-316-5b261b9
2025-10-12 23:10:02 +08:00
Pedrito
e70d0205ca
feat: add support for more esrgan models & x2 & x1 models (#855) master-315-e70d020 2025-10-12 22:53:31 +08:00
leejet
02af48a97f
chore: fix vulkan ci (#878) master-314-02af48a 2025-10-11 00:40:57 +08:00
leejet
e12d5e0aaf
fix: ensure directory iteration results are sorted by filename (#858) 2025-10-11 00:18:39 +08:00
Serkan Sahin
940a2018e1
chore: fix dockerfile libgomp1 dependency + improvements (#852) 2025-10-11 00:17:45 +08:00
Sharuzzaman Ahmat Raslan
b451728b2f
docs: update README.md (#866) 2025-10-11 00:11:10 +08:00
stduhpf
11f436c483
feat: add support for Flux Controls and Flex.2 (#692) 2025-10-11 00:06:57 +08:00
leejet
35843c77ea
fix: optimize the handling of embedding weight (#859) master-309-35843c7 2025-09-25 23:09:59 +08:00
leejet
6ad46bb700 sync: update ggml 2025-09-25 21:57:43 +08:00
leejet
1ba30ce005 sync: update ggml 2025-09-25 00:38:38 +08:00
leejet
2abe9451c4
fix: optimize the handling of CLIP embedding weight (#840) master-306-2abe945 2025-09-25 00:28:20 +08:00
Wagner Bruna
f3140eadbb
fix: tensor loading thread count (#854) master-305-f3140ea 2025-09-25 00:26:38 +08:00