Pedrito
1ac5a616de
feat: support custom upscale tile size ( #896 )
2025-12-10 22:25:19 +08:00
leejet
d939f6e86a
refactor: optimize the handling of LoRA models ( #1070 )
2025-12-10 00:26:07 +08:00
Wagner Bruna
e72aea796e
feat: embed version string and git commit hash ( #1008 )
2025-12-09 22:38:54 +08:00
leejet
96c3e64057
refactor: optimize the handling of embedding ( #1068 )
...
* optimize the handling of embedding
* support case-insensitive embedding names
2025-12-08 23:59:04 +08:00
leejet
985aedda32
refactor: optimize the handling of pred type ( #1048 )
2025-12-04 23:31:55 +08:00
Wagner Bruna
118683de8a
fix: correct preview method selection ( #1038 )
2025-12-04 22:43:16 +08:00
leejet
5865b5e703
refactor: split SDParams to SDCliParams/SDContextParams/SDGenerationParams ( #1032 )
2025-12-03 22:31:46 +08:00
Wagner Bruna
e4c50f1de5
chore: add sd_ prefix to a few functions ( #967 )
2025-12-01 22:43:52 +08:00
Wagner Bruna
0249509a30
refactor: add user data pointer to the image preview callback ( #1001 )
2025-11-30 11:34:17 +08:00
leejet
52b67c538b
feat: add flux2 support ( #1016 )
...
* add flux2 support
* rename qwenvl to llm
* add Flux2FlowDenoiser
* update docs
2025-11-30 11:32:56 +08:00
leejet
20345888a3
refactor: optimize the handling of sample method ( #999 )
2025-11-22 14:00:25 +08:00
akleine
490c51d963
feat: report success/failure when saving PNG/JPG output ( #912 )
2025-11-22 13:57:44 +08:00
Wagner Bruna
45c46779af
feat: add LCM scheduler ( #983 )
2025-11-22 13:53:31 +08:00
leejet
869d023416
refactor: optimize the handling of scheduler ( #998 )
2025-11-22 12:48:53 +08:00
Wagner Bruna
b542894fb9
fix: avoid crash on default video preview path ( #997 )
...
Co-authored-by: masamaru-san
2025-11-22 12:46:27 +08:00
rmatif
a14e2b321d
feat: add easycache support ( #940 )
2025-11-19 23:19:32 +08:00
leejet
d5b05f70c6
feat: support independent sampler rng ( #978 )
2025-11-16 17:11:02 +08:00
Wagner Bruna
199e675cc7
feat: support for --tensor-type-rules on generation modes ( #932 )
2025-11-16 17:07:32 +08:00
leejet
742a7333c3
feat: add cpu rng ( #977 )
2025-11-16 14:48:15 +08:00
Wagner Bruna
e8eb3791c8
fix: typo in --lora-apply-mode help ( #972 )
2025-11-16 14:48:00 +08:00
leejet
347710f68f
feat: support applying LoRA at runtime ( #969 )
2025-11-13 21:48:44 +08:00
stduhpf
8ecdf053ac
feat: add image preview support ( #522 )
2025-11-10 00:12:02 +08:00
leejet
d05e46ca5e
chore: add .clang-tidy configuration and apply modernize checks ( #902 )
2025-10-18 23:23:40 +08:00
leejet
0723ee51c9
refactor: optimize option printing ( #900 )
2025-10-18 17:50:30 +08:00
leejet
90ef5f8246
feat: add auto-resize support for reference images (was Qwen-Image-Edit only) ( #898 )
2025-10-18 16:37:09 +08:00
leejet
0585e2609d
docs: split README sections (build, performance, etc.) into separate docs
2025-10-16 23:22:06 +08:00
leejet
40a6a8710e
fix: resolve precision issues in SDXL VAE under fp16 ( #888 )
...
* fix: resolve precision issues in SDXL VAE under fp16
* add --force-sdxl-vae-conv-scale option
* update docs
2025-10-15 23:01:00 +08:00
Daniele
e3702585cb
feat: added prediction argument ( #334 )
2025-10-15 23:00:10 +08:00
leejet
2e9242e37f
feat: add Qwen Image Edit support ( #877 )
...
* add ref latent support for qwen image
* optimize clip_preprocess and fix get_first_stage_encoding
* add qwen2vl vit support
* add qwen image edit support
* fix qwen image edit pipeline
* add mmproj file support
* support dynamic number of Qwen image transformer blocks
* set prompt_template_encode_start_idx every time
* to_add_out precision fix
* to_out.0 precision fix
* update docs
2025-10-13 23:17:18 +08:00
leejet
beb99a2de2
feat: add Qwen Image support ( #851 )
...
* add qwen tokenizer
* add qwen2.5 vl support
* mv qwen.hpp -> qwenvl.hpp
* add qwen image model
* add qwen image t2i pipeline
* fix qwen image flash attn
* add qwen image i2i pipline
* change encoding of vocab_qwen.hpp to utf8
* fix get_first_stage_encoding
* apply jeffbolz f32 patch
https://github.com/leejet/stable-diffusion.cpp/pull/851#issuecomment-3335515302
* fix the issue that occurs when using CUDA with k-quants weights
* optimize the handling of the FeedForward precision fix
* to_add_out precision fix
* update docs
2025-10-12 23:23:19 +08:00
Wagner Bruna
aa68b875b9
refactor: deal with default img-cfg-scale at the library level ( #869 )
2025-10-12 23:17:52 +08:00
Wagner Bruna
5b261b9cee
feat: add a stand-alone upscale mode ( #865 )
...
* feat: add a stand-alone upscale mode
* fix prompt option check
* format code
* update README.md
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-10-12 23:10:02 +08:00
leejet
e12d5e0aaf
fix: ensure directory iteration results are sorted by filename ( #858 )
2025-10-11 00:18:39 +08:00
stduhpf
11f436c483
feat: add support for Flux Controls and Flex.2 ( #692 )
2025-10-11 00:06:57 +08:00
leejet
fd693ac6a2
refactor: remove unused --normalize-input parameter ( #835 )
2025-09-18 00:12:53 +08:00
rmatif
8376dfba2a
feat: add sgm_uniform scheduler, simple scheduler, and support for NitroFusion ( #675 )
...
* feat: Add timestep shift and two new schedulers
* update readme
* fix spaces
* format code
* simplify SGMUniformSchedule
* simplify shifted_timestep logic
* avoid conflict
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-16 22:42:09 +08:00
leejet
0ebe6fe118
refactor: simplify the logic of pm id image loading ( #827 )
2025-09-14 22:50:21 +08:00
leejet
52a97b3ac1
feat: add vace support ( #819 )
...
* add wan vace t2v support
* add --vace-strength option
* add vace i2v support
* fix the processing of vace_context
* add vace v2v support
* update docs
2025-09-14 16:57:33 +08:00
stduhpf
2c9b1e2594
feat: add VAE encoding tiling support and adaptive overlap ( #484 )
...
* implement tiling vae encode support
* Tiling (vae/upscale): adaptative overlap
* Tiling: fix edge case
* Tiling: fix crash when less than 2 tiles per dim
* remove extra dot
* Tiling: fix edge cases for adaptative overlap
* tiling: fix edge case
* set vae tile size via env var
* vae tiling: refactor again, base on smaller buffer for alignment
* Use bigger tiles for encode (to match compute buffer size)
* Fix edge case when tile is bigger than latent
* non-square VAE tiling (#3 )
* refactor tile number calculation
* support non-square tiles
* add env var to change tile overlap
* add safeguards and better error messages for SD_TILE_OVERLAP
* add safeguards and include overlapping factor for SD_TILE_SIZE
* avoid rounding issues when specifying SD_TILE_SIZE as a factor
* lower SD_TILE_OVERLAP limit
* zero-init empty output buffer
* Fix decode latent size
* fix encode
* tile size params instead of env
* Tiled vae parameter validation (#6 )
* avoid crash with invalid tile sizes, use 0 for default
* refactor default tile size, limit overlap factor
* remove explicit parameter for relative tile size
* limit encoding tile to latent size
* unify code style and format code
* update docs
* fix get_tile_sizes in decode_first_stage
---------
Co-authored-by: Wagner Bruna <wbruna@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-14 16:00:29 +08:00
Wagner Bruna
c607fc3ed4
feat: use Euler sampling by default for SD3 and Flux ( #753 )
...
Thank you for your contribution.
2025-09-14 12:34:41 +08:00
Erik Scholz
49d6570c43
feat: add SmoothStep Scheduler ( #813 )
2025-09-11 23:17:46 +08:00
Markus Hartung
abb115cd02
fix: clarify lora quant support and small fixes ( #792 )
2025-09-08 22:39:25 +08:00
stduhpf
c587a43c99
feat: support incrementing ref image index (omni-kontext) ( #755 )
...
* kontext: support ref images indices
* lora: support x_embedder
* update help message
* Support for negative indices
* support for OmniControl (offsets at index 0)
* c++11 compat
* add --increase-ref-index option
* simplify the logic and fix some issues
* update README.md
* remove unused variable
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-07 22:35:16 +08:00
leejet
675208dcb6
chore: update to c++17
2025-09-07 12:04:17 +08:00
leejet
d7f430cd69
docs: update docs and help message
2025-09-07 02:26:44 +08:00
stduhpf
141a4b4113
feat: add flow shift parameter (for SD3 and Wan) ( #780 )
...
* Add flow shift parameter (for SD3 and Wan)
* unify code style and fix some issues
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-07 02:16:59 +08:00
stduhpf
21ce9fe2cf
feat: add support for timestep boundary based automatic expert routing in Wan MoE ( #779 )
...
* Wan MoE: Automatic expert routing based on timestep boundary
* unify code style and fix some issues
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-09-07 01:44:10 +08:00
leejet
cb1d975e96
feat: add wan2.1/2.2 support ( #778 )
...
* add wan vae suppport
* add wan model support
* add umt5 support
* add wan2.1 t2i support
* make flash attn work with wan
* make wan a little faster
* add wan2.1 t2v support
* add wan gguf support
* add offload params to cpu support
* add wan2.1 i2v support
* crop image before resize
* set default fps to 16
* add diff lora support
* fix wan2.1 i2v
* introduce sd_sample_params_t
* add wan2.2 t2v support
* add wan2.2 14B i2v support
* add wan2.2 ti2v support
* add high noise lora support
* sync: update ggml submodule url
* avoid build failure on linux
* avoid build failure
* update ggml
* update ggml
* fix sd_version_is_wan
* update ggml, fix cpu im2col_3d
* fix ggml_nn_attention_ext mask
* add cache support to ggml runner
* fix the issue of illegal memory access
* unify image loading processing
* add wan2.1/2.2 FLF2V support
* fix end_image mask
* update to latest ggml
* add GGUFReader
* update docs
2025-09-06 18:08:03 +08:00
Wagner Bruna
2eb3845df5
fix: typo in the verbose long flag ( #783 )
2025-09-04 00:49:01 +08:00
stduhpf
4c6475f917
feat: show usage on unknown arg ( #767 )
2025-09-01 21:38:34 +08:00