stable-diffusion.cpp

mirror of https://github.com/leejet/stable-diffusion.cpp.git synced 2025-12-12 21:38:58 +00:00

Author	SHA1	Message	Date
leejet	6103d86e2c	refactor: introduce GGMLRunnerContext (#928 ) * introduce GGMLRunnerContext * add Flash Attention enable control through GGMLRunnerContext * add conv2d_direct enable control through GGMLRunnerContext	2025-11-02 02:11:04 +08:00
leejet	dd75fc081c	refactor: unify the naming style of ggml extension functions (#921 )	2025-10-28 23:26:48 +08:00
leejet	d05e46ca5e	chore: add .clang-tidy configuration and apply modernize checks (#902 )	2025-10-18 23:23:40 +08:00
leejet	fce6afcc6a	feat: add sd3 flash attn support (#815 )	2025-09-11 23:24:29 +08:00
leejet	f8fe4e7db9	fix: add flash attn support check (#803 )	2025-09-07 21:29:06 +08:00
leejet	cb1d975e96	feat: add wan2.1/2.2 support (#778 ) * add wan vae suppport * add wan model support * add umt5 support * add wan2.1 t2i support * make flash attn work with wan * make wan a little faster * add wan2.1 t2v support * add wan gguf support * add offload params to cpu support * add wan2.1 i2v support * crop image before resize * set default fps to 16 * add diff lora support * fix wan2.1 i2v * introduce sd_sample_params_t * add wan2.2 t2v support * add wan2.2 14B i2v support * add wan2.2 ti2v support * add high noise lora support * sync: update ggml submodule url * avoid build failure on linux * avoid build failure * update ggml * update ggml * fix sd_version_is_wan * update ggml, fix cpu im2col_3d * fix ggml_nn_attention_ext mask * add cache support to ggml runner * fix the issue of illegal memory access * unify image loading processing * add wan2.1/2.2 FLF2V support * fix end_image mask * update to latest ggml * add GGUFReader * update docs	2025-09-06 18:08:03 +08:00
leejet	f6b9aa1a43	refector: optimize the usage of tensor_types	2025-07-28 23:18:29 +08:00
stduhpf	7ce63e740c	feat: flexible model architecture for dit models (Flux & SD3) (#490 ) * Refactor: wtype per tensor * Fix default args * refactor: fix flux * Refactor photmaker v2 support * unet: refactor the refactoring * Refactor: fix controlnet and tae * refactor: upscaler * Refactor: fix runtime type override * upscaler: use fp16 again * Refactor: Flexible sd3 arch * Refactor: Flexible Flux arch * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-30 14:18:53 +08:00
stduhpf	65fa646684	feat: add sd3.5 medium and skip layer guidance support (#451 ) * mmdit-x * add support for sd3.5 medium * add skip layer guidance support (mmdit only) * ignore slg if slg_scale is zero (optimization) * init out_skip once * slg support for flux (expermiental) * warn if version doesn't support slg * refactor slg cli args * set default slg_scale to 0 (oops) * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 11:15:31 +08:00
leejet	ac54e00760	feat: add sd3.5 support (#445 )	2024-10-24 21:58:03 +08:00
leejet	1bdc767aaf	feat: force using f32 for some layers	2024-08-25 13:53:16 +08:00
leejet	64d231f384	feat: add flux support (#356 ) * add flux support * avoid build failures in non-CUDA environments * fix schnell support * add k quants support * add support for applying lora to quantized tensors * add inplace conversion support for f8_e4m3 (#359) in the same way it is done for bf16 like how bf16 converts losslessly to fp32, f8_e4m3 converts losslessly to fp16 * add xlabs flux comfy converted lora support * update docs --------- Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>	2024-08-24 14:29:52 +08:00
leejet	73c2176648	feat: add sd3 support (#298 )	2024-07-28 15:44:08 +08:00

13 Commits