28 Commits

Author SHA1 Message Date
stduhpf
bcc9c0d0b3
feat: handle ggml compute failures without crashing the program (#1003)
* Feat: handle compute failures more gracefully

* fix Unreachable code after return

Co-authored-by: idostyle <idostyl3@googlemail.com>

* adjust z_image.hpp

---------

Co-authored-by: idostyle <idostyl3@googlemail.com>
Co-authored-by: leejet <leejet714@gmail.com>
2025-12-04 22:04:27 +08:00
leejet
52b67c538b
feat: add flux2 support (#1016)
* add flux2 support

* rename qwenvl to llm

* add Flux2FlowDenoiser

* update docs
2025-11-30 11:32:56 +08:00
leejet
20345888a3
refactor: optimize the handling of sample method (#999) 2025-11-22 14:00:25 +08:00
Wagner Bruna
45c46779af
feat: add LCM scheduler (#983) 2025-11-22 13:53:31 +08:00
leejet
869d023416
refactor: optimize the handling of scheduler (#998) 2025-11-22 12:48:53 +08:00
leejet
dd75fc081c
refactor: unify the naming style of ggml extension functions (#921) 2025-10-28 23:26:48 +08:00
leejet
d05e46ca5e
chore: add .clang-tidy configuration and apply modernize checks (#902) 2025-10-18 23:23:40 +08:00
vmobilis
97ad3e7ff9
refactor: simplify DPM++ (2S) Ancestral (#667) 2025-09-16 23:05:25 +08:00
rmatif
8376dfba2a
feat: add sgm_uniform scheduler, simple scheduler, and support for NitroFusion (#675)
* feat: Add timestep shift and two new schedulers

* update readme

* fix spaces

* format code

* simplify SGMUniformSchedule

* simplify shifted_timestep logic

* avoid conflict

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-09-16 22:42:09 +08:00
Erik Scholz
49d6570c43
feat: add SmoothStep Scheduler (#813) 2025-09-11 23:17:46 +08:00
stduhpf
141a4b4113
feat: add flow shift parameter (for SD3 and Wan) (#780)
* Add flow shift parameter (for SD3 and Wan)

* unify code style and fix some issues

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-09-07 02:16:59 +08:00
leejet
cb1d975e96
feat: add wan2.1/2.2 support (#778)
* add wan vae suppport

* add wan model support

* add umt5 support

* add wan2.1 t2i support

* make flash attn work with wan

* make wan a little faster

* add wan2.1 t2v support

* add wan gguf support

* add offload params to cpu support

* add wan2.1 i2v support

* crop image before resize

* set default fps to 16

* add diff lora support

* fix wan2.1 i2v

* introduce sd_sample_params_t

* add wan2.2 t2v support

* add wan2.2 14B i2v support

* add wan2.2 ti2v support

* add high noise lora support

* sync: update ggml submodule url

* avoid build failure on linux

* avoid build failure

* update ggml

* update ggml

* fix sd_version_is_wan

* update ggml, fix cpu im2col_3d

* fix ggml_nn_attention_ext mask

* add cache support to ggml runner

* fix the issue of illegal memory access

* unify image loading processing

* add wan2.1/2.2 FLF2V support

* fix end_image mask

* update to latest ggml

* add GGUFReader

* update docs
2025-09-06 18:08:03 +08:00
Wagner Bruna
76c72628b1
fix: fix a few typos on cli help and error messages (#714) 2025-07-04 22:15:41 +08:00
leejet
7dac89ad75 refector: reuse some code 2025-07-01 23:33:50 +08:00
stduhpf
9251756086
feat: add CosXL support (#683) 2025-07-01 23:13:04 +08:00
leejet
45d0ebb30c style: format code 2025-06-29 23:40:55 +08:00
yslai
19d876ee30
feat: implement DDIM with the "trailing" timestep spacing and TCD (#568) 2025-02-22 21:34:22 +08:00
leejet
ac54e00760
feat: add sd3.5 support (#445) 2024-10-24 21:58:03 +08:00
Daniele
dc0882cdc9
feat: add exponential scheduler (#346)
* feat: added exponential scheduler

* updated README

* improved exponential formatting

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-08-28 00:13:35 +08:00
Daniele
d00c94844d
feat: add ipndm and ipndm_v samplers (#344) 2024-08-28 00:03:41 +08:00
Daniele
2d4a2f7982
feat: add GITS scheduler (#343) 2024-08-28 00:02:17 +08:00
leejet
c837c5d9cc style: format code 2024-08-25 00:19:37 +08:00
leejet
64d231f384
feat: add flux support (#356)
* add flux support

* avoid build failures in non-CUDA environments

* fix schnell support

* add k quants support

* add support for applying lora to quantized tensors

* add inplace conversion support for f8_e4m3 (#359)

in the same way it is done for bf16
like how bf16 converts losslessly to fp32,
f8_e4m3 converts losslessly to fp16

* add xlabs flux comfy converted lora support

* update docs

---------

Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>
2024-08-24 14:29:52 +08:00
leejet
73c2176648
feat: add sd3 support (#298) 2024-07-28 15:44:08 +08:00
leejet
f9f0d4685b fix: sample_k_diffusion should be static 2024-06-10 23:04:02 +08:00
leejet
08f5b41956 refector: make the sampling module more independent 2024-06-10 22:42:15 +08:00
Grauho
ce1bcc74a6
feat: add AYS(Align Your Steps) scheduler (#241)
Added NVIDEA's new "Align Your Steps" style scheduler in accordance with their
quick start guide. Currently has handling for SD1.5, SDXL, and SVD, using the
noise levels from their paper to generate the sigma values. Can be selected
using the --schedule ays command line switch. Updates the main.cpp help
message and README to reflect this option, also they now inform the user
of the --color switch as well.

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-04-29 23:21:32 +08:00
leejet
2e79a82f85
refactor: reorganize code and use c api (#133) 2024-01-01 16:22:18 +08:00