85 Commits

Author SHA1 Message Date
leejet
e2a3a406b6 add wan2.1/2.2 FLF2V support 2025-08-31 23:49:47 +08:00
leejet
33ff442c1d unify image loading processing 2025-08-31 18:40:04 +08:00
leejet
50f921119e fix the issue of illegal memory access 2025-08-31 12:41:16 +08:00
leejet
b05b2b29a3 Merge branch 'master' into wan 2025-08-29 23:08:32 +08:00
leejet
2410ce3dee avoid build failure on linux 2025-08-29 22:20:16 +08:00
leejet
079b393b6e add wan2.2 t2v support 2025-08-25 00:10:16 +08:00
leejet
afef8cef9e introduce sd_sample_params_t 2025-08-24 17:20:41 +08:00
leejet
e69195d22f set default fps to 16 2025-08-23 14:41:14 +08:00
leejet
459fd4dbbc crop image before resize 2025-08-23 13:59:05 +08:00
leejet
d83867b8e9 add wan2.1 i2v support 2025-08-23 12:37:15 +08:00
leejet
9b29de27a8 add offload params to cpu support 2025-08-17 03:13:16 +08:00
leejet
3a2840f9fb add wan2.1 t2v support 2025-08-15 00:37:30 +08:00
leejet
1d9ccea41a add wan2.1 t2i support 2025-08-10 17:07:17 +08:00
leejet
bace0a08c4 add umt5 support 2025-08-09 16:07:04 +08:00
leejet
5f7d98884c add wan model support 2025-08-06 00:29:53 +08:00
Daniele
5b8996f74a
Conv2D direct support (#744)
* Conv2DDirect for VAE stage

* Enable only for Vulkan, reduced duplicated code

* Cmake option to use conv2d direct

* conv2d direct always on for opencl

* conv direct as a flag

* fix merge typo

* Align conv2d behavior to flash attention's

* fix readme

* add conv2d direct for controlnet

* add conv2d direct for esrgan

* clean code, use enable_conv2d_direct/get_all_blocks

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-08-03 01:25:17 +08:00
leejet
e3f9366857 add wan vae suppport 2025-08-02 11:00:33 +08:00
Wagner Bruna
7eb30d00e5
feat: add missing models and parameters to image metadata (#743)
* feat: add new scheduler types, clip skip and vae to image embedded params

- If a non default scheduler is set, include it in the 'Sampler' tag in the data
embedded into the final image.
- If a custom VAE path is set, include the vae name (without path and extension)
in embedded image params under a `VAE:` tag.
- If a custom Clip skip is set, include that Clip skip value in embedded image
params under a `Clip skip:` tag.

* feat: add separate diffusion and text models to metadata

---------

Co-authored-by: one-lithe-rune <skapusniak@lithe-runes.com>
2025-07-28 22:00:27 +08:00
stduhpf
59080d3ce1
feat: change image dimensions requirement for DiT models (#742) 2025-07-28 21:58:17 +08:00
leejet
0739361bfe fix: avoid macOS build failed 2025-07-13 20:18:10 +08:00
leejet
ca0bd9396e
refactor: update c api (#728) 2025-07-13 18:48:42 +08:00
stduhpf
a772dca27a
feat: add Instruct-Pix2pix/CosXL-Edit support (#679)
* Instruct-p2p support

* support 2 conditionings cfg

* Do not re-encode the exact same image twice

* fixes for 2-cfg

* Fix pix2pix latent inputs + improve inpainting a bit + fix naming

* prepare for other pix2pix-like models

* Support sdxl ip2p

* fix reference image embeddings

* Support 2-cond cfg properly in cli

* fix typo in help

* Support masks for ip2p models

* unify code style

* delete unused code

* use edit mode

* add img_cond

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-07-12 15:36:45 +08:00
Wagner Bruna
6d84a30c66
feat: overriding quant types for specific tensors on model conversion (#724) 2025-07-08 00:11:38 +08:00
leejet
b9e4718fac fix: correct --chroma-enable-t5-mask argument 2025-07-06 11:11:47 +08:00
Wagner Bruna
76c72628b1
fix: fix a few typos on cli help and error messages (#714) 2025-07-04 22:15:41 +08:00
leejet
a28d04dd81 fix: fix the issue in parsing --chroma-disable-dit-mask 2025-06-29 23:52:36 +08:00
leejet
45d0ebb30c style: format code 2025-06-29 23:40:55 +08:00
stduhpf
b1cc40c35c
feat: add Chroma support (#696)
---------

Co-authored-by: Green Sky <Green-Sky@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2025-06-29 23:36:42 +08:00
stduhpf
c9b5735116
feat: add FLUX.1 Kontext dev support (#707)
* Kontext support
* add edit mode

---------

Co-authored-by: leejet <leejet714@gmail.com>
2025-06-29 10:08:53 +08:00
vmobilis
81556f3136
chore: silence some warnings about precision loss (#620) 2025-03-09 12:22:39 +08:00
leejet
30b3ac8e62 fix: avoid potential dangling pointer problem 2025-03-01 16:58:26 +08:00
yslai
19d876ee30
feat: implement DDIM with the "trailing" timestep spacing and TCD (#568) 2025-02-22 21:34:22 +08:00
lalala
f27f2b2aa2
docs: add missing --mask and --guidance options to print_usage (#572) 2025-02-22 21:32:37 +08:00
vmobilis
d46ed5e184
feat: support JPEG compression (#583) 2025-02-05 16:18:02 +08:00
piallai
b5cc1422da
fix: fix typo for skip layers parameters (#492) 2024-12-28 13:12:08 +08:00
stduhpf
8f4ab9add3
feat: support Inpaint models (#511) 2024-12-28 13:04:49 +08:00
stduhpf
9148b980be
feat: remove type restrictions (#489) 2024-11-30 14:22:15 +08:00
stduhpf
7ce63e740c
feat: flexible model architecture for dit models (Flux & SD3) (#490)
* Refactor: wtype per tensor

* Fix default args

* refactor: fix flux

* Refactor photmaker v2 support

* unet: refactor the refactoring

* Refactor: fix controlnet and tae

* refactor: upscaler

* Refactor: fix runtime type override

* upscaler: use fp16 again

* Refactor: Flexible sd3 arch

* Refactor: Flexible Flux arch

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-11-30 14:18:53 +08:00
stduhpf
53b415f787
fix: remove default variables in c headers (#478) 2024-11-24 18:10:25 +08:00
Erik Scholz
1c168d98a5
fix: repair flash attention support (#386)
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>

* make flash attention in the diffusion model a runtime flag
no support for sd3 or video

* remove old flash attention option and switch vae over to attn_ext

* update docs

* format code

---------

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 12:39:08 +08:00
Plamen Minev
8c7719fe9a
fix: typo in clip-g encoder arg (#472) 2024-11-23 11:46:00 +08:00
stduhpf
65fa646684
feat: add sd3.5 medium and skip layer guidance support (#451)
* mmdit-x

* add support for sd3.5 medium

* add skip layer guidance support (mmdit only)

* ignore slg if slg_scale is zero (optimization)

* init out_skip once

* slg support for flux (expermiental)

* warn if version doesn't support slg

* refactor slg cli args

* set default slg_scale to 0 (oops)

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 11:15:31 +08:00
leejet
ac54e00760
feat: add sd3.5 support (#445) 2024-10-24 21:58:03 +08:00
stduhpf
f4c937cb94
fix: add some missing cli args to usage (#363) 2024-08-28 00:17:46 +08:00
Daniele
0362cc4874
fix: fix some typos (#361) 2024-08-28 00:15:37 +08:00
Daniele
dc0882cdc9
feat: add exponential scheduler (#346)
* feat: added exponential scheduler

* updated README

* improved exponential formatting

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-08-28 00:13:35 +08:00
Daniele
d00c94844d
feat: add ipndm and ipndm_v samplers (#344) 2024-08-28 00:03:41 +08:00
Daniele
2d4a2f7982
feat: add GITS scheduler (#343) 2024-08-28 00:02:17 +08:00
leejet
64d231f384
feat: add flux support (#356)
* add flux support

* avoid build failures in non-CUDA environments

* fix schnell support

* add k quants support

* add support for applying lora to quantized tensors

* add inplace conversion support for f8_e4m3 (#359)

in the same way it is done for bf16
like how bf16 converts losslessly to fp32,
f8_e4m3 converts losslessly to fp16

* add xlabs flux comfy converted lora support

* update docs

---------

Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>
2024-08-24 14:29:52 +08:00
leejet
73c2176648
feat: add sd3 support (#298) 2024-07-28 15:44:08 +08:00