152 Commits

Author SHA1 Message Date
leejet
f440ad9c29
fix: avoid writable mmap for read-only weights (#1698) 2026-06-23 00:39:31 +08:00
stduhpf
41f7acbfb0
feat: support guidance_schedule (#1684) 2026-06-23 00:05:55 +08:00
leejet
b395a6972d
refactor: add Flux VAE version helper (#1696) 2026-06-22 22:39:42 +08:00
fszontagh
787d229d84
perf: --eager-load to pre-load params at model-load time (#1687) 2026-06-22 22:10:09 +08:00
leejet
b12098f5d0
feat: add boogu image support (#1688) 2026-06-22 00:36:17 +08:00
Daniele
e9e952462f
fix: workaround for Ernie with Vulkan and Flash Attention (#1680) 2026-06-22 00:21:38 +08:00
Wagner Bruna
e8e012eef2
fix: workaround for Anima with Vulkan and Flash Attention (#1678) 2026-06-22 00:20:00 +08:00
leejet
7f0e728b7d
fix: normalize CLIP prompts before special-token splitting (#1670) 2026-06-17 00:33:00 +08:00
Wagner Bruna
710bc91c8f
fix: correct conversion from sd_type_t to ggml_type (#1519) 2026-06-16 23:54:42 +08:00
Wagner Bruna
5a34bc7f6e
feat: support for cancelling generations (#1124)
* feat: support for canceling the ongoing generation

* return partial image batches on cancel

---------

Co-authored-by: leejet <leejet714@gmail.com>
2026-06-16 00:36:38 +08:00
leejet
146b6cc49e
fix: simplify PuLID ID extraction setup (#1664) 2026-06-15 23:55:38 +08:00
RapidMark
93527fda74
feat: add PuLID-Flux identity-injection support (#1595) 2026-06-15 23:33:50 +08:00
leejet
6e66a1a4a4
fix: allow oversized Vulkan parameter tensors (#1662) 2026-06-15 23:18:52 +08:00
leejet
bb90bfa00f feat: support backend-specific max-vram budgets 2026-06-14 22:46:32 +08:00
stduhpf
c2df4e1228
feat: add RPC support (#1629) 2026-06-14 17:30:23 +08:00
leejet
9838264c49
refactor: simplify ControlNet output caching (#1655) 2026-06-14 16:58:37 +08:00
leejet
5db680c2c7
refactor: route cpu placement through backend specs (#1654) 2026-06-14 15:52:24 +08:00
leejet
749186c0eb
refactor: remove vae_decode_only context flag (#1653) 2026-06-14 15:23:29 +08:00
leejet
bdb431ad95
feat: support disk params backend (#1651) 2026-06-14 14:48:50 +08:00
leejet
276025e054
fix: mark LoKR w2_a tensor as applied (#1650) 2026-06-14 02:11:02 +08:00
leejet
8d4c7af95b
refactor: route all runner params through model manager (#1649) 2026-06-14 02:05:23 +08:00
leejet
9b0fceb41b
refactor: manage upscaler params through model manager (#1645) 2026-06-13 15:39:57 +08:00
leejet
563137a592
refactor: centralize runner weight staging and cleanup (#1644) 2026-06-13 13:19:13 +08:00
Wyatt Caldwell
3a54597776
fix: SD3 conditioning crash when clip_l text encoder is missing (#1638) 2026-06-13 13:16:59 +08:00
Cyberhan123
1fb6b22850
feat: add free_sd_images function to manage memory for C API (#1633) 2026-06-13 13:08:14 +08:00
stduhpf
c20769b2c8
feat: add circular RoPE support for ideogram4 (#1627) 2026-06-13 13:06:34 +08:00
RapidMark
1b702a51e7
fix: correct mask shape for masked flash attention (#1625) 2026-06-13 13:01:20 +08:00
RapidMark
19bdfe22d2
feat: set tensor names on block params (#1622) 2026-06-08 23:25:52 +08:00
stduhpf
138da14cc3
apg: normalize diff_norm calculation by tensor size (#1620) 2026-06-08 21:56:15 +08:00
fszontagh
17a2b4a315
perf: cap planner budget when model dwarfs the streaming budget (#1612) 2026-06-08 21:53:54 +08:00
leejet
b3d56d0ba1
refactor: split model loader from model definitions (#1619) 2026-06-07 23:20:12 +08:00
leejet
2a07540c2a
refactor: move photomaker into generation extension (#1618) 2026-06-07 22:40:02 +08:00
Wagner Bruna
81abfb2548
chore: rename and reformat gits_noise.inl (#1617) 2026-06-07 22:30:20 +08:00
leejet
f3fd359b58
refactor: reorganize src model layout (#1615) 2026-06-07 03:21:12 +08:00
leejet
dfb2390dd4
refactor: extract Wan VAE implementation (#1614) 2026-06-07 01:33:49 +08:00
leejet
cfbc19d186
refactor: unify model config detection (#1613) 2026-06-07 01:05:12 +08:00
leejet
b9254dda0d
feat: add ideogram4 support (#1609) 2026-06-06 16:34:16 +08:00
fszontagh
0648f4426b
perf: ratchet streaming budget so plan stops re-merging every step (#1611) 2026-06-06 16:32:03 +08:00
fszontagh
064001b524
perf: allocate CPU-offloaded params from runtime device pinned host buffer (#1601) 2026-06-06 16:22:18 +08:00
leejet
1f9ee88e09
fix: zero Wan2.2 TI2V timesteps for fixed frames (#1604) 2026-06-03 23:32:31 +08:00
fszontagh
a7f2e03da4
perf: keep chunk-K residency engaged with runtime LoRA (#1598) 2026-06-03 23:12:00 +08:00
stduhpf
4513e3fda9
refactor: img-cond->img_uncond (#1594)
* refactor: img-cond->img_uncond

* align APG and CFG++ with img-uncond CFG

* set default img_cfg to 1.f

---------

Co-authored-by: leejet <leejet714@gmail.com>
2026-06-03 22:57:42 +08:00
leejet
2d40a8b2ad
feat: make Wan2.2 5B FLF2V work (#1110) 2026-06-02 23:16:09 +08:00
fszontagh
ed74577c40
feat: --stream-layers for streaming weights from CPU during generation (#1576) 2026-06-02 22:35:28 +08:00
Wagner Bruna
02f06370a7
refactor: call CPU backend functions dynamically (#1591)
Co-authored-by: leejet <leejet714@gmail.com>
2026-06-01 23:41:21 +08:00
stduhpf
f8935d6f25
feat: support img-cfg for edit models (#929)
Co-authored-by: leejet <leejet714@gmail.com>
2026-06-01 22:54:25 +08:00
stduhpf
be65ac7511
feat: add support for APG (adaptive projected guidance) + unconditionnal SLG (#593) 2026-06-01 00:55:49 +08:00
leejet
20901f6d8e
fix: remove kv padding from flash attention wrapper (#1453) 2026-05-31 23:23:19 +08:00
leejet
0982807139
feat: add PiD support (#1585) 2026-05-31 22:38:39 +08:00
leejet
d2797b8667
fix: correct Gemma3 rope settings and vram limit propagation (#1583) 2026-05-30 22:23:49 +08:00