fszontagh
|
064001b524
|
perf: allocate CPU-offloaded params from runtime device pinned host buffer (#1601)
|
2026-06-06 16:22:18 +08:00 |
|
fszontagh
|
a7f2e03da4
|
perf: keep chunk-K residency engaged with runtime LoRA (#1598)
|
2026-06-03 23:12:00 +08:00 |
|
fszontagh
|
ed74577c40
|
feat: --stream-layers for streaming weights from CPU during generation (#1576)
|
2026-06-02 22:35:28 +08:00 |
|
Wagner Bruna
|
02f06370a7
|
refactor: call CPU backend functions dynamically (#1591)
Co-authored-by: leejet <leejet714@gmail.com>
|
2026-06-01 23:41:21 +08:00 |
|
leejet
|
20901f6d8e
|
fix: remove kv padding from flash attention wrapper (#1453)
|
2026-05-31 23:23:19 +08:00 |
|
stduhpf
|
a397e03488
|
feat: add Longcat-Image / Longcat-Image-Edit support (#1053)
Co-authored-by: leejet <leejet714@gmail.com>
|
2026-05-24 02:02:02 +08:00 |
|
stduhpf
|
adaa599a3b
|
Feat: Temporal tile custom size with overlap (#1510)
* Temporal tile size + overlap
* add --extra-tiling-args support
---------
Co-authored-by: leejet <leejet714@gmail.com>
|
2026-05-21 23:44:12 +08:00 |
|
stduhpf
|
47d8198b69
|
feat: add taeltx2_3_wide support (#1535)
|
2026-05-21 22:34:12 +08:00 |
|
leejet
|
67dda3f897
|
feat: add ltx2.3 support (#1463)
* add GemmaTokenizer
* add basic ltx2.3 support
* change vocab file encoding
* fix ci
* fix ubuntu build
* add temporal tiling support
* add ltx audio support
* update ggml submodule url
* fix generate_video
* add i2v support
* minify bundled Gemma tokenizer vocab sources
* pass video fps into temporal rope embeddings
* fix av_ca_timestep_scale_multiplier
* add LTX2Scheduler support
* update docs
* fix ci
|
2026-05-17 16:46:20 +08:00 |
|
leejet
|
36330724bd
|
feat: add module backend assignment support (#1500)
Co-authored-by: Stéphane du Hamel <stephduh@live.fr>
|
2026-05-16 20:27:06 +08:00 |
|
Wagner Bruna
|
686856edca
|
chore: do not report the fake VAE "allocation" as an error (#1494)
|
2026-05-16 16:08:31 +08:00 |
|
leejet
|
0665a7f8bf
|
feat: add hidream o1 image support (#1485)
|
2026-05-15 00:40:21 +08:00 |
|
Wagner Bruna
|
57ff2eb0f4
|
feat: support for memory-mapping model weights (#1414)
Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>
Co-authored-by: Junmo Kim <me@junmo.kim>
Co-authored-by: leejet <leejet714@gmail.com>
|
2026-05-15 00:30:03 +08:00 |
|
leejet
|
90e87bc846
|
feat: add max-vram based segmented param offload (#1476)
|
2026-05-06 21:56:02 +08:00 |
|
Wagner Bruna
|
b8079e253d
|
feat: transition from compile-time to runtime backend discovery (#1448)
Co-authored-by: Stéphane du Hamel <stephduh@live.fr>
Co-authored-by: Cyberhan123 <255542417@qq.com>
Co-authored-by: leejet <leejet714@gmail.com>
|
2026-04-29 23:26:57 +08:00 |
|
akleine
|
970c4a3312
|
chore: replace some NULL with nullptr + use "%zu" for printing some size_t data (#1457)
|
2026-04-27 22:42:57 +08:00 |
|
leejet
|
f16a110f87
|
refactor: migrate generation pipeline to sd::Tensor (#1373)
|
2026-03-30 00:19:25 +08:00 |
|
leejet
|
84cbd88df1
|
style: remove redundant struct qualifiers for consistent C/C++ type usage (#1349)
|
2026-03-16 22:17:22 +08:00 |
|
leejet
|
acc3bf1fdc
|
refactor: optimize the VAE architecture (#1345)
|
2026-03-15 16:57:42 +08:00 |
|
stduhpf
|
3d33caaef8
|
fix: make tiling work better when using circular (#1299)
|
2026-03-08 00:25:07 +08:00 |
|
leejet
|
ba35dd734e
|
refactor: introduce ggml_ext_zeros_like/ggml_ext_ones_like (#1312)
|
2026-03-04 00:36:52 +08:00 |
|
leejet
|
28ef93c0e1
|
refactor: reorganize the file structure (#1266)
|
2026-02-10 23:13:35 +08:00 |
|