449 Commits

Author SHA1 Message Date
Wagner Bruna
cc107714d7
fix: consistently pass 2nd-order samplers half steps as negatives (#1095) master-449-cc10771 2025-12-27 15:54:18 +08:00
leejet
37c9860b79
fix: handle redirected UTF-8 output correctly on Windows (#1147) master-448-37c9860 2025-12-27 15:43:19 +08:00
leejet
ccb6b0ac9d
feat: add __index_timestep_zero__ support (#1146) master-447-ccb6b0a 2025-12-26 22:07:40 +08:00
Weiqi Gao
df4efe26bd
feat: add png sequence output for vid_gen (#1117) master-446-df4efe2 2025-12-26 22:06:13 +08:00
leejet
860a78e248
fix: avoid crash when using taesd for preview only (#1141) master-445-860a78e 2025-12-24 23:30:12 +08:00
leejet
a0adcfb148
feat: add support for qwen image edit 2511 (#1096) master-444-a0adcfb 2025-12-24 23:00:08 +08:00
leejet
3d5fdd7b37
feat: add support for more underline loras (#1135) master-443-3d5fdd7 2025-12-24 22:59:23 +08:00
Weiqi Gao
3e6c428c27
chore: use Ninja on Windows to speed up build process (#1120) master-442-3e6c428 2025-12-24 22:53:17 +08:00
张春乔
96fcb13fc0
feat: add --serve-html-path option to example server (#1123) 2025-12-24 22:43:09 +08:00
leejet
3e812460cf
fix: correct ggml_pad_ext (#1133) master-440-3e81246 2025-12-23 21:37:07 +08:00
leejet
98916e8256 docs: update README.md 2025-12-22 23:58:28 +08:00
rmatif
298b11069f
feat: add more caching methods (#1066) master-438-298b110 2025-12-22 23:52:11 +08:00
leejet
30a91138f8 fix: add the missing } master-437-30a9113 2025-12-21 21:53:38 +08:00
leejet
c6937ba44a fix: correct the parsing of --convert-name opotion 2025-12-21 21:47:50 +08:00
leejet
ca5b1969a8
feat: do not convert tensor names by default in convert mode (#1122) master-435-ca5b196 2025-12-21 18:40:10 +08:00
Phylliida Dev
50ff966445
feat: add seamless texture generation support (#914)
* global bool

* reworked circular to global flag

* cleaner implementation of tiling support in sd cpp

* cleaned rope

* working simplified but still need wraps

* Further clean of rope

* resolve flux conflict

* switch to pad op circular only

* Set ggml to most recent

* Revert ggml temp

* Update ggml to most recent

* Revert unneded flux change

* move circular flag to the GGMLRunnerContext

* Pass through circular param in all places where conv is called

* fix of constant and minor cleanup

* Added back --circular option

* Conv2d circular in vae and various models

* Fix temporal padding for qwen image and other vaes

* Z Image circular tiling

* x and y axis seamless only

* First attempt at chroma seamless x and y

* refactor into pure x and y, almost there

* Fix crash on chroma

* Refactor into cleaner variable choices

* Removed redundant set_circular_enabled

* Sync ggml

* simplify circular parameter

* format code

* no need to perform circular pad on the clip

* simplify circular_axes setting

* unify function naming

* remove unnecessary member variables

* simplify rope

---------

Co-authored-by: Phylliida <phylliidadev@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
master-434-50ff966
2025-12-21 18:06:47 +08:00
leejet
88ec9d30b1
feat: add scale_rope support (#1121) master-433-88ec9d3 2025-12-21 15:40:21 +08:00
stduhpf
60abda56e0
feat: select vulkan device with env variable (#629) master-432-60abda5 2025-12-21 15:35:38 +08:00
stduhpf
23fce0bd84
feat: add support for Chroma Radiance x0 (#1091)
* Add x0 Flux pred (+prepare for others)

* Fix convert models with empty tensors

* patch_32 exp support attempt

* improve support for patch_32

* follow official pipeline

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-431-23fce0b
2025-12-20 00:55:57 +08:00
Wagner Bruna
7c88c4765c
chore: give feedback about cfg values smaller than 1 (#1088) master-430-7c88c47 2025-12-19 23:41:52 +08:00
Weiqi Gao
1f77545cf8
docs: document usage of tae for VRAM reduction using wan (#1108) 2025-12-19 23:31:09 +08:00
leejet
8e9f3a4d9e
feat: add support for underline style lora of flux (#1103)
* feat: add support for underline style lora of flux

* add support for underline style lora of t5

* add more protected tokens
2025-12-18 21:44:16 +08:00
Wagner Bruna
78e15bd4af
feat: default to LCM scheduler for LCM sampling (#1109)
* feat: default to LCM scheduler for LCM sampling

* fix bug and attempt to get default scheduler for vid_gen when none is set

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-427-78e15bd
2025-12-18 21:43:39 +08:00
Daniele
97cf2efe45
feat: add KL Optimal scheduler (#1098) master-426-97cf2ef 2025-12-18 21:02:55 +08:00
leejet
bda7fab9f2 chore: remove unused debug code master-425-bda7fab 2025-12-17 23:43:37 +08:00
leejet
c2e18c86e8
fix: make flash attn work with high noise diffusion model (#1111) master-424-c2e18c8 2025-12-17 23:28:59 +08:00
leejet
c3ad6a13e1
refactor: optimize the printing of version log (#1102) master-423-c3ad6a1 2025-12-16 23:11:27 +08:00
leejet
ebe9d26a72
feat: supports correct UTF-8 printing on windows (#1101) master-422-ebe9d26 2025-12-16 23:00:41 +08:00
stduhpf
9fa7f415df
feat: add taehv support for Wan/Qwen (#937) master-421-9fa7f41 2025-12-16 22:57:34 +08:00
akleine
a23262dfde
fix: added a clean exit in ModelLoader::load_tensors if OOM (#1097) master-420-a23262d 2025-12-16 22:45:10 +08:00
Wagner Bruna
e687913bf1
chore: remove lora_model_dir parameter (#1100) master-419-e687913 2025-12-16 22:37:45 +08:00
Wagner Bruna
200cb6f2ca
fix: avoid crash with VAE tiling and certain image sizes (#1090) master-418-200cb6f 2025-12-15 23:51:40 +08:00
leejet
43a70e819b
fix: add lora info to image metadata (#1086) master-417-43a70e8 2025-12-14 01:24:15 +08:00
Kirill A. Korinsky
614f8736df
sync: update ggml (#1082) master-416-614f873 2025-12-14 01:23:34 +08:00
stduhpf
d96b4152d6
perf: optimize ggml_ext_chunk (#1084) master-415-d96b415 2025-12-14 01:22:41 +08:00
rmatif
8f05f5bc6e
feat: add support for custom scheduler (#694)
---------

Co-authored-by: leejet <leejet714@gmail.com>
master-414-8f05f5b
2025-12-13 16:20:02 +08:00
leejet
15d0f82760
feat(server): do not parse lora fromt client-side prompts (#1083) master-413-15d0f82 2025-12-13 14:27:47 +08:00
xxnuo
6888fcb581
feat: server add default_gen_params to override default args (#1050) master-412-6888fcb 2025-12-13 14:22:32 +08:00
leejet
2aecdd57ca
feat: simple openai image generation api compatiple server (#1037) master-411-2aecdd5 2025-12-13 13:53:21 +08:00
leejet
11ab095230
fix: resolve embedding loading issue when calling generate_image multiple times (#1078) master-410-11ab095 2025-12-12 23:08:12 +08:00
Wagner Bruna
a3a88fc9b2
fix: avoid crash loading LoRAs with bf16 weights (#1077) master-409-a3a88fc 2025-12-12 22:36:54 +08:00
leejet
8823dc48bc
feat: align the spatial size to the corresponding multiple (#1073) master-408-8823dc4 2025-12-10 23:15:08 +08:00
Pedrito
1ac5a616de
feat: support custom upscale tile size (#896) master-407-1ac5a61 2025-12-10 22:25:19 +08:00
leejet
d939f6e86a
refactor: optimize the handling of LoRA models (#1070) master-406-d939f6e 2025-12-10 00:26:07 +08:00
Wagner Bruna
e72aea796e
feat: embed version string and git commit hash (#1008) master-405-e72aea7 2025-12-09 22:38:54 +08:00
wuhei
a908436729
docs: update download link for Stable Diffusion v1.5 (#1063) 2025-12-09 22:06:16 +08:00
stduhpf
583a02e29e
feat: add Flux.2 VAE proj matrix for previews (#1017) master-403-583a02e 2025-12-09 22:00:45 +08:00
leejet
96c3e64057
refactor: optimize the handling of embedding (#1068)
* optimize the handling of embedding

* support case-insensitive embedding names
master-402-96c3e64
2025-12-08 23:59:04 +08:00
Weiqi Gao
0392273e10
chore: add compute kernels to Windows CUDA build (#1062)
* Fix syntax for CUDA architecture definitions

* Extend CUDA support to GTX 10 Series to RTX 50 Series

* update cuda installer step version to install cuda 12.8.1

* Remove unsupported compute capability
master-401-0392273
2025-12-07 22:12:50 +08:00
leejet
bf1a388b44 docs: update logo 2025-12-07 15:09:32 +08:00