402 Commits

Author SHA1 Message Date
leejet
96c3e64057
refactor: optimize the handling of embedding (#1068)
* optimize the handling of embedding

* support case-insensitive embedding names
master-402-96c3e64
2025-12-08 23:59:04 +08:00
Weiqi Gao
0392273e10
chore: add compute kernels to Windows CUDA build (#1062)
* Fix syntax for CUDA architecture definitions

* Extend CUDA support to GTX 10 Series to RTX 50 Series

* update cuda installer step version to install cuda 12.8.1

* Remove unsupported compute capability
master-401-0392273
2025-12-07 22:12:50 +08:00
leejet
bf1a388b44 docs: update logo 2025-12-07 15:09:32 +08:00
leejet
c9005337a8 docs: update logo 2025-12-07 14:56:21 +08:00
leejet
2f0bd31a84
feat: add ovis image support (#1057) master-398-2f0bd31 2025-12-07 12:32:56 +08:00
leejet
bfbb929790
feat: do not convert bf16 to f32 (#1055) master-397-bfbb929 2025-12-06 23:55:51 +08:00
leejet
689e44c9a8
fix: correct ggml_ext_silu_act (#1056) master-396-689e44c 2025-12-06 23:55:28 +08:00
leejet
985aedda32
refactor: optimize the handling of pred type (#1048) master-395-985aedd 2025-12-04 23:31:55 +08:00
leejet
3f3610b5cd
chore: optimize lora log (#1047) master-394-3f3610b 2025-12-04 22:44:58 +08:00
Wagner Bruna
118683de8a
fix: correct preview method selection (#1038) master-393-118683d 2025-12-04 22:43:16 +08:00
stduhpf
bcc9c0d0b3
feat: handle ggml compute failures without crashing the program (#1003)
* Feat: handle compute failures more gracefully

* fix Unreachable code after return

Co-authored-by: idostyle <idostyl3@googlemail.com>

* adjust z_image.hpp

---------

Co-authored-by: idostyle <idostyl3@googlemail.com>
Co-authored-by: leejet <leejet714@gmail.com>
master-392-bcc9c0d
2025-12-04 22:04:27 +08:00
leejet
5865b5e703
refactor: split SDParams to SDCliParams/SDContextParams/SDGenerationParams (#1032) master-391-5865b5e 2025-12-03 22:31:46 +08:00
stduhpf
edf2cb3846
fix: fix CosXL not being detected (#989) master-390-edf2cb3 2025-12-03 22:25:02 +08:00
Wagner Bruna
99e17232a4
fix: prevent NaN issues with Z-Image on certain ROCm setups (#1034) 2025-12-03 22:19:34 +08:00
leejet
710169df5c docs: update news 2025-12-01 22:46:15 +08:00
Wagner Bruna
e4c50f1de5
chore: add sd_ prefix to a few functions (#967) master-387-e4c50f1 2025-12-01 22:43:52 +08:00
rmatif
0743a1b3b5
fix: fix vae tiling for flux2 (#1025) master-386-0743a1b 2025-12-01 22:41:56 +08:00
leejet
34a6fd4e60
feat: add z-image support (#1020)
* add z-image support

* use flux_latent_rgb_proj for z-image

* fix qwen3 rope type

* add support for qwen3 4b gguf

* add support for diffusers format lora

* fix nan issue that occurs when using CUDA with k-quants weights

* add z-image docs
master-385-34a6fd4
2025-12-01 22:39:43 +08:00
leejet
3c1187ce83 docs: correct the time of adding flux2 support 2025-11-30 12:40:56 +08:00
leejet
20eb674100
fix: avoid crash when the lora file is not found using immediately mode (#1022) master-383-20eb674 2025-11-30 12:19:37 +08:00
leejet
bc80225336
fix: make the immediate LoRA apply mode work better when using Vulkan (#1021) master-382-bc80225 2025-11-30 12:08:25 +08:00
leejet
ab7e8d285e docs: update news 2025-11-30 11:51:23 +08:00
Wagner Bruna
673dbdda17
fix: add missing line cleanup for s/it progress display (#891) master-380-673dbdd 2025-11-30 11:45:30 +08:00
Wagner Bruna
0249509a30
refactor: add user data pointer to the image preview callback (#1001) master-379-0249509 2025-11-30 11:34:17 +08:00
leejet
52b67c538b
feat: add flux2 support (#1016)
* add flux2 support

* rename qwenvl to llm

* add Flux2FlowDenoiser

* update docs
master-378-52b67c5
2025-11-30 11:32:56 +08:00
leejet
20345888a3
refactor: optimize the handling of sample method (#999) master-377-2034588 2025-11-22 14:00:25 +08:00
akleine
490c51d963
feat: report success/failure when saving PNG/JPG output (#912) master-376-490c51d 2025-11-22 13:57:44 +08:00
Wagner Bruna
45c46779af
feat: add LCM scheduler (#983) master-375-45c4677 2025-11-22 13:53:31 +08:00
leejet
869d023416
refactor: optimize the handling of scheduler (#998) 2025-11-22 12:48:53 +08:00
akleine
e9bc3b6c06
fix: check the PhotoMaker id_embeds tensor ONLY in PhotoMaker V2 mode (#987) master-373-e9bc3b6 2025-11-22 12:47:40 +08:00
Wagner Bruna
b542894fb9
fix: avoid crash on default video preview path (#997)
Co-authored-by: masamaru-san
master-372-b542894
2025-11-22 12:46:27 +08:00
leejet
5498cc0d67
feat: add Wan2.1-I2V-1.3B(SkyReels) support (#988) master-371-5498cc0 2025-11-19 23:56:46 +08:00
stduhpf
aa2b8e0ca5
fix: patch 1x1 conv weights at runtime (#986) master-370-aa2b8e0 2025-11-19 23:27:23 +08:00
rmatif
a14e2b321d
feat: add easycache support (#940) master-369-a14e2b3 2025-11-19 23:19:32 +08:00
leejet
28ffb6c13d
fix: resolve issue with concat multiple LoRA output diffs at runtime (#985) master-368-28ffb6c 2025-11-17 22:56:07 +08:00
leejet
b88cc32346
fix: avoid using same type but diff instances for rng and sampler_rng (#982) master-367-b88cc32 2025-11-16 23:37:14 +08:00
leejet
f532972d60
fix: avoid precision issues on vulkan backend (#980) master-366-f532972 2025-11-16 20:57:08 +08:00
leejet
d5b05f70c6
feat: support independent sampler rng (#978) master-365-d5b05f7 2025-11-16 17:11:02 +08:00
akleine
6d6dc1b8ed
fix: make PhotoMakerV2 more robust by image count check (#970) master-364-6d6dc1b 2025-11-16 17:10:48 +08:00
Wagner Bruna
199e675cc7
feat: support for --tensor-type-rules on generation modes (#932) master-363-199e675 2025-11-16 17:07:32 +08:00
leejet
742a7333c3
feat: add cpu rng (#977) master-362-742a733 2025-11-16 14:48:15 +08:00
Wagner Bruna
e8eb3791c8
fix: typo in --lora-apply-mode help (#972) master-361-e8eb379 2025-11-16 14:48:00 +08:00
Wagner Bruna
aa44e06890
fix: avoid crash with LoRAs and type override (#974) master-360-aa44e06 2025-11-16 14:47:36 +08:00
Daniele
6448430dbb
feat: add break pseudo token support (#422)
---------

Co-authored-by: Urs Ganse <urs.ganse@helsinki.fi>
master-359-6448430
2025-11-16 14:45:20 +08:00
leejet
347710f68f
feat: support applying LoRA at runtime (#969) master-358-347710f 2025-11-13 21:48:44 +08:00
lcy
59ebdf0bb5
chrore: enable Windows ROCm(HIP) build release (#956)
* build: fix missing commit sha in macOS and Ubuntu build zip name

The build workflows for macOS and Ubuntu incorrectly check for the
"main" branch instead of "master" when retrieving the commit hash for
naming the build artifacts.

* build: correct Vulkan SDK installation condition in build workflow

* build: Enable Windows ROCm(HIP) build release

Refer to the build workflow of llama.cpp to add a Windows ROCm (HIP)
build release to the workflow.
Since there are many differences between the HIP build and other
builds, this commit add a separate "windows-latest-cmake-hip" job,
instead of enabling the ROCm matrix entry in the existing Windows
build job.

Main differences include:

- Install ROCm SDK from AMD official installer.
- Add a cache step for ROCm installation and a ccache step for build
  processing, since the HIP build takes much longer time than other
  builds.
- Include the ROCm/HIP artifact in the release assets.
master-357-59ebdf0
2025-11-12 00:28:55 +08:00
Flavio Bizzarri
4ffcbcaed7
fix: specify enum modifier in sd_set_preview_callback signature (#959) master-356-4ffcbca 2025-11-12 00:27:23 +08:00
leejet
694f0d9235
refactor: optimize the logic for name conversion and the processing of the LoRA model (#955) master-355-694f0d9 2025-11-10 00:12:20 +08:00
stduhpf
8ecdf053ac
feat: add image preview support (#522) master-354-8ecdf05 2025-11-10 00:12:02 +08:00
leejet
ee89afc878
fix: resolve issue with pmid (#957) master-353-ee89afc 2025-11-09 22:47:53 +08:00