160 Commits

Author SHA1 Message Date
fszontagh
07585448ad
docs: update readme (#462) 2024-11-23 11:42:12 +08:00
stduhpf
6ea812256e
feat: add flux 1 lite 8B (freepik) support (#474)
* Flux Lite (Freepik) support

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-6ea8122
2024-11-23 11:41:30 +08:00
stduhpf
9b1d90bc23
fix: improve clip text_projection support (#397) master-9b1d90b 2024-11-23 11:19:27 +08:00
stduhpf
65fa646684
feat: add sd3.5 medium and skip layer guidance support (#451)
* mmdit-x

* add support for sd3.5 medium

* add skip layer guidance support (mmdit only)

* ignore slg if slg_scale is zero (optimization)

* init out_skip once

* slg support for flux (expermiental)

* warn if version doesn't support slg

* refactor slg cli args

* set default slg_scale to 0 (oops)

* format code

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-65fa646
2024-11-23 11:15:31 +08:00
leejet
ac54e00760
feat: add sd3.5 support (#445) master-ac54e00 2024-10-24 21:58:03 +08:00
stduhpf
14206fd488
fix: fix clip tokenizer (#383) master-14206fd 2024-09-02 22:31:46 +08:00
zhentaoyu
e410aeb534
sync: update ggml to fix large image generation with SYCL backend (#380)
* turn off fast-math on host in SYCL backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for sync some sycl ops

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update sycl readme and ggml

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
master-e410aeb
2024-09-02 22:29:35 +08:00
leejet
58d54738e2 docs: add star history 2024-08-28 00:27:54 +08:00
leejet
4f87b232c2 docs: add Vulkan build command 2024-08-28 00:25:31 +08:00
Erik Scholz
e71ddcedad
fix: improve VAE tiling (#372)
* fix and improve: VAE tiling
- properly handle the upper left corner interpolating both x and y
- refactor out lerp
- use smootherstep to preserve more detail and spend less area blending

* actually fix vae tile merging

Co-authored-by: stduhpf <stephduh@live.fr>

* remove the now unused lerp function

---------

Co-authored-by: stduhpf <stephduh@live.fr>
master-e71ddce
2024-08-28 00:21:12 +08:00
stduhpf
f4c937cb94
fix: add some missing cli args to usage (#363) master-f4c937c 2024-08-28 00:17:46 +08:00
Daniele
0362cc4874
fix: fix some typos (#361) master-0362cc4 2024-08-28 00:15:37 +08:00
Yu Xing
6c88ad3fd6
fix: resolve naming conflict while llama.cpp and sd.cpp both build (#351) master-6c88ad3 2024-08-28 00:14:41 +08:00
Daniele
dc0882cdc9
feat: add exponential scheduler (#346)
* feat: added exponential scheduler

* updated README

* improved exponential formatting

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-dc0882c
2024-08-28 00:13:35 +08:00
Daniele
d00c94844d
feat: add ipndm and ipndm_v samplers (#344) master-d00c948 2024-08-28 00:03:41 +08:00
Daniele
2d4a2f7982
feat: add GITS scheduler (#343) master-2d4a2f7 2024-08-28 00:02:17 +08:00
Tim Miller
353ee93e2d
fix: add enum type to sd_type_t (#293) master-353ee93 2024-08-27 23:57:24 +08:00
soham
2027b16fda
feat: add vulkan backend support (#291)
* Fix includes and init vulkan the same as llama.cpp

* Add Windows Vulkan CI

* Updated ggml submodule

* support epsilon as a parameter for ggml_group_norm

---------

Co-authored-by: Cloudwalk <cloudwalk@icculus.org>
Co-authored-by: Oleg Skutte <00.00.oleg.00.00@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
master-2027b16
2024-08-27 23:56:09 +08:00
leejet
8847114abf fix: fix issue when applying lora master-8847114 2024-08-25 22:39:39 +08:00
leejet
5c561eab31 feat: do not convert more flux tensors master-5c561ea 2024-08-25 16:01:36 +08:00
leejet
f5997a1951 fix: do not force using f32 for some flux layers
This sometimes leads to worse result
master-f5997a1
2024-08-25 14:07:22 +08:00
leejet
1bdc767aaf feat: force using f32 for some layers master-1bdc767 2024-08-25 13:53:16 +08:00
leejet
79c9fe9556 feat: do not convert some tensors master-79c9fe9 2024-08-25 13:37:37 +08:00
leejet
28a614769a docs: update docs/flux.md 2024-08-25 13:11:34 +08:00
leejet
c837c5d9cc style: format code master-c837c5d 2024-08-25 00:19:37 +08:00
leejet
d08d7fa632 docs: update README.md 2024-08-24 14:38:44 +08:00
leejet
64d231f384
feat: add flux support (#356)
* add flux support

* avoid build failures in non-CUDA environments

* fix schnell support

* add k quants support

* add support for applying lora to quantized tensors

* add inplace conversion support for f8_e4m3 (#359)

in the same way it is done for bf16
like how bf16 converts losslessly to fp32,
f8_e4m3 converts losslessly to fp16

* add xlabs flux comfy converted lora support

* update docs

---------

Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>
master-64d231f
2024-08-24 14:29:52 +08:00
zhentaoyu
697d000f49
feat: add SYCL Backend Support for Intel GPUs (#330)
* update ggml and add SYCL CMake option

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* hacky CMakeLists.txt for updating ggml in cpu backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase and clean code

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* add sycl in README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase ggml commit

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* refine README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for supporting sycl tsembd op

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
master-697d000
2024-08-10 13:42:50 +08:00
leejet
5b8d16aa68 docs: reorganize README.md 2024-08-03 12:06:34 +08:00
leejet
3d854f7917 sync: update ggml submodule url master-3d854f7 2024-08-03 11:42:12 +08:00
leejet
4a6e36edc5 sync: update ggml master-4a6e36e 2024-07-28 18:30:35 +08:00
leejet
73c2176648
feat: add sd3 support (#298) master-73c2176 2024-07-28 15:44:08 +08:00
Phu Tran
9c51d8787f
chore: fix cuda CI (#286) master-9c51d87 2024-06-12 23:13:24 +08:00
leejet
f9f0d4685b fix: sample_k_diffusion should be static 2024-06-10 23:04:02 +08:00
leejet
8d2050a5cf sync: update ggml 2024-06-10 22:59:36 +08:00
leejet
08f5b41956 refector: make the sampling module more independent 2024-06-10 22:42:15 +08:00
Eugene
b6daf5c55b
fix: use PRI64 instead of %i for some log (#269) 2024-06-01 14:01:58 +08:00
leejet
be6cd1a4bf sync: update ggml 2024-06-01 13:44:09 +08:00
Justine Tunney
e1384defca
perf: make crc32 100x faster on x86-64 (#278)
This change makes checkpoints load significantly faster by optimizing
pkzip's cyclic redundancy check. This code was developed by Intel and
Google and Mozilla. See Chromium's zlib codebase for further details.
master-e1384de
2024-06-01 12:58:30 +08:00
Phu Tran
814280343c
chore: update artifact actions (#267) master-8142803 2024-06-01 12:33:13 +08:00
leejet
1d2af5ca3f fix: set n_dims of tensor storage to 1 when it's 0 master-1d2af5c 2024-05-14 23:06:52 +08:00
Grauho
ce1bcc74a6
feat: add AYS(Align Your Steps) scheduler (#241)
Added NVIDEA's new "Align Your Steps" style scheduler in accordance with their
quick start guide. Currently has handling for SD1.5, SDXL, and SVD, using the
noise levels from their paper to generate the sigma values. Can be selected
using the --schedule ays command line switch. Updates the main.cpp help
message and README to reflect this option, also they now inform the user
of the --color switch as well.

---------

Co-authored-by: leejet <leejet714@gmail.com>
master-ce1bcc7
2024-04-29 23:21:32 +08:00
Eugene
760cfaa618
fix: ignore tensors with the particular dim while loading (#233) master-760cfaa 2024-04-29 23:04:27 +08:00
Eugene
6d16f6853e
fix: correct upscale progressbar (#232) master-6d16f68 2024-04-29 22:59:46 +08:00
leejet
036ba9e6d8 feat: enable controlnet and photo maker for img2img mode master-036ba9e 2024-04-14 16:36:08 +08:00
leejet
ec82d5279a refector: remove some useless code master-ec82d52 2024-04-14 14:04:52 +08:00
bssrdf
afea457eda
fix: support more SDXL LoRA names (#216)
* apply pmid lora only once for multiple txt2img calls

* add better support for SDXL LoRA

* fix for some sdxl lora, like lcm-lora-xl

---------

Co-authored-by: bssrdf <bssrdf@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
master-afea457
2024-04-06 17:12:03 +08:00
null-define
646e77638e
fix: fix tiles_ctx not freed in sd_tiling (#219) master-646e776 2024-04-06 16:51:48 +08:00
leejet
3ac48ea1a7 fix: use static implementation of stb_image_resize master-3ac48ea 2024-04-06 16:37:08 +08:00
Phu Tran
607e39489f
docs: add Jellybox as UI using sd.cpp (#214) 2024-04-02 12:31:54 +08:00