34 Commits

Author SHA1 Message Date
Wagner Bruna
e72aea796e
feat: embed version string and git commit hash (#1008) 2025-12-09 22:38:54 +08:00
cmdr2
a7d6d296c7
chore: allow building ggml as a separate shared lib (#468) 2025-10-15 22:10:26 +08:00
clibdev
6bbaf161ad
chore: add install() support in CMakeLists.txt (#540) 2025-09-11 22:24:16 +08:00
leejet
675208dcb6 chore: update to c++17 2025-09-07 12:04:17 +08:00
Wagner Bruna
f7f05fb185
chore: avoid setting GGML_MAX_NAME when building against external ggml (#751)
An external ggml will most likely have been built with the default
GGML_MAX_NAME value (64), which would be inconsistent with the value
set by our build (128). That would be an ODR violation, and it could
easily cause memory corruption issues due to the different
sizeof(struct ggml_tensor) values.

For now, when linking against an external ggml, we demand it has been
patched with a bigger GGML_MAX_NAME, since we can't check against a
value defined only at build time.
2025-08-03 01:24:40 +08:00
Seas0
6167e2927a
feat: support build against system installed GGML library (#749) 2025-08-02 11:03:18 +08:00
rmatif
d42fd59464
feat: add OpenCL backend support (#680) 2025-06-30 23:32:23 +08:00
Meng, Hengyu
838beb9b5e
chore: add global SYCL compile flags (#597) 2025-02-22 21:23:58 +08:00
R0CKSTAR
a3cbdf6dcb
chore: SD_USE_CUBLAS => SD_USE_CUDA for MUSA backend (#578)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-02-05 16:11:26 +08:00
null-define
b70aaa672a
chore: fix amd rocm build (#571) 2025-01-18 13:11:39 +08:00
leejet
dcf91f9e0f chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA 2024-12-28 13:27:51 +08:00
R0CKSTAR
5cc74d1f09
feat: support Moore Threads GPU (#529)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-12-28 13:08:36 +08:00
Erik Scholz
1c168d98a5
fix: repair flash attention support (#386)
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>

* make flash attention in the diffusion model a runtime flag
no support for sd3 or video

* remove old flash attention option and switch vae over to attn_ext

* update docs

* format code

---------

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 12:39:08 +08:00
zhentaoyu
e410aeb534
sync: update ggml to fix large image generation with SYCL backend (#380)
* turn off fast-math on host in SYCL backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for sync some sycl ops

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update sycl readme and ggml

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
2024-09-02 22:29:35 +08:00
Yu Xing
6c88ad3fd6
fix: resolve naming conflict while llama.cpp and sd.cpp both build (#351) 2024-08-28 00:14:41 +08:00
soham
2027b16fda
feat: add vulkan backend support (#291)
* Fix includes and init vulkan the same as llama.cpp

* Add Windows Vulkan CI

* Updated ggml submodule

* support epsilon as a parameter for ggml_group_norm

---------

Co-authored-by: Cloudwalk <cloudwalk@icculus.org>
Co-authored-by: Oleg Skutte <00.00.oleg.00.00@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-08-27 23:56:09 +08:00
zhentaoyu
697d000f49
feat: add SYCL Backend Support for Intel GPUs (#330)
* update ggml and add SYCL CMake option

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* hacky CMakeLists.txt for updating ggml in cpu backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase and clean code

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* add sycl in README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase ggml commit

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* refine README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for supporting sycl tsembd op

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
2024-08-10 13:42:50 +08:00
leejet
be6cd1a4bf sync: update ggml 2024-06-01 13:44:09 +08:00
Phu Tran
1ce9470f27
fix: fix building shared library (#188) 2024-03-03 13:24:59 +08:00
Cyberhan123
b7870a0f89
chore: improve ci (#150)
---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-02-26 22:01:34 +08:00
leejet
b6368868d9
feat: introduce GGMLBlock and implement SVD(Broken) (#159)
* introduce GGMLBlock and implement SVD(Broken)

* add sdxl vae warning
2024-02-24 20:06:39 +08:00
Steward Garcia
36ec16ac99
feat: Control Net support + Textual Inversion (embeddings) (#131)
* add controlnet to pipeline

* add cli params

* control strength cli param

* cli param keep controlnet in cpu

* add Textual Inversion

* add canny preprocessor

* refactor: change ggml_type_sizef to ggml_row_size

* process hint once time

* ignore the embedding name case

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-01-29 22:38:51 +08:00
旺旺碎冰冰
c6071fa82f
feat: add hipBlas support (#94) 2024-01-14 11:53:42 +08:00
leejet
7fb8a51318 chore: make SD_BUILD_DLL visible only to SD_LIB 2024-01-02 22:31:40 +08:00
leejet
2c5f3fc53a chore: add support for building shared library 2024-01-02 21:05:44 +08:00
leejet
2e79a82f85
refactor: reorganize code and use c api (#133) 2024-01-01 16:22:18 +08:00
Steward Garcia
004dfbef27
feat: implement ESRGAN upscaler + Metal Backend (#104)
* add esrgan upscaler

* add sd_tiling

* support metal backend

* add clip_skip

---------

Co-authored-by: leejet <leejet714@gmail.com>
2023-12-28 23:46:48 +08:00
leejet
d7af2c2ba9
feat: load weights from safetensors and ckpt (#101) 2023-12-03 15:47:20 +08:00
Steward Garcia
8124588cf1
feat: ggml-alloc integration and gpu acceleration (#75)
* set ggml url to FSSRepo/ggml

* ggml-alloc integration

* offload all functions to gpu

* gguf format + native converter

* merge custom vae to a model

* full offload to gpu

* improve pretty progress

---------

Co-authored-by: leejet <leejet714@gmail.com>
2023-11-26 19:02:36 +08:00
leejet
09cab2a2ae chore: set default BUILD_SHARED_LIBS to OFF 2023-10-22 14:59:03 +08:00
Erik Scholz
844351c417
feat: cmake improvements and simple ci (#9)
* move main and stb-libs to subfolder

* cmake : general additions

* ci : add simple building

---------

Co-authored-by: leejet <31925346+leejet@users.noreply.github.com>
2023-08-17 21:09:57 +08:00
leejet
58735a2813
feat: add img2img mode (#5) 2023-08-16 01:48:07 +08:00
Georgi Gerganov
a08cae6d95
fix: minor build fixes (#2)
* cmake : fix C++11 build

* gitignore : ignore .cache
2023-08-14 08:12:04 +08:00
leejet
3aca342e60 Initial commit 2023-08-13 16:00:22 +08:00