An external ggml will most likely have been built with the default
GGML_MAX_NAME value (64), which would be inconsistent with the value
set by our build (128). That would be an ODR violation, and it could
easily cause memory corruption issues due to the different
sizeof(struct ggml_tensor) values.
For now, when linking against an external ggml, we demand it has been
patched with a bigger GGML_MAX_NAME, since we can't check against a
value defined only at build time.
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE
Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
* make flash attention in the diffusion model a runtime flag
no support for sd3 or video
* remove old flash attention option and switch vae over to attn_ext
* update docs
* format code
---------
Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
* Fix includes and init vulkan the same as llama.cpp
* Add Windows Vulkan CI
* Updated ggml submodule
* support epsilon as a parameter for ggml_group_norm
---------
Co-authored-by: Cloudwalk <cloudwalk@icculus.org>
Co-authored-by: Oleg Skutte <00.00.oleg.00.00@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
* add controlnet to pipeline
* add cli params
* control strength cli param
* cli param keep controlnet in cpu
* add Textual Inversion
* add canny preprocessor
* refactor: change ggml_type_sizef to ggml_row_size
* process hint once time
* ignore the embedding name case
---------
Co-authored-by: leejet <leejet714@gmail.com>
* set ggml url to FSSRepo/ggml
* ggml-alloc integration
* offload all functions to gpu
* gguf format + native converter
* merge custom vae to a model
* full offload to gpu
* improve pretty progress
---------
Co-authored-by: leejet <leejet714@gmail.com>
* move main and stb-libs to subfolder
* cmake : general additions
* ci : add simple building
---------
Co-authored-by: leejet <31925346+leejet@users.noreply.github.com>