Daniele
5b8996f74a
Conv2D direct support ( #744 )
...
* Conv2DDirect for VAE stage
* Enable only for Vulkan, reduced duplicated code
* Cmake option to use conv2d direct
* conv2d direct always on for opencl
* conv direct as a flag
* fix merge typo
* Align conv2d behavior to flash attention's
* fix readme
* add conv2d direct for controlnet
* add conv2d direct for esrgan
* clean code, use enable_conv2d_direct/get_all_blocks
* format code
---------
Co-authored-by: leejet <leejet714@gmail.com>
2025-08-03 01:25:17 +08:00
leejet
f6b9aa1a43
refector: optimize the usage of tensor_types
2025-07-28 23:18:29 +08:00
rmatif
d42fd59464
feat: add OpenCL backend support ( #680 )
2025-06-30 23:32:23 +08:00
stduhpf
7ce63e740c
feat: flexible model architecture for dit models (Flux & SD3) ( #490 )
...
* Refactor: wtype per tensor
* Fix default args
* refactor: fix flux
* Refactor photmaker v2 support
* unet: refactor the refactoring
* Refactor: fix controlnet and tae
* refactor: upscaler
* Refactor: fix runtime type override
* upscaler: use fp16 again
* Refactor: Flexible sd3 arch
* Refactor: Flexible Flux arch
* format code
---------
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-30 14:18:53 +08:00
leejet
73c2176648
feat: add sd3 support ( #298 )
2024-07-28 15:44:08 +08:00
leejet
be6cd1a4bf
sync: update ggml
2024-06-01 13:44:09 +08:00
leejet
ec82d5279a
refector: remove some useless code
2024-04-14 14:04:52 +08:00
leejet
b6368868d9
feat: introduce GGMLBlock and implement SVD(Broken) ( #159 )
...
* introduce GGMLBlock and implement SVD(Broken)
* add sdxl vae warning
2024-02-24 20:06:39 +08:00
leejet
349439f239
style: format code
2024-01-29 23:05:18 +08:00
leejet
2b6ec97fe2
sync: update ggml ( #134 )
2024-01-05 23:18:41 +08:00
leejet
2e79a82f85
refactor: reorganize code and use c api ( #133 )
2024-01-01 16:22:18 +08:00