stable-diffusion.cpp

mirror of https://github.com/leejet/stable-diffusion.cpp.git synced 2026-01-02 18:53:36 +00:00

Author	SHA1	Message	Date
Phylliida Dev	50ff966445	feat: add seamless texture generation support (#914 ) * global bool * reworked circular to global flag * cleaner implementation of tiling support in sd cpp * cleaned rope * working simplified but still need wraps * Further clean of rope * resolve flux conflict * switch to pad op circular only * Set ggml to most recent * Revert ggml temp * Update ggml to most recent * Revert unneded flux change * move circular flag to the GGMLRunnerContext * Pass through circular param in all places where conv is called * fix of constant and minor cleanup * Added back --circular option * Conv2d circular in vae and various models * Fix temporal padding for qwen image and other vaes * Z Image circular tiling * x and y axis seamless only * First attempt at chroma seamless x and y * refactor into pure x and y, almost there * Fix crash on chroma * Refactor into cleaner variable choices * Removed redundant set_circular_enabled * Sync ggml * simplify circular parameter * format code * no need to perform circular pad on the clip * simplify circular_axes setting * unify function naming * remove unnecessary member variables * simplify rope --------- Co-authored-by: Phylliida <phylliidadev@gmail.com> Co-authored-by: leejet <leejet714@gmail.com>	2025-12-21 18:06:47 +08:00
stduhpf	d96b4152d6	perf: optimize ggml_ext_chunk (#1084 )	2025-12-14 01:22:41 +08:00
leejet	f532972d60	fix: avoid precision issues on vulkan backend (#980 )	2025-11-16 20:57:08 +08:00
leejet	347710f68f	feat: support applying LoRA at runtime (#969 )	2025-11-13 21:48:44 +08:00
leejet	c2d8ffc22c	fix: compatibility for models with modified tensor shapes (#951 )	2025-11-07 23:04:41 +08:00
leejet	8f6c5c217b	refactor: simplify the model loading logic (#933 ) * remove String2GGMLType * remove preprocess_tensor * fix clip init * simplify the logic for reading weights	2025-11-03 21:21:34 +08:00
leejet	6103d86e2c	refactor: introduce GGMLRunnerContext (#928 ) * introduce GGMLRunnerContext * add Flash Attention enable control through GGMLRunnerContext * add conv2d_direct enable control through GGMLRunnerContext	2025-11-02 02:11:04 +08:00
leejet	dd75fc081c	refactor: unify the naming style of ggml extension functions (#921 )	2025-10-28 23:26:48 +08:00
leejet	d05e46ca5e	chore: add .clang-tidy configuration and apply modernize checks (#902 )	2025-10-18 23:23:40 +08:00
leejet	beb99a2de2	feat: add Qwen Image support (#851 ) * add qwen tokenizer * add qwen2.5 vl support * mv qwen.hpp -> qwenvl.hpp * add qwen image model * add qwen image t2i pipeline * fix qwen image flash attn * add qwen image i2i pipline * change encoding of vocab_qwen.hpp to utf8 * fix get_first_stage_encoding * apply jeffbolz f32 patch https://github.com/leejet/stable-diffusion.cpp/pull/851#issuecomment-3335515302 * fix the issue that occurs when using CUDA with k-quants weights * optimize the handling of the FeedForward precision fix * to_add_out precision fix * update docs	2025-10-12 23:23:19 +08:00
leejet	f8fe4e7db9	fix: add flash attn support check (#803 )	2025-09-07 21:29:06 +08:00
leejet	f6b9aa1a43	refector: optimize the usage of tensor_types	2025-07-28 23:18:29 +08:00
leejet	7dac89ad75	refector: reuse some code	2025-07-01 23:33:50 +08:00
rmatif	d42fd59464	feat: add OpenCL backend support (#680 )	2025-06-30 23:32:23 +08:00
stduhpf	7ce63e740c	feat: flexible model architecture for dit models (Flux & SD3) (#490 ) * Refactor: wtype per tensor * Fix default args * refactor: fix flux * Refactor photmaker v2 support * unet: refactor the refactoring * Refactor: fix controlnet and tae * refactor: upscaler * Refactor: fix runtime type override * upscaler: use fp16 again * Refactor: Flexible sd3 arch * Refactor: Flexible Flux arch * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-30 14:18:53 +08:00
Erik Scholz	1c168d98a5	fix: repair flash attention support (#386 ) * repair flash attention in _ext this does not fix the currently broken fa behind the define, which is only used by VAE Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> * make flash attention in the diffusion model a runtime flag no support for sd3 or video * remove old flash attention option and switch vae over to attn_ext * update docs * format code --------- Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 12:39:08 +08:00
leejet	64d231f384	feat: add flux support (#356 ) * add flux support * avoid build failures in non-CUDA environments * fix schnell support * add k quants support * add support for applying lora to quantized tensors * add inplace conversion support for f8_e4m3 (#359) in the same way it is done for bf16 like how bf16 converts losslessly to fp32, f8_e4m3 converts losslessly to fp16 * add xlabs flux comfy converted lora support * update docs --------- Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>	2024-08-24 14:29:52 +08:00
leejet	73c2176648	feat: add sd3 support (#298 )	2024-07-28 15:44:08 +08:00
leejet	b6368868d9	feat: introduce GGMLBlock and implement SVD(Broken) (#159 ) * introduce GGMLBlock and implement SVD(Broken) * add sdxl vae warning	2024-02-24 20:06:39 +08:00
leejet	349439f239	style: format code	2024-01-29 23:05:18 +08:00
Steward Garcia	36ec16ac99	feat: Control Net support + Textual Inversion (embeddings) (#131 ) * add controlnet to pipeline * add cli params * control strength cli param * cli param keep controlnet in cpu * add Textual Inversion * add canny preprocessor * refactor: change ggml_type_sizef to ggml_row_size * process hint once time * ignore the embedding name case --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-01-29 22:38:51 +08:00
leejet	2e79a82f85	refactor: reorganize code and use c api (#133 )	2024-01-01 16:22:18 +08:00

22 Commits