* introduce GGMLRunnerContext
* add Flash Attention enable control through GGMLRunnerContext
* add conv2d_direct enable control through GGMLRunnerContext
* add ref latent support for qwen image
* optimize clip_preprocess and fix get_first_stage_encoding
* add qwen2vl vit support
* add qwen image edit support
* fix qwen image edit pipeline
* add mmproj file support
* support dynamic number of Qwen image transformer blocks
* set prompt_template_encode_start_idx every time
* to_add_out precision fix
* to_out.0 precision fix
* update docs