-
Notifications
You must be signed in to change notification settings - Fork 407
Open
Description
I understand that the GGML OpenCL backend has many limitations, however it's much faster than the cpu backend on my phone. Since Conv2D direct sdxl generation now only takes 15-20 seconds per iteration, which is almost usable or would be with an accelerator lora, but I can't seem to be able to load any.
This works with the cpu backend, but not the OpenCL one:
sdcpp --model "sd_xl_base-1.0.safetensors" --vae "sdxl_vae.safetensors" --prompt "<lora:dmd2_sdxl_4step_lora_fp16:1.0>A dumb cat"
ggml_opencl: selected platform: 'QUALCOMM Snapdragon(TM)'
ggml_opencl: device: 'QUALCOMM Adreno(TM) 830 (OpenCL 3.0 Adreno(TM) 830)'
ggml_opencl: OpenCL driver: OpenCL 3.0 QUALCOMM build: 0800.35 Compiler E031.47.18.28
ggml_opencl: vector subgroup broadcast support: true
ggml_opencl: device FP16 support: true
ggml_opencl: mem base addr align: 128
ggml_opencl: max mem alloc size: 1024 MB
ggml_opencl: SVM coarse grain buffer support: true
ggml_opencl: SVM fine grain buffer support: true
ggml_opencl: SVM fine grain system support: false
ggml_opencl: SVM atomics support: true
ggml_opencl: flattening quantized weights representation as struct of arrays (GGML_OPENCL_SOA_Q)
ggml_opencl: using kernels optimized for Adreno (GGML_OPENCL_USE_ADRENO_KERNELS)
ggml_opencl: loading OpenCL kernels...........................................................
ggml_opencl: default device: 'QUALCOMM Adreno(TM) 830 (OpenCL 3.0 Adreno(TM) 830)'
[INFO ] stable-diffusion.cpp:192 - loading model from 'sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:1013 - load sd_xl_base_1.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:231 - loading vae from 'sdxl_vae.safetensors'
[INFO ] model.cpp:1013 - load sdxl_vae.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:243 - Version: SDXL
[INFO ] stable-diffusion.cpp:272 - Weight type: f16
[INFO ] stable-diffusion.cpp:273 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:274 - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:275 - VAE weight type: f16
|==================================================| 2641/2641 - 250.00it/s
|==================> | 959/2641 - 0.00it/s
[INFO ] stable-diffusion.cpp:569 - total params memory size = 6751.89MB (VRAM 6751.89MB, RAM 0.00MB): clip 1757.36MB(VRAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:573 - loading model from 'sd_xl_base_1.0.safetensors' completed, taking 9.31s
[INFO ] stable-diffusion.cpp:615 - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:2028 - TXT2IMG
[INFO ] stable-diffusion.cpp:749 - Attempting to apply 1 LoRAs
[INFO ] model.cpp:1013 - load dmd2_sdxl_4step_lora_fp16.safetensors using safetensors format
[INFO ] lora.hpp:117 - loading LoRA from 'dmd2_sdxl_4step_lora_fp16.safetensors'
|==================================================| 2364/2364 - 0.00it/s
/data/data/com.termux/files/home/.termux-build/sdcpp-opencl/src/ggml/src/ggml-opencl/ggml-opencl.cpp:3697: GGML_ASSERT(src0->type == src1->type) failed
0: 0x607554fcc8
1: 0x607554fc78
2: 0x6075561ed8
3: 0x607552ad90
4: 0x607552848c
5: 0x607554ac04
6: 0x6075564aa0
7: 0x6075434d6c
8: 0x60754975d8
9: 0x607541d704
10: 0x60753fddd0
11: 0x6075403768
12: 0x6075386b3c
13: 0x74f114d8e8 __libc_init
The process was killed by SIGABRT: Aborted
The process was killed by SIGABRT: Aborted
@Green-Sky @stduhpf Is this an OpenCL backend issue or am I doing something wrong? I don't have any experience with any of this, that's why I'm asking. If it's an issue is it possible to apply the lora to the model with some other program to create a new model or that's not how it works at all?
Metadata
Metadata
Assignees
Labels
No labels