Skip to content

Problem using VULKAN version ~8kb black image (sometimes not), maybe conditioner bug #748

@phil2sat

Description

@phil2sat

I dont know what it is tried almost every option, heres a good run:
EDIT: I know exactly ehat it is...

while writing i tried several prompts, simple do work like "a cat",
longer prompts with weights end up weird like:

..., ..., masterpiece, best quality, amazing quality, realistic, photorealistic, hyper-realistic, photo, high-resolution, 8K ultra-detailed RAW photograph, Fujifilm XT3, soft even lighting, realistic texture, natural shadows, high-end magazine photoshoot', 1], ]

will become something like this and generate nothing:

..., ..., masterpiece, best quality, amazing quality, realistic, photorealistic, hyper-realistic, photo, high-resolution, 8K ultra-detailed RAW photograph, Fujifilm XT3, soft even lighting, realistic texture, natural shadows, high-end magazine photoshoot qual <--- HERE IS SOMETHING WRONG --> ', 1], ] <-- MISSING

if i remove just the "quality" word i generates a picture as it should

next thing if i add a negative prompt like:

blurry, low resolution, low quality, bad anatomy, poorly drawn face, missing limbs, extra limbs, duplicate body parts, cropped, jpeg artifacts, bad hands, malformed feet, deformed pu**y, unrealistic proportions, watermark, text, sketch, cartoon, CGI render, anime, painting, 3D render, noise, bad composition, tanlines <-- from forge exactly 77 tokens

or

blurry, low resolution, low quality, bad anatomy, poorly drawn face, missing limbs, extra limbs, duplicate body parts, cropped, jpeg artifacts

it will also generates no image

while "blurry "alone works, weird...

at least with this knowledge i can generate images, but it lasted a week

thanks for you hard work and effort and its the only possibility to get stable diffusion running on my rig:
4790k Radeon R9 290 4GB / 16GB ram.

Image
sd -M img_gen -p "a cat" -n  --sampling-method ddim_trailing --steps 20 --schedule discrete -W 512 -H 512 -b 1 --cfg-scale 7 -s -1 --clip-skip -1 --embd-dir /home/phil2sat/sd.cpp-webui/models/embeddings/ --lora-model-dir /home/phil2sat/sd.cpp-webui/models/loras/ -t 0 --rng cuda -o /home/phil2sat/sd.cpp-webui/outputs/txt2img/14.png --model /home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors --vae /home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors --color --diffusion-fa --diffusion-conv-direct --vae-conv-direct -v


Option:
    n_threads:         4
    mode:              img_gen
    model_path:        /home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors
    wtype:             unspecified
    clip_l_path:
    clip_g_path:
    t5xxl_path:
    diffusion_model_path:
    vae_path:          /home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors
    taesd_path:
    esrgan_path:
    control_net_path:
    embedding_dir:   /home/phil2sat/sd.cpp-webui/models/embeddings/
    stacked_id_embed_dir:
    input_id_images_path:
    style ratio:       20.00
    normalize input image :  false
    output_path:       /home/phil2sat/sd.cpp-webui/outputs/txt2img/14.png
    init_img:
    mask_img:
    control_image:
    ref_images_paths:
    clip on cpu:       false
    controlnet cpu:    false
    vae decoder on cpu:false
    diffusion flash attention:true
    strength(control): 0.90
    prompt:            a cat
    negative_prompt:
    min_cfg:           1.00
    cfg_scale:         7.00
    img_cfg_scale:     7.00
    slg_scale:         0.00
    guidance:          3.50
    eta:               0.00
    clip_skip:         -1
    width:             512
    height:            512
    sample_method:     ddim_trailing
    schedule:          discrete
    sample_steps:      20
    strength(img2img): 0.75
    rng:               cuda
    seed:              854311355
    batch_count:       1
    vae_tiling:        false
    upscale_repeats:   1
    chroma_use_dit_mask:   true
    chroma_use_t5_mask:    false
    chroma_t5_mask_pad:    1
System Info:
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
[DEBUG] stable-diffusion.cpp:145  - Using Vulkan backend
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon R9 200 Series (RADV HAWAII) (radv) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
[INFO ] stable-diffusion.cpp:192  - loading model from '/home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors'
[INFO ] model.cpp:1010 - load /home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors using gguf format
[DEBUG] model.cpp:1027 - init from '/home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors'
[INFO ] stable-diffusion.cpp:231  - loading vae from '/home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors'
[INFO ] model.cpp:1013 - load /home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors using safetensors format
[DEBUG] model.cpp:1088 - init from '/home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors'
[INFO ] stable-diffusion.cpp:243  - Version: SDXL
[INFO ] stable-diffusion.cpp:277  - Weight type:                 q4_K
[INFO ] stable-diffusion.cpp:278  - Conditioner weight type:     q4_K
[INFO ] stable-diffusion.cpp:279  - Diffusion model weight type: q4_K
[INFO ] stable-diffusion.cpp:280  - VAE weight type:             f32
[DEBUG] stable-diffusion.cpp:282  - ggml tensor size = 400 bytes
[INFO ] stable-diffusion.cpp:330  - Using flash attention in the diffusion model
[DEBUG] clip.hpp:171  - vocab size: 49408
[DEBUG] clip.hpp:182  -  trigger word img already in vocab
[DEBUG] ggml_extend.hpp:1216 - clip params backend buffer size =  191.00 MB(VRAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1216 - clip params backend buffer size =  587.42 MB(VRAM) (517 tensors)
[DEBUG] ggml_extend.hpp:1216 - unet params backend buffer size =  1960.49 MB(VRAM) (1680 tensors)
[DEBUG] ggml_extend.hpp:1216 - vae params backend buffer size =  94.47 MB(VRAM) (140 tensors)
[DEBUG] stable-diffusion.cpp:459  - loading weights
[DEBUG] model.cpp:1891 - loading tensors from /home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors

  |==================================================| 2641/2641 - 500.00it/s
[DEBUG] model.cpp:1891 - loading tensors from /home/phil2sat/sd.cpp-webui/models/vae/fixFP16ErrorsSDXLLowerMemoryUse_v10.safetensors

  |==================>                               | 959/2641 - 0.00it/s
[INFO ] stable-diffusion.cpp:543  - total params memory size = 2833.38MB (VRAM 2833.38MB, RAM 0.00MB): clip 778.42MB(VRAM), unet 1960.49MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:562  - loading model from '/home/phil2sat/sd.cpp-webui/models/checkpoints/ponyRealism_V22_q4_k.safetensors' completed, taking 1.54s
[INFO ] stable-diffusion.cpp:604  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:610  - running with discrete schedule
[DEBUG] stable-diffusion.cpp:648  - finished loaded file
[DEBUG] stable-diffusion.cpp:1887 - generate_image 512x512
[INFO ] stable-diffusion.cpp:2017 - TXT2IMG
[DEBUG] stable-diffusion.cpp:1557 - prompt after extract and remove lora: "a cat"
[INFO ] stable-diffusion.cpp:738  - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1562 - apply_loras completed, taking 0.00s
[DEBUG] conditioner.hpp:358  - parse 'a cat' to [['a cat', 1], ]
[DEBUG] clip.hpp:311  - token length: 77
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 1.40 MB(VRAM)
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 2.33 MB(VRAM)
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 2.33 MB(VRAM)
[DEBUG] conditioner.hpp:486  - computing condition graph completed, taking 213 ms
[DEBUG] conditioner.hpp:358  - parse '' to [['', 1], ]
[DEBUG] clip.hpp:311  - token length: 77
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 1.40 MB(VRAM)
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 2.33 MB(VRAM)
[DEBUG] ggml_extend.hpp:1168 - clip compute buffer size: 2.33 MB(VRAM)
[DEBUG] conditioner.hpp:486  - computing condition graph completed, taking 211 ms
[INFO ] stable-diffusion.cpp:1696 - get_learned_condition completed, taking 426 ms
[INFO ] stable-diffusion.cpp:1719 - sampling using DDIM "trailing" method
[INFO ] stable-diffusion.cpp:1768 - generating image: 1/1 - seed 854311355
[DEBUG] stable-diffusion.cpp:865  - Sample
[DEBUG] ggml_extend.hpp:1168 - unet compute buffer size: 123.46 MB(VRAM)

  |==================================================| 20/20 - 4.93s/it
[INFO ] stable-diffusion.cpp:1806 - sampling completed, taking 75.23s
[INFO ] stable-diffusion.cpp:1814 - generating 1 latent images completed, taking 75.24s
[INFO ] stable-diffusion.cpp:1817 - decoding 1 latents
[DEBUG] ggml_extend.hpp:1168 - vae compute buffer size: 1664.00 MB(VRAM)
[DEBUG] stable-diffusion.cpp:1182 - computing vae [mode: DECODE] graph completed, taking 42.42s
[INFO ] stable-diffusion.cpp:1827 - latent 1 decoded, taking 42.42s
[INFO ] stable-diffusion.cpp:1831 - decode_first_stage completed, taking 42.42s
[INFO ] stable-diffusion.cpp:2088 - generate_image completed in 118.08s
save result PNG image to '/home/phil2sat/sd.cpp-webui/outputs/txt2img/14.png'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions