Skip to content

Insufficent context pool memory; Large SDXL images + "high" batch numbers #842

@Green-Sky

Description

@Green-Sky
[INFO ] ggml_extend.hpp:1648 - vae offload params ( 94.47 MB, 140 tensors) to runtime backend (CUDA0), taking 0.07s
[INFO ] stable-diffusion.cpp:2179 - latent 42 decoded, taking 10.33s
[INFO ] ggml_extend.hpp:1648 - vae offload params ( 94.47 MB, 140 tensors) to runtime backend (CUDA0), taking 0.07s
[INFO ] stable-diffusion.cpp:2179 - latent 43 decoded, taking 10.21s
[INFO ] ggml_extend.hpp:1648 - vae offload params ( 94.47 MB, 140 tensors) to runtime backend (CUDA0), taking 0.07s
[INFO ] stable-diffusion.cpp:2179 - latent 44 decoded, taking 10.34s
[WARN ] ggml_extend.hpp:68   - ggml_new_object: not enough space in the context's memory pool (needed 1076015744, available 1073741824)
/build/cxph0ma2xq8falxnwjs2yadqlnzdynvr-source/ggml/src/ggml.c:1663: GGML_ASSERT(obj_new) failed

Which was caused by:

$ result/bin/sd -m models/CyberRealisticXLPlay_V7.0-vae_f16-q8_0.gguf --cfg-scale 7.5 --steps 36 --scheduler karras --sampling-method dpm++2m -n "lowres, bad quality, worst quality, artistic error, wrong hand, wrong foot, bad anatomy, bad feet, bad reflection, bad perspective, bad arm, bad proportions, bad hands, bad leg, watermark, signature, dated, artist name, copyright name, sketch, censored, facial mark, scan" -W 1280 -H 1280 --diffusion-fa --clip-on-cpu --clip-skip 2 --lora-model-dir models/loras/sdxl/ --offload-to-cpu --vae-conv-direct -p "a lovely cat, natural light.<lora:epiCRealnessRC1:-0.75>" -b 64

So batch of 64, f16+q8_0 sdxl, --offload-to-cpu and --vae-conv-direct with a custom ggml patch (one of the prs, not sure this matters).

If need be I can re run with -v (oops), but it takes for ever.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions