Skip to content

Commit 4c86a7e

Browse files
RyanJDickhipsterusername
authored andcommitted
Update Low-VRAM docs guidance around max_cache_vram_gb.
1 parent b9f9d1c commit 4c86a7e

File tree

1 file changed

+10
-8
lines changed

1 file changed

+10
-8
lines changed

docs/features/low-vram.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -86,24 +86,26 @@ But, if your GPU has enough VRAM to hold models fully, you might get a perf boos
8686
# As an example, if your system has 32GB of RAM and no other heavy processes, setting the `max_cache_ram_gb` to 28GB
8787
# might be a good value to achieve aggressive model caching.
8888
max_cache_ram_gb: 28
89+
8990
# The default max cache VRAM size is adjusted dynamically based on the amount of available VRAM (taking into
9091
# consideration the VRAM used by other processes).
91-
# You can override the default value by setting `max_cache_vram_gb`. Note that this value takes precedence over the
92-
# `device_working_mem_gb`.
93-
# It is recommended to set the VRAM cache size to be as large as possible while leaving enough room for the working
94-
# memory of the tasks you will be doing. For example, on a 24GB GPU that will be running unquantized FLUX without any
95-
# auxiliary models, 18GB might be a good value.
96-
max_cache_vram_gb: 18
92+
# You can override the default value by setting `max_cache_vram_gb`.
93+
# CAUTION: Most users should not manually set this value. See warning below.
94+
max_cache_vram_gb: 16
9795
```
9896
99-
!!! tip "Max safe value for `max_cache_vram_gb`"
97+
!!! warning "Max safe value for `max_cache_vram_gb`"
98+
99+
Most users should not manually configure the `max_cache_vram_gb`. This configuration value takes precedence over the `device_working_mem_gb` and any operations that explicitly reserve additional working memory (e.g. VAE decode). As such, manually configuring it increases the likelihood of encountering out-of-memory errors.
100100

101-
To determine the max safe value for `max_cache_vram_gb`, subtract `device_working_mem_gb` from your GPU's VRAM. As described below, the default for `device_working_mem_gb` is 3GB.
101+
For users who wish to configure `max_cache_vram_gb`, the max safe value can be determined by subtracting `device_working_mem_gb` from your GPU's VRAM. As described below, the default for `device_working_mem_gb` is 3GB.
102102

103103
For example, if you have a 12GB GPU, the max safe value for `max_cache_vram_gb` is `12GB - 3GB = 9GB`.
104104

105105
If you had increased `device_working_mem_gb` to 4GB, then the max safe value for `max_cache_vram_gb` is `12GB - 4GB = 8GB`.
106106

107+
Most users who override `max_cache_vram_gb` are doing so because they wish to use significantly less VRAM, and should be setting `max_cache_vram_gb` to a value significantly less than the 'max safe value'.
108+
107109
### Working memory
108110

109111
Invoke cannot use _all_ of your VRAM for model caching and loading. It requires some VRAM to use as working memory for various operations.

0 commit comments

Comments
 (0)