Static calculatorLiveAI Tools

GPU VRAM Calculator

Estimate the GPU VRAM needed to run or train a model.

Privacy note. Runs in your browser. The values you enter stay on this page and aren't sent anywhere.
For example, a 7B model is 7.
Lower precision (quantization) uses fewer bytes per parameter.
2 for fp16 KV cache, 1 for 8-bit.
Extra memory for activations and runtime buffers.
Adds gradients and Adam optimizer state on top of the weights.
These are rough estimates. Real VRAM use depends on the framework, attention kernel, and runtime; always verify before provisioning hardware.
Model weights13 GiB
KV cache2 GiB
Activation overhead3 GiB
Estimated total VRAM18 GiB

Recommended next steps

Related tools

Frequently asked questions

Weights come from the parameter count times the bytes per parameter for your chosen precision (fp16 = 2, 8-bit = 1, 4-bit = 0.5). On top of that we add the KV cache (which grows with hidden size, layers, context length, and batch size) plus a configurable activation overhead. Training mode also adds gradients and Adam optimizer state.

Last updated 2026-06-23.