Best VPS for Stable Diffusion in 2026

Running Stable Diffusion on your own server means unlimited image generation with no per-image fees, no content filters you didn’t ask for, and full control over models and workflows. If you’re also interested in text generation, check out our guide on the best VPS for LLM hosting. The catch? You need a GPU. Here’s what actually works.

Why Self-Host Stable Diffusion?

Services like Midjourney and DALL-E charge per image. At scale — generating product shots, game assets, marketing materials, or training LoRAs — costs pile up fast. A self-hosted setup flips the economics:

Unlimited generations for a fixed monthly cost
No content restrictions beyond what you choose
Custom models and LoRAs — fine-tune on your own data
API access for automation and pipelines (pair with n8n for workflow automation)
Privacy — your prompts and outputs stay on your hardware

The math: Midjourney Pro costs $60/month for ~900 fast images. A GPU VPS at $50-80/month gives you thousands of images per day, 24/7.

What Hardware Does Stable Diffusion Need?

Stable Diffusion is a GPU workload. CPU generation exists but is painfully slow — a single 512x512 image can take 5+ minutes versus 3 seconds on a decent GPU.

VRAM Requirements by Model

Model	Minimum VRAM	Recommended VRAM	Image Size
SD 1.5	4GB	8GB	512x512
SDXL	8GB	12GB	1024x1024
Flux.1 Dev	12GB	16GB+	1024x1024
Flux.1 Schnell	8GB	12GB	1024x1024
SD 3.5 Medium	8GB	12GB	1024x1024
SD 3.5 Large	12GB	16GB+	1024x1024

Key insight: VRAM is the bottleneck, not CPU or system RAM. A 16GB GPU handles every current model comfortably. 24GB gives you room for large batch sizes and inpainting workflows.

Speed Expectations

Generation speed depends on the GPU, model, resolution, and steps. Rough benchmarks for a single 1024x1024 SDXL image at 30 steps:

GPU	~Time per Image	Monthly Cost (Hetzner)
NVIDIA A100 (40GB)	2-3 sec	~€320/mo
NVIDIA L40S (48GB)	3-5 sec	~€250/mo
NVIDIA A40 (48GB)	4-6 sec	~€200/mo
NVIDIA RTX 4090 (24GB)	2-4 sec	Not widely available
NVIDIA A10 (24GB)	6-10 sec	~€100/mo
NVIDIA T4 (16GB)	15-25 sec	~€50/mo

Best GPU VPS for Stable Diffusion

1. Hetzner GPU Servers — Best Overall Value

Hetzner offers dedicated GPU servers with NVIDIA A100 and L40S cards at prices that make cloud giants look predatory.

Why Hetzner wins:

Hourly billing — spin up when you need it, shut down when you don’t
European data centers with excellent connectivity
Dedicated GPUs, not shared — predictable performance
Competitive pricing compared to AWS/GCP/Azure

Best config for Stable Diffusion:

Budget: EX44-GPU (NVIDIA A10, 24GB VRAM) — handles SDXL and Flux comfortably
Performance: GEX44 (NVIDIA A100, 40GB VRAM) — fast generations, large batches

Quick start:

# Install NVIDIA drivers + CUDA
sudo apt update && sudo apt install -y nvidia-driver-535 nvidia-cuda-toolkit

# Run ComfyUI with Docker
docker run -d --gpus all -p 8188:8188 \
  -v comfyui-data:/workspace \
  ghcr.io/ai-dock/comfyui:latest

2. Vultr Cloud GPU — Best Global Coverage

Vultr offers NVIDIA A100, A40, and L40S GPUs across data centers worldwide. If you need a GPU server close to your users — Asia, South America, or multiple US regions — Vultr has the best geographic spread.

Standout features:

32 data center locations
Hourly billing with no commitment
A100, A40, L40S options
Good API for automation

Best for: Teams needing GPU servers in specific regions, automated image pipelines with geographic requirements.

3. Lambda Cloud — Built for AI Workloads

From $0.50/hr | NVIDIA A10, A100, H100 options

Lambda specializes in machine learning infrastructure. Their instances come pre-loaded with CUDA, PyTorch, and common ML libraries. Less setup, more generating.

Best for: ML engineers who want a ready-to-go environment without configuring CUDA drivers.

4. Hostinger VPS — Budget Entry Point

Hostinger doesn’t offer GPU servers, but their high-RAM VPS plans can run Stable Diffusion on CPU for occasional use or serve as a front-end for a GPU-accelerated backend.

Use case: Host the ComfyUI web interface and API proxy on Hostinger, route heavy generation to a GPU instance. This way you keep a cheap always-on endpoint while only paying for GPU time when generating.

5. Contabo — Cheap Storage for Models

Running Stable Diffusion means storing models. A single SDXL checkpoint is 6-7GB. Add LoRAs, VAEs, ControlNet models, and upscalers — you’re easily looking at 50-100GB+ of model files.

Contabo’s strength is massive storage at low prices. Use a Contabo instance as your model repository and pair it with a GPU server for inference.

Best Software for Self-Hosted Stable Diffusion

ComfyUI — The Power User’s Choice

ComfyUI is a node-based workflow editor for Stable Diffusion. It’s the most flexible option, supports every model and technique, and has an active extension ecosystem.

# Docker setup (recommended)
docker run -d --gpus all \
  -p 8188:8188 \
  -v /models:/workspace/ComfyUI/models \
  ghcr.io/ai-dock/comfyui:latest

# Or manual install
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
python main.py --listen 0.0.0.0

Why ComfyUI:

Node-based workflows for complex pipelines
Supports SD 1.5, SDXL, SD3, Flux, and more
ControlNet, IP-Adapter, InstantID built-in
API mode for automation
Lower VRAM usage than alternatives

Automatic1111 (Forge) — The Classic

A1111 with the Forge backend is still popular for its simplicity. Good for users who want a traditional web UI with extensions.

git clone https://github.com/lllyasviel/stable-diffusion-webui-forge.git
cd stable-diffusion-webui-forge
./webui.sh --listen --api

InvokeAI — The Polished Option

InvokeAI offers a clean, modern UI with good workflow management. Best for artists who want a more curated experience.

pip install invokeai
invokeai-web --host 0.0.0.0

Production Setup: ComfyUI Behind a Reverse Proxy

Don’t expose ComfyUI directly to the internet. Use Caddy or Nginx as a reverse proxy with authentication:

# Caddyfile
sd.yourdomain.com {
    basicauth {
        admin $2a$14$your_hashed_password
    }
    reverse_proxy localhost:8188
}

For API-only access (feeding images to your apps):

# Generate via ComfyUI API
curl -X POST http://localhost:8188/prompt \
  -H "Content-Type: application/json" \
  -d '{"prompt": {...your_workflow_json...}}'

Cost Comparison: Self-Hosted vs Services

	Midjourney Pro	DALL-E 3	Self-Hosted (A10)	Self-Hosted (A100)
Monthly cost	$60	Pay per image	~$100/mo	~$320/mo
Images/month	~900 fast	~1,000 ($0.04 each)	Unlimited	Unlimited
Custom models	No	No	Yes	Yes
LoRA training	No	No	Yes	Yes
API access	Limited	Yes	Full	Full
Content restrictions	Yes	Yes	None	None
Quality (subjective)	Excellent	Good	Depends on model	Depends on model

Break-even: If you generate more than ~2,500 images per month, self-hosting on an A10 is cheaper than DALL-E. For Midjourney-level quality with Flux models, the break-even is even lower.

Optimizing Performance

1. Use the Right Precision

FP16 (half precision) is standard. FP8 cuts VRAM usage further with minimal quality loss on supported GPUs (RTX 40-series, A100):

# ComfyUI supports FP8 natively in model loader nodes
# Just select "fp8_e4m3fn" in the checkpoint loader

2. Enable xFormers or Flash Attention

# For A1111/Forge
./webui.sh --xformers

# ComfyUI uses optimized attention by default

3. Use Tiled VAE for High-Resolution

Generating images above 2048x2048 can OOM. Tiled VAE decoding prevents this:

# In ComfyUI, use the "VAE Decode (Tiled)" node
# In A1111, enable "Tiled VAE" extension

4. Batch Processing

For bulk generation, queue multiple prompts and let them process sequentially. ComfyUI’s API mode handles this natively.

5. Model Caching

Keep frequently used models loaded in VRAM. Switching models takes 5-15 seconds depending on size. If you mostly use one model, keep it warm.

Security Considerations

Authentication — Never expose ComfyUI/A1111 without auth. Anyone with access can generate anything. See our VPS security guide for hardening tips.
Firewall — Only open ports 443 (HTTPS via reverse proxy). Block direct access to 8188/7860.
Storage — Generated images can fill disks fast. Set up auto-cleanup or external storage.
Updates — Keep ComfyUI and models updated. Security patches matter.
Resource limits — Set max resolution and batch size limits to prevent GPU memory exhaustion.

Our Recommendation

For serious image generation: Hetzner GPU servers with ComfyUI. An A10 (24GB) handles everything from SDXL to Flux at a reasonable price. Scale up to an A100 if you need speed or are serving multiple users.

For occasional use: Rent GPU time hourly from Hetzner or Lambda. Spin up when you need to generate, shut down when you’re done. A few hours of A100 time costs less than a coffee.

For teams and production: Vultr or Lambda for geographic flexibility and pre-configured environments. Pair with a Hostinger VPS as your always-on API gateway.

For experimenting: Start with a Hostinger VPS and run CPU inference to learn the tools. It’s slow, but it’s cheap and teaches you the workflow before committing to GPU costs.

Self-hosted Stable Diffusion gives you unlimited creative power at a fixed cost. The tools are mature, the models are incredible, and a $100/month GPU server replaces thousands in API fees. Pick a provider, install ComfyUI, and start generating.

// last updated: March 5, 2026. Disclosure: This article may contain affiliate links.

Best VPS for Stable Diffusion 2026: Generate AI Images on Your Own Server