Unified GPU Homelab: AI, Media & Gaming

High-performance, multi-purpose homelab setup running on a single powerful node with PCI-e GPU passthrough for AI inference, media transcoding, and cloud gaming workloads.

Overview

Aspect	Details
Host	Dell Precision 3680
CPU	Intel Core i9-14900K
RAM	128 GB
GPU	NVIDIA RTX 3090 Ti (PCI-e passthrough)
Hypervisor	Proxmox VE 8.4.5
VM	16 vCPUs, 40 GB RAM, Ubuntu 24.04.3 LTS

Workload Breakdown

AI Inference & Language Tasks

Ollama - Local LLM inference
Whisper / Faster-Whisper - Speech-to-text
Piper TTS - Text-to-speech
LibreTranslate - Machine translation
Stable Diffusion - Image generation
Viseron - AI-powered video analysis
Immich - AI-powered photo management

Ollama API Integrations

Open WebUI (ChatGPT-style interface)
Nextcloud AI features
Perplexica (AI search)
n8n workflow automation
Home Assistant voice control

Media Streaming & Transcoding

Jellyfin - Media server with hardware transcoding
Viseron - NVR with AI object detection
Immich - Photo/video library with ML features

Cloud Gaming Stack

Sunshine - Game streaming server
Xorg - Display server
Steam with Proton + Wine
Tested: Cyberpunk 2077
Clients: Moonlight (Android TV), Xbox controllers

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                  Dell Precision 3680                             │
│              i9-14900K | 128 GB RAM                             │
│                   Proxmox VE 8.4.5                               │
└─────────────────────────────────────────────────────────────────┘
                            │
                            │ PCI-e Passthrough
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                   GPU VM (Ubuntu 24.04.3)                        │
│                  16 vCPUs | 40 GB RAM                           │
│                   RTX 3090 Ti (24GB VRAM)                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    AI Workloads                          │   │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐   │   │
│  │  │  Ollama  │ │ Whisper  │ │  Piper   │ │ Stable   │   │   │
│  │  │   LLM    │ │   STT    │ │   TTS    │ │Diffusion │   │   │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘   │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                  Media Services                          │   │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐                │   │
│  │  │ Jellyfin │ │ Viseron  │ │  Immich  │                │   │
│  │  │ (NVENC)  │ │  (NVR)   │ │ (Photos) │                │   │
│  │  └──────────┘ └──────────┘ └──────────┘                │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                  Cloud Gaming                            │   │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐                │   │
│  │  │ Sunshine │ │  Steam   │ │   Xorg   │                │   │
│  │  │(Streamer)│ │ (Proton) │ │(Display) │                │   │
│  │  └──────────┘ └──────────┘ └──────────┘                │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
                            │
                            │ Network
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Clients                                     │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │  Open WebUI  │  │Home Assistant│  │   Moonlight  │          │
│  │  (AI Chat)   │  │(Voice Control)│  │(Game Stream) │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
└─────────────────────────────────────────────────────────────────┘

Key Features

Efficient resource consolidation - Single GPU serves multiple workloads
PCI-e passthrough - Full GPU performance in VM
Hardware transcoding - NVENC for media streaming
Local AI - No cloud dependencies, complete privacy
Cloud gaming - Stream AAA games to any device

Use Cases

AI Chat & Assistance

Local ChatGPT-style interface via Open WebUI, integrated with Home Assistant for voice control.

Photo Management

Immich provides Google Photos-like experience with local ML for face recognition, object detection, and smart search.

Media Server

Jellyfin with hardware transcoding supports multiple simultaneous streams without CPU overhead.

Gaming

Stream PC games to Android TV box or any Moonlight client with minimal latency.

Results

This setup proves that high-performance versatility is achievable without a full-scale server rack. By balancing AI, media, and gaming workloads within a single GPU-accelerated VM, it maximizes hardware utilization while maintaining isolation.

Skills Demonstrated

Proxmox virtualization
PCI-e GPU passthrough
Docker containerization
Large Language Models (Ollama)
NVIDIA CUDA/NVENC
Linux system administration
Media server configuration
Game streaming setup

A practical blueprint for power users pushing the limits of homelab engineering

Overview#

Workload Breakdown#

AI Inference & Language Tasks#

Ollama API Integrations#

Media Streaming & Transcoding#

Cloud Gaming Stack#

Architecture#

Key Features#

Use Cases#

AI Chat & Assistance#

Photo Management#

Media Server#

Gaming#

Results#

Skills Demonstrated#