High-performance, multi-purpose homelab setup running on a single powerful node with PCI-e GPU passthrough for AI inference, media transcoding, and cloud gaming workloads.
Overview
| Aspect | Details |
|---|---|
| Host | Dell Precision 3680 |
| CPU | Intel Core i9-14900K |
| RAM | 128 GB |
| GPU | NVIDIA RTX 3090 Ti (PCI-e passthrough) |
| Hypervisor | Proxmox VE 8.4.5 |
| VM | 16 vCPUs, 40 GB RAM, Ubuntu 24.04.3 LTS |
Workload Breakdown
AI Inference & Language Tasks
- Ollama - Local LLM inference
- Whisper / Faster-Whisper - Speech-to-text
- Piper TTS - Text-to-speech
- LibreTranslate - Machine translation
- Stable Diffusion - Image generation
- Viseron - AI-powered video analysis
- Immich - AI-powered photo management
Ollama API Integrations
- Open WebUI (ChatGPT-style interface)
- Nextcloud AI features
- Perplexica (AI search)
- n8n workflow automation
- Home Assistant voice control
Media Streaming & Transcoding
- Jellyfin - Media server with hardware transcoding
- Viseron - NVR with AI object detection
- Immich - Photo/video library with ML features
Cloud Gaming Stack
- Sunshine - Game streaming server
- Xorg - Display server
- Steam with Proton + Wine
- Tested: Cyberpunk 2077
- Clients: Moonlight (Android TV), Xbox controllers
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Dell Precision 3680 │
│ i9-14900K | 128 GB RAM │
│ Proxmox VE 8.4.5 │
└─────────────────────────────────────────────────────────────────┘
│
│ PCI-e Passthrough
▼
┌─────────────────────────────────────────────────────────────────┐
│ GPU VM (Ubuntu 24.04.3) │
│ 16 vCPUs | 40 GB RAM │
│ RTX 3090 Ti (24GB VRAM) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ AI Workloads │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Ollama │ │ Whisper │ │ Piper │ │ Stable │ │ │
│ │ │ LLM │ │ STT │ │ TTS │ │Diffusion │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Media Services │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Jellyfin │ │ Viseron │ │ Immich │ │ │
│ │ │ (NVENC) │ │ (NVR) │ │ (Photos) │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Cloud Gaming │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Sunshine │ │ Steam │ │ Xorg │ │ │
│ │ │(Streamer)│ │ (Proton) │ │(Display) │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
│
│ Network
▼
┌─────────────────────────────────────────────────────────────────┐
│ Clients │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Open WebUI │ │Home Assistant│ │ Moonlight │ │
│ │ (AI Chat) │ │(Voice Control)│ │(Game Stream) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Key Features
- Efficient resource consolidation - Single GPU serves multiple workloads
- PCI-e passthrough - Full GPU performance in VM
- Hardware transcoding - NVENC for media streaming
- Local AI - No cloud dependencies, complete privacy
- Cloud gaming - Stream AAA games to any device
Use Cases
AI Chat & Assistance
Local ChatGPT-style interface via Open WebUI, integrated with Home Assistant for voice control.
Photo Management
Immich provides Google Photos-like experience with local ML for face recognition, object detection, and smart search.
Media Server
Jellyfin with hardware transcoding supports multiple simultaneous streams without CPU overhead.
Gaming
Stream PC games to Android TV box or any Moonlight client with minimal latency.
Results
This setup proves that high-performance versatility is achievable without a full-scale server rack. By balancing AI, media, and gaming workloads within a single GPU-accelerated VM, it maximizes hardware utilization while maintaining isolation.
Skills Demonstrated
- Proxmox virtualization
- PCI-e GPU passthrough
- Docker containerization
- Large Language Models (Ollama)
- NVIDIA CUDA/NVENC
- Linux system administration
- Media server configuration
- Game streaming setup
A practical blueprint for power users pushing the limits of homelab engineering
