One Dell Precision 3680 with an RTX 3090 Ti passed through to an Ubuntu VM. It runs local LLMs, transcodes media, and streams PC games — all from the same GPU, depending on what I need at the time.

Overview

AspectDetails
HostDell Precision 3680
CPUIntel Core i9-14900K
RAM128 GB
GPUNVIDIA RTX 3090 Ti (PCI-e passthrough)
HypervisorProxmox VE 8.4.5
VM16 vCPUs, 40 GB RAM, Ubuntu 24.04.3 LTS

Workload Breakdown

AI Inference & Language Tasks

  • Ollama - Local LLM inference
  • Whisper / Faster-Whisper - Speech-to-text
  • Piper TTS - Text-to-speech
  • LibreTranslate - Machine translation
  • Stable Diffusion - Image generation
  • Viseron - AI-powered video analysis
  • Immich - AI-powered photo management

Ollama API Integrations

  • Open WebUI (ChatGPT-style interface)
  • Nextcloud AI features
  • Perplexica (AI search)
  • n8n workflow automation
  • Home Assistant voice control

Media Streaming & Transcoding

  • Jellyfin - Media server with hardware transcoding
  • Viseron - NVR with AI object detection
  • Immich - Photo/video library with ML features

Cloud Gaming Stack

  • Sunshine - Game streaming server
  • Xorg - Display server
  • Steam with Proton + Wine
  • Tested: Cyberpunk 2077
  • Clients: Moonlight (Android TV), Xbox controllers

Architecture

┌─────────────────────────────────────────────────────────────────┐
                  Dell Precision 3680                             
              i9-14900K | 128 GB RAM                             
                   Proxmox VE 8.4.5                               
└─────────────────────────────────────────────────────────────────┘
                            
                             PCI-e Passthrough
                            
┌─────────────────────────────────────────────────────────────────┐
                   GPU VM (Ubuntu 24.04.3)                        
                  16 vCPUs | 40 GB RAM                           
                   RTX 3090 Ti (24GB VRAM)                        
├─────────────────────────────────────────────────────────────────┤
                                                                  
  ┌─────────────────────────────────────────────────────────┐   
                      AI Workloads                             
    ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐      
      Ollama    Whisper     Piper     Stable         
       LLM        STT        TTS     Diffusion       
    └──────────┘ └──────────┘ └──────────┘ └──────────┘      
  └─────────────────────────────────────────────────────────┘   
                                                                  
  ┌─────────────────────────────────────────────────────────┐   
                    Media Services                             
    ┌──────────┐ ┌──────────┐ ┌──────────┐                   
     Jellyfin   Viseron     Immich                     
     (NVENC)     (NVR)     (Photos)                    
    └──────────┘ └──────────┘ └──────────┘                   
  └─────────────────────────────────────────────────────────┘   
                                                                  
  ┌─────────────────────────────────────────────────────────┐   
                    Cloud Gaming                               
    ┌──────────┐ ┌──────────┐ ┌──────────┐                   
     Sunshine    Steam       Xorg                      
    (Streamer)  (Proton)  (Display)                    
    └──────────┘ └──────────┘ └──────────┘                   
  └─────────────────────────────────────────────────────────┘   
                                                                  
└─────────────────────────────────────────────────────────────────┘
                            
                             Network
                            
┌─────────────────────────────────────────────────────────────────┐
                      Clients                                     
  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          
    Open WebUI    Home Assistant     Moonlight            
    (AI Chat)     (Voice Control)  (Game Stream)           
  └──────────────┘  └──────────────┘  └──────────────┘          
└─────────────────────────────────────────────────────────────────┘

What It Does

The GPU is shared across three types of work:

AI — Open WebUI gives me a local ChatGPT-style interface. Ollama handles inference, Whisper does speech-to-text, Piper does text-to-speech. Home Assistant uses it for voice control. Immich uses it for face recognition and photo search. Nothing leaves my network.

Media — Jellyfin uses NVENC for hardware transcoding, so multiple streams don’t touch the CPU. Viseron runs AI object detection on the security cameras. Immich handles the photo library.

Gaming — Sunshine streams games to Moonlight clients. I’ve played Cyberpunk 2077 on an Android TV box in the living room with an Xbox controller. Latency is low enough that I forgot it wasn’t local.


One GPU, three jobs. The VM handles the scheduling — I just use whatever I need.