Ollama

One Dell Precision 3680 with an RTX 3090 Ti passed through to an Ubuntu VM. It runs local LLMs, transcodes media, and streams PC games — all from the same GPU, depending on what I need at the time. Overview Aspect Details Host Dell Precision 3680 CPU Intel Core i9-14900K RAM 128 GB GPU NVIDIA RTX 3090 Ti (PCI-e passthrough) Hypervisor Proxmox VE 8.4.5 VM 16 vCPUs, 40 GB RAM, Ubuntu 24.04.3 LTS Workload Breakdown AI Inference & Language Tasks Ollama - Local LLM inference Whisper / Faster-Whisper - Speech-to-text Piper TTS - Text-to-speech LibreTranslate - Machine translation Stable Diffusion - Image generation Viseron - AI-powered video analysis Immich - AI-powered photo management Ollama API Integrations Open WebUI (ChatGPT-style interface) Nextcloud AI features Perplexica (AI search) n8n workflow automation Home Assistant voice control Media Streaming & Transcoding Jellyfin - Media server with hardware transcoding Viseron - NVR with AI object detection Immich - Photo/video library with ML features Cloud Gaming Stack Sunshine - Game streaming server Xorg - Display server Steam with Proton + Wine Tested: Cyberpunk 2077 Clients: Moonlight (Android TV), Xbox controllers Architecture ┌─────────────────────────────────────────────────────────────────┐ │ Dell Precision 3680 │ │ i9-14900K | 128 GB RAM │ │ Proxmox VE 8.4.5 │ └─────────────────────────────────────────────────────────────────┘ │ │ PCI-e Passthrough ▼ ┌─────────────────────────────────────────────────────────────────┐ │ GPU VM (Ubuntu 24.04.3) │ │ 16 vCPUs | 40 GB RAM │ │ RTX 3090 Ti (24GB VRAM) │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ AI Workloads │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ Ollama │ │ Whisper │ │ Piper │ │ Stable │ │ │ │ │ │ LLM │ │ STT │ │ TTS │ │Diffusion │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Media Services │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ Jellyfin │ │ Viseron │ │ Immich │ │ │ │ │ │ (NVENC) │ │ (NVR) │ │ (Photos) │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Cloud Gaming │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ Sunshine │ │ Steam │ │ Xorg │ │ │ │ │ │(Streamer)│ │ (Proton) │ │(Display) │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ Network ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Clients │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Open WebUI │ │Home Assistant│ │ Moonlight │ │ │ │ (AI Chat) │ │(Voice Control)│ │(Game Stream) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ What It Does The GPU is shared across three types of work: ...

An n8n workflow that watches my blog RSS feed, summarizes new posts with a local LLM (Gemma on Ollama), and cross-posts to Facebook, LinkedIn, X, and Instagram. I write the post once; the workflow handles the rest. Overview Aspect Details Platform Self-hosted n8n LLM Gemma via Ollama (local GPU) Social APIs Postiz Scheduling Cron trigger (10 min) Hardware RTX 3090 Ti for inference How It Works Fetch - RSS feed polling for new blog posts Summarize - Ollama (Gemma) generates platform-specific summaries Extract - Parse HTML for images, download and resize Generate - AI creates relevant hashtags Publish - Postiz API posts to all platforms Dedupe - Hash-based tracking prevents duplicates Architecture ┌─────────────────────────────────────────────────────────────────┐ │ Hugo RSS Feed │ └───────────────────────────┬─────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ n8n Workflow │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Cron Trigger (10 min) │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────▼───────────────────────────────┐ │ │ │ RSS Feed Node (Fetch Latest) │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────▼───────────────────────────────┐ │ │ │ Hash Check (Duplicate Prevention) │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────▼───────────────────────────────┐ │ │ │ Ollama Node (Gemma LLM) │ │ │ │ • Summarize (platform char limits) │ │ │ │ • Generate hashtags │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────▼───────────────────────────────┐ │ │ │ Image Extraction & Processing │ │ │ │ • HTML parsing │ │ │ │ • Download & resize │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────▼───────────────────────────────┐ │ │ │ Postiz API │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │Facebook │ │LinkedIn │ │ X │ │Instagram│ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ └──────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ What Was Harder Than Expected Getting the LLM to respect character limits was the recurring headache. X needs 280 characters, LinkedIn gets more room. The prompt engineering for consistent output length took more iteration than the actual workflow logic. ...

Ollama

Unified GPU Homelab: AI, Media & Gaming

Auto-Summarize Blog Posts to Social Media