# Project Context: Self-Hosted AI Smart Speaker (The "Brain" Project) **Role for AI:** You are an expert Linux System Administrator and AI Engineer. We are building a high-performance, local-first smart speaker system designed to replace cloud assistants with 100% private, GPU-accelerated intelligence. ## 🏗️ Hardware Environment * **Hypervisor:** Proxmox VE. * **Physical Server:** High-performance build with 32GB System RAM. * **GPU:** NVIDIA GeForce RTX 3080 Ti (12GB VRAM). * **I/O Configuration:** Intel VT-d enabled; `intel_iommu=on` configured in GRUB. ## 🐧 Virtual Machine Architecture (The "Brain" VM) * **Guest OS:** Ubuntu 24.04 LTS (Noble Numbat). * **BIOS/Firmware:** SeaBIOS (Chosen specifically to bypass UEFI/Secure Boot/MOK signature complexities for NVIDIA drivers). * **CPU Configuration:** 1 Socket, 4 Cores, Type: `host` (for maximum instruction set compatibility). * **Memory:** 16GB RAM, Ballooning and KSM disabled (to ensure deterministic performance for AI workloads). * **Storage/Disk:** VirtIO SCSI Single controller; LVM-based disk management with expansion capability. * **Networking:** VirtIO (paravirtualized) for low-latency communication. ## 🛠️ Software Stack & Completed Milestones 1. **GPU Passthrough:** Successfully isolated the RTX 3080 Ti from Proxmox using `vfio-pci` and assigned it to the Ubuntu VM via PCI Passthrough. 2. **Driver Layer:** Installed NVIDIA Driver version `580.126.09` (and CUDA 13.0) directly on the Ubuntu Guest. 3. **Containerization:** Docker Engine installed. 4. **The "Bridge" (Crucial):** Successfully configured the **NVIDIA Container Toolkit**. * *Note:* We had to use a workaround for Ubuntu 24.04 by pointing the `apt` repository to the `ubuntu22.04` stable path because the `noble` path was missing/broken on NVIDIA's servers. 5. **Orchestration:** Deployed a `docker-compose` stack containing: * **Ollama:** Running as the LLM engine (GPU-accelerated). * **Open WebUI:** Running as the frontend interface for text-based testing. ## 🎯 Current Objective & Next Steps We have successfully verified that `nvidia-smi` works inside a Docker container. The "Text-to-Text" pipeline is functional and running on the RTX 3080 Ti. **The next phases are:** 1. **Phase 7 (Audio Input):** Integrating a microphone array/stream into the Linux environment. 2. **Phase 8 (ASR - The Ears):** Deploying `faster-whisper` in a Docker container to transcribe audio to text. 3. **Phase 9 (The Logic):** Writing the Python "Glue" code to pipe audio from the mic $\rightarrow$ Whisper $\rightarrow$ Ollama $\rightarrow$ Home Assistant API for automation execution. **Current Task:** Verify the text-based interaction in Open WebUI and begin planning the integration of the Whisper ASR engine.