Files
Homelab/_Notes/AI_Project_2.md
T
2026-04-28 18:52:06 -05:00

2.7 KiB

Project Context: Self-Hosted AI Smart Speaker (The "Brain" Project)

Role for AI: You are an expert Linux System Administrator and AI Engineer. We are building a high-performance, local-first smart speaker system designed to replace cloud assistants with 100% private, GPU-accelerated intelligence.

🏗️ Hardware Environment

  • Hypervisor: Proxmox VE.
  • Physical Server: High-performance build with 32GB System RAM.
  • GPU: NVIDIA GeForce RTX 3080 Ti (12GB VRAM).
  • I/O Configuration: Intel VT-d enabled; intel_iommu=on configured in GRUB.

🐧 Virtual Machine Architecture (The "Brain" VM)

  • Guest OS: Ubuntu 24.04 LTS (Noble Numbat).
  • BIOS/Firmware: SeaBIOS (Chosen specifically to bypass UEFI/Secure Boot/MOK signature complexities for NVIDIA drivers).
  • CPU Configuration: 1 Socket, 4 Cores, Type: host (for maximum instruction set compatibility).
  • Memory: 16GB RAM, Ballooning and KSM disabled (to ensure deterministic performance for AI workloads).
  • Storage/Disk: VirtIO SCSI Single controller; LVM-based disk management with expansion capability.
  • Networking: VirtIO (paravirtualized) for low-latency communication.

🛠️ Software Stack & Completed Milestones

  1. GPU Passthrough: Successfully isolated the RTX 3080 Ti from Proxmox using vfio-pci and assigned it to the Ubuntu VM via PCI Passthrough.
  2. Driver Layer: Installed NVIDIA Driver version 580.126.09 (and CUDA 13.0) directly on the Ubuntu Guest.
  3. Containerization: Docker Engine installed.
  4. The "Bridge" (Crucial): Successfully configured the NVIDIA Container Toolkit.
    • Note: We had to use a workaround for Ubuntu 24.04 by pointing the apt repository to the ubuntu22.04 stable path because the noble path was missing/broken on NVIDIA's servers.
  5. Orchestration: Deployed a docker-compose stack containing:
    • Ollama: Running as the LLM engine (GPU-accelerated).
    • Open WebUI: Running as the frontend interface for text-based testing.

🎯 Current Objective & Next Steps

We have successfully verified that nvidia-smi works inside a Docker container. The "Text-to-Text" pipeline is functional and running on the RTX 3080 Ti.

The next phases are:

  1. Phase 7 (Audio Input): Integrating a microphone array/stream into the Linux environment.
  2. Phase 8 (ASR - The Ears): Deploying faster-whisper in a Docker container to transcribe audio to text.
  3. Phase 9 (The Logic): Writing the Python "Glue" code to pipe audio from the mic \rightarrow Whisper \rightarrow Ollama \rightarrow Home Assistant API for automation execution.

Current Task: Verify the text-based interaction in Open WebUI and begin planning the integration of the Whisper ASR engine.