2.2 KiB
2.2 KiB
🎙️ Project Summary: Self-Hosted High-Performance AI Assistant
🎯 The Ultimate Goal
To build a fully local, privacy-focused smart speaker/home automation engine that uses high-end hardware to achieve near-instantaneous response times (low latency) without relying on cloud-based processing.
🏗️ The Software Pipeline (The "Stack")
- Wake Word Detection: Porcupine/Picovoice (Running on CPU, acting as the gatekeeper).
- ASR (Speech-to-Text):
faster-whisper(Running on RTX 3080 Ti via CUDA for high-speed transcription). - NLU (The Brain): Local LLM via O/llama (e.g., Llama 3 or Mistral) to parse intent from text into JSON commands.
- Execution Layer: Home Assistant (Receiving JSON webhooks to trigger physical smart home devices).
🖥️ The Hardware & Infrastructure
- Host Hypervisor: Proxmox VE.
- Physical Resources: 32GB System RAM, NVIDIA RTX 3080 Ti (12GB VRAM), Intel CPU with VT-d enabled.
- The "Brain" VM Configuration:
- OS: Ubuntu 24.04 LTS (Regular version).
- CPU: 1 Socket, 4 Cores, Type: Host (Crucial for AI instructions).
- Memory: 16GB RAM, KSM and Ballooning disabled (To ensure stability and prevent latency jitter).
- Storage: VirtIO SCSI Single controller using
io_uringfor high-performance asynchronous I/O. - GPU Passthrough: Completed. The GPU is isolated from Proxmox using
vfio-pci(bypassing thenouveaudriver) and passed directly to the Ubuntu VM. - Networking: VirtIO (paravirtualized) for low-latency communication.
🚩 Current Progress & Status
- Proxmox IOMMU/VT-d configuration finalized.
- GPU Isolation and VFIO configuration completed.
- Ubuntu VM creation and storage architecture (LVM) finalized.
- Ubuntu installation completed.
- CURRENT TASK: Post-install "Day Zero" tasks: SSH access, system updates, and installing NVIDIA Drivers + Docker + NVIDIA Container Toolkit.
Instructions for New Chat: Paste this block into a new chat and say: "I am working on the project described in this summary. I have finished the installation and am ready to begin the 'Day Zero' tasks."