things
This commit is contained in:
@@ -0,0 +1,35 @@
|
|||||||
|
$ffprobe = 'D:\yt-dlp\ffprobe.exe'
|
||||||
|
$FolderToReview = 'Q:\Web-RegulationPodcast\Unsorted'
|
||||||
|
|
||||||
|
if (-not (Test-Path $ffprobe)) {
|
||||||
|
Write-Error "ffprobe.exe not found at '$ffprobe'. Please verify the path."
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
$results = Get-ChildItem -Path $FolderToReview -Recurse -Filter *.mp4 | ForEach-Object {
|
||||||
|
$file = $_.FullName
|
||||||
|
Write-Verbose "Processing: $file" -Verbose
|
||||||
|
|
||||||
|
# Run ffprobe, suppress stderr, capture stdout
|
||||||
|
$rawOutput = & $ffprobe -v error `
|
||||||
|
-select_streams v:0 `
|
||||||
|
-show_entries stream=width,height `
|
||||||
|
-of csv=p=0:s=x `
|
||||||
|
$file 2>$null
|
||||||
|
|
||||||
|
# Clean up output and handle missing/invalid streams
|
||||||
|
$resolution = $rawOutput.Trim()
|
||||||
|
if ([string]::IsNullOrWhiteSpace($resolution)) {
|
||||||
|
$resolution = 'No video stream or unreadable'
|
||||||
|
}
|
||||||
|
|
||||||
|
[PSCustomObject]@{
|
||||||
|
File = $file
|
||||||
|
Resolution = $resolution
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
$results | Format-Table -AutoSize
|
||||||
|
|
||||||
|
# Optional: Export to CSV
|
||||||
|
# $results | Export-Csv -Path 'Q:\Web-RegulationPodcast\video_resolutions.csv' -NoTypeInformation
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
# 🎙️ Project Summary: Self-Hosted High-Performance AI Assistant
|
||||||
|
|
||||||
|
### 🎯 The Ultimate Goal
|
||||||
|
To build a fully local, privacy-focused smart speaker/home automation engine that uses high-end hardware to achieve near-instantaneous response times (low latency) without relying on cloud-based processing.
|
||||||
|
|
||||||
|
### 🏗️ The Software Pipeline (The "Stack")
|
||||||
|
1. **Wake Word Detection:** Porcupine/Picovoice (Running on CPU, acting as the gatekeeper).
|
||||||
|
2. **ASR (Speech-to-Text):** `faster-whisper` (Running on **RTX 3080 Ti via CUDA** for high-speed transcription).
|
||||||
|
3. **NLU (The Brain):** Local LLM via **O/llama** (e.g., Llama 3 or Mistral) to parse intent from text into JSON commands.
|
||||||
|
4. **Execution Layer:** **Home Assistant** (Receiving JSON webhooks to trigger physical smart home devices).
|
||||||
|
|
||||||
|
### 🖥️ The Hardware & Infrastructure
|
||||||
|
* **Host Hypervisor:** Proxmox VE.
|
||||||
|
* **Physical Resources:** 32GB System RAM, NVIDIA RTX 3080 Ti (12GB VRAM), Intel CPU with VT-d enabled.
|
||||||
|
* **The "Brain" VM Configuration:**
|
||||||
|
* **OS:** Ubuntu 24.04 LTS (Regular version).
|
||||||
|
* **CPU:** 1 Socket, 4 Cores, **Type: Host** (Crucial for AI instructions).
|
||||||
|
* **Memory:** 16GB RAM, **KSM and Ballooning disabled** (To ensure stability and prevent latency jitter).
|
||||||
|
* **Storage:** VirtIO SCSI Single controller using `io_uring` for high-performance asynchronous I/O.
|
||||||
|
* **GPU Passthrough:** Completed. The GPU is isolated from Proxmox using `vfio-pci` (bypassing the `nouveau` driver) and passed directly to the Ubuntu VM.
|
||||||
|
* **Networking:** VirtIO (paravirtualized) for low-latency communication.
|
||||||
|
|
||||||
|
### 🚩 Current Progress & Status
|
||||||
|
* [x] Proxmox IOMMU/VT-d configuration finalized.
|
||||||
|
* [x] GPU Isolation and VFIO configuration completed.
|
||||||
|
* [x] Ubuntu VM creation and storage architecture (LVM) finalized.
|
||||||
|
* [x] Ubuntu installation completed.
|
||||||
|
* [ ] **CURRENT TASK:** Post-install "Day Zero" tasks: SSH access, system updates, and installing NVIDIA Drivers + Docker + NVIDIA Container Toolkit.
|
||||||
|
|
||||||
|
***
|
||||||
|
**Instructions for New Chat:**
|
||||||
|
Paste this block into a new chat and say: *"I am working on the project described in this summary. I have finished the installation and am ready to begin the 'Day Zero' tasks."*
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
# Project Context: Self-Hosted AI Smart Speaker (The "Brain" Project)
|
||||||
|
|
||||||
|
**Role for AI:** You are an expert Linux System Administrator and AI Engineer. We are building a high-performance, local-first smart speaker system designed to replace cloud assistants with 100% private, GPU-accelerated intelligence.
|
||||||
|
|
||||||
|
## 🏗️ Hardware Environment
|
||||||
|
* **Hypervisor:** Proxmox VE.
|
||||||
|
* **Physical Server:** High-performance build with 32GB System RAM.
|
||||||
|
* **GPU:** NVIDIA GeForce RTX 3080 Ti (12GB VRAM).
|
||||||
|
* **I/O Configuration:** Intel VT-d enabled; `intel_iommu=on` configured in GRUB.
|
||||||
|
|
||||||
|
## 🐧 Virtual Machine Architecture (The "Brain" VM)
|
||||||
|
* **Guest OS:** Ubuntu 24.04 LTS (Noble Numbat).
|
||||||
|
* **BIOS/Firmware:** SeaBIOS (Chosen specifically to bypass UEFI/Secure Boot/MOK signature complexities for NVIDIA drivers).
|
||||||
|
* **CPU Configuration:** 1 Socket, 4 Cores, Type: `host` (for maximum instruction set compatibility).
|
||||||
|
* **Memory:** 16GB RAM, Ballooning and KSM disabled (to ensure deterministic performance for AI workloads).
|
||||||
|
* **Storage/Disk:** VirtIO SCSI Single controller; LVM-based disk management with expansion capability.
|
||||||
|
* **Networking:** VirtIO (paravirtualized) for low-latency communication.
|
||||||
|
|
||||||
|
## 🛠️ Software Stack & Completed Milestones
|
||||||
|
1. **GPU Passthrough:** Successfully isolated the RTX 3080 Ti from Proxmox using `vfio-pci` and assigned it to the Ubuntu VM via PCI Passthrough.
|
||||||
|
2. **Driver Layer:** Installed NVIDIA Driver version `580.126.09` (and CUDA 13.0) directly on the Ubuntu Guest.
|
||||||
|
3. **Containerization:** Docker Engine installed.
|
||||||
|
4. **The "Bridge" (Crucial):** Successfully configured the **NVIDIA Container Toolkit**.
|
||||||
|
* *Note:* We had to use a workaround for Ubuntu 24.04 by pointing the `apt` repository to the `ubuntu22.04` stable path because the `noble` path was missing/broken on NVIDIA's servers.
|
||||||
|
5. **Orchestration:** Deployed a `docker-compose` stack containing:
|
||||||
|
* **Ollama:** Running as the LLM engine (GPU-accelerated).
|
||||||
|
* **Open WebUI:** Running as the frontend interface for text-based testing.
|
||||||
|
|
||||||
|
## 🎯 Current Objective & Next Steps
|
||||||
|
We have successfully verified that `nvidia-smi` works inside a Docker container. The "Text-to-Text" pipeline is functional and running on the RTX 3080 Ti.
|
||||||
|
|
||||||
|
**The next phases are:**
|
||||||
|
1. **Phase 7 (Audio Input):** Integrating a microphone array/stream into the Linux environment.
|
||||||
|
2. **Phase 8 (ASR - The Ears):** Deploying `faster-whisper` in a Docker container to transcribe audio to text.
|
||||||
|
3. **Phase 9 (The Logic):** Writing the Python "Glue" code to pipe audio from the mic $\rightarrow$ Whisper $\rightarrow$ Ollama $\rightarrow$ Home Assistant API for automation execution.
|
||||||
|
|
||||||
|
**Current Task:** Verify the text-based interaction in Open WebUI and begin planning the integration of the Whisper ASR engine.
|
||||||
Reference in New Issue
Block a user