Architecture Document

Personal AI Twin

Complete system architecture for writing style LoRA + voice clone TTS on M4 Pro Mac Mini

1. High-Level Architecture

The system has two independent AI pipelines that share infrastructure:

PipelineInputModelOutput
Writing StylePrompt (draft request)Qwen 2.5 7B + LoRA via OllamaStyled text
Voice CloneText + Reference audioOpenVoice V2 + MeloTTS via FastAPIAudio waveform (WAV/MP3)
Unified PipelinePrompt + Reference audioBoth above in sequenceStyled text + spoken audio

2. Component Architecture

2.1 Training Pipeline

๐Ÿ“ง Email Export
Formats: .mbox, .eml, Gmail Takeout JSON
Personal emails exported from Apple Mail, Gmail, or Outlook. Only 1:1 personal emails; distribution lists, auto-replies, and BCC excluded.
โ†“ Parse locally
๐Ÿ’ฌ WhatsApp Export
Format: Plain text .txt ยท Pattern: timestamp + sender + message
Chat export from WhatsApp iOS/Android. Grouped by consecutive sender into conversational turns.
โ†“ Parse + format
๐Ÿง  Dataset Construction Pipeline
Tool: Python ยท Format: ChatML JSONL ยท Deduplication: MinHash LSH
Python scripts parse raw exports โ†’ instruction pairs with system prompt, user prompt, and Thota's actual response as assistant output. Deduplicated at 0.85 similarity threshold. Quality filtered (20-char min length, langdetect for English).
โ†“ 500โ€“1,000 curated samples
๐Ÿ”ง LLaMA Factory (Training)
Platform: macOS + Metal GPU (M4 Pro) ยท Framework: PyTorch MPS + Unsloth kernels
QLoRA fine-tuning of Qwen 2.5 7B Instruct. Rank=16, targets q_proj+v_proj, LR=2e-4, 1โ€“3 epochs on 500โ€“1,000 samples. Output: LoRA adapter weights in GGUF or .safetensors format.
โ†“ LoRA adapter weights
๐ŸŽฏ Ollama Model (Inference-Ready)
File: thota-style-lora.gguf ยท Port: 11434 ยท API: OpenAI-compatible REST
Modelfile with base Qwen 2.5 7B + ADAPTER directive pointing to LoRA weights. Registered as "thota-writing" model in Ollama. Ready for inference.

2.2 Voice Recording Pipeline

๐ŸŽ™๏ธ Voice Recording Sessions
Duration: ~1 hour total ยท Emotions: 6โ€“10 contexts ยท Format: 24kHz WAV
Thota records 5โ€“10 minute sessions per emotional context (neutral, happy, sad, angry, surprised, whispered, authoritative, tired, playful). Same microphone, same room. Raw audio files stored in FileVault-encrypted directory.
โ†“ Normalize + clean
๐Ÿ”Š Audio Pre-Processing
Tool: Python (librosa, scipy.signal) ยท Sample rate: 24kHz โ†’ 22050Hz normalized
Level normalization, silence removal, breathing artifact removal, consistent sample rate. Organized by emotion tag in separate folders.
โ†“ Cleaned + tagged audio
๐Ÿ—ฃ๏ธ OpenVoice V2 (Instant Clone)
Checkpoint: checkpoints_v2_0417.zip ยท Base TTS: MeloTTS ยท License: MIT
Reference audio (10โ€“30 sec) passed to OpenVoice tone color cloner. No fine-tuning required for basic clone โ€” instant voice embedding from reference. Fine-tuning mode available for enhanced quality (2โ€“4 hours on 1hr audio).

2.3 Inference Stack (Production)

๐ŸŒ SvelteKit + Deno Backend
Framework: SvelteKit with Deno adapter ยท Port: 3000 ยท Routes: /api/*
Server-side web backend. Handles HTTP requests, orchestrates Ollama + FastAPI calls, returns JSON or audio responses. All AI calls happen server-side (localhost) โ€” no secrets exposed to client.
โ†“ server-side fetch (localhost)
๐Ÿง  Ollama Server
Model: thota-writing (Qwen 2.5 7B + LoRA) ยท Port: 11434
Metal GPU-accelerated inference via llama.cpp. Receives styled prompts, generates text in Thota's voice. Streaming support via SSE. Zero outbound network calls.
โ†‘ styled text
๐Ÿ”Š FastAPI (Python) โ€” TTS Server
Port: 8000 ยท Workers: 1 ยท Framework: FastAPI + uvicorn
Wraps OpenVoice V2 + MeloTTS. Receives text + reference audio path, synthesizes speech with cloned voice. Keeps models warm in memory. Serves WAV/MP3 responses.
โ†“ audio data
๐ŸŽง Audio Output
Formats: WAV (lossless, 22050 Hz) ยท MP3 (compressed)
Served as file download or streamed via chunked transfer encoding. Browser audio player or downloadable file.

3. Data Flow Diagrams

3.1 Writing Style Generation

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Client                                                         โ”‚
โ”‚  "Draft a reply thanking my colleague"                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ POST /api/tts/lora/generate
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SvelteKit Backend (Deno)                                      โ”‚
โ”‚  1. Validate request                                          โ”‚
โ”‚  2. Build messages array with system prompt + user prompt     โ”‚
โ”‚  3. Server-side fetch to Ollama localhost:11434               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ POST /api/chat  { model:"thota-writing" }
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Ollama (Metal GPU, M4 Pro)                                    โ”‚
โ”‚  Base: Qwen 2.5 7B Q4_KM  +  LoRA: thota-style-lora.gguf        โ”‚
โ”‚  System: "You are Thota's writing assistant..."                โ”‚
โ”‚  Output: "Nice one โ€” thanks for flagging this..."              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ JSON { message.content }
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SvelteKit Backend                                              โ”‚
โ”‚  Return styled text to client                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

3.2 Voice Clone TTS Pipeline

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Client                                                         โ”‚
โ”‚  POST /api/tts/pipeline  { text, referenceAudio }              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SvelteKit Backend (Deno)                                      โ”‚
โ”‚  1. Call Ollama โ†’ get styled text                              โ”‚
โ”‚  2. Base64-encode reference audio (or use stored voice ID)     โ”‚
โ”‚  3. Call FastAPI localhost:8000/tts                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚ POST /tts { text, reference }  โ”‚
            โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  FastAPI Python Server (Mac Mini, port 8000)                  โ”‚
โ”‚  1. Load reference audio                                      โ”‚
โ”‚  2. MeloTTS: synthesize base audio (text โ†’ waveform)         โ”‚
โ”‚  3. OpenVoice: clone tone color from reference                 โ”‚
โ”‚  4. Return WAV audio                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ audio/wav
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  SvelteKit Backend                                              โ”‚
โ”‚  Stream audio back to client as chunked response              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

4. Network & Access Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Mac Mini M4 Pro (Home Network)                                  โ”‚
โ”‚                                                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚  Ollama    โ”‚    โ”‚  FastAPI    โ”‚    โ”‚  SvelteKit/Deno    โ”‚   โ”‚
โ”‚  โ”‚  :11434    โ”‚    โ”‚  :8000      โ”‚    โ”‚  :3000              โ”‚   โ”‚
โ”‚  โ”‚  (localhost)โ”‚    โ”‚  (localhost)โ”‚    โ”‚  (localhost)        โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚         โ”‚                โ”‚                        โ”‚               โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜               โ”‚
โ”‚                           โ”‚                                      โ”‚
โ”‚                     localhost only                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚ outbound tunnel
                            โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Cloudflare Edge Network                                         โ”‚
โ”‚  Tunnel: cloudflared (persistent, outbound-only)               โ”‚
โ”‚  Public URL: voice-api.yourdomain.com โ†’ Mac Mini :3000          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

All services bind to localhost only. Only Cloudflare Tunnel connects outward from the Mac Mini. No inbound ports opened on router.

5. Storage Architecture

/Users/
โ””โ”€โ”€ thota/
    โ”œโ”€โ”€ ai-models/                          # Model weights (50GB+ free space)
    โ”‚   โ”œโ”€โ”€ qwen2.5-7b/                    # Base Qwen 2.5 7B Instruct
    โ”‚   โ”œโ”€โ”€ thota-style-lora/              # Trained LoRA adapter
    โ”‚   โ””โ”€โ”€ openvoice-v2/                  # OpenVoice V2 checkpoints
    โ”‚       โ””โ”€โ”€ checkpoints_v2/
    โ”œโ”€โ”€ voice-references/                   # ๐Ÿ”’ FileVault encrypted
    โ”‚   โ”œโ”€โ”€ neutral/
    โ”‚   โ”œโ”€โ”€ happy/
    โ”‚   โ”œโ”€โ”€ authoritative/
    โ”‚   โ””โ”€โ”€ ... (by emotion)
    โ”œโ”€โ”€ datasets/
    โ”‚   โ”œโ”€โ”€ email-parsed.jsonl             # Parsed email instruction pairs
    โ”‚   โ”œโ”€โ”€ whatsapp-parsed.jsonl          # Parsed WhatsApp pairs
    โ”‚   โ””โ”€โ”€ combined-dataset.jsonl         # Deduplicated + merged
    โ””โ”€โ”€ scripts/
        โ”œโ”€โ”€ parse_emails.py
        โ”œโ”€โ”€ parse_whatsapp.py
        โ””โ”€โ”€ train_lora.py

6. Model Files & Checkpoints

Model / FileSizeLocationSource
Qwen 2.5 7B Instruct (Q4_KM)~4โ€“5 GB~/.cache/huggingface/HuggingFace
thota-style-lora.gguf100โ€“500 MBai-models/thota-style-lora/Trained output
OpenVoice V2 checkpoints~400 MBopenvoice-v2/checkpoints_v2/myshell-ai/S3
MeloTTS (EN base)~300 MBopenvoice-v2/myshell-ai/MeloTTS
Reference audio (1hr)~1 GB (24kHz)voice-references/Thota's recordings

7. Process Management

ServiceManagerRestart PolicyCommand / Config
Ollama serverlaunchd or tmuxRestart on crash, start on bootlaunchctl start com.ollama.server
FastAPI TTS serverlaunchd or tmuxRestart on crash, start on bootuvicorn main:app --host 0.0.0.0 --port 8000 --workers 1
SvelteKit/DenolaunchdRestart on crash, start on bootdeno task start
Cloudflare TunnellaunchdRestart on crash, reconnect on networkcloudflared tunnel run --token <TOKEN>

8. Privacy Architecture

๐Ÿ”’ Defense in Depth
  • Layer 1 โ€” Local only: All training and inference happens on Mac Mini. Ollama makes zero outbound requests during inference.
  • Layer 2 โ€” Encrypted storage: FileVault full-disk encryption. Encrypted DMG for sensitive voice samples.
  • Layer 3 โ€” Network isolation: All services on localhost. Cloudflare Tunnel is outbound-only. No ports on router.
  • Layer 4 โ€” SSH hardening: Key-only auth, no password auth, non-standard port optional.
  • Layer 5 โ€” Dataset curation: Deduplication + quality filtering prevents memorization of exact phrasing.
  • Layer 6 โ€” API auth: FastAPI middleware adds API key check for any external tunnel requests.