GitHub AI Boom: My Homelab Goes Local
Published: November 20, 2023 (retrospective)
November 2023’s generative AI explosion on GitHub—repos tripled to 65k+—pushed me from cloud experimentation to Proxmox homelab builds. ChatGPT Enterprise handled scripting, but token costs and privacy concerns demanded local LLMs. Enter Ollama’s early hype.
From Cloud to Containers
GitHub’s AI project surge validated my pivot: Ollama promised free, private inference on Proxmox VMs. First homelab setup:
– Llama2 on RTX 3090 GPU passthrough
– QNAP NFS for model storage
– Uptime Kuma monitoring from day one
| Test | Latency | Cost | Verdict |
|---|---|---|---|
| GPT-4 (cloud) | 2.1s | $0.03/1k tokens | Reference |
| Ollama Llama2 | 1.8s | $0 | Winner |
| Local Router (planned) | 1.2s | $0 | Future |
First month: 87% cost savings routing simple queries locally. Cloud reserved for complex reasoning tasks only.
Early Router Sketches
The LocalLLM-Router concept was born here: route 70% of queries to Ollama, reserve cloud API for edge cases. This became the foundation of Control Tower’s cost engine a year later.
Lessons
- Proxmox GPU passthrough is powerful but finicky—document everything.
- Model size matters: 7B handles most IT tasks; 13B for nuanced cybersecurity analysis.
- Privacy wins: Sensitive client data never leaves the homelab.
Want to build your own AI homelab? Let’s talk.
Next: M365 Copilot GA forces enterprise governance thinking (Jan 2024).