A standalone, decentralized, AI-first systems platform for data science, network engineering, and LLM DevOps.
Zeus is not a server. It is a protocol. A self-contained infrastructure node designed to think, act, and evolve like an AI engineer — without ever needing the internet.
Where traditional AI platforms demand cloud dependency and API costs, Zeus runs offline-first, stays sovereign, and optimizes for autonomy. It unifies five domains into one resilient stack:
| Principle | Description |
|---|---|
| Offline-First | All AI/LLM ops run locally: Ollama, sentence-transformers, Qdrant — no external APIs required. |
| Sovereign Data | Embeddings, uploads, and queries never leave the host. No data leakage. No SaaS. |
| Self-Validating | zeus-validator.sh enforces infrastructure SLAs — runs before and after every deploy. |
| Modular by Design | Docker Compose services replaced independently — no vendor lock-in. |
| Observability-First | Prometheus + Grafana + cAdvisor form a triple-layer telemetry fabric: service → container → host. |
All services operate on network_mode: host — no NAT, no proxy, no latency penalty. Just raw, efficient inter-process communication.
| Service | Role | Status |
|---|---|---|
open-webui | LLM Inference & RAG Frontend | ✅ Validated |
qdrant | Vector Database (gRPC + HTTP) | ✅ Validated |
pytorch-notebook | Research Environment (Jupyter + CUDA) | ✅ Validated |
unsloth | Efficient LoRA Fine-tuning | ✅ Validated |
n8n | Workflow Orchestrator | ✅ Validated |
tika | Document Intelligence Engine | ✅ Validated |
prometheus + grafana + cadvisor | Observability Triad | ✅ Validated |
adguard + searxng | Edge Security & Search | ✅ External VMs |
| Capability | Status | Details |
|---|---|---|
| Secrets Management | ✅ Production | All API keys/tokens moved to .env — zero plaintext in compose |
| Infrastructure Validation | ✅ Production | zeus-validator.sh runs all checks, reports all failures, exits correctly |
| Resource Governance | ✅ Production | Memory limits enforced, GPU reservations correct, no overcommit |
| gRPC + RAG | ✅ Production | Qdrant gRPC (http://localhost:6334), hybrid RAG with RAG_TOP_K=10 |
| Observability Stack | ✅ Production | Prometheus, Grafana, cAdvisor — local metrics, no cloud |
| Network Stability | ✅ Production | Bridged LAN (br0) on enp3s0 — no NAT, no packet loss |
Low-risk, high-value enhancements — only where they improve stability, security, or usability.
| Goal | Priority | Notes |
|---|---|---|
| Automated Deploy+Validate Loop | Medium | Alias dcup — deploys + validates in one command (no breaking changes) |
| GPU Memory Monitoring | Low | Extend zeus-validator.sh to check VRAM usage — optional for users with high GPU load |
| Remote Access Gateway | Medium | Optional: Secure web gateway (HTTPS, auth) — still no external dependency, just better TLS |
| Multi-Node Sync (Future) | Long-term | qdrant-data sync over gRPC between Zeus nodes — not needed for single-node Zeus |
localhost. No cloud, no SaaS..env — All API keys/tokens moved to file-based secrets.| Metric | Config | Result |
|---|---|---|
| Embedding Inference (32 docs) | RAG_EMBEDDING_BATCH_SIZE=32, GPU:0 | ~18 ms/doc |
| RAG Query (Top-K=10, Hybrid) | qdrant-data: ~42GB, gRPC | P95: 22ms |
| Document Ingestion | n8n pipeline (PDF → Tika → Qdrant) | ~1.2 docs/sec, 98% success |
| LoRA Fine-tuning | memory: 8G, cpus: 2.0 | ~350 tokens/sec, VRAM < 18GB |
DESIGN, ARCHITECTURE, AND DEPLOYMENT BY
Stephen Sargent
Chief Architect, Sovereign AI Infrastructure | steve@adminsnet.net | Phoenix, AZ
— 38 Years of No-Nonsense Systems & Network Design —
Licence & Attribution
This specification describes a self-hosted, open-architecture system built with Docker, Prometheus, and community LLMs. No trade secrets — only engineering choices. No external APIs — all data stays in-house.
© Zeus Operating Collective — for internal documentation and publication. Not a trademark. Yet.