Under the hood

Every component is open-source and replaceable. No vendor lock-in, no proprietary dependencies. Your infrastructure, your rules.

The pipeline

Four steps from voice to response, all on your own infrastructure.

1

Capture

Tap the device. Audio buffers to RAM while Wi-Fi reconnects. TLS keeps the connection encrypted end-to-end.

2

Transcribe

Audio sent over an encrypted tunnel to faster-whisper for local speech-to-text. Your voice never reaches an external service.

3

Reason

A Python orchestrator routes to the right agent. Ollama generates the response locally on your Mac Mini.

4

Respond

EdgeTTS converts text to natural speech. Audio streams back to your wearable device over the encrypted tunnel.

Open-source stack

Every component is replaceable. No vendor lock-in, no proprietary dependencies.

Ollama Python FastAPI Redis cron faster-whisper EdgeTTS ESP-IDF TLS / HTTPS ESP32-C6 ESP32-S3 Nginx Signal Mac Mini M4

The stack in detail

What each component does and why it was chosen.

Ollama

Local Intelligence

Runs AI models directly on your Mac Mini. Your prompts and context never leave the machine. Supports swapping models without code changes.

Python Orchestrator

Multi-Agent Gateway

A custom FastAPI service that routes requests to the right agent, manages tool permissions, and enforces security policies. The brain that connects everything together.

Redis

Shared State

Shared memory between agent processes. Handles message passing, conversation context, and real-time state. Keeps agents coordinated without tight coupling.

cron

Scheduled Tasks

Morning briefings, reminder delivery, evening summaries, and periodic maintenance. Simple, reliable, battle-tested Unix scheduling.

faster-whisper

Speech to Text

Transcribes your voice to text locally. Based on OpenAI's Whisper model, optimised for speed with CTranslate2. Runs as a simple HTTP service.

EdgeTTS

Text to Speech

Converts agent responses to natural-sounding speech. Different voices per agent so you always know who's talking. Also runs as a local HTTP service.

HTTPS + Device Auth

Encrypted Transport

Every device connects over HTTPS with its own unique API key. TLS encrypts all traffic. No VPN required — standard, auditable, works everywhere.

ESP-IDF

Firmware Framework

Custom firmware built from scratch with Espressif's official framework. Not Arduino. Full control over audio pipeline, power management, and OTA updates.

Signal (via signal-cli)

Notifications

Reminders, follow-ups, and structured data delivered to your phone via Signal. Headless client on the server — no GUI, no companion app needed.

Nginx

Web Server

Serves the OpenRain web interface and reverse-proxies internal services. Handles TLS termination, keeping everything behind a single secure entry point.

Web Interface

Command Centre

A private web dashboard for managing your agents, browsing the library, viewing reports, and seeing full transparency logs of every agent's decisions and actions.

Mac Mini M4

Server Hardware

16GB Apple Silicon running the entire stack. Sits in a colocation data centre — your own hardware, on your own terms. Quiet, efficient, and always on.

Hardware

Purpose-built voice gateways. One firmware, multiple form factors.

ESP32-C6 Watch

Primary Wearable

1.69" capacitive touchscreen, built-in mic and speaker, Wi-Fi 6, RISC-V architecture.

  • 240x280 display
  • ES8311 audio codec
  • 512KB SRAM (~5-8s audio buffer)
  • Battery powered, watch form factor

ESP32-S3 Board

Desk / Development

1.69" touchscreen, 8MB PSRAM for extended audio. Needs external mic and speaker modules.

  • 240x280 display
  • INMP441 mic + MAX98357A amp
  • 8MB PSRAM (unlimited buffer)
  • USB powered, desk use

Networking

Secure tunnels, over-the-air updates, and zero cloud dependency.

HTTPS with device keys

Each device authenticates to the server with its own unique API key over HTTPS. TLS encrypts everything in transit. No VPN stack needed on constrained hardware — just standard, reliable web protocols.

OTA updates

Flash once via USB, then all subsequent firmware updates are delivered over the air. Triggered by voice command: "Update all devices." Essential for managing multiple devices.

Multi-SSID support

Multiple Wi-Fi networks baked in at compile time. Devices connect to whichever known network is available. Unknown Wi-Fi? Tether to your phone.

Power-aware connectivity

Wi-Fi enters light sleep between interactions. On tap, audio buffers to RAM while Wi-Fi reconnects. The user perceives no delay.

Your data never leaves

Every layer is designed to keep your information exactly where it belongs.

Local Intelligence

Ollama runs on your Mac Mini. Your prompts, your context, your answers — all processed on a machine sat quietly on a shelf that you physically own. No listening. No recording. No always-on microphone.

Encrypted Transport

TLS secures every connection between your wearable devices and your server. Unique API key per device. No shared secrets. No plaintext.

No Cloud Dependency

No subscriptions to cancel. No terms of service to change. No vendor who can read your data or shut down the API.

Tap-Only Activation

No always-on microphone. No wake word listening. The device records only when you deliberately tap it. Intent is always clear.

Security model

Defence in depth. Every layer adds protection, no single layer is trusted alone.

Per-device keys

Every device gets its own unique API key. Compromising one device doesn't compromise the others. Keys can be rotated individually via the web UI.

Least privilege agents

All agents start chat-only with no tool access. Tools are enabled incrementally per-agent. An agent that reads email cannot send email — separate agents with human confirmation between.

Read/write separation

Agents that read external data (email, web) are never given write or send capabilities on those same systems. This limits the blast radius of prompt injection attacks.

Confirmation gates

Configurable per-agent. Agents that take outward-facing actions (send, delete, post) require explicit voice confirmation before proceeding. Chat-only agents don't need it.

Encrypted drives

Server storage is encrypted at rest. Even physical loss of the hardware isn't a loss of your private data. Your conversations and files stay protected no matter what.

Full transparency

Every action taken by every agent is logged. The web interface lets you see exactly what each agent can see, how it makes decisions, and what it did. No hidden behaviour. Total transparency of thought and intent.