NEONVIL & Confidential B2B SaaS Client
A Case Study: An Event-Driven Multi-Agent Office on Serverless Firestore

Business Outcome
A multi-agent 'virtual office' on Firebase — eight specialised AI agents plan, execute, and audit work on top of Firestore as both message bus and real-time UI feed. MVP feature-complete and QA-hardened, scheduled for pilot launch.
Event-Driven Pipeline
Eight agents orchestrated by Firestore triggers — no queue, no WebSocket, no polling.
Self-Auditing Outputs
A dedicated auditor agent reviews every labor-unit output against the original plan before it reaches the user.
Zero-Trust Multi-Tenant
LLM keys in Secret Manager, tenant-scoped Firestore rules, and admin-only writes to agent configs and system defaults.
The numbers behind the platform:
0
Specialised AI Agents
0
Executable Skills in Registry
0
Hot-Swappable LLM Providers
0
Firestore Triggers Drive the Pipeline
Context & Challenges
The client wanted to replace ad-hoc prompt-engineering — users copy-pasting into ChatGPT, then copy-pasting results back into Gmail, Telegram, or Calendar — with a single natural-language inbox where a team of specialised AI agents autonomously plans, executes, audits, and delivers outcomes, with human approval gates on anything high-impact. The constraint: no queue, no WebSocket, no polling — everything event-driven on Firestore.
Five critical engineering challenges:
01
Unbounded LLM cost and hallucinated outputs with no second pair of eyes: a dedicated auditor agent runs after every labor-unit output, with a retry cap of 3 before escalating for human review.
02
Heterogeneous LLM providers per agent: only Secret Manager secret IDs persist in Firestore; raw keys resolve at function runtime and are stripped before any agent config is served to the client.
03
Stale tasks on a serverless pipeline: a 12-hour scheduled cron scans a CollectionGroup query for tasks stuck in 'executing' longer than 9 minutes and nudges the user.
04
Firestore is not a queue: idempotent triggers guarded by status transitions (pending → executing → awaiting_audit) prevent double-fired audits on concurrent onUpdate events.
05
Strict multi-tenant isolation with zero client-side trust: Firestore rules lock agents and system defaults to deny-all; every mutation flows through Cloud Functions using the Admin SDK.
Project Goals
Kanban Frontend
A real-time Kanban SPA in React 19 + MUI 7 + Redux Toolkit, with @dnd-kit drag-and-drop and Firestore listeners — no WebSocket infrastructure.
Agent Orchestration
An event-driven Cloud Functions Gen 2 backend with 8 specialised agents, a plan-execute-audit loop, and tenant-scoped Secret Manager for LLM keys.
Skills Registry
A 29-skill registry that agents invoke for Gmail, Calendar, Telegram, WhatsApp, web search, content, and memory operations.
Our Solution
Three layers, one substrate. The frontend drops missions into Firestore. The orchestration backend plans, executes, and audits via triggers. The skills layer reaches into external tools through a dispatcher registry. Firestore is both the message bus and the real-time UI feed.
Kanban Frontend
Drag-and-drop missions, real-time agent feedback
Real-time Kanban (Intake / In Progress / Needs Approval / Done) via Firestore listeners
@dnd-kit for WCAG-compliant drag-and-drop (keyboard and screen-reader)
Outcome dashboard: hours saved, LLM cost, worker credits
Virtualised long lists via react-virtuoso; markdown via react-markdown
Agent Orchestration
Plan, execute, audit — no queue, no WebSocket
The orchestrator produces a JSON plan with human-in-the-loop flags; the auditor validates every output against that plan
5 Firestore triggers drive the pipeline end-to-end — zero queue infrastructure
Auditor veto loop retries up to 3 times before escalating for human review
A monitor agent scans for stale tasks every 12 hours and nudges the user
Skills & Integrations
The hands that reach into external tools
29-skill registry with a dispatcher pattern — agents request skills, not services
Multi-provider LLM: Gemini planning, GPT-4o extraction, Claude Sonnet long-form
API keys live in Secret Manager, never in Firestore — only secret IDs persist
Gmail OAuth2, Telegram webhook, WhatsApp Cloud API — platform and per-tenant
Effort Allocation
Business Logic (35%)
Codebase (22%)
Requirements & Spec (15%)
Infrastructure (12%)
Security (10%)
QA & Reliability (6%)
Infrastructure & Technologies
Firestore (Native)
Chosen so writes become events and client listeners become pub/sub — one substrate covers the agent message bus and the real-time UI feed, with zero queue or WebSocket infrastructure.
Cloud Functions Gen 2
Chosen because labor-unit agents chain LLM calls and fan out to external APIs for longer than 9 minutes — Gen 1's 540s cap was a blocker, Gen 2's Cloud Run base gives up to 60 minutes.
Secret Manager (GCP)
Chosen to keep LLM provider keys out of Firestore — only secret IDs persist, raw keys resolve at function runtime, and the API never ships them to the client.
Shared Types Workspace
Chosen to eliminate type drift between backend and frontend — a single npm workspace package defines Mission, Task, AgentConfig, and Plan, so schema changes break compilation on both sides rather than failing silently at runtime.
Zod
Chosen because LLM output is unstructured by nature — every plan from the orchestrator and every verdict from the auditor is Zod-parsed at ingest, so malformed responses fail loud rather than poisoning the pipeline.
Redux Toolkit + RTK Query
Chosen because the Kanban holds complex cross-slice state (drag × status × auth × agent config) that useState and useContext couldn't keep coherent — a custom Firebase base query unifies REST calls and Firestore listeners behind one cache.
@dnd-kit
Chosen because WCAG-compliant drag-and-drop (keyboard + screen-reader) was a hard requirement; react-dnd doesn't clear that bar.
Multi-Provider LLM
Chosen because no single model wins every role — Gemini 2.0 Flash plans fast, GPT-4o extracts structured data, Claude Sonnet handles long-form. Providers are selectable per agent per tenant, hot-swappable with no code change.
