Theo Tardigrade — Enterprise AI Assistant

What It Is

Theo Tardigrade is a full production AI assistant built from the ground up to give employees natural language access to every major business system in one place. It connects to HubSpot CRM, Gong call intelligence, Zendesk support, PandaDoc documents, Campfire accounting, BigQuery data warehouse, and an internal knowledge base — all through a custom architecture I designed and built. It has been in daily production use since December 2025, and is currently on version 3.2.0.

The Problem

1. Business AI tools are expensive and siloed

Every business platform now ships with an AI feature. HubSpot has AI. Gong has AI. Zendesk has AI. But each one only knows about its own data, each one comes with a premium price tag, and none of them can answer a question that requires crossing system boundaries. A sales rep asking "what did we discuss on that call last month, and where does the deal stand today?" has to open Gong, then open HubSpot, then piece it together manually. The AI in each product can't help with the other one.

Enterprise-grade tools that attempt to solve this — unified AI platforms, AI copilots for revenue teams — are priced in the hundreds of thousands of dollars per year. That's before customization, before integration work, before seat licensing scales up.

2. Employees were solving it themselves — badly

Without a sanctioned tool, employees found their own solutions. Data was being copied into free consumer AI tools — deal notes, call transcripts, customer records, financial summaries — pasted into ChatGPT to get a summary or a draft. This was happening regularly and created real risk: company data in a consumer product, no visibility into what was shared, no way to audit or control it, no way to get it back.

This isn't unusual. It's happening in most organizations right now. But it needed to be solved, not just forbidden.

3. We were already in Google's ecosystem

The entire organization runs on Google Workspace. Our infrastructure was already in Google Cloud. Using that foundation — Gemini instead of a paid API, Cloud Run instead of managed AI platforms, Google's IAM and OAuth instead of a separate identity layer — meant we could build something enterprise-grade without enterprise pricing.

The Solution

I built Theo Tardigrade: a custom AI assistant that employees access through a modern chat interface, with real connections to every system they actually use. It is the company's sanctioned AI tool for working with internal data. It is not a wrapper around an API. It is a full system with its own architecture, security model, access controls, deployment pipeline, and operational tooling.

The core design decision was to use Model Context Protocol (MCP) — a structured communication pattern that lets the AI agent call tools across independently deployed services, rather than embedding all integrations into a single monolithic application. Each business system gets its own MCP server: isolated, independently deployable, with its own authentication, rate limiting, and caching logic. The agent doesn't know how to talk to HubSpot directly — it knows how to call the HubSpot MCP server, which handles everything about that integration. This means adding a new data source doesn't require touching the agent.

Systems Architecture

Every layer is independently deployed and independently scalable. A failure in the Campfire MCP server does not affect the agent's ability to answer a HubSpot question. New integrations are added by deploying a new MCP server — zero changes to the agent or frontend.

LLM Flexibility

The system is not locked to a single model. The frontend exposes a model selector, and users can switch between available LLMs per conversation. Gemini 2.0 Flash is the default and primary model — it is fast, cost-effective, and deeply integrated with Google's infrastructure. Claude (Anthropic) is also available today. The architecture is designed so that additional models can be added as the LLM market evolves, without changes to the MCP layer or the integrations beneath it. The choice of which model to use is kept at the application layer, not buried in the infrastructure — meaning we are never dependent on a single provider's pricing, availability, or capability trajectory.

Integrations

System	Tools	What Theo Can Do
HubSpot CRM	15	Query contacts, companies, deals, tickets, notes. Pipeline analysis, deal health, stage velocity.
Gong	6	Search calls by date or participant. Pull full speaker-separated transcripts. Get AI-generated summaries, topics, and action items.
PandaDoc	7	List documents across workspaces (Sales, Legal, HR, Finance, Operations). Check signing status and progress. Browse templates.
Zendesk	8	Search tickets by status, priority, or Zendesk Query Language. View full comment history. Look up customers and organizations.
Campfire	13	View vendors, bills, invoices, chart of accounts, journal entries. Generate balance sheets and income statements.
BigQuery	6	Natural language data warehouse queries. Preview cost and validate SQL before executing. Per-user permissions enforced at query time.
RAG Knowledge Base	5	Semantic + keyword search over internal documents synced from Google Drive. Supports PDF, Docs, Sheets, Slides, DOCX, Markdown.

Total: 7 MCP servers, 60 tools accessible in a single conversation.

Access Control & Security

Because the organization runs entirely on Google Workspace, the access control model is built on top of infrastructure that already exists and that IT already manages — no new identity provider, no new directory, no additional vendor.

Authentication is Google OAuth. Employees sign in with their Google Workspace account. No separate credentials to manage.

Authorization uses a layered model:

Google Groups control which employees can access the system at all. Adding or removing access is a group membership change in Google Admin — something any IT admin already knows how to do.
Identity-Aware Proxy (IAP) sits in front of every MCP server. Requests from outside authenticated service accounts are rejected before they reach application code. This means the HubSpot server, the Gong server, and every other integration is invisible to the public internet.
Per-user OAuth tokens are stored encrypted at rest using Google Cloud KMS. The agent calls each system using the authenticated user's own OAuth token — so a user can only see data their account has access to in that system. Theo cannot return data from HubSpot that the user couldn't access by logging in directly.
Permission profiles in the application layer provide granular control over which tools and admin features each user can access (tools.hubspot.read, tools.bigquery.execute, admin.*, etc.).
Team hierarchy allows permission profiles to be assigned at the team level, with hierarchical inheritance.
Session fingerprinting detects session hijacking by validating IP address and user agent on every request. New devices require Google re-authentication to approve.
BigQuery uses a two-step confirmation pattern — the agent previews the query and estimated cost before the user confirms execution. All BigQuery queries run under the user's own credentials with read-only scope.

The security model follows the principle that Theo has no more access than the person using it.

Intelligence Features

ReAct Thinking Pipeline

The agent doesn't just call tools — it reasons through them. Each response involves a visible thinking pipeline: form a plan, identify tools needed, call them, evaluate the result, adjust if needed, repeat until the answer is complete. This is streamed in real time so users can see the reasoning, not just the output.

Semantic Memory

Every conversation is summarized and embedded using Vertex AI text-embedding-004. When a new conversation starts, the system runs a semantic search over past conversations and injects the most relevant context into the agent's system prompt. The agent remembers what matters without requiring users to repeat themselves.

Skills System

Users and admins can define reusable workflows in YAML:

Workflow skills chain MCP tool calls with variable passing — e.g., look up a contact, pass their ID to fetch associated deals, then pull Gong calls for those deals, all triggered by /contact-summary [name]
Instruction skills inject guidance text into the LLM prompt for domain-specific behavior
Knowledge skills load full reference documents contextually, based on semantic relevance to the current request

Personality Modes

Each conversation is created with a focus mode — Sales, Engineering, Finance, or General — that shapes the agent's priorities, tone, and reasoning. A sales conversation focuses on deal health and pipeline velocity. An engineering conversation focuses on blockers and capacity. The mode is stored with the conversation so it's consistent when resumed.

RAG Knowledge Base

Internal documents synced from a Google Drive shared folder are chunked, embedded (Voyage AI), and stored in PostgreSQL with pgvector. Search combines vector similarity with BM25 keyword matching. The agent calls the knowledge base mid-conversation when a query benefits from internal documentation. Admins can force re-indexing of specific files on demand.

Cost

Enterprise AI platforms that offer a comparable capability — unified AI access across CRM, support, finance, and communications data, with role-based access control and an organization-wide interface — are priced in the hundreds of thousands of dollars per year.

Theo Tardigrade runs for under $100 per month on some months. Most of that is Cloud Run compute across ten services. The LLM itself (Gemini 2.0 Flash) is free under our existing Google Workspace Enterprise agreement.

This is possible because:

Gemini 2.0 Flash is included in Google Workspace Enterprise at no additional per-query cost
Cloud Run scales to zero when idle — we pay only for actual usage
Google's infrastructure (IAP, Secret Manager, Cloud KMS, Firestore, Cloud DNS) replaces what would otherwise require separate vendor contracts
The custom MCP architecture means we are not paying platform fees to a middleware vendor
There is no per-seat AI licensing — every employee in the org can use Theo at no marginal cost

The people doing the math on enterprise AI tools are looking at $50–100k+ per year before they've answered a single question. We answered that question differently.

Operational Maturity

This is not a prototype that got deployed. It was built with production operations in mind from an early stage.

CI/CD: Every push to main runs 600+ tests, enforces code quality (Black, Ruff, MyPy), checks coverage, and deploys all ten services to Cloud Run in parallel. Push to production, live in under twelve minutes. A separate beta branch deploys to a staging environment for testing new integrations before they reach production.

Observability: Structured JSON logs across all services with end-to-end request ID correlation. Google Cloud Error Reporting for automatic error grouping. An admin log viewer in the UI lets operators filter logs by service, severity, time range, and user without touching the GCP console.

Analytics: All tool calls are tracked — success rate, duration, usage by user and server. The admin dashboard shows a seven-day view of the most-used tools, per-user activity, and feedback (thumbs up/down) on responses.

Testing: 600+ unit tests, integration tests covering OAuth token lifecycle, and security tests. Coverage enforced as a CI gate.

Outcome

Production since December 2025, v3.2.0, 32 releases shipped
7 business systems connected through a single chat interface
60 tools available across 7 independently deployed MCP servers
Full user and team management with granular permission profiles
Semantic cross-session memory — the AI builds context over time
RAG knowledge base over internal documents, always current
User-defined workflow automation via the skills system
Operating cost under $100/month — a fraction of comparable enterprise tools
Employees no longer need to paste company data into consumer AI tools

Theo Tardigrade