Skip to main content

On-Prem Deployment

Bolti is built to be self-hostable. Every component of our managed cloud — the API backend, the agent runtime, the realtime telephony stack, the dashboards, the observability layer — is a standard container that can run inside your own infrastructure.

On-prem is the right choice when:

  • You have a regulatory or contractual requirement that calls cannot leave your environment
  • Your security team requires that all customer audio and transcripts stay on networks you control
  • You want to peer Bolti directly with internal-only systems (databases, CRMs, telephony gateways) over private networking
  • You want full control over upgrade timing and component versions

For most teams, the managed cloud is faster and cheaper. On-prem is a deliberate choice for organizations that need it.

Engagement-required

On-prem is delivered as part of an enterprise contract, not as a self-serve product. We work with your team on architecture review, secret management, and upgrade processes. Talk to us before deploying.

What "on-prem" actually means

A full Bolti deployment is three node groups running in containers via Docker Compose (or Kubernetes, on request):

┌──────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐
│ App node │ │ Realtime node │ │ Observability node │
│ ──────────────────── │ │ ──────────────────── │ │ ──────────────────── │
│ • Backend API │ │ • Realtime audio │ │ • Metrics dashboards │
│ • Agent worker │◀──▶│ • SIP gateway │◀──▶│ • Log aggregator │
│ • Ingress │ │ • Recording worker │ │ • Metrics store │
│ • Log/metric shippers│ │ • Session cache │ │ • Ingress │
└──────────┬───────────┘ └──────────┬───────────┘ └──────────────────────┘
│ │
▼ ▼
┌───────────────┐ ┌───────────────────┐
│ PostgreSQL │ │ Object storage │
│ (your DB) │ │ (recordings) │
└───────────────┘ └───────────────────┘

Plus two managed dependencies you provide:

DependencyPurposeCompatible with
PostgreSQLPrimary application database — agents, calls, tools, members, billing.Any PostgreSQL ≥ 14, including managed (AWS RDS, Cloud SQL, Aiven) or self-hosted.
S3-compatible object storageStores call recordings (audio/video) with signed playback URLs.AWS S3, Cloudflare R2, MinIO, E2E Object Storage, GCS, Azure Blob, Wasabi, Backblaze.

The whole stack is plain containers with environment-variable config. If your platform team has run Kubernetes, Docker Swarm, or Compose before, the operational shape is familiar.

Three Linux hosts, the same shape we run in Bolti's managed cloud. Sizes are starting points — scale based on call concurrency.

NodePurposeSuggested size for ~100 concurrent calls
App nodeBackend API, agent worker, ingress4 vCPU / 8 GB RAM
Realtime nodeRealtime audio service, SIP gateway, recording worker, session cache8 vCPU / 16 GB RAM (audio mixing is CPU-heavy)
Observability nodeMetrics, logs, dashboards2 vCPU / 4 GB RAM

Single-node deployments are also supported for dev and small production workloads — every container can run on one host. For higher concurrency we recommend splitting realtime onto its own host.

What you need to provide

Compute

Linux hosts with Docker Engine ≥ 24. We provide tested Compose files for app, realtime, and observability stacks. Kubernetes Helm charts are available on request for customers running k8s natively.

Networking

  • Public ingress for the dashboard and API (or internal-only — works either way)
  • Public ingress for realtime WebRTC traffic (UDP) and SIP (UDP/TCP) — these usually have to be reachable for telephony providers and end-user browsers
  • Outbound access to your chosen STT, LLM, and TTS providers — or self-hosted alternatives if you want a fully closed network

For air-gapped deployments, we'll work with you on locally-hosted models (e.g. self-hosted Whisper for STT, a local Llama variant for LLM, a TTS engine of your choice). This adds complexity and quality tradeoffs worth discussing up front.

Persistent storage

  • A PostgreSQL instance you operate (or a managed Postgres of your choice)
  • An S3-compatible bucket for recordings
  • Local volumes on each host for log files and config

Secrets

Bolti reads its config from environment variables. We support sourcing those from any of:

  • Plain .env files (simple, fine for dev / small prod)
  • Doppler (what Bolti's managed cloud uses)
  • HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, Azure Key Vault — pull at runtime via a sidecar or your existing tooling
  • A custom KMS — anything that produces env vars is compatible

Identity

By default Bolti uses Firebase Authentication for user login. For on-prem you can also:

  • Self-host Firebase Auth (works for most teams)
  • Plug in OIDC / SAML through a wrapper for SSO with Okta, Azure AD, Google Workspace, etc. — discussed during engagement.

What stays the same vs. cloud

Everything about how you build agents stays identical to the managed product:

  • The dashboard, the wizard, the agent settings tabs — same code, same UX
  • The REST API, the MCP server, the SDKs — same endpoints, same behavior
  • The agent runtime (STT → LLM → TTS pipeline) — same pipeline
  • Call logs, recordings, transcripts, billing reports — same data model

The only differences are:

  • The data lives on your infrastructure
  • You control the upgrade cadence (we publish versioned images; you pull when ready)
  • You set up the provider keys (LLM, STT, TTS, telephony) directly — no Bolti-managed credentials

Upgrades and support

  • We publish versioned, signed container images to a registry you can mirror locally
  • A documented upgrade path with database migration scripts (Alembic) and a tested staging rollout procedure
  • Direct support channel with the engineering team (Slack Connect or shared Jira) included with enterprise contracts
  • Quarterly architecture reviews so your deployment doesn't drift away from supported configurations

What's not included in on-prem

So you can plan around it:

  • Bolti-managed phone numbers — bring your own SIP/PSTN provider (Twilio, Plivo, your local carrier, an enterprise SBC). See Telephony → SIP Trunking.
  • Bolti-managed STT/LLM/TTS billing — you contract with the providers directly (or self-host)
  • Bolti's hosted MCP — you run the MCP server yourself; same binary, same tools

Hybrid deployments

Many organizations don't need full on-prem — they need audio on-prem but are happy with the rest in the cloud. Common hybrid shapes:

  • Realtime on-prem, control plane in cloud — calls happen on your infrastructure (audio never leaves your network); the dashboard, API, and database stay in Bolti's cloud.
  • Self-hosted LLM, everything else cloud — you run a local LLM endpoint (often for compliance with internal "no third-party LLM" policies) and Bolti's cloud points to it via Custom LLMs.
  • Bolti cloud + dedicated VPC peer — Bolti runs in a dedicated VPC peered to yours, with private endpoints — looks "hybrid" from a networking standpoint without you having to operate the stack.

We typically recommend a hybrid for first deployments — it solves the most common compliance constraints with a fraction of the operational overhead.

How to get started

  1. Talk to us. On-prem starts with a call. We need to understand your scale, regulatory posture, and existing infrastructure before recommending a shape.
  2. Architecture review. We'll share a deployment guide tailored to your environment (cloud provider, region, network topology, identity stack).
  3. Staging deployment. Stand up the stack in a non-production environment first. We help you validate it end-to-end.
  4. Production rollout. Once staging is green, we cut over and document the operational runbook with your team.

Typical engagement timeline: 2–6 weeks from kickoff to production traffic, depending on how much of the dependency stack (Postgres, object storage, identity) is already in place.

To start an on-prem conversation, reach out via your account team or hello@bolti.co.in.