On-Prem Deployment

Bolti is built to be self-hostable. Every component of our managed cloud — the API backend, the agent runtime, the realtime telephony stack, the dashboards, the observability layer — is a standard container that can run inside your own infrastructure.

On-prem is the right choice when:

You have a regulatory or contractual requirement that calls cannot leave your environment
Your security team requires that all customer audio and transcripts stay on networks you control
You want to peer Bolti directly with internal-only systems (databases, CRMs, telephony gateways) over private networking
You want full control over upgrade timing and component versions

For most teams, the managed cloud is faster and cheaper. On-prem is a deliberate choice for organizations that need it.

Engagement-required

On-prem is delivered as part of an enterprise contract, not as a self-serve product. We work with your team on architecture review, secret management, and upgrade processes. Talk to us before deploying.

What "on-prem" actually means

A full Bolti deployment is three node groups running in containers via Docker Compose (or Kubernetes, on request):

┌──────────────────────┐    ┌──────────────────────┐    ┌──────────────────────┐
│   App node           │    │   Realtime node      │    │   Observability node │
│ ──────────────────── │    │ ──────────────────── │    │ ──────────────────── │
│ • Backend API        │    │ • Realtime audio     │    │ • Metrics dashboards │
│ • Agent worker       │◀──▶│ • SIP gateway        │◀──▶│ • Log aggregator     │
│ • Ingress            │    │ • Recording worker   │    │ • Metrics store      │
│ • Log/metric shippers│    │ • Session cache      │    │ • Ingress            │
└──────────┬───────────┘    └──────────┬───────────┘    └──────────────────────┘
           │                           │
           ▼                           ▼
   ┌───────────────┐          ┌───────────────────┐
   │  PostgreSQL   │          │  Object storage   │
   │  (your DB)    │          │ (recordings)      │
   └───────────────┘          └───────────────────┘

Plus two managed dependencies you provide:

Dependency	Purpose	Compatible with
PostgreSQL	Primary application database — agents, calls, tools, members, billing.	Any PostgreSQL ≥ 14, including managed (AWS RDS, Cloud SQL, Aiven) or self-hosted.
S3-compatible object storage	Stores call recordings (audio/video) with signed playback URLs.	AWS S3, Cloudflare R2, MinIO, E2E Object Storage, GCS, Azure Blob, Wasabi, Backblaze.

The whole stack is plain containers with environment-variable config. If your platform team has run Kubernetes, Docker Swarm, or Compose before, the operational shape is familiar.

Reference architecture (recommended)

Three Linux hosts, the same shape we run in Bolti's managed cloud. Sizes are starting points — scale based on call concurrency.

Node	Purpose	Suggested size for ~100 concurrent calls
App node	Backend API, agent worker, ingress	4 vCPU / 8 GB RAM
Realtime node	Realtime audio service, SIP gateway, recording worker, session cache	8 vCPU / 16 GB RAM (audio mixing is CPU-heavy)
Observability node	Metrics, logs, dashboards	2 vCPU / 4 GB RAM

Single-node deployments are also supported for dev and small production workloads — every container can run on one host. For higher concurrency we recommend splitting realtime onto its own host.

What you need to provide

Compute

Linux hosts with Docker Engine ≥ 24. We provide tested Compose files for app, realtime, and observability stacks. Kubernetes Helm charts are available on request for customers running k8s natively.

Networking

Public ingress for the dashboard and API (or internal-only — works either way)
Public ingress for realtime WebRTC traffic (UDP) and SIP (UDP/TCP) — these usually have to be reachable for telephony providers and end-user browsers
Outbound access to your chosen STT, LLM, and TTS providers — or self-hosted alternatives if you want a fully closed network

For air-gapped deployments, we'll work with you on locally-hosted models (e.g. self-hosted Whisper for STT, a local Llama variant for LLM, a TTS engine of your choice). This adds complexity and quality tradeoffs worth discussing up front.

Persistent storage

A PostgreSQL instance you operate (or a managed Postgres of your choice)
An S3-compatible bucket for recordings
Local volumes on each host for log files and config

Secrets

Bolti reads its config from environment variables. We support sourcing those from any of:

Plain .env files (simple, fine for dev / small prod)
Doppler (what Bolti's managed cloud uses)
HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, Azure Key Vault — pull at runtime via a sidecar or your existing tooling
A custom KMS — anything that produces env vars is compatible

Identity

By default Bolti uses Firebase Authentication for user login. For on-prem you can also:

Self-host Firebase Auth (works for most teams)
Plug in OIDC / SAML through a wrapper for SSO with Okta, Azure AD, Google Workspace, etc. — discussed during engagement.

What stays the same vs. cloud

Everything about how you build agents stays identical to the managed product:

The dashboard, the wizard, the agent settings tabs — same code, same UX
The REST API, the MCP server, the SDKs — same endpoints, same behavior
The agent runtime (STT → LLM → TTS pipeline) — same pipeline
Call logs, recordings, transcripts, billing reports — same data model

The only differences are:

The data lives on your infrastructure
You control the upgrade cadence (we publish versioned images; you pull when ready)
You set up the provider keys (LLM, STT, TTS, telephony) directly — no Bolti-managed credentials

Upgrades and support

We publish versioned, signed container images to a registry you can mirror locally
A documented upgrade path with database migration scripts (Alembic) and a tested staging rollout procedure
Direct support channel with the engineering team (Slack Connect or shared Jira) included with enterprise contracts
Quarterly architecture reviews so your deployment doesn't drift away from supported configurations

What's not included in on-prem

So you can plan around it:

Bolti-managed phone numbers — bring your own SIP/PSTN provider (Twilio, Plivo, your local carrier, an enterprise SBC). See Telephony → SIP Trunking.
Bolti-managed STT/LLM/TTS billing — you contract with the providers directly (or self-host)
Bolti's hosted MCP — you run the MCP server yourself; same binary, same tools

Hybrid deployments

Many organizations don't need full on-prem — they need audio on-prem but are happy with the rest in the cloud. Common hybrid shapes:

Realtime on-prem, control plane in cloud — calls happen on your infrastructure (audio never leaves your network); the dashboard, API, and database stay in Bolti's cloud.
Self-hosted LLM, everything else cloud — you run a local LLM endpoint (often for compliance with internal "no third-party LLM" policies) and Bolti's cloud points to it via Custom LLMs.
Bolti cloud + dedicated VPC peer — Bolti runs in a dedicated VPC peered to yours, with private endpoints — looks "hybrid" from a networking standpoint without you having to operate the stack.

We typically recommend a hybrid for first deployments — it solves the most common compliance constraints with a fraction of the operational overhead.

How to get started

Talk to us. On-prem starts with a call. We need to understand your scale, regulatory posture, and existing infrastructure before recommending a shape.
Architecture review. We'll share a deployment guide tailored to your environment (cloud provider, region, network topology, identity stack).
Staging deployment. Stand up the stack in a non-production environment first. We help you validate it end-to-end.
Production rollout. Once staging is green, we cut over and document the operational runbook with your team.

Typical engagement timeline: 2–6 weeks from kickoff to production traffic, depending on how much of the dependency stack (Postgres, object storage, identity) is already in place.

Data Residency — controlling where your data physically lives
PII Data Protection — masking sensitive content before LLM calls
Custom LLMs — point Bolti at a self-hosted LLM endpoint
SIP Trunking — bring your own telephony

To start an on-prem conversation, reach out via your account team or hello@bolti.co.in.

What "on-prem" actually means​

Reference architecture (recommended)​

What you need to provide​

Compute​

Networking​

Persistent storage​

Secrets​

Identity​

What stays the same vs. cloud​

Upgrades and support​

What's not included in on-prem​

Hybrid deployments​

How to get started​

Related​