Curiosity

Installation

Curiosity Workspace ships as a single container image (curiosityai/curiosity) plus a Windows installer. This page helps you pick the right delivery model for your environment and links to the platform-specific guide that takes you the rest of the way.

For a runnable local install in under five minutes, jump to Quickstart. For a complete end-to-end developer build, see Build your first enterprise AI app.

Decision tree

If you need… Use…
Local development on a laptop Docker — a single docker run or docker compose up.
A demo or evaluation on a Windows VM Windows installer. Easy to install/uninstall as a service.
A staging environment on a single VM Docker with Compose, behind your existing reverse proxy.
A production deployment on Kubernetes Kubernetes, and consult the cloud-specific notes below.
A production deployment on AWS AWS — EC2/EKS, EBS, ALB.
A production deployment on Azure Azure — VM/AKS, Azure Disk, Entra ID.
A production deployment on Google Cloud GCP — Compute Engine/GKE, Persistent Disk.
A production deployment on OpenShift OpenShift.
An air-gapped or on-prem deployment Docker or Kubernetes with a private registry mirror.

Decisions you should make before installing

Environment tier

Local / staging / production — they each impose different defaults.

  • Local: bind to loopback, use generated admin password via MSK_ADMIN_PASSWORD, persistence on a real volume so you don't lose your work between restarts.
  • Staging: prod-shaped manifest, smaller capacity, isolated secrets, restore drills allowed.
  • Production: TLS, secrets manager, monitoring, backups, anti-affinity, anti-cohabitation with noisy neighbors. ===
Storage

The graph and its indexes are I/O sensitive. Always provision SSD or better.

  • Capacity = (sum of indexed text fields, in bytes) × ~1.5 + (embedded fields, in bytes) × (embedding dimensions × 4 / chunk size) + journal headroom.
  • A starter PVC of 200 GB is sufficient for hundreds of thousands of documents.
  • Pin a MSK_GRAPH_BACKUP_FOLDER to a different volume so a corrupted graph volume doesn't take backups with it. ===
Access model

Decide before opening the workspace to anyone.

  • Internal only: workspace reachable through a VPN or private network.
  • Public, but authenticating: TLS-terminated reverse proxy, SSO via your IdP, no admin/admin default.
  • See Security. ===
Identity

Plan how users will sign in before ingesting production data, so ACLs can be ingested against the right teams.

Observability and backups
  • Centralized logs (stdout collector, or MSK_LOG_PATH on a mounted volume).
  • Backups to off-host storage, with a documented restore drill. See Backup and restore.
  • Alerts on liveness, latency regressions, and ingestion failures. See Monitoring. ===

Prerequisites

  • CPU: 4 cores minimum (8+ recommended for production with embeddings).
  • RAM: 8 GB minimum (16 GB+ recommended; embeddings indexes are memory-resident).
  • Storage: SSD with enough space for graph + indexes + 1.5× headroom for backups.
  • Network: TCP 8080 (or your chosen MSK_PORT) reachable by clients. TLS terminated by a proxy or in-container.
  • License: a MSK_LICENSE token if you have a commercial license.
  • For AI features: an LLM/embedding provider key (OpenAI, Azure OpenAI, Anthropic, or a local OpenAI-compatible server).

First-boot checklist

After the service is running for the first time:

  • Open the UI and complete the setup wizard.
  • Rotate admin credentials if you didn't already set MSK_ADMIN_PASSWORD (never leave defaults in any environment beyond your laptop).
  • Set MSK_JWT_KEY explicitly so tokens survive restarts.
  • Set MSK_GRAPH_MASTER_KEY and back it up; you cannot decrypt content without it.
  • Configure SSO before inviting real users.
  • Create an API token for ingestion connectors and store it in a secret manager.
  • Confirm persistence by restarting the service and verifying the workspace state remains.

Post-install validation checklist

  • Web UI loads at your workspace URL (using your MSK_PUBLIC_ADDRESS from a client machine).
  • You can log in with an admin account — and your IdP, if you configured one.
  • TLS is correct end to end (browser shows the expected certificate; HSTS header present if enabled).
  • Storage is persistent across restarts.
  • Background tasks (indexing/parsing) can run.
  • Backup runs successfully and the resulting snapshot restores in a separate environment.
  • Logs reach your aggregator.
  • Monitoring shows the workspace as healthy.

Common installation pitfalls

  • Ephemeral storage: running with a non-persistent volume will lose data on restart.
  • Reverse proxies and origins: when behind a proxy, set MSK_PUBLIC_ADDRESS consistently; otherwise generated links (SSO callbacks, email links) will be wrong.
  • Ports and binding: confirm the service binds to the expected interface (127.0.0.1 for local, 0.0.0.0 for a proxy-fronted deployment).
  • :latest in production: pin to a versioned image tag (curiosityai/curiosity:vX.Y.Z) so upgrades are explicit.
  • Missing master key: encrypted properties can't be read after a restart if MSK_GRAPH_MASTER_KEY was autogenerated and then lost.

Next steps

Referenced by

© 2026 Curiosity. All rights reserved.
Powered by Neko