Deploying on AWS
Curiosity Workspace runs unchanged on AWS — the deployment shape is the same as the Docker or Kubernetes guide; what's specific to AWS is the surrounding infrastructure (compute, storage, ingress, secrets, monitoring).
Reference architecture (production)
| Concern | AWS service | Notes |
|---|---|---|
| Compute | EC2 (m6i.2xlarge or larger) or EKS | EKS for fleet; single EC2 for small deployments. |
| Container runtime | Docker / containerd | Same image as everywhere else. |
| Persistence | EBS gp3 |
200 GB starter, iops=3000. Enable EBS snapshots. |
| Backups | EBS snapshots + cross-region copy | Schedule via Data Lifecycle Manager. |
| Ingress + TLS | Application Load Balancer + ACM cert | Or AWS WAF in front of an Ingress on EKS. |
| Private access | VPC private subnet, NAT for egress | Workspace egresses to LLM provider; ingress through ALB only. |
| Identity | IAM Roles for Service Accounts (EKS) or instance role (EC2) | For pulling secrets from Secrets Manager. |
| Secrets | AWS Secrets Manager | Inject as env vars via MSK_*_FILE references or sidecars. |
| Logs | CloudWatch Logs via FireLens / Fluent Bit | Or set MSK_LOG_PATH for file-based collection. |
| Metrics | CloudWatch Container Insights | Plus the workspace's internal /api/endpoints/metrics. |
| DNS / CDN | Route 53 → ALB | Optional CloudFront for static caching. |
Option A: Single EC2 instance (small / staging)
- Launch the instance: Ubuntu 22.04 or Amazon Linux 2023, m6i.2xlarge (8 vCPU, 32 GB) or larger.
- Attach EBS storage: a separate
gp3volume (≥ 200 GB) mounted at/srv/curiosity. See AWS Documentation: Make an Amazon EBS volume available for use. - Install Docker (Get Docker).
- Pull secrets from Secrets Manager:
aws secretsmanager get-secret-value --secret-id curiosity/prod \ --query SecretString --output text > /etc/curiosity/.env chmod 600 /etc/curiosity/.env - Run with Docker Compose following the Docker page, pointing
volumes:at/srv/curiosityandenv_file:at/etc/curiosity/.env. - Front with an ALB that terminates TLS using an ACM certificate, with the target group pointing at port
8080on the instance. - Schedule EBS snapshots via Data Lifecycle Manager.
Option B: EKS
- Provision the cluster with managed node groups in private subnets.
- Install the EBS CSI driver (docs).
- Install the AWS Load Balancer Controller to manage ALBs declaratively, or use NGINX Ingress with an NLB.
- Use External Secrets Operator to project Secrets Manager values into the
curiosity-secretsSecret. - Apply the manifest from Kubernetes, setting:
storageClassName: gp3volumeClaimTemplates.resources.requests.storage: 200GiingressClassName: alb(or your NGINX class), with cert-manager pointing at ACM.
- Configure backups:
- EBS
VolumeSnapshots via the CSI driver, or MSK_GRAPH_BACKUP_FOLDERon a sidecar PVC mirrored to S3.
- EBS
Fargate caveat
EKS on Fargate has no EBS. You must use EFS (ReadWriteMany), and accept that EFS is slower than block storage. Use Fargate only for non-prod environments. (Fargate storage)
Identity (SSO) on AWS
If your users are in Microsoft Entra ID, Google Workspace, or Okta, follow the matching SSO guide. For AWS IAM Identity Center (formerly AWS SSO), wire it up as a custom SAML or OIDC provider.
Observability
- Logs: ship container stdout via FireLens → CloudWatch Logs, or set
MSK_LOG_PATHand use a Fluent Bit DaemonSet on EKS. - Metrics: scrape the workspace's
/api/endpoints/metricsand/api/chatai/tools/metrics(admin token) from a CloudWatch agent or Managed Prometheus. - Alarms: ALB target health, EBS IOPS saturation, container restart rate, scheduled-task failure rate.
Production checklist (AWS-specific)
- Pinned image tag, not
:latest. - EBS
gp3volume on the same AZ as the pod/instance; volumes are AZ-local. - EBS snapshots scheduled and tested by restoring to a sandbox account.
- Secrets in AWS Secrets Manager, never in the container image or task definition.
- ALB with HTTP/2, TLS 1.2+, ACM-managed certificate, optional WAF.
- Egress through a NAT gateway with a documented allowlist (LLM provider, Docker registry, NuGet).
- CloudWatch alarms for liveness, latency, container restarts.
- Restore drill completed and dated.
See the broader Deployment checklist.
See also
- Docker for the single-host case.
- Kubernetes for the EKS case.
- Configuration reference for every variable referenced above.
- Backup and restore for the snapshot procedure.