Curiosity

Migration: Elasticsearch + vector DB + LangChain

If your current enterprise-AI stack is something like Elasticsearch for text search, Pinecone/Weaviate/Qdrant for vector search, LangChain for orchestration, and a thin REST API on top, this page maps each piece onto Curiosity Workspace and shows what you save and what changes.

What maps onto what

Stitched stack component	Curiosity Workspace equivalent
Elasticsearch / OpenSearch text index	Built-in text search
Pinecone / Weaviate / Qdrant vector DB	Built-in vector search
BM25 / hybrid ranker / reciprocal rank fusion	Built-in hybrid search
Neo4j / a "knowledge graph" microservice	Built-in graph engine
LangChain `Retriever` chains	Custom endpoints
LangChain `Tool` / function-calling	AI tools
LangChain `LCEL` orchestration	A C# custom endpoint that orchestrates retrieval → LLM → response
Per-tenant ACL filtering in a `pre_filter` Elastic query	ReBAC graph + `CreateSearchAsUserAsync`
Airflow / Prefect ingestion DAGs	Connectors + scheduled tasks
Custom embedding pipeline (HF + a queue)	Built-in embedding pipeline driven by the schema
Custom UI built on a chat library	Custom front-end with Tesserae
FastAPI / Flask gateway	Workspace's built-in HTTP gateway
Per-service auth, JWT, secret management	Workspace token scopes and `MSK_*` env vars
Per-service observability stack	Built-in monitoring

What you save

Operating one system instead of four to six. One container, one storage volume, one secrets surface, one upgrade procedure.
Coherent permissions. ACLs ingested once apply to text search, vector search, graph traversal, AND AI tools. No more "the chat tool can see what the search API can't".
One identity for tokens and users. No need to bridge auth between the gateway, the index, the vector DB, and the LLM router.
No glue code for retrieval pipelines. What was 200 lines of LangChain is typically 20 lines of a custom endpoint.
Sandbox-friendly local dev. A laptop runs the whole stack via docker run.

What changes

You write C# in the workspace for endpoints and AI tools. If your team is 100% Python, the Curiosity.Library.Python SDK covers connectors, but endpoints and tools are C#.
You give up some flexibility at the lowest layers — you can't swap in a different vector index or text-analyzer plugin. The tradeoff is that the integrated layers stay correct together.
You move ingestion from DAG-shaped Python to connector-shaped C#. The work is similar; the runtime is different.
Schema-first. Where Elastic happily accepts arbitrary JSON, Curiosity expects you to declare node and edge schemas.

A practical migration plan

A staged migration that doesn't require a big-bang cutover:

Stand up Workspace in parallel. Local first, then a staging environment (Installation).
Mirror one entity type end-to-end. Pick the highest-value one (often "tickets" or "documents"). Build the connector, configure search and embeddings, expose one retrieval endpoint.
Run both stacks side-by-side. A/B the retrieval endpoint against your existing LangChain retriever for a week. Capture precision and latency for both.
Cut one user-facing surface (the chat assistant is usually the cleanest) over to Workspace. Keep search on the old stack until you're ready.
Migrate ACLs. The hardest part of any migration. Audit the source-side permissions; build a connector that ingests them as RestrictAccessTo*. Test with non-admin accounts.
Cut the remaining surfaces.
Retire the old stack. Keep snapshots for whatever retention period your compliance requires.

What stays the same

The data sources don't change. Your existing JSON, your existing PDFs, your existing Confluence — same inputs.
The model providers don't change. Bring your existing OpenAI / Azure / Anthropic / local-model keys.
The evaluation set doesn't change. Reuse your golden queries to A/B retrieval quality.
The user mental model — search box, chat panel, citations — stays the same.

Common surprises

Embeddings don't transfer. Vectors from one model aren't comparable to vectors from a different model. Plan to re-embed.
Workspace is a single writer. If your current architecture relies on multi-master ingestion, you'll need to consolidate.
Reranking is built in. If you were running a separate cross-encoder reranker, evaluate whether hybrid search alone is now sufficient before porting the reranker.
Graph traversals are first-class. Patterns that took JOINs and post-filters in the old stack become natural Q().StartAt(...).Out(...) chains.

Architecture overview.
Build your first enterprise AI app — apply the equivalents above on a small sandbox first.
From legacy enterprise search — the sibling migration page for teams coming from Coveo / Algolia / Endeca / Sinequa / similar.

Referenced by

migration-from-legacy-enterprise-search