Migration: legacy enterprise search
If you're moving from a traditional enterprise-search platform (Coveo, Algolia for enterprise, Endeca/Oracle, Sinequa, Mindbreeze, GSA, FAST/SharePoint Search, …), this page maps the patterns onto Curiosity Workspace.
What maps onto what
| Legacy enterprise search | Curiosity Workspace |
|---|---|
| Crawler / source connector | Connector using Curiosity.Library |
| Out-of-the-box source plugins (Confluence, SharePoint, Box, etc.) | Built-in integrations under Settings → Integrations, plus custom connectors for what isn't covered |
| Document-level ACL ingestion ("crawl-time security") | RestrictAccessToTeam, RestrictAccessToUser, ReBAC |
| Late-binding security (filter at query time using user's groups) | Built-in — CreateSearchAsUserAsync applies the user's ACL filter at query time |
| Tuning UI for relevance | Search Optimization, boosts per indexed field, hybrid retrieval |
| Synonyms, dictionaries, stop words | NLP pipelines, language-specific analyzers |
| Faceted search | Property facets and graph-relationship facets — the latter is the differentiator |
| Connectors marketplace | Smaller built-in set + custom connectors via the SDK |
| Query-rewriting layer | A custom endpoint in front of search |
| Personalized re-ranking | Custom endpoint that re-ranks results using user-context signals from the graph |
| Multilingual analyzer packs | Built-in internationalization |
| Search UI widgets | Tesserae components in a custom front-end |
| Analytics / query log dashboards | Workspace monitoring + your own custom analytics endpoint |
What you gain
- Knowledge graph alongside search. Most legacy platforms have no concept of entities and relationships. Curiosity gives you both, with consistent permissions across them.
- Built-in AI. No need to bolt on a separate LLM gateway, vector DB, or RAG framework. Embeddings, chat, and citations are part of the platform.
- One license, one image, one deployment. Legacy enterprise search often comes with separately licensed crawlers, indexes, analyzers, and admin UIs.
- Modern auth. Native OIDC / Entra ID / Okta / Auth0 / SAML. No proprietary identity adapters.
What changes
- Crawler vs connector mindset. Legacy crawlers are configured to "discover" content; Curiosity connectors are explicit programs that map source records into typed nodes and edges. This is more code but produces a cleaner data model.
- ACL model. Most legacy platforms store flat ACLs on each document. Curiosity uses a graph-based ReBAC model — user → team → owns → resource. Migration usually means mapping source-system groups onto Workspace teams.
- Configuration as code. Legacy platforms expose configuration through admin UIs that are hard to version. Curiosity's schema, endpoints, and tools are C# you commit to git.
- No proprietary query language. No XQuery, no Coveo's syntax, no Endeca rules. Curiosity exposes a small set of built-in operators and lets you implement complex behaviors as endpoints.
A practical migration plan
- Catalog your current sources. Which connectors are in use? Which are critical, which are optional?
- Stand up Workspace in parallel (Installation).
- Map identities first. Configure SSO and the group → team mapping. Permissions must work before content goes in.
- Migrate one source end-to-end. Pick the highest-value source (usually the one with the most user queries). Build the connector, configure search and ACLs, validate with real users.
- Run both stacks in parallel. A/B for at least a week on a real query log; compare precision, latency, and user behaviour.
- Add the AI layer. This is what was missing from the legacy platform. Build the first chat endpoint and watch user behaviour shift toward it.
- Migrate the remaining sources one at a time. Retire the legacy platform when the last source is cut over.
Migrating ACLs specifically
Document-level ACLs are the most error-prone part of a search migration. Two patterns work well:
- Group → team mapping. For most platforms the right move is: each source-system group becomes a Workspace
_AccessGroup. The connector mirrors group membership and callsRestrictAccessToTeam(doc, group). - Per-document override. When a document has a unique ACL (a one-off share), model it with a per-user restriction. Use sparingly; teams scale better.
Test with a non-admin account before declaring the migration done. The most common bug is the connector running as system context and accidentally making everything visible to everyone.
Common surprises
- Search "loses" results. Almost always a permission bug or an analyzer mismatch. Sign in as admin to bisect.
- Result counts differ. Legacy platforms often counted duplicates from multiple sources. Curiosity dedupes by stable key.
- Boost values don't transfer. Different scoring math. Re-tune against your evaluation set.
- Crawl-time vs query-time security drift. Legacy stacks sometimes index ACLs and forget to refresh them. Curiosity ReBAC is graph-based, so a membership change is reflected on the next user request.
Related
- Architecture overview.
- Access Control Model.
- From Elasticsearch + vector DB + LangChain — the sibling migration page for teams coming from a stitched modern stack.