#
Schema Design
#
Schema Design
Schema design is the most important step in building a successful Curiosity Workspace application. It determines:
- how users navigate and explore data
- how search is scoped and filtered
- how AI features can ground and enrich results
- how connectors keep data consistent over time
#
The three layers of a good schema
- Entities (nodes): the “things” in your domain
- Relationships (edges): the meaningful links between those things
- Attributes (properties): the descriptive fields used for display, filtering, and retrieval
#
Start from user journeys
Ask these questions before you write your first node schema:
- What are the top 5 questions users ask?
- What are the top 5 workflows users execute?
- Which objects do they search for first?
- What do they click on next?
Those answers typically map directly to:
- primary node types
- the edges between them
- the filters and facets you must support
#
Keys: pick stable identity early
For each node type, define a stable key:
- Prefer stable IDs from the source system.
- If not available, use a deterministic key strategy (canonicalization + hash).
- Avoid random IDs unless you never need to re-run ingestion safely.
#
When to make something a node vs a property
Use a property when:
- the value is only displayed or filtered on the current node
- you do not need to navigate to it as an entity
Use a node + edge when:
- you need cross-cutting filters (e.g., status across multiple types)
- you need navigation and context building (“show all tickets for this customer”)
- the value should have its own metadata over time
#
Relationship modeling patterns
Common patterns:
- Ownership / membership:
Customer -> HasTicket -> Ticket - Attribution as node:
Ticket -> HasStatus -> Status - Mentions / linking:
Document -> Mentions -> Entity - Bipartite linking: avoid duplicating properties by linking to shared nodes
#
Schema evolution
Expect schema evolution in real systems:
- add properties as new data becomes available
- introduce new node/edge types for new workflows
- backfill or reparse content when pipelines change
Operational advice:
- version your connector logic and treat schema updates as deployments
- plan reindex and reparse windows for large changes
#
Next steps
- Learn ingestion patterns that keep schemas consistent: Ingestion Pipelines
- Tune search based on schema decisions: Search Optimization