# Graph Model

# Graph Model

Curiosity Workspace represents data as a labeled property graph:

  • Nodes represent entities (e.g., User, Ticket, Device, Policy)
  • Edges represent relationships (e.g., Created, Mentions, BelongsTo)
  • Properties store structured fields on nodes (and potentially edges)

The graph model is the backbone for:

  • exploration (“show me related things”)
  • context building (neighbors, clusters, paths)
  • graph-constrained search (“search within this customer’s accounts”)

# Schemas: types and constraints

Workspaces are schema-driven:

  • Node schemas define node types and their keys/properties.
  • Edge schemas define relationship types and, optionally, direction conventions.

Why schemas matter:

  • schemas make data predictable for applications and endpoints
  • schemas reduce ambiguity in search configuration (which fields exist?)
  • schemas enable governance (what is allowed to be ingested?)

# Keys: identity and deduplication

Every node type should have a stable key (or a deterministic ID strategy). Keys determine:

  • whether ingestion updates an existing node or creates a new one
  • how external systems reference nodes
  • how you build safe connectors (idempotency)

Common patterns:

  • Natural key: a stable domain ID (ticket_id, email, device_serial)
  • Synthetic key: generated ID stored externally and reused
  • Deterministic hash: stable hash of a canonical record (careful with schema evolution)

# Relationships: modeling decisions

Edges can be used for:

  • navigation: user → tickets → product
  • faceting/filtering: ticket → status, ticket → team
  • enrichment: entity mentions → resolved entities

A useful modeling rule:

  • Properties store attributes (string/number/date values)
  • Edges store associations (connect one entity to another entity)

# Querying the graph

Graph queries typically follow patterns like:

  • Start at a node type or a known node
  • Traverse in/out along specific edge types
  • Filter by property, type, or timestamp
  • Return nodes, paths, or aggregates

See Reference → Graph Query Language.

# Common pitfalls

  • Over-normalizing: turning every string into a node can bloat the graph; reserve nodes for values that need navigation/filtering.
  • Under-linking: keeping everything as properties removes graph value; add edges where users will navigate or constrain search.
  • Unstable keys: the #1 reason for duplicates during ingestion.

# Next steps