# Semantic Similarity

# Semantic Similarity

Semantic similarity answers “how related are these two items by meaning?” even when they do not share keywords.

In Curiosity Workspace, semantic similarity is usually powered by:

  • embeddings (vector representations)
  • vector indexes for fast nearest-neighbor lookup

# Common use cases

  • “Show similar cases/documents”
  • “Find duplicates or near-duplicates”
  • “Recommend related entities”
  • “Cluster items into themes”

# Similarity in context

Similarity is most useful when constrained by context, such as:

  • only compare within a given customer or project
  • only compare within a product line
  • only compare within a time window

You can enforce context via:

  • facets (property filters)
  • graph traversal (related-to constraints)

# Practical tuning tips

  • Choose the right field(s) to embed (often Summary + first message, or Title + body).
  • Use chunking for long fields.
  • Tune cutoffs so “similar” is meaningful for your domain.
  • Evaluate with known pairs (positive and negative examples).

# Next steps