Curiosity Workspaces

# Semantic Similarity

# Semantic Similarity

Semantic similarity answers “how related are these two items by meaning?” even when they do not share keywords.

In Curiosity Workspace, semantic similarity is usually powered by:

embeddings (vector representations)
vector indexes for fast nearest-neighbor lookup

# Common use cases

“Show similar cases/documents”
“Find duplicates or near-duplicates”
“Recommend related entities”
“Cluster items into themes”

# Similarity in context

Similarity is most useful when constrained by context, such as:

only compare within a given customer or project
only compare within a product line
only compare within a time window

You can enforce context via:

facets (property filters)
graph traversal (related-to constraints)

# Practical tuning tips

Choose the right field(s) to embed (often Summary + first message, or Title + body).
Use chunking for long fields.
Tune cutoffs so “similar” is meaningful for your domain.
Evaluate with known pairs (positive and negative examples).

# Next steps

Configure embeddings: NLP → Embeddings
Apply similarity to retrieval: Search → Vector Search