#
Semantic Similarity
#
Semantic Similarity
Semantic similarity answers “how related are these two items by meaning?” even when they do not share keywords.
In Curiosity Workspace, semantic similarity is usually powered by:
- embeddings (vector representations)
- vector indexes for fast nearest-neighbor lookup
#
Common use cases
- “Show similar cases/documents”
- “Find duplicates or near-duplicates”
- “Recommend related entities”
- “Cluster items into themes”
#
Similarity in context
Similarity is most useful when constrained by context, such as:
- only compare within a given customer or project
- only compare within a product line
- only compare within a time window
You can enforce context via:
- facets (property filters)
- graph traversal (related-to constraints)
#
Practical tuning tips
- Choose the right field(s) to embed (often
Summary + first message, orTitle + body). - Use chunking for long fields.
- Tune cutoffs so “similar” is meaningful for your domain.
- Evaluate with known pairs (positive and negative examples).
#
Next steps
- Configure embeddings: NLP → Embeddings
- Apply similarity to retrieval: Search → Vector Search