Curiosity

Embeddings API

Reference for the embeddings surface in Curiosity Workspace: how to encode text, query for similar nodes, and project embeddings for visualization. For the conceptual model, see NLP → Embeddings. For tuning retrieval, see Vector search.

The two layers

Layer Use Surface
Graph query "Find nodes similar to text X, optionally filtered by graph or type." Q().StartAtSimilarText(...) in C#.
REST API External callers — get a vector for a text, get vectors for nodes, find similar nodes, project to 2D/3D. /api/embeddings/*

The graph query is the common path. Reach for REST when you need raw vectors (for example, to mirror them into another store or to feed a UMAP visualization).

StartAtSimilarText (graph query)

return Q().StartAtSimilarText(
              text:        "screen flicker after sleep",
              count:       50,
              nodeTypes:   new[] { "SupportCase" },
              indexUID:    default,
              applyCutoff: false)
          .EmitWithScores();
Parameter Type Default Notes
text string The text to compare against.
count int 50 Maximum nearest neighbors to return.
nodeTypes string[] null If set, restrict candidates to these types.
indexUID IndexUID default Pick a specific similarity index (when multiple are configured for a type).
applyCutoff bool false If true, drop hits below the index's configured similarity threshold.

The result is a graph query — you can chain IsRelatedTo, Where, Take, etc. before terminating with EmitWithScores() (most common) or Emit("N").

REST endpoints

All routes live under /api/embeddings/ and are admin- or token-scoped. Responses default to MessagePack for vector payloads (fast and compact); JSON is used for control-plane responses.

POST /api/embeddings/uids

Get vectors for a set of nodes.

{
  "UIDs":     ["…", "…"],
  "NodeType": "SupportCase",
  "IndexUID": "…"
}

Response: a MessagePack-serialized LabelledVector[], where each item carries a UID and a float[] vector. The shape is { UID: UID128, Vector: float[] } repeated.

Use this when you need to mirror embeddings into another vector store, or when you're building a custom analysis tool.

`POST /api/embeddings/similar/

Find nodes most similar to a single node.

Query param Type Default Notes
count int 100 Maximum neighbors.

Response: JSON array of similar UIDs. Cheap when you already have a UID and only need IDs back — StartAtSimilarText is the better choice when starting from free text.

POST /api/embeddings/projected

Run UMAP over the embeddings for visualization.

{
  "Dimensions":             2,
  "MaximumNumberOfPoints":  10000,
  "NumberOfEpochsOverride": 200
}

Response: MessagePack containing the projected points. Use for cluster visualizations and "embedding atlas" UIs. Expensive — cache results.

`GET /api/embeddings/availablefor/

List the embedding indexes configured for a node type. Returns a JSON list of IndexUIDs, each with the model name and field. Useful when a type has multiple indexes (e.g. summary + body embedded separately).

GET /api/embeddings/available

List every node type that has at least one embedding index. JSON { nodeType: [indexUID, …] }.

Encoding text directly

From inside an endpoint or scheduled task you can ask the graph to encode a string:

float[] vector = await Graph.EncodeAsync("screen flicker", indexUID: default);

Or pick a specific encoder model:

float[] vector = await Graph.EncodeAsync("screen flicker", SentenceEncoderModel.MiniLM);

These calls use the active configured provider (see below). They count against your provider quota if the provider is external.

Batching

External providers support batch encoding — pay the network cost once for a list of strings:

float[][] vectors = await encoder.EncodeAsync(texts, cancellationToken);

Batch sizes are bounded by the provider's per-request token limit. The encoder splits large lists internally; you don't need to chunk on the call site.

Provider configuration

The active encoder is set on the index, not per request. Models supported:

Encoder model Notes
MiniLM Local, small, fast. Reasonable quality. No external network calls.
ArcticXS Local, higher quality than MiniLM. Larger memory footprint.
External Use a hosted provider (see below). Requires API key and network.
None Embedding indexing disabled for this index.

When SentenceEncoderModel = External, the index uses:

Setting Notes
ExternalProviderName OpenAi, Anthropic, Cohere, Google, or AzureOpenAi.
ExternalProviderModel The provider's model name (e.g. text-embedding-3-large).
ExternalProviderApiKey Secret. Store in a secret manager, not in source.
ExternalProviderUrl Custom URL — for self-hosted or proxied providers.

Switching models requires re-embedding the entire corpus. Vectors from different models are not comparable — searches will silently degrade until the rebuild completes. See Reindexing and re-embedding.

Vector index configuration (per field)

Embedding indexes are per (nodeType, field). A single type can carry several indexes — for example, an Article with one index over title and another over body. Configure them in the admin UI under the type's index settings; the schema layer is not aware of them.

The query API picks the right index automatically based on nodeTypes. If you need to disambiguate (multiple indexes on the same type), pass indexUID to StartAtSimilarText.

Errors and rate limits

  • HTTP 401 / 403 — token missing or wrong scope. See Token scopes.
  • HTTP 404availablefor/{nodeType} returns 404 if no embedding index exists for that type.
  • HTTP 429 — exceeded the external provider's rate limit. The workspace retries with exponential backoff but eventually surfaces the 429 to the caller. Tune your external-provider quota or fall back to a local model for high-volume ingestion.
  • HTTP 503 — embeddings are still indexing. Retry; the result will become available once the build completes.

For the full error code list, see Error codes.

Best practices

  • Choose fields with intent. Embed long, descriptive fields. Don't embed status codes — they're better as facets.
  • Chunk long content. Documents over a few thousand characters should be chunked. See Vector search → chunking.
  • Constrain by graph. A two-step "vector retrieve, graph filter" query is almost always better than a pure vector search with post-hoc filtering.
  • Treat applyCutoff as the default for production. Without a cutoff every query returns count hits, even when nothing is actually similar.
  • Re-embed after model changes. Schedule the rebuild outside business hours; the corpus is unavailable for vector search until it completes.
© 2026 Curiosity. All rights reserved.
Powered by Neko