# Connectors

# Connectors

Connectors are the most flexible way to ingest data into Curiosity Workspace. A connector is a program (or integration component) that:

  • reads from a source system (files, databases, APIs, event streams)
  • maps source records into node schemas and edge schemas
  • commits changes into the workspace graph

# Why connectors matter

Connectors are where you encode the “truth” of how your source systems map into your graph model:

  • stable keys and deduplication
  • relationship creation
  • incremental updates and deletes
  • enrichment (aliases, normalization, derived fields)

# Minimal connector mapping example (C#)

The demo repository illustrates a common ingestion structure: define schemas, upsert nodes, link edges, then commit.

// Define schemas once (or validate they exist)
await graph.CreateNodeSchemaAsync<Device>();
await graph.CreateNodeSchemaAsync<Part>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));

// Upsert nodes by key
var deviceNode = graph.TryAdd(new Device { Name = "iPhone 14 Pro Max" });
var partNode   = graph.TryAdd(new Part   { Name = "Loudspeaker" });

// Link nodes with an edge (and optionally the inverse edge name)
graph.Link(deviceNode, partNode, Edges.HasPart, Edges.PartOf);

await graph.CommitPendingAsync();

This pattern scales well when you add batching, incremental cursors, and observability.

# Connector responsibilities

At minimum, a connector should:

  • create/update schemas (or validate schemas exist)
  • upsert nodes by stable keys
  • create/update edges between nodes
  • commit in batches and handle retries
  • log ingestion progress and failures

# Designing connector-friendly schemas

Good connector design starts with good schema design:

  • each node type has a stable key
  • relationships are explicit edges
  • large text fields are properties (later indexed for search/embeddings)

See Schema Design.

# Incremental ingestion patterns

Choose one:

  • Full refresh: rebuild everything on each run (simple, expensive)
  • Incremental: ingest only changes since last run (recommended for production)
  • Event-driven: apply changes as they happen (fastest, most complex)

For incremental ingestion, track:

  • source cursor/watermark (timestamp, sequence)
  • deletes (tombstones or periodic reconciliation)
  • schema evolution strategy (backfills)

# Testing and validation

After running a connector, validate:

  • counts per node type
  • edge completeness
  • no duplicate keys
  • search indexes include the intended fields (if configured)

# Common pitfalls

  • Unstable keys cause duplicates and broken links.
  • Missing edges make the graph unusable for navigation and graph-filtered search.
  • Ingesting everything as text reduces the value of schemas and facets.

# Next steps