# Connectors

Connectors are the most flexible way to ingest data into Curiosity Workspace. A connector is a program (or integration component) that:

reads from a source system (files, databases, APIs, event streams)
maps source records into node schemas and edge schemas
commits changes into the workspace graph

# Why connectors matter

Connectors are where you encode the “truth” of how your source systems map into your graph model:

stable keys and deduplication
relationship creation
incremental updates and deletes
enrichment (aliases, normalization, derived fields)

# Minimal connector mapping example (C#)

The demo repository illustrates a common ingestion structure: define schemas, upsert nodes, link edges, then commit.

// Define schemas once (or validate they exist)
await graph.CreateNodeSchemaAsync<Device>();
await graph.CreateNodeSchemaAsync<Part>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));

// Upsert nodes by key
var deviceNode = graph.TryAdd(new Device { Name = "iPhone 14 Pro Max" });
var partNode   = graph.TryAdd(new Part   { Name = "Loudspeaker" });

// Link nodes with an edge (and optionally the inverse edge name)
graph.Link(deviceNode, partNode, Edges.HasPart, Edges.PartOf);

await graph.CommitPendingAsync();

This pattern scales well when you add batching, incremental cursors, and observability.

# Connector responsibilities

At minimum, a connector should:

create/update schemas (or validate schemas exist)
upsert nodes by stable keys
create/update edges between nodes
commit in batches and handle retries
log ingestion progress and failures

# Designing connector-friendly schemas

Good connector design starts with good schema design:

each node type has a stable key
relationships are explicit edges
large text fields are properties (later indexed for search/embeddings)

See Schema Design.

# Incremental ingestion patterns

Choose one:

Full refresh: rebuild everything on each run (simple, expensive)
Incremental: ingest only changes since last run (recommended for production)
Event-driven: apply changes as they happen (fastest, most complex)

For incremental ingestion, track:

source cursor/watermark (timestamp, sequence)
deletes (tombstones or periodic reconciliation)
schema evolution strategy (backfills)

# Testing and validation

After running a connector, validate:

counts per node type
edge completeness
no duplicate keys
search indexes include the intended fields (if configured)

# Common pitfalls

Unstable keys cause duplicates and broken links.
Missing edges make the graph unusable for navigation and graph-filtered search.
Ingesting everything as text reduces the value of schemas and facets.

# Next steps

Build ingestion workflows: Ingestion Pipelines
Learn best practices for modeling: Schema Design