Curiosity

Idempotency

Idempotent ingestion means running the connector twice with the same input is a no-op for the graph. Without it, every retry creates duplicates, every crash leaves orphan edges, and every backfill drifts.

Two rules carry all the weight: stable keys, and choosing AddOrUpdate vs TryAdd deliberately.

Rule 1: stable keys from the source

The [Key] property must be deterministic — same source record, same key, every time. Use real source identifiers:

// GOOD — derived from the source's primary key.
var caseNode = graph.AddOrUpdate(new SupportCase
{
    Id      = sourceCase.ReferenceNumber,
    Summary = sourceCase.Summary,
});

Anti-patterns to avoid:

// BAD — random GUID; new key every run, duplicates forever.
Id = Guid.NewGuid().ToString();

// BAD — local clock; clock skew between connectors duplicates.
Id = DateTime.UtcNow.Ticks.ToString();

// BAD — auto-increment from a local DB; depends on insertion order.
Id = $"case-{++counter}";

Rule 2: hash if you have to

When the source doesn't expose a stable identifier, derive one from the immutable fields:

static string KeyFor(SupportCaseSource src) =>
    HashUtils.ComputeMD5($"{src.CustomerEmail}|{src.OpenedAt:o}|{src.Subject}");

Two rules for the hash:

  1. Only immutable fields. Anything that can change between runs (status, last-updated, content) must stay out of the hash, or re-runs will create new nodes.
  2. Document the recipe. Comment which fields the hash uses; future-you will need to recompute it during a migration.

Rule 3: pick the right upsert

Method Use when
TryAdd Reference data that doesn't change once seeded (manufacturers, statuses, types).
AddOrUpdate Source data that mutates (cases, articles, tickets, profiles).

AddOrUpdate reads + writes; TryAdd only writes if absent. Picking TryAdd for mutable data means later updates to the source are silently ignored. Picking AddOrUpdate for everything is safe but slightly more expensive on read.

Rule 4: derived keys for child entities

Conversation messages, line items, comments — anything that's "n per parent" — needs deterministic child keys.

var i = 0;
foreach (var msg in src.Messages)
{
    var key = $"{src.ReferenceNumber}#{i++}";   // index-based — order must be stable
    graph.AddOrUpdate(new SupportCaseMessage
    {
        Id   = key,
        Body = msg.Body,
        Time = msg.Time,
    });
}

If the source can reorder messages between runs, use HashUtils.ComputeMD5(msg.Body + msg.Time) as the suffix instead of the index.

Rule 5: ACLs are part of the record

When you re-run the connector, ACLs need to converge to whatever the source currently says. The connector should:

  1. Compute the intended set of restrictions from the source record.
  2. Apply them with RestrictAccessToTeam / RestrictAccessToUser — these are themselves idempotent (re-applying does nothing).
  3. Remove any restrictions that no longer apply if you want strict convergence (use Unrestrict* calls; see Access control).

Skipping step 3 means revoked permissions linger. For most sources that's a security bug, not a feature.

Validating idempotency

A simple test: run the connector twice on the same input, then check that node counts didn't change.

var before = await graph.QueryAsync(q => q.StartAt(nameof(SupportCase)).EmitCount("C"));
await RunConnectorAsync();
var after  = await graph.QueryAsync(q => q.StartAt(nameof(SupportCase)).EmitCount("C"));

Assert.AreEqual(before.GetEmittedCount("C"), after.GetEmittedCount("C"));

Wire this into your CI alongside any integration tests.

© 2026 Curiosity. All rights reserved.