Full-stack RAG app

End-to-end build of a permission-aware retrieval-augmented chat app on Curiosity Workspace: ingest, model, retrieve, generate, cite, audit. Every layer is wired together — graph schema, ACL connector, vector + hybrid retrieval, custom endpoint, audit log, and a Tesserae UI.

If you've done Build your first enterprise AI app, this is the next rung up — same domain (technical support knowledge), more depth on the parts that matter in production.

Estimated time: 2–3 hours, end to end.

What you'll build

flowchart LR subgraph Source [Source: knowledge base] Articles[(Articles + ACL)] end subgraph Workspace [Curiosity Workspace] direction LR Connector --> Graph[(Graph + index)] Graph --> Endpoint["/api/endpoints/<br/>kb-chat"] Endpoint --> Audit[(ChatAuditEntry<br/>nodes)] Endpoint --> LLM[ChatAI provider] end subgraph UI [Tesserae UI] Chat[Chat panel] Cites[Citations<br/>side panel] end User([User]) --> Chat Chat -->|POST kb-chat| Endpoint Endpoint -->|answer + citations| Chat Chat --> Cites Articles --> Connector

The pieces:

Layer	What it does	Key APIs
Connector	Ingest articles with ACLs, idempotent re-runs	`AddOrUpdate`, `RestrictAccessToTeam`
Schema	`KbArticle`, `Author`, `Team`	`[Node]`, `[Key]`, `[Property]`
Retrieval	Hybrid search constrained to caller's permissions	`CreateSearchAsUserAsync`, `SearchRequest`
Endpoint	Retrieve → generate → cite → audit	`ChatAI.CompleteAsync`, `Ok()`
Audit	One node per chat call, linked to cited articles	`Graph.AddOrUpdate`, `Graph.Link`
Front-end	Chat + citations panel in Tesserae	`Endpoints.CallAsync`, `Defer`

Prerequisites

A running workspace with AI Settings configured for an LLM and embedding provider.
A connector token (CURIOSITY_TOKEN) — see Custom connector from scratch.
The h5 compiler and Curiosity CLI installed for the front-end (see Custom front-end).
An LLM credit balance — the chat endpoint will call ChatAI.CompleteAsync on each request.

Step 1 — schema

using Curiosity.Library;

namespace KB.Schema;

[Node]
public class Author
{
    [Key]      public string Login { get; set; }
    [Property] public string Name  { get; set; }
}

[Node]
public class KbArticle
{
    [Key]       public string         Id        { get; set; }
    [Property]  public string         Title     { get; set; }
    [Property]  public string         Body      { get; set; }
    [Property]  public string         SourceUrl { get; set; }
    [Timestamp] public DateTimeOffset Updated   { get; set; }
}

[Node]
public class ChatAuditEntry
{
    [Key]       public string         Id        { get; set; }
    [Property]  public string         Question  { get; set; }
    [Property]  public string         Answer    { get; set; }
    [Property]  public string         UserLogin { get; set; }
    [Timestamp] public DateTimeOffset Asked     { get; set; }
}

public static class Edges
{
    public const string Wrote     = nameof(Wrote);
    public const string WrittenBy = nameof(WrittenBy);
    public const string Cited     = nameof(Cited);
    public const string CitedBy   = nameof(CitedBy);
}

Enable chunked embeddings on KbArticle.Body in Settings → AI Settings → Embeddings so long articles stay retrievable in pieces.

Step 2 — connector with ACLs

using var graph = Graph.Connect(/* endpoint, token, connectorName */);

await graph.CreateNodeSchemaAsync<Author>();
await graph.CreateNodeSchemaAsync<KbArticle>();
await graph.CreateNodeSchemaAsync<ChatAuditEntry>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));

graph.SetAutoCommitCost(everyNodes: 5_000);

foreach (var src in await FetchArticles(since: await ReadCheckpoint()))
{
    var author  = await graph.CreateUserAsync(src.AuthorLogin, /* ... */);
    var article = graph.AddOrUpdate(new KbArticle
    {
        Id        = src.Id,
        Title     = src.Title,
        Body      = src.Body,
        SourceUrl = src.Url,
        Updated   = src.Updated,
    });

    graph.Link(author, article, Edges.Wrote, Edges.WrittenBy);

    // Mirror ACLs.
    foreach (var teamName in src.AccessTeams)
    {
        var team = await graph.CreateTeamAsync(teamName);
        graph.RestrictAccessToTeam(article, team);
    }
}

await graph.CommitPendingAsync();
await WriteCheckpointAsync(...);

For the full checkpoint loop, see Custom connector from scratch. Production deployments persist the cursor outside the process (S3, a workspace node, a database).

Step 3 — RAG endpoint

// Endpoint path: kb-chat
// Mode: Sync (switch to Pooling if completions take > 30s)
// Read Only: false  // writes ChatAuditEntry

public record ChatRequest(string Question, int K = 5);
public record Citation  (string ArticleId, string Title, string SourceUrl, double Score);
public record ChatReply (string Answer, Citation[] Citations, string AuditId);

var req = Body.FromJson<ChatRequest>();
if (string.IsNullOrWhiteSpace(req.Question))
    return BadRequest("Question is required.");

// 1. Retrieve — hybrid, ACL-filtered for caller.
await RelayStatusAsync("Searching the knowledge base...");

var search = SearchRequest.For(req.Question);
search.BeforeTypesFacet = new HashSet<string> { "KbArticle" };
search.HybridSearch     = true;

var hits = (await Graph.CreateSearchAsUserAsync(search, CurrentUser, CancellationToken))
           .Take(req.K)
           .EmitWithScores()
           .ToList();

if (hits.Count == 0)
    return Ok(new ChatReply("I couldn't find anything in the knowledge base on that.", Array.Empty<Citation>(), null!));

// 2. Build the closed-set prompt.
var sb = new StringBuilder();
sb.AppendLine("Answer the user's question using ONLY the articles below.");
sb.AppendLine("Cite articles inline as [KB-####]. If the articles don't answer the question, say so.");
sb.AppendLine();
foreach (var (article, score) in hits)
{
    sb.AppendLine($"[{article["Id"].AsString()}] {article["Title"].AsString()}  (score {score:F2})");
    sb.AppendLine(article["Body"].AsString());
    sb.AppendLine("---");
}
sb.AppendLine();
sb.AppendLine($"User question: {req.Question}");

// 3. Generate.
await RelayStatusAsync("Composing answer...");
var answer = await ChatAI.CompleteAsync(sb.ToString(), CancellationToken);

// 4. Audit — link the audit node to every citation.
var auditId = Guid.NewGuid().ToString("n");
var audit = Graph.AddOrUpdate(new ChatAuditEntry
{
    Id        = auditId,
    Question  = req.Question,
    Answer    = answer,
    UserLogin = (await Graph.GetUserByUidAsync(CurrentUser))?.Login ?? "unknown",
    Asked     = DateTimeOffset.UtcNow,
});
foreach (var (article, _) in hits)
    Graph.Link(audit, article, Edges.Cited, Edges.CitedBy);
await Graph.CommitPendingAsync();

return Ok(new ChatReply(
    Answer:    answer,
    Citations: hits.Select(h => new Citation(
                   ArticleId: h.Node["Id"].AsString(),
                   Title:     h.Node["Title"].AsString(),
                   SourceUrl: h.Node["SourceUrl"].AsString(),
                   Score:     h.Score)).ToArray(),
    AuditId:   auditId));

What's load-bearing in this endpoint:

CreateSearchAsUserAsync(_, CurrentUser) — no retrieval-time leaks, ever.
Closed-set prompt — the model can't cite an article that wasn't retrieved.
Audit node linked to citations — every answer is reproducible from graph state alone.

Step 4 — Tesserae UI

Inside the front-end project (downloaded from Management → Interfaces → Download template), wire a chat panel that calls the endpoint. Tesserae provides the layout primitives and Defer for async content.

public class KbChatView : IComponent
{
    private readonly Observable<string> _question = new("");
    private readonly Observable<ChatReply?> _reply = new(null);

    public dom.HTMLElement Render() =>
        HStack().Children(
            // Left column: input + answer
            VStack().Grow(2).Children(
                TextBox().Placeholder("Ask the knowledge base...")
                        .Bind(_question),
                Button("Ask").OnClick(async () =>
                {
                    var reply = await Mosaik.API.Endpoints.CallAsync<ChatReply>(
                        "kb-chat", new { Question = _question.Value });
                    _reply.Value = reply;
                }),
                Defer(_reply, r => r is null
                    ? TextBlock("(no answer yet)")
                    : TextBlock(r.Answer).MaxWidth(64.rem()))
            ),
            // Right column: citations panel
            VStack().Grow(1).Children(
                TextBlock("Sources").SemiBold(),
                Defer(_reply, r => r?.Citations is null or { Length: 0 }
                    ? TextBlock("(none)")
                    : VStack().Children(r.Citations.Select(c =>
                        Link(c.SourceUrl).Children(
                            TextBlock($"[{c.ArticleId}] {c.Title}"),
                            TextBlock($"score {c.Score:F2}").Small()))))
            )
        ).Render();
}

Deploy with:

curiosity-cli upload-front-end -s https://workspace.example.com -t $CURIOSITY_TOKEN -p ./bin/Debug/netstandard2.0/h5

Under the hood, the CLI calls UploadNewApplicationInterfaceAsync(path) against the workspace.

Step 5 — eval and monitor

A small set of fixed Q&A pairs makes prompt and retrieval changes safe to ship.

// Endpoint: kb-chat-eval
var questions = new[]
{
    new { Q = "How do I reset a Tier-2 password?",            ExpectedCites = new[] { "KB-0042" } },
    new { Q = "MacBook screen flickers after sleep, fixes?",  ExpectedCites = new[] { "KB-0157" } },
    new { Q = "What is our SLA for hardware swaps?",          ExpectedCites = new[] { "KB-0008" } },
};

var pass = 0;
foreach (var q in questions)
{
    var reply = await RunEndpointAsync<ChatReply>("kb-chat", new { Question = q.Q });
    if (reply.Citations.Any(c => q.ExpectedCites.Contains(c.ArticleId)))
        pass++;
}
return new { pass, total = questions.Length };

Wire this through the evaluation framework so a regression on retrieval triggers an alert.

Operational metrics to watch (see Monitoring):

Endpoint latency p50 / p95 (retrieval + LLM)
Empty-citation rate (model didn't find anything to cite)
ChatAI cost per call
Audit-node growth rate

Security checklist

CreateSearchAsUserAsync is used; no code path calls CreateSearchAsync from a user endpoint.
Connector restricts every article that the source restricted.
CurrentUser == default (endpoint-token caller) returns Forbid() from kb-chat.
Prompt template forbids the model from inventing article IDs (closed-set prompt).
Audit nodes are read-only for non-admins.

Cross-links

Permission-aware search — the access-control deep dive
Custom connector from scratch
Custom endpoint from scratch
Production deployment checklist
AI tools — turn the endpoint into a chat tool
Evaluation framework