Full-stack RAG app
End-to-end build of a permission-aware retrieval-augmented chat app on Curiosity Workspace: ingest, model, retrieve, generate, cite, audit. Every layer is wired together — graph schema, ACL connector, vector + hybrid retrieval, custom endpoint, audit log, and a Tesserae UI.
If you've done Build your first enterprise AI app, this is the next rung up — same domain (technical support knowledge), more depth on the parts that matter in production.
Estimated time: 2–3 hours, end to end.
What you'll build
The pieces:
| Layer | What it does | Key APIs |
|---|---|---|
| Connector | Ingest articles with ACLs, idempotent re-runs | AddOrUpdate, RestrictAccessToTeam |
| Schema | KbArticle, Author, Team |
[Node], [Key], [Property] |
| Retrieval | Hybrid search constrained to caller's permissions | CreateSearchAsUserAsync, SearchRequest |
| Endpoint | Retrieve → generate → cite → audit | ChatAI.CompleteAsync, Ok() |
| Audit | One node per chat call, linked to cited articles | Graph.AddOrUpdate, Graph.Link |
| Front-end | Chat + citations panel in Tesserae | Endpoints.CallAsync, Defer |
Prerequisites
- A running workspace with AI Settings configured for an LLM and embedding provider.
- A connector token (
CURIOSITY_TOKEN) — see Custom connector from scratch. - The h5 compiler and Curiosity CLI installed for the front-end (see Custom front-end).
- An LLM credit balance — the chat endpoint will call
ChatAI.CompleteAsyncon each request.
Step 1 — schema
using Curiosity.Library;
namespace KB.Schema;
[Node]
public class Author
{
[Key] public string Login { get; set; }
[Property] public string Name { get; set; }
}
[Node]
public class KbArticle
{
[Key] public string Id { get; set; }
[Property] public string Title { get; set; }
[Property] public string Body { get; set; }
[Property] public string SourceUrl { get; set; }
[Timestamp] public DateTimeOffset Updated { get; set; }
}
[Node]
public class ChatAuditEntry
{
[Key] public string Id { get; set; }
[Property] public string Question { get; set; }
[Property] public string Answer { get; set; }
[Property] public string UserLogin { get; set; }
[Timestamp] public DateTimeOffset Asked { get; set; }
}
public static class Edges
{
public const string Wrote = nameof(Wrote);
public const string WrittenBy = nameof(WrittenBy);
public const string Cited = nameof(Cited);
public const string CitedBy = nameof(CitedBy);
}
Enable chunked embeddings on KbArticle.Body in Settings → AI Settings → Embeddings so long articles stay retrievable in pieces.
Step 2 — connector with ACLs
using var graph = Graph.Connect(/* endpoint, token, connectorName */);
await graph.CreateNodeSchemaAsync<Author>();
await graph.CreateNodeSchemaAsync<KbArticle>();
await graph.CreateNodeSchemaAsync<ChatAuditEntry>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));
graph.SetAutoCommitCost(everyNodes: 5_000);
foreach (var src in await FetchArticles(since: await ReadCheckpoint()))
{
var author = await graph.CreateUserAsync(src.AuthorLogin, /* ... */);
var article = graph.AddOrUpdate(new KbArticle
{
Id = src.Id,
Title = src.Title,
Body = src.Body,
SourceUrl = src.Url,
Updated = src.Updated,
});
graph.Link(author, article, Edges.Wrote, Edges.WrittenBy);
// Mirror ACLs.
foreach (var teamName in src.AccessTeams)
{
var team = await graph.CreateTeamAsync(teamName);
graph.RestrictAccessToTeam(article, team);
}
}
await graph.CommitPendingAsync();
await WriteCheckpointAsync(...);
For the full checkpoint loop, see Custom connector from scratch. Production deployments persist the cursor outside the process (S3, a workspace node, a database).
Step 3 — RAG endpoint
// Endpoint path: kb-chat
// Mode: Sync (switch to Pooling if completions take > 30s)
// Read Only: false // writes ChatAuditEntry
public record ChatRequest(string Question, int K = 5);
public record Citation (string ArticleId, string Title, string SourceUrl, double Score);
public record ChatReply (string Answer, Citation[] Citations, string AuditId);
var req = Body.FromJson<ChatRequest>();
if (string.IsNullOrWhiteSpace(req.Question))
return BadRequest("Question is required.");
// 1. Retrieve — hybrid, ACL-filtered for caller.
await RelayStatusAsync("Searching the knowledge base...");
var search = SearchRequest.For(req.Question);
search.BeforeTypesFacet = new HashSet<string> { "KbArticle" };
search.HybridSearch = true;
var hits = (await Graph.CreateSearchAsUserAsync(search, CurrentUser, CancellationToken))
.Take(req.K)
.EmitWithScores()
.ToList();
if (hits.Count == 0)
return Ok(new ChatReply("I couldn't find anything in the knowledge base on that.", Array.Empty<Citation>(), null!));
// 2. Build the closed-set prompt.
var sb = new StringBuilder();
sb.AppendLine("Answer the user's question using ONLY the articles below.");
sb.AppendLine("Cite articles inline as [KB-####]. If the articles don't answer the question, say so.");
sb.AppendLine();
foreach (var (article, score) in hits)
{
sb.AppendLine($"[{article["Id"].AsString()}] {article["Title"].AsString()} (score {score:F2})");
sb.AppendLine(article["Body"].AsString());
sb.AppendLine("---");
}
sb.AppendLine();
sb.AppendLine($"User question: {req.Question}");
// 3. Generate.
await RelayStatusAsync("Composing answer...");
var answer = await ChatAI.CompleteAsync(sb.ToString(), CancellationToken);
// 4. Audit — link the audit node to every citation.
var auditId = Guid.NewGuid().ToString("n");
var audit = Graph.AddOrUpdate(new ChatAuditEntry
{
Id = auditId,
Question = req.Question,
Answer = answer,
UserLogin = (await Graph.GetUserByUidAsync(CurrentUser))?.Login ?? "unknown",
Asked = DateTimeOffset.UtcNow,
});
foreach (var (article, _) in hits)
Graph.Link(audit, article, Edges.Cited, Edges.CitedBy);
await Graph.CommitPendingAsync();
return Ok(new ChatReply(
Answer: answer,
Citations: hits.Select(h => new Citation(
ArticleId: h.Node["Id"].AsString(),
Title: h.Node["Title"].AsString(),
SourceUrl: h.Node["SourceUrl"].AsString(),
Score: h.Score)).ToArray(),
AuditId: auditId));
What's load-bearing in this endpoint:
CreateSearchAsUserAsync(_, CurrentUser)— no retrieval-time leaks, ever.- Closed-set prompt — the model can't cite an article that wasn't retrieved.
- Audit node linked to citations — every answer is reproducible from graph state alone.
Step 4 — Tesserae UI
Inside the front-end project (downloaded from Management → Interfaces → Download template), wire a chat panel that calls the endpoint. Tesserae provides the layout primitives and Defer for async content.
public class KbChatView : IComponent
{
private readonly Observable<string> _question = new("");
private readonly Observable<ChatReply?> _reply = new(null);
public dom.HTMLElement Render() =>
HStack().Children(
// Left column: input + answer
VStack().Grow(2).Children(
TextBox().Placeholder("Ask the knowledge base...")
.Bind(_question),
Button("Ask").OnClick(async () =>
{
var reply = await Mosaik.API.Endpoints.CallAsync<ChatReply>(
"kb-chat", new { Question = _question.Value });
_reply.Value = reply;
}),
Defer(_reply, r => r is null
? TextBlock("(no answer yet)")
: TextBlock(r.Answer).MaxWidth(64.rem()))
),
// Right column: citations panel
VStack().Grow(1).Children(
TextBlock("Sources").SemiBold(),
Defer(_reply, r => r?.Citations is null or { Length: 0 }
? TextBlock("(none)")
: VStack().Children(r.Citations.Select(c =>
Link(c.SourceUrl).Children(
TextBlock($"[{c.ArticleId}] {c.Title}"),
TextBlock($"score {c.Score:F2}").Small()))))
)
).Render();
}
Deploy with:
curiosity-cli upload-front-end -s https://workspace.example.com -t $CURIOSITY_TOKEN -p ./bin/Debug/netstandard2.0/h5
Under the hood, the CLI calls UploadNewApplicationInterfaceAsync(path) against the workspace.
Step 5 — eval and monitor
A small set of fixed Q&A pairs makes prompt and retrieval changes safe to ship.
// Endpoint: kb-chat-eval
var questions = new[]
{
new { Q = "How do I reset a Tier-2 password?", ExpectedCites = new[] { "KB-0042" } },
new { Q = "MacBook screen flickers after sleep, fixes?", ExpectedCites = new[] { "KB-0157" } },
new { Q = "What is our SLA for hardware swaps?", ExpectedCites = new[] { "KB-0008" } },
};
var pass = 0;
foreach (var q in questions)
{
var reply = await RunEndpointAsync<ChatReply>("kb-chat", new { Question = q.Q });
if (reply.Citations.Any(c => q.ExpectedCites.Contains(c.ArticleId)))
pass++;
}
return new { pass, total = questions.Length };
Wire this through the evaluation framework so a regression on retrieval triggers an alert.
Operational metrics to watch (see Monitoring):
- Endpoint latency p50 / p95 (retrieval + LLM)
- Empty-citation rate (model didn't find anything to cite)
- ChatAI cost per call
- Audit-node growth rate
Security checklist
-
CreateSearchAsUserAsyncis used; no code path callsCreateSearchAsyncfrom a user endpoint. - Connector restricts every article that the source restricted.
-
CurrentUser == default(endpoint-token caller) returnsForbid()fromkb-chat. - Prompt template forbids the model from inventing article IDs (closed-set prompt).
- Audit nodes are read-only for non-admins.
Cross-links
- Permission-aware search — the access-control deep dive
- Custom connector from scratch
- Custom endpoint from scratch
- Production deployment checklist
- AI tools — turn the endpoint into a chat tool
- Evaluation framework