Curiosity

LLM Agents and Integration

An LLM agent in Curiosity Workspace is the tool-use loop wrapped around the workspace's graph and search APIs. The LLM reasons about which tool to call next; deterministic endpoints do the work; the agent loops until it can answer.

This page is about the architecture. For the prompt templates that drive each step, see Prompting patterns → Tool use.

Architecture

Each box is a separate concern with a well-defined contract:

Component	Responsibility	Stateless?
User turn	One question or instruction from the user.	n/a
LLM	Picks the next tool call or composes the final answer.	yes
Tool router	Validates the model's tool request, calls the right endpoint, returns JSON.	yes
Tool endpoints	Deterministic graph/search/fetch operations. Permission-aware.	mostly
Final answer	LLM-written summary with citation UIDs and links back to the source nodes.	n/a

The arrow that matters most is the loop: tool result → LLM → next call. Cap it (8 calls is a reasonable default) so a confused model can't infinite-loop.

Tool registry

Each tool is a custom endpoint registered with the chat AI. The registration gives the model a name, a description, and a JSON-schema input. Keep tools:

Small. One tool = one verb. Not "search and summarize"; that's two tools.
Composable. A tool that returns UIDs is more useful than one that returns prose.
Idempotent for reads. Calling the same read tool twice with the same args should return the same thing.
Authenticated. Pass the user's identity through; never grant the agent more permissions than the user has.

Typical tool set for a support assistant:

Tool	Inputs	Returns
`search(query, type, k)`	string, optional type, optional k	list of `{uid, title, snippet, score}`
`get_node(uid)`	uid	full properties of the node
`get_neighbors(uid, edge, k)`	uid, edge type, k	list of neighbor nodes
`find_similar(uid, k)`	uid, k	list of nodes similar by embedding
`ask_human(question)`	string	terminates the loop, surfaces the question to the UI

The chat AI ships with these built-in; add domain-specific ones (open_ticket, lookup_invoice, …) as custom endpoints. See AI tools.

Permission handling

The user's identity threads through every tool call. The graph layer's ACL enforcement runs at retrieval time — CreateSearchAsUserAsync and Q().AsUser(uid) apply the user's permission view before results come back. The model never sees data the user can't see, which means it can't accidentally leak it.

If you build a custom tool, mirror this:

return await Graph.CreateSearchAsUserAsync(request, User.Id);

Never use the unscoped variant inside an AI tool.

Error handling

The agent loop must distinguish three error classes:

Class	Example	Loop response
Tool error	Endpoint 5xx, validation fail.	Return error JSON; let the model try another approach.
Empty result	Search returned nothing.	Return `{count: 0}`; the model decides whether to retry differently.
Hard guardrail	User asked for restricted action.	Short-circuit the loop and return the refusal message directly.

Don't catch and silently fix tool errors in the router — that hides genuine issues. Surface them to the model as structured errors and let it decide.

Cost and latency

Every loop iteration is a model call plus a tool call. A 5-step agent is 5 model calls + 5 endpoint calls. Two consequences:

Pick the model carefully. A haiku-class model for routing and a sonnet-class for the final answer is a common split.
Cache aggressively. If the same search("battery drain", "SupportCase") runs from different users, cache the read.
Stream the final answer. The user shouldn't watch a spinner while the model generates the summary.

See LLM configuration for provider setup and Metrics reference for the tool-call observability surface.

Worked example: support assistant

Trigger. A user opens a ticket in the support UI and asks "Why is my MBA-2024 battery draining overnight?"

Step 1. LLM calls search("battery drain MBA-2024", type="SupportCase", k=5).

Step 2. The tool returns 5 cases. LLM picks the most relevant 2 by reading snippets.

Step 3. LLM calls get_neighbors(uid=case_42, edge="Resolution", k=1) for each.

Step 4. The tool returns the resolutions: "firmware update v3.2 fixes overnight drain."

Step 5. LLM composes the answer: "Two recent cases for the MacBook Air 2024 ([case_42], [case_77]) report the same overnight drain. Both were resolved by updating to firmware v3.2."

Step 6. Loop ends. The router renders citations and posts the answer.

Total: 5 model calls, 4 tool calls, ~3 seconds. Well within the budget for a chat experience.

Anti-patterns

Tools that return prose. Always return data. Prose belongs to the final answer step.
Tools with side effects in the read path. A search that also logs to a CRM is two tools fused; split them.
Tools that bypass permissions. No exceptions — even admin tools take the user's identity and enforce against admin role checks.
Unbounded loops. Always cap iterations.
Hidden state. Each tool call should carry everything it needs. If you need state, persist it as graph nodes the model can fetch.

Where to go next

Prompting patterns — templates for the LLM steps.
AI tools — registering custom tools.
LLM configuration — picking and configuring providers.
Custom endpoints — building the deterministic side.
Grounded answer evaluation — measuring quality.