Curiosity

Manufacturing knowledge assistant

A blueprint for industrial / engineering knowledge: parts, assemblies, finished products, engineering documents, and the engineers who maintain them. Built for the case where engineers ask questions like "how does this part interact with the cooling assembly?" and answers must reference specific, versioned documents.

The graph

flowchart LR Document -->|DescribesPart| Part Document -->|RevisedFrom| Document Document -->|AuthoredBy| Engineer Part -->|PartOf| Assembly Assembly -->|InProduct| Product Engineer -->|MemberOf| Team Document -->|OwnedBy| Team
Node Key Notes
Document DocNumber Versioned via RevisedFrom edge to predecessor
Part PartNumber Canonical part
Assembly AssemblyNumber Sub-assemblies modeled recursively
Product Sku The shippable product
Engineer Email Mapped from SSO
Team Name SSO-mapped team for ACL

What this demonstrates

  • Recursive part-of relationships — assemblies of assemblies of parts.
  • Versioned documents via a chained RevisedFrom edge — no schema duplication.
  • Team-based ACL ingestion — sensitive engineering docs (drawings, specs) restricted to specific engineering teams.
  • Long-document RAG — the heavy text fields are engineering PDFs that get OCR'd and chunked for retrieval.

Retrieval

  • Text search on DocNumber, PartNumber, Product.Sku — engineers know exact identifiers.
  • Hybrid search on Document.Body (PDF-extracted text) — for paraphrased questions.
  • Graph-scoped search: "find docs related to this part" via Part → In(DescribesPart) → Document.
  • Facets: Product, Assembly, Team, DocumentRevision (newest/all/specific revision).

AI

A "part-aware assistant" with three tools:

  • FindDocsForPart(partNumber) — returns documents describing a part and its parent assemblies.
  • GetDocumentSummary(docNumber) — returns the doc's summary with a citation pointer.
  • LatestRevisionOf(docNumber) — walks RevisedFrom to find the head revision.

Prompt template: "Answer using only the cited revisions. If a document has a newer revision, prefer it. Cite with [1], [2]."

Permissions

Drawings and specs are typically restricted by business unit rather than per-document — set up Workspace teams that mirror the engineering org, and call RestrictAccessToTeam(doc, ownerTeam) during ingestion. Read-only access for executives can be modeled as additional team memberships.

Connector

Sources:

  • PLM (Teamcenter, Windchill, ENOVIA, …) for the Part/Assembly/Product graph.
  • Document management (SharePoint, network shares) for the heavy PDFs.
  • HR / SSO (Entra ID, Okta) for Engineer/Team membership.

The connector merges those three sources — IDs from PLM are authoritative; documents are matched to parts by metadata field; engineers are matched by email.

Deployment

  • Production-scale corpora (millions of parts, hundreds of thousands of documents) want 64+ GB RAM.
  • OCR is heavy; provision parser CPUs accordingly.
  • Embedding choice: a local model is common for IP-sensitive corpora.

Common pitfalls in this domain

  • Stale document revisions dominating retrieval — solve by always preferring the latest revision in the response, or by filtering older revisions out at search time.
  • Drawing metadata that doesn't match part metadata — invest in a canonicalization pass during ingestion.
  • Permission inflation ("everyone in engineering can see everything") — start restrictive, expand only on request.
© 2026 Curiosity. All rights reserved.
Powered by Neko