Knowledge and grounding

The current prototype ingests public hotel data, normalizes it into a structured knowledge file, generates embeddings for semantic search, and can optionally verify thin answers with live first-party or trusted sources.

Technical stakeholders focused on retrieval, answer quality, and trustworthiness.8 min

This page covers

Knowledge pipelineAurelia starts from structured hotel records, not raw pages alone.Embeddings and retrievalAurelia mixes lexical signals with semantic similarity for hotel search.Live verification strategyNot every answer needs live lookup, but some do.Trust and answer transparencyAurelia should be explicit about what is known and what is not.

Page details

Audience

Technical stakeholders focused on retrieval, answer quality, and trustworthiness.

Read time

8 min

Focus

See how Aurelia builds and searches the hotel knowledge base, embeddings, and live source checks.

Knowledge pipeline

Aurelia starts from structured hotel records, not raw pages alone.

The prototype already includes an ingest pipeline that collects public hotel information into a normalized JSON knowledge base. Each record carries hotel identity, destination, collections, amenities, activities, room details, dining details, spa details, location details, and supporting links.

Normalized knowledge records create a stable retrieval surface for planning and hotel questions.
Search text is built from structured fields rather than relying on raw HTML at answer time.
Images, booking links, ratings, and section links can be attached to support richer answer widgets.

Embeddings and retrieval

Aurelia mixes lexical signals with semantic similarity for hotel search.

The current repo generates embeddings for hotel records and uses them alongside lexical scoring. That allows Aurelia to rank hotels based on both literal phrase matches and broader semantic fit, which is especially useful when guests describe a vibe, occasion, or requirement rather than a hotel name.

Current prototype scripts for ingest and embeddings.

npm run ingest:preferred
npm run embed:preferred

Why this matters for docs and product

The same pattern used for the hotel demo can be extended to docs search: structured content, searchable passages, retrieval ranking, and grounded answer generation instead of a free-form chat with no corpus behind it.

Live verification strategy

Not every answer needs live lookup, but some do.

Freshness-sensitive planning questions may require a current source check.
Policy and capability questions should escalate to official sources when the snapshot is thin.
Local-area and spatial questions can use trusted local or map-like sources when hotel data is not enough.
Grounding metadata should make it clear whether the answer came from snapshot data alone or from verified live sources.

Trust and answer transparency

Aurelia should be explicit about what is known and what is not.

A grounded assistant becomes more credible when it names the limits of the available evidence. If a capability is not confirmed, the answer should not bluff. If live research was required, the experience should expose the source clearly enough that a guest or operator can understand where the claim came from.

PreviousHow it works NextTechnical architecture