HYVE Docs
DumpFeatures

Dump's RAG (Retrieval-Augmented Generation) interface lets you ask questions in natural language and get AI-generated answers grounded in your vault's content.

How It Works

Query Embedding

Your question is vectorized using Gemini text-embedding-004 to find semantically relevant content.

pgvector approximate nearest neighbor (ANN) search finds the most relevant items. Falls back to full-text search if vector search returns no results.

Context Building

Up to 4,000 characters of relevant content are assembled from the top matches, with source attribution.

Streaming Response

Gemini generates a response based on the context, streamed back via Server-Sent Events (SSE).

API

POST/api/rag

Request Body

{
  query: string  // Your question (required)
  limit?: number // Max context items (1-20, default 10)
}

Response (SSE Stream)

The response is a stream of Server-Sent Events:

data: {"status": "embedding"}
data: {"status": "searching"}
data: {"status": "thinking"}
data: {"text": "Based on your vault..."}
data: {"text": " the article mentions..."}
data: {"done": true, "sources": [...]}

Status Events

StatusDescription
embeddingVectorizing your question
searchingFinding relevant content via pgvector
thinkingGemini is generating the answer

Source Attribution

The final done event includes sources used for the answer:

{
  "done": true,
  "sources": [
    {
      "id": "uuid",
      "title": "Article Title",
      "url": "https://example.com",
      "source_type": "article"
    }
  ]
}

System Prompt

The RAG assistant (HYVER) follows these rules:

  • Answers based only on provided vault context
  • Cites sources by title
  • Responds "Nao encontrei isso no vault" when the answer isn't in context
  • Responds in the same language as the question

Model Selection

Dump tries models in order of preference:

  1. gemini-3.1-pro-preview
  2. gemini-2.0-flash (fallback)

If both fail, the error is streamed back to the client.

On this page