HYVE Docs
DumpFeatures

Dump combines PostgreSQL full-text search with vector similarity for hybrid results, re-ranked by a composite scoring formula.

How It Works

Query Processing

The search query is sent to two systems in parallel: PostgreSQL tsvector full-text search and Gemini text-embedding-004 for semantic embedding.

Supabase textSearch finds items matching the query terms. Falls back to ilike search if full-text fails.

Composite Re-Ranking

Results are scored using a weighted formula combining position, recency, source quality, and semantic similarity.

RAG Complement

If embeddings are available, a secondary pgvector ANN search adds semantically related items not found by text search.

Ranking Formula

When embeddings are available:

score = baseRank * 0.4 + recency * 0.2 + sourceWeight * 0.1 + semanticSim * 0.3

When embeddings are not available:

score = baseRank * 0.6 + recency * 0.25 + sourceWeight * 0.15

Source Weights

SourceWeightNormalized
PDF1.31.0
Article1.20.86
YouTube1.10.71
Reddit1.050.64
Twitter/X1.00.57
LinkedIn0.90.43
Instagram0.80.29
Text0.70.14
Image0.60.0

Recency Score

recency = 1 / (1 + daysSinceCreation)

Today's items score ~1.0, items from 1 year ago score ~0.003.

API

GET/api/search

Query Parameters

ParameterTypeRequiredDescription
qstringYesSearch query
source_typestringNoFilter by source type
categorystringNoFilter by category
limitnumberNoMax results (default 20, max 100)

Response

{
  "items": [...],
  "total": 42,
  "rag": [...]
}

The items array contains ranked results with a _score field. The rag array contains additional semantically related items from the RAG index, deduplicated against the main results.

On this page