Search

Dump combines PostgreSQL full-text search with vector similarity for hybrid results, re-ranked by a composite scoring formula.

How It Works

Query Processing

The search query is sent to two systems in parallel: PostgreSQL tsvector full-text search and Gemini text-embedding-004 for semantic embedding.

Full-Text Search

Supabase textSearch finds items matching the query terms. Falls back to ilike search if full-text fails.

Composite Re-Ranking

Results are scored using a weighted formula combining position, recency, source quality, and semantic similarity.

RAG Complement

If embeddings are available, a secondary pgvector ANN search adds semantically related items not found by text search.

Ranking Formula

When embeddings are available:

score = baseRank * 0.4 + recency * 0.2 + sourceWeight * 0.1 + semanticSim * 0.3

When embeddings are not available:

score = baseRank * 0.6 + recency * 0.25 + sourceWeight * 0.15

Source Weights

Source	Weight	Normalized
PDF	1.3	1.0
Article	1.2	0.86
YouTube	1.1	0.71
Reddit	1.05	0.64
Twitter/X	1.0	0.57
LinkedIn	0.9	0.43
Instagram	0.8	0.29
Text	0.7	0.14
Image	0.6	0.0

Recency Score

recency = 1 / (1 + daysSinceCreation)

Today's items score ~1.0, items from 1 year ago score ~0.003.

API

GET/api/search

Query Parameters

Parameter	Type	Required	Description
`q`	string	Yes	Search query
`source_type`	string	No	Filter by source type
`category`	string	No	Filter by category
`limit`	number	No	Max results (default 20, max 100)

Response

{
  "items": [...],
  "total": 42,
  "rag": [...]
}

The items array contains ranked results with a _score field. The rag array contains additional semantically related items from the RAG index, deduplicated against the main results.

On this page