Dump is HYVE's knowledge vault — a tool to capture, process, and retrieve content from across the web. Paste a URL or text, and Dump extracts, categorizes, and indexes it for instant search and AI-powered retrieval.
Key Features
9 Source Types
Ingest from Twitter/X, YouTube, Instagram, LinkedIn, Reddit, articles, PDFs, images, and plain text.
AI Categorization
Gemini-powered automatic categorization, tagging, and subcategory assignment.
Hybrid Search
Full-text + semantic vector search for fast, relevant results.
RAG Interface
Ask questions about your knowledge base using retrieval-augmented generation.
Supported Sources
| Source | Detection | Example |
|---|---|---|
| Twitter/X | twitter.com, x.com | Tweet threads, profiles |
| YouTube | youtube.com, youtu.be | Video transcripts |
instagram.com | Posts, carousels | |
linkedin.com | Posts, articles | |
reddit.com, redd.it | Posts, comments | |
.pdf URLs | Documents, papers | |
| Article | Any other URL | Web pages, blogs |
| Image | .png, .jpg, .webp | Image files |
| Text | Plain text input | Notes, snippets |
Architecture
Dump runs as a Next.js app in the HYVE monorepo. Content processing flows through:
Extraction
Source-specific extractors pull content from URLs based on auto-detected source type.
AI Processing
Gemini categorizes content, assigns tags/subcategories, and generates vector embeddings.
Storage
Supabase stores items with full-text indexes and pgvector for semantic search.
Retrieval
Hybrid search combines text matching with semantic similarity for relevant results.
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST | /api/ingest | Ingest a URL, text, or image |
GET | /api/items | List all items (paginated) |
GET | /api/items/:id | Get a single item |
GET | /api/items/related | Find related items |
GET | /api/items/counts | Get item counts by category |
GET | /api/search | Hybrid text + semantic search |
POST | /api/rag | Ask questions (RAG) |
GET | /api/categories | List all categories |