Dump
Prerequisites
Dump is part of the HYVE monorepo. You need the full monorepo set up before running Dump.
- Node.js 22+
- pnpm 9+
- Supabase project (for database + auth + storage)
- Google AI API key (Gemini, for categorization and embeddings)
Environment Variables
Never commit your .env.local file. It contains secrets that should stay local.
Create apps/dump/.env.local with:
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
GOOGLE_AI_API_KEY=your-gemini-api-keyInstallation
Clone the monorepo
git clone https://github.com/THE-HYVE-COMPANY/hyve-os.git
cd hyve-osInstall dependencies
pnpm installYour First Ingest
Once Dump is running, you can ingest content via the UI or the API.
Via API
curl -X POST http://localhost:3106/api/ingest \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/article"}'The ingest endpoint accepts three input types:
{
url?: string // URL to extract content from
text?: string // Plain text to store directly
image?: string // Image URL or base64
force?: boolean // Re-ingest even if URL exists
}At least one of url, text, or image must be provided. Source type is auto-detected from the URL pattern.
Via UI
- Open Dump in your browser
- Paste a URL into the input field
- Click Ingest — Dump auto-detects the source type
- Watch real-time progress via SSE streaming
- Once complete, the item appears in your collection
What Happens During Ingestion
When you ingest content, Dump runs through this pipeline:
- Detection — URL pattern determines source type (twitter, youtube, article, etc.)
- Extraction — Source-specific extractor pulls title, content, author, and media
- Categorization — Gemini assigns a category, subcategories, and tags
- Embedding — Content is vectorized for semantic search
- Storage — Item is saved to Supabase with full-text and vector indexes