Smart Document Ingestion for Recursive Hybrid RAG with Google Drive & Supabase
detail.loadingPreview
This n8n workflow automates the intelligent ingestion of documents from Google Drive into a Supabase vector database. It leverages an advanced Recursive Hybrid RAG strategy with OpenAI to contextualize document chunks, ensuring highly accurate and relevant retrievals for your AI applications.
About This Workflow
This powerful workflow transforms raw documents into highly intelligent, context-aware data assets for your Retrieval Augmented Generation (RAG) systems. By continuously monitoring a designated Google Drive folder for new PDF or text files, it automatically downloads and processes content. The core innovation lies in its "Basic LLM Chain" which employs a Recursive RAG approach, using OpenAI's gpt-4o-mini to enrich each document chunk with broader context from the full document. This ensures that fragmented information, like incomplete numbers or table entries, is corrected and integrated, significantly enhancing the quality and relevance of embeddings stored in your Supabase vector database for superior AI search and response generation.
Key Features
- Automated Google Drive Monitoring: Automatically triggers upon new file creation (PDF or text) in a specified Google Drive folder.
- Intelligent Document Processing: Downloads and converts Google Docs to plain text, handling various document types.
- Advanced Recursive Hybrid RAG: Utilizes an LLM to contextualize document chunks, correcting incomplete data and providing a succinct context for improved search retrieval.
- OpenAI LLM Integration: Leverages
gpt-4o-minifor sophisticated chunk contextualization and OpenAI embeddings for vectorization. - Supabase Vector Database Integration: Seamlessly stores enriched document chunks and their embeddings in a designated Supabase table for efficient RAG.
How To Use
- Configure Google Drive Credentials: Set up your Google Drive OAuth2 API credentials. In the "Google Drive Trigger" node, select or create your credential and specify the "Folder To Watch" where new files will be uploaded.
- Set Up OpenAI Credentials: Provide your OpenAI API key for both the "OpenAI Chat Model" and "Embeddings OpenAI" nodes.
- Integrate Supabase: Configure your Supabase API credentials. In the "Supabase Vector Store" node, select your credential and specify the "Table Name" (e.g.,
documents) where vectors will be stored. - Review LLM Chain Prompt: The "Basic LLM Chain" node contains a sophisticated prompt for contextualization. While pre-configured, you can review and adjust it if your specific document types require different contextualization rules.
- Test and Activate: Upload a PDF or text file to your specified Google Drive folder to test the workflow. Once satisfied, activate the workflow to enable continuous monitoring and ingestion.
Apps Used
Workflow JSON
{
"id": "ab3f8839-4bd4-419f-b22e-5dc9ea68b60c",
"name": "Smart Document Ingestion for Recursive Hybrid RAG with Google Drive & Supabase",
"nodes": 5,
"category": "Operations",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: ab3f8839-4bd4...
About the Author
N8N_Community_Pick
Curator
Hand-picked high quality workflows from the global community.
Statistics
Related Workflows
Discover more workflows you might like
Google Sheets to Icypeas: Automated Bulk Domain Scanning
This workflow streamlines the process of performing bulk domain scans by integrating your Google Sheets data directly with the Icypeas platform. Automate the submission of company names from your spreadsheet to Icypeas for comprehensive domain information, saving valuable time and effort.
Instant WooCommerce Order Notifications via Telegram
When a new order is placed on your WooCommerce store, instantly receive detailed notifications directly to your Telegram chat. Stay on top of your e-commerce operations with real-time alerts, including order specifics and a direct link to view the order.
On-Demand Microsoft SQL Query Execution
This workflow allows you to manually trigger and execute any SQL query against your Microsoft SQL Server database. Perfect for ad-hoc data lookups, administrative tasks, or quick tests, giving you direct control over your database operations.