Evaluate RAG Document Relevance for AI Quality
detail.loadingPreview
Ensure your RAG (Retrieval-Augmented Generation) applications deliver precise answers by automatically evaluating the relevance of retrieved documents. This n8n workflow calculates a critical metric, 'retrieved document relevance,' to validate that the information your AI system pulls from its knowledge base directly addresses user questions.
About This Workflow
This robust n8n workflow is designed to empower developers and product managers to rigorously evaluate the performance of their Retrieval-Augmented Generation (RAG) systems. It tackles the crucial challenge of ensuring that the documents retrieved by your AI application are truly relevant to the user's query. By integrating with Google Sheets for test datasets and leveraging OpenAI for embeddings, the workflow automates the measurement of 'retrieved document relevance.' This allows you to quantify and improve your RAG system's accuracy, optimize your knowledge base, and make data-driven decisions. Importantly, it includes a cost-saving mechanism to only calculate metrics when actively evaluating, optimizing resource usage.
Key Features
- Automated Relevance Scoring: Automatically calculates a similarity score to determine how relevant retrieved documents are to a given question.
- Flexible Triggering: Supports evaluation via pre-defined test datasets from Google Sheets or real-time chat message inputs for continuous monitoring.
- Integrated Knowledge Base Management: Ingests, cleans, splits, and embeds documents from Google Sheets into a temporary vector store using OpenAI for comprehensive RAG testing.
- Cost-Optimized Evaluation: Intelligently checks if evaluation is active, reducing unnecessary processing and API costs when not in metric-gathering mode.
- Scalable with External Data: Easily connect to external datasets and APIs for dynamic testing and continuous improvement of your RAG applications.
How To Use
- Configure Data Ingestion: Set up the 'Get dataset' (Google Sheets) node to point to your document source. Ensure your spreadsheet contains a column with the document content (e.g.,
document_text). - Set Up Embeddings: Ensure your 'Embeddings OpenAI' node is correctly configured with your OpenAI API credentials.
- Define Your Test Data: For evaluation, configure the 'When fetching a dataset row' node to link to a Google Sheet containing your test questions (e.g., in a
questioncolumn). This dataset will be used to trigger evaluation runs. - Connect Your RAG Logic: Integrate your RAG chain (vector store search, LLM call) between the input (chat or dataset question) and the 'Set metrics' node. The workflow expects a
message.content.scoreoutput from your RAG chain for the similarity metric. - Enable/Disable Evaluation: The 'Evaluating?' node allows you to control when metrics are calculated. For live use, bypass the 'Set metrics' branch; for evaluation, ensure this branch is active.
Apps Used
Workflow JSON
{
"id": "371e9acc-07da-4217-9ea5-e001d0704f77",
"name": "Evaluate RAG Document Relevance for AI Quality",
"nodes": 18,
"category": "Operations",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: 371e9acc-07da...
About the Author
DevOps_Master_X
Infrastructure Expert
Specializing in CI/CD pipelines, Docker, and Kubernetes automations.
Statistics
Related Workflows
Discover more workflows you might like
Google Sheets to Icypeas: Automated Bulk Domain Scanning
This workflow streamlines the process of performing bulk domain scans by integrating your Google Sheets data directly with the Icypeas platform. Automate the submission of company names from your spreadsheet to Icypeas for comprehensive domain information, saving valuable time and effort.
Instant WooCommerce Order Notifications via Telegram
When a new order is placed on your WooCommerce store, instantly receive detailed notifications directly to your Telegram chat. Stay on top of your e-commerce operations with real-time alerts, including order specifics and a direct link to view the order.
On-Demand Microsoft SQL Query Execution
This workflow allows you to manually trigger and execute any SQL query against your Microsoft SQL Server database. Perfect for ad-hoc data lookups, administrative tasks, or quick tests, giving you direct control over your database operations.