Effortlessly Compare LLM Performance Side-by-Side in Google Sheets
detail.loadingPreview
Streamline your LLM selection process by directly comparing outputs from OpenAI and other providers within a familiar Google Sheet. This workflow automates the side-by-side evaluation of AI model responses, making it easier to choose the best fit for your project.
About This Workflow
Choosing the right Large Language Model (LLM) can be a complex task, especially given their non-deterministic nature. This n8n workflow simplifies the evaluation process by allowing you to compare outputs from different LLMs, such as various OpenAI models or even models from different providers like OpenRouter, directly within a Google Sheet. The workflow receives a chat message, duplicates it for each LLM, and then logs the user's input, both model responses, and the conversation context into a designated Google Sheet. This creates a structured dataset perfect for manual review, team collaboration, or even automated evaluation using a more advanced LLM. Get a clear, data-driven view of which LLM performs best for your specific needs.
Key Features
- Side-by-Side LLM Comparison: Directly compare the outputs of two or more LLMs for the same prompt.
- Google Sheets Integration: Automatically log all conversation data, including user input and model responses, into a Google Sheet for easy analysis.
- Contextual Memory: Each LLM maintains its own conversational memory, ensuring context is preserved during individual evaluations.
- Flexible Model Selection: Easily configure and switch between different LLMs available through providers like OpenRouter or OpenAI.
- Streamlined Evaluation: Reduce the manual effort required to test and select the optimal LLM for your AI applications.
How To Use
- Duplicate Google Sheet Template: Make a copy of the provided Google Sheets template to start logging your comparisons.
- Configure LLM Providers: Set up your credentials for your chosen LLM providers (e.g., OpenAI, OpenRouter) within n8n.
- Define Models for Comparison: In the
Define Models to Comparesticky note (or theSet model, sessionId, chatInput, sessionIdBasenode), specify the exact model IDs you wish to evaluate (e.g.,["openai/gpt-4.1", "mistralai/mistral-large"]). - Customize System Prompt & Tools: Adjust the
AI Agentnode's system prompt and any associated tools to align with your specific use case and evaluation criteria. - Activate Workflow: Trigger the workflow by sending a message to the chat interface defined in the
When chat message receivednode. - Review & Evaluate: Monitor the Google Sheet for logged responses and perform your analysis. Optionally, extend the workflow to automate evaluation with a more powerful LLM.
Apps Used
Workflow JSON
{
"id": "0fc1301c-8f9b-4cc4-91e8-ad20d5418140",
"name": "Effortlessly Compare LLM Performance Side-by-Side in Google Sheets",
"nodes": 12,
"category": "Marketing",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: 0fc1301c-8f9b...
About the Author
DevOps_Master_X
Infrastructure Expert
Specializing in CI/CD pipelines, Docker, and Kubernetes automations.
Statistics
Related Workflows
Discover more workflows you might like
Automate LinkedIn Content Promotion for Your Ghost Blog with AI
Effortlessly promote your latest Ghost blog posts on LinkedIn. This workflow leverages AI to generate engaging, professional LinkedIn messages based on your article content and saves them, along with article metadata, directly to a Google Sheet.
AI-Powered On-Page SEO Audit & Report Automation
Instantly generate comprehensive on-page SEO technical and content audits for any website URL. This AI-powered workflow automates the entire process, from scraping the page to delivering a detailed report directly to your inbox, empowering you to optimize for better search rankings and user engagement.
AI-Powered Instagram Comment Automation
This n8n workflow intelligently automates responses to Instagram comments, leveraging advanced AI to engage with your audience. It filters out irrelevant content and personalizes replies, saving you time while boosting your social media presence.