AI Vision-Based Web Scraper with Google Sheets, ScrapingBee, and Gemini
detail.loadingPreview
Automate web scraping using AI vision. This workflow leverages ScrapingBee for screenshots and Google Gemini for data extraction, outputting results to Google Sheets.
🚀Ready to Deploy This Workflow?
About This Workflow
Overview
This n8n workflow automates the process of scraping data from websites using a vision-based AI agent. It begins by fetching a list of URLs from a Google Sheet. For each URL, it uses ScrapingBee to capture a full-page screenshot. This screenshot is then fed to a Google Gemini AI model, which is prompted to extract specific data (like product title, price, and brand). A fallback mechanism is included where the AI can use an HTML scraping tool if visual extraction fails. Finally, the extracted data is structured using an Output Parser and saved back to a Google Sheet.
Key Features
- Vision-based AI scraping using Google Gemini's multimodal capabilities.
- Utilizes ScrapingBee for capturing full-page screenshots.
- Includes a fallback HTML scraping tool for robust data extraction.
- Structured output parsing for easy data integration.
- Seamless integration with Google Sheets for input URLs and output results.
How To Use
- Set up Google Sheets: Create a Google Sheet with a sheet named 'List of URLs' containing the URLs you want to scrape. Another sheet named 'Results' will be populated with the scraped data.
- Configure ScrapingBee: Obtain an API key from ScrapingBee and replace
<your_scrapingbee_apikey>in the 'ScrapingBee - Get page screenshot' node. - Configure Google Gemini: Connect your Google Gemini API credentials in the 'Google Gemini Chat Model' node.
- Customize AI Prompts: Adjust the system and user prompts within the 'Vision-Based Scraping AI Agent' node to define the data you want to extract and how.
- Define Output Structure: Modify the
jsonSchemaExamplein the 'Structured Output Parser' node to match the data fields you intend to extract. - Trigger Workflow: Manually trigger the workflow using the 'Test workflow' button or set up a different trigger of your choice.
Apps Used
Workflow JSON
{
"id": "a0070a12-a04d-47e5-b5de-673f7e615f75",
"name": "AI Vision-Based Web Scraper with Google Sheets, ScrapingBee, and Gemini",
"nodes": 0,
"category": "Web Scraping & AI Automation",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: a0070a12-a04d...
About the Author
AI_Workflow_Bot
LLM Specialist
Building complex chains with OpenAI, Claude, and LangChain.
Statistics
Verification Info
Related Workflows
Discover more workflows you might like
Automate Local Business Outreach with AI-Powered Yelp Scraper
This workflow automates the process of scraping local business details from Yelp using AI, then leverages that data to send personalized partnership proposals via Gmail. It's perfect for sales and marketing teams looking to streamline lead generation and outreach campaigns.
WhatsApp AI Assistant: LLaMA 4 & Google Search for Real-Time Insights
Instantly deploy a smart AI assistant on WhatsApp, powered by Groq's lightning-fast LLaMA 4 model. This workflow enables real-time conversations, remembers context, and provides up-to-date answers by integrating live Google Search results.
Automate Getty Images Editorial Search & CMS Integration
This n8n workflow automates searching for editorial images on Getty Images, extracts key details and embed codes, and prepares them for seamless integration into your Content Management System (CMS), streamlining your content creation process.