Generate Structured Metadata from Trustpilot Reviews
detail.loadingPreview
Scrape Trustpilot reviews, process them, store them in a vector database, and generate structured metadata for analysis.
About This Workflow
This workflow automates the process of gathering customer feedback from Trustpilot. It scrapes reviews for a specified company, extracts key information, stores these reviews in a Qdrant vector database for similarity search, and then uses clustering to identify common themes and sentiments. Finally, it exports processed data to Google Sheets and can trigger further analysis workflows.
Key Features
- Automated Scraping: Fetches Trustpilot reviews for a given company.
- Data Extraction: Parses review data including author, rating, text, dates, and country.
- Vector Storage: Stores reviews and their embeddings in Qdrant for efficient similarity search.
- Text Splitting & Embedding: Prepares review text for embedding and processing.
- Clustering Analysis: Groups similar reviews using vector embeddings to identify common feedback patterns.
- Data Export: Outputs processed insights and raw reviews to Google Sheets.
- Dynamic Date Filtering: Allows filtering reviews by a specific date range.
- Workflow Triggering: Enables calling this workflow as a sub-workflow or a trigger.
How To Use
- Trigger the workflow: This workflow can be triggered manually by clicking 'Test workflow' or via a workflow trigger.
- Set Company ID: The 'Set Variables' node (ID:
f0ea6b63-c96d-4b3f-8a21-d0f2dbb4efc3) defines thecompanyIdfor which reviews will be scraped. Ensure this is set correctly. - Initial Data Cleaning: The 'Clear Existing Reviews' node (ID:
1f60c3a5-a47a-4313-9b29-8ea652d573f7) ensures a clean slate in the Qdrant database for the specified company. - Scrape Reviews: The 'Get TrustPilot Page' node (ID:
139ccadd-9135-4681-b2eb-403b8d8bd710) scrapes reviews from Trustpilot. ThemaxRequests: 3parameter limits scraping to the most recent 3 pages. - Extract Review Data: The 'Extract Reviews' node (ID:
9290e116-c001-49d5-ae4c-d91cd246f2c2) parses the HTML content to extract individual review details. - Structure Review Data: The 'Zip Entries' node (ID:
00de989c-d9e9-4b42-b5db-7097800a6017) consolidates extracted review fields into a structured array. - Store in Vector Database: The 'Qdrant Vector Store' node (ID:
a4f82a1b-5a76-46b6-a7a3-84ab09b46699) inserts the structured review data and their generated embeddings into the 'trustpilot_reviews' collection in Qdrant. - Find Reviews for Analysis: The 'Find Reviews' node (ID:
85cb48b1-0ab9-4f88-88f3-82fcfb041ebe) retrieves reviews from Qdrant based oncompanyIdand a specified date range (current month by default). - Cluster Reviews: The workflow then proceeds to cluster these reviews to identify common themes and points of interest. This involves fetching points from Qdrant (
Get Payload of Points,Clusters To List,Only Clusters With 3+ points), likely using a code node (not explicitly shown but implied by Sticky Note4) to perform the clustering algorithm. - Prepare Output: The 'Prep Output For Export' node (ID:
69bbd197-c78f-4dae-9300-fe23d4d49855) formats the clustered insights and raw review data for export. - Export to Sheets: The 'Export To Sheets' node (ID:
d77daa23-6acf-4daa-bf4c-33da4d05a54c) appends the prepared data to a Google Sheet. - Trigger Sub-workflow (Optional): The 'Trigger Insights' node (ID:
61c3117c-757c-45dd-b9d5-1122b793be30) can be used to trigger another workflow for further analysis using the generated insights.
Apps Used
Workflow JSON
{
"id": "fef39898-009f-4560-9956-dd098279b599",
"name": "Generate Structured Metadata from Trustpilot Reviews",
"nodes": 13,
"category": "Data Processing",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: fef39898-009f...
About the Author
SaaS_Connector
Integration Guru
Connecting CRM, Notion, and Slack to automate your life.
Statistics
Related Workflows
Discover more workflows you might like
Multilingual Metadata Generation from Gmail and Google Sheets
Automate structured metadata generation in English and Chinese by processing Gmail emails and Google Sheets data.
Generate Structured Metadata with Multiple Languages
This workflow demonstrates generating structured metadata, specifically supporting both English and Chinese output.
Generate Cocktail Instructions with Translation
Fetches random cocktail data and translates its instructions using LingvaNex.