DNB Company Search & Extract with Bright Data and OpenAI 4o mini
detail.loadingPreview
Automates DNB company data extraction using Bright Data's MCP Client and OpenAI for structured output in English and Chinese.
About This Workflow
This workflow leverages Bright Data's MCP Client and OpenAI's GPT-4o mini to search for companies on DNB, scrape their web pages, and extract structured company profile information. The extracted data can be saved to disk or sent via webhook.
Note: This template requires self-hosted n8n due to the usage of the community node 'MCP Client'.
Key Features
- DNB Company Search: Utilizes Bright Data's MCP Client to perform searches on DNB.
- Web Scraping: Employs Bright Data's
scrape_as_markdowntool to extract content from DNB company URLs. - LLM for URL Extraction: Uses OpenAI GPT-4o mini to identify and extract relevant DNB company URLs from search results.
- Structured Data Extraction: Employs OpenAI GPT-4o mini with a detailed JSON schema to extract structured company profile data.
- Multi-language Support: Designed to generate structured metadata that can be used for English and Chinese outputs (though the current LLM prompt is English-centric for extraction).
- File Output: Option to write the structured JSON data to a local file.
- Webhook Notification: Ability to send the extracted structured data to a specified webhook URL.
How To Use
- Prerequisites: Ensure you have n8n self-hosted and the necessary OpenAI API credentials configured.
- Credentials: Set up your 'MCP Client (STDIO) account' and 'OpenAi account' credentials in n8n.
- Workflow Trigger: Click 'Test workflow' to initiate the process.
- Input Fields: Update the
searchparameter in the 'Set input fields' node with your desired DNB company search query (e.g., "dnb starbucks url"). - Webhook URL: Configure the
webhook_notification_urlin the 'Set input fields' node if you intend to use webhook notifications. - Execution: The workflow will:
- Search for company information using Bright Data's MCP Client.
- Extract company URLs using OpenAI.
- Scrape the DNB page content using Bright Data.
- Extract structured company profile data using OpenAI.
- Optionally, save the structured data to a file (
d:\DNB_Info.json). - Optionally, send the structured data via a webhook notification.
Apps Used
Workflow JSON
{
"id": "b4ad5933-22f8-4933-bf7e-916dbe10ffed",
"name": "DNB Company Search & Extract with Bright Data and OpenAI 4o mini",
"nodes": 5,
"category": "Data Extraction",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: b4ad5933-22f8...
About the Author
N8N_Community_Pick
Curator
Hand-picked high quality workflows from the global community.
Statistics
Related Workflows
Discover more workflows you might like
HubSpot CRM Contact Data Extractor with Pagination
Fetches contact data from HubSpot CRM, handling pagination to retrieve all records.
Reddit Post Analysis and Summarization for n8n
Fetches Reddit posts related to n8n, filters them, and uses OpenAI to classify and summarize relevant content.
Web Page to Markdown & Links
Scrapes web pages, converts HTML to Markdown, and extracts links using Firecrawl.dev API.