Automated Hacker News Comment Ingestion for AI Applications
detail.loadingPreview
This powerful n8n workflow automates the extraction, flattening, and vectorization of Hacker News comments for any specified story ID. It seamlessly integrates with OpenAI embeddings and Qdrant Vector Database, preparing a rich dataset for advanced AI applications like semantic search or Retrieval-Augmented Generation (RAG).
About This Workflow
Navigating the complex, nested comment structures of platforms like Hacker News can be challenging for data analysis. This n8n workflow provides an elegant solution by automatically fetching a specific Hacker News article and its deeply nested comments. It then flattens this hierarchical data into a clean, searchable format. Each comment's text is transformed into a high-quality vector embedding using OpenAI, enriched with essential metadata, and then efficiently stored in a Qdrant Vector Database. This process not only streamlines data preparation but also lays the foundation for building sophisticated AI-driven features like semantic search, personalized content recommendations, or Retrieval-Augmented Generation (RAG) systems.
Key Features
- Automated Hacker News Comment Extraction: Pulls articles and all nested comments for a given story ID, ensuring comprehensive data capture.
- Intelligent Comment Flattening: Transforms complex, multi-level nested comment threads into a flat, digestible list, making data processing straightforward.
- Advanced OpenAI Embeddings: Generates high-quality vector representations of comment text using OpenAI's
text-embedding-3-smallmodel for semantic understanding. - Robust Qdrant Vector Database Integration: Stores vectorized comments along with crucial metadata (author, story ID, comment ID) into a dedicated Qdrant collection for efficient similarity search.
- Idempotent Data Ingestion: Clears existing comments for a specific story before ingesting new ones, ensuring data freshness and preventing duplicates during updates.
How To Use
- Set up Credentials: Ensure you have valid
OpenAiApiandQdrantApicredentials configured in n8n. - Define Story ID: In the "Set Variables" node, update the
story_idvalue to the specific Hacker News article you wish to process. - Configure Qdrant Collection: Verify the "Qdrant Vector Store" node is configured to the correct
hn_commentscollection. If the collection doesn't exist, Qdrant will typically create it upon first insertion. - Run Workflow: Execute the "When clicking ‘Test workflow’" node to start the ingestion process. The workflow will clear existing data for the specified story, fetch new comments, embed them, and store them in Qdrant.
- Extend Functionality (Optional): Connect the output of the "Qdrant Vector Store" node to a retrieval or LLM chain (like the disconnected OpenAI Chat Model and associated nodes) to query the ingested data.
Apps Used
Workflow JSON
{
"id": "ee5d191d-79b3-4bd7-a30f-02a676a8e140",
"name": "Automated Hacker News Comment Ingestion for AI Applications",
"nodes": 20,
"category": "DevOps",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: ee5d191d-79b3...
About the Author
DevOps_Master_X
Infrastructure Expert
Specializing in CI/CD pipelines, Docker, and Kubernetes automations.
Statistics
Related Workflows
Discover more workflows you might like
Build a Custom OpenAI-Compatible LLM Proxy with n8n
This workflow transforms n8n into a powerful OpenAI-compatible API proxy, allowing you to centralize and customize how your applications interact with various Large Language Models. It enables a unified interface for diverse AI capabilities, including multimodal input handling and dynamic model routing.
Automated PR Merged QA Notifications
Streamline your QA process with this automated workflow that notifies your team upon successful Pull Request merges. Leverage AI and vector stores to enrich notifications and ensure seamless integration into your development pipeline.
Visualize Your n8n Workflows: Interactive Dashboard with Mermaid.js
Gain unparalleled visibility into your n8n automation landscape. This workflow transforms your n8n instance into a dynamic, interactive dashboard, leveraging Mermaid.js to visualize all your workflows in one accessible place.