Generate Multilingual Image Captions and Overlay
detail.loadingPreview
This workflow generates captions for images using Google Gemini and overlays them onto the image, with support for multilingual output.
About This Workflow
This workflow demonstrates how to leverage n8n's LangChain integration with Google Gemini to generate descriptive captions for images. It then uses the editImage node to overlay these captions, in a calculated position, onto the original image. The workflow is designed to be extensible for multilingual caption generation.
Key Features
- Multimodal AI: Utilizes Google Gemini for image understanding and caption generation.
- Structured Output: Employs
outputParserStructuredto ensure consistent caption formatting. - Image Manipulation: Uses the
editImagenode for resizing, drawing backgrounds, and applying text overlays. - Dynamic Positioning: A
codenode calculates optimal text placement based on image dimensions and caption length. - Extensible for Multilingual: The core logic can be adapted to generate captions in different languages by adjusting the AI prompts and potentially the output parser schema.
How To Use
- Import Image: The
Get Imagenode fetches an image from a URL. Replace this with your desired image source. - Prepare for AI: The
Resize For AInode ensures the image is in a suitable format for the AI model. - Generate Caption (English): The
Image Captioning Agentnode, configured withGoogle Gemini Chat ModelandStructured Output Parser, prompts Gemini to generate a caption with a title and text. The prompt is carefully crafted to guide the AI's output format. - Calculate Caption Position: The
Calculate Positioningcode node determines the ideal position for the caption on the image, considering factors like image size, font size, and caption length. - Merge Image and Caption Data: The
Merge Image & CaptionandMerge Caption & Positionsnodes combine the original image data with the generated caption and its calculated position. - Apply Caption to Image: The
Apply Caption to Imagenode draws a background rectangle and then overlays the generated caption onto the image.
To adapt for Chinese captions:
- Modify the
messagesin theImage Captioning Agentnode to instruct Gemini to generate captions in Chinese. - You may need to adjust the
jsonSchemaExamplein theStructured Output Parserif the Chinese output structure differs significantly, or if you want to translate the keys. - Ensure your
Apply Caption to Imagenode is configured with a font that supports Chinese characters.
Apps Used
Workflow JSON
{
"id": "6102b7ee-05ad-448c-b5a6-e3c2492029db",
"name": "Generate Multilingual Image Captions and Overlay",
"nodes": 13,
"category": "AI",
"status": "active",
"version": "1.0.0"
}Note: This is a sample preview. The full workflow JSON contains node configurations, credentials placeholders, and execution logic.
Get This Workflow
ID: 6102b7ee-05ad...
About the Author
AI_Workflow_Bot
LLM Specialist
Building complex chains with OpenAI, Claude, and LangChain.
Statistics
Related Workflows
Discover more workflows you might like
LangChain Workflow Retriever Example
Demonstrates using LangChain's Retriever QA Chain to query data retrieved from another workflow.
Build an AI Documentation Expert Chatbot with Gemini RAG
This n8n workflow automates the creation of an AI-powered expert chatbot capable of answering questions based on your documentation. It intelligently ingests, cleans, and processes your knowledge base, preparing it for a Retrieval Augmented Generation (RAG) pipeline with Google Gemini.
Automated Gmail Labeling with AI
Automatically categorize incoming emails in Gmail using AI, creating new labels when necessary.