Extracción de datos estructurados de documentos médicos (Google Gemini AI)
Este es unDocument Extraction, AI Summarizationflujo de automatización del dominio deautomatización que contiene 17 nodos.Utiliza principalmente nodos como Set, Webhook, HttpRequest, ExtractFromFile, RespondToWebhook. Usar Google Gemini AI para extraer datos estructurados de documentos médicos
- •Punto final de HTTP Webhook (n8n generará automáticamente)
- •Pueden requerirse credenciales de autenticación para la API de destino
Nodos utilizados (17)
{
"meta": {
"instanceId": "e0b4273272d770192500fb71dd8e3f7185db564e7d7d721b18247acd525855e5"
},
"nodes": [
{
"id": "e363dddf-da87-44e2-9f4d-f9d66a6b2bbc",
"name": "Nota adhesiva",
"type": "n8n-nodes-base.stickyNote",
"position": [
280,
460
],
"parameters": {
"width": 300,
"height": 520,
"content": "# Medical Document AI Analysis\n\n**Extract structured data from medical documents using Google Gemini AI**\n\n• Classifies document types (receipts, prescriptions, reports)\n• 95%+ text extraction accuracy\n• Multi-language support\n• Structured JSON output\n\n**Use cases:**\n• Medical billing automation\n• Insurance processing\n• Clinical data extraction\n\n**Benefits:**\n• 90% reduction in manual work\n• Standardized output format"
},
"typeVersion": 1
},
{
"id": "2da8f7ac-709a-420d-a1c3-6a6ec1dcb5c4",
"name": "Nota adhesiva1",
"type": "n8n-nodes-base.stickyNote",
"position": [
920,
460
],
"parameters": {
"width": 300,
"height": 520,
"content": "# Setup Requirements\n\n**What you need:**\n• Google Gemini API Key (from Google AI Studio)\n• Add credentials to n8n\n\n**Setup:**\n1. Import workflow\n2. Configure Gemini credentials\n3. Test with image URL\n\n**Cost:** $0.01-0.05 per document\n\n⚠️ Ensure API quota is sufficient"
},
"typeVersion": 1
},
{
"id": "84d77cf0-b9f0-441b-ad0c-531c3ff78768",
"name": "Nota adhesiva3",
"type": "n8n-nodes-base.stickyNote",
"position": [
1560,
460
],
"parameters": {
"width": 300,
"height": 520,
"content": "# API Usage\n\n**Input:**\n```json\n{\n \"image_url\": \"https://example.com/receipt.jpg\"\n}\n```\n\n**Output:**\n```json\n{\n \"documentType\": \"financial\",\n \"content\": {\n \"amount\": 150.00\n },\n \"confidence\": 0.95\n}\n```"
},
"typeVersion": 1
},
{
"id": "9ed438ec-475a-49aa-8980-aec3e5c45c27",
"name": "Disparador Webhook Input",
"type": "n8n-nodes-base.webhook",
"notes": "Receives medical document analysis requests\n\nAccepts JSON payload with:\n• image_url: URL to medical document image\n• expected_type: Document type hint (optional)\n• language_hint: Language detection hint (optional)",
"position": [
280,
1120
],
"webhookId": "medical-doc-analyzer",
"parameters": {
"path": "analyze-medical-document",
"options": {},
"httpMethod": "POST",
"responseMode": "responseNode"
},
"typeVersion": 1.1
},
{
"id": "e13a0dda-b09a-4cfe-83ef-2a8f336fcb6d",
"name": "Parse Input",
"type": "n8n-nodes-base.set",
"notes": "Validates and extracts input parameters\n\n• Extracts image URL from request body\n• Sets default values for optional parameters\n• Prepares data for image processing pipeline",
"position": [
460,
1120
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "image_url",
"name": "image_url",
"type": "string",
"value": "={{ $json.body.image_url }}"
},
{
"id": "expected_type",
"name": "expected_type",
"type": "string",
"value": "={{ $json.body.expected_type || 'unknown' }}"
},
{
"id": "language_hint",
"name": "language_hint",
"type": "string",
"value": "={{ $json.body.language_hint || 'auto' }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "2216fddb-edf6-41b2-bc87-e4fdeecfe039",
"name": "Download Image",
"type": "n8n-nodes-base.httpRequest",
"notes": "Downloads image from provided URL\n\n• Supports JPEG, PNG, WebP formats\n• Handles various hosting services\n• Returns binary data with metadata\n• Automatic MIME type detection",
"position": [
640,
1120
],
"parameters": {
"url": "={{ $json.image_url }}",
"options": {
"response": {
"response": {
"responseFormat": "file"
}
}
}
},
"typeVersion": 4.2
},
{
"id": "b40dfc91-06a7-4e8f-a7a0-e15e5bd2c1cd",
"name": "Extract to Base64",
"type": "n8n-nodes-base.extractFromFile",
"notes": "Converts binary image to base64 format\n\n• Extracts image data for AI processing\n• Preserves original binary for reference\n• Handles various image formats\n• Prepares data for Gemini API",
"position": [
820,
1120
],
"parameters": {
"options": {
"keepSource": "both"
},
"operation": "binaryToPropery",
"destinationKey": "image_data"
},
"typeVersion": 1
},
{
"id": "5502a987-82bf-4879-a06b-d24fcd0e3817",
"name": "Prepare for AI",
"type": "n8n-nodes-base.set",
"notes": "Organizes data for AI processing\n\n• Extracts MIME type for proper API formatting\n• Combines image data with input parameters\n• Validates data completeness\n• Ready for Google Gemini processing",
"position": [
1000,
1120
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "mime_type",
"name": "mime_type",
"type": "string",
"value": "={{ $binary.data.mimeType }}"
}
]
},
"includeOtherFields": true
},
"typeVersion": 3.4
},
{
"id": "84fcd675-a04f-4ec2-a10d-020cb0b6854d",
"name": "Gemini Classify Extract",
"type": "n8n-nodes-base.httpRequest",
"notes": "Document classification and text extraction using Google Gemini 2.0 Flash\n\n• Classifies document into medical taxonomy categories\n• Extracts all visible text with high accuracy OCR\n• Detects multiple languages automatically\n• Provides quality assessment and confidence scores\n• Optimized for medical document processing",
"position": [
1200,
1120
],
"parameters": {
"url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent",
"method": "POST",
"options": {},
"jsonBody": "={\n \"contents\": [\n {\n \"parts\": [\n {\n \"inline_data\": {\n \"mime_type\": \"{{ $json.mime_type }}\",\n \"data\": \"{{ $json.image_data }}\"\n }\n },\n {\n \"text\": \"You are an expert medical document classifier and text extractor. Your task is to:\\n\\n1. CLASSIFY the document type according to medical document taxonomy\\n2. EXTRACT all visible text from the document\\n3. DETECT the primary language(s)\\n\\nDOCUMENT TAXONOMY:\\n- financial: receipt, bill, insurance_claim, invoice, payment_record\\n- clinical: medical_chart, progress_notes, consultation_report, visit_summary\\n- prescription: prescription, medication_list, pharmacy_record, drug_interaction_alert\\n- administrative: appointment_schedule, referral_letter, insurance_authorization, patient_registration\\n- legal_regulatory: consent_form, medical_certificate, incident_report, compliance_record\\n- diagnostic: lab_report, imaging_report, screening_result, biopsy_report\\n\\nExpected type hint: {{ $json.expected_type }}\\nLanguage hint: {{ $json.language_hint }}\\n\\nProvide classification confidence and extract ALL visible text accurately.\"\n }\n ]\n }\n ],\n \"generationConfig\": {\n \"temperature\": 0.1,\n \"topP\": 0.8,\n \"maxOutputTokens\": 4096,\n \"response_mime_type\": \"application/json\",\n \"response_schema\": {\n \"type\": \"object\",\n \"properties\": {\n \"document_classification\": {\n \"type\": \"object\",\n \"properties\": {\n \"document_type\": {\n \"type\": \"string\",\n \"enum\": [\"financial\", \"clinical\", \"prescription\", \"administrative\", \"legal_regulatory\", \"diagnostic\"]\n },\n \"document_subtype\": {\n \"type\": \"string\"\n },\n \"classification_confidence\": {\n \"type\": \"number\",\n \"minimum\": 0,\n \"maximum\": 1\n },\n \"classification_reasoning\": {\n \"type\": \"string\"\n }\n }\n },\n \"text_extraction\": {\n \"type\": \"object\",\n \"properties\": {\n \"extracted_text\": {\n \"type\": \"string\"\n },\n \"primary_language\": {\n \"type\": \"string\"\n },\n \"secondary_languages\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"string\"\n }\n },\n \"text_regions\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"region\": {\n \"type\": \"string\"\n },\n \"text\": {\n \"type\": \"string\"\n },\n \"language\": {\n \"type\": \"string\"\n }\n }\n }\n }\n }\n },\n \"image_quality\": {\n \"type\": \"object\",\n \"properties\": {\n \"readability_score\": {\n \"type\": \"number\",\n \"minimum\": 0,\n \"maximum\": 1\n },\n \"quality_issues\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"string\"\n }\n }\n }\n }\n }\n }\n }\n}",
"sendBody": true,
"specifyBody": "json",
"authentication": "predefinedCredentialType",
"nodeCredentialType": "googlePalmApi"
},
"credentials": {
"googlePalmApi": {
"id": "wAatqti05gvZyuMi",
"name": "Google Gemini(PaLM) Api account"
}
},
"typeVersion": 4.2
},
{
"id": "04084e15-5bbf-4a7b-b316-9dc71382e8cd",
"name": "Parse AI Results",
"type": "n8n-nodes-base.set",
"notes": "Processes Gemini classification results\n\n• Parses JSON response from Gemini API\n• Extracts document type and confidence scores\n• Validates classification results\n• Prepares data for structured extraction phase",
"position": [
1380,
1120
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "classification_output",
"name": "classification_output",
"type": "object",
"value": "={{ JSON.parse($json.candidates[0].content.parts[0].text) }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "29dc5d4f-4e1a-40de-be64-6ed47353473b",
"name": "Gemini Structure Data",
"type": "n8n-nodes-base.httpRequest",
"notes": "Converts extracted text into structured medical data\n\n• Applies medical document taxonomy schema\n• Extracts specific fields based on document type\n• Ensures regulatory compliance formatting\n• Validates data completeness and accuracy\n• Generates comprehensive quality metrics",
"position": [
1580,
1120
],
"parameters": {
"url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent",
"method": "POST",
"options": {},
"jsonBody": "={\n \"contents\": [\n {\n \"parts\": [\n {\n \"inline_data\": {\n \"mime_type\": \"{{ $('Prepare for AI').item.json.mime_type }}\",\n \"data\": \"{{ $('Prepare for AI').item.json.image_data }}\"\n }\n },\n {\n \"text\": \"Create the final medical document analysis result conforming to the medical document taxonomy schema. Combine and validate all analysis results:\\n\\nCLASSIFICATION: {{ $json.classification_output.document_classification.document_type }}\\n\\nEXTRACTED DATA: {{ encodeURI($json.classification_output.text_extraction.extracted_text) }}\\n\\nCreate a final result that strictly follows the medical document taxonomy schema with proper one based on document type. You should try to fill as many of the fields as you have data for. You should use the same language as the original document. Do not miss anything or you will face regulatory reprimand.\"\n }\n ]\n }\n ],\n \"generationConfig\": {\n \"temperature\": 0.1,\n \"topP\": 0.8,\n \"maxOutputTokens\": 8192,\n \"response_mime_type\": \"application/json\",\n \"response_schema\": {\n \"type\": \"object\",\n \"properties\": {\n \"documentId\": {\n \"type\": \"string\",\n \"description\": \"Unique identifier for the document\"\n },\n \"documentType\": {\n \"type\": \"string\",\n \"enum\": [\"financial\", \"clinical\", \"prescription\", \"administrative\", \"legal_regulatory\", \"diagnostic\"],\n \"description\": \"Primary category of the medical document\"\n },\n \"metadata\": {\n \"type\": \"object\",\n \"properties\": {\n \"createdDate\": {\n \"type\": \"string\",\n \"format\": \"date-time\",\n \"description\": \"Date the document was created\"\n },\n \"providerId\": {\n \"type\": \"string\",\n \"description\": \"Healthcare provider identifier\"\n },\n \"providerName\": {\n \"type\": \"string\",\n \"description\": \"Name of healthcare provider or institution\"\n },\n \"practitionerName\": {\n \"type\": \"string\",\n \"description\": \"Name of practitioner\"\n },\n \"providerTel\": {\n \"type\": \"string\",\n \"description\": \"Phone number of healthcare provider or institution\"\n },\n \"patientId\": {\n \"type\": \"string\",\n \"description\": \"Patient identifier (anonymized)\"\n },\n \"patientName\": {\n \"type\": \"string\",\n \"description\": \"Patient name (anonymized)\"\n },\n \"language\": {\n \"type\": \"string\",\n \"description\": \"Primary language of the document\"\n },\n \"currency\": {\n \"type\": \"string\",\n \"description\": \"Currency used for financial documents\"\n }\n },\n \"required\": [\"createdDate\", \"providerName\", \"practitionerName\"]\n },\n \"content\": {\n \"type\": \"object\",\n \"properties\": {\n \"subtype\": {\n \"type\": \"string\",\n \"description\": \"Specific document subtype\"\n },\n \"amount\": {\n \"type\": \"number\",\n \"minimum\": 0,\n \"description\": \"Total amount for financial documents\"\n },\n \"currency\": {\n \"type\": \"string\",\n \"description\": \"Currency code\"\n },\n \"paymentMethod\": {\n \"type\": \"string\",\n \"enum\": [\"cash\", \"credit_card\", \"insurance\", \"check\", \"electronic\"]\n },\n \"services\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\"type\": \"string\"},\n \"cost\": {\"type\": \"number\", \"minimum\": 0}\n },\n \"required\": [\"description\", \"cost\"]\n }\n },\n \"diagnosis\": {\n \"type\": \"array\",\n \"items\": {\"type\": \"string\"}\n },\n \"medications\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"dosage\": {\"type\": \"string\"},\n \"frequency\": {\"type\": \"string\"},\n \"duration\": {\"type\": \"string\"},\n \"instructions\": {\"type\": \"string\"}\n },\n \"required\": [\"name\", \"dosage\", \"frequency\"]\n }\n },\n \"treatmentPlan\": {\"type\": \"string\"},\n \"followUpDate\": {\"type\": \"string\", \"format\": \"date-time\"},\n \"appointmentDate\": {\"type\": \"string\", \"format\": \"date-time\"},\n \"department\": {\"type\": \"string\"},\n \"referralTo\": {\"type\": \"string\"},\n \"authorizationNumber\": {\"type\": \"string\"},\n \"testType\": {\"type\": \"string\"},\n \"results\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"parameter\": {\"type\": \"string\"},\n \"value\": {\"type\": \"string\"},\n \"referenceRange\": {\"type\": \"string\"},\n \"status\": {\"type\": \"string\", \"enum\": [\"normal\", \"abnormal\", \"critical\", \"pending\"]}\n },\n \"required\": [\"parameter\", \"value\"]\n }\n },\n \"interpretation\": {\"type\": \"string\"},\n \"recommendations\": {\"type\": \"string\"}\n }\n },\n \"quality_metrics\": {\n \"type\": \"object\",\n \"properties\": {\n \"overall_confidence\": {\"type\": \"number\", \"minimum\": 0, \"maximum\": 1},\n \"classification_confidence\": {\"type\": \"number\", \"minimum\": 0, \"maximum\": 1},\n \"extraction_confidence\": {\"type\": \"number\", \"minimum\": 0, \"maximum\": 1},\n \"readability_score\": {\"type\": \"number\", \"minimum\": 0, \"maximum\": 1},\n \"validation_notes\": {\"type\": \"string\"},\n \"quality_issues\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}}\n }\n }\n },\n \"required\": [\"documentId\", \"documentType\", \"metadata\", \"content\"]\n }\n }\n}",
"sendBody": true,
"specifyBody": "json",
"authentication": "predefinedCredentialType",
"nodeCredentialType": "googlePalmApi"
},
"credentials": {
"googlePalmApi": {
"id": "wAatqti05gvZyuMi",
"name": "Google Gemini(PaLM) Api account"
}
},
"typeVersion": 4.2
},
{
"id": "56d8908a-a200-49bb-9881-636c98a43dae",
"name": "Finalize Track",
"type": "n8n-nodes-base.set",
"notes": "Prepares final response with comprehensive metrics\n\n• Parses structured medical data from Gemini\n• Calculates total token usage for cost tracking\n• Adds processing metadata and timestamps\n• Compiles quality metrics and confidence scores\n• Ready for API response or further processing",
"position": [
1780,
1120
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "final_result",
"name": "final_result",
"type": "object",
"value": "={{ JSON.parse($json.candidates[0].content.parts[0].text) }}"
},
{
"id": "token_usage",
"name": "token_usage",
"type": "object",
"value": "={{ {\n \"classification_tokens\": $('Gemini Classify Extract').item.json.usageMetadata || {},\n \"structuring_tokens\": $json.usageMetadata || {},\n \"total_input_tokens\": ($('Gemini Classify Extract').item.json.usageMetadata.promptTokenCount || 0) + ($json.usageMetadata.promptTokenCount || 0),\n \"total_output_tokens\": ($('Gemini Classify Extract').item.json.usageMetadata.candidatesTokenCount || 0) + ($json.usageMetadata.candidatesTokenCount || 0)\n} }}"
},
{
"id": "processing_metadata",
"name": "processing_metadata",
"type": "object",
"value": "={{ {\n \"workflow_version\": \"1.0\",\n \"processing_timestamp\": new Date().toISOString(),\n \"stages_completed\": [\"image_fetch\", \"classification\", \"text_extraction\", \"data_structuring\"],\n \"ai_models_used\": [\"gemini-2.0-flash\"]\n} }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "106bff46-74e6-400d-994a-6aad4effdc1a",
"name": "API Response",
"type": "n8n-nodes-base.respondToWebhook",
"notes": "Returns standardized JSON response\n\n• Structured medical document data\n• Quality metrics and confidence scores\n• Token usage for cost monitoring\n• Processing metadata for audit trails\n• RESTful API format for easy integration",
"position": [
1980,
1120
],
"parameters": {
"options": {},
"respondWith": "json",
"responseBody": "={{ {\n \"success\": true,\n \"result\": $json.final_result,\n \"token_usage\": $json.token_usage,\n \"processing_metadata\": $json.processing_metadata\n} }}"
},
"typeVersion": 1.1
},
{
"id": "d4cdf8ca-13e3-4b42-832d-38750b2727d9",
"name": "Nota adhesiva5",
"type": "n8n-nodes-base.stickyNote",
"position": [
600,
460
],
"parameters": {
"width": 300,
"height": 520,
"content": "# Technical Details\n\n**AI Model:** Google Gemini 2.0 Flash\n• Optimized for document processing\n• Superior OCR accuracy\n• Multi-language support\n\n**Requirements:**\n• Google Gemini API access\n• Internet connectivity\n\n**Performance:**\n• 95%+ extraction accuracy\n• 2-5 second processing\n• Multiple image formats"
},
"typeVersion": 1
},
{
"id": "d3d2b495-3bc8-406a-81cf-b40ed90882da",
"name": "Nota adhesiva7",
"type": "n8n-nodes-base.stickyNote",
"position": [
1240,
460
],
"parameters": {
"width": 300,
"height": 520,
"content": "# Quick Start\n\n**Test in 3 steps:**\n1. Import workflow\n2. Add Gemini API credentials\n3. Send test request\n\n**Test request:**\n```\nPOST /webhook/analyze-medical-document\n{ \"image_url\": \"your-image-url\" }\n```\n\n**Customize:**\n• Add database storage\n• Integrate with systems\n• Batch processing\n• Custom validation"
},
"typeVersion": 1
},
{
"id": "c983283b-8ac7-448c-8b53-7d17496bcf54",
"name": "Nota adhesiva2",
"type": "n8n-nodes-base.stickyNote",
"position": [
1140,
1000
],
"parameters": {
"color": 5,
"width": 220,
"height": 320,
"content": "## Replace with your Gemini API Key"
},
"typeVersion": 1
},
{
"id": "829f2952-42ab-4f9c-a98b-30a3fb4d1228",
"name": "Nota adhesiva4",
"type": "n8n-nodes-base.stickyNote",
"position": [
1520,
1000
],
"parameters": {
"color": 5,
"width": 220,
"height": 320,
"content": "## Replace with your Gemini API Key"
},
"typeVersion": 1
}
],
"pinData": {},
"connections": {
"e13a0dda-b09a-4cfe-83ef-2a8f336fcb6d": {
"main": [
[
{
"node": "2216fddb-edf6-41b2-bc87-e4fdeecfe039",
"type": "main",
"index": 0
}
]
]
},
"Webhook Input": {
"main": [
[
{
"node": "e13a0dda-b09a-4cfe-83ef-2a8f336fcb6d",
"type": "main",
"index": 0
}
]
]
},
"2216fddb-edf6-41b2-bc87-e4fdeecfe039": {
"main": [
[
{
"node": "b40dfc91-06a7-4e8f-a7a0-e15e5bd2c1cd",
"type": "main",
"index": 0
}
]
]
},
"56d8908a-a200-49bb-9881-636c98a43dae": {
"main": [
[
{
"node": "106bff46-74e6-400d-994a-6aad4effdc1a",
"type": "main",
"index": 0
}
]
]
},
"5502a987-82bf-4879-a06b-d24fcd0e3817": {
"main": [
[
{
"node": "84fcd675-a04f-4ec2-a10d-020cb0b6854d",
"type": "main",
"index": 0
}
]
]
},
"04084e15-5bbf-4a7b-b316-9dc71382e8cd": {
"main": [
[
{
"node": "29dc5d4f-4e1a-40de-be64-6ed47353473b",
"type": "main",
"index": 0
}
]
]
},
"b40dfc91-06a7-4e8f-a7a0-e15e5bd2c1cd": {
"main": [
[
{
"node": "5502a987-82bf-4879-a06b-d24fcd0e3817",
"type": "main",
"index": 0
}
]
]
},
"29dc5d4f-4e1a-40de-be64-6ed47353473b": {
"main": [
[
{
"node": "56d8908a-a200-49bb-9881-636c98a43dae",
"type": "main",
"index": 0
}
]
]
},
"84fcd675-a04f-4ec2-a10d-020cb0b6854d": {
"main": [
[
{
"node": "04084e15-5bbf-4a7b-b316-9dc71382e8cd",
"type": "main",
"index": 0
}
]
]
}
}
}¿Cómo usar este flujo de trabajo?
Copie el código de configuración JSON de arriba, cree un nuevo flujo de trabajo en su instancia de n8n y seleccione "Importar desde JSON", pegue la configuración y luego modifique la configuración de credenciales según sea necesario.
¿En qué escenarios es adecuado este flujo de trabajo?
Avanzado - Extracción de documentos, Resumen de IA
¿Es de pago?
Este flujo de trabajo es completamente gratuito, puede importarlo y usarlo directamente. Sin embargo, tenga en cuenta que los servicios de terceros utilizados en el flujo de trabajo (como la API de OpenAI) pueden requerir un pago por su cuenta.
Flujos de trabajo relacionados recomendados
Louis Chan
@louischanCompartir este flujo de trabajo