Extraction de données de sites web à partir d'entrées de formulaire (Gemini 2.5 flash + Gmail)

Intermédiaire

Ceci est unAI Summarization, Multimodal AIworkflow d'automatisation du domainecontenant 13 nœuds.Utilise principalement des nœuds comme Html, Gmail, FormTrigger, HttpRequest, ChainLlm. Extraire des données de sites web spécifiques via les saisies de formulaires, Gemini 2.5 flash et Gmail

Prérequis
  • Compte Google et informations d'identification Gmail API
  • Peut nécessiter les informations d'identification d'authentification de l'API cible
  • Clé API Google Gemini
Aperçu du workflow
Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement
Exporter le workflow
Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow
{
  "meta": {
    "instanceId": "d1786ab0d745a7498abf13a9c2cdabb1374c006e889b79eef64ce0386b8f8a41",
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "6d85bf32-59a5-4644-b5b0-d31aaf677bda",
      "name": "Analyseur de sortie structurée",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        640,
        200
      ],
      "parameters": {
        "jsonSchemaExample": "{\n    \"result\": \"extracted value(s)\"\n}"
      },
      "typeVersion": 1.2
    },
    {
      "id": "d011794a-d4bf-4750-9ad1-3cb0df662aff",
      "name": "Obtenir le HTML depuis l'URL source",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        40,
        0
      ],
      "parameters": {
        "url": "={{ $json['Source URL'] }}",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "389ac2ce-39d8-4bc9-a1af-7ea3dba0240d",
      "name": "Chaîne LLM d'extraction de données",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        460,
        0
      ],
      "parameters": {
        "text": "=Your task is to extract the exact information specified by the user.\n\nUser’s extraction request:\n\"{{ $('Web Scraper form submission').item.json['Data to extract'] }}\"\n\nRules:\n1. Extract ONLY the requested information.\n2. If multiple matches exist, combine them into a single string separated by commas.\n3. Do NOT add explanations or extra text—output only the extracted data.\n4. Maintain the original values unless formatting is requested.\n5. If no matches are found, return: { \"result\": \"No data found\" }.\n6. Always return the response in this format:\n{\n    \"result\": \"extracted value(s)\"\n}\n\nHere is the source data:\n{{ $json.body }}\n",
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 1.6
    },
    {
      "id": "a73e7657-4a79-4b50-973f-e27d406f0278",
      "name": "Gmail - Envoyer le résultat",
      "type": "n8n-nodes-base.gmail",
      "position": [
        880,
        0
      ],
      "webhookId": "fa29cdcc-e8e9-449a-a6a4-88a874e2a0c5",
      "parameters": {
        "sendTo": " template_data_extactor_replace_me@yopmail.com",
        "message": "=Your web scraping task has been completed.\n\nSource URL:\n{{ $('Web Scraper form submission').item.json['Source URL'] }}\n\nData Requested:\n{{ $('Web Scraper form submission').item.json['Data to extract'] }}\n\nExtracted Result:\n{{ $json.output.result }}\n\nThank you for using our web scraping automation.",
        "options": {
          "appendAttribution": false
        },
        "subject": "=✅ Web Scraping Result for {{ $('Web Scraper form submission').item.json['Source URL'] }}",
        "emailType": "text"
      },
      "credentials": {
        "gmailOAuth2": {
          "id": "CeBpTZBQSAMKVKJY",
          "name": "Gmail account (Billy Email 2)"
        }
      },
      "typeVersion": 2.1
    },
    {
      "id": "ec13f750-6015-4c17-b062-9036b0ae8697",
      "name": "Soumission de formulaire Web Scraper",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        -160,
        0
      ],
      "webhookId": "a757a352-5ab2-4fa7-a8ee-08bb5d3448cc",
      "parameters": {
        "options": {},
        "formTitle": "Web Scraper Form",
        "formFields": {
          "values": [
            {
              "fieldLabel": "Source URL"
            },
            {
              "fieldLabel": "Data to extract"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "1ed56fc9-5124-454c-8b4d-ee2c9e72076c",
      "name": "Note adhésive4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1100,
        -260
      ],
      "parameters": {
        "color": 4,
        "width": 380,
        "height": 760,
        "content": "# 👋 Hi, I’m Billy!\n\nI help businesses build **n8n workflows** & **AI automation projects**.  \nNeed help with n8n or AI Automation projects? \nContact me and let’s build your automation together.\n\n📩 **Email:** billychartanto@gmail.com  \n🤝 **n8n Creator:** [n8n.io/creators/billy](https://n8n.io/creators/billy/)\n🌐 **My n8n Projects:** [billychristi.com/n8n](https://www.billychristi.com/n8n)  \n\n\n\n---\n💡 Feel free to get in touch if you’d like help on your next automation project or if you have any feedback or thoughts to share.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "4b1b75f3-95fa-4a42-8180-cb47ef7c3a02",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -760,
        -80
      ],
      "parameters": {
        "color": 4,
        "width": 500,
        "height": 360,
        "content": "## SETUP REQUIRED\n\nWorkflow Configurations:\n- Update the email recipient in the Gmail node (currently set to template_data_extactor_replace_me@yopmail.com)\n- Adjust the JSON schema in the Structured Output Parser if you need different output formats\n- Modify the LLM prompt in the Data Extractor LLM Chain based on your specific extraction requirements\n\nRequired Credentials:\n- Google Gemini API Key (Google PaLM API account)\n- Gmail Credential for sending result emails"
      },
      "typeVersion": 1
    },
    {
      "id": "35116b5c-c3ab-4c47-9914-c6cecdf3e3b4",
      "name": "Note adhésive1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -860,
        420
      ],
      "parameters": {
        "color": 4,
        "width": 600,
        "height": 400,
        "content": "## 🔍Extract Specific Website Data with Form Input, Gemini 2.5 flash and Gmail Delivery\n\nWhat This Template Does:\n\n- Provides a web form interface for users to submit scraping requests\n- Accepts any website URL and custom data extraction requirements\n- Fetches HTML content from the specified source URL\n- Uses Google Gemini AI to intelligently extract only the requested information\n- Processes raw HTML content and returns structured JSON results\n- Automatically sends extraction results via Gmail with detailed reporting\n- Handles various data types and formats while maintaining original values unless formatting is requested\n"
      },
      "typeVersion": 1
    },
    {
      "id": "b1526d24-29c6-4552-8964-84953933494b",
      "name": "Note adhésive2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -220,
        420
      ],
      "parameters": {
        "color": 4,
        "width": 1000,
        "height": 300,
        "content": "## 📋 WORKFLOW PROCESS OVERVIEW\n\nStep 1: 📝 Web Scraper Form Submission triggers the workflow when users submit URL and extraction requirements\nStep 2: 🌐 Get HTML from Source URL fetches the complete HTML content from the provided website\nStep 3: 🔧 HTML Extractor processes the raw HTML and extracts the body content for analysis\nStep 4: 🤖 Data Extractor LLM Chain uses Google Gemini AI to analyze content and extract only the specific data requested by the user\nStep 5: 📊 Structured Output Parser formats the AI response into clean JSON structure with standardized format\nStep 6: 📧 Gmail Send Result delivers the extraction results via email including:\n  - Original source URL\n  - Data extraction request details  \n  - Clean extracted results\n  - Professional formatting with success confirmation"
      },
      "typeVersion": 1
    },
    {
      "id": "461e0675-1755-41eb-b445-20fd0d733d8c",
      "name": "Note adhésive3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        380,
        -180
      ],
      "parameters": {
        "color": 4,
        "width": 400,
        "height": 560,
        "content": "## Data Extractor LLM Chain  \nThis is where we extract the content based on the user request  \n\nConfiguration:  \nYou can update the prompt and the model here to adjust to your use case.  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "e147282c-51f0-4f76-8416-bfeb00a47f64",
      "name": "Note adhésive5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        800,
        -160
      ],
      "parameters": {
        "color": 4,
        "width": 260,
        "height": 340,
        "content": "## Gmail - Send Results  \n\nConfiguration:  \nUpdate the target email  \nUpdate the email subject and body  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "adb7c780-8210-4dab-ad4f-9e7fc366cd16",
      "name": "Google Gemini Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "position": [
        440,
        200
      ],
      "parameters": {
        "options": {},
        "modelName": "models/gemini-2.5-flash"
      },
      "credentials": {
        "googlePalmApi": {
          "id": "gdaO8lU3HwsldifM",
          "name": "Google Gemini(PaLM) Api account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "25e49c2a-9018-4503-8ce1-a95699e9941c",
      "name": "Extracteur HTML",
      "type": "n8n-nodes-base.html",
      "position": [
        220,
        0
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "body",
              "cssSelector": "body"
            }
          ]
        }
      },
      "typeVersion": 1.2
    }
  ],
  "pinData": {},
  "connections": {
    "25e49c2a-9018-4503-8ce1-a95699e9941c": {
      "main": [
        [
          {
            "node": "389ac2ce-39d8-4bc9-a1af-7ea3dba0240d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "389ac2ce-39d8-4bc9-a1af-7ea3dba0240d": {
      "main": [
        [
          {
            "node": "a73e7657-4a79-4b50-973f-e27d406f0278",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "d011794a-d4bf-4750-9ad1-3cb0df662aff": {
      "main": [
        [
          {
            "node": "25e49c2a-9018-4503-8ce1-a95699e9941c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "adb7c780-8210-4dab-ad4f-9e7fc366cd16": {
      "ai_languageModel": [
        [
          {
            "node": "389ac2ce-39d8-4bc9-a1af-7ea3dba0240d",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "6d85bf32-59a5-4644-b5b0-d31aaf677bda": {
      "ai_outputParser": [
        [
          {
            "node": "389ac2ce-39d8-4bc9-a1af-7ea3dba0240d",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "ec13f750-6015-4c17-b062-9036b0ae8697": {
      "main": [
        [
          {
            "node": "d011794a-d4bf-4750-9ad1-3cb0df662aff",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Intermédiaire - Résumé IA, IA Multimodale

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.

Informations sur le workflow
Niveau de difficulté
Intermédiaire
Nombre de nœuds13
Catégorie2
Types de nœuds8
Description de la difficulté

Adapté aux utilisateurs expérimentés, avec des workflows de complexité moyenne contenant 6-15 nœuds

Auteur
Billy Christi

Billy Christi

@billy

I build scalable automation systems with n8n to help businesses save time and cut costs. 💼 n8n expert available for new projects 📩 billychartanto@gmail.com

Liens externes
Voir sur n8n.io

Partager ce workflow

Catégories

Catégories: 34