KI-gestütztes Lead-Scraping (mit APIFY-Crawler, Gemini-Filterung und Ausgabe in Google Sheets)

Experte

Dies ist ein Content Creation, Multimodal AI-Bereich Automatisierungsworkflow mit 19 Nodes. Hauptsächlich werden Wait, Telegram, HttpRequest, SplitInBatches, Agent und andere Nodes verwendet. KI-basierte Lead-Verarbeitung: Nutzung von Apify mit Gemini und Google Sheets

Voraussetzungen
  • Telegram Bot Token
  • Möglicherweise sind Ziel-API-Anmeldedaten erforderlich
  • Google Sheets API-Anmeldedaten
  • Google Gemini API Key
Workflow-Vorschau
Visualisierung der Node-Verbindungen, mit Zoom und Pan
Workflow exportieren
Kopieren Sie die folgende JSON-Konfiguration und importieren Sie sie in n8n
{
  "id": "37qfTKwl5HThtkgN",
  "meta": {
    "instanceId": "3dfb5a3650edc2b4757ba54350b9efb3f78be8117da0b1a84cc1dc9700b64bb4"
  },
  "name": "AI-Powered Lead Scraping w/APIFY Scraper, Gemini Filtring, to Google Sheets",
  "tags": [],
  "nodes": [
    {
      "id": "9a9e0e1e-63bf-43d6-8762-4bcae9a8528c",
      "name": "Haftnotiz",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1040,
        -224
      ],
      "parameters": {
        "color": 4,
        "width": 660,
        "height": 1312,
        "content": "## 📋 LEAD SCRAPING AUTOMATION - SETUP GUIDE\n\n### 🎯 What This Workflow Does:\nAutomatically processes scraped leads from Apollo/Apify:\n- Validates and cleanses lead data\n- Generates unique Lead IDs (AP-DDMMYY-xxxx)\n- Appends to Google Sheets with deduplication\n- Sends batch summary reports via Telegram\n- Handles 1000 leads per batch\n\n### 🔧 Required Setup:\n\n**1. Apify/Apollo Integration:**\n   - Configure HTTP Request node with scraping API endpoint\n   - Add API credentials if required\n\n**2. Google Sheets:**\n   - Create spreadsheet with columns:\n     Lead ID, Name, Email, Phone, Company Name, Job Title,\n     Website/LinkedIn, Address, Company Summary, Relevant Partner\n   - Share with service account email\n   - Add Google Sheets OAuth2 credentials in n8n\n\n**3. Telegram Bot:**\n   - Use @BotFather to create bot\n   - Get bot token and add to n8n credentials\n\n**4. Google Gemini API:**\n   - Get API key from Google AI Studio\n   - Add to n8n credentials\n\n### 📊 Data Processing Rules:\n- **Required Fields**: Name, Email, Company Name\n- **Lead ID Format**: AP-DDMMYY-xxxx (auto-incremented)\n- **Phone Format**: Wrapped in quotes, mobile preferred\n- **Location**: City, Country (no street addresses)\n- **Deduplication**: By email address\n- **Batch Size**: 1000 leads maximum\n\n### 🔄 Workflow Flow:\n1. Trigger from another workflow\n2. Fetch leads via HTTP Request (Apify/Apollo)\n3. Split into batches of 1000\n4. AI Agent validates & processes each batch\n5. Append validated leads to Google Sheets\n6. Send Telegram summary (successes, warnings, errors)\n7. Loop continues with 30-second delay\n\n### ⚠️ Important Notes:\n- AI Agent uses memory for context (20 messages)\n- Skips leads missing critical fields\n- Flags leads with missing optional fields\n- One Telegram summary per batch (not per lead)"
      },
      "typeVersion": 1
    },
    {
      "id": "7591ae13-e828-4925-9a0d-7703782751a4",
      "name": "Haftnotiz1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -352,
        368
      ],
      "parameters": {
        "color": 5,
        "width": 280,
        "height": 156,
        "content": "## 🚀 TRIGGER\nStarts when executed by another workflow.\n\nPasses lead data through for processing."
      },
      "typeVersion": 1
    },
    {
      "id": "d66bd4db-6062-4c23-8359-cac89327d515",
      "name": "Haftnotiz2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -48,
        320
      ],
      "parameters": {
        "color": 6,
        "width": 280,
        "height": 224,
        "content": "## 🌐 API CALL\nFetches scraped leads from Apify/Apollo API.\n\nConfigure:\n- URL endpoint\n- Method (POST)\n- Authentication headers\n- Request body with search params"
      },
      "typeVersion": 1
    },
    {
      "id": "130f0ea4-835a-464e-850a-4ae978577b06",
      "name": "Haftnotiz3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        208,
        816
      ],
      "parameters": {
        "color": 3,
        "width": 300,
        "height": 264,
        "content": "## 🔁 BATCH PROCESSING\nSplits leads into batches of 1000.\n\nLoop continues until all leads processed.\n\nBranches:\n- Output 1: Current batch → AI Agent\n- Output 2: Loop back → Wait 30s → Fetch next"
      },
      "typeVersion": 1
    },
    {
      "id": "81e229b5-555e-4e67-96a1-66038b88da45",
      "name": "Haftnotiz4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        624,
        768
      ],
      "parameters": {
        "color": 7,
        "width": 340,
        "height": 156,
        "content": "## ⏱️ RATE LIMITING\nWaits 30 seconds between batches.\n\nPrevents API rate limiting and ensures stable processing."
      },
      "typeVersion": 1
    },
    {
      "id": "302be8a6-560e-41f7-b8e6-980f716f4d44",
      "name": "Haftnotiz5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        592,
        -288
      ],
      "parameters": {
        "color": 2,
        "width": 392,
        "height": 420,
        "content": "## 🤖 AI PROCESSING ENGINE\nGoogle Gemini AI Agent:\n\n**Validates & Processes:**\n- Extracts lead data from API response\n- Validates required fields (Name, Email, Company)\n- Generates unique Lead IDs\n- Formats phone numbers and locations\n- Deduplicates by email\n\n**Connected Tools:**\n- Append to Google Sheets\n- Read from Google Sheets (deduplication)\n- Memory (20 message context)\n\n**Output:**\n- Batch summary with stats\n- Lists: Added, Flagged, Skipped leads"
      },
      "typeVersion": 1
    },
    {
      "id": "c4027074-090c-4353-a314-76a12e51ddb8",
      "name": "Haftnotiz6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1312,
        416
      ],
      "parameters": {
        "width": 280,
        "height": 248,
        "content": "## 📊 GOOGLE SHEETS\n\n**Append Tool:**\nWrites validated leads to sheet\n\n**Read Tool:**\nChecks for duplicate emails\n\nBoth connected as AI tools to the agent."
      },
      "typeVersion": 1
    },
    {
      "id": "f95fe9dd-9920-43ef-8930-d0b57902bda4",
      "name": "Haftnotiz7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1424,
        -112
      ],
      "parameters": {
        "color": 5,
        "width": 300,
        "height": 252,
        "content": "## 📱 TELEGRAM REPORT\nSends batch summary:\n\n✅ Total processed\n✅ Successfully added\n⚠️ Flagged (missing optional fields)\n❌ Skipped (missing critical fields)\n\nIncludes lead identifiers:\nName — Company — Email"
      },
      "typeVersion": 1
    },
    {
      "id": "bea27c22-afb2-44b2-9639-6d336ab905a8",
      "name": "When Executed by Another Workflow",
      "type": "n8n-nodes-base.executeWorkflowTrigger",
      "position": [
        -320,
        560
      ],
      "parameters": {
        "inputSource": "passthrough"
      },
      "typeVersion": 1.1
    },
    {
      "id": "a2975651-6c44-4e27-86ea-5e99a2388672",
      "name": "Über Elemente schleifen",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        320,
        560
      ],
      "parameters": {
        "options": {
          "reset": false
        },
        "batchSize": 1000
      },
      "typeVersion": 3
    },
    {
      "id": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
      "name": "Knowledge Base Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        672,
        160
      ],
      "parameters": {
        "text": "=# Role and Objective\n\nParse one or more scraped Apollo leads (structured or semi-structured) into validated, deduplicated Excel rows, and provide a **single well-formatted Telegram summary message** per batch. Each lead must still be processed individually for Excel (row by row).\n\n#Here is the Input Data:\n\n\n\n---\n\n# Instructions\n\n* Input data may contain:\n  `Name, Email, Phone, Company Name, Job Title, LinkedIn, Company Website, Location, Company Type, Company Summary, No. of Employees, Industry, Common Projects, Newest Updates, Relevant Partner`.\n* Extract required and available optional data per lead, validate, and append to the Excel sheet.\n* **Do not send a Telegram message per lead.** Instead, generate one **Telegram summary message per batch**.\n\n---\n\n## Excel Processing\n\n### Columns\n\n`Lead ID, Name, Email, Phone Number, Company Name, Job Title, LinkedIn, Company Website, Location, Company Type (opt), Company Summary (opt), No. of Employees (opt), Industry (opt), Common Projects (opt), Newest Updates (opt), Relevant Partner (opt)`\n\n### Lead ID Generation\n\n* Format: `AP-DDMMYY-xxxx`.\n* `DDMMYY` = today's date.\n* `xxxx` = incremental per batch, starting at 0001.\n\n### Field Formatting & Validation\n\n* **Phone Number**: Prefer mobile, else landline. Always wrap in `\"quotes\"`.\n* **Location**: Format as *City, Country*. Strip street info.\n* **Company Summary**: Short, clear, no fluff.\n* **Optional Columns**: Fill only if confidently present.\n\n### Validation Rules\n\n* **Required fields**: Name, Email, Company Name. Missing any → skip row.\n* **Deduplication**: Check by Email.\n* **Order**: Preserve input order in Excel output.\n\n---\n\n## Telegram Reporting\n\n* Generate **one consolidated message per batch**.\n* Show totals: processed, added, flagged, skipped.\n* Provide quick list of added leads with identifiers: `Name — Company — Email`.\n* For flagged rows: list missing non-critical fields.\n* For rejected rows: list missing critical fields.\n* Never show **n8n/system errors**. Only user-side data gaps.\n\n---\n\n## Telegram Message Examples\n\n### Batch Summary (All Good)\n\n```\n✅ Batch Complete  \n\nTotal Leads Processed: 10  \nAdded Successfully: 10  \nFlagged: 0  \nSkipped: 0  \n\nContacts Added:  \n- John Smith — Acme Inc. — john.smith@email.com  \n- Jane Doe — Beta Corp — jane.doe@email.com  \n- … (etc.)\n```\n\n### Batch Summary (With Warnings & Errors)\n\n```\n⚠️ Batch Complete With Issues  \n\nTotal Leads Processed: 12  \nAdded Successfully: 8  \nFlagged: 2  \nSkipped: 2  \n\nContacts Added:  \n- Sarah Lee — GreenTech — sarah.lee@email.com  \n- Ahmed Ali — FinSolve — ahmed.ali@email.com  \n…  \n\nFlagged (Missing Fields):  \n- Lead 5: Missing Job Title, LinkedIn  \n- Lead 9: Missing Website  \n\n❌ Skipped (Critical Missing):  \n- Lead 3: Missing Email  \n- Lead 7: Missing Company Name\n```\n\n---\n\n## Clarification\n\n* **✅ = All critical fields present** (Name, Email, Company Name).\n* **⚠️ = Missing important but not critical fields** (Job Title, Phone, LinkedIn, Website). Lead still added.\n* **❌ = Missing critical fields** (Name, Email, Company Name). Lead skipped.\n* Always consolidate into **one Telegram message per batch**, never one per lead.",
        "options": {
          "systemMessage": ""
        },
        "promptType": "define"
      },
      "typeVersion": 1.9
    },
    {
      "id": "16ba55c6-09e5-4a93-944e-8c4931782c08",
      "name": "Google Gemini-Chat-Modell",
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "position": [
        608,
        368
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "eace1550-8dc1-4a6c-8aff-9ca7e52628cc",
      "name": "Simple Speicher",
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "position": [
        768,
        400
      ],
      "parameters": {
        "sessionKey": "=memory_{{ $('Telegram Trigger').item.json.message.message_id }}",
        "sessionIdType": "customKey",
        "contextWindowLength": 20
      },
      "typeVersion": 1.3
    },
    {
      "id": "a52239b9-2886-4544-95fe-139ae07373ea",
      "name": "Append row in sheet in Google Tabellen",
      "type": "n8n-nodes-base.googleSheetsTool",
      "position": [
        992,
        368
      ],
      "parameters": {
        "columns": {
          "value": {
            "Name": "={{ $fromAI('Name', '', 'string') }}",
            "Email": "={{ $fromAI('Email', '', 'string') }}",
            "Address": "={{ $fromAI('Address', '', 'string') }}",
            "Lead \nID": "={{ $fromAI('Lead__ID', '', 'string') }}",
            "Job Title": "={{ $fromAI('Job_Title', '', 'string') }}",
            "Company Name": "={{ $fromAI('Company_Name', '', 'string') }}",
            "Phone Number": "={{ $fromAI('Phone_Number', '', 'string') }}",
            "Company Summary": "={{ $fromAI('Company_Summary', '', 'string') }}",
            "Relevant Partner": "={{ $fromAI('Relevant_Partner', '', 'string') }}",
            "Website / LinkedIn": "={{ $fromAI('Website___LinkedIn', '', 'string') }}"
          },
          "schema": [
            {
              "id": "Lead \nID",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Lead \nID",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Name",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Email",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Email",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Phone Number",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Phone Number",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Company Name",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Company Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Job Title",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Job Title",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Website / LinkedIn",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Website / LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Address",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Address",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Company Summary",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Company Summary",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Relevant Partner",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Relevant Partner",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Lead \nID"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": 1974176187,
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc/edit#gid=1974176187",
          "cachedResultName": "Raw Scraping"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc/edit?usp=drivesdk",
          "cachedResultName": "EDCON LEADS UNFILTERED"
        }
      },
      "typeVersion": 4.7
    },
    {
      "id": "bf62c5ce-110d-4123-9f82-05954227f12e",
      "name": "Get row(s) in sheet in Google Tabellen",
      "type": "n8n-nodes-base.googleSheetsTool",
      "position": [
        1184,
        320
      ],
      "parameters": {
        "options": {},
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": 1974176187,
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc/edit#gid=1974176187",
          "cachedResultName": "Raw Scraping"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJXVWo8FF8758gItdyfT5JOSInlm7xJ3QQL7zy7tCHc/edit?usp=drivesdk",
          "cachedResultName": "EDCON LEADS UNFILTERED"
        }
      },
      "typeVersion": 4.7
    },
    {
      "id": "1ab18e0f-3d8d-43ed-b6db-bc0871970c38",
      "name": "Send a text message",
      "type": "n8n-nodes-base.telegram",
      "position": [
        1376,
        160
      ],
      "webhookId": "53b5dc25-a9ea-47b4-87b7-274984c1115d",
      "parameters": {
        "text": "={{ $json.output }}",
        "chatId": "={{ $('Telegram Trigger').item.json.message.chat.id }}",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "ccf8718f-a004-4c42-9cf3-611062d42235",
      "name": "APWennY Post Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        16,
        560
      ],
      "parameters": {
        "method": "POST",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "e1d8b798-d325-453e-b552-dc223e1d3177",
      "name": "Wait for APWennY to Scrape",
      "type": "n8n-nodes-base.wait",
      "position": [
        720,
        576
      ],
      "webhookId": "ccf0293c-0614-4542-b947-40e6da8c39b0",
      "parameters": {
        "amount": 30
      },
      "typeVersion": 1.1
    },
    {
      "id": "59ff55ac-dc9e-4276-af98-b1f9b01bfaf2",
      "name": "Apify Get Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1072,
        576
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 4.2
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "4ccad03f-070d-40c5-a3eb-5071667c8982",
  "connections": {
    "Simple Memory": {
      "ai_memory": [
        [
          {
            "node": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items": {
      "main": [
        [
          {
            "node": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Wait for APIFY to Scrape",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "59ff55ac-dc9e-4276-af98-b1f9b01bfaf2": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "APIFY Post Request": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "b093d4cd-dae3-4e27-b84c-128ceb0408e5": {
      "main": [
        [
          {
            "node": "1ab18e0f-3d8d-43ed-b6db-bc0871970c38",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Google Gemini Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Wait for APIFY to Scrape": {
      "main": [
        [
          {
            "node": "59ff55ac-dc9e-4276-af98-b1f9b01bfaf2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "bea27c22-afb2-44b2-9639-6d336ab905a8": {
      "main": [
        [
          {
            "node": "APIFY Post Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Append row in sheet in Google Sheets": {
      "ai_tool": [
        [
          {
            "node": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Get row(s) in sheet in Google Sheets": {
      "ai_tool": [
        [
          {
            "node": "b093d4cd-dae3-4e27-b84c-128ceb0408e5",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    }
  }
}
Häufig gestellte Fragen

Wie verwende ich diesen Workflow?

Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.

Für welche Szenarien ist dieser Workflow geeignet?

Experte - Content-Erstellung, Multimodales KI

Ist es kostenpflichtig?

Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.

Workflow-Informationen
Schwierigkeitsgrad
Experte
Anzahl der Nodes19
Kategorie2
Node-Typen10
Schwierigkeitsbeschreibung

Für fortgeschrittene Benutzer, komplexe Workflows mit 16+ Nodes

Externe Links
Auf n8n.io ansehen

Diesen Workflow teilen

Kategorien

Kategorien: 34