使用 RSS、GPT-4.1-MINI 和 BrightData 监控并提取种子轮初创公司数据到 Excel
中级
这是一个Lead Generation, AI Summarization领域的自动化工作流,包含 14 个节点。主要使用 Set, Code, Markdown, HttpRequest, OpenAi 等节点。 使用 RSS、GPT-4.1-MINI 和 BrightData 监控并提取种子轮初创公司数据到 Excel
前置要求
- •可能需要目标 API 的认证凭证
- •OpenAI API Key
工作流预览
可视化展示节点连接关系,支持缩放和平移
导出工作流
复制以下 JSON 配置到 n8n 导入,即可使用此工作流
{
"meta": {
"instanceId": "588297c7214f1c4e25d370806d33145b7a547bf66f8157b64edb38d64fc3c5f2",
"templateCredsSetupCompleted": true
},
"nodes": [
{
"id": "66526413-badf-48cc-b08d-29a87490bf75",
"name": "编辑字段",
"type": "n8n-nodes-base.set",
"notes": "Filter the",
"position": [
1024,
176
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "28ed03f9-1e17-432f-9438-6484aab19e35",
"name": "",
"type": "array",
"value": "={{ $json.choices.map(choice => choice.message.content) }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "c527cac4-fb30-48f0-82f2-f516aa266ce5",
"name": "消息模型",
"type": "@n8n/n8n-nodes-langchain.openAi",
"notes": "get seed funded companay data",
"position": [
640,
176
],
"parameters": {
"modelId": {
"__rl": true,
"mode": "list",
"value": "gpt-4.1-mini",
"cachedResultName": "GPT-4.1-MINI"
},
"options": {},
"messages": {
"values": [
{
"role": "system",
"content": "You are an AI designed to extract key information from a specified news article related to startup funding. You will receive the link to the article and its content in markdown format. Your task is to meticulously gather relevant data concerning startup funding as outlined below.\n\n### Input:\n- The URL of the news article discussing recent startup funding events.\n- The complete markdown text of the article.\n\n### Tasks:\n1. Review the provided article content and extract necessary information regarding companies that have received seed funding. If the article contains multiple instances of seed funding data, ensure you gather details for each company without addressing any generic explanations of seed funding itself.\n2. Dont use the article url for extracting the data. use it only for output\n\n3. Extract the following information in JSON format for each company reported in the article:\n\n - **companyName**: Name of the startup company.\n - **companyWebsite**: Official website of the company (do not reference any URLs provided in the markdown).\n - **companyLinkedIn**: URL of the company's LinkedIn page.\n - **fundingAmount**: The total amount raised in this funding round (e.g., \"£950,000\" or \"$1.2 million\").\n - **founderName**: An array containing the full names of all the founders.\n - **founderLinkedIn**: An array of LinkedIn profile URLs for each founder (set to null if not available).\n - **articleUrl**: Return the input article URL instead of the article content.\n\n### Output Format:\n- Provide your output strictly in JSON format, ensuring proper structure even if some fields contain null values. If multiple companies are mentioned, return an array of objects, each representing a different company.\n\n### JSON Example:\n```json\n[\n {\n \"companyName\": \"Sample Startup 1\",\n \"companyWebsite\": \"https://www.samplestartup1.com\",\n \"companyLinkedIn\": \"https://www.linkedin.com/company/sample-startup-1\",\n \"fundingAmount\": \"$1.5 million\",\n \"founderName\": [\"John Doe\", \"Jane Smith\"],\n \"founderLinkedIn\": [\"https://www.linkedin.com/in/johndoe\", null],\n \"articleUrl\": \"https://www.example.com/sample-article\"\n },\n {\n \"companyName\": \"Sample Startup 2\",\n \"companyWebsite\": \"https://www.samplestartup2.com\",\n \"companyLinkedIn\": \"https://www.linkedin.com/company/sample-startup-2\",\n \"fundingAmount\": \"$950,000\",\n \"founderName\": [\"Alice Johnson\"],\n \"founderLinkedIn\": [null],\n \"articleUrl\": \"https://www.example.com/sample-article\"\n }\n]\n```\n\n### Guidelines:\n- Utilize only verified information from the article provided.\n- Set any unavailable fields to null.\n- Avoid seeking additional information from external websites.\n- Refrain from providing interpretations; stick strictly to the facts as presented in the markdown content."
},
{
"content": "=\narticle content in markdown format : {{ $json.data }}\narticle link : {{ $json.link }}\n"
}
]
},
"simplify": false,
"jsonOutput": true
},
"notesInFlow": true,
"typeVersion": 1.8
},
{
"id": "4c6baf0d-c407-40cb-8d4c-89f1a716de24",
"name": "Markdown",
"type": "n8n-nodes-base.markdown",
"onError": "continueRegularOutput",
"position": [
416,
176
],
"parameters": {
"html": "={{ $json.body }}",
"options": {
"ignore": "head, script, img",
"useLinkReferenceDefinitions": true
}
},
"typeVersion": 1
},
{
"id": "a0d497c0-81cb-4835-a632-42035ddc01e8",
"name": "重构文章链接",
"type": "n8n-nodes-base.code",
"notes": "Get the redirect URL",
"position": [
-256,
176
],
"parameters": {
"jsCode": "/** \n* Loop for extracting the valid article URL from the redirect URL\n*/\nfor (const item of $input.all()) {\n /** Redirect URL */\n const rawLink = item.json.link;\n\n let extractedUrl = rawLink;\n\n /**\n * Actual URL is start from \"&Url\" to \"&\" \n * It will match and extract the URL\n */\n const match = rawLink.match(/[?&]url=([^&]+)/);\n \n if (match && match[1]) {\n /** Decode the URL-encoded value */\n extractedUrl = decodeURIComponent(match[1]);\n }\n\n /** Replace the redirect URL with actual URL */\n item.json.link = extractedUrl;\n \n}\nreturn $input.all();"
},
"notesInFlow": true,
"typeVersion": 2
},
{
"id": "8df2524a-ba30-48a0-afe7-059187fd334b",
"name": "添加文章链接",
"type": "n8n-nodes-base.code",
"position": [
192,
176
],
"parameters": {
"jsCode": "/**\n * This code will integrate the article link from Refactor article link node and output of get article page node\n */\n\n/** Input for get articel page node */\nconst inputForBightData = $items(\"Refactor article link\"); \n\n/** Output of get articel page node */\nconst outputOfBightData = $input.all(); // from the BrightData response\n\n\nreturn outputOfBightData.map((item, index) => {\n const input = inputForBightData[index].json;\n const output = item.json;\n\n return {\n json: {\n ...output,\n link: input.link // Add the link from the original input\n }\n };\n});"
},
"typeVersion": 2
},
{
"id": "0d725bba-0d9e-4e18-a6e8-3fb10bb5835a",
"name": "RSS订阅触发器",
"type": "n8n-nodes-base.rssFeedReadTrigger",
"position": [
-464,
176
],
"parameters": {
"feedUrl": "https://www.google.co.in/alerts/feeds/02881064610578318478/7170584238880554951",
"pollTimes": {
"item": [
{
"mode": "everyX",
"unit": "minutes",
"value": 5
}
]
}
},
"typeVersion": 1
},
{
"id": "30fa6ec5-7cb1-45cb-bfce-169dc5e284f6",
"name": "便签",
"type": "n8n-nodes-base.stickyNote",
"position": [
-528,
-32
],
"parameters": {
"width": 420,
"height": 380,
"content": "## 触发器与文章发现"
},
"typeVersion": 1
},
{
"id": "c705f45e-9063-4738-85d8-2f67bea53ea5",
"name": "便签1",
"type": "n8n-nodes-base.stickyNote",
"position": [
-64,
-32
],
"parameters": {
"width": 620,
"height": 380,
"content": "## 内容抓取与准备"
},
"typeVersion": 1
},
{
"id": "5cdcd2db-5e38-4245-8da1-281e28f238cc",
"name": "便签2",
"type": "n8n-nodes-base.stickyNote",
"position": [
592,
-32
],
"parameters": {
"width": 380,
"height": 380,
"content": "## 使用AI进行数据提取"
},
"typeVersion": 1
},
{
"id": "ba4045f1-628b-4b6c-aef3-40f00c3f0e4e",
"name": "便签3",
"type": "n8n-nodes-base.stickyNote",
"position": [
992,
-32
],
"parameters": {
"width": 380,
"height": 380,
"content": "## 从嵌套数据提取有效初创公司条目"
},
"typeVersion": 1
},
{
"id": "c5a79532-75a6-40b5-b7c1-a911ecb5cf82",
"name": "便签4",
"type": "n8n-nodes-base.stickyNote",
"position": [
1392,
-32
],
"parameters": {
"width": 280,
"height": 380,
"content": "## 将数据追加到Excel表格"
},
"typeVersion": 1
},
{
"id": "edad49fb-c923-4374-b9e2-87eeb1e2630a",
"name": "获取文章页面",
"type": "@brightdata/n8n-nodes-brightdata.brightData",
"onError": "continueRegularOutput",
"position": [
-32,
176
],
"parameters": {
"url": "={{ $json.link }}",
"zone": {
"__rl": true,
"mode": "list",
"value": "web_unlocker1"
},
"format": "json",
"country": {
"__rl": true,
"mode": "list",
"value": "us"
},
"requestOptions": {}
},
"retryOnFail": false,
"typeVersion": 1
},
{
"id": "684530ce-8472-4312-a3c6-6fc0c9c8ff84",
"name": "筛选公司数据",
"type": "n8n-nodes-base.code",
"position": [
1232,
176
],
"parameters": {
"jsCode": "/** \n * this code will generate the array of company details by using the row and unstructured data from previous node\n * It also remove the duplicate entry\n*/\n\nconst results = [];\nconst seenCompanyNames = new Set();\n\nfunction extractValidStartups(obj) {\n if (Array.isArray(obj)) {\n for (const item of obj) {\n extractValidStartups(item);\n }\n } else if (typeof obj === 'object' && obj !== null) {\n // Skip if it's an error object\n if (obj.error) return;\n\n // Check if it looks like a startup object\n if (obj.companyName) {\n const key = obj.companyName.trim().toLowerCase(); // normalize name\n if (!seenCompanyNames.has(key)) {\n seenCompanyNames.add(key);\n results.push({ json: obj });\n }\n return;\n }\n\n // Otherwise, recursively search its values\n for (const key in obj) {\n extractValidStartups(obj[key]);\n }\n }\n}\n\nfor (const item of $input.all()) {\n const root = item.json[\"\"];\n if (!Array.isArray(root)) continue;\n\n for (const entry of root) {\n extractValidStartups(entry);\n }\n}\n\nreturn results;\n"
},
"typeVersion": 2
},
{
"id": "8c50058f-58c4-424e-a45a-ea27df89a47d",
"name": "将数据添加到Excel表格",
"type": "n8n-nodes-base.httpRequest",
"onError": "continueErrorOutput",
"position": [
1456,
176
],
"parameters": {
"url": "https://graph.microsoft.com/v1.0/drives/{{drive-id}}/items/{{file-id}}/workbook/tables/{ {{ sheet-id }} }/rows",
"method": "POST",
"options": {
"batching": {
"batch": {
"batchSize": 1,
"batchInterval": 3000
}
}
},
"jsonBody": "={\n \"values\": [\n {{ $input.all().map((item, index) => \n `${index > 0 ? ',' : ''}[` +\n `\"${$now.format('yyyy-MM-dd \\'at\\' T')}\",` +\n `\"${item.json.companyName || \"-\"}\",` +\n `\"${item.json.companyWebsite || \"-\"}\",` +\n `\"${item.json.companyLinkedIn || \"-\"}\",` +\n `\"${item.json.fundingAmount || \"-\"}\",` +\n `\"${Array.isArray(item.json.founderName) && item.json.founderName.filter(n => n).length > 0 \n ? item.json.founderName.filter(n => n).join(', ') \n : \"-\" }\",` +\n `\"${Array.isArray(item.json.founderLinkedIn) && item.json.founderLinkedIn.filter(n => n).length > 0 \n ? item.json.founderLinkedIn.filter(n => n).join(', ') \n : \"-\" }\",` +\n `\"${item.json.articleUrl || \"-\"}\"` +\n `]`\n ).join('\\n') }}\n ]\n}",
"sendBody": true,
"specifyBody": "json",
"authentication": "genericCredentialType",
"genericAuthType": "oAuth2Api"
},
"executeOnce": true,
"retryOnFail": true,
"typeVersion": 4.2
}
],
"pinData": {},
"connections": {
"Markdown": {
"main": [
[
{
"node": "Message a model",
"type": "main",
"index": 0
}
]
]
},
"Edit Fields": {
"main": [
[
{
"node": "Filter company data",
"type": "main",
"index": 0
}
]
]
},
"Message a model": {
"main": [
[
{
"node": "Edit Fields",
"type": "main",
"index": 0
}
]
]
},
"Add article link": {
"main": [
[
{
"node": "Markdown",
"type": "main",
"index": 0
}
]
]
},
"Get article Page": {
"main": [
[
{
"node": "Add article link",
"type": "main",
"index": 0
}
]
]
},
"RSS Feed Trigger": {
"main": [
[
{
"node": "Refactor article link",
"type": "main",
"index": 0
}
]
]
},
"Filter company data": {
"main": [
[
{
"node": "Add data into excel sheet",
"type": "main",
"index": 0
}
]
]
},
"Refactor article link": {
"main": [
[
{
"node": "Get article Page",
"type": "main",
"index": 0
}
]
]
}
}
}常见问题
如何使用这个工作流?
复制上方的 JSON 配置代码,在您的 n8n 实例中创建新工作流并选择「从 JSON 导入」,粘贴配置后根据需要修改凭证设置即可。
这个工作流适合什么场景?
中级 - 潜在客户开发, AI 摘要总结
需要付费吗?
本工作流完全免费,您可以直接导入使用。但请注意,工作流中使用的第三方服务(如 OpenAI API)可能需要您自行付费。
相关工作流推荐
Cha GPT驱动的破冰生成器
使用GPT-4和谷歌表格从网站生成个性化冷邮件开场白
If
Set
Code
+10
19 节点Mirai
潜在客户开发
潜在客户开发与邮件工作流
使用Google Maps、SendGrid和AI自动化B2B潜在客户开发与邮件营销
If
Set
Code
+21
141 节点Ezema Kingsley Chibuzo
潜在客户开发
SEO博客分析
使用AI分析博客SEO:基于GPT-4和合规爬取的完整评估
If
Set
Code
+7
20 节点inderjeet Bhambra
市场调研
使用 OpenAI 自动检测需处理邮件并通过 Flow 在 Teams 发送提醒
使用 OpenAI 自动检测需处理邮件并通过 Flow 在 Teams 发送提醒消息
If
Set
Code
+4
11 节点Eumentis
个人效率
使用Apollo、GPT-4和Telegram的AI驱动潜在客户生成至数据库
基于AI的潜在客户生成:使用Apollo、GPT-4和Telegram输出到数据库
Set
Code
Limit
+15
26 节点Paul
潜在客户开发
使用LinkedIn和Perplexity AI自动生成潜在客户研究报告
使用LinkedIn和Perplexity AI自动生成潜在客户研究报告
Code
Html
Gmail
+7
27 节点Abdul Mir
潜在客户开发