Extraction en temps réel des startups Y Combinator avec Apify et Google Sheets
Ceci est unLead Generation, Multimodal AIworkflow d'automatisation du domainecontenant 9 nœuds.Utilise principalement des nœuds comme GoogleSheets, Apify, ManualTrigger. Automatiser le scraping des startups Y Combinator avec Apify et Google Sheets
- •Informations d'identification Google Sheets API
Nœuds utilisés (9)
Catégorie
{
"id": "f0l6j5GkLScFOfqK",
"meta": {
"instanceId": "1a54c41d9050a8f1fa6f74ca858828ad9fb97b9fafa3e9760e576171c531a787",
"templateCredsSetupCompleted": true
},
"name": "Live-Automate Scraping Y Combinator Startups with Apify & Google Sheets",
"tags": [],
"nodes": [
{
"id": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
"name": "Exécuter un Actor",
"type": "@apify/n8n-nodes-apify.apify",
"position": [
1632,
1632
],
"parameters": {
"actorId": {
"__rl": true,
"mode": "list",
"value": "XXsXDaNQLjoF4lgmU",
"cachedResultUrl": "https://console.apify.com/actors/XXsXDaNQLjoF4lgmU/input",
"cachedResultName": "Y Combinator Directory Scraper | Fast & Reliable | $4.5 / 1K (fatihtahta/y-combinator-directory-scraper)"
},
"customBody": "{\n \"maxCompanies\": 5,\n \"startUrls\": \"{https://www.ycombinator.com/companies?industry=Fintech®ions=America%20%2F%20Canada&team_size=%5B%221%22%2C%2225%22%5D}\",\n \"proxyConfiguration\": {\n \"useApifyProxy\": true\n }\n}"
},
"credentials": {
"apifyApi": {
"id": "8decwrzbYTySCGCT",
"name": "Apify account 4"
}
},
"typeVersion": 1
},
{
"id": "e524c759-a193-42b6-9553-683656413431",
"name": "Obtenir les éléments du dataset",
"type": "@apify/n8n-nodes-apify.apify",
"position": [
2432,
1968
],
"parameters": {
"resource": "Datasets",
"datasetId": "={{ $json.defaultDatasetId }}"
},
"credentials": {
"apifyApi": {
"id": "8decwrzbYTySCGCT",
"name": "Apify account 4"
}
},
"typeVersion": 1
},
{
"id": "4eea9bab-911c-4480-9073-831b8ac46571",
"name": "Note adhésive",
"type": "n8n-nodes-base.stickyNote",
"position": [
608,
1744
],
"parameters": {
"width": 528,
"height": 336,
"content": "### **Step 1 – Manual Trigger**\n\n- The workflow begins with a **Manual Trigger node**, allowing you to start the process on demand. \n- This approach ensures full control over when company data from **Y Combinator** is scraped and logged. \n"
},
"typeVersion": 1
},
{
"id": "b5814a97-7dd1-4488-8af3-6bf0af555d51",
"name": "Démarrer le Workflow",
"type": "n8n-nodes-base.manualTrigger",
"position": [
816,
1936
],
"parameters": {},
"typeVersion": 1
},
{
"id": "3eacc0a3-ca74-4405-ad0e-a25b9b4b964e",
"name": "Note adhésive1",
"type": "n8n-nodes-base.stickyNote",
"position": [
1392,
1424
],
"parameters": {
"color": 3,
"width": 592,
"height": 368,
"content": "### **Step 2 – Apify Actor (Scrape Company Data)**\n\n- This step uses an **Apify Actor node** to scrape details of companies listed on **Y Combinator**. \n- You need to provide the **URL of the Y Combinator search page** with your desired filters applied (e.g., industry, location, funding stage). \n- The actor then extracts structured company data, including names, descriptions, websites, and other available details, preparing it for downstream logging and processing.\n"
},
"typeVersion": 1
},
{
"id": "d67e5ff1-ff84-4196-9a76-cc59215e4061",
"name": "Note adhésive2",
"type": "n8n-nodes-base.stickyNote",
"position": [
2176,
1760
],
"parameters": {
"color": 4,
"width": 592,
"height": 368,
"content": "### **Step 3 – Apify Get Dataset Items**\n\n- This step uses the **Apify Get Dataset Items node** to fetch the actual company data generated by the Apify Actor in the previous step. \n- The node requires the **Dataset ID** returned by the Apify Actor to retrieve structured results. \n- The output includes detailed company information (e.g., name, description, website, location, sector), which is then prepared for logging into Google Sheets.\n"
},
"typeVersion": 1
},
{
"id": "04149226-1821-419d-b7c6-f2288de0f4cc",
"name": "Note adhésive3",
"type": "n8n-nodes-base.stickyNote",
"position": [
3040,
1104
],
"parameters": {
"color": 5,
"width": 640,
"height": 720,
"content": "### **Step 4 – Add or Update Row in Google Sheet**\n\n- This step uses the **Google Sheets (Add or Update Row) node** to log the company data into a connected Google Sheet. \n- You must **select the target Google Document and specific Sheet** where the data will be stored. \n- Ensure the following columns are already created in the sheet (**case-sensitive**): \n - Company \n - Location \n - Website \n - LinkedIn \n - Founded \n - Description \n - Industry Tags \n - Founder 1 Name \n - Founder 1 LinkedIn \n - Founder 2 Name \n - Founder 2 LinkedIn \n\n- The node will automatically add new rows or update existing entries, keeping the sheet clean and up to date with the latest scraped company details.\n"
},
"typeVersion": 1
},
{
"id": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
"name": "Ajouter des données à la feuille Google",
"type": "n8n-nodes-base.googleSheets",
"position": [
3312,
1616
],
"parameters": {
"columns": {
"value": {
"Company": "={{ $json.company_name }}",
"Founded": "={{ $json.year_founded }}",
"Website": "={{ $json.website }}",
"LinkedIn": "={{ $json.company_linkedin }}",
"Location": "={{ $json.company_location }}",
"Description": "={{ $json.long_description }}",
"Industry Tags": "={{ $json['tags/0'] }} {{ $json['tags/1'] }} {{ $json['tags/2'] }} {{ $json['tags/3'] }}",
"Founder 1 Name": "={{ $json['founders/0/name'] }}",
"Founder 2 Name": "={{ $json['founders/1/name'] }}",
"Founder 1 LinkedIn": "={{ $json['founders/0/linkedin'] }}",
"Founder 2 LinkedIn": "={{ $json['founders/1/linkedin'] }}"
},
"schema": [
{
"id": "Company",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Company",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Location",
"type": "string",
"display": true,
"required": false,
"displayName": "Location",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Website",
"type": "string",
"display": true,
"required": false,
"displayName": "Website",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "LinkedIn",
"type": "string",
"display": true,
"required": false,
"displayName": "LinkedIn",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Founded",
"type": "string",
"display": true,
"required": false,
"displayName": "Founded",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Description",
"type": "string",
"display": true,
"required": false,
"displayName": "Description",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Industry Tags",
"type": "string",
"display": true,
"required": false,
"displayName": "Industry Tags",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Founder 1 Name",
"type": "string",
"display": true,
"required": false,
"displayName": "Founder 1 Name",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Founder 1 LinkedIn",
"type": "string",
"display": true,
"required": false,
"displayName": "Founder 1 LinkedIn",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Founder 2 Name",
"type": "string",
"display": true,
"required": false,
"displayName": "Founder 2 Name",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Founder 2 LinkedIn",
"type": "string",
"display": true,
"required": false,
"displayName": "Founder 2 LinkedIn",
"defaultMatch": false,
"canBeUsedToMatch": true
}
],
"mappingMode": "defineBelow",
"matchingColumns": [
"Company"
],
"attemptToConvertTypes": false,
"convertFieldsToString": false
},
"options": {},
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": "gid=0",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit#gid=0",
"cachedResultName": "Sheet1"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit?usp=drivesdk",
"cachedResultName": "YCom Apify Scrapped "
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"id": "dZG6jp43p2oX45HG",
"name": "Google Sheets account 4-Smit"
}
},
"typeVersion": 4.7
},
{
"id": "c8f614e2-2aa5-4f4a-8be9-090fb24bf616",
"name": "Note adhésive4",
"type": "n8n-nodes-base.stickyNote",
"position": [
368,
944
],
"parameters": {
"color": 3,
"width": 768,
"height": 672,
"content": "### **Step 0 – Prerequisites**\n\nBefore running the workflow, ensure the following configurations are complete:\n\n- **Apify Setup:**\n - Connect your Apify account in n8n. \n - Select the **Y Combinator Directory Scraper** actor. \n - Paste the Y Combinator search URL (with filters applied) into the `searchUrls` parameter. \n - Adjust the `maxCompanies` parameter to control the number of companies scraped per run. \n\n- **Google Sheets Setup:**\n - Connect your Google account using **OAuth2 credentials** with both **Google Sheets** and **Google Drive** features enabled. \n - Ensure the target Google Sheet is created in advance with the following column headers (**case-sensitive**): \n - Company \n - Location \n - Website \n - LinkedIn \n - Founded \n - Description \n - Industry Tags \n - Founder 1 Name \n - Founder 1 LinkedIn \n - Founder 2 Name \n - Founder 2 LinkedIn \n\n- **n8n Configuration:**\n - Confirm that both Apify and Google integrations are properly authenticated and available in your workflow.\n"
},
"typeVersion": 1
}
],
"active": false,
"pinData": {},
"settings": {
"executionOrder": "v1"
},
"versionId": "36ae4ec1-b59a-49a4-b4e6-0f80bd2111f3",
"connections": {
"4d88b9f9-6909-47c8-91a5-c27ebc97de49": {
"main": [
[
{
"node": "e524c759-a193-42b6-9553-683656413431",
"type": "main",
"index": 0
}
]
]
},
"b5814a97-7dd1-4488-8af3-6bf0af555d51": {
"main": [
[
{
"node": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
"type": "main",
"index": 0
}
]
]
},
"e524c759-a193-42b6-9553-683656413431": {
"main": [
[
{
"node": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
"type": "main",
"index": 0
}
]
]
}
}
}Comment utiliser ce workflow ?
Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.
Dans quelles scénarios ce workflow est-il adapté ?
Intermédiaire - Génération de leads, IA Multimodale
Est-ce payant ?
Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.
Workflows recommandés
Intuz
@intuzWorkflow automation can help automate your routine activities and help saves $$$, as well as hours of time. As a boutique tech consulting company, Intuz help businesses with custom AI/ML, AI Workflow Automations, and software development. Automate your business workflow for: Sales Marketing Accounting Finance Operations E-Commerce Customer Support Admin & Backoffice Logistics & Supply Chain
Partager ce workflow