Apify와 Google Sheets를 사용한 Y Combinator 스타트업 실시간 자동 스크래핑

Name: Apify와 Google Sheets를 사용한 Y Combinator 스타트업 실시간 자동 스크래핑
Rating: 4.5 (10 reviews)
Author: Intuz

중급

이것은Lead Generation, Multimodal AI분야의자동화 워크플로우로, 9개의 노드를 포함합니다.주로 GoogleSheets, Apify, ManualTrigger 등의 노드를 사용하며. Apify 및 Google Sheets를 사용한 Y Combinator 스타트업 스크래핑 자동화

사전 요구사항

•Google Sheets API 인증 정보

사용된 노드 (9)

카테고리

리드 생성

멀티모달 AI

워크플로우 미리보기

노드 연결 관계를 시각적으로 표시하며, 확대/축소 및 이동을 지원합니다

액터 실행하기

데이터셋 항목 가져오기

워크플로우 시작

Google Sheet에 데이터 추가하기

React Flow

워크플로우 내보내기

다음 JSON 구성을 복사하여 n8n에 가져오면 이 워크플로우를 사용할 수 있습니다

{
  "id": "f0l6j5GkLScFOfqK",
  "meta": {
    "instanceId": "1a54c41d9050a8f1fa6f74ca858828ad9fb97b9fafa3e9760e576171c531a787",
    "templateCredsSetupCompleted": true
  },
  "name": "Live-Automate Scraping Y Combinator Startups with Apify & Google Sheets",
  "tags": [],
  "nodes": [
    {
      "id": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
      "name": "액터 실행하기",
      "type": "@apify/n8n-nodes-apify.apify",
      "position": [
        1632,
        1632
      ],
      "parameters": {
        "actorId": {
          "__rl": true,
          "mode": "list",
          "value": "XXsXDaNQLjoF4lgmU",
          "cachedResultUrl": "https://console.apify.com/actors/XXsXDaNQLjoF4lgmU/input",
          "cachedResultName": "Y Combinator Directory Scraper | Fast & Reliable | $4.5 / 1K (fatihtahta/y-combinator-directory-scraper)"
        },
        "customBody": "{\n  \"maxCompanies\": 5,\n  \"startUrls\": \"{https://www.ycombinator.com/companies?industry=Fintech&regions=America%20%2F%20Canada&team_size=%5B%221%22%2C%2225%22%5D}\",\n  \"proxyConfiguration\": {\n    \"useApifyProxy\": true\n  }\n}"
      },
      "credentials": {
        "apifyApi": {
          "id": "8decwrzbYTySCGCT",
          "name": "Apify account 4"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "e524c759-a193-42b6-9553-683656413431",
      "name": "데이터셋 항목 가져오기",
      "type": "@apify/n8n-nodes-apify.apify",
      "position": [
        2432,
        1968
      ],
      "parameters": {
        "resource": "Datasets",
        "datasetId": "={{ $json.defaultDatasetId }}"
      },
      "credentials": {
        "apifyApi": {
          "id": "8decwrzbYTySCGCT",
          "name": "Apify account 4"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "4eea9bab-911c-4480-9073-831b8ac46571",
      "name": "메모",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        608,
        1744
      ],
      "parameters": {
        "width": 528,
        "height": 336,
        "content": "### **Step 1 – Manual Trigger**\n\n- The workflow begins with a **Manual Trigger node**, allowing you to start the process on demand.  \n- This approach ensures full control over when company data from **Y Combinator** is scraped and logged.  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "b5814a97-7dd1-4488-8af3-6bf0af555d51",
      "name": "워크플로우 시작",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        816,
        1936
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "3eacc0a3-ca74-4405-ad0e-a25b9b4b964e",
      "name": "메모1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1392,
        1424
      ],
      "parameters": {
        "color": 3,
        "width": 592,
        "height": 368,
        "content": "### **Step 2 – Apify Actor (Scrape Company Data)**\n\n- This step uses an **Apify Actor node** to scrape details of companies listed on **Y Combinator**.  \n- You need to provide the **URL of the Y Combinator search page** with your desired filters applied (e.g., industry, location, funding stage).  \n- The actor then extracts structured company data, including names, descriptions, websites, and other available details, preparing it for downstream logging and processing.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "d67e5ff1-ff84-4196-9a76-cc59215e4061",
      "name": "메모2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2176,
        1760
      ],
      "parameters": {
        "color": 4,
        "width": 592,
        "height": 368,
        "content": "### **Step 3 – Apify Get Dataset Items**\n\n- This step uses the **Apify Get Dataset Items node** to fetch the actual company data generated by the Apify Actor in the previous step.  \n- The node requires the **Dataset ID** returned by the Apify Actor to retrieve structured results.  \n- The output includes detailed company information (e.g., name, description, website, location, sector), which is then prepared for logging into Google Sheets.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "04149226-1821-419d-b7c6-f2288de0f4cc",
      "name": "메모3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3040,
        1104
      ],
      "parameters": {
        "color": 5,
        "width": 640,
        "height": 720,
        "content": "### **Step 4 – Add or Update Row in Google Sheet**\n\n- This step uses the **Google Sheets (Add or Update Row) node** to log the company data into a connected Google Sheet.  \n- You must **select the target Google Document and specific Sheet** where the data will be stored.  \n- Ensure the following columns are already created in the sheet (**case-sensitive**):  \n  - Company  \n  - Location  \n  - Website  \n  - LinkedIn  \n  - Founded  \n  - Description  \n  - Industry Tags  \n  - Founder 1 Name  \n  - Founder 1 LinkedIn  \n  - Founder 2 Name  \n  - Founder 2 LinkedIn  \n\n- The node will automatically add new rows or update existing entries, keeping the sheet clean and up to date with the latest scraped company details.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
      "name": "Google Sheet에 데이터 추가하기",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        3312,
        1616
      ],
      "parameters": {
        "columns": {
          "value": {
            "Company": "={{ $json.company_name }}",
            "Founded": "={{ $json.year_founded }}",
            "Website": "={{ $json.website }}",
            "LinkedIn": "={{ $json.company_linkedin }}",
            "Location": "={{ $json.company_location }}",
            "Description": "={{ $json.long_description }}",
            "Industry Tags": "={{ $json['tags/0'] }} {{ $json['tags/1'] }} {{ $json['tags/2'] }} {{ $json['tags/3'] }}",
            "Founder 1 Name": "={{ $json['founders/0/name'] }}",
            "Founder 2 Name": "={{ $json['founders/1/name'] }}",
            "Founder 1 LinkedIn": "={{ $json['founders/0/linkedin'] }}",
            "Founder 2 LinkedIn": "={{ $json['founders/1/linkedin'] }}"
          },
          "schema": [
            {
              "id": "Company",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Company",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Location",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Location",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Website",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Website",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founded",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founded",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Description",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Description",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Industry Tags",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Industry Tags",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 1 Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 1 Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 1 LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 1 LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 2 Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 2 Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 2 LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 2 LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Company"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit?usp=drivesdk",
          "cachedResultName": "YCom Apify Scrapped "
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "dZG6jp43p2oX45HG",
          "name": "Google Sheets account 4-Smit"
        }
      },
      "typeVersion": 4.7
    },
    {
      "id": "c8f614e2-2aa5-4f4a-8be9-090fb24bf616",
      "name": "메모4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        368,
        944
      ],
      "parameters": {
        "color": 3,
        "width": 768,
        "height": 672,
        "content": "### **Step 0 – Prerequisites**\n\nBefore running the workflow, ensure the following configurations are complete:\n\n- **Apify Setup:**\n  - Connect your Apify account in n8n.  \n  - Select the **Y Combinator Directory Scraper** actor.  \n  - Paste the Y Combinator search URL (with filters applied) into the `searchUrls` parameter.  \n  - Adjust the `maxCompanies` parameter to control the number of companies scraped per run.  \n\n- **Google Sheets Setup:**\n  - Connect your Google account using **OAuth2 credentials** with both **Google Sheets** and **Google Drive** features enabled.  \n  - Ensure the target Google Sheet is created in advance with the following column headers (**case-sensitive**):  \n    - Company  \n    - Location  \n    - Website  \n    - LinkedIn  \n    - Founded  \n    - Description  \n    - Industry Tags  \n    - Founder 1 Name  \n    - Founder 1 LinkedIn  \n    - Founder 2 Name  \n    - Founder 2 LinkedIn  \n\n- **n8n Configuration:**\n  - Confirm that both Apify and Google integrations are properly authenticated and available in your workflow.\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "36ae4ec1-b59a-49a4-b4e6-0f80bd2111f3",
  "connections": {
    "4d88b9f9-6909-47c8-91a5-c27ebc97de49": {
      "main": [
        [
          {
            "node": "e524c759-a193-42b6-9553-683656413431",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "b5814a97-7dd1-4488-8af3-6bf0af555d51": {
      "main": [
        [
          {
            "node": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e524c759-a193-42b6-9553-683656413431": {
      "main": [
        [
          {
            "node": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

자주 묻는 질문

이 워크플로우를 어떻게 사용하나요?

위의 JSON 구성 코드를 복사하여 n8n 인스턴스에서 새 워크플로우를 생성하고 "JSON에서 가져오기"를 선택한 후, 구성을 붙여넣고 필요에 따라 인증 설정을 수정하세요.

이 워크플로우는 어떤 시나리오에 적합한가요?

중급 - 리드 생성, 멀티모달 AI

유료인가요?

이 워크플로우는 완전히 무료이며 직접 가져와 사용할 수 있습니다. 다만, 워크플로우에서 사용하는 타사 서비스(예: OpenAI API)는 사용자 직접 비용을 지불해야 할 수 있습니다.

Apify와 Google Sheets를 사용한 Y Combinator 스타트업 실시간 자동 스크래핑

사용된 노드 (9)

카테고리

이 워크플로우를 어떻게 사용하나요?

이 워크플로우는 어떤 시나리오에 적합한가요?

유료인가요?

관련 워크플로우 추천

카테고리