Extractor automatizado de propiedades inmobiliarias

Intermedio

Este es unMarket Researchflujo de automatización del dominio deautomatización que contiene 7 nodos.Utiliza principalmente nodos como Code, GoogleSheets, ScheduleTrigger, Scrapeless. Usar Scrapeless y Google Sheets para automatizar el scraping de listas de propiedades inmobiliarias

Requisitos previos
  • Credenciales de API de Google Sheets
Vista previa del flujo de trabajo
Visualización de las conexiones entre nodos, con soporte para zoom y panorámica
Exportar flujo de trabajo
Copie la siguiente configuración JSON en n8n para importar y usar este flujo de trabajo
{
  "id": "EgeVsV76EKfXbkcW",
  "meta": {
    "instanceId": "7d291de9dc3bbf0106d65e069919a3de2507e3365a7b25788a79a3562af9bfc5"
  },
  "name": "Automated Real Estate Listing Extractor",
  "tags": [],
  "nodes": [
    {
      "id": "337aabda-3017-4057-8383-6855837d5e9a",
      "name": "Activador Semanal de Mercado",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        60,
        780
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "weeks",
              "triggerAtDay": [
                1
              ],
              "triggerAtHour": 9
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "2be97af8-6121-4cbc-9239-1901d947d8e2",
      "name": "Nota Adhesiva3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        0,
        0
      ],
      "parameters": {
        "color": 6,
        "width": 620,
        "height": 1160,
        "content": "## 🔹 **SECTION 1: 🔁 Schedule Trigger — Automate Workflow**\n\n### 🧩 1. 📅 Schedule Trigger\n\n**Node Name:** `Schedule Trigger`  \n**What it does:**  \nAutomatically triggers the workflow every 6 hours, no manual intervention needed. Keeps your data fresh and updated regularly.\n\n🧠 **Beginner Benefit:**  \n\n> Set it once and forget it — your workflow runs automatically on schedule without any extra effort.\n\n---\n\n## 🔹 **SECTION 2: 🌐 Scrapeless Crawler — Fetch Webpage Data**\n\n### 🧩 2. 🕷️ Scrapeless Crawler\n\n**Node Name:** `Scrapeless Crawler`  \n**What it does:**  \nSends a request to Scrapeless API to crawl the target real estate webpage. Returns the page content in Markdown format for easy parsing later.\n\n**Example URL:**  \nhttps://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/\n\n🧠 **Beginner Benefit:**  \n\n> Leverage powerful scraping as a service — no need to write complicated crawler code yourself.\n\n---\n"
      },
      "typeVersion": 1
    },
    {
      "id": "ce4de51e-920e-4e72-9aee-13f2180952fc",
      "name": "Nota Adhesiva",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        660,
        -380
      ],
      "parameters": {
        "color": 5,
        "width": 700,
        "height": 1540,
        "content": "\n\n## 🔹 **SECTION 4: 🕵️ Parse Listings — Extract Property Data**\n\n### 🧩 3. 🔍 Parse Listings (Code Node)\n\n\n**Node Name:** `Parse Listings`\n**What it does:**\nHandles the entire extraction and cleaning process in a single code node to simplify the workflow and improve performance.\n\n\n### ✅ **Step 1: Extract Markdown Text**\n\n* Extracts the core Markdown-formatted text from the complex HTML response returned by Scrapeless.\n* Automatically removes unwanted HTML tags, scripts, and ads, keeping only the meaningful page content.\n\n---\n\n### ✅ **Step 2: Parse Key Information**\n\n* Uses regex and string manipulation to extract critical fields from the Markdown text, including:\n\n  * 🏢 **Property Title**\n  * 🔗 **Link**\n  * 📐 **Size**\n  * 🏗️ **Year Built**\n\n* Outputs clean, structured **JSON objects** that are easy to pass to downstream nodes.\n\n---\n\n### ✅ **Step 3: Clean & Format Data**\n\n* Filters out unnecessary fields, keeping only the relevant ones:\n\n  * `title`\n  * `link`\n  * `size`\n  * `yearBuilt`\n\n* Formats the output to be clean and ready for export to Google Sheets, Notion, Slack, databases, or other platforms.\n\n---\n\n### 🧠 **Beginner Benefit:**\n\n> Extracts text, parses listings, and cleans data in one step, saving time and reducing node complexity. Produces structured, ready-to-use data for your business needs.\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
      "name": "Rastrear",
      "type": "n8n-nodes-scrapeless.scrapeless",
      "position": [
        360,
        780
      ],
      "parameters": {
        "url": "https://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/",
        "resource": "crawler",
        "operation": "crawl",
        "limitCrawlPages": 2
      },
      "credentials": {
        "scrapelessApi": {
          "id": "B73pdQXNjpqNbIhs",
          "name": "Scrapeless account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
      "name": "Añadir o actualizar fila en hoja",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1580,
        780
      ],
      "parameters": {
        "columns": {
          "value": {
            "Link": "={{ $json.link }}",
            "Size": "={{ $json.size }}",
            "Image": "={{ $json.image }}",
            "Title": "={{ $json.title }}",
            "YearBuilt": "={{ $json.yearBuilt }}"
          },
          "schema": [
            {
              "id": "Title",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Title",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Link",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Link",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Size",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Size",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "YearBuilt",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "YearBuilt",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Image",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Image",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Title"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit?usp=drivesdk",
          "cachedResultName": "Real Estate Market Report"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "0458cbbb-e60b-461d-aed0-562d5067946e",
      "name": "Nota Adhesiva1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1420,
        -60
      ],
      "parameters": {
        "width": 580,
        "height": 1220,
        "content": "\n\n## 🔹 **SECTION 6: 📊 Append to Google Sheets — Save Data**\n\n### 🧩 6. 📈 Append to Google Sheets\n\n**Node Name:** `Google Sheets Append`  \n**What it does:**  \nAppends the parsed and cleaned property data into a Google Sheets spreadsheet for easy review and analysis.\n\n🧠 **Beginner Benefit:**  \n\n> Automatically keeps your spreadsheet up-to-date with fresh listings — no copy/paste required.\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
      "name": "Analizar Listados",
      "type": "n8n-nodes-base.code",
      "position": [
        860,
        780
      ],
      "parameters": {
        "jsCode": "const markdownData = [];\n$input.all().forEach((item) => {\n\titem.json.forEach((c) => {\n\t\tmarkdownData.push(c.markdown);\n\t});\n});\n\nconst results = [];\n\nfunction dataExtact(md) {\n\tconst re = /\\[More details for ([^\\]]+)\\]\\((https:\\/\\/www\\.loopnet\\.com\\/Listing\\/[^\\)]+)\\)/g;\n\n\tlet match;\n\n\twhile ((match = re.exec(md))) {\n\t\tconst title = match[1].trim();\n\t\tconst link = match[2].trim()?.split(' ')[0];\n\n\t\t// Extract a snippet of context around the match\n\t\tconst context = md.slice(match.index, match.index + 500);\n\n\t\t// Extract size range, e.g. \"10,000 - 20,000 SF\"\n\t\tconst sizeMatch = context.match(/([\\d,]+)\\s*-\\s*([\\d,]+)\\s*SF/);\n\t\tconst sizeRange = sizeMatch ? `${sizeMatch[1]} - ${sizeMatch[2]} SF` : null;\n\n\t\t// Extract year built, e.g. \"Built in 1988\"\n\t\tconst yearMatch = context.match(/Built in\\s*(\\d{4})/i);\n\t\tconst yearBuilt = yearMatch ? yearMatch[1] : null;\n\n\t\t// Extract image URL\n\t\tconst imageMatch = context.match(/!\\[[^\\]]*\\]\\((https:\\/\\/images1\\.loopnet\\.com[^\\)]+)\\)/);\n\t\tconst image = imageMatch ? imageMatch[1] : null;\n\n\t\tresults.push({\n\t\t\tjson: {\n\t\t\t\ttitle,\n\t\t\t\tlink,\n\t\t\t\tsize: sizeRange,\n\t\t\t\tyearBuilt,\n\t\t\t\timage,\n\t\t\t},\n\t\t});\n\t}\n\n\t// Return original markdown if no matches found (for debugging)\n\tif (results.length === 0) {\n\t\treturn [\n\t\t\t{\n\t\t\t\tjson: {\n\t\t\t\t\terror: 'No listings matched',\n\t\t\t\t\traw: md,\n\t\t\t\t},\n\t\t\t},\n\t\t];\n\t}\n}\n\nmarkdownData.forEach((item) => {\n\tdataExtact(item);\n});\n\nreturn results;\n"
      },
      "typeVersion": 2
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "3bbe4fe1-455d-4486-af39-d0980957100e",
  "connections": {
    "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2": {
      "main": [
        [
          {
            "node": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "f0d425e4-af8d-4c6f-bced-625ba3b094f0": {
      "main": [
        [
          {
            "node": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "337aabda-3017-4057-8383-6855837d5e9a": {
      "main": [
        [
          {
            "node": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Preguntas frecuentes

¿Cómo usar este flujo de trabajo?

Copie el código de configuración JSON de arriba, cree un nuevo flujo de trabajo en su instancia de n8n y seleccione "Importar desde JSON", pegue la configuración y luego modifique la configuración de credenciales según sea necesario.

¿En qué escenarios es adecuado este flujo de trabajo?

Intermedio - Investigación de mercado

¿Es de pago?

Este flujo de trabajo es completamente gratuito, puede importarlo y usarlo directamente. Sin embargo, tenga en cuenta que los servicios de terceros utilizados en el flujo de trabajo (como la API de OpenAI) pueden requerir un pago por su cuenta.

Información del flujo de trabajo
Nivel de dificultad
Intermedio
Número de nodos7
Categoría1
Tipos de nodos5
Descripción de la dificultad

Adecuado para usuarios con experiencia intermedia, flujos de trabajo de complejidad media con 6-15 nodos

Enlaces externos
Ver en n8n.io

Compartir este flujo de trabajo

Categorías

Categorías: 34