Extracteur automatisé de listes immobilières
Ceci est unMarket Researchworkflow d'automatisation du domainecontenant 7 nœuds.Utilise principalement des nœuds comme Code, GoogleSheets, ScheduleTrigger, Scrapeless. Automatisation du scraping des annonces immobilières avec Scrapeless et Google Sheets
- •Informations d'identification Google Sheets API
Nœuds utilisés (7)
Catégorie
{
"id": "EgeVsV76EKfXbkcW",
"meta": {
"instanceId": "7d291de9dc3bbf0106d65e069919a3de2507e3365a7b25788a79a3562af9bfc5"
},
"name": "Automated Real Estate Listing Extractor",
"tags": [],
"nodes": [
{
"id": "337aabda-3017-4057-8383-6855837d5e9a",
"name": "Déclencheur Hebdomadaire du Marché",
"type": "n8n-nodes-base.scheduleTrigger",
"position": [
60,
780
],
"parameters": {
"rule": {
"interval": [
{
"field": "weeks",
"triggerAtDay": [
1
],
"triggerAtHour": 9
}
]
}
},
"typeVersion": 1.2
},
{
"id": "2be97af8-6121-4cbc-9239-1901d947d8e2",
"name": "Note Adhésive3",
"type": "n8n-nodes-base.stickyNote",
"position": [
0,
0
],
"parameters": {
"color": 6,
"width": 620,
"height": 1160,
"content": "## 🔹 **SECTION 1: 🔁 Schedule Trigger — Automate Workflow**\n\n### 🧩 1. 📅 Schedule Trigger\n\n**Node Name:** `Schedule Trigger` \n**What it does:** \nAutomatically triggers the workflow every 6 hours, no manual intervention needed. Keeps your data fresh and updated regularly.\n\n🧠 **Beginner Benefit:** \n\n> Set it once and forget it — your workflow runs automatically on schedule without any extra effort.\n\n---\n\n## 🔹 **SECTION 2: 🌐 Scrapeless Crawler — Fetch Webpage Data**\n\n### 🧩 2. 🕷️ Scrapeless Crawler\n\n**Node Name:** `Scrapeless Crawler` \n**What it does:** \nSends a request to Scrapeless API to crawl the target real estate webpage. Returns the page content in Markdown format for easy parsing later.\n\n**Example URL:** \nhttps://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/\n\n🧠 **Beginner Benefit:** \n\n> Leverage powerful scraping as a service — no need to write complicated crawler code yourself.\n\n---\n"
},
"typeVersion": 1
},
{
"id": "ce4de51e-920e-4e72-9aee-13f2180952fc",
"name": "Note Adhésive",
"type": "n8n-nodes-base.stickyNote",
"position": [
660,
-380
],
"parameters": {
"color": 5,
"width": 700,
"height": 1540,
"content": "\n\n## 🔹 **SECTION 4: 🕵️ Parse Listings — Extract Property Data**\n\n### 🧩 3. 🔍 Parse Listings (Code Node)\n\n\n**Node Name:** `Parse Listings`\n**What it does:**\nHandles the entire extraction and cleaning process in a single code node to simplify the workflow and improve performance.\n\n\n### ✅ **Step 1: Extract Markdown Text**\n\n* Extracts the core Markdown-formatted text from the complex HTML response returned by Scrapeless.\n* Automatically removes unwanted HTML tags, scripts, and ads, keeping only the meaningful page content.\n\n---\n\n### ✅ **Step 2: Parse Key Information**\n\n* Uses regex and string manipulation to extract critical fields from the Markdown text, including:\n\n * 🏢 **Property Title**\n * 🔗 **Link**\n * 📐 **Size**\n * 🏗️ **Year Built**\n\n* Outputs clean, structured **JSON objects** that are easy to pass to downstream nodes.\n\n---\n\n### ✅ **Step 3: Clean & Format Data**\n\n* Filters out unnecessary fields, keeping only the relevant ones:\n\n * `title`\n * `link`\n * `size`\n * `yearBuilt`\n\n* Formats the output to be clean and ready for export to Google Sheets, Notion, Slack, databases, or other platforms.\n\n---\n\n### 🧠 **Beginner Benefit:**\n\n> Extracts text, parses listings, and cleans data in one step, saving time and reducing node complexity. Produces structured, ready-to-use data for your business needs.\n\n"
},
"typeVersion": 1
},
{
"id": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
"name": "Exploration",
"type": "n8n-nodes-scrapeless.scrapeless",
"position": [
360,
780
],
"parameters": {
"url": "https://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/",
"resource": "crawler",
"operation": "crawl",
"limitCrawlPages": 2
},
"credentials": {
"scrapelessApi": {
"id": "B73pdQXNjpqNbIhs",
"name": "Scrapeless account"
}
},
"typeVersion": 1
},
{
"id": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
"name": "Ajouter ou mettre à jour une ligne dans la feuille",
"type": "n8n-nodes-base.googleSheets",
"position": [
1580,
780
],
"parameters": {
"columns": {
"value": {
"Link": "={{ $json.link }}",
"Size": "={{ $json.size }}",
"Image": "={{ $json.image }}",
"Title": "={{ $json.title }}",
"YearBuilt": "={{ $json.yearBuilt }}"
},
"schema": [
{
"id": "Title",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Title",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Link",
"type": "string",
"display": true,
"required": false,
"displayName": "Link",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Size",
"type": "string",
"display": true,
"required": false,
"displayName": "Size",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "YearBuilt",
"type": "string",
"display": true,
"required": false,
"displayName": "YearBuilt",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Image",
"type": "string",
"display": true,
"required": false,
"displayName": "Image",
"defaultMatch": false,
"canBeUsedToMatch": true
}
],
"mappingMode": "defineBelow",
"matchingColumns": [
"Title"
],
"attemptToConvertTypes": false,
"convertFieldsToString": false
},
"options": {},
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": "gid=0",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit#gid=0",
"cachedResultName": "Sheet1"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit?usp=drivesdk",
"cachedResultName": "Real Estate Market Report"
}
},
"typeVersion": 4.6
},
{
"id": "0458cbbb-e60b-461d-aed0-562d5067946e",
"name": "Note Adhésive1",
"type": "n8n-nodes-base.stickyNote",
"position": [
1420,
-60
],
"parameters": {
"width": 580,
"height": 1220,
"content": "\n\n## 🔹 **SECTION 6: 📊 Append to Google Sheets — Save Data**\n\n### 🧩 6. 📈 Append to Google Sheets\n\n**Node Name:** `Google Sheets Append` \n**What it does:** \nAppends the parsed and cleaned property data into a Google Sheets spreadsheet for easy review and analysis.\n\n🧠 **Beginner Benefit:** \n\n> Automatically keeps your spreadsheet up-to-date with fresh listings — no copy/paste required.\n\n"
},
"typeVersion": 1
},
{
"id": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
"name": "Analyser les Annonces",
"type": "n8n-nodes-base.code",
"position": [
860,
780
],
"parameters": {
"jsCode": "const markdownData = [];\n$input.all().forEach((item) => {\n\titem.json.forEach((c) => {\n\t\tmarkdownData.push(c.markdown);\n\t});\n});\n\nconst results = [];\n\nfunction dataExtact(md) {\n\tconst re = /\\[More details for ([^\\]]+)\\]\\((https:\\/\\/www\\.loopnet\\.com\\/Listing\\/[^\\)]+)\\)/g;\n\n\tlet match;\n\n\twhile ((match = re.exec(md))) {\n\t\tconst title = match[1].trim();\n\t\tconst link = match[2].trim()?.split(' ')[0];\n\n\t\t// Extract a snippet of context around the match\n\t\tconst context = md.slice(match.index, match.index + 500);\n\n\t\t// Extract size range, e.g. \"10,000 - 20,000 SF\"\n\t\tconst sizeMatch = context.match(/([\\d,]+)\\s*-\\s*([\\d,]+)\\s*SF/);\n\t\tconst sizeRange = sizeMatch ? `${sizeMatch[1]} - ${sizeMatch[2]} SF` : null;\n\n\t\t// Extract year built, e.g. \"Built in 1988\"\n\t\tconst yearMatch = context.match(/Built in\\s*(\\d{4})/i);\n\t\tconst yearBuilt = yearMatch ? yearMatch[1] : null;\n\n\t\t// Extract image URL\n\t\tconst imageMatch = context.match(/!\\[[^\\]]*\\]\\((https:\\/\\/images1\\.loopnet\\.com[^\\)]+)\\)/);\n\t\tconst image = imageMatch ? imageMatch[1] : null;\n\n\t\tresults.push({\n\t\t\tjson: {\n\t\t\t\ttitle,\n\t\t\t\tlink,\n\t\t\t\tsize: sizeRange,\n\t\t\t\tyearBuilt,\n\t\t\t\timage,\n\t\t\t},\n\t\t});\n\t}\n\n\t// Return original markdown if no matches found (for debugging)\n\tif (results.length === 0) {\n\t\treturn [\n\t\t\t{\n\t\t\t\tjson: {\n\t\t\t\t\terror: 'No listings matched',\n\t\t\t\t\traw: md,\n\t\t\t\t},\n\t\t\t},\n\t\t];\n\t}\n}\n\nmarkdownData.forEach((item) => {\n\tdataExtact(item);\n});\n\nreturn results;\n"
},
"typeVersion": 2
}
],
"active": false,
"pinData": {},
"settings": {
"executionOrder": "v1"
},
"versionId": "3bbe4fe1-455d-4486-af39-d0980957100e",
"connections": {
"3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2": {
"main": [
[
{
"node": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
"type": "main",
"index": 0
}
]
]
},
"f0d425e4-af8d-4c6f-bced-625ba3b094f0": {
"main": [
[
{
"node": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
"type": "main",
"index": 0
}
]
]
},
"337aabda-3017-4057-8383-6855837d5e9a": {
"main": [
[
{
"node": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
"type": "main",
"index": 0
}
]
]
}
}
}Comment utiliser ce workflow ?
Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.
Dans quelles scénarios ce workflow est-il adapté ?
Intermédiaire - Étude de marché
Est-ce payant ?
Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.
Workflows recommandés
scrapeless official
@scrapelessofficialPartager ce workflow