Modèle pour extraire des PDF

Avancé

Ceci est unDocument Extraction, AI Summarizationworkflow d'automatisation du domainecontenant 24 nœuds.Utilise principalement des nœuds comme Set, Code, Html, Merge, Discord. Modèle d'IA Llama pour le suivi dans Google Sheets

Prérequis
  • Token Bot Discord ou Webhook
  • Peut nécessiter les informations d'identification d'authentification de l'API cible
  • Informations d'identification Google Sheets API
Aperçu du workflow
Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement
Exporter le workflow
Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow
{
  "id": "7W7d0YwVNAq2e1ia",
  "meta": {
    "instanceId": "3cc1878486c4b89b99f849786349de8096d559c2a7d28662583b888d19dabd2f"
  },
  "name": "Template to extract PDFs",
  "tags": [],
  "nodes": [
    {
      "id": "e8790514-92e0-4f1f-bf17-94833512d86e",
      "name": "Lors du clic sur 'Exécuter le workflow'",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -400,
        -96
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "8dea3c26-0cb2-4607-8135-d45963200c6a",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -192,
        -240
      ],
      "parameters": {
        "url": "PUT THE URL OF THE WEBSITE YOU WANT TO EXTRACT PDFS FROM",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "aacd88bc-e89b-47fc-af70-7c85f5e6c4f8",
      "name": "HTML",
      "type": "n8n-nodes-base.html",
      "position": [
        -48,
        -96
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "pdf",
              "attribute": "href",
              "cssSelector": "a[href$=\".pdf\"]",
              "returnArray": true,
              "returnValue": "attribute"
            },
            {
              "key": "tittle",
              "cssSelector": "a[href$=\".pdf\"]",
              "returnArray": true
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "963ff0ab-af77-4e66-8339-11b7d105c318",
      "name": "Code",
      "type": "n8n-nodes-base.code",
      "notes": "Lista los pdfs",
      "position": [
        96,
        -96
      ],
      "parameters": {
        "jsCode": "const baseUrl = \"https://www.playway.com\";\n\nreturn $input.first().json.pdf.map(link => {\n  return {\n    json: {\n      url: `${baseUrl}${link}`\n    }\n  };\n});"
      },
      "notesInFlow": true,
      "typeVersion": 2
    },
    {
      "id": "e305771c-6ad1-43ea-a8c1-61ba61be5df7",
      "name": "HTTP Request1",
      "type": "n8n-nodes-base.httpRequest",
      "onError": "continueErrorOutput",
      "position": [
        -416,
        208
      ],
      "parameters": {
        "url": "={{ $json.url }}",
        "options": {
          "response": {
            "response": {
              "neverError": true,
              "responseFormat": "file"
            }
          }
        }
      },
      "retryOnFail": true,
      "typeVersion": 4.2
    },
    {
      "id": "1306f74c-cf7a-4f4b-897a-abc15281eac2",
      "name": "Extract from File",
      "type": "n8n-nodes-base.extractFromFile",
      "onError": "continueRegularOutput",
      "position": [
        -176,
        208
      ],
      "parameters": {
        "options": {},
        "operation": "pdf"
      },
      "typeVersion": 1
    },
    {
      "id": "ddaba8bb-676b-4c1e-9986-47b55503520d",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "notes": "Ejecución semanal",
      "position": [
        -400,
        -288
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "weeks"
            }
          ]
        }
      },
      "notesInFlow": true,
      "typeVersion": 1.2
    },
    {
      "id": "f56b233d-f303-4845-be9e-7cde10110f87",
      "name": "Discord",
      "type": "n8n-nodes-base.discord",
      "position": [
        1024,
        144
      ],
      "webhookId": "b30b5012-4bb4-4a31-a291-f7490b21e4f6",
      "parameters": {
        "content": "={{ $json.message }}",
        "options": {},
        "authentication": "webhook"
      },
      "typeVersion": 2
    },
    {
      "id": "4c946a75-11bd-4c5e-9e29-9b95b56b342b",
      "name": "Get row(s) in sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        336,
        -80
      ],
      "parameters": {
        "options": {},
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA/edit?usp=drivesdk",
          "cachedResultName": "PlayWay_BabelBots"
        }
      },
      "executeOnce": true,
      "typeVersion": 4.6,
      "alwaysOutputData": true
    },
    {
      "id": "792b65d0-3e97-44d2-994b-7ddbb765c9dd",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        -464,
        592
      ],
      "parameters": {
        "model": "meta-llama/llama-4-maverick",
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "377311e1-ee99-48a9-8b23-fc9b09b4abbf",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        32,
        208
      ],
      "parameters": {
        "text": "=Analyze the following company report and provide an executive summary in English, highlighting:\n\n* Key financial information (if available)\n* Relevant actions announced by the company\n* Changes in the board of directors or other important decisions\n* Any potential impact for investors\n\nReport text:\n{{ \\$json.text }}\n",
        "options": {},
        "promptType": "define"
      },
      "typeVersion": 2.1
    },
    {
      "id": "73691e8d-3eb0-4918-8f57-e42ffda2ff1e",
      "name": "Simple Memory",
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "position": [
        144,
        432
      ],
      "parameters": {
        "sessionKey": "playway-agent",
        "sessionIdType": "customKey"
      },
      "typeVersion": 1.3
    },
    {
      "id": "f6a43984-cc3f-4a02-aadc-575ba322778f",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -192,
        160
      ],
      "parameters": {
        "color": 3,
        "width": 768,
        "height": 400,
        "content": "This is the AI process — go in and change the prompt to your liking.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "1928302e-8082-4d28-a553-4a7bb295547e",
      "name": "Edit Fields1",
      "type": "n8n-nodes-base.set",
      "position": [
        656,
        496
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "657129da-9b3f-43c2-ac71-a41799949ff1",
              "name": "url",
              "type": "string",
              "value": "={{ $('HTTP Request1').item.json.url }}"
            },
            {
              "id": "a3583044-d67b-4576-a9b9-b87181ce34c1",
              "name": "enviado_discord",
              "type": "string",
              "value": "true"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "bfd92648-dc17-4dec-b1a0-089b949b990e",
      "name": "Update row in sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        864,
        496
      ],
      "parameters": {
        "columns": {
          "value": {
            "url": "={{ $json.url }}",
            "enviado_discord": "={{ $json.enviado_discord }}"
          },
          "schema": [
            {
              "id": "url",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "url",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "enviado_discord",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "enviado_discord",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "url"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "update",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA/edit#gid=0",
          "cachedResultName": "PDFs"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA/edit?usp=drivesdk",
          "cachedResultName": "PlayWay_BabelBots"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "2996662d-4710-4340-96d0-3f6272a756ea",
      "name": "Append or update row in sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        672,
        -80
      ],
      "parameters": {
        "columns": {
          "value": {},
          "schema": [
            {
              "id": "url",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "url",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "enviado_discord",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "enviado_discord",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "autoMapInputData",
          "matchingColumns": [
            "url"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA/edit#gid=0",
          "cachedResultName": "PDFs"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1eJVTi79GRuz05MRn55KsQGKnn9R3yX8BBH212POrcKA/edit?usp=drivesdk",
          "cachedResultName": "PlayWay_BabelBots"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "3bf5e883-58dc-4035-b0fd-a87a00702da8",
      "name": "Merge",
      "type": "n8n-nodes-base.merge",
      "position": [
        512,
        -128
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "joinMode": "keepNonMatches",
        "fieldsToMatchString": "url"
      },
      "typeVersion": 3.2
    },
    {
      "id": "570fdfd2-e438-4b5c-86d2-e0729389b653",
      "name": "Filtro caracteres max",
      "type": "n8n-nodes-base.code",
      "position": [
        416,
        192
      ],
      "parameters": {
        "jsCode": "const input = $input.all();\nconst chunkSize = 1900;\nconst result = [];\n\nfor (const item of input) {\n  const text = item.json.output;\n  for (let i = 0; i < text.length; i += chunkSize) {\n    result.push({\n      json: {\n        message: text.slice(i, i + chunkSize)\n      }\n    });\n  }\n}\n\nreturn result;\n\n"
      },
      "typeVersion": 2
    },
    {
      "id": "23a4e1dc-5535-43a9-8f5e-0e4e7f757044",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -704,
        -400
      ],
      "parameters": {
        "width": 432,
        "height": 448,
        "content": "## Start\n\nThese blocks trigger the flow — one is manual, and one is scheduled. Open them and set the frequency you want.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "807096dc-8460-4ce2-80e8-d08ce64e4141",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -224,
        -368
      ],
      "parameters": {
        "color": 2,
        "width": 480,
        "height": 432,
        "content": "## Website Access and PDF Search\n\nOpen the first node that connects to the website and set it to the one you want.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "2eac5cae-923f-49c3-8897-ffc22801c882",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        304,
        -272
      ],
      "parameters": {
        "width": 496,
        "height": 336,
        "content": "## Google Connection\n\nHere you need to connect to the Google Cloud API: [https://cloud.google.com/](https://cloud.google.com/) and enable both Drive and Sheets.\nThen, create a sheet in Drive — this will be the one you use.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "b03fe146-3fce-464e-9f91-5edff8c9b769",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -576,
        384
      ],
      "parameters": {
        "color": 6,
        "width": 320,
        "height": 368,
        "content": "## Model Connection\nHere you need to register on OpenRouter and enter the API key. You can choose any model you like, as long as it’s free.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "446111cd-3794-4812-a379-9b69ea2ebf85",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        624,
        368
      ],
      "parameters": {
        "color": 4,
        "width": 448,
        "height": 288,
        "content": "## Marking the Registered URLs\n\nWith this, I simply receive which URLs I’ve downloaded the PDF from, and we mark them in the Google Sheet we created.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "511215dc-2ba6-4e62-a622-565e765d4244",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        864,
        16
      ],
      "parameters": {
        "color": 4,
        "width": 448,
        "height": 288,
        "content": "## Sending to Discord\n\nIn a private Discord server, create a channel. In the channel settings, create an integration via webhook and paste it here.\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "b3ef412b-aeec-4a6e-89e3-f03acdc1e845",
  "connections": {
    "963ff0ab-af77-4e66-8339-11b7d105c318": {
      "main": [
        [
          {
            "node": "4c946a75-11bd-4c5e-9e29-9b95b56b342b",
            "type": "main",
            "index": 0
          },
          {
            "node": "3bf5e883-58dc-4035-b0fd-a87a00702da8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "aacd88bc-e89b-47fc-af70-7c85f5e6c4f8": {
      "main": [
        [
          {
            "node": "963ff0ab-af77-4e66-8339-11b7d105c318",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3bf5e883-58dc-4035-b0fd-a87a00702da8": {
      "main": [
        [
          {
            "node": "2996662d-4710-4340-96d0-3f6272a756ea",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "377311e1-ee99-48a9-8b23-fc9b09b4abbf": {
      "main": [
        [
          {
            "node": "570fdfd2-e438-4b5c-86d2-e0729389b653",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "1928302e-8082-4d28-a553-4a7bb295547e": {
      "main": [
        [
          {
            "node": "bfd92648-dc17-4dec-b1a0-089b949b990e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "8dea3c26-0cb2-4607-8135-d45963200c6a": {
      "main": [
        [
          {
            "node": "aacd88bc-e89b-47fc-af70-7c85f5e6c4f8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e305771c-6ad1-43ea-a8c1-61ba61be5df7": {
      "main": [
        [
          {
            "node": "1306f74c-cf7a-4f4b-897a-abc15281eac2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "73691e8d-3eb0-4918-8f57-e42ffda2ff1e": {
      "ai_memory": [
        [
          {
            "node": "377311e1-ee99-48a9-8b23-fc9b09b4abbf",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "ddaba8bb-676b-4c1e-9986-47b55503520d": {
      "main": [
        [
          {
            "node": "8dea3c26-0cb2-4607-8135-d45963200c6a",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "1306f74c-cf7a-4f4b-897a-abc15281eac2": {
      "main": [
        [
          {
            "node": "377311e1-ee99-48a9-8b23-fc9b09b4abbf",
            "type": "main",
            "index": 0
          },
          {
            "node": "1928302e-8082-4d28-a553-4a7bb295547e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4c946a75-11bd-4c5e-9e29-9b95b56b342b": {
      "main": [
        [
          {
            "node": "3bf5e883-58dc-4035-b0fd-a87a00702da8",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "570fdfd2-e438-4b5c-86d2-e0729389b653": {
      "main": [
        [
          {
            "node": "f56b233d-f303-4845-be9e-7cde10110f87",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "792b65d0-3e97-44d2-994b-7ddbb765c9dd": {
      "ai_languageModel": [
        [
          {
            "node": "377311e1-ee99-48a9-8b23-fc9b09b4abbf",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "2996662d-4710-4340-96d0-3f6272a756ea": {
      "main": [
        [
          {
            "node": "e305771c-6ad1-43ea-a8c1-61ba61be5df7",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e8790514-92e0-4f1f-bf17-94833512d86e": {
      "main": [
        [
          {
            "node": "8dea3c26-0cb2-4607-8135-d45963200c6a",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Avancé - Extraction de documents, Résumé IA

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.

Workflows recommandés

Explorer les nœuds n8n dans la bibliothèque de références visuelles
Explorer les nœuds n8n dans la base de références visuelles
If
Ftp
Set
+
If
Ftp
Set
113 NœudsI versus AI
Autres
Automatisation de la recherche d'emploi et de la personnalisation des CV avec Mistral AI, LinkedIn et Google Sheets
Automatisation de la recherche d'offres d'emploi et de la personnalisation de CV avec Mistral AI, LinkedIn et Google Sheets
Set
Code
Html
+
Set
Code
Html
46 NœudsJordan Hoyle
Productivité personnelle
Analyse et prévision des sujets de mathématiques O-Level sur GCE par un système d'analyse de données de sonorisation IA
Analyser et publier des prédictions mathématiques du niveau O de GCE avec Perplexity AI sur WordPress et Slack
Set
Html
Slack
+
Set
Html
Slack
14 NœudsCheng Siong Chin
Extraction de documents
Évaluation des travaux par IA Sonar Pro et rappels pour plusieurs dates d'échéance
Automatiser l'attribution des relecture par les pairs via Sonar Pro AI et des rappels de dates limites multi-canaux
Set
Filter
Discord
+
Set
Filter
Discord
23 NœudsCheng Siong Chin
Extraction de documents
utilisationSlacketAsanade虚拟Scrum Master
基于AIdeScrum Masterassistant,intégrationOpenAI、SlacketAsana
Set
Code
Html
+
Set
Code
Html
35 NœudsŁukasz
Gestion de projet
AI-Deepseek-R1t Approbation des déplacements pour réunions et demande d'autorisation de dépenses
Automatisation de l'approbation des voyages et des réunions avec Deepseek AI, Gmail et Google Sheets
If
Set
Code
+
If
Set
Code
24 NœudsCheng Siong Chin
Extraction de documents
Informations sur le workflow
Niveau de difficulté
Avancé
Nombre de nœuds24
Catégorie2
Types de nœuds14
Description de la difficulté

Adapté aux utilisateurs avancés, avec des workflows complexes contenant 16+ nœuds

Auteur
Cristian Baño Belchí

Cristian Baño Belchí

@babelbots

Técnico en automatización y robótica, compartiré todo tipo de automatizaciones con la comunidad hispanohablante a través de youtube.

Liens externes
Voir sur n8n.io

Partager ce workflow

Catégories

Catégories: 34