Convertir PDF, DOC et images en Markdown avec l'API Datalab.to

Intermédiaire

Ceci est unDocument Extraction, Multimodal AIworkflow d'automatisation du domainecontenant 11 nœuds.Utilise principalement des nœuds comme Set, Wait, Switch, FormTrigger, HttpRequest. Utiliser l'API Datalab.to pour convertir des PDF, DOC et images en Markdown

Prérequis
  • Peut nécessiter les informations d'identification d'authentification de l'API cible
Aperçu du workflow
Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement
Exporter le workflow
Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow
{
  "meta": {
    "instanceId": "b3c467df4053d13fe31cc98f3c66fa1d16300ba750506bfd019a0913cec71ea3"
  },
  "nodes": [
    {
      "id": "c302f8e5-6bed-4b64-8d52-33eaa5fce86a",
      "name": "À la soumission du formulaire",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        -336,
        48
      ],
      "webhookId": "a857ad3f-4b4d-4574-b503-809a95b1fbbf",
      "parameters": {
        "options": {},
        "formTitle": "upload file",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "file",
              "multipleFiles": false,
              "requiredField": true
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "3cc12307-f5da-476a-811a-79920f20abf7",
      "name": "Obtenir le Markdown",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        336,
        48
      ],
      "parameters": {
        "url": "={{ $json.request_check_url }}",
        "options": {},
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth"
      },
      "credentials": {
        "httpHeaderAuth": {
          "id": "7Wx4MGEDgCG0D7fT",
          "name": "datalab.io"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "ec9e9703-fb38-4ba5-97f6-a73bee7c47f7",
      "name": "Envoyer à Datalab API",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -112,
        48
      ],
      "parameters": {
        "url": "https://www.datalab.to/api/v1/marker",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "max_pages",
              "value": "4"
            },
            {
              "name": "use_llm",
              "value": "true"
            },
            {
              "name": "output_format",
              "value": "markdown"
            },
            {
              "name": "file",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "file"
            }
          ]
        },
        "genericAuthType": "httpHeaderAuth"
      },
      "credentials": {
        "httpHeaderAuth": {
          "id": "7Wx4MGEDgCG0D7fT",
          "name": "datalab.io"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "542793bf-05c1-4c6c-b0e0-b087669f966c",
      "name": "Définir les champs",
      "type": "n8n-nodes-base.set",
      "position": [
        560,
        48
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "20280dbd-8188-4f5e-8e4e-74621d65d40a",
              "name": "markdown",
              "type": "string",
              "value": "={{ $json.markdown }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "cb900de1-e4b1-488f-8a12-9946bb8466f6",
      "name": "Attendre",
      "type": "n8n-nodes-base.wait",
      "position": [
        112,
        48
      ],
      "webhookId": "7b49cf9b-9859-4361-a8b8-2407d114d418",
      "parameters": {
        "amount": 10
      },
      "typeVersion": 1.1
    },
    {
      "id": "9fcdcaab-2041-4411-8ec1-10edfe29f1d6",
      "name": "Note adhésive13",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1424,
        -32
      ],
      "parameters": {
        "width": 972,
        "height": 564,
        "content": "![](https://res.cloudinary.com/dd6vlwblr/image/upload/v1756196495/Convert_PDF_DOC_IMAGES_to_1_sfdzu1.png)"
      },
      "typeVersion": 1
    },
    {
      "id": "32550fd8-2570-4e42-b249-8d53288da00a",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -416,
        -32
      ],
      "parameters": {
        "width": 1200,
        "height": 272,
        "content": "## Convert Files to Markdown for LLM with Datalab.to"
      },
      "typeVersion": 1
    },
    {
      "id": "f968189c-83c4-436c-91c6-dc0313ee4256",
      "name": "Note adhésive2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -416,
        272
      ],
      "parameters": {
        "width": 1200,
        "height": 256,
        "content": "## Guide\nThis simple automation enables you to convert .doc, .pdf, .png, .jpg, and .webp files to markdown using datalab.to API. This is useful for llm workflows for ai processing. \n\nSetup \n- Sign up to datalab.to to get your api key. Get the free $5 credits by entering your payment method.\n- Set up a Generic Header with name \"X-API-Key\" and api key as the value. \n- Thats it you are good to go.\n\nLearn more on tweaking the body payload here; [api reference](https://www.datalab.to/redoc#operation/marker_api_v1_marker_post)"
      },
      "typeVersion": 1
    },
    {
      "id": "883e4090-cb4f-4de6-bfc4-4c5c7ce85728",
      "name": "Note adhésive3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        816,
        -32
      ],
      "parameters": {
        "color": 6,
        "width": 400,
        "height": 416,
        "content": "## Polling Confirmation Node\n\nYou can add this node after \"Get Markdown\" node to confirm if the get request worked or the doc is still being processed. If the outcome is \"*success*\" then proceed, if \"*failed*\" then plug that route to the wait node, to try again. You can also add more outcomes of this node to cover api call fails. If you are processing **larger files**, increase the wait time."
      },
      "typeVersion": 1
    },
    {
      "id": "4f0c8ae8-e92e-419f-a609-d426db8321d8",
      "name": "Commutateur",
      "type": "n8n-nodes-base.switch",
      "disabled": true,
      "position": [
        832,
        192
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "outputKey": "success",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "cbac06bd-0ee6-40ac-bad9-bdb0929ca0ad",
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.status }}",
                    "rightValue": "complete"
                  }
                ]
              },
              "renameOutput": true
            },
            {
              "outputKey": "failed",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "bc2570f6-c9ec-4c4f-9a04-863be5fb1ced",
                    "operator": {
                      "type": "string",
                      "operation": "notEquals"
                    },
                    "leftValue": "={{ $json.status }}",
                    "rightValue": "complete"
                  }
                ]
              },
              "renameOutput": true
            }
          ]
        },
        "options": {}
      },
      "typeVersion": 3.2
    },
    {
      "id": "33ec2be6-e49a-4973-b105-5db71fc2cf40",
      "name": "Note adhésive1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        816,
        400
      ],
      "parameters": {
        "color": 4,
        "width": 400,
        "height": 128,
        "content": "## Wanna work with me?\n**Email**: joseph@uppfy.com\n**X/Twitter**: [@juppfy](https://x.com/juppfy)"
      },
      "typeVersion": 1
    }
  ],
  "pinData": {},
  "connections": {
    "cb900de1-e4b1-488f-8a12-9946bb8466f6": {
      "main": [
        [
          {
            "node": "3cc12307-f5da-476a-811a-79920f20abf7",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3cc12307-f5da-476a-811a-79920f20abf7": {
      "main": [
        [
          {
            "node": "542793bf-05c1-4c6c-b0e0-b087669f966c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "c302f8e5-6bed-4b64-8d52-33eaa5fce86a": {
      "main": [
        [
          {
            "node": "ec9e9703-fb38-4ba5-97f6-a73bee7c47f7",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "ec9e9703-fb38-4ba5-97f6-a73bee7c47f7": {
      "main": [
        [
          {
            "node": "cb900de1-e4b1-488f-8a12-9946bb8466f6",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Intermédiaire - Extraction de documents, IA Multimodale

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.

Informations sur le workflow
Niveau de difficulté
Intermédiaire
Nombre de nœuds11
Catégorie2
Types de nœuds6
Description de la difficulté

Adapté aux utilisateurs expérimentés, avec des workflows de complexité moyenne contenant 6-15 nœuds

Auteur
Joseph

Joseph

@mjomba

Automation expert specializing in building smart, scalable workflows using tools like n8n, Make, and Airtable. I help businesses save time, reduce manual work, and grow faster with tailored automation solutions. Feel free to reach out at joseph@uppfy.com to discuss your project. I am also on x.com/juppfy

Liens externes
Voir sur n8n.io

Partager ce workflow

Catégories

Catégories: 34