Benchmarking des performances de LLM sur des documents juridiques avec Google Sheets et OpenRouter
Ceci est unAIworkflow d'automatisation du domainecontenant 23 nœuds.Utilise principalement des nœuds comme If, Set, Limit, Merge, Webhook, combinant la technologie d'intelligence artificielle pour une automatisation intelligente. Benchmark des performances des LLM sur des documents juridiques avec Google Sheets et OpenRouter
- •Point de terminaison HTTP Webhook (généré automatiquement par n8n)
- •Informations d'identification Google Drive API
- •Peut nécessiter les informations d'identification d'authentification de l'API cible
- •Informations d'identification Google Sheets API
Nœuds utilisés (23)
Catégorie
{
"meta": {
"instanceId": "45e293393b5dd8437fb351e5b1ef5511ef67e6e0826a1c10b9b68be850b67593"
},
"nodes": [
{
"id": "17f30fc7-7b73-4588-8d10-27b1a98ea795",
"name": "Au clic sur 'Tester le workflow'",
"type": "n8n-nodes-base.manualTrigger",
"position": [
-180,
260
],
"parameters": {},
"typeVersion": 1
},
{
"id": "ca19d73c-7c73-4e3d-96cd-842fc0e4f014",
"name": "Webhook",
"type": "n8n-nodes-base.webhook",
"position": [
580,
500
],
"webhookId": "1cbce320-d28e-4e97-8663-bf2c6a36a358",
"parameters": {
"path": "1cbce320-d28e-4e97-8663-bf2c6a36a358",
"options": {},
"httpMethod": "POST",
"responseData": "allEntries",
"responseMode": "lastNode"
},
"typeVersion": 2
},
{
"id": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"name": "Fusion1",
"type": "n8n-nodes-base.merge",
"position": [
2180,
420
],
"parameters": {
"mode": "combine",
"options": {},
"combineBy": "combineByPosition"
},
"typeVersion": 3.1
},
{
"id": "29d94760-0dd8-4c28-b31e-a499962b14df",
"name": "Obtenir les tests",
"type": "n8n-nodes-base.googleSheets",
"position": [
-20,
260
],
"parameters": {
"options": {},
"sheetName": {
"__rl": true,
"mode": "list",
"value": "gid=0",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit#gid=0",
"cachedResultName": "Tests"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=drivesdk",
"cachedResultName": "Info Extraction Tasks (LLM Judge)"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"id": "04iXS2lwUVyzn6F2",
"name": "Google Sheets account"
}
},
"typeVersion": 4.5
},
{
"id": "0853cbfc-f7fc-4097-88f1-1d08504d93d0",
"name": "Est-ce un PDF ?",
"type": "n8n-nodes-base.if",
"position": [
140,
260
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 2,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "loose"
},
"combinator": "and",
"conditions": [
{
"id": "1609d1f6-2142-4965-8d6b-01cfa53251c4",
"operator": {
"type": "string",
"operation": "contains"
},
"leftValue": "={{ $json['Relevant Source Reference'] }}",
"rightValue": ".pdf"
},
{
"id": "6b767c0f-071c-4663-a0c5-b4278b413650",
"operator": {
"type": "number",
"operation": "gt"
},
"leftValue": "={{ $json.row_number }}",
"rightValue": 0
}
]
},
"looseTypeValidation": true
},
"typeVersion": 2.2
},
{
"id": "dd3eacbb-d93d-4e2d-83ce-7853e16ab00b",
"name": "Analyseur de sortie structurée2",
"type": "@n8n/n8n-nodes-langchain.outputParserStructured",
"position": [
1800,
520
],
"parameters": {
"jsonSchemaExample": "{\n \"reasoning\": \"The Assistant fabricated a $1 million figure and a 12-month provision that are not found in the source. This breaches factual correctness and completeness. The output would mislead business stakeholders if used without correction.\",\n \"decision\": \"Fail\"\n}"
},
"typeVersion": 1.2
},
{
"id": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"name": "Chaîne LLM de base1",
"type": "@n8n/n8n-nodes-langchain.chainLlm",
"onError": "continueErrorOutput",
"position": [
1640,
340
],
"parameters": {
"text": "=INPUT:\n\n{\n \"task\": {{ $('Save Input/Output').item.json['Input '] }},\n \"source\": {{ $json.text }},\n \"output\": {{ $('Save Input/Output').item.json['Output '] }}\n}\n\nOUTPUT:",
"messages": {
"messageValues": [
{
"message": "=You are an evaluator of LLMs in the legal domain.\n\nYou will be given:\n\n- *Source Material*: the underlying document(s) that the AI Assistant was supposed to review.\n- *AI Assistant Output*: the answer generated by the AI Assistant in response to a legal task.\n\nYou must carefully review the source material to determine whether the AI Assistant’s output is accurate, relevant, and complete.\n\n---\n\n## Accuracy Standard\n\nThe AI Assistant’s response must satisfy *all three* of the following requirements:\n\n1. *Factual Correctness* \nThe response must accurately reflect the information in the source material. \nNo hallucinated, fabricated, or incorrect information is allowed. \nIf the answer is missing from the source material, the Assistant must acknowledge this. Inventing information = Fail.\n\n2. *Relevance to the Query* \nThe response must directly answer the specific question asked, without introducing unrelated or off-topic information.\n\n3. *Completeness* \nThe response must contain enough information to fully answer the question based on the source material. \nOmitting critical points that are needed to address the query = Fail.\n\n*Key Rule:* \n- If the output *materially fails any one* of the three requirements, the overall result must be marked *Fail*.\n- Minor phrasing or style issues that do not affect the meaning are acceptable.\n\n---\n\n## Common Failure Patterns to Watch For\n\nBe alert to the following known AI weaknesses:\n\n- Incomplete responses when the task is broad or vague.\n- Fabricated answers when the information is missing from the source.\n- Failure to correctly process or cross-reference multiple documents.\n- Reinforcing incorrect assumptions made in the question without verifying them.\n- Technical failures (e.g., missing pages, unreadable scans) that affect the output.\n- Ignoring contradictory information in the source.\n\nIf any of these issues occur and affect the substantive quality of the response, the correct result is *Fail*.\n\n---\n\n## How to Structure Your Evaluation\n\n*Reasoning:* \nBriefly explain why you reached this decision. \nReference the source material where necessary. \nState clearly whether the AI Assistant’s output was factually correct, relevant, and complete.\n\n*Final Decision:* Pass or Fail\n\nThe output that you give should be given in a JSON format, with keys of \"reasoning\" and \"decision\" as shown in the examples below. Return your answer as a raw JSON object under a top-level key called \"output\" — no markdown, no extra text.\n\n---\n\n## Example 1\n\nINPUT:\n\n{\n \"task\": \"Extract the liability cap and time-based provisions from a limitation of liability clause.\",\n \"source\": \"- The liability cap figure is redacted.\\n- There is no 12-month time limit mentioned.\",\n \"output\": \"The liability cap is $1 million with a 12-month limit.\"\n}\n\nOUTPUT:\n\n{\n \"output\": {\n \"reasoning\": \"The Assistant fabricated a $1 million figure and a 12-month provision that are not found in the source. This breaches factual correctness and completeness. The output would mislead business stakeholders if used without correction.\",\n \"decision\": \"Fail\"\n }\n}\n\n## Example 2\n\nINPUT:\n\n{\n \"task\": \"Identify LinkedIn’s indemnity obligations under a Master Services Agreement.\",\n \"source\": \"- LinkedIn has no indemnity obligations under the agreement.\\n- The indemnities are provided by the vendor only.\",\n \"output\": \"LinkedIn has no indemnity obligations under this MSA.\"\n}\n\nOUTPUT:\n\n{\n \"output\": {\n \"reasoning\": \"The Assistant correctly identified that LinkedIn has no indemnity obligations, fully answering the query. The response is factually correct, relevant, and complete based on the source material.\",\n \"decision\": \"Pass\"\n }\n}"
}
]
},
"promptType": "define",
"hasOutputParser": true
},
"typeVersion": 1.4
},
{
"id": "ae455c78-78aa-4857-bfea-5aed468ce224",
"name": "Google Drive",
"type": "n8n-nodes-base.googleDrive",
"onError": "continueErrorOutput",
"position": [
1220,
360
],
"parameters": {
"fileId": {
"__rl": true,
"mode": "id",
"value": "={{ $json[\"URL\"].match(/[-\\w]{25,}/)[0] }}"
},
"options": {},
"operation": "download"
},
"credentials": {
"googleDriveOAuth2Api": {
"id": "yej6mV2w6RslwOGo",
"name": "Google Drive account"
}
},
"typeVersion": 3
},
{
"id": "d33dc3e9-1f41-44e6-8a95-83c2ab735061",
"name": "Extraire du fichier",
"type": "n8n-nodes-base.extractFromFile",
"position": [
1420,
340
],
"parameters": {
"options": {},
"operation": "pdf"
},
"typeVersion": 1
},
{
"id": "a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76",
"name": "Sauvegarder Entrée/Sortie",
"type": "n8n-nodes-base.set",
"position": [
940,
480
],
"parameters": {
"mode": "raw",
"options": {},
"jsonOutput": "={{ $json.body }}"
},
"typeVersion": 3.4
},
{
"id": "76f8fed7-50d5-42d9-9dda-67a8d547da90",
"name": "Exécuter le sous-workflow",
"type": "n8n-nodes-base.httpRequest",
"onError": "continueErrorOutput",
"maxTries": 2,
"position": [
580,
240
],
"parameters": {
"url": "https://webhook-processor-production-48f8.up.railway.app/webhook/1cbce320-d28e-4e97-8663-bf2c6a36a358",
"method": "POST",
"options": {
"batching": {
"batch": {
"batchSize": 1,
"batchInterval": 500
}
}
},
"jsonBody": "={{ $json }}",
"sendBody": true,
"specifyBody": "json"
},
"retryOnFail": false,
"typeVersion": 4.2
},
{
"id": "2644c10f-9b1f-4d91-b49e-f3114b3df205",
"name": "Modèle de chat OpenRouter",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
"position": [
1640,
520
],
"parameters": {
"model": "openai/gpt-4.1",
"options": {}
},
"credentials": {
"openRouterApi": {
"id": "ipzDVYsZqbum9bX4",
"name": "OpenRouter account 2"
}
},
"typeVersion": 1
},
{
"id": "2d326d74-85c9-4363-b4d6-ee0b2a8abeb3",
"name": "Note adhésive3",
"type": "n8n-nodes-base.stickyNote",
"position": [
520,
20
],
"parameters": {
"color": 4,
"height": 700,
"content": "## 2. Execute Subworkflow\nThis node runs immediately (batching requests), but waits for the result before moving to the next step."
},
"typeVersion": 1
},
{
"id": "57ee0b46-fb57-48f3-80fb-51d3e1102bdf",
"name": "Note adhésive6",
"type": "n8n-nodes-base.stickyNote",
"position": [
-120,
460
],
"parameters": {
"width": 460,
"height": 280,
"content": "## Data format\nOur Tests Sheet contains the following columns:\n- ID: A unique identifier for each row\n- Test No.: The test that the LLM was given\n- AI Platform: The LLM that was given the test.\n- Relevant Source: The file name of the source document that was given to the LLM.\n- URL: The Google Drive URL where the file is stored.\n- Input: The input prompt that the LLM was given.\n- Output: The response that the LLM gave."
},
"typeVersion": 1
},
{
"id": "ddfb78af-ff3b-4c83-ac30-b2c36b217eb3",
"name": "Note adhésive7",
"type": "n8n-nodes-base.stickyNote",
"position": [
-40,
20
],
"parameters": {
"color": 6,
"width": 340,
"height": 420,
"content": "## 1. Fetch test cases\nWe start by grabbing our list of test cases stored in a Google Sheet [here](https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=sharing).\n\nWe only want the rows that connect to a PDF document, as DOCX downloads will need to be handled separately."
},
"typeVersion": 1
},
{
"id": "9d6ee8f0-0d88-4100-bd2a-78bd4439b448",
"name": "Conserver les données d'origine",
"type": "n8n-nodes-base.set",
"position": [
1440,
840
],
"parameters": {
"mode": "raw",
"options": {},
"jsonOutput": "={{ $json.body }}"
},
"typeVersion": 3.4
},
{
"id": "d5f9df9c-e94e-4880-89e4-2792a2757256",
"name": "Note adhésive11",
"type": "n8n-nodes-base.stickyNote",
"position": [
1180,
20
],
"parameters": {
"color": 4,
"width": 360,
"height": 540,
"content": "## 3. Grab the PDF as text\nWe download the PDF from the Google Drive link in the Google Sheet, extracting the file as text for the next step. We filter out any files that do not return data."
},
"typeVersion": 1
},
{
"id": "bab6b7ec-250a-407e-a1b8-c2d1a3e1afec",
"name": "Note adhésive12",
"type": "n8n-nodes-base.stickyNote",
"position": [
840,
20
],
"parameters": {
"color": 6,
"width": 260,
"height": 380,
"content": "## 5. Update results\nWe create a new row in our output sheet, containing our original data together with the judge decision/reasoning."
},
"typeVersion": 1
},
{
"id": "7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39",
"name": "Limite (pour test)",
"type": "n8n-nodes-base.limit",
"disabled": true,
"position": [
360,
240
],
"parameters": {
"maxItems": 3
},
"typeVersion": 1
},
{
"id": "7320f4c7-cccb-465d-a103-eb76cc51feb0",
"name": "Note adhésive13",
"type": "n8n-nodes-base.stickyNote",
"position": [
1600,
20
],
"parameters": {
"color": 4,
"width": 360,
"height": 660,
"content": "## 4. Judge LLM outputs\nOur prompt judges the LLM input/output and decides if the LLM passed the test. We also ask for a reason why the judge made its decision, which we can use to refine our eval later.\n\nWe're using OpenRouter here, which lets us easily tweak which LLM we want to use.\n\nThe output parser makes sure that the output is in JSON format, making the data easy to parse in the next step."
},
"typeVersion": 1
},
{
"id": "b7d997a9-acf6-415d-a4d3-30a4985f53f7",
"name": "Note adhésive",
"type": "n8n-nodes-base.stickyNote",
"position": [
-440,
240
],
"parameters": {
"width": 180,
"height": 200,
"content": "## Start Here\nMake sure to click \"Execute Workflow\" here, rather than underneath, as that will set the webhook in listening mode."
},
"typeVersion": 1
},
{
"id": "de440bcf-2ff2-46fb-958d-158fecc1c451",
"name": "Note adhésive14",
"type": "n8n-nodes-base.stickyNote",
"position": [
2120,
20
],
"parameters": {
"color": 4,
"width": 220,
"height": 600,
"content": "## 5. Combine data and return\nReturn the result of the subworkflow back to our HTTP request.\n\nWe are merging our pass/fail + reason together with the original data that was passed in the body of our HTTP request, so we still have access to the other data here."
},
"typeVersion": 1
},
{
"id": "73f53c55-3374-4759-be24-28f54b795886",
"name": "Mettre à jour les résultats",
"type": "n8n-nodes-base.googleSheets",
"position": [
920,
220
],
"parameters": {
"columns": {
"value": {
"ID": "={{ $json['ID'] }}",
"URL": "={{ $json['URL'] }}",
"Input": "={{ $json['Input'] }}",
"Output": "={{ $json['Output'] }}",
"Decision": "={{ $json.output.decision }}",
"Test No.": "={{ $json['Test No'][\"\"] }}",
"Reasoning": "={{ $json.output.reasoning }}",
"AI Platform": "={{ $json['AI Platform'] }}",
"Relevant Source Reference": "={{ $json['Relevant Source Reference'] }}"
},
"schema": [
{
"id": "ID",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "ID",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Test No.",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Test No.",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "AI Platform",
"type": "string",
"display": true,
"required": false,
"displayName": "AI Platform",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Relevant Source Reference",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Relevant Source Reference",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "URL",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "URL",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Input",
"type": "string",
"display": true,
"required": false,
"displayName": "Input",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Output",
"type": "string",
"display": true,
"required": false,
"displayName": "Output",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Decision",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Decision",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Reasoning",
"type": "string",
"display": true,
"required": false,
"displayName": "Reasoning",
"defaultMatch": false,
"canBeUsedToMatch": true
}
],
"mappingMode": "defineBelow",
"matchingColumns": [
"ID"
],
"attemptToConvertTypes": false,
"convertFieldsToString": false
},
"options": {},
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": 537199982,
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit#gid=537199982",
"cachedResultName": "Results"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=drivesdk",
"cachedResultName": "Info Extraction Tasks (LLM Judge)"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"id": "04iXS2lwUVyzn6F2",
"name": "Google Sheets account"
}
},
"typeVersion": 4.5
}
],
"pinData": {},
"connections": {
"0853cbfc-f7fc-4097-88f1-1d08504d93d0": {
"main": [
[
{
"node": "7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39",
"type": "main",
"index": 0
}
]
]
},
"ca19d73c-7c73-4e3d-96cd-842fc0e4f014": {
"main": [
[
{
"node": "9d6ee8f0-0d88-4100-bd2a-78bd4439b448",
"type": "main",
"index": 0
},
{
"node": "a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76",
"type": "main",
"index": 0
}
]
]
},
"29d94760-0dd8-4c28-b31e-a499962b14df": {
"main": [
[
{
"node": "0853cbfc-f7fc-4097-88f1-1d08504d93d0",
"type": "main",
"index": 0
}
]
]
},
"ae455c78-78aa-4857-bfea-5aed468ce224": {
"main": [
[
{
"node": "d33dc3e9-1f41-44e6-8a95-83c2ab735061",
"type": "main",
"index": 0
}
]
]
},
"ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c": {
"main": [
[
{
"node": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"type": "main",
"index": 0
}
]
]
},
"d33dc3e9-1f41-44e6-8a95-83c2ab735061": {
"main": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "main",
"index": 0
}
]
]
},
"a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76": {
"main": [
[
{
"node": "ae455c78-78aa-4857-bfea-5aed468ce224",
"type": "main",
"index": 0
}
]
]
},
"9d6ee8f0-0d88-4100-bd2a-78bd4439b448": {
"main": [
[
{
"node": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"type": "main",
"index": 1
}
]
]
},
"76f8fed7-50d5-42d9-9dda-67a8d547da90": {
"main": [
[
{
"node": "73f53c55-3374-4759-be24-28f54b795886",
"type": "main",
"index": 0
}
]
]
},
"7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39": {
"main": [
[
{
"node": "76f8fed7-50d5-42d9-9dda-67a8d547da90",
"type": "main",
"index": 0
}
]
]
},
"2644c10f-9b1f-4d91-b49e-f3114b3df205": {
"ai_languageModel": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"dd3eacbb-d93d-4e2d-83ce-7853e16ab00b": {
"ai_outputParser": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "ai_outputParser",
"index": 0
}
]
]
},
"17f30fc7-7b73-4588-8d10-27b1a98ea795": {
"main": [
[
{
"node": "29d94760-0dd8-4c28-b31e-a499962b14df",
"type": "main",
"index": 0
}
]
]
}
}
}Comment utiliser ce workflow ?
Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.
Dans quelles scénarios ce workflow est-il adapté ?
Avancé - Intelligence Artificielle
Est-ce payant ?
Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.
Workflows recommandés
Adam Janes
@adamjanesI am a product-minded technologist with hacker DNA building things in AI automation. I have a broad and varied background - having worked in Product, Design, and Sales - combined with deep technical experience as a Senior Developer and Fractional CTO. I am also a best-selling Udemy instructor (with 25K+ students), and founder of WOOFCODE - a free coding camp for fullstack developers. I practice non-violent communication, motivational interviewing, and Tibetan Buddhist meditation.
Partager ce workflow