Context ingestion pipeline

Name: Context ingestion pipeline
Rating: 4.5 (10 reviews)
Author: Daniel Rosehill

Intermédiaire

Ceci est unEngineering, Multimodal AIworkflow d'automatisation du domainecontenant 15 nœuds.Utilise principalement des nœuds comme Set, Webhook, ConvertToFile, Agent, EmbeddingsOpenAi. Extraire le contexte des notes vocales pour un système RAG avec OpenRouter AI et Milvus

Prérequis

•Point de terminaison HTTP Webhook (généré automatiquement par n8n)
•Clé API OpenAI

Nœuds utilisés (15)

OutputParserStructured

DocumentDefaultDataLoader

Catégorie

Ingénierie

IA Multimodale

Aperçu du workflow

Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement

Webhook

Modifier les champs

Agent IA

Analyseur de sortie structurée

Modèle de chat OpenRouter

Modifier les champs1

Convertir en fichier

Magasin de vecteurs Milvus

Chargeur de données par défaut

Embeddings OpenAI

React Flow

Exporter le workflow

Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow

{
  "id": "workflow-id-placeholder",
  "meta": {
    "instanceId": "instance-id-placeholder",
    "templateCredsSetupCompleted": true
  },
  "name": "Context Ingestion Pipeline",
  "tags": [],
  "nodes": [
    {
      "id": "cee1c3f4-a0d3-4e4c-8563-70814b37d99d",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "position": [
        -384,
        -80
      ],
      "webhookId": "webhook-uuid-placeholder",
      "parameters": {
        "path": "webhook-uuid-placeholder",
        "options": {},
        "httpMethod": "POST"
      },
      "typeVersion": 2
    },
    {
      "id": "4cf76388-5dbf-46a3-8750-1bbda180949d",
      "name": "Modifier les champs",
      "type": "n8n-nodes-base.set",
      "position": [
        -176,
        -80
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "d1c59fe6-0834-45bd-8cc2-1c399773d7ee",
              "name": "title",
              "type": "string",
              "value": "={{ $json.body.data.title }}"
            },
            {
              "id": "bde4d7fb-c21b-4a5e-bfbf-96aaf0ad7b6b",
              "name": "transcript",
              "type": "string",
              "value": "={{ $json.body.data.transcript }}"
            },
            {
              "id": "a79b01b6-e602-43b4-a3c2-7efca1cedf3a",
              "name": "timestamp",
              "type": "string",
              "value": "={{ $json.body.timestamp }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
      "name": "Agent IA",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -16,
        -112
      ],
      "parameters": {
        "text": "={{ $json.transcript }}",
        "options": {
          "systemMessage": "=You are a **Context Extraction Agent**.\nYour role is to ingest text from the user, which will have been captured using **speech-to-text** and may therefore contain transcription errors, missing words, or imprecise phrasing.\n\n**Your tasks are:**\n\n1. **Infer intended meaning:**\n\n   * If any words appear to be obvious mistranscriptions, you may replace them with the most likely intended words based on the context.\n\n2. **Reformulate into third person:**\n\n   * Change all first-person references (\"I\", \"me\", \"my\") into \"User\" or \"their\" where appropriate.\n   * Example: `\"I really enjoy spicy food\"` → `\"User enjoys spicy food\"`.\n\n3. **Extract context data only:**\n\n   * Identify and isolate **significant, specific facts** about the user that could be useful for grounding AI inference in a Retrieval-Augmented Generation (RAG) pipeline.\n   * Omit casual musings, filler thoughts, and irrelevant narrative.\n\n4. **Format the output in plain text:**\n\n   * Keep each fact as a separate line.\n   * Optionally group facts under **all-caps headers** with one blank line before and after.\n   * Avoid any other formatting, markup, or commentary.\n\n5. **Output rules:**\n\n   * No introductory or concluding remarks.\n   * The result is a single continuous plain text document containing only the extracted facts.\n   * Keep the facts **short, precise, and formulaic**.\n\n---\n\n**Example Input:**\n\n```\nI just moved to a new city last month, and I'm still figuring out the best pizza places.  \nI think my favorite so far is Margarita pizza, though I really miss the one I used to get back home.  \nOh, and my new apartment has a great view of the downtown area.  \n```\n\n**Example Output:**\n\n```\nLOCATION  \nUser moved to a new city recently.  \n\nFOOD PREFERENCES  \nUser likes pizza.  \nUser's favorite type of pizza is Margarita.  \n\nOTHER  \nUser's apartment has a view of the downtown area.  \n"
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.1
    },
    {
      "id": "8b98eb0a-258c-4e43-bc74-8a007ae95668",
      "name": "Analyseur de sortie structurée",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        208,
        128
      ],
      "parameters": {
        "jsonSchemaExample": " {\n  \"output\": \"User moved to a new city recently.\\nUser likes pizza.\\nUser's favorite type of pizza is Margarita.\\nUser's apartment has a view of the downtown area.\"\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "b2eb0913-9dd9-4533-82a0-e09d61724b64",
      "name": "Modèle de chat OpenRouter",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        -160,
        96
      ],
      "parameters": {
        "options": {}
      },
      "credentials": {
        "openRouterApi": {
          "id": "credential-id-placeholder",
          "name": "OpenRouter account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8",
      "name": "Modifier les champs1",
      "type": "n8n-nodes-base.set",
      "position": [
        336,
        -112
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "5676fee9-3080-4b08-be04-b6203d2b132b",
              "name": "tite.",
              "type": "string",
              "value": "={{ $('Edit Fields').item.json.title }}"
            },
            {
              "id": "a46332e5-ba8c-4094-87ed-e04ab8462367",
              "name": "output",
              "type": "string",
              "value": "=Context data created: {{ $('Webhook').item.json.body.timestamp }}\n\nCONTEXT:\n\n{{ $json.output }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "f5875c16-9c32-468f-89fa-cec55a21c236",
      "name": "Convertir en fichier",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        544,
        -144
      ],
      "parameters": {
        "options": {
          "fileName": "={{ $json.tite[\"\"] }}"
        },
        "operation": "toText",
        "sourceProperty": "output"
      },
      "typeVersion": 1.1
    },
    {
      "id": "874e9798-782a-4ce5-bbab-3203576b53d6",
      "name": "Magasin de vecteurs Milvus",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreMilvus",
      "position": [
        752,
        -128
      ],
      "parameters": {
        "mode": "insert",
        "options": {
          "clearCollection": false
        },
        "milvusCollection": {
          "__rl": true,
          "mode": "list",
          "value": "user-context-collection",
          "cachedResultName": "user-context-collection"
        }
      },
      "credentials": {
        "milvusApi": {
          "id": "credential-id-placeholder",
          "name": "Milvus account"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "7d82b497-4349-4039-9fcd-62776317a14a",
      "name": "Chargeur de données par défaut",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        896,
        96
      ],
      "parameters": {
        "options": {},
        "dataType": "binary"
      },
      "typeVersion": 1.1
    },
    {
      "id": "44abc538-09d8-4359-9911-f66016b5aa28",
      "name": "Embeddings OpenAI",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "position": [
        624,
        96
      ],
      "parameters": {
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "id": "credential-id-placeholder",
          "name": "OpenAI API"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "752244c4-196f-44a0-99cf-eb1fde3b0407",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -496,
        -288
      ],
      "parameters": {
        "width": 208,
        "height": 144,
        "content": "## Context data\n\nVoicenotes.com tag trigger for 'conext data'"
      },
      "typeVersion": 1
    },
    {
      "id": "fa03fa89-8cd7-4774-8c52-7a2bee25c02d",
      "name": "Note adhésive1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -224,
        -272
      ],
      "parameters": {
        "width": 160,
        "height": 80,
        "content": "## Narrow fields"
      },
      "typeVersion": 1
    },
    {
      "id": "5e9a54f2-dbaa-48f7-bd10-2e0095304aed",
      "name": "Note adhésive2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -16,
        -288
      ],
      "parameters": {
        "width": 272,
        "height": 144,
        "content": "## Context data extraction agent\n\nParses transcript and isolates context rich text"
      },
      "typeVersion": 1
    },
    {
      "id": "bbdb9d7a-ff61-484b-be60-53a684c3dcb1",
      "name": "Note adhésive3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        368,
        -320
      ],
      "parameters": {
        "width": 352,
        "height": 144,
        "content": "## Context data prepared for embedding\n\nTimestamp injected into agent output"
      },
      "typeVersion": 1
    },
    {
      "id": "6071a74f-3bf3-4f7f-b574-0e279bddaecd",
      "name": "Note adhésive4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        736,
        -304
      ],
      "parameters": {
        "width": 352,
        "height": 144,
        "content": "## Embedding\n\nContext dasta embedded into Milvus\nvector database (hosted)"
      },
      "typeVersion": 1
    }
  ],
  "active": true,
  "pinData": {
    "Webhook": [
      {
        "json": {
          "body": {
            "data": {
              "id": "sample-note-id",
              "title": "Sample Voice Note Title",
              "transcript": "This is a sample transcript from a voice note. The user discusses their preferences and provides context that will be extracted and stored in the vector database for future reference."
            },
            "event": "tag.attached.299437",
            "timestamp": "2025-08-15T11:28:21+00:00"
          },
          "query": {},
          "params": {},
          "headers": {
            "host": "your-n8n-instance.com",
            "cf-ray": "ray-id-placeholder",
            "cdn-loop": "cloudflare; loops=1",
            "cf-visitor": "{\"scheme\":\"https\"}",
            "connection": "keep-alive",
            "user-agent": "GuzzleHttp/7",
            "cf-ipcountry": "US",
            "content-type": "application/json",
            "authorization": "Bearer",
            "cf-warp-tag-id": "warp-tag-placeholder",
            "content-length": "1481",
            "accept-encoding": "gzip, br",
            "x-forwarded-for": "xxx.xxx.xxx.xxx",
            "cf-connecting-ip": "xxx.xxx.xxx.xxx",
            "x-forwarded-proto": "https"
          },
          "webhookUrl": "https://your-n8n-instance.com/webhook-test/webhook-uuid-placeholder",
          "executionMode": "test"
        }
      }
    ],
    "Edit Fields": [
      {
        "json": {
          "title": "Sample Voice Note Title",
          "timestamp": "2025-08-15T11:28:21+00:00",
          "transcript": "This is a sample transcript from a voice note. The user discusses their preferences and provides context that will be extracted and stored in the vector database for future reference."
        }
      }
    ]
  },
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "version-id-placeholder",
  "connections": {
    "cee1c3f4-a0d3-4e4c-8563-70814b37d99d": {
      "main": [
        [
          {
            "node": "4cf76388-5dbf-46a3-8750-1bbda180949d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8": {
      "main": [
        [
          {
            "node": "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4cf76388-5dbf-46a3-8750-1bbda180949d": {
      "main": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8": {
      "main": [
        [
          {
            "node": "f5875c16-9c32-468f-89fa-cec55a21c236",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "f5875c16-9c32-468f-89fa-cec55a21c236": {
      "main": [
        [
          {
            "node": "874e9798-782a-4ce5-bbab-3203576b53d6",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "44abc538-09d8-4359-9911-f66016b5aa28": {
      "ai_embedding": [
        [
          {
            "node": "874e9798-782a-4ce5-bbab-3203576b53d6",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "7d82b497-4349-4039-9fcd-62776317a14a": {
      "ai_document": [
        [
          {
            "node": "874e9798-782a-4ce5-bbab-3203576b53d6",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "b2eb0913-9dd9-4533-82a0-e09d61724b64": {
      "ai_languageModel": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "8b98eb0a-258c-4e43-bc74-8a007ae95668": {
      "ai_outputParser": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    }
  }
}

Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Intermédiaire - Ingénierie, IA Multimodale

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.