Contextual Ingestion Pipeline

Name: Contextual Ingestion Pipeline
Rating: 4.5 (10 reviews)
Author: Daniel Rosehill

Fortgeschritten

Dies ist ein Engineering, Multimodal AI-Bereich Automatisierungsworkflow mit 15 Nodes. Hauptsächlich werden Set, Webhook, ConvertToFile, Agent, EmbeddingsOpenAi und andere Nodes verwendet. Extrahieren Sie Kontext aus Sprachnotizen für RAG-Systeme mit OpenRouter AI und Milvus

Voraussetzungen

•HTTP Webhook-Endpunkt (wird von n8n automatisch generiert)
•OpenAI API Key

Verwendete Nodes (15)

OutputParserStructured

DocumentDefaultDataLoader

Kategorie

Engineering

Multimodales KI

Workflow-Vorschau

Visualisierung der Node-Verbindungen, mit Zoom und Pan

Webhook-Trigger

Felder bearbeiten

KI-Agent

Strukturierter Ausgabeparser

OpenRouter Chat Model

Felder bearbeiten1

In Datei konvertieren

Milvus Vektorspeicher

Standard-Datenlader

Einbettungen OpenAI

React Flow

Workflow exportieren

Kopieren Sie die folgende JSON-Konfiguration und importieren Sie sie in n8n

{
  "id": "workflow-id-placeholder",
  "meta": {
    "instanceId": "instance-id-placeholder",
    "templateCredsSetupCompleted": true
  },
  "name": "Context Ingestion Pipeline",
  "tags": [],
  "nodes": [
    {
      "id": "cee1c3f4-a0d3-4e4c-8563-70814b37d99d",
      "name": "Webhook-Trigger",
      "type": "n8n-nodes-base.webhook",
      "position": [
        -384,
        -80
      ],
      "webhookId": "webhook-uuid-placeholder",
      "parameters": {
        "path": "webhook-uuid-placeholder",
        "options": {},
        "httpMethod": "POST"
      },
      "typeVersion": 2
    },
    {
      "id": "4cf76388-5dbf-46a3-8750-1bbda180949d",
      "name": "Felder bearbeiten",
      "type": "n8n-nodes-base.set",
      "position": [
        -176,
        -80
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "d1c59fe6-0834-45bd-8cc2-1c399773d7ee",
              "name": "title",
              "type": "string",
              "value": "={{ $json.body.data.title }}"
            },
            {
              "id": "bde4d7fb-c21b-4a5e-bfbf-96aaf0ad7b6b",
              "name": "transcript",
              "type": "string",
              "value": "={{ $json.body.data.transcript }}"
            },
            {
              "id": "a79b01b6-e602-43b4-a3c2-7efca1cedf3a",
              "name": "timestamp",
              "type": "string",
              "value": "={{ $json.body.timestamp }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
      "name": "KI-Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -16,
        -112
      ],
      "parameters": {
        "text": "={{ $json.transcript }}",
        "options": {
          "systemMessage": "=You are a **Context Extraction Agent**.\nYour role is to ingest text from the user, which will have been captured using **speech-to-text** and may therefore contain transcription errors, missing words, or imprecise phrasing.\n\n**Your tasks are:**\n\n1. **Infer intended meaning:**\n\n   * If any words appear to be obvious mistranscriptions, you may replace them with the most likely intended words based on the context.\n\n2. **Reformulate into third person:**\n\n   * Change all first-person references (\"I\", \"me\", \"my\") into \"User\" or \"their\" where appropriate.\n   * Example: `\"I really enjoy spicy food\"` → `\"User enjoys spicy food\"`.\n\n3. **Extract context data only:**\n\n   * Identify and isolate **significant, specific facts** about the user that could be useful for grounding AI inference in a Retrieval-Augmented Generation (RAG) pipeline.\n   * Omit casual musings, filler thoughts, and irrelevant narrative.\n\n4. **Format the output in plain text:**\n\n   * Keep each fact as a separate line.\n   * Optionally group facts under **all-caps headers** with one blank line before and after.\n   * Avoid any other formatting, markup, or commentary.\n\n5. **Output rules:**\n\n   * No introductory or concluding remarks.\n   * The result is a single continuous plain text document containing only the extracted facts.\n   * Keep the facts **short, precise, and formulaic**.\n\n---\n\n**Example Input:**\n\n```\nI just moved to a new city last month, and I'm still figuring out the best pizza places.  \nI think my favorite so far is Margarita pizza, though I really miss the one I used to get back home.  \nOh, and my new apartment has a great view of the downtown area.  \n```\n\n**Example Output:**\n\n```\nLOCATION  \nUser moved to a new city recently.  \n\nFOOD PREFERENCES  \nUser likes pizza.  \nUser's favorite type of pizza is Margarita.  \n\nOTHER  \nUser's apartment has a view of the downtown area.  \n"
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.1
    },
    {
      "id": "8b98eb0a-258c-4e43-bc74-8a007ae95668",
      "name": "Strukturierter Ausgabeparser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        208,
        128
      ],
      "parameters": {
        "jsonSchemaExample": " {\n  \"output\": \"User moved to a new city recently.\\nUser likes pizza.\\nUser's favorite type of pizza is Margarita.\\nUser's apartment has a view of the downtown area.\"\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "b2eb0913-9dd9-4533-82a0-e09d61724b64",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        -160,
        96
      ],
      "parameters": {
        "options": {}
      },
      "credentials": {
        "openRouterApi": {
          "id": "credential-id-placeholder",
          "name": "OpenRouter account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8",
      "name": "Felder bearbeiten1",
      "type": "n8n-nodes-base.set",
      "position": [
        336,
        -112
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "5676fee9-3080-4b08-be04-b6203d2b132b",
              "name": "tite.",
              "type": "string",
              "value": "={{ $('Edit Fields').item.json.title }}"
            },
            {
              "id": "a46332e5-ba8c-4094-87ed-e04ab8462367",
              "name": "output",
              "type": "string",
              "value": "=Context data created: {{ $('Webhook').item.json.body.timestamp }}\n\nCONTEXT:\n\n{{ $json.output }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "f5875c16-9c32-468f-89fa-cec55a21c236",
      "name": "In Datei konvertieren",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        544,
        -144
      ],
      "parameters": {
        "options": {
          "fileName": "={{ $json.tite[\"\"] }}"
        },
        "operation": "toText",
        "sourceProperty": "output"
      },
      "typeVersion": 1.1
    },
    {
      "id": "874e9798-782a-4ce5-bbab-3203576b53d6",
      "name": "Milvus Vektorspeicher",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreMilvus",
      "position": [
        752,
        -128
      ],
      "parameters": {
        "mode": "insert",
        "options": {
          "clearCollection": false
        },
        "milvusCollection": {
          "__rl": true,
          "mode": "list",
          "value": "user-context-collection",
          "cachedResultName": "user-context-collection"
        }
      },
      "credentials": {
        "milvusApi": {
          "id": "credential-id-placeholder",
          "name": "Milvus account"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "7d82b497-4349-4039-9fcd-62776317a14a",
      "name": "Standard-Datenlader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        896,
        96
      ],
      "parameters": {
        "options": {},
        "dataType": "binary"
      },
      "typeVersion": 1.1
    },
    {
      "id": "44abc538-09d8-4359-9911-f66016b5aa28",
      "name": "Einbettungen OpenAI",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "position": [
        624,
        96
      ],
      "parameters": {
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "id": "credential-id-placeholder",
          "name": "OpenAI API"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "752244c4-196f-44a0-99cf-eb1fde3b0407",
      "name": "Haftnotiz",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -496,
        -288
      ],
      "parameters": {
        "width": 208,
        "height": 144,
        "content": "## Context data\n\nVoicenotes.com tag trigger for 'conext data'"
      },
      "typeVersion": 1
    },
    {
      "id": "fa03fa89-8cd7-4774-8c52-7a2bee25c02d",
      "name": "Haftnotiz1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -224,
        -272
      ],
      "parameters": {
        "width": 160,
        "height": 80,
        "content": "## Narrow fields"
      },
      "typeVersion": 1
    },
    {
      "id": "5e9a54f2-dbaa-48f7-bd10-2e0095304aed",
      "name": "Haftnotiz2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -16,
        -288
      ],
      "parameters": {
        "width": 272,
        "height": 144,
        "content": "## Context data extraction agent\n\nParses transcript and isolates context rich text"
      },
      "typeVersion": 1
    },
    {
      "id": "bbdb9d7a-ff61-484b-be60-53a684c3dcb1",
      "name": "Haftnotiz3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        368,
        -320
      ],
      "parameters": {
        "width": 352,
        "height": 144,
        "content": "## Context data prepared for embedding\n\nTimestamp injected into agent output"
      },
      "typeVersion": 1
    },
    {
      "id": "6071a74f-3bf3-4f7f-b574-0e279bddaecd",
      "name": "Haftnotiz4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        736,
        -304
      ],
      "parameters": {
        "width": 352,
        "height": 144,
        "content": "## Embedding\n\nContext dasta embedded into Milvus\nvector database (hosted)"
      },
      "typeVersion": 1
    }
  ],
  "active": true,
  "pinData": {
    "Webhook": [
      {
        "json": {
          "body": {
            "data": {
              "id": "sample-note-id",
              "title": "Sample Voice Note Title",
              "transcript": "This is a sample transcript from a voice note. The user discusses their preferences and provides context that will be extracted and stored in the vector database for future reference."
            },
            "event": "tag.attached.299437",
            "timestamp": "2025-08-15T11:28:21+00:00"
          },
          "query": {},
          "params": {},
          "headers": {
            "host": "your-n8n-instance.com",
            "cf-ray": "ray-id-placeholder",
            "cdn-loop": "cloudflare; loops=1",
            "cf-visitor": "{\"scheme\":\"https\"}",
            "connection": "keep-alive",
            "user-agent": "GuzzleHttp/7",
            "cf-ipcountry": "US",
            "content-type": "application/json",
            "authorization": "Bearer",
            "cf-warp-tag-id": "warp-tag-placeholder",
            "content-length": "1481",
            "accept-encoding": "gzip, br",
            "x-forwarded-for": "xxx.xxx.xxx.xxx",
            "cf-connecting-ip": "xxx.xxx.xxx.xxx",
            "x-forwarded-proto": "https"
          },
          "webhookUrl": "https://your-n8n-instance.com/webhook-test/webhook-uuid-placeholder",
          "executionMode": "test"
        }
      }
    ],
    "Edit Fields": [
      {
        "json": {
          "title": "Sample Voice Note Title",
          "timestamp": "2025-08-15T11:28:21+00:00",
          "transcript": "This is a sample transcript from a voice note. The user discusses their preferences and provides context that will be extracted and stored in the vector database for future reference."
        }
      }
    ]
  },
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "version-id-placeholder",
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "4cf76388-5dbf-46a3-8750-1bbda180949d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8": {
      "main": [
        [
          {
            "node": "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4cf76388-5dbf-46a3-8750-1bbda180949d": {
      "main": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "298d1326-fa5d-4bd6-9a5a-4d0a07f078b8": {
      "main": [
        [
          {
            "node": "f5875c16-9c32-468f-89fa-cec55a21c236",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "f5875c16-9c32-468f-89fa-cec55a21c236": {
      "main": [
        [
          {
            "node": "Milvus Vector Store",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings OpenAI": {
      "ai_embedding": [
        [
          {
            "node": "Milvus Vector Store",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "7d82b497-4349-4039-9fcd-62776317a14a": {
      "ai_document": [
        [
          {
            "node": "Milvus Vector Store",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "b2eb0913-9dd9-4533-82a0-e09d61724b64": {
      "ai_languageModel": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "8b98eb0a-258c-4e43-bc74-8a007ae95668": {
      "ai_outputParser": [
        [
          {
            "node": "5ffd63b4-fd8a-4be3-ae07-6fa1861579b8",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    }
  }
}

Häufig gestellte Fragen

Wie verwende ich diesen Workflow?

Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.

Für welche Szenarien ist dieser Workflow geeignet?

Fortgeschritten - Engineering, Multimodales KI

Ist es kostenpflichtig?

Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.