Akademisches Wissensgraph mit PDF-Vektoren, GPT-4 und Neo4j aufbauen

Fortgeschritten

Dies ist ein AI RAG, Multimodal AI-Bereich Automatisierungsworkflow mit 10 Nodes. Hauptsächlich werden Code, Neo4j, OpenAi, Postgres, PdfVector und andere Nodes verwendet. Akademisches Wissensgraph aus Forschungsarbeiten mit PDF-Vector, GPT-4 und Neo4j

Voraussetzungen
  • OpenAI API Key
  • PostgreSQL-Datenbankverbindungsdaten
Workflow-Vorschau
Visualisierung der Node-Verbindungen, mit Zoom und Pan
Workflow exportieren
Kopieren Sie die folgende JSON-Konfiguration und importieren Sie sie in n8n
{
  "meta": {
    "instanceId": "placeholder"
  },
  "nodes": [
    {
      "id": "kb-info",
      "name": "Wissensbasis-Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        250,
        150
      ],
      "parameters": {
        "content": "## Knowledge Base Builder\n\nExtracts and connects:\n- Concepts & Keywords\n- Authors & Institutions\n- Methods & Datasets\n- Citations & References\n\nBuilds searchable knowledge graph"
      },
      "typeVersion": 1
    },
    {
      "id": "daily-update",
      "name": "Tägliches KB-Update",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        450,
        300
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "days",
              "daysInterval": 1
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "fetch-papers",
      "name": "PDF-Vektor - Artikel abrufen",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "position": [
        650,
        300
      ],
      "parameters": {
        "limit": 20,
        "query": "={{ $json.domain || 'artificial intelligence' }}",
        "fields": [
          "title",
          "authors",
          "abstract",
          "year",
          "doi",
          "pdfUrl",
          "totalCitations"
        ],
        "resource": "academic",
        "yearFrom": "={{ new Date().getFullYear() }}",
        "operation": "search",
        "providers": [
          "semantic_scholar",
          "arxiv"
        ]
      },
      "typeVersion": 1
    },
    {
      "id": "parse-papers",
      "name": "PDF-Vektor - Artikel parsen",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "position": [
        850,
        300
      ],
      "parameters": {
        "useLlm": "always",
        "resource": "document",
        "operation": "parse",
        "documentUrl": "={{ $json.pdfUrl }}"
      },
      "typeVersion": 1
    },
    {
      "id": "extract-entities",
      "name": "Entitäten extrahieren",
      "type": "n8n-nodes-base.openAi",
      "position": [
        1050,
        300
      ],
      "parameters": {
        "model": "gpt-4",
        "options": {
          "responseFormat": {
            "type": "json_object"
          }
        },
        "messages": {
          "values": [
            {
              "content": "Extract knowledge graph entities from this paper:\n\nTitle: {{ $json.title }}\nContent: {{ $json.content }}\n\nExtract:\n1. Key concepts (5-10 main ideas)\n2. Methods used\n3. Datasets mentioned\n4. Research questions\n5. Key findings\n6. Future directions\n\nAlso identify relationships between these entities.\n\nReturn as structured JSON with entities and relationships arrays."
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "build-graph",
      "name": "Graphstruktur aufbauen",
      "type": "n8n-nodes-base.code",
      "position": [
        1250,
        300
      ],
      "parameters": {
        "functionCode": "const extraction = JSON.parse($json.content);\nconst paper = $node['PDF Vector - Fetch Papers'].json;\n\n// Create nodes for Neo4j\nconst nodes = [];\n\n// Paper node\nnodes.push({\n  label: 'Paper',\n  properties: {\n    id: paper.doi || paper.title.replace(/[^a-zA-Z0-9]/g, ''),\n    title: paper.title,\n    year: paper.year,\n    authors: paper.authors.join('; '),\n    citations: paper.totalCitations\n  }\n});\n\n// Author nodes\npaper.authors.forEach(author => {\n  nodes.push({\n    label: 'Author',\n    properties: {\n      name: author\n    }\n  });\n});\n\n// Concept nodes\nextraction.concepts?.forEach(concept => {\n  nodes.push({\n    label: 'Concept',\n    properties: {\n      name: concept\n    }\n  });\n});\n\n// Method nodes\nextraction.methods?.forEach(method => {\n  nodes.push({\n    label: 'Method',\n    properties: {\n      name: method\n    }\n  });\n});\n\n// Create relationships\nconst relationships = [];\n\n// Paper-Author relationships\npaper.authors.forEach(author => {\n  relationships.push({\n    from: paper.doi || paper.title,\n    to: author,\n    type: 'AUTHORED_BY'\n  });\n});\n\n// Paper-Concept relationships\nextraction.concepts?.forEach(concept => {\n  relationships.push({\n    from: paper.doi || paper.title,\n    to: concept,\n    type: 'DISCUSSES'\n  });\n});\n\n// Paper-Method relationships\nextraction.methods?.forEach(method => {\n  relationships.push({\n    from: paper.doi || paper.title,\n    to: method,\n    type: 'USES'\n  });\n});\n\nreturn { nodes, relationships };"
      },
      "typeVersion": 1
    },
    {
      "id": "create-nodes",
      "name": "Graphknoten erstellen",
      "type": "n8n-nodes-base.neo4j",
      "position": [
        1450,
        250
      ],
      "parameters": {
        "query": "=UNWIND $nodes AS node\nMERGE (n:Node {id: node.properties.id})\nSET n += node.properties\nSET n:${node.label}",
        "operation": "create",
        "parameters": "={{ { nodes: $json.nodes } }}"
      },
      "typeVersion": 1
    },
    {
      "id": "create-relationships",
      "name": "Beziehungen erstellen",
      "type": "n8n-nodes-base.neo4j",
      "position": [
        1450,
        350
      ],
      "parameters": {
        "query": "=UNWIND $relationships AS rel\nMATCH (a {id: rel.from})\nMATCH (b {id: rel.to})\nMERGE (a)-[r:${rel.type}]->(b)",
        "operation": "create",
        "parameters": "={{ { relationships: $json.relationships } }}"
      },
      "typeVersion": 1
    },
    {
      "id": "kb-stats",
      "name": "KB-Statistiken",
      "type": "n8n-nodes-base.code",
      "position": [
        1650,
        300
      ],
      "parameters": {
        "functionCode": "// Generate knowledge base statistics\nconst stats = {\n  papersProcessed: $items().length,\n  conceptsExtracted: $json.nodes.filter(n => n.label === 'Concept').length,\n  authorsAdded: $json.nodes.filter(n => n.label === 'Author').length,\n  methodsIdentified: $json.nodes.filter(n => n.label === 'Method').length,\n  timestamp: new Date().toISOString()\n};\n\nreturn stats;"
      },
      "typeVersion": 1
    },
    {
      "id": "log-update",
      "name": "KB-Update protokollieren",
      "type": "n8n-nodes-base.postgres",
      "position": [
        1850,
        300
      ],
      "parameters": {
        "table": "kb_updates",
        "columns": "papers_processed,concepts,authors,methods,updated_at",
        "operation": "insert"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "kb-stats": {
      "main": [
        [
          {
            "node": "log-update",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "daily-update": {
      "main": [
        [
          {
            "node": "fetch-papers",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "extract-entities": {
      "main": [
        [
          {
            "node": "build-graph",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "create-nodes": {
      "main": [
        [
          {
            "node": "kb-stats",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "create-relationships": {
      "main": [
        [
          {
            "node": "kb-stats",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "build-graph": {
      "main": [
        [
          {
            "node": "create-nodes",
            "type": "main",
            "index": 0
          },
          {
            "node": "create-relationships",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "fetch-papers": {
      "main": [
        [
          {
            "node": "parse-papers",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "parse-papers": {
      "main": [
        [
          {
            "node": "extract-entities",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Häufig gestellte Fragen

Wie verwende ich diesen Workflow?

Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.

Für welche Szenarien ist dieser Workflow geeignet?

Fortgeschritten - KI RAG, Multimodales KI

Ist es kostenpflichtig?

Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.

Workflow-Informationen
Schwierigkeitsgrad
Fortgeschritten
Anzahl der Nodes10
Kategorie2
Node-Typen7
Schwierigkeitsbeschreibung

Für erfahrene Benutzer, mittelkomplexe Workflows mit 6-15 Nodes

Autor
PDF Vector

PDF Vector

@pdfvector

A fully featured PDF APIs for developers - Parse any PDF or Word document, extract structured data, and access millions of academic papers - all through simple APIs.

Externe Links
Auf n8n.io ansehen

Diesen Workflow teilen

Kategorien

Kategorien: 34