[Showcase] Industrial GraphRAG: A Visual P&ID Editor to Build Physics-Aware Knowledge Graphs (Open Source)

Hi Neo4j Community! :waving_hand:

I am a Chemical Industry Expert (with 0 coding background originally) who partnered with Google Gemini to build an open-source solution for Industrial AI.

I’m excited to share a "Dual-Project" ecosystem that solves a massive bottleneck in Engineering RAG: Converting unstructured P&ID drawings into high-quality Neo4j graphs.


:stop_sign: The Problem: LLMs Don't Understand "Physics"

In the process industry, P&IDs (Piping and Instrumentation Diagrams) are the bible of engineering.
However, standard RAG (Vector Search) fails miserably here because:

  1. Topology Loss: Vectors can't describe complex connectivity (e.g., "If Valve A closes, does flow stop at Tank B?").
  2. Context Blindness: An LLM can't distinguish if a pipe connects to the Shell-side or Tube-side of a heat exchanger just by reading a PDF.

We needed a Knowledge Graph. But manually writing Cypher for thousands of devices is impossible.


:light_bulb: The Solution: A Visual "Data Producer" & "Consumer"

I built two open-source tools to bridge this gap:

1. The Data Producer: Chemical P&ID Graph Editor

  • Tech Stack: React + AntV X6 + Neo4j Driver.
  • What it does: A drag-and-drop web editor that looks like CAD/Visio but behaves like a Graph Builder.
  • Neo4j Integration:
    • It maps visual nodes to Neo4j Labels (:Equipment, :Instrument).
    • It handles Orthogonal Routing visually but saves logic relations (:PIPING_CONNECT).
    • Physics-Aware: I embedded domain logic into the ports. For example, connecting to a specific port on a heater automatically tags the relationship with region: "ShellSide" and phase: "Steam".

2. The Data Consumer: Industrial GraphRAG Chatbot

  • Tech Stack: Python + Streamlit + LangChain + Neo4j.
  • How it uses the Graph:
    • Implements Hybrid Search (Cypher generation + Vector search).
    • Uses the graph topology to answer complex reasoning questions like "Trace the downstream path of Reactor R-101" or "Check for dry-run risks in the heat exchanger."

:brain: Why Neo4j?

The strict schema and relationship properties in Neo4j were essential. We use properties like fromRegion, toRegion, and fluid on relationships to ensure the AI understands the physical boundaries of the equipment, not just the connection.

:handshake: The "Expert + AI" Journey

As a non-programmer, I defined the Domain Logic (Chemical Engineering rules), and Gemini wrote 100% of the React/Python code. This project is a testament to how domain experts can leverage GenAI to build specialized graph tools.

:link: Links & Feedback

Both projects are MIT Licensed. I’d love to hear your thoughts on how to further optimize the graph schema for engineering RAG!

Cheers,
A Chemical Engineer & Graph Enthusiast