Searching for pre-existing graphs to avoid double writing them

I have a fairly interesting graph schema (see below). And so far my focused has been on 'reading' the graph to build a good UX for my solution.

Now I move to the tediousness of 'writing' stuff, and as such i need to avoid 'graph instance spawning'.

For example, i have the concept of 'versions' that should reference the same 'branch' (v1 created it / v2 references it), more important an organisational unit can create a branch and anyone should also reference it without a problem.

I am trying to figure how this can be done, for v1 -> v2 i obviously know what comes from where. The challenge is when a delta (or entirely new graph for a v1) is created, I need to work out if that branch already exists in my graph?

They're quite 'simple' (depths of 2 to 5 depending on what levels of 'pre-existing' i am searching) and the total set of nodes should be relatively small (between 5 and 30), but could be "dense" (e.g. 1 node -> 15 leaves)

They're semantically not the same at each level (it's not a simple type of pairs movies and actors or actors related to actors, think distributor -> producer -> movie -> actor -> role) So i don't want Keanu repeated twice against Neo for all the Matrix movies (maybe not the best example, but i am trying to find those 'Keanu->Neo' graphs to avoid repetition).

... i've been ruminating and i think since my 'writing' is interactive (UX) that means that the checks are done against an in-memory graph.

Practically I don't have 2 graphs inside Neo4J, i only have 1 and another that i am interactively 'building' at the front-end.

I don't fully understand, but it looks like you need the ability to manage your graph with basic add, remove, alter, and delete operations. I have built an app for myself that has a network of related graphs. I use java as my backend, where I wrote a repository to give me all the operations I need to alter the graphs as needed. My front-end developer calls these APIs to update the graph accordingly. It's all cypher code that gets executed.

I also wrote a suite of custom procedures to calculate specific metrics over my graphs. These calculations could not be done efficiently in cypher. It seems your need to compare graphs and other similar capabilities may be better implemented with custom procedures.

it is possible - i am researching GDS to see how i can extrapolate 'similarity' from my graphs.

My problems are (for example) that (A)-(N nodes) and (B)->(N Nodes) could be total opposites if A's semantics are opposite to B's - even if the N nodes related are the same (one could be 'do' and the other 'can't do').

I am thinking of assigning some weight to those 'concentration' nodes (probably simple as 1 / 0 / -1) that would allow me to run some of the pre-existing algorithms and would give me that 'similarity' index with meaning.

I've also got to figure intersections and unions. e.g. you have A' and A'' - same weight, but N' and N" are to be intersected or aggregated (could also be that A' and B' are opposite and some intersection - or aggregation too - needs to be executed).

I've asked chatGPT ... still waiting.