cancel
Showing results for 
Search instead for 
Did you mean: 

CALL db.schema adding connections where they don't belong? - recursion vs. hierarchy use case

llpree
Graph Buddy

I might be confused about how db.schema() works. I see it as an architectural modeling view. I just was building two sub-graphs to explore whether a recursive or hierarchical model was best for what I'm creating. But when I did a db.schema call, expecting to see two sub-graphs, I saw connections that I'd not created. For this post, I created a very simple example to see if anyone can help. The hierarchy is head->body. The recursion is node->node. So, using db.schema, I'd expect a connection between the first two and a connection to itself for the recursive sub-graph as shown in the bottom right hand side of the pic below.

I did add a pic that hopefully will make this clear. The top upper left was a result of a simple CREATE then MERGE leaving me with two sub-graphs in the data. The nodes "node1" and "node2" are type "Node", the head and body are Head and Body types. The bottom left hand pic is the db.schema of these two sub-graphs.

So first confusion is why the CALL to db.schema gives the results on the bottom left hand corner? I just don't understand why these two sub-graphs are now connected by connections (2) I did NOT create. The only common association here is that these connections are spelled the same. That should NOT impact the schema, right???

The last pic on the bottom right hand side, is after I removed the recursion's connection (the "Node" type), and added 'x' to the CONNECTS_TO connection name. Exact same MERGE cypher call. That solved the issue in terms of now having a schema with a hierarchy and a recursion correctly modeling a graph with two sub-graphs. But that's not what I want nor need. Depending on the semantic intent of this connection, it may make sense to have it the same connection with a diff in the node types and path. But so far, that seems impossible to create. Which seems wrong to me.

So, any idea why this happens or where I messed up? I'm not a cypher expert but do know graphs and I know nothing in graph theory that makes connection names/attributes spelling significant.

Thanks for any thoughts/feedback/answers.

3 REPLIES 3

There are some false positives while generating the schema graph that may result in phantom connections. You should rely on apoc.meta.graph() from APOC Procedures for a more accurate view at this time, though there may be some additional sampling time needed to generate the graph.

Thanks to you both for the quick response. I just checked the apoc docs and, finding nothing to alert users like myself of this, might suggest we add this "heads-up" to the docs (or fix this "bug"). Currently, the apoc docs say: "examines a subset of the graph to provide a map-like meta information". Clearly adding connections is wrong. Maybe deprecation makes sense if there's another option that does work. Thanks a ton for directing me in the right direction.

FYI: As a final test, I reloaded APOC and just run both graph() and schema() on the updated graph, removing the 'x' in the connection. As you've said, graph() works perfectly and schema() continues to gives erroneous results. Scary given the schema() proc was what was taught in the course I took from Neo4j.

Again, thanks!

PS: is there a URL where, what was well known about APOC to you both, is categorized and called out? I've depended on the docs but now am a bit concerned when I look towards more serious cased like the graph algorithms. Thanks again!!

jerald_fernando
Node Link

As Andrew has rightly pointed out, CALL db.schema() shows unwanted connections. You can try clicking on the nodes below the node labels or the relationship types which can give you the right picture(See image attached)