Traversing that involve super node

md7 · September 15, 2021, 2:26pm

i am using database version 4.3.3, for the below query that involve a supernode. While finding path between user u1 and u3 a super node comes. On profiling i am not able to understand why query engine doesn't check for supernode.

CREATE INDEX username_index FOR (u:User) ON (u.name);
MATCH (n) DETACH DELETE n;
CREATE (u1:User {name: 'u1'}),
(u2:User {name: 'u2'}),
(u3:User {name: 'u3'}),
(u4:User {name: 'u4'}),
(u5:User {name: 'u5'}),
(u6:User {name: 'u6'}),
(u7:User {name: 'u7'}),
(u8:User {name: 'u8'}),
(u9:User {name: 'u9'}),
(sperNode:User {name: 'super-node'}),
(u1) -[:FOLLOWS]-> (u2),
(u2) -[:FOLLOWS]-> (u3),
(u2) -[:FOLLOWS]-> (sperNode),
(u4) -[:FOLLOWS]-> (sperNode),
(u5) -[:FOLLOWS]-> (sperNode),
(u6) -[:FOLLOWS]-> (sperNode),
(u7) -[:FOLLOWS]-> (sperNode),
(u8) -[:FOLLOWS]-> (sperNode),
(u9) -[:FOLLOWS]-> (sperNode)
RETURN *

PROFILE
MATCH (u1:User { name: "u1" })
WITH u1
MATCH p=(u1)-[:FOLLOWS*]-(:User { name: "u3" })
RETURN p

Thanks in advance,
baseer

Benoit_d · September 15, 2021, 5:14pm

may be because "super-node" ist not on any path between u1 and u3?
Untitled graph (1)

md7 · September 16, 2021, 4:16am

Even if i add super node in the path from user u1 to u3 , then also it will not check for all the relationship of supernode. though its performance magic of cypher query engine in optimising search path.

test data; super node in path

CREATE  (u1:User {name: 'u1'}),
        (u2:User {name: 'u2'}),
        (u3:User {name: 'u3'}),
        (u4:User {name: 'u4'}),
        (u5:User {name: 'u5'}),
        (u6:User {name: 'u6'}),
        (u7:User {name: 'u7'}),
        (u8:User {name: 'u8'}),
        (u9:User {name: 'u9'}),
        (sperNode:User {name: 'super-node'}),
        (u1) -[:FOLLOWS]-> (u2),
        (u2) -[:FOLLOWS]-> (u3),
        (u2) -[:FOLLOWS]-> (sperNode),
(sperNode) -[:FOLLOWS]-> (u3),

        (u4) -[:FOLLOWS]-> (sperNode),
        (u5) -[:FOLLOWS]-> (sperNode),
        (u6) -[:FOLLOWS]-> (sperNode),
        (u7) -[:FOLLOWS]-> (sperNode),
        (u8) -[:FOLLOWS]-> (sperNode),
        (u9) -[:FOLLOWS]-> (sperNode)
RETURN *

Still supernode relationships are not checked. In my understanding, supernode should degrades performance. That's the reason i had not specified the directionality in below query.

PROFILE
MATCH p = (u1:User { name: "u1" })-[r1:FOLLOWS*]-(u3:User { name: "u3" })
RETURN p

Screenshot 2021-09-16 at 9.45.59 AM

Bennu · September 16, 2021, 8:48pm

Hi!

Can you define check for a super node?

Bennu

md7 · September 17, 2021, 5:32am

i expect query engine checking all relationships of super node if its on the path between user u1 and u3 using below query (direction of relationship not mentioned)

PROFILE
MATCH p = (u1:User { name: "u1" })-[r1:FOLLOWS*]-(u3:User { name: "u3" })
RETURN p

andrew_bowman · September 25, 2021, 12:20am

We don't understand what "check" you want this to perform.

Should this prevent matching through a supernode? Should it stop and report a supernode when it encounters it? Should it process through it but also report that a supernode exists here? What behavior are you expecting this to do?

Note that if you want it to not process it, then you technically will not be getting back correct answers to your query, since such a path exists.

md7 · September 25, 2021, 7:07am

Let me rephrase my question again:
Since super node has many relationships. And if you analyse the path search query using PROFILE\EXPLAIN keyword, you will notice operator 'VarLengthExpand' operator it returns exact two relationship instead of returning all relationship of super node.

How does Neo4j intelligently finds only required relationship that are in the path finding?

andrew_bowman · September 27, 2021, 7:22pm

Thanks for clarifying.

It may depend upon the query plan.

The one I see when I run this matches to both of the end nodes first, and then performs a VarLengthExpand(Into) operation, meaning that the filtering of the end nodes is performed within the operation, since it already found both of them. It does not need to separate the var length expand and filter steps into two separate ones. That said, even if the result is going to show a row count for after that filtering, the db hits should reflect the work that was done in expansion and node id comparison.

If the plan or query was different, then you might see a VarLengthExpand operation coming only from one side, and that's where all of the supernode's relationships would be considered, and then it might be followed by a filter operator on the property name, which would bring the rows down to the exact matches.

So theres no real "magic" here, all relationships of the supernode DO have to be expanded and filtered in some way. The VarLengthExpand(Into) operator just takes care of that filtering for you instead of needing to do the filtering in a separate operator, so you don't see the rows being expanded or filtered in the plan, though you should see the db hits from it.

Topic		Replies	Views
Cypher Question: Checking for Known Path Based on Node Properties & Returning Leaf Node Neo4j Graph Platform migrated	7	993	December 30, 2022
Is there a way to only reference the incoming paths into a few nodes instead of scanning all paths? Neo4j Graph Platform migrated	5	148	January 7, 2023
Why is this query too slow? Cypher	5	553	October 2, 2023
Performance issue with shortestPath on cypher query Cypher performance , cypher	4	1171	May 21, 2020
Take detour if relationship between two nodes is missing and choose detour node on node property base Cypher cypher	0	234	February 18, 2021

Traversing that involve super node

Related topics