i am using database version 4.3.3, for the below query that involve a supernode. While finding path between user u1 and u3 a super node comes. On profiling i am not able to understand why query engine doesn't check for supernode.
Even if i add super node in the path from user u1 to u3 , then also it will not check for all the relationship of supernode. though its performance magic of cypher query engine in optimising search path.
Still supernode relationships are not checked. In my understanding, supernode should degrades performance. That's the reason i had not specified the directionality in below query.
PROFILE
MATCH p = (u1:User { name: "u1" })-[r1:FOLLOWS*]-(u3:User { name: "u3" })
RETURN p
i expect query engine checking all relationships of super node if its on the path between user u1 and u3 using below query (direction of relationship not mentioned)
PROFILE
MATCH p = (u1:User { name: "u1" })-[r1:FOLLOWS*]-(u3:User { name: "u3" })
RETURN p
We don't understand what "check" you want this to perform.
Should this prevent matching through a supernode? Should it stop and report a supernode when it encounters it? Should it process through it but also report that a supernode exists here? What behavior are you expecting this to do?
Note that if you want it to not process it, then you technically will not be getting back correct answers to your query, since such a path exists.
Let me rephrase my question again:
Since super node has many relationships. And if you analyse the path search query using PROFILE\EXPLAIN keyword, you will notice operator 'VarLengthExpand' operator it returns exact two relationship instead of returning all relationship of super node.
How does Neo4j intelligently finds only required relationship that are in the path finding?
The one I see when I run this matches to both of the end nodes first, and then performs a VarLengthExpand(Into) operation, meaning that the filtering of the end nodes is performed within the operation, since it already found both of them. It does not need to separate the var length expand and filter steps into two separate ones. That said, even if the result is going to show a row count for after that filtering, the db hits should reflect the work that was done in expansion and node id comparison.
If the plan or query was different, then you might see a VarLengthExpand operation coming only from one side, and that's where all of the supernode's relationships would be considered, and then it might be followed by a filter operator on the property name, which would bring the rows down to the exact matches.
So theres no real "magic" here, all relationships of the supernode DO have to be expanded and filtered in some way. The VarLengthExpand(Into) operator just takes care of that filtering for you instead of needing to do the filtering in a separate operator, so you don't see the rows being expanded or filtered in the plan, though you should see the db hits from it.