Hi Neo4j community,
I am a working on the evaluation of the performance of a query using the recommnedations dataset included in the Sandbox.
I tried to emulate a multi label by doing the following:
MATCH (n:Actor:Director) SET n:ActorAndDirector RETURN COUNT(n)
So this way I had a label for actors who are directors as well.
Then I performed the simple query
PROFILE MATCH (n:ActorAndDirector) WHERE EXISTS(n.name) AND EXISTS(n.bornIn) RETURN n
which results in 442 nodes.
It goes through 9545 total db hits and I meassured the runtime (in milliseconds) for 10 repeats as follows
298 67 50 111 35 108 34 35 27 30
I know that runtime is not the best meassurement and that the cache needs to warm up so happy to get suggestion on how to use better metrics as the db hits will always be the same and what I found especially confusing is the following.
To see how the same query performs involving a lot more nodes I did the following
UNWIND range(1,44200) AS vertices
CREATE (n:ActorAndDirector {name:vertices, bornIn:vertices})
to create 100 times as many vertices giving them names 1,2,3 and the attributes bornIn as 1,2,3, etc. Not meaningful properties, but just for the sake of performance. Then I executed
PROFILE MATCH (n:ActorAndDirector) WHERE EXISTS(n.name) AND EXISTS(n.bornIn) RETURN n
again, this time with 44642 nodes as result and 407345 total db hits. However, the runtimes of the first ten executions were
28 5 5 9 6 6 5 7 5 6
in milliseconds. Again some warming up and fluctuation in performance, but way faster than before on a much larger (x101) vertex set.
And even weirder was then that after deleting all these new vertices and performing the query on the original 442 vertices the query was slower again, closer to the performance the first time.
I am sure that this might not be the best way to meassure performance, I am trying to get started somewhere, but was just surprised by the results and cannot make any sense of it.
I have looked here in the community forum and on the internet, but could not find anything that could explain this.
Thank you very much for any helpful input, it is much appreciated.
Best,
Philipp