The answer on slide 8 of the browser guide indicates that the correct way to introduce pattern comprehension to this query:
PROFILE MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) WHERE $relYear1 <= m.releaseYear <= $relYear2 AND a.born > $yearBorn RETURN a.name as actor, a.born as born, collect(DISTINCT m.title) AS titles ORDER BY actor
is like so:
PROFILE MATCH (a:Actor) WHERE a.born > $yearBorn RETURN a.name AS actor, a.born AS born, [(a)-->(x) WHERE $relYear1 <= x.releaseYear <= $relYear2 | x.title] AS titles ORDER BY actor
Which generates 95036 hits in 49 ms. My questions are:
why is there seemingly no performance penalty from using
[(a)-[:ACTED_IN]->(m:Movie)..? Does pattern comprehension really not care about having no relationship type or node labels included in this way?
if you did swap to the the more explicit syntax of
[(a)-[:ACTED_IN]->(m:Movie).., then you must include
WITH DISTINCT abefore the return statement, otherwise there are duplicate rows for each actor - why is this?
if you use the documented syntax of
[(a)-->(x)..it's possible to to get a slightly faster execution time by adding a
WITH DISTINCT aclause first (a few ms), but strangely this increases dbhits to 98273. Can anyone explain why?