How to use order by and limit inside a multi-line cypher query

Greetings,

I am trying to build the link between multiple nodes using the following query which contains multiple match statements. On the third match I get more than one node as a result but I need to sort them on the childid in descending order and pick the one highest childid value but it is not allowing me to use order by and limit inside the query. I am sure I am doing something wrong. How do I go about achieving this. ?

match (f:Node1) where f.prop1 = "XX" and f.prop2 = "ABC" and f.prop3 = "DEF"
match (n:Node2 {name: f.prop4})-[:Node2_to_Node4]->(rt:Node4)
match (r0:Node3) where r0.prop2 = rt.name and r0.prop1 = "XX" order by r0.childid desc limit 1
return f.prop1, f.prop2, f.prop3, f.prop4 + "," + r0.prop1+ "," + r0.prop2+ "||" + r0.id as node3 order by f.prop1, f.prop2, f.prop3, f.prop4 limit 5

Thanks
Satish

Hi Satish,

ORDER BY sub-clause must be following clause RETURN or WITH.
So after the third Match you need to use WITH in case you need to utilize ORDER BY

Try this:

match (f:Node1) where f.prop1 = "XX" and f.prop2 = "ABC" and f.prop3 = "DEF"

match (n:Node2 {name: f.prop4})-[:Node2_to_Node4]->(rt:Node4)

match (r0:Node3) where r0.prop2 = rt.name and r0.prop1 = "XX"

with f, n, rt, r0, max(r0.childid) as chld

match(r1:Node3) where r1.prop2 = rt.name and r1.prop1 = "XX" and r1.childid = chld

with f, r1

return f.prop1, f.prop2, f.prop3, f.prop4 + "," + r1.prop1+ "," +

r1.prop2+ "||" + r1.id as node3 order by f.prop1, f.prop2, f.prop3, f.prop4 limit 5

Thank you @ameyasoft. Isn't it a performance issue running the match just to get the max value and then running the match again to use that max value.?

Thanks @intouch_vivek. Will try that.

An alternate approach for you here, if you have APOC Procedures (a release >= 3.5.0.5 for the 3.5.x branch) you can make use of apoc.coll.maxItems():

match (f:Node1) where f.prop1 = "XX" and f.prop2 = "ABC" and f.prop3 = "DEF"
match (n:Node2 {name: f.prop4})-[:Node2_to_Node4]->(rt:Node4)
match (r0:Node3) where r0.prop2 = rt.name and r0.prop1 = "XX"
with f, n, rt, apoc.agg.maxItems(r0, r0.childid, 1).items[0] as r0
return f.prop1, f.prop2, f.prop3, f.prop4 + "," + r0.prop1+ "," + r0.prop2+ "||" + r0.id as node3 order by f.prop1, f.prop2, f.prop3, f.prop4 limit 5

This will perform an aggregation to find the r0 nodes with the max childid properties. We also supply 1 so we only get 1 item for a max value (instead of collecting ties into the list), and then we extract that r0 node with the max value from the resulting items list.

1 Like

Thank you @andrew_bowman.

Just to clarify ....

apoc.agg.maxItems(r0, r0.childid, 1).items[0]

The statement above takes all the nodes of the variable r0 and sorts them by childid in descending order and then returns an array with 1 item(third argument). We are selecting that one item in the array using the items[0]. Did I understand that correctly ?

Thanks

Almost got it.

It's not doing sorting, but it is evaluating the value (r0.childid) per entry, and tracking only the entry (or entries, depending on the third parameter) with the highest value. If it evaluates a subsequent entry that has a higher value, it replaces the entry that it's tracking. If it finds an entry tied for the max value, it groups it by appending it to the list (though respecting that third parameter to limit the grouping with the same max value).

With these parameters, the result is a map structure like {items:[], value:n}, with one entry in the items list. Since you care just about the item, not the value, you just extract that single entry.

I have a Cypher knowledge base article pending about this, I'll link it here once it's published.

1 Like