We are now working 2 years with Neo4j and do a lot of performance checks.
We are search a lot about sorting many nodes and we do not found a good solution.
Only when i do a sort like;
MATCH(message:Message) RETURN message ORDER BY message.changed_on DESC LIMIT 10
Cypher version: CYPHER 3.4, planner: COST, runtime: COMPILED. 6600415 total db hits in 34132 ms.
What is the best way to do this?
We've had a request to improve performance for ORDER BY when using indexed properties for awhile now, and until recently we've had other performance improvements that have taken precedence as far as engineering work.
We expect to see indexed ORDER BY make its debut with the 3.5 release, which should be within the next month or two.
Something I usually do in these cases is the following.
Run a query at regular times that gives me the timestamp of the message at index 100 or 1000 and store it in memory.
MATCH(message:Message) RETURN message.changed_on SKIP 1000 LIMIT 1
and then run:
MATCH(message:Message) WHERE message.changed_on > $timestamp RETURN message ORDER BY message.changed_on DESC LIMIT 10
Which uses the index for the range lookup and allows you (even if new messages are ordered) to keep the volume to sort in the 1000 to 10k range.