Best way to sort nodes on changed on


(Tomswinkels) #1

Hi,

We are now working 2 years with Neo4j and do a lot of performance checks.

We are search a lot about sorting many nodes and we do not found a good solution.

Only when i do a sort like;

MATCH(message:Message) 
RETURN message 
ORDER BY message.changed_on 
DESC LIMIT 10

Cypher version: CYPHER 3.4, planner: COST, runtime: COMPILED. 6600415 total db hits in 34132 ms.

What is the best way to do this?


(Andrew Bowman) #2

We've had a request to improve performance for ORDER BY when using indexed properties for awhile now, and until recently we've had other performance improvements that have taken precedence as far as engineering work.

We expect to see indexed ORDER BY make its debut with the 3.5 release, which should be within the next month or two.


(Tomswinkels) #3

Great!

Are the timestamps not indexed? Or is this a bug in the new Neo4j indexes?


(Andrew Bowman) #4

It's more of a missing feature. Indexes in Neo4j are currently used for lookup, not for sorting purposes. Once 3.5 is out, if there is an index present on the property you're using for the sort, you should see much better performance.


(Michael Hunger) #5

Something I usually do in these cases is the following.

Run a query at regular times that gives me the timestamp of the message at index 100 or 1000 and store it in memory.

MATCH(message:Message) 
RETURN message.changed_on  SKIP 1000 LIMIT 1

and then run:

MATCH(message:Message) WHERE message.changed_on > $timestamp
RETURN message 
ORDER BY message.changed_on 
DESC LIMIT 10

Which uses the index for the range lookup and allows you (even if new messages are ordered) to keep the volume to sort in the 1000 to 10k range.