Get most recent of 000's of transaction nodes (t) connected to a single account node (a)

performance

(Simon) #1

I have a number of nodes representing accounts, called say (a :Account). Each (a) can have potentially tens of thousands of (t :Transaction) nodes connected to it, each representing the data for a transaction that occurred involving that account.

The (t) nodes have a timestamp property representing when that transaction occurred. What would the most efficient way to get the latest (t :Transaction) node for each (a :Account) that occurs before a given timestamp, This could be one way to do it:

optional match (a :Account)-->(t :Transaction)
where t.timestamp <= date("2014-03-07")
return a, t
order by t.timestamp desc
limit 1

However I'm not sure if this method is very efficient when the number of (t) connected to each (a) becomes very very large if running for lots of account nodes. What would be the best way to get the most recent (t :Transaction) for an (a :Account) that occurs before a particular date?

Many thanks.


(Michael Hunger) #2

Exactly what you did (minus the optional) in Neo4j 3.5 benefits from index backed order by if you have an index on :Transaction(timestamp)

You see that it will use the information from the index for the sort + limit.

explain
match (a :Account)-->(t :Transaction)
where t.timestamp <= date("2014-03-07")
return a, t
order by t.timestamp desc
limit 1


(Simon) #3

Many thanks for your help resolving this.