Lingering Queries & TimedOut Transactions

Greetings, I'm running into a weird issue on my neo4j installation where timedout transactions and queries are lingering and cannot be killed. I opened an issue relating to the later problem (queries cannot be killed) but that is a side note, and I would like to determine the root cause of the issue.

Long story short I have parallel workers looking for nodes tagged with 'Ledger' whose 'processed' property is false. The exact query I am using to lookup these ledgers is as follows:

MATCH(l:Ledger { processed : false }) RETURN l LIMIT 5

I set the limit to 5 as I have 5 workers and thus guarantee that each pass will pull in at least one ledger which can be processed by that worker (workers are implemented as separate threads in a single process which have access to a shared array used to reserve ledgers being processed).

I currently have an index on Ledger.processed (which is required as lookup is slow otherwise) and since this is a batch import I have high checkpoint interval settings:

dbms.checkpoint.interval.time=2h
dbms.checkpoint.interval.tx=1200000

Upon spinning up my importer script everything works flawlessly for a while, usually for 1/2 a day at least the workers find unprocessed ledgers very quickly. Unfortunately after a while some of these queries stop returning and they start accumulating in the list of queries. I notice this as CPU usage increases on the server as more and more queries are left running.

There doesn't seem to be any rhyme or reason as to when this occurs, checkpoints happen without incident, and the lingering transactions seem to manifest randomly after a while running.

As a workaround I tried adding a short timeout to transactions so as to kill these queries if they take too long:

dbms.transaction.timeout=10s
dbms.lock.acquisition.timeout=10s

Unfortunately while it seems like the timeout is being enforced, the transactions are lingering as are the queries as indicated by the transactions list...

neo4j@neo4j> call dbms.listTransactions() yield transactionId, elapsedTimeMillis, currentQuery, currentQueryId, status, idleTimeMillis, pageHits, pageFaults where elapsedTimeMillis > 100 return transactionId, status, currentQuery, currentQueryId, elapsedTimeMillis, idleTimeMillis, pageHits, pageFaults;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| transactionId               | status                                                                                 | currentQuery                                             | currentQueryId        | elapsedTimeMillis | idleTimeMillis | pageHits | pageFaults |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "neo4j-transaction-537609"  | "Terminated with reason: Status.Code[Neo.ClientError.Transaction.TransactionTimedOut]" | ""                                                       | ""                    | 18862184          | 18862184       | 0        | 0          |
| "neo4j-transaction-4129837" | "Terminated with reason: Status.Code[Neo.ClientError.Transaction.TransactionTimedOut]" | "MATCH(l:Ledger { processed : false }) RETURN l LIMIT 5" | "neo4j-query-4129853" | 3315643           | 3315643        | 17       | 0          |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

...and queries list:

neo4j@neo4j> call dbms.listQueries() yield queryId, query, elapsedTimeMillis, status, idleTimeMillis, pageHits, pageFaults where elapsedTimeMillis > 100 return queryId, query, status, elapsedTimeMillis, idleTimeMillis, pageHits, pageFaults;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| queryId               | query                                                    | status    | elapsedTimeMillis | idleTimeMillis | pageHits | pageFaults |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| "neo4j-query-4129853" | "MATCH(l:Ledger { processed : false }) RETURN l LIMIT 5" | "running" | 3425892           | NULL           | 17       | 0          |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+

Could anyone provide any insights into

  1. Why this query is lingering indefinitely?
  2. Why terminated transactions are lingering?

Many thanks in advance for any help