I'm facing a strange behaviour while trying to optimize a Cypher query that should fetch Message nodes from our database. (Enterprise 4.4.2)
As we need to apply some simple conditions and some with more text comparison, I tried to separate them as I didn't want to perform those text comparisons on all of the 255.000 Message nodes. So I tried this:
MATCH (mail:Message)
WHERE mail.channel IN ["Mail","WhatsApp","Messenger","Sms"]
AND mail.sent_at is null
AND not mail.draft
AND mail.trial_count <= 30
WITH mail // until here there are 23 lines as result
WHERE (mail.send_at is null or mail.send_at <= "2022-12-08T10:34")
and
(
(
(mail.last_send_trial_at is null AND mail.created_at < "2022-12-08T08:33:53.5044341+01:00")
OR
(mail.trial_count > 0
AND mail.trial_count <= 5
AND (mail.last_send_trial_at is null or mail.last_send_trial_at < "2022-12-08T08:28:53.5044341+01:00"))
)
OR (mail.trial_count > 5 AND (mail.last_send_trial_at is null or mail.last_send_trial_at < "2022-12-08T07:33:53.5044341+01:00"))
)
WITH mail
The first part until the first WITH clause results 23 lines. On these 23 lines, the following conditions should be applied.
BUT adding these following conditions after the With slows down the cypher by more than 2 seconds?? That made me wonder.
When I looked into the query plan, I found out that some of those string comparison conditions on created_at and last_send_trial_at that are AFTER the first WITH are evaluated together with the conditions from before the first WITH - so I guess they are applied to 255.000 nodes.
Now I tried to force a separation of the conditions before and after with by replacing the first WITH with this
WITH mail as mail2
WITH mail2 as mail
And now it's working as inteded, those string comparisons take just a few ms as they are just applied to those 23 lines that resulted before.
This mixing of criteria from blocks before and after a WITH clause looks like a bug for me as the explaination of the WITH clause according to the documentation is
The
WITH
clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next.
Best
Reiner