Query mixes criteria from before and after WITH clause into on filter action

I'm facing a strange behaviour while trying to optimize a Cypher query that should fetch Message nodes from our database. (Enterprise 4.4.2)

As we need to apply some simple conditions and some with more text comparison, I tried to separate them as I didn't want to perform those text comparisons on all of the 255.000 Message nodes. So I tried this:

MATCH (mail:Message)
WHERE mail.channel IN ["Mail","WhatsApp","Messenger","Sms"] 
AND mail.sent_at is null 
AND not mail.draft 
AND mail.trial_count <= 30 

WITH  mail  // until here there are 23 lines as result

WHERE  (mail.send_at is null or mail.send_at <= "2022-12-08T10:34")
and
(
    (
        (mail.last_send_trial_at is null AND mail.created_at < "2022-12-08T08:33:53.5044341+01:00") 
        OR 
        (mail.trial_count > 0 
        AND mail.trial_count <= 5 
        AND (mail.last_send_trial_at is null or mail.last_send_trial_at < "2022-12-08T08:28:53.5044341+01:00"))
    ) 
    OR  (mail.trial_count > 5 AND (mail.last_send_trial_at is null or mail.last_send_trial_at < "2022-12-08T07:33:53.5044341+01:00"))
)
WITH mail

The first part until the first WITH clause results 23 lines. On these 23 lines, the following conditions should be applied.

BUT adding these following conditions after the With slows down the cypher by more than 2 seconds?? That made me wonder.

When I looked into the query plan, I found out that some of those string comparison conditions on created_at and last_send_trial_at that are AFTER the first WITH are evaluated together with the conditions from before the first WITH - so I guess they are applied to 255.000 nodes.

Now I tried to force a separation of the conditions before and after with by replacing the first WITH with this

WITH  mail as mail2
WITH  mail2 as mail

And now it's working as inteded, those string comparisons take just a few ms as they are just applied to those 23 lines that resulted before.

This mixing of criteria from blocks before and after a WITH clause looks like a bug for me as the explaination of the WITH clause according to the documentation is

The WITH clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next.

Best
Reiner

Hi Reiner,

The description in the documentation is with regards to the semantics of the language (as in: what results are returned), not how it is implemented. Cypher is a declarative language and the query optimiser tries to make use of all information available to it to find the optimal plan.

I'm sorry to learn that the optimiser in your case has chosen a suboptimal plan. You seem to have found a way to enforce the separation of the two query parts, though, which is what we would suggest in those cases.

If you provide the PROFILE of the query, we could have a closer look at how the query / the query optimiser could be improved.

Hi Arne,

that was almost a year ago. I can no longer reproduce this behaviour as the data has changed massively since then. With current data, both ways are fast.

Best,
Reiner