Eager operation in a previously running query

Neo4j: 5.26.7 EE, running on Ubuntu Pro 24.04 LTS
file
Well, good morning to everyone!

I got this very simple query

    WITH "pikeen" AS owner
    WITH pikeen.uuid5(owner,owner) as ownerUuid5
    
    MATCH (pikeen:Owner { uuid: ownerUuid5 })
    WITH pikeen, ownerUuid5
    
    LOAD CSV WITH HEADERS 
    FROM  "https://storage.googleapis.com/file.csv" 
    AS row FIELDTERMINATOR ';'
    
        WITH row, pikeen, toLower(row.url) AS url
            WHERE row.uuid <> "#"
        
        WITH row, pikeen, url, pikeen.uuid5(url,row.shortName) as uuid5
        
        MERGE (merchant:Merchant { uuid: uuid5})
            SET merchant.shortName= row.shortName,
                merchant.uuid= uuid5,
                merchant.url= url
        
        MERGE (pikeen)-[:HAS_MERCHANT]->(merchant)
        
        return pikeen, merchant;

Well, if I run this query, it do nothing but returns

This query was working nicely with the old version of Desktop, but now returns thi.

So, the question is why it signal a Eager operation, and where is my fault:

I tried to explain the query, and I find where there are Eager operations: but it is not clear what I have to do to repair it.

BTW, the csv file contains just 6 lines!!!

AND the CREATE CONSTRAINT ON (m:Merchant) ASSERT m.uuid IS UNIQUE; is already in place

Have you tried nested calls?

WITH "pikeen" AS owner
    WITH pikeen.uuid5(owner,owner) as ownerUuid5
    
    MATCH (pikeen:Owner { uuid: ownerUuid5 })
    WITH pikeen, ownerUuid5
    
    LOAD CSV WITH HEADERS 
    FROM  "https://storage.googleapis.com/file.csv" 
    AS row FIELDTERMINATOR ';'
    
        WITH row, pikeen, toLower(row.url) AS url
            WHERE row.uuid <> "#"
        
        WITH row, pikeen, url, pikeen.uuid5(url,row.shortName) as uuid5

        CALL {

                WITH row, pikeen, url, uuid5

                MERGE (merchant:Merchant { uuid: uuid5})
                ON CREATE
                    SET merchant.shortName= row.shortName,          
                        merchant.url= url
        
                MERGE (pikeen)-[:HAS_MERCHANT]->(merchant)

                return merchant
        }
        return pikeen, merchant;

Hi @joshcornejo ,

I tryied, but the result is exactly the same:

the execution plan for this query contains the eager operator, which forces all dependent data to be materialize in main memory before proceding

I don't have your pikeen module ... so i just substituted it with strings and I didn't get the error - you'll have to see if anyone from Neo picks it up.

You could just load all the csv inside a temporaryLabel and work it out from a MATCH instead (as I understand that EAGER is related to the LOAD CSV and the fact that it doesn't want to load everything into memory)

I didn't understand what do you mean with You could just load all the csv inside a temporaryLabel and work it out from a MATCH. Can you give me a sample?

BTW, the query in any case doesn't run after the EAGER message. It yust returns 0 managed records

I would do

LOAD CSV WITH HEADERS 
    FROM  "https://storage.googleapis.com/file.csv" 
    AS row FIELDTERMINATOR ';'

into some :tmpNode.

And then run the rest of the query.

Hi @joshcornejo,

Please, show me the exact syntax to assign Load CSV to :tmpnode, thank you.

I don't know what are the contents of your csv file?

But it looks like you have row.url, row.shortName ... that should be easy?

Something like this:

LOAD CSV 
    WITH HEADERS 
    FROM "https://storage.googleapis.com/file.csv" AS row 
    FIELDTERMINATOR ';'
CREATE (:tmpNode {url: row.url, shortName: row.shortName})

If you have duplicates on your csv on the URL ?

LOAD CSV 
    WITH HEADERS 
    FROM "https://storage.googleapis.com/file.csv" AS row 
    FIELDTERMINATOR ';'
MERGE (:tmpNode {url: row.url, shortName: row.shortName})

then you can run your queries on that tmpNode label

I think it's too cumbersome:

I tryied with a slightly different approach:

LOAD CSV 
    WITH HEADERS 
    FROM "https://storage.googleapis.com/file.csv" AS row1 
    FIELDTERMINATOR ';'

   CREATE (row:Row)
        set row = row1

but the LOAD CSV always returns the Eager error.

There should be something about indexes ....

The eager seems to be triggered by the SET operation. You could try to include the set as part of the merge with "on create" and "on match". I don't think that removes the eager though. In my plan, it just moved the eager operation.

WITH "pikeen" AS owner
WITH pikeen.uuid5(owner,owner) as ownerUuid5
    
MATCH (pikeen:Owner { uuid: ownerUuid5 })
WITH pikeen, ownerUuid5

LOAD CSV WITH HEADERS 
FROM  "https://storage.googleapis.com/file.csv" 
AS row FIELDTERMINATOR ';'

WITH row, pikeen, toLower(row.url) AS url
WHERE row.uuid <> "#"

WITH row, pikeen, url, pikeen.uuid5(url,row.shortName) as uuid5

MERGE (merchant:Merchant { uuid: uuid5})
ON CREATE SET merchant.shortName= row.shortName,
        merchant.uuid= uuid5,
        merchant.url= url
ON MATCH  SET merchant.shortName= row.shortName,
        merchant.uuid= uuid5,
        merchant.url= url

MERGE (pikeen)-[:HAS_MERCHANT]->(merchant)

return pikeen, merchant

The documentation says the EAGER is triggered on the MERGE inside the LOAD.

Not sure how often you will be running this process to consider it 'cumbersome'.

The problem is not how many times I run this process: the real problem is that the process halts after says it's EAGER and do not performs :enraged_face: :face_with_symbols_on_mouth:

As you just predict :crystal_ball:, it just moves the eager in another place.

2 Things:

  1. CYPHER is becoming worst a day after the other: looks like a spaghetti code salad

  2. if I put this statement CREATE CONSTRAINT Merchant_uuid IF NOT EXISTS FOR (n:Merchant) REQUIRE (n.uuid) IS UNIQUE; at the beginning of the query the Eager problem disappears, AND THEN. as the index already existed, I think this should be a CYPHER interpreter error.

Hi, @therese.magnusson :slight_smile:

could you try to help us to solve this crumbled situation?

Thank you

I'm confused

CREATE CONSTRAINT Merchant_uuid IF NOT EXISTS FOR (n:uuid) REQUIRE n.uuid IS UNIQUE;

TheFOR (n:uuid) ... is uuid a label or a property ?

I also ran your 2 queries from the browser and they both ran "correctly", even though:

  • There was no :uuid label (perhaps a warning would have been nice as it is not inherently an error?)
  • I don't have pikeen, nor a file in that URL ... both errors just went and the output was "No changes".

You're right Josh, I corrected the statement!

CREATE CONSTRAINT Merchant_uuid IF NOT EXISTS FOR (n:Merchant) REQUIRE n.uuid IS UNIQUE;

It is like If the system create the Constraint (or not create because it already exists!) the following query knows the presence of it and doesn't return the EAGER error

But if I remove the creat Constraint, it doesn't recognize the index and seems to run.

But it still look as spaghetti code

Well, I'm here again:This query run but not execute!!!

the result is **No changes, no records. Completed after 340 ms**

I espected to add 3 or four merchants with their properties.

The real .csv file has 4 lines available

The execution plan for this query contains the Eager operator, which forces all dependent data to be materialized in main memory before proceeding
Using LOAD CSV with a large data set in a query where the execution plan contains the Eager operator could potentially consume a lot of memory and is likely to not perform well. See the Neo4j Manual entry on the Eager operator for more information and hints on how problems could be avoided.
    LOAD CSV WITH HEADERS 
    FROM  "https://storage.googleapis.com/pikeen-neo4j/gdpr/merchants.csv" 
    AS row FIELDTERMINATOR ';'

    WITH row, pikeen.uuid5("pikeen","pikeen") as ownerUuid5, "pikeen" AS owner, toLower(row.url) AS url, toLower(row.shortName) AS shortName
    
    MATCH (pikeen:Owner { uuid: ownerUuid5 })

    WITH row, pikeen, url, shortName, ownerUuid5, pikeen.uuid5(shortName,url) as uuid5
        WHERE row.uuid <> "//"
 
    MERGE (merchant:Merchant { uuid: uuid5})
        ON CREATE SET merchant.shortName= shortName,
            merchant.uuid= uuid5,
            merchant.url= url
        ON MATCH  SET merchant.shortName= shortName,
            merchant.uuid= uuid5,
            merchant.url= url
  
    MERGE (pikeen)-[:HAS_MERCHANT]->(merchant)
    
    return pikeen, merchant;

Status code Neo.ClientNotification.Statement.EagerOperator

Hi @paolodipietro58,

Regarding Eager

Why do we have Eager

We add an Eager operator in the plan to make sure that the query behaves correctly. This should not interfere with obtaining the correct results. So, if you did get other results before, maybe there was a bug, that we fixed. The only problem is that it might make your query consume a lot of memory. So, unless you experience Out-Of-Memory problems, the Eager operator is not your problem.

How to "fix" Eager

If you prepend EXPLAIN to your query and have a look at the textual representation of the query plan, you should get a fairly good explanation as to why the Eager operators were put there.

When I ran the query, I got two Eager operators:

  • One is added because you set merchant.uuid=uuid5, while it is MATCHed on in the MERGE statement above. I do not see why you cannot leave that out, as the MERGE should already make sure it has the right value.
  • The other is added because you return the whole entities RETURN pikeen, merchant and have set properties before. We have discovered some inconsistencies in how we return entities that requires this Eager. However, do you really need the whole entities? Or would a count(*) or a property suffice?

Regarding no changes

That being said, I understand that you still have a problem of the query not creating the nodes that you expected. I cannot really tell you why that is the problem just from looking at your query but my first hunch would be that the entities (nodes and relationships) already existed and MERGE did not need to change anything. Could that be?

Otherwise, the output from a PROFILE run may be useful together with some information about the data present in the database (Is it empty? Or what is there?)

Hi all,

@arne.fischereit is on the right way.

but my first hunch would be that the entities (nodes and relationships) already existed and MERGE did not need to change anything. Could that be?

It can be true. this is a query running when we would have some new customer: sometimes the new customer is there, sometimes is not and in the latter case, there is nothing to change.

If this is the real problem, I coul relate the query execution at the result of the 'OPTIONAL MATCH (merchant:Merchant { uuid: uuid5})

if the result is not null, I could avoid to execute the rest of the query, while if it null, I can jump at the end.

The drawback is that this doesn't allow me to change attributes....

I'll try and let you know the result

At the same time thank you all for your collaboration and ideas, and to help me understand what really is and why there is the 'EAGER'.