:auto transaction doesn't run as described in the manual

I'm running neo4j server community 4.4.5 with browser 4.4.3 on Ubuntu 20.04 machine.

This is from the online manual:

This is the error:

And this is the query

:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row FIELDTERMINATOR ';'

CALL {
WITH row
WITH row
    WHERE row.id IS NOT NULL

OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

MERGE (transaction:Transaction { id : toInteger(row.id) })
		SET transaction = row,
			transaction.uuid = apoc.create.uuid(),
            transaction.id = toInteger(row.id),
            transaction.consumer_id = NULL // remove the attribute no more used

    WITH transaction, row, bankAccount, organisation

CALL {
    WITH transaction, row, bankAccount
    WITH transaction, row, bankAccount
        WHERE bankAccount IS NOT NULL

    MERGE (transaction)-[h:HAS]->(bankAccount)
}

CALL {
    WITH transaction, row, bankAccount
    WITH transaction, row, bankAccount
        WHERE bankAccount IS NULL
    
    MERGE (error:Error)
    MERGE (transaction)-[h:HAS_NO_BANK_ACCOUNT]->(error)
}

CALL {
    WITH transaction, row, organisation
    WITH transaction, row, organisation
        WHERE organisation IS NOT NULL

    MERGE (transaction)-[h:BELONGS_TO]->(organisation)
}

CALL {
    WITH transaction, row, organisation
    WITH transaction, row, organisation
        WHERE organisation IS NULL
    
    MERGE (error:Error)
    MERGE (transaction)-[h:HAS_NO_ORGANISATION]->(error)
}

} IN TRANSACTIONS;

I tried to make some changes but with no luck!

Any suggestion?

Hey there!

From a brief peek at your cypher code snippet, it appears you are hitting that error due to a syntax error on row 5? I was able to reproduce your query above (different data) using v4.4.5 (I am in v4.4.5 browser though).

I believe you are seeing an error because you are following the CALL { with row with row with a WHERE clause.

i.e. -> You have:

:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row 
CALL { 
WITH row
WITH row 
      WHERE row.id IS NOT NULL
...

What you may want is to tuck that where clause in your OPTIONAL MATCH. Something like this:

:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row 
CALL {  WITH row WITH row 
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)}) WHERE ...

Reading through regarding auto transactions, I don't think you can use a WHERE clause there... :thinking:

I hope this somewhat helps and there isn't much else to it. Post back here if you are seeing same thing. I'd be curious what is throwing that error if it isn't how you are stating WHERE row.id IS NOT NULL.

Cheers,
Rob

You may recall you had a query a few weeks back that had nested CALL subqueries and returned the same error. In that case, I was not able to get rid of the error and ended up refactoring the query not to have nested CALLs. I did the same in this case using the apoc 'when' method to replace the use of the inner CALL subqueries used to mimic if/else functionality.

The query executes, but I don't have test data to validate it is actually working. The 'RETURN 1' statements were necessary to make it execute, since I got errors about 'can't terminate without a RETURN statement'.

:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL

CALL {
    WITH row

    OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

    OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

    MERGE (transaction:Transaction { id : toInteger(row.id) })
		SET transaction = row,
			transaction.uuid = apoc.create.uuid(),
            transaction.id = toInteger(row.id),
            transaction.consumer_id = NULL // remove the attribute no more used

    WITH transaction, row, bankAccount, organisation

    CALL apoc.when(bankAccount IS NOT NULL, 
    'MERGE ($tran)-[h:HAS]->($bankAcct)', 
    'MERGE (error:Error) MERGE ($tran)-[h:HAS_NO_BANK_ACCOUNT]->(error)',
    {tran: transaction, bankAcct: bankAccount}) yield value

    with transaction, organisation

    CALL apoc.when(organisation IS NOT NULL, 
    'MERGE ($tran)-[h:BELONGS_TO]->($org)',
    'MERGE (error:Error) MERGE ($tran)-[h:HAS_NO_ORGANISATION]->(error)',
    {tran: transaction, org: organisation}) yield value

    RETURN 1

} IN TRANSACTIONS
RETURN 1

I tried another approach that allowed me to eliminate the extra return statements. I used some 'list filtering' to simulate an 'if' capability. Each of the lists 'bankAcctNotNull', 'bankAcctIsNull', 'organisationNotNull', and 'organisationIsNull' will either be empty or contain a single value of '1', depending on the 'is null / is not null' conditions. As such, the 'foreach' loops will either execute one time or zero times.

Again, the query runs but I have not tested it.

:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL

CALL {
    WITH row

    OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

    OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

    MERGE (transaction:Transaction { id : toInteger(row.id) })
		SET transaction = row,
			transaction.uuid = apoc.create.uuid(),
            transaction.id = toInteger(row.id),
            transaction.consumer_id = NULL // remove the attribute no more used

    WITH transaction, row, bankAccount, organisation, 
    [i in [1] where bankAccount IS NOT NULL] as bankAcctNotNull,
    [i in [1] where bankAccount IS NULL] as bankAcctIsNull,
    [i in [1] where organisation IS NOT NULL] as organisationNotNull,
    [i in [1] where organisation IS NULL] as organisationIsNull

    foreach(i in bankAcctNotNull | 
        MERGE (transaction)-[h:HAS]->(bankAccount)
    )

    foreach(i in bankAcctIsNull | 
        MERGE (error:Error) 
        MERGE (transaction)-[h:HAS_NO_BANK_ACCOUNT]->(error)
    )

    foreach(i in organisationNotNull | 
        MERGE (transaction)-[h:BELONGS_TO]->(organisation)
    )

    foreach(i in organisationIsNull | 
        MERGE (error:Error) 
        MERGE (transaction)-[h:HAS_NO_ORGANISATION]->(error)
    )

} IN TRANSACTIONS
1 Like

@glilienfield The latter looks really interesting! I'll try it tomorrow!

1 Like

The second WITH is used to filter the row.id, and it is trick because in a call the first WITH cannot be filtered.
It has ran until now, I don't think it is the problem, because if I remove it the problem remains.
I'm just moving to the latest 4.4.6, then let you know ....

@glilienfield @Rcolinp I used the first solution from @glilienfield : it is much more readable of the second, despite the return statement.
You can remove the second return 1 statement: it runs without.

So, the problem key are the nested calls: if you have a running query with CALL {} IN TRANSACTIONS and add a nested CALL {} doing nothing, the entire query fails with an error.

While the same query without the inner call, succeeds (without any return!):

This latest to demonstrate the usage of the double WITH ROW inside the CALL. With only one it doesn't run with the WHERE clause:

Now a question is still open: What is the meaning of using :auto combined with using WITH TRANSACTIONS OF 100 ROWS? Who has the precedence? :auto or 100?

Unfortunately, it looks nice but run badly :face_with_symbols_over_mouth:

Why not simply this way? (We need apoc.do.when because we are writing ....)

That is what I originally had, but I thought it would not recognize the variables 'transaction', 'organization', and bankAccount' when the apoc routine executed the cypher. I did not try it to verify. Did you try it?

The error looks like you can't use the passed parameters in the match pattern.

Hi @glilienfield ,

With $tran he will look for the variable within the outer scope. With tran alone should work.

Bennu

I had passed the variables from the outer scope as parameters to the apoc method, then referenced the parameters in the cypher. It looks like it does not like using parameters in replacement of match binder variables.

I just did an experiment. the apoc method did not recognize variables defined outside the method. This make sense to me, as the cypher will not have knowledge of these variables when it is executed on the server.

I tried this as a work around. I bound the passed parameters to variables in a 'with' clause, which can them be used in the match pattern.

:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL

CALL {
    WITH row

    OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

    OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

    MERGE (transaction:Transaction { id : toInteger(row.id) })
		SET transaction = row,
			transaction.uuid = apoc.create.uuid(),
            transaction.id = toInteger(row.id),
            transaction.consumer_id = NULL // remove the attribute no more used

    WITH transaction, row, bankAccount, organisation

    CALL apoc.do.when(bankAccount IS NOT NULL, 
    'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct)', 
    'WITH $tran as tran  MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error)',
    {tran: transaction, bankAcct: bankAccount}) yield value

    with transaction, organisation

    CALL apoc.do.when(organisation IS NOT NULL, 
    'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org)',
    'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error)',
    {tran: transaction, org: organisation}) yield value

    RETURN 1

} IN TRANSACTIONS

RETURN 1

I'll give you another point to think about: the :auto keyword doesn't like to the cypher-shell!

Well, I gave it this version and it partially run:

But I have 895 nodes, and if I check I have 895 :HAS_NO_BANK_ACCOUNT but 0 :HAS_NO_ORGANIZATION AND 0 :BELONGS_TO. Still cannot explain

@glilienfield @Rcolinp @bennu_neo Yes, I tried a lot of times, the first APOC.DO.WHEN runs fine, but the second APOC.DO.WHEN is never executed!

Can you please try reversing the order? Does the 'organisation' one now run and the 'bankAccount' when doesn't?

Already done: it stops after the first one!

I think it may be caused by not returning a value from the 'when' cypher statements. I added returns. I also rearranged some things, but that should have no effect.

Try this:

:auto 
LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL

CALL {
    WITH row

    MERGE (transaction:Transaction { id : toInteger(row.id) })
		SET transaction = row,
			transaction.uuid = apoc.create.uuid(),
            transaction.id = toInteger(row.id),
            transaction.consumer_id = NULL // remove the attribute no more used

    WITH transaction, row

    OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

    CALL apoc.do.when(bankAccount IS NOT NULL, 
    'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct) return 1', 
    'WITH $tran as tran  MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error) return 1',
    {tran: transaction, bankAcct: bankAccount}) yield value

    with transaction, row

    OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

    CALL apoc.do.when(organisation IS NOT NULL, 
    'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org) return 1',
    'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error) return 1',
    {tran: transaction, org: organisation}) yield value

    RETURN 1

} IN TRANSACTIONS

RETURN 1

I have already tried this too: moving the optional matches below doesn't change anything.

BTW: You other solutions run as a charme, but I would like to figure this out and understand this unusual behavior

with the return statements added to the 'when' cypher? I tested it locally with similar code and it executed both 'when' statements.

Yes, It don't execute the second when. I changed the names to the relationships, so I can check if it creates one branch or the other, and I have no relationships from the second WHEN.

WITH row
    WHERE row.transactionid IS NOT NULL

MERGE (transaction:Transaction { transactionid : row.transactionid })
        SET transaction = row,
            transaction.uuid = apoc.create.uuid(),
            transaction.consumer_id = NULL // remove the attribute no more used

    with transaction, row
    
    OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})

    CALL apoc.do.when(organisation IS NOT NULL, 
    'WITH $tran as tran, $org as org MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)',
    'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)',
    {tran: transaction, org: organisation}) yield value AS X

    WITH transaction, row

    OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})

    CALL apoc.do.when(bankAccount IS NOT NULL, 
    'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:TRANSACTION_HAS_BANK_ACCOUNT]->(bankAcct)', 
    'WITH $tran as tran  MERGE (error:Error) MERGE (tran)-[h:TRANSACTION_HAS_NO_BANK_ACCOUNT]->(error)',
    {tran: transaction, bankAcct: bankAccount}) yield value AS Y


return 1

} IN TRANSACTIONS OF 5000 ROWS

return 1;