You can change the return values from all 1's to different values and return them. You can then see which cypher statements executed. maybe add a 'limit 10' or so after the first 'where' clause so it doesn't execute all the rows.
The above code you pasted does not have the 'return' statements in the apoc 'when' cypher. I was able to get both 'when' statements to execute in my prototype only when I returned something from the first 'when' cypher; otherwise, the query stops since it produces no results.
Please, show me what you mean with a return statements in the apoc.when cypher.
Anyway, I can't try now because neo4j is not responding: it' loading millions transactions using your other query and, even I split the input file in chunks of 10000lines (with shell split) and I'm running in the cypher shell, the browser doesn't respond. I'll have to wait it will finish to make more tests!
Try this code. See in the each cypher statement in each 'with' clause, it terminates with a 'RETURN 1' statement. This causes the apoc 'when' method to return a value in the 'yield value' statement, which gets merged with the current result. Without returning a value, the current result gets merged with 'null' and the query stops since there is no result to process. At least this is my hypothesis. I got it to work locally when I returned a value.
:auto
LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error) RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error) RETURN 1',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
This approach using apoc turned out to be by far the most readable one (after a bit of a sprucing up of the query!)
This is the result:
LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row FIELDTERMINATOR ';'
CALL {
WITH row
WITH row
WHERE row.transactionid IS NOT NULL
MERGE (transaction:Transaction { transactionid : row.transactionid })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.consumer_id = NULL // remove the attribute no more used
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org
MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)
RETURN 1',
{tran: transaction, org: organisation}) yield value AS X
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct
MERGE (tran)-[h:TRANSACTION_HAS_BANK_ACCOUNT]->(bankAcct)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_BANK_ACCOUNT]->(error)
RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value AS Y
return transaction
} IN TRANSACTIONS OF 5000 ROWS
return count(transaction)
The story will follow with a new discussion .... but this is an achieved result I'd like to share!
I discovered tha you can remove the WITH statement inside and this further simplify the query, just keep attention to the names!
CALL apoc.do.when(organisation IS NOT NULL,
' MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)
RETURN 1',
' MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)
RETURN 1',
{tran: transaction, org: organisation}) yield value AS X
That is very interesting, so I looked at the source code to understand how that works. It turns out the apoc procedure passes the parameter map to the neo4j API 'execute' method as parameters, so they can be referenced as parameters, i.e. with a '$' sign. It also turns out the the procedure creates a 'WITH' clause with all the parameter values individually assigned to their parameter's names, i.e. 'WITH value1 as key1, value2 as key2, ...' This 'WITH' clause is prepended to both the 'if' cypher and the 'else' cypher before they are passed to the 'execute' method. This allows the parameter values to be referenced directly as cypher variables. In summary, the apoc.do.when allows passed parameters to be references as parameters (with '$') and directly as cypher variables. This explains why both methods work. This approach is used in all of the apoc cypher methods that pass a parameter map.