Retrieving errors from runMany

Is there a way I can retrieve the errors during a apoc.cypher.runMany call?

I have certain constraint set in my database:

create constraint Test_unique for (test:Test) require test.id is UNIQUE

Then I am running a runMany procedure:

CALL apoc.cypher.runMany(
  '
  CREATE (n:Test {name:"John1",id:"1"});
  CREATE (n:Test {name:"John2",id:"1"});
  ', {}
)

I'd expect it to throw an error when trying to create the 2nd "John2" node since it violates my constraint.

Currently, it does not return any error and only creates the 1st "John1" node and return "1 row entered" in statistics.

Also, if "John1" node already exists in database and I only try to create "John2" node in runMany, it does not return any error and returns "No records added".

Just wanted to check if there is a way to retrieve the list of errors when constraints are violated.

I think you get that behavior because they are separate cypher statements. They are run in separate transactions, but are executed in sequence, so once one fails the remaining ones are not executed.

If you want all operations to success or fail, then don't separate the statements with semicolons. At that point, you don't need the apoc procedure.

You can see this behavior when you execute the cypher statements directly in the browser.

I understand that's why 1st works, but 2nd doesn't do anything. I'm okay with that.

But what about retrieving errors from a runMany?

If I run these statements one after the other in Neo4j Browser, the second one will give me back a constraint validation error. Is there a way to do that from a runMany?

Am I wrong in assuming that runMany is basically running operations in bulk? And I am assuming running in bulk means its faster than running them one by one in Neo4j?

Looking at the source code, the procedure is just running each statement using tx.execute, where tx is a new transaction for each statement. As such, I see no performance benefit. I think the benefit is that you can programmatically build cypher queries and pass them to execute in separate transactions. If the separate transactions feature is not of value, you can do the same in one transaction using apoc.cypher.doIt or apoc.cypher.run.

Interesting. If that’s the case, why even use apoc? If there’s no performance impact I can just use a normal Create statement, right? Or does apoc doIt or run give performance benefit?

Also, is there anything that I can do to perform bulk writes in Neo4j with a better performance than running n cypher “CREATE” ?

The code just calls 'execute' on a transaction. Its run on the server, as that is where the procedure is located. I have developed a suite of custom procedures for my application. I don't execute cypher like that, as I can do the same with the driver. The driver will just send it to the server to execute. I used custom procedures to traverse my graphs using the Java API that is on the server. It is what neo4j is built on. I can't do what I need to to efficiently, if at all, using cypher queries.

I think the real benefit of these apoc.cypher procedures is that they allow you to build queries as strings in your query and execute them. This allows you to dynamically build queries.

You can use apoc.periodic.iterate or cypher's call subqueries with in transactions. They allow you to break down your updated into batches of updates in separate transactions. This typically has a better performance.