cancel
Showing results for 
Search instead for 
Did you mean: 

Loading from json that includes existing node data

Lucas_
Node Link

I am loading data from json files. I have created a uniqueness constraint on a node type but when I load this data using the implementation shown below, Neo returns a Neo.ClientError.Schema.ConstraintvalidationFailed with the message that 'Node(x) already exists...'


CREATE CONSTRAINT ON (n:MyNode) ASSERT n.name IS UNIQUE
....
CALL apoc.load.json("file://data.json")
YIELD value
FOREACH ( node in value.myNodes |
MERGE ( a:MyNode { vertexID: node.vertexID, name: node.name, location: node.location } )
SET a.vertexID = node.vertexID, a.name = node.name, a.location = node.location
)

My understanding of MERGE is that it should mean Neo skips items in the dataset that correspond with existing nodes in the graph, and just carry on to the next item. How can I ensure all items are iterated through rather than the operation exiting as soon as an existing item is reached? 

I have seen a couple of posts on here about this error but still can't figure this out for my case.

1 ACCEPTED SOLUTION

Ok I now see that in a nutshell the answer is to only merge on the property you have a constraint on, then set the rest of the properties afterwards. For my case, the following works:

CALL apoc.load.json("file://data.json")
YIELD value
FOREACH ( node in value.myNodes |
MERGE ( a:MyNode { name: node.name } )
SET a.vertexID = node.vertexID, a.name = node.name, a.location = node.location
)

View solution in original post

3 REPLIES 3

dana_canzano
Neo4j
Neo4j

@Lucas_ 

 

Your  MERGE, namely

MERGE ( a:MyNode { vertexID: node.vertexID, name: node.name, location: node.location } )

says find me a node with label :MyNode and has these 3 properties with said values.   If the node exists then update, if not then create.  But you have a unique constraint on the name property.  This failure is explained/demonstrated for example by

neo4j@neo4j> CREATE CONSTRAINT ON (n:MyNode) ASSERT n.name IS UNIQUE;
0 rows
ready to start consuming query after 1097 ms, results consumed after another 0 ms
Added 1 constraints


neo4j@neo4j> merge (n:MyNode {name:'Lucas', location:'ABC'});
0 rows
ready to start consuming query after 834 ms, results consumed after another 0 ms
Added 1 nodes, Set 2 properties, Added 1 labels

neo4j@neo4j> merge (n:MyNode {name:'Lucas', location:'DEF'});
Node(0) already exists with: label `MyNode` and property `name` = 'Lucas'

the 2nd MERGE looks for a :MyNode node and which has both a name:'Lucas' and also a location: 'DEF'.   Since it does not exist it attempts to create the node, but since the 1st MERGE already created a node on this label and with name:Lucas and for which there is a unique constraint the failure is expected

 

 

 

 

 

Lucas_
Node Link

Thanks for the explanation but I still don't see how to add data from a json that includes a mixture of new and existing nodes where we only want there to be one instance of each node in the graph. 

To take your example, there is no node with name 'Lucas' and location 'DEF' (there can only be one 'Lucas') ... but there are instances of 'Lucas'/ABC' in multiple json files. The first time this data is encountered in a load operation a new node is created... and I have applied a uniqueness constraint so that if the same data is encountered again it is understood to already be represented in the graph so a second node is not created for the same entity. My problem is the process stops here rather than moving on to the next json item.  

Ok I now see that in a nutshell the answer is to only merge on the property you have a constraint on, then set the rest of the properties afterwards. For my case, the following works:

CALL apoc.load.json("file://data.json")
YIELD value
FOREACH ( node in value.myNodes |
MERGE ( a:MyNode { name: node.name } )
SET a.vertexID = node.vertexID, a.name = node.name, a.location = node.location
)