Import JSON/GraphML with existing nodes and merge

I receive data from various external systems and need to import/load this data into my neo4j graph. The structure of the various systems can differ, so I was planning to convert the received data into JSON Lines or GraphML and then import it via the APOC import/load procedures. This would give me some flexibility as I don't have to write explicit Cypher statements with properties which depend on the actual system.

However, each dataset has an ID and this is unique across all systems. So when I load something from system A with ID=1 and later from system B with ID=1, I need to update the existing node (MERGE) instead of creating a new node.

I couldn't find any information in the docs for Import JSON or Load GraphML on how to specifiy an unique id. Is there any way to achieve that?

Hi @chrszrkl,

You'll be happy to hear it's a pretty straightforward thing to do.

As you've rightly said, you'll have a property for your unique id (which I assume you've created an index/constraint for).

After that, you can something along the lines of:

MERGE (myNode:SomeNode {id:'1234'})
ON MATCH Set myNode.someProp='some value'

Cheers,

Lju

Hi @lju,

thank you for your reply!

I wasn't aware that there is a difference between Load JSON and Import JSON. I was actually speaking of Importing JSON and there it seems to be not configurable on which property a node should be merged on.

However, I can easily change my conversions to create JSON instead of JSON Lines as output format. Therefore, I will be able to use the Load JSON procedures.

Thank you for the hints!

Hey Chris,

Import JSON is more for use with the corresponding export JSON procedures. It makes a bunch of assumptions about the structure of the JSON, so isn't really appropriate for general-purpose JSON files.

I think you rather need LOAD JSON, and it can handle data in the JSON lines format as well. So you should have something like:

CALL apoc.load.json("file:///yourFile.json")
YIELD value
MERGE (myNode:SomeNode {id:value.id})
ON MATCH Set myNode.someProp=value.prop

Adjusted to match the names of the properties in your file!

Hey Mark,

thank you for the info. I didn't know that LOAD JSON supports the format JSON Lines as well. That would make it much easier for my use case.

Thank you!