Delete duplicate data and restore relationship

Hi everyone !

I have a problem and i would like to know if there is a sample way to solve it.

I have a function under java which create nodes with several properties, then i would like to compare those created nodes (their labels, their properties and content (except one property : the Id property which is unique)) with nodes in my data base using a cypher command. It needs to be dynamic, I mean the properties and labels are different according to the data created (but they all have an Id property).

If I found duplicate nodes (except Id property and the content) I would like to delete duplicate nodes, but before deleting them I would like to add the relationship they have to the unique node which I will not delete.

Hi Benjamin,

for your first question: have you looked into node similarity algorithms yet? (Node Similarity - Neo4j Graph Data Science)

for the second: if you create a relationship from a duplicate node to the original node and then delete the duplicate node, then the relationship will disappear as well... a relationship always needs two nodes to cling to...

So, if we assume that there is no relationship that you can keep from the duplicate node, might a solution be to use the "MERGE" command and not the "CREATE" command in your java programme? This will then only create a node if there is no node yet...
For your "keeping track" (which I guess is what you wanted to do with the relationship) would it be a solution to "MATCH" for the node and see if it already exists and then add some kind of property to it?

I hope my thoughts help you.
Regards,
Elena

While creating a node try Merge instead of Create.
Merge (a:Node1{id:line.node_id}) On Create set a.name =line. name, a.startdate=line.startDate
Return a

Else If you have already nodes in the Graph then use try to merge the nodes
MATCH (a:Node1)
WITH a.id as id, collect(a) as nodes
CALL apoc.refactor.mergeNodes(nodes, {properties: "combine"}) YIELD node
RETURN node;