Delete duplicate data and restore relationship

benjamin.caoduro_ext · March 17, 2020, 1:51pm

Hi everyone !

I have a problem and i would like to know if there is a sample way to solve it.

I have a function under java which create nodes with several properties, then i would like to compare those created nodes (their labels, their properties and content (except one property : the Id property which is unique)) with nodes in my data base using a cypher command. It needs to be dynamic, I mean the properties and labels are different according to the data created (but they all have an Id property).

If I found duplicate nodes (except Id property and the content) I would like to delete duplicate nodes, but before deleting them I would like to add the relationship they have to the unique node which I will not delete.

elena.kohlwey · March 17, 2020, 3:54pm

Hi Benjamin,

for your first question: have you looked into node similarity algorithms yet? (Node Similarity - Neo4j Graph Data Science)

for the second: if you create a relationship from a duplicate node to the original node and then delete the duplicate node, then the relationship will disappear as well... a relationship always needs two nodes to cling to...

So, if we assume that there is no relationship that you can keep from the duplicate node, might a solution be to use the "MERGE" command and not the "CREATE" command in your java programme? This will then only create a node if there is no node yet...
For your "keeping track" (which I guess is what you wanted to do with the relationship) would it be a solution to "MATCH" for the node and see if it already exists and then add some kind of property to it?

I hope my thoughts help you.
Regards,
Elena

intouch_vivek · March 17, 2020, 4:33pm

While creating a node try Merge instead of Create.
Merge (a:Node1{id:line.node_id}) On Create set a.name =line. name, a.startdate=line.startDate
Return a

Else If you have already nodes in the Graph then use try to merge the nodes
MATCH (a:Node1)
WITH a.id as id, collect(a) as nodes
CALL apoc.refactor.mergeNodes(nodes, {properties: "combine"}) YIELD node
RETURN node;

Topic		Replies	Views
Deleting / Merging Duplicate Connected Nodes Neo4j Graph Platform migrated	5	276	August 30, 2022
Remove nodes duplicates and replace removed relationships with new one, with same properties values Newbie Questions	3	880	February 7, 2021
Cannot delete node<id>, because it still has relationships. To delete this node, you must first delete its relationships Neo4j Graph Platform migrated	3	153	October 14, 2022
Cannot delete node<id>, because it still has relationships. To delete this node, you must first delete its relationships Cypher apoc	5	2417	August 30, 2021
Updating neo4j relationship properties without duplicating the existing relationship Neo4j Graph Platform	7	561	May 8, 2021

Get Certified in June!

Delete duplicate data and restore relationship

Related topics