I have been struggling with what might be considered deduplication, except it's a matter of deleting similar nodes, rather than precise duplicates. Let me try and explain:
I have (:Author) nodes, with [:WROTE] relationships to (:Book) nodes. Each (:Book) node has a unique ID property, as well as a varying number of relationships to (:Topic) nodes. However, I have duplicate nodes for some Books, so they share: 1. the Author node which :WROTE them, 2. The 'title' property amongst several nodes in many cases.
What I wish to do is to keep the single node, per book with a unique 'title' property, linked to the Artist who wrote it, based on the MAXIMUM number of relationships to (:Topic) nodes- essentially thinning the database by purging "duplicates" with fewer Topic links. Is this possible? Easy?