I have a dataset in Neo4j dataset . Based on some key properties I need to do DeDup check to find the duplicate records . Please help how to find duplicates . I am using Auro DB
Can you share some more information? Is uniqueness defined based on the value of a single property (like an ID field)? Or multiple properties (like name + address, etc)? Are you wanting to do more complex identity resolution, like fuzzy name matching, etc?
What have you tried so far?
As a starting point, let's say you have nodes with the label Person
and the property name
and you want to find nodes with the same name:
MATCH (p:Person)
WITH p.name AS name, COUNT(p) AS num_duplicates
RETURN name, num_duplicates ORDER BY num_duplicates DESC LIMIT 100
Thank you so much @ [William Lyon]
I already tried the single attribute one . Let me tell you what exactly i am looking . I am trying to do a POC for Customer DEDUP check , it will be based on multiple attributes . example Name ( First +last) , address , DOB , SSN etc . Intention is apply the rule on set of data and find the matching .
Please suggest.