Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 1: replicate users.

NEO4j Optimize match and merge based on fuzzy match

Node Link

Hi, I am running the following query

MATCH (fn1: full_name)
MATCH (fn2: full_name)
WHERE fn1.full_name <> fn2.full_name and  apoc.text.fuzzyMatch(fn1.full_name, fn2.full_name)=TRUE
MERGE (fn1)-[:FUZZY_MATCH]-(fn2)

which is currently taking more than an hour to run. The graph consists of approximately 54K full_name nodes. The idea is to create a connection between similar names.
Is there a way for me to optimize this process?

(Screenshot of query map for reference)


Graph Buddy

Hi, have you set indexes already?

Even index creation is taking time but I'm currently creating indexes for full_name nodes. Will update on performance once it gets done.

Also, you can check the constraint creation. I believe this can speed up even more (but at the cost of having an exclusive field)

Is there anything I'm missing out on in terms of query tuning? I was hoping that might be an area to explore as well.