cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 1: replicate users.

NEO4j Optimize match and merge based on fuzzy match

prateek_sethi
Node Link

Hi, I am running the following query

MATCH (fn1: full_name)
MATCH (fn2: full_name)
WHERE fn1.full_name <> fn2.full_name and  apoc.text.fuzzyMatch(fn1.full_name, fn2.full_name)=TRUE
MERGE (fn1)-[:FUZZY_MATCH]-(fn2)

which is currently taking more than an hour to run. The graph consists of approximately 54K full_name nodes. The idea is to create a connection between similar names.
Is there a way for me to optimize this process?

(Screenshot of query map for reference)

4 REPLIES 4

andreperez
Graph Buddy

Hi, have you set indexes already?

Even index creation is taking time but I'm currently creating indexes for full_name nodes. Will update on performance once it gets done.

Also, you can check the constraint creation. I believe this can speed up even more (but at the cost of having an exclusive field)

Is there anything I'm missing out on in terms of query tuning? I was hoping that might be an area to explore as well.