How to execute the below queries in minimum time?

Hello Team ,

I need a help in knowing how can I fine tune the below queries of near about 300 lines to get execute in minimum time. Right now , it is taking nearly 4 hours to update 84 Lacs records in a label.

Below are the queries-

match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB MW' and n.SITE_OWNER = 'Vodafone' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Vodafone';
match(n:SITE) WHERE n.SITE_TYPE = 'ISC' and n.SITE_OWNER = 'Hotel Mayur' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Hotel Mayur';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB MW' and n.SITE_OWNER = 'VIOM' SET n.CUSTOM_SITE_OWNER = 'IP COLO-VIOM';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = 'Trident' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Trident';
match(n:SITE) WHERE n.SITE_TYPE = 'ISC' and n.SITE_OWNER = 'Padmini Resort' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Padmini Resort';
match(n:SITE) WHERE n.SITE_TYPE = 'HANDHOLE' and n.SITE_OWNER = 'null' SET n.CUSTOM_SITE_OWNER = 'IP COLO-null';
match(n:SITE) WHERE n.SITE_TYPE = 'AG1_R' and n.SITE_OWNER = 'BSNL' SET n.CUSTOM_SITE_OWNER = 'IP COLO-BSNL';
match(n:SITE) WHERE n.SITE_TYPE = 'MW Repeater' and n.SITE_OWNER = 'Bharti Infratel' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Bharti Infratel';
match(n:SITE) WHERE n.SITE_TYPE = 'ENT' and n.SITE_OWNER = 'RDC' SET n.CUSTOM_SITE_OWNER = 'IP COLO-RDC';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = '0' SET n.CUSTOM_SITE_OWNER = 'IP COLO-0';
match(n:SITE) WHERE n.SITE_TYPE = 'NLD AG1' and n.SITE_OWNER = 'ATC' SET n.CUSTOM_SITE_OWNER = 'IP COLO-ATC';
match(n:SITE) WHERE n.SITE_TYPE = 'NB' and n.SITE_OWNER = 'JH' SET n.CUSTOM_SITE_OWNER = 'IP COLO-JH';
match(n:SITE) WHERE n.SITE_TYPE = 'ILA' and n.SITE_OWNER = 'RCOM' SET n.CUSTOM_SITE_OWNER = 'RP1-RCOM';
match(n:SITE) WHERE n.SITE_TYPE = 'ILA' and n.SITE_OWNER = 'VIOM' SET n.CUSTOM_SITE_OWNER = 'IP COLO-VIOM';
match(n:SITE) WHERE n.SITE_TYPE = 'MANHOLE' and n.SITE_OWNER = 'Owned' SET n.CUSTOM_SITE_OWNER = 'IP COLO-OWNED';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = 'IP1 COLO' SET n.CUSTOM_SITE_OWNER = 'IP COLO-IP1 COLO';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = 'MICROQUAL' SET n.CUSTOM_SITE_OWNER = 'IP COLO-MICROQUAL';
match(n:SITE) WHERE n.SITE_TYPE = 'OSC' and n.SITE_OWNER = 'RIL' SET n.CUSTOM_SITE_OWNER = 'IP COLO-RIL';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB MW' and n.SITE_OWNER = 'INDUS' SET n.CUSTOM_SITE_OWNER = 'IP COLO-INDUS';
match(n:SITE) WHERE n.SITE_TYPE = 'ILA' and n.SITE_OWNER = 'RJIL' SET n.CUSTOM_SITE_OWNER = 'P1-RJIL';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = 'Suyog' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Suyog';
match(n:SITE) WHERE n.SITE_TYPE = 'eNodeB' and n.SITE_OWNER = 'Videocon' SET n.CUSTOM_SITE_OWNER = 'IP COLO-Videocon';
match(n:SITE) WHERE n.SITE_TYPE = 'ENT' and n.SITE_OWNER = 'RIL' SET n.CUSTOM_SITE_OWNER = 'IP COLO-RIL';

Best Regards
Akshat

Hi Akshat,

Neo4j is strong on handling nodes and relationships but has its weaknesses in searching for nodes by properties for example. I assume, your database probably has a lot of nodes and hence it takes such a long time to find the nodes that you want to change?!

What you could try:

  1. your "SITE_TYPE" property. It seems that there is a limited number of types that keep repeating itself? Of course from your small example, I can't be sure but at least I see "eNodeB" and "ENT" being there at least twice. Try setting the type as a label on the node and not as a property. This will make the performance already a lot better, I guess. Then you will need to query for "MATCH (n:SITE:ENT) ..." for example.
  2. More or less the same is valid for the site owner. I see the same owner appearing several times. I would model that via a relationship, hence something like (o:Owner)-[:OWNS]->(n:SITE). This will again improve your performance since Neo4j is optimised for node / relationship searches.
  3. To be coherent with me second suggestion, I would also model the custom site owner as a separate node with a relationship to the site.

I hope this improves your performance!
Regards,
Elena

Hello Elena,

Thanks for the quick help.
It will be little difficult for me to create new label for SITE_TYPE and SITE_OWNER as SITE as a label already has 25 properties for 1 record.
Moreover , Below is the distinct site_type and its count.

n.SITE_TYPE COUNT(n.SITE_TYPE)
"NB" 6864047
"CUSTOMER" 980468
"SPOKE" 12970
"OSC" 33121
"CO" 2903
"eNodeB" 281604
"TOBY BOX" 105420
"PEDESTAL" 52448
"ENT" 20026
"IBS" 1253
"FSA" 363
"IAP" 10323
"ISC" 7824
"COW" 365
"HUB" 249
"ILA" 2111
"Metro AG1" 5079
"Metro AG2" 283
"POI" 1422
"OAP" 11206
"NLD AG1" 1157
"SNLD AG1" 368
"MW Repeater" 132
"SMetro AG1" 1071
"NLD AG2" 267
"JCO" 733
"AG1_R" 243
"eNodeB MW" 109
"AG2-OTN" 38
"GDN" 7
"AG3-IDC" 10
"SAG2" 51
"WCB" 20
"CLS" 72
"AG3" 8
"JCN" 145
"NBT" 3
"JIS" 73
"UBR" 22
"IXC" 13
"HOTEL" 9
"POLE" 22326
"Spoke" 23
"WAREHOUSE" 2
"IIDC" 3
"TP" 4
"BUILDING NODE" 1
"SAG2-IDC" 1
null 0
"MANHOLE" 6891
"HANDHOLE" 1553

Do you have indexes created on SITE_TYPE and SITE_OWNER? If not it s going to do label scans and take lot of time.

It's not the size of the cypher that is causing the performance delay.

Also, you could run each of those queries separately in parallel.

When you run all of them in a single query, you need more heap memory to commit the transaction.

Hi Akshat,

I don't see what is keeping you from adding a label. It will not "destroy" your properties.
If you just add the label (e.g. "MATCH (n:SITE {SITE_TYPE:"NB"}) SET n:NB") nothing else is affected. But afterwards you can look for the node by "MATCH (n:NB).... " which will find the node much quicker than "MATCH (n:SITE) WHERE n.SITE_TYPE = "NB". Again, this is just because of the way Neo4j is designed.

1 Like