Poor performance UNWINDing 2 collections to find possible node relationships

pdrangeid · December 10, 2019, 8:35pm

Here's the basic schema:

(:Auvikdevice)-[:WITHIN_TENANT]-(:Auviktenant)-[:CHILD_OF_TENANT]-(:Auviktenant)-[:AUVIK_SITE_FOR]-(:Company)-[:ASSET_OF]-(ca:Crmasset)

I have network devices discovered with a network discovery tool. The tenant is nested to satisfy child/sub locations for a client
Our CRM (IT asset management) stores assets as :Crmasset

We are building missing assets within our CRM by using an API into our CRM system when we discover a new device on their network.

I'm using Neo4j to try to track which network discovered items have had assets created, and which are still missing (or orphaned)

I use a number of methods to tie assets to devices (MAC address, serial number,name etc) within a tenants space.

I'm writting a query to find :Auvikdevice nodes without a relationship to :Crmasset, and then find possible :Crmasset nodes within the same company in order to attempt a match (in the example below, using MAC address).
The largest qty for a combination of child tenant to Crmasset list is only 571 Crmassets and 882 Auvikdevice nodes, and there are only 54 total child tenants we are examining.
I'm perplexed on why it takes over a minute to run this, and I'm trying to discover why it's so slow:

MATCH (at:Auviktenant)--(:Auviktenant)--(a:Company) where at.adcount >0
WITH collect(distinct a) as companies
UNWIND companies as a
MATCH (a)--(ca:Crmasset) where exists(ca.mac)
WITH a,collect(distinct ca) as assets
MATCH (at:Auviktenant)--()--(a) where at.adcount >0
WITH assets,a,collect(distinct at) as tenants
UNWIND tenants as at
MATCH (ad:Auvikdevice)--(at)--(:Auviktenant)--(a) where not (ad)--(:Crmasset) and exists(ad.macaddress)
UNWIND assets as ca
WITH ca,ad,"[^a-zA-Z\\d]" as regex
WITH ca,ad,toLower(apoc.text.replace(ca.mac, regex,"")) as camac,toLower(apoc.text.replace(ad.macaddress, regex,"")) as admac
WITH ca,ad where camac=admac
MERGE (ad)<-[:IS_AUVIK_MONITORED]-(ca)
RETURN ad.id,ca.name order by ad.name;

jhonathanmarolo · December 10, 2019, 9:19pm

Hi @pdrangeid, did you saw the execution plan?

Please upload for us to understand a litle bit more about your query.

Also did you saw this link: Execution plans - Cypher Manual ?

pdrangeid · December 10, 2019, 11:02pm

wasn't sure of the best/easiest way to post the "EXPLAIN" results on the forum, so here's goes with a png attachment:

jhonathanmarolo · December 11, 2019, 6:27pm

Did you have considered to create a index, expecialy where you use distinct?

Also try no to use toLower or replace inside this query.

pdrangeid · December 14, 2019, 5:08pm

I have indexes on these properties, but the string parsing made a substantial difference. Unfortunately to test this I had to first make some mods to all the datasource inputs. (basically I performed the toLower and regex filtering as I create the nodes now, instead of when trying to compare them.

Thanks!

Topic		Replies	Views
Neo4j Unwind query response very slow Cypher cypher	2	293	January 18, 2021
Query performance with not existing relationship checks in Neo4j Cypher	11	312	May 25, 2022
Query performance with not existing relationship checks in Neo4j Neo4j Graph Platform migrated	11	146	September 13, 2022
Merge and Nested Unwind: How to write an efficient query wrt. indexing Neo4j Graph Platform performance , cypher , unwind , merge	1	1683	April 2, 2020
1000 queries takes time(as expected), how can i approach this this in a better way Neo4j Graph Platform performance , cypher	3	343	November 24, 2023

Get Certified in June!

Poor performance UNWINDing 2 collections to find possible node relationships

Related topics