Hello,
I am trying to optimize a cypher query. The purpose of this query is given a node "source" we want to get the number of weakly connected component connected to another node "AL". For instance, if I have the following graph
I want to create relationships from the node source to target nodes connected to the same AL node at time T and I want to add a property on this relationship which correspond to the number of connected component attached at AL node. In this example, weights from source node to target nodes will be of 2 because there are 2 wcc (the source node in itself and the target nodes).
The difficulty is that I want to do this also for another type of node, that we can call "AF", and it is time consuming...
Here is my code: (for information, I use Neo4j v.4.2.1)
CALL gds.graph.create("similarite_proj",
{
D : {properties: "ORDERDATE_TIME_INT"}
},
["RELATIONSHIP"]);
CALL gds.beta.graph.create.subgraph(
"similarite_4mois",
"similarite_proj",
"*",
"*"
);
CALL apoc.periodic.iterate(
'MATCH (s:D)
WHERE NOT EXISTS (s.WCC)
RETURN s',
'
SET s.WCC = 1 // allow us to know on which node we have been through
WITH s
CALL {
WITH s
WITH s, s.ORDERDATE_TIME_INT AS sDate, toInteger(apoc.date.parse(toString(s.ORDERDATE_TIME - duration({months:4})), "s", "yyyy-MM-dd\'T\'HH:mm:ss")) AS oldDate
WITH s, sDate + " >= n.ORDERDATE_TIME_INT AND n.ORDERDATE_TIME_INT >= " + oldDate as nodeFilter // Filter for a subgraph projection
// First delete in-memory graph
// Note that this will break if no graph with that name is in memory yet
CALL gds.graph.drop("similarite_4mois")
YIELD graphName
// Projection of the subgraph with the filter
CALL gds.beta.graph.create.subgraph(
"similarite_4mois",
"similarite_proj",
nodeFilter,
"*"
)
YIELD nodeCount
// Create the wcc property on the in memory graph(projected graph)
CALL gds.wcc.mutate("similarite_4mois", {mutateProperty: "wcc"})
YIELD componentCount
// Get the wcc number of the s node
CALL gds.graph.streamNodeProperty("similarite_4mois", "wcc")
YIELD nodeId, propertyValue
WHERE nodeId = id(s)
WITH s, propertyValue AS component_id_s
RETURN component_id_s
}
WITH s, component_id_s
// Compute # of component id connected to the same AL
CALL {
WITH s, component_id_s
MATCH (s)-[:A_L]->(al:AdresseLivraison)<-[:A_L]-(targets_A_L:D)
WHERE s.ORDERDATE_TIME>=targets_A_L.ORDERDATE_TIME>=s.ORDERDATE_TIME - duration({months:4}) AND NOT EXISTS((s)-[:RELATIONSHIP]->(targets_A_L)) AND id(s) <> id(targets_A_L)
// Allow us to filter on the component id of the targets
CALL gds.graph.streamNodeProperty("similarite_4mois", "wcc")
YIELD nodeId, propertyValue AS component_id_targets_A_L
WHERE nodeId = id(targets_A_L) AND component_id_targets_A_L <> component_id_s
WITH collect(DISTINCT targets_A_L) as targets_A_L_list, collect(DISTINCT component_id_targets_A_L) as component_id_targets_A_L
RETURN targets_A_L_list, size(component_id_targets_A_L)+1 as nb_wcc_A_L//+1 because we take into account the s component
}
WITH s, component_id_s, targets_A_L_list, nb_wcc_A_L
CALL{
WITH s, targets_A_L_list, nb_wcc_A_L
WITH s, targets_A_L_list
UNWIND targets_A_L_list as targets_A_L
MERGE (s)-[r:SOFT_SIMILARITE_A_L]->(targets_A_L)
SET r.NUM_WCC_AL = nb_wcc_A_L
}
// Compute # of component id connected to the same AF
WITH s, component_id_s
CALL {
WITH s, component_id_s
MATCH (s)-[:A_F]->(af:AdresseFacturationClean)<-[:A_F]-(targets_af:D)
WHERE s.ORDERDATE_TIME>=targets_af.ORDERDATE_TIME>=s.ORDERDATE_TIME - duration({months:4}) AND NOT EXISTS((s)-[:RELATIONSHIP]->(targets_af)) AND id(s) <> id(targets_af)
// A_LLow us to filter on the component id of the targets
CALL gds.graph.streamNodeProperty("similarite_4mois", "wcc")
YIELD nodeId, propertyVA_Lue AS component_id_targets_af
WHERE nodeId = id(targets_af) AND component_id_targets_af <> component_id_s
WITH collect(DISTINCT targets_af) as targets_af_list, collect(DISTINCT component_id_targets_af) as component_id_targets_af
RETURN targets_af_list, size(component_id_targets_af)+1 as nb_wcc_af//+1 because we take into account the s component
}
WITH s, component_id_s, targets_af_list, nb_wcc_af
CALL{
WITH s, targets_af_list, nb_wcc_af
UNWIND targets_af_list as targets_af
MERGE (s)-[r:SOFT_SIMILARITE_A_F]->(targets_af)
SET r.NOMBRE_WCC_AF = nb_wcc_af
}
RETURN COUNT(DISTINCT s.NUM_DOSSIER)
',
{batchSize:1, parallel:false})
Thanks for your help !