Hello everyone, I'm a newbie about Neo4J and I'm working on big graph network with clustering algorithms. I'm using python with neo4j python driver. I have a problem. I ran the Weakly Connected Components algorithm into my network (~6M nodes, 23M edges) with a label I've chosen and I successfully get results back in a table:
Now I want to build a virtual graph where I create virtual nodes and each node has the parameter {component:componentId} of query above. So, I want to build virtual edges like
(node{nodeId}) ---[BELONGS_TO] ---->(component{componentId), so I can plot something like clusters where more nodes are connected to a component node. There is a problem. With the query below, neo4j creates virtual duplicated nodes, even though I set "distinct" everywhere. So, instead of connecting n nodes to a single component node, it creates n component nodes, like the screen below:
I just want those 3 red nodes connected to the same node "0", but instead a new ones get created. The same goes for other components. How can I solve my query to solve these duplicates and set correct relationships?
Query:
call gds.wcc.stream('gene-interactions') yield nodeId, componentId
with componentId,nodeId,collect(distinct componentId) as componentList
unwind componentList as component
with distinct component, componentId, gds.util.asNode(nodeId) as n
call apoc.create.vNode(['component'],{component:component}) yield node
call apoc.create.vRelationship(n,'BELONGS_TO',{},node) yield rel
return n,rel,node limit 30
Hope you can help me with this, my head is gone, but I still need to plot the correct result for my project. Thank you.