I am working with citation data. In academia, paper_0 will be cited by paper_1 which will be cited by paper_2, so on and so forth. During the creation of our database, we added labels to all paper_1 nodes called one_hop and labels to all paper_2 nodes called two_hop. It becomes complicated when a one_hop node is also a two_hop node to a different paper. This means on any given node, it could have a one_hop label and a two_hop label.
I need a solution to that overwrites the label of the nodes when running, what we call the two hop analysis, so each node has one label which can be used to color the nodes in Bloom appropriately. In addition, the intended solution will not update the graph itself. That is where I found Virtual Nodes and Relationships.
I am close to a solution but it is not working perfectly:
Match (n:Paper {paperid: '7608367'})
OPTIONAL MATCH (n)<-[r1:REFERS]-(o:one_hop)
OPTIONAL MATCH (o)<-[r2:REFERS]-(t:two_hop)
CALL apoc.create.vNode(['one_hop'],o{.*}) yield node as one
CALL apoc.create.vNode(['two_hop'],{title:t.papertitle}) yield node as two
call apoc.create.vRelationship(one,'REFERS',{},n) yield rel as rel1
call apoc.create.vRelationship(two,'REFERS',{},one) yield rel as rel2
return n, one, two, rel1, rel2
The above seems to be creating duplicated nodes. See screenshot below:
If I were solving this issue in SQL I would simply group by n, one but I am not sure how to do that in Cypher.
I am relatively new to Neo4j so I appreciate any patience afforded :)
Thanks in advance for any help and let me know if there are questions
 
        



