I'm trying to run the degree centrality algorithm on my dataset. The submodel of my query looks like this:
The query that I'm trying to run: CALL algo.degree.stream("Transfer", "PARENT_TRANSFER", {direction:"outgoing"}) YIELD nodeId, score RETURN nodeId, score ORDER BY score DESC
I have verified that these relations exist, but I recieve a score of 0.0 for each record nonetheless:
When I implement the query myself, I do get proper results: MATCH (t:Transfer) RETURN t.Code, size((t)-[:PARENT_TRANSFER]->()) as score ORDER BY score DESC
Could anyone explain me why I'm not getting the proper results when using the degree centrality algorithm pls?
CALL algo.degree.stream("Code", NULL, {direction:'both'})
YIELD nodeId, score
RETURN algo.asNode(nodeId).name AS name, score AS degree
ORDER BY degree DESC
Degree centrality expects a monopartite graph (only one node type). These data models look like there are two node labels; when you specify the node label, the algorithm only looks for relationships between that category of node.
If you have (:LabelA)-[:R1]->(:LabelB), degree centrality will find degree 0 for all LabelA nodes because there are no self-relationships between LabelA (if you ran MATCH p= (:LabelA)-[ ]->(:LabelA) RETURN COUNT(p) you would get 0). However if your data model also had (:LabelA)-[:R2]->(:LabelA), you would get a non-zero result.
The work around is to pass 'null', 'null' in the procedure call (all nodes, all relationships) - this will ignore node labels and relationship types, and just return the overall degree: CALL algo.degree.stream(NULL, NULL, {direction:'both'}). Alternatively, you can look into using Cypher projections or the huge graph loader to flexibly combine node or relationship types for the algorithms.
Thanks @alicia.frame. That solved the issue. Now I see original degree centrality expects a monopartite graph and does not consider a multigraph. Is that right?
Yes - it expects a monopartite graph, but you can use Cypher Projections or the graph loader to represent the input data model as if it was monopartite.
Great add @alicia.frame! Just a heads up. Putting in nulls for the first two parameters (this covers the nodes and relationships parameters) can produce invalid results when graphs are complex with various nodes and relationships. Hereâs an idea that will give you more control.
Put in null for the node parameter but in the relationship parameter enter the relationship type you want to measure. This will give you the degree for that relationship's impact. Note that the effect of this centrality algorithm depends a lot on your graphâs architecture. Remember, the history of this algo was one based in social networks. So, make sure first and foremost, this is the algo youâre looking for!
A near textbook example dataset to play with is found in Barrasa's terrific blog post on building an automated taxonomy (QuickGraph#5 Learning a taxonomy from your tagged data â JesĂșs Barrasa). The blog's about something else, but the dataset in this post (an online CSV file) will serve as a great example of how to see the power and benefits of this useful centrality algo.
In this graph, you can easily get to where youâll see all those darn "0" values. Then, if you use the null/null "trick", you'll get loads of numbersâŠbut theyâll be degree values that really have little data science value. This is really a multi-partite graph which does relate to Barrasaâs blog. So, the degree values you'd want if say, this dataset was your target and you wanted to get some specific answers, these would come from each specific relationshipâs takeoff and landing meaning and purpose.
And when you start to see this by adding the relationship and the âincomingâ or âoutgoingâ parameter value, youâll move your graph maturity from Neoâs documentationâs rather confusing and "hello world" examples, and get you into some more useful territory where you can create the building blocks of some valuable insights.
Ironically and on topic, me posting this bit, changes the degree centrality on Barassaâs blog. Using this very algorithm to model this fact, is a clear example of why centrality is so simple yet so powerful when placed into a solid graph architecture. Gotta love graphs! HTH.