What sort of affinity algorithm?

I should be grateful if you would guide me to the right resources (documentation and/or person) in order that I can solve a problem. I am very much a neo4j novice and have only represented my CV in neo4j and that was over a year ago.

I want to group one type of node according to how similarly they relate to another type of node. In my mind this is affinity analysis but others might call it cluster analysis.

To be more specific, I have a CRUD matrix where one axis is a set list of business processes and the other is a list of entity types (in the relational modelling/database sense) and the relationship between them is whether the Business Process has a Write, Read or Read/Write access to the Entity Type. I want to represent this in neo4j.

I think I have two questions:

  1. Is the type of algorithm that I need listed in the link below?
  2. If so, where can I go to get more detail to solve my problem?

The solving this problem will do two things:

  • Give a useful insight into my client’s business architecture
  • Reveal the power of graphs to my client.

The link I mentioned is this....
https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=9&cad=rja&uact=8&ved=2ahUKEwih47Xf9ZXgAhW6RhUIHRhrBkQQFjAIegQIAhAB&url=https%3A%2F%2Fneo4j.com%2Fblog%2Fgraph-algorithms-neo4j-15-different-graph-algorithms-and-what-they-do%2F&usg=AOvVaw3glk2IKverTUAiHsUO4z_A

The Jaccard algorithm is perfect for you, check out the docs and examples and let us know if that helped you:

Thank you. I will follow this up and I am sure I will have fun and games trying to work out how to use it.

1 Like

There are also some good blog posts, from @bratanic_tomaz and @mark.needham on the topic

Michael, Having run the query below, I can see that the algo has not been installed but 57 others have. How can I check if it has been removed/replaced? Many thanks.
"
CALL dbms.procedures() YIELD name, signature, description
WHERE name starts with "algo"
RETURN name, signature, description
"

Can you show the printscreen of what that returns? Also you call do call algo.list() to achieve a similar thing

We haven't removed Jaccard Similarity in answer to your question!

Here is a screen-shot, as requested. I was expecting to see it before algo.spanningTree but I don't see it anywhere. I will be happy to learn if I have loaded the algorithms incorrectly such that some might be missing. I look forward to reading your conclusions.

and this one...

Which version of Neo4j and graph algorithms you are using?

Thank you for asking. It didn't occur to me that availability would be differentiated. I am using the Community edition. Desktop version 1.1.17. Browser 3.2.19.

Please check the Graph Algorithms version. Click on Manage and select Plugins tab and scroll down to see the version number.
algo1

Latest version is 3.5.4.

Thank you very much. I was on the wrong version, as you suspected. I can now see it. Now I can enjoy wrestling with how to use the algo and the data together.
Thank you for sparing the time on my issue.

Dear all,
I have had success thanks to the guidance provided above. I have a result of 48,000 records. In my naivity of neo4j, I was expecting the output to represented in graph form, as well as a table, where the similarity score is reflected in the relative proximity of the nodes. I was hoping then to see clusters of nodes.
Rummaging the Graph Algorithms part of the manual and the interweb, I can't infer whether I am missing something in the query or whether I need to invoke a thing to act on these results in order to get the graphical representation of the clusters I hope/expect to see.
Once again, any ideas and pointers will be gratefully received.
The query I ran was
"
Match (p1:Logical_Business_Process)-[:read|:write|:read_write]-(e1:Entity_Type)
WITH p1, collect(id(e1)) AS p1entity_type
MATCH (p2:Logical_Business_Process)-[:read|:write|:read_write]-(e2:Entity_Type) WHERE p1 <> p2
WITH p1, p1entity_type, p2, collect(id(e2)) AS p2entity_type
RETURN p1.Name AS from,
p2.Name AS to,
algo.similarity.jaccard(p1entity_type, p2entity_type) AS similarity
ORDER BY similarity DESC
"