Graph Algorithms Book Not Finding Similarity Algorithms

ameyasoft · June 5, 2019, 10:51pm

In Graph_Algorithms_Neo4j book, I am not finding the Similarity Algorithms like Jaccard topic in the Index page. Am I missing something? Please let me know.
Thanks

andrew_bowman · June 5, 2019, 11:31pm

I don't believe these are covered in the book. I believe the explanation was that although they can be useful in a graph context, they aren't really graph algorithms (they work on vector inputs). This omission is mentioned in the Other Algorithms section:

Other Algorithms

Many algorithms can be used with graph data. In this book, we’ve focused on those that are most representative of classic graph algorithms and those of most use to application developers. Some algorithms, such as coloring and heuristics, have been omitted because they are either of more interest in academic cases or can be easily derived.

Other algorithms, such as edge-based community detection, are interesting but have yet to be implemented in Neo4j or Apache Spark. We expect the list of graph algo‐ rithms used in both platforms to increase as the use of graph analytics grows.

There are also categories of algorithms that are used with graphs but aren’t strictly graphy in nature. For example, we looked at a few algorithms used in the context of machine learning in Chapter 8. Another area of note is similarity algorithms, which are often applied to recommendations and link prediction. Similarity algorithms work out which nodes most resemble each other by using various methods to com‐ pare items like node attributes.

So there's not really anything special about Jaccard or other similarity algorithms in a graph vs any other usage, I think, you just match on two nodes and use property vectors for those nodes in the similarity algorithm. The only really graphy thing about it is the similarity relationships you would create between the nodes being compared (provided you want to do so...it would probably be best to only add those relationships when they're above a certain similarity threshold), and subsequent querying for similar nodes using the relationships that were added as a consequence of running the similarity algorithms previously.

Or, more succinctly:

Establish how you want to pair off nodes for running the similarity calculation (cartesian product (excluding mirrored pairings), or filtered down in some other way?), and what properties you want to include for the similarity calculation.
Figure out if you just want to stream the results, or actually write out the relationships based on the similarity calculations.
Run the query that does that, and use the similarity algorithm of choice, with the approach of choice (to stream or not to stream, to write the rels or not the write).
Use the results (or later, query for similar nodes based on the similar relationships you wrote to the graph)

More info in our graph algo documentation, and here's the section for Jaccard similarity

ameyasoft · June 6, 2019, 5:20am

Thanks for your reply.

Topic		Replies	Views
Jaccard in Alpha forever Graph Data Science / Graph Analytics	8	618	March 10, 2021
I want to what is the use of similarity in neo4j Neo4j Graph Platform migrated	1	176	February 16, 2024
Community detection based on Jaccard similarity index with Neo4j Community Content & Blogs blog	3	1419	May 10, 2019
Pages of documentation not found : Jaccard similarity Graph Data Science / Graph Analytics	2	423	April 1, 2022
What sort of affinity algorithm? Newbie Questions cypher	12	1588	April 25, 2019

Graph Algorithms Book Not Finding Similarity Algorithms

Related topics