Online Meetup: Link Prediction with Neo4j

Link prediction explores the problem of predicting new relationships in a graph based on the topology that already exists.

This has been an area of research for many years, and in the last month we've introduced link prediction algorithms to the Neo4j Graph Algorithms library.

In this session Amy and Mark explain the problem in more detail, describe the approaches that can be taken, and the challenges that have to be addressed.

We finish by going through a worked example showing how to apply these approaches using Graph Algorithms and the popular sciki-learn machine learning library.

Presenters: @AmyH & @mark.needham


@AmyH @mark.needham @neo4j_devrel
enjoyed the presentation just now. Here's a good paper on fraud prediction using graph analysis, specifically applying to time-weighted bipartite graphs (which I think you mentioned you would be moving on to shortly?)

1 Like

Indeed a very interesting meetup, thanks! I am using a combination of similarity and community detection algorithms, since as Mark noted at the end of the video, the current community algorithms work only if you try to build-up groups of same node-type linked by a certain relationship. In general I have noticed that there is a less powerful support for complex relations between nodes. Already for similarity algorithms, if you have relationships with properties it gets very complicated to run the algorithms. E.g., imagine to have (RESEARCHER)-[:IS_RELATED {role:author | reviewer | editor}]-(PAPER)-[:INCLUDES]-(RESEARCH_TOPIC) and would like to elicit relationships between the researcher role and the research topic. I guess that at now the only feasible way is to "flatten" :IS_RELATED into three separate relationships between RESEARCHER and PAPER.

1 Like

Thank you for your meetups! :) I have a question/request... is/would it be possible to create a Bayesian network from data using Neo4j? Something Weka and BayesiaLab do is create a network from data on which you're able to set evidence/observation on a node and see the probabilities change for all the other nodes, showing which are the most influential on the observed node. I've searched for Bayesian network and Neo4j but there's not much available. Thank you again!


Thanks for the wonderful session on Link Prediction. The material/algorithm covered in the session is very helpful to recommend /predict similar Items/Friend in an already connected network but if we are starting with absolutely unconnected data how should we link similar items based on their properties . For example if there is a database of Food items , how would we determine looking into the ingredients that Milk is closely related to Skim Milk and Almond Milk but distantly related to Yogurt and not at all related to Peanuts , so that we can then use something like Adamic Ader to close the triangle between Skim Milk and Almond Milk

Thanks for submitting!

I've added a tag that allows your blog to be displayed on the community home page!

Thank you both for sharing! I am currently working on a prototype for similar purposes.

Thanks Mike. I'll take a look at the paper.

1 Like

Thanks for your question! There are many ways you could approach creating your relationships. Since you're still building your model, below are a few resources that are helpful:

Best of luck!

1 Like

How can I use Link Prediction Algorithms as part of my graph algorithm library. I am not able to access them in my Neo4J browser.

Hi, Tiwari! What version of Neo4j and the graph algorithms library are you running? What error are you getting when you try to run one of the Link Prediction algorithms in your Neo4j Browser?


Dear Jennifer,

Greetings and hope you are doing well.

My version of Neo4J - Neo4j Desktop 3.5.6
Version of Neo4j ML Model - neo4j-ml-models-1.0.0.jar.

When I install this library using the procedure mentioned in the following link my database stops working and I have to delete it.

Are you seeing an error message? Did you restart the database when you pulled in the jar file?

Yes, I did but the database fails to restart.

What is the error message? Can you send a screenshot of what happens when it fails?

Also, you can actually use Link Prediction algorithms by clicking Manage on your database and clicking install for graph algorithms. The ML Models jar is for graph embeddings, I believe. The instructions for installing graph algorithms plugin is also listed in one of my blog posts here: Explore New Worlds — Adding Plugins to Neo4j | by Jennifer Reif | Neo4j Developer Blog | Medium

Hope this helps!

Thanks for this super interesting and accessible session (I'm only 2 years late to the party!!). You mention towards the end that most of the methods discussed are relevant to monopartite networks: I was wondering if there was any update on deployment of Npartite-suitable methods in the GA library? Thanks a lot.