I'll be presenting some of Octavian.ai's research on Deep Learning with Graphs at the Neo4j Online Meetup on Thursday 15th November at 9am PST/5pm GMT .
It'll be hosted on YouTube live at the following link:
To get a flavour of our work check out the articles at https://www.octavian.ai
The talk is available as a blog post with references as links here:
We intend the model to generalise to unseen graph with the same schema but we don't intend to generalise to unseen questions at the moment.
All the test accuracy scores I gave were on graphs that the models had not seen during training i.e. unseen. They are simply graphs with the same schema.
Excellent talk @andrew.jefferson .
To preface my question(s), I want to say that I'm just starting to explore graphs, as well as, machine learning. So forgive any stupidity I may have as I'm a total newb on the subjects.
From what I understand, graphs are great for representing real world connections (as you talk about in the beginning), but I've also read that there are performance benefits to queries with highly connected data on a graph.
I'm curious what are the benefits when using a graph in a deep learning network vs a traditional dataset? From the talk, your examples seem promising in that it provides a higher level of confidence in the results. Is this because the data is easier for the NN to understand the relationships? Furthermore, have you found performance benefits in using a graph for Neural Networks? i.e. faster training? Faster results?
Absolutely great talk. I will have to watch multiple times to fully understand. Appreciate any response you can give me.
I'm glad that you enjoyed the talk.
This is a great question:
A "traditional" dataset would be a table of training data points - to achieve that you typically have to flatten the real data e.g. from a relational database or a graph like neo4j into a single table.
Imagine a hypothetical experiment using a social graph where I try and predict people's favourite music genre based on their friends (at least those friends whose favourite music genre we already know).
The flattened traditional dataset for ML would be a table like:
Target Person, Friend_1 Favourite Genre, Friend_2 Favourite Genre, Friend_3 Favourite Genre...
Not everyone has the same number of friends so you have to come up with some way of inserting null values, or ignoring extra friends or people who have too few friends.
Lots of information is lost
e.g. friends who you have lots of friends in common with might be much more influential than friends who you don't have lots of friends in common with
For many ML models each column in the data set will be treated independently, so it's harder for it to learn that all friends should be treated the same.
You could try and overcome some of these problems by adding more features to your tabular dataset:
Target Person, Friend_1 Favourite Genre, Friend_1 count(shared friends), Friend_2 Favourite Genre, Friend_2 count(shared friends), Friend_3 Favourite Genre, Friend_3 count(shared friends)...
But which features to add? And each column is still treated as independent and we still have to populate null values somehow. This approach is a lot of work when we could just train the model using the graph which contains all the information and allows the model to learn to treat all entities which are the same (friends) in the same way.
Thanks for bringing that up!
The short answer is that at Octavian we have not tried this technique for subgraph similarity.
I think that combining some of the techniques I discussed (attention and within-graph message passing) it could be possible to generate subgraph vectors/embeddings that could be used for similarity measurement.
I think you'd have to figure out what function(s) could be used to produce embeddings that captured the particular similarity that you are interested in.
@watsaqat were you thinking of a particular application or dataset?
I'm working on customer journey mapping in the CXP space. One of our goals is to find similar customer journeys and group them into communities. From there the idea is that knowing a partial customer journey we might be able to predict next moves, churn probability, etc based on similar journeys. Vector similarity in graphs is easy, but we're dealing with unstructured journeys which can be a full graph structure. However, for simplicity, we can simplify the journey into a LL structure and attempt to assign a similarity with other journeys. The techniques I'm aware of for vector similarity don't fit here, and I'm exploring possible ways to go about this.