How to project GDS graph with node "Twitter_User" and node "Tweets" to identify similarty between Twitter_Users

I would like to convey my thanks in advance.

I am working on a project to identify node similarities where Nodes: User (properties: follower_count, following_count, tweets_count) . Tweets(properties: Likes, RT, Reply, Quote). And the relationship between them is (u:user)-[:posted]->(t:tweet). a user can post multiple tweets.

My question is, how to project GDS graph to identify "User" similarity coefficients by considering both "User" and "Tweet" properties?

neo4j version: 5.3.0
desktop version: enterprise version
APOC version: 5.3.0
GDS version: 2.4.5

Thanks again.

Hello @p102271 ,
lets summarize what you might already know.
In GDS there is
(1) Node Similarity, which takes the neighbors but no properties
(2) KNN, which takes node properties but no neighbors

You want neighbors + node properties.
For this, I would suggest to use a node embedding algorithms, whiche captures the neighbor information into a node property, and apply filtered KNN on the embedding.

You could also look into link prediction as the relationships are node pairs.

Hi @florentin_dorre ,
Thank you very much for your reply, I am actually struggling to project the graph to apply algorithm on it. as the graph has two nodes, users and tweets. . as tutorial suggest, to project we

<CALL gds.graph.project ('graph_name', 'Users', 'Tweets' {INTERACTS:{orientation:'UNDIRECTED', properties:'weight'}}) >

My understanding is, the above graph will treat both user and tweet node equally but the main goal is to predict node similarity between users. even though I wanted to use both user and tweet properties to be consider in the algorithm. As user can post multiple tweets, i can not (don't know) use tweets information as user properties.
will it be possible to do?
I apologies if my explanation is not clear. Again thank you very much for your consideration. I really appreciate it.

I would project both labels through:

CALL gds.graph.project ('graph_name', {Users: {properties: [...]}, {Tweets: {properties: [...]}}}, {INTERACTS:{orientation:'UNDIRECTED', properties:'weight'}})

Then run GraphSage with properties and afterwards run KNN on over only the User nodes (with a nodeLabels filter on the KNN call).
With GraphSage supporting heterogeneous properties, I think its worth a try to see if your user embeddings can capture the neighbor tweets + its properties.

Curious about your results :)

Dear @florentin_dorre ,
Thank you very much for your suggestions... I will let you know results :slight_smile:

1 Like