GraphSAGE source code / detail

Hi all,

I have some questions about the graphSAGE algorithm in the graphdatascience library.

  1. The graph sage documentation page contains references to two papers - graphSAGE and SWAG. I noticed that there is a relationship weight property, so does this mean the algorithm uses the SWAG model and not graphSAGE?

  2. Is the source code for this implementation anywhere? I'd like to see what is actually happening

  3. The page says the model supports directed networks - what does this actually mean? Is it possible to specify this so as to affect the sampling of the graph during training of graphSAGE/SWAG (my network is a DAG) ?

  4. What is the meaning of "The algorithm is defined for UNDIRECTED graphs." especially given the support for direct networks in the above point?

Many thanks,
David

Hi David,

The code is public and is found here: graph-data-science/algo/src/main/java/org/neo4j/gds/embeddings/graphsage at master · neo4j/graph-data-science · GitHub

The algorithm is somewhere between graphSAGE and SWAG in the weighted case. It is closer to graphSage; it uses the relationship weights in the loss function like SWAG. However the sampling and aggregation works as in GraphSage.

The algorithm runs on directed graphs and the sampling traverses outgoing when expanding from the source nodes. The original paper is only benchmarked on undirected graphs, so while it techically works on directed graphs, embedding quality is not backed by the research. It may nevertheless work well in practice.

The documentation is not super clear on this point. Appologies.

Best regards,
Jacob Sznajdman, developer of GDS.

1 Like

Hi Jacob,

Thanks for your detailed reply and link to the source code, very helpful!

Kind regards,
David