Graph data sampling

Hello everyone,

I am working with a large data graph for learning tasks and it is impossible to handle it in one piece. I tried to get a representative sub-graph via data sampling algorithms but using my machine the execution time is always out of control.

I am wondering if there are some examples of data graphs with their sampled versions (wathever the sampling data used) that I can download and use directly !

Please help me !

Hi @hm873154,

It's unfortunate that you're not able to run the sampling on your machine. Are you using the GDS's random walk with restarts algorithm with reasonable concurrency? It's a very efficient algorithm.

If you cannot run the algorithm on your own graph, but are just interested in a sample subgraph of any example graph, you could for instance:

  1. Use the GDS Python client to load one of the smaller example graphs it comes shipped with, like Cora or LastFM, and then
  2. Use GDS's random walk with restarts algorithm to sample a representative subgraph from it

This is for example done with Cora in the example notebook tutorial PyG integration: Sample and export.

If you need a larger graph you could also use an OGB graph in step 1.

Hope this is helpful,
Adam