deepWalk algorithm doesn't go beyond iteration 0


(Vramanathan00) #1

Hi
I've tried out several public datasets as well as internal..deepWalk output is always iterations 0.

CALL embedding.deepWalk(null, null, { vectorSize:256, windowSize:100, writeProperty: 'dw_128', graph:'heavy'}) YIELD nodes, iterations;
+---------------------+
| nodes | iterations |
+---------------------+
| 262119 | 0 |
+---------------------+

Using neo4j 3.5.3 (neo4j-ml-models 1.0.2; built from source)

is this a bug?

thanks


(Mark Needham) #2

Hey,

So the way the algorithm decides whether to do another iteration is based on whether it finds any new unique features on the current iteration. You can see the code that checks that condition here - https://github.com/neo4j-graph-analytics/ml-models/blob/master/src/main/java/embedding/DeepGL.java#L165

The Pruning code that works out whether there are unique features is here - https://github.com/neo4j-graph-analytics/ml-models/blob/master/src/main/java/embedding/Pruning.java#L182

You can control how that pruning works by tweaking the config parameter pruningLambda - https://github.com/neo4j-graph-analytics/ml-models/blob/master/src/main/java/embedding/DeepGLProc.java#L51

I would try reducing that value (default is 0.7) and then see if that helps.

CALL embedding.deepWalk(null, null, 
  { vectorSize:256, windowSize:100, writeProperty: 'dw_128', graph:'heavy', 
    pruningLambda: 0.3}) 
YIELD nodes, iterations;

Cheers, Mark


(Vramanathan00) #3

Thanks Mark. I tried several values. No luck.

Is there a way to enable this logging?
logger.log("Unique features this iteration: " + uniqueFeaturesSet.size());
if (uniqueFeaturesSet.size() == 0) {
Also, if you have specific public dataset that you tested it where it went beyond iteration 0, let me know as well.

Thanks


(Vramanathan00) #4

Ignore my logging question..It just doesn't get to that state.
If you can tell me which public data set this code has been tested, please let me know
thanks