cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Any suggestion on how to improve fastRP/fastExtendedRP algorithm's performance?

lingvisa
Graph Fellow

I am testing graph embedding algorithm, especially fastRP. My split all nodes in my graph into test and train sets, and then evaluate fastRP/fastExtendedRP's prediction capability. I am comparing performances in the 3 categories:

  1. bertEmbedding: I created a bert embedding vector for the 'name' property and use that as the single feature
  2. fastRP: I created a fastRP embedding vector and use that as a single feature
  3. fastExtendedRP: I created a fastExtendedRP embedding vector through 'bertEmbedding' vector as the node's name property and use that as a single feature

In all cases, I am running the classification algorithm:
CALL gds.alpha.ml.nodeClassification.train

The results are below:

  1. fastRP is the worst, about 72% F1
  2. fastExtenedRP (using bertEmbedding as name property) is better, about 75% F1
  3. bertEmbedding alone is the best, about 90% F1

It seems that the structure of my graph doesn't help too much, while bert embedding alone is far better than fastRP or fastExtendedRP. Since fastExtendedRP has taken advantage of the bertEmbedding, I don't expect it is significantly worse than bertEmbedding alone. I guess it might be due to my parameter setting when training it, and those are my settings:

CALL gds.beta.fastRPExtended.write(
           'nodeGraph',
           { embeddingDimension: 512,
            iterationWeights: [0.0, 1.0, 1.0, 1.0],
            normalizationStrength: 0,
            propertyDimension: 96, 
            featureProperties: ['bertEmbedding'],
            writeProperty: 'graphEmbedding'
          }
        )
        YIELD nodePropertiesWritten
        
        """)

Graph embedding authors, any suggestions on tuning in the parameters? The BERT embedding is 768 dimension in standard form.

0 REPLIES 0