Relationship weight to Multi Label Model in GraphSage

Hi,

I have a graph with two node labels with different properties and my relationships have weights on it. What is the command to train and stream a multi-label graph with relationship weights using GraphSage.

In the example given, it is only for one label:
CALL gds.beta.graphSage.train(
'persons',
{
modelName: 'weightedTrainedModel',
featureProperties: ['age', 'heightAndWeight'],
relationshipWeightProperty: 'relWeight',
nodeLabels: ['Person'],
relationshipTypes: ['KNOWS']
}
)

Thanks

Hello @kamalika.ray ,
The example is nearly correct.

according to the docs, it states The projected feature dimension is configured with projectedFeatureDimension, and specifying it enables the multi-label mode..
I assume in your case, all labels have the same label?

Also it only makes sense if your nodeLabels parameter has multiple labels specified.

Hi @florentin_dorre ,

No, it has different labels and properties as well.
I am trying the following command but I am not sure if it is correct:

CALL gds.beta.graphSage.train(
'graphsage',
{
modelName: 'weightedTrainedModel',
featureProperties: ['pval_protein', 'pval_disease'],
projectedFeatureDimension: 2,
relationshipWeightProperty: 'confidence_score',
nodeLabels: ['Protein', 'Disease'],
relationshipTypes: ['RESPONSIBLE_FOR']
}
)"""

Is this correct?

Thanks

okay I see your issue. Then, the projected dimension should be the maximum feature dimension over these labels.

If assume pval_proteinis only existing for the Protein label.
So I would set the projectedFeatureDimension to 1. This assumes the pval_protein is only a scalar value. If its an array, you should take the length of the array instead.

I have 400 Protein label nodes each having a numeric property- pval_protein and 200 Disease label nodes each having a numeric property pval_protein in my graph. Each protein and disease have 1 value only.

okay, then projectedFeatureDimension of 1 is correct

Okay, Thank you very much.

I have a question regarding the example:

CALL gds.beta.graphSage.train(
'persons_with_instruments',
{
modelName: 'multiLabelModel',
featureProperties: ['age', 'heightAndWeight', 'cost'],
projectedFeatureDimension: 4
}
)

Why is the projectedFeatureDimension 4 here? age per person and cost per instrument is scalar right?

You are right, the example is confusing here. Looking at our implementation, I still stand by my explanation. I will get back to you about this example

1 Like

Just an update on this. The example was wrong here and is in the process of updating now.

You can already find the updated wording at GraphSAGE - Neo4j Graph Data Science

The projectedFeatureDimension should equal the maximum length of the feature-array. In our example, persons have age (1) and heightAndWeight (2), summing up to a length of 3. Instruments only have cost with length of 1. Thus, the projectedFeatureDimension should be set to 3.

1 Like