Relationship weight to Multi Label Model in GraphSage

Hi,

I have a graph with two node labels with different properties and my relationships have weights on it. What is the command to train and stream a multi-label graph with relationship weights using GraphSage.

In the example given, it is only for one label:
CALL gds.beta.graphSage.train(
'persons',
{
modelName: 'weightedTrainedModel',
featureProperties: ['age', 'heightAndWeight'],
relationshipWeightProperty: 'relWeight',
nodeLabels: ['Person'],
relationshipTypes: ['KNOWS']
}
)

Thanks

Hello @kamalika.ray ,
The example is nearly correct.

according to the docs, it states The projected feature dimension is configured with projectedFeatureDimension, and specifying it enables the multi-label mode..
I assume in your case, all labels have the same label?

Also it only makes sense if your nodeLabels parameter has multiple labels specified.

Hi @florentin_dorre ,

No, it has different labels and properties as well.
I am trying the following command but I am not sure if it is correct:

CALL gds.beta.graphSage.train(
'graphsage',
{
modelName: 'weightedTrainedModel',
featureProperties: ['pval_protein', 'pval_disease'],
projectedFeatureDimension: 2,
relationshipWeightProperty: 'confidence_score',
nodeLabels: ['Protein', 'Disease'],
relationshipTypes: ['RESPONSIBLE_FOR']
}
)"""

Is this correct?

Thanks

okay I see your issue. Then, the projected dimension should be the maximum feature dimension over these labels.

If assume pval_proteinis only existing for the Protein label.
So I would set the projectedFeatureDimension to 1. This assumes the pval_protein is only a scalar value. If its an array, you should take the length of the array instead.

I have 400 Protein label nodes each having a numeric property- pval_protein and 200 Disease label nodes each having a numeric property pval_protein in my graph. Each protein and disease have 1 value only.

okay, then projectedFeatureDimension of 1 is correct

Okay, Thank you very much.

I have a question regarding the example:

CALL gds.beta.graphSage.train(
'persons_with_instruments',
{
modelName: 'multiLabelModel',
featureProperties: ['age', 'heightAndWeight', 'cost'],
projectedFeatureDimension: 4
}
)

Why is the projectedFeatureDimension 4 here? age per person and cost per instrument is scalar right?

You are right, the example is confusing here. Looking at our implementation, I still stand by my explanation. I will get back to you about this example

Just an update on this. The example was wrong here and is in the process of updating now.

You can already find the updated wording at GraphSAGE - Neo4j Graph Data Science

The projectedFeatureDimension should equal the maximum length of the feature-array. In our example, persons have age (1) and heightAndWeight (2), summing up to a length of 3. Instruments only have cost with length of 1. Thus, the projectedFeatureDimension should be set to 3.