My machine has 40 cores on Centos 7 and 120G memory. My top or htop command shows that only 1 core is being used for training and all other are idle. As can be seen below, only number 20 is being used.
My subgraph has 1.7 million nodes and several millions of relationships.
Training code:
CALL gds.graph.create(
'nodeGraph',
['Article','Topic'],
{
topicOf: {orientation: 'UNDIRECTED'}
},
{readConcurrency: 4, validateRelationships: true}
)
node_degree_feature_query: CALL gds.degree.mutate(
'nodeGraph',
{mutateProperty: 'degree'}
)
page_rank_feature_query: CALL gds.pageRank.mutate(
'nodeGraph',
{mutateProperty: 'pageRank'}
)
CALL gds.beta.graphSage.train(
'nodeGraph',
{
modelName: 'graphSageModel',
aggregator:'mean',
batchSize:32,
activationFunction:'sigmoid',
epochs:15,
searchDepth:2,
sampleSizes:[10,5],
learningRate:0.1,
embeddingDimension:64,
featureProperties:['degree', 'pageRank'],
projectedFeatureDimension: 2,
randomSeed: 46,
concurrency: 4
}
)
YIELD modelInfo as info
RETURN
info.modelName as modelName,
info.metrics.didConverge as didConverge,
info.metrics.ranEpochs as ranEpochs,
info.metrics.epochLosses as epochLosses
Anything I can do to make use of 4 cores?