I have created a database:
Nodes are created containing data regarding monthly precipitation with properties such as node name and monthly precipitation amount, monthly temperatures with their properties, and the hydrothermal coefficient with its properties: the cht value and the type of drought.
MERGE(r2c:YRC{name:'Rainfall 2002 Center', Rainfall:604}); . . .
MERGE (t5c:YTC{name:'Temperature 2005 Center', temp:10.5}); . . .
MERGE (y5C:YCHTC{name:CHT 2005 Center', cht:1, drought:0}); . . .
MERGE(soy05:Productivity{name:'Productivity 2005', harvest:18}); . . .
Relationships are created between the nodes:
// :YRCβ :YCHTC.
MATCH ( r ), ( c ) WHERE Id( r )= 652 and id(c)= 611 MERGE ( r ) - [rel: DETERMINE]->( c );
// :YTC β :YCHTC MATCH ( t ), ( c ) WHERE Id( t )= 672 and id(c)= 611 MERGE (t ) - [rel: DETERMINE]->( c );
// :YCHTC β :Productivity
MATCH ( c ), ( p ) WHERE Id( c )= 611 and id(p)= 631 MERGE (c ) - [rel: DETERMINE] -> ( p );
. . .
- I developed a prediction model that uses this information to predict drought levels for future periods.
a) Creating a pipeline for training: CALL gds.beta.pipeline.nodeClassification.create('pipe')
Configuring the pipeline
b) Adding a node property step to the pipeline. Here, the input graph contains a cht property:
CALL gds.beta.pipeline.nodeClassification.addNodeProperty('pipe', 'alpha.scaleProperties', { nodeProperties: 'cht', scaler: 'L1Norm', mutateProperty:'scaledSizes' }) YIELD name, nodePropertySteps
c) Selecting features for the pipeline:
CALL gds.beta.pipeline.nodeClassification.selectFeatures('pipe', ['scaledSizes', 'cht']) YIELD name, featureProperties
// Adding a logistic regression model with default configuration:
CALL gds.beta.pipeline.nodeClassification.addLogisticRegression('pipe') YIELD parameterSpace;
// Adding a random forest model: CALL gds.alpha.pipeline.nodeClassification.addRandomForest('pipe', {numberOfDecisionTrees: 5}) YIELD parameterSpace;
// Adding a multi-layer perceptron model with weighted focal loss: CALL gds.alpha.pipeline.nodeClassification.addMLP('pipe', {classWeights: [0.4,0.3,0.3], focusWeight: 0.5}) YIELD parameterSpace;
// Adding a logistic regression model with an interval parameter:
CALL gds.beta.pipeline.nodeClassification.addLogisticRegression('pipe', {maxEpochs: 500, penalty: {range: [1e-4, 1e2]}}) YIELD parameterSpace RETURN parameterSpace.RandomForest AS randomForestSpace, parameterSpace.LogisticRegression AS logisticRegressionSpace, parameterSpace.MultilayerPerceptron AS MultilayerPerceptronSpace;
//I have configured autotuning:
CALL gds.alpha.pipeline.nodeClassification.configureAutoTuning('pipe', { maxTrials: 2 }) YIELD autoTuningConfig
Training the pipeline
The following statement will project a graph using a native projection and store it in the graph catalog under the name "dtGraph".
CALL gds.graph.project('dtGraph', { YCHTC: { properties: ['cht', 'drought'] }, YCHTCp: { properties: 'cht' } }, '*' )
I got the following error:
Failed to invoke procedure gds.graph.project
: Caused by: java.lang.UnsupportedOperationException: Cannot safely convert 0.80 into an long value.