GDS results combination

yuzr1 · September 18, 2023, 9:35am

neo4j 5.5, gds 2.4.5

I have a GDS project with several nodes, and I get both node embedding and centrality results on this project by doing:

call gds.beta.graphSage.train(
    'Project101',
    {
        modelName:'Model101',
        featureProperties: ['label'],
        aggregator: 'mean',
        activationFunction: 'sigmoid',
        randomSeed: 1337,
        sampleSizes: [3, 3]
    }
)

CALL gds.beta.graphSage.stream(
     'Project101',
    {
     modelName: 'Model101'
    }
)
YIELD nodeId, embedding

which gave me the embedding for each node,

nodeId | score
101    | 0.95
102    | 0.90

and:

CALL gds.eigenvector.stream(
     'Project101'
)
YIELD nodeId, score

which gave me the centrality score for each node

nodeId | embedding
101    | [0.02, 0.015, 0.01 ...]
102    | [0.03, 0.013, 0.04 ...]

Now I am trying to output a dataframe like

nodeId | score | embedding
101    | 0.95  | [0.02, 0.015, 0.01 ...]
102    | 0.90  | [0.03, 0.013, 0.04 ...]

by combining both outputs above together. How should I deal with my Cypher?

P.S. Please do not use write mode and try only complete it on neo4j without any other coding platform.

florentin_dorre · September 18, 2023, 9:57am

I would suggest to use the mutate mode and use gds.graph.nodeProperties.stream to combine them.

In general I would advise you to look into our Python client for GDS if you want a pandas dataframe.

yuzr1 · September 19, 2023, 3:02am

Thank you for this cool combination method!
Using mutate mode and use gds.graph.nodeProperties.stream, I now get

nodeId | nodeProperty| propertyValue
101    | "score "    | 0.95
101    | "embedding" | [0.02, 0.015, 0.01 ...]
102    | "score "    | 0.90
102    | "embedding" | [0.03, 0.013, 0.04 ...]

However there is still one more step left to the form I want.

nodeId | score | embedding
101    | 0.95  | [0.02, 0.015, 0.01 ...]
102    | 0.90  | [0.03, 0.013, 0.04 ...]

Sure I can use pandas or other python packages to reform it. However, my neo4j server is much more powerful then my python server, hence I hope to finish it on neo4j only.
Would there be any Cypher for such dataframe transform?

yuzr1 · September 19, 2023, 9:44am

Figured out this following Cypher, workable:

CALL gds.graph.nodeProperties.stream(
        'Project101', 
        ['embedding', 'score']
)
YIELD nodeId, nodeProperty, propertyValue
MATCH (n) 
WHERE id(n)=nodeId
WITH 
nodeId, 
CASE WHEN nodeProperty = "score" THEN propertyValue ELSE [] END AS score, 
CASE WHEN nodeProperty = "embedding" THEN propertyValue ELSE [] END AS embedding
WITH nodeId, collect(score)[0] AS embedding, collect(score)[1] AS score
RETURN nodeId, score, embedding

florentin_dorre · September 20, 2023, 9:31am

Glad you got something to run

on the python client we offer a parameter called separate_property_columns were we do this transformation for you.

Topic		Replies	Views
GDS node embedding aggregation Graph Algorithms/Graph Data Science embedding	1	242	April 2, 2024
Export results of Cypher query to CSV General apoc , cypher	1	259	July 30, 2021
Issue incorporating centrality features in gds node classification model Graph Algorithms/Graph Data Science operations	2	380	July 17, 2023
Eigen centrality algorithm based on CYPHER projection and with weighted edges Graph Algorithms/Graph Data Science	0	623	June 15, 2020
How to generate Embedding Vectors for a Node while creating the node at run time? Graph Algorithms/Graph Data Science cypher , data-science	1	556	January 26, 2024

August 🏄 🏖️ 🏊

GDS results combination

Related topics