Embeddings only for selected nodes

Kronossos · August 4, 2022, 12:54pm

Let me quickly describe me graph.

I have a graph that links every species by its taxon for mammals. See below small example for Hominoidea:

There are five organisms (HSA, PPS, PTR, GGO, PON) at the end of this lineage. Only organisms that are at the end of the lineage have the property of kegg=kegg_genome_id. Each of these nodes has relationships to a different node type labelled as KO (functional orthologs). See the example below just for two organisms. The same KO nodes can link to many mammalian organisms like elephant, human or a mouse (or even to all mammals),

This results in a network with 337 (111 are organisms) taxa nodes and 12142 Ko nodes and over 1,200,000 relations.

Now i want to build a model that would predict based on KO whenever a given species belongs toEuarchontoglires. Every organism node that is linked to Euarchontoglires has a property category=1. The rest of the organisms have the property category=0.

This was just an introduction.

What I want to know is how I can calculate node2vec ONLY for these organism nodes. We do not want to have embeddings for KO nodes.

I have a projected graph:

graph.run("""

CALL gds.graph.project(

'graph_info',

{ 

  taxa: { 

            label: 'Taxa', 

            properties: \['category'\]

            },

  ko: { 

            label: 'Ko'

            } 

},

{

    RELS: {

        type: 'HAS\_KO',

        orientation: 'UNDIRECTED'

    }

}

)

""")

I do not know how to write gds.beta.node2vec.write only for the nodes that I will later use for ML.

MATCH (n:Taxa) WHERE n.kegg is not null RETURN n.name, n.category, n.n2v_all_nodes

Can u guide me?

michael.hunger · August 5, 2022, 1:34pm

You can probably rather use cypher-projection where you can use arbitrary filters and pattern matches to determine the nodes and relationships to be projected into the in-memory graph.

https://neo4j.com/docs/graph-data-science/current/graph-project-cypher/

https://neo4j.com/docs/graph-data-science/current/graph-project-cypher-aggregation/

Topic		Replies	Views
Embeddings only for selected nodes General migrated	0	105	August 4, 2022
Node embeddings: node2vec with #Neo4j Community Content & Blogs	1	749	June 1, 2020
Creating subgraph by filtering on embeddings Graph Algorithms/Graph Data Science cypher	2	385	April 14, 2022
How to create a subgraph and run graph algorithms only on that? Neo4j Graph Platform	8	5593	December 3, 2019
Graph embedding for smiliarity measurements of hierarchical graphs Cypher embedding , cypher , path , data-science	1	514	June 24, 2021

Embeddings only for selected nodes

Related topics