Showing results for 
Search instead for 
Did you mean: 

Embeddings only for selected nodes.


Let me quickly describe me graph.

I have a graph that links every species by its taxon for mammals. See below small example for Hominoidea:


There are five organisms (HSA, PPS, PTR, GGO, PON) at the end of this lineage. Only organisms that are at the end of the lineage have the property of kegg=kegg_genome_id. Each of these nodes has relationships to a different node type labelled as KO (functional orthologs). See the example below just for two organisms. The same KO nodes can link to many mammalian organisms like elephant, human or a mouse (or even to all mammals),


This results in a network with 337 (111 are organisms) taxa nodes and 12142 Ko nodes and over 1,200,000 relations.


Now i want to build a model that would predict based on KO whenever a given species belongs toEuarchontoglires. Every organism node that is linked to Euarchontoglires has a property category=1. The rest of the organisms have the property category=0. 

This was just an introduction.

What I want to know is how I can calculate node2vec ONLY for these organism nodes. We do not want to have embeddings for KO nodes.

I have a projected graph:"""


CALL gds.graph.project(
      taxa: { 
                label: 'Taxa', 
                properties: ['category']
      ko: { 
                label: 'Ko'


        RELS: {
            type: 'HAS_KO',
            orientation: 'UNDIRECTED'



I do not know how to write gds.beta.node2vec.write only for the nodes that I will later use for ML.


MATCH (n:Taxa) WHERE n.kegg is not null RETURN, n.category, n.n2v_all_nodes

Can u guide me?



Nodes 2022
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.