Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 2: migrate recent content

Embeddings only for selected nodes.


Let me quickly describe me graph.

I have a graph that links every species by its taxon for mammals. See below small example for Hominoidea:


There are five organisms (HSA, PPS, PTR, GGO, PON) at the end of this lineage. Only organisms that are at the end of the lineage have the property of kegg=kegg_genome_id. Each of these nodes has relationships to a different node type labelled as KO (functional orthologs). See the example below just for two organisms. The same KO nodes can link to many mammalian organisms like elephant, human or a mouse (or even to all mammals),


This results in a network with 337 (111 are organisms) taxa nodes and 12142 Ko nodes and over 1,200,000 relations.


Now i want to build a model that would predict based on KO whenever a given species belongs toEuarchontoglires. Every organism node that is linked to Euarchontoglires has a property category=1. The rest of the organisms have the property category=0. 

This was just an introduction.

What I want to know is how I can calculate node2vec ONLY for these organism nodes. We do not want to have embeddings for KO nodes.

I have a projected graph:"""


CALL gds.graph.project(
      taxa: { 
                label: 'Taxa', 
                properties: ['category']
      ko: { 
                label: 'Ko'


        RELS: {
            type: 'HAS_KO',
            orientation: 'UNDIRECTED'



I do not know how to write gds.beta.node2vec.write only for the nodes that I will later use for ML.


MATCH (n:Taxa) WHERE n.kegg is not null RETURN, n.category, n.n2v_all_nodes

Can u guide me?