cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Does relationships get automatically generated between nodes?

Kevin6482
Node Clone

I'm constructing a biomedical knowledge graph, I collected the data from different open sources, All values in each node are unique, and there are no duplicate rows in relationships. (verified thoroughly)

These are my nodes: assays, cells, clinicals, compounds, disorders, drugs, foods, genes, metabolites, organisms, pathways, peptides, proteins, targets, therapeutics.

These are my relationships:  cell_FROM_species, clinical_IS_ASSOCIATED_disorder, clinical_IS_ASSOCIATED_drug, compound_IS_ASSOCIATED_protein, drug_CAUSES_disorder, drug_INTERACTS_target, food_IS_ASSOCIATED_compound, metabolite_IS_ASSOCIATED_pathway, peptide_TESTED_IN_assay, peptide_BINDS_TO_protein, peptide_IS_ASSOCIATED_therapeutics, protein_IS_ASSOCIATED_disorder, protein_IS_ASSOCIATED_gene, protein_COMES_FROM_organism, protein_IS_EXPRESSED_IN_pathway.

I used neo4j admin to import data using below command, (since it's a long one, I only mentioned a sample)

 

C:/Users/mypc/.Neo4jDesktop/relate-data/dbmss/bin/neo4j-admin import --database=db1 --nodes=import/assays.csv --nodes=import/cells.csv --nodes=import/clinicals.csv --………………………………………. --relationships=import/ cell_FROM_species.csv --relationships=import/ clinical_IS_ASSOCIATED_disorder.csv …………………………………………………………………………--multiline-fields=true 

 

I ended up with this schema, I could see there are some new relationships been created between nodes, example:new_schema.png

  1. peptide IS_ASSOCIATED with compound which I didn't mention.
  2. protein IS_ASSOCIATED with compound, but I gave the opposite which is compound IS_ASSOCIATED with protein
  3. Also why compound IS_ASSOCIATED  with compound (same node)

Can someone correct me where I'm going wrong? Thanks in advance.




 

#neo4j-admin #relationships

1 ACCEPTED SOLUTION

glilienfield
Ninja
Ninja

Did you use 'db.schema.visualization' to get the schema?  I recall helping someone out months ago where the schema was not accurately representing their data. I believe someone else in the community mentioned there is a known issue with this method.  The relationships did not actually exists in his data. I suggest you query your data to verify those relations do indeed exists or do not. Something like this for each relationship you don't expect:

 

 

return exists( (:Peptide)-[:IS_ASSOCIATED]->(:Compound) )

 

 

This should provide you an inventory of your relationships. I assumed that data model only has one label per node.

 

match(n)-[r]->(m)
return labels(n)[0] as `start node`, type(r) as `relationship type`, labels(m)[0] as `end node`, count(*) as count

 

View solution in original post

2 REPLIES 2

glilienfield
Ninja
Ninja

Did you use 'db.schema.visualization' to get the schema?  I recall helping someone out months ago where the schema was not accurately representing their data. I believe someone else in the community mentioned there is a known issue with this method.  The relationships did not actually exists in his data. I suggest you query your data to verify those relations do indeed exists or do not. Something like this for each relationship you don't expect:

 

 

return exists( (:Peptide)-[:IS_ASSOCIATED]->(:Compound) )

 

 

This should provide you an inventory of your relationships. I assumed that data model only has one label per node.

 

match(n)-[r]->(m)
return labels(n)[0] as `start node`, type(r) as `relationship type`, labels(m)[0] as `end node`, count(*) as count

 

Thanks for your response, well yes I used 'db.schema.visualization' to get this schema, I used your Cypher query and found that those relationships were not actually existing, but I don't know why the schema was showing those relationships. I also found that this issue was fixed with apoc version.