cancel
Showing results for 
Search instead for 
Did you mean: 

Node relationships are not being displayed as I expect

ceiag
Node

Hi Neo4j community,

Following on from some previous advice (thank you @Rcolinp  ) I was able to import the following ontology file (Thesaurus_22.06d.OWL (https://evs.nci.nih.gov/ftp1/NCI_Thesaurus/Thesaurus_22.06d.OWL.zip)) into a neo4j knowledge graph. 

Although it appears all triples have been parsed and imported, the relationships displayed are not what I expected. I believe all the data is there, which leads me to believe the cause is how I have configured the graph prior to importing the data

FWIW I'm looking to emulate something along the lines of the following

https://ncithesaurus.nci.nih.gov/ncitbrowser/ajax?action=view_graph&scheme=NCI_Thesaurus&version=22....

This is what I see after running the following query


MATCH (alpelisib)
WHERE single(x IN  alpelisib.ns2__NHC0 WHERE x = "C94214")

RETURN alpelisib

ceiag_0-1662562376791.png

As you can Nodes appear correctly but, relationships are nor.

Any pointers as to where I may be going wrong would be greatly appreciated. 

Thanks in advance.

Chris

1 ACCEPTED SOLUTION

Rcolinp
Ninja
Ninja

Hi @ceiag

Thanks for providing the graph view of your example from NCIt. I'll use that as my basis/reference as what you are looking for upon importing the Thesaurus.owl OWL ontology.
 
Your hunch regarding your an incorrect graph configuration you have set prior to import causing the issues you have illustrated above is correct but it isn't exactly the only reason the graph view from NCIt differs in naming convention (from what you are seeing in Neo4j). What you are seeing in Neo4j with the current configuration is actually the true raw OWL ontology (disregarding that the uris have been shortened by default and prefixed with an nsx__. This is a result of Neo4j adhering to the default value for the handleVocabUris parameter (see more here --> Configuring Neo4j to use RDF data)).

The NCIt Graph View on the other hand is a modified/transformed graph visualization that is surfacing rdfs:label for the associations contained in this ontology rather than the URI or shortened URI (in Neo4j we are seeing the shortened). Reference NCI Thesaurus documentation regarding how the metadata within this ontology translates to "human readable language". (Thesaurus.owl metadata documentation)

With that said there is a way to perform this transformation within Neo4j! No worries! But first, let's first take a quick look at your current graphConfig:

When using the current graphConfig, handleMultivalhas been set to "Array". When setting handleMultival to "Array",  this is instructing Neo4j to import and store all property values as arrays (including properties that wouldn't make sense to be stored as arrays --> for example: single value properties). In addition to all property values being stored as arrays when handleMultival is set to "ARRAY" in our GraphConfig if we don’t provide a list of property URIs as multivalPropList (within the graphConfig) all properties will be stored as arrays. So if handleMultival needs to be set to "ARRAY", you need to also specify multivalPropList within the graphConfig as-well. This isn't contributing to the reason you are seeing ns2__A31 rather than Has_GDC_Value but this is storing all node property values as arrays when they all should not. 

Easy Initial Solution to Node Properties as Array Problem: 

Change your graphConfig to either omit handleMultival entirely (if the ontology doesn't contain any multi-value properties/you don't care about those properties that are multival) OR specify the exact multi-valued property(s) that should be stored as an array by specifying multivalPropList within your graphConfig. Take a look below:

graphConfig:

 

 

CALL n10s.graphconfig.init( { handleRDFTypes: "LABELS_AND_NODES" } );

 

 

Cypher Statement to Review:

 

 

MATCH (alpelisib)-[r:ns2__A32]->(pharmSub)
WHERE alpelisib.ns2__NHC0 = "C94214"
RETURN alpelisib, r, pharmSub;

 

 

Result Vis:

A31_A32_image.png

 
Note that this is 100% correct based on the OWL file:

 

 

<!-- http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 -->

<owl:AnnotationProperty rdf:about="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31">
    <NHC0>A31</NHC0>
    <P106>Conceptual Entity</P106>
    <P108>Has_GDC_Value</P108>
    <P90>Has_GDC_Value</P90>
    <P97>An association that connects a concept representing a GDC property to its dedicated permissible value concept(s).</P97>
    <rdfs:label>Has_GDC_Value</rdfs:label>
    <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#anyURI"/>
</owl:AnnotationProperty>

 

 

we can see that http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 has been shortened by Neo4j to ns2__A31 (as expected as handleVocabUris within the graphConfig has defaulted to its default value "SHORTEN"). 

As we can see in the OWL snippet, the AnnotationProperty has rdfs:label value of Has_GDC_Value, but upon import using this graphConfig, Neo4j is simply shortening the URI of the predicate to its raw value. If you'd like to further edit what these relationshipTypes (it sounds like you do or want to mirror NCIt graph view), refer to Mapping Graph Models - Neosemantics (4.3). This will walk you through how to set the proper graph configuration to allow you to utilize other neosemantics (n10s) procedures to add namespace prefix definitions and create actual mappings for individual elements in the graph to elements to match the NCIt graph view. 

To help you get going I have provided the steps required to take below too 😃.
(Please note: You'll have to add each relationshipType as a distinct mapping using n10s.mapping.add()).

Solution To Get You Started: 

 

 

// Create Uniqueness Constraint
CREATE CONSTRAINT n10s_unique_uri ON (r:Resource) ASSERT r.uri IS UNIQUE;
// Create GraphConfig --> need to SET handleVocabUris to Map. This will enable ability to ensure Neo4j mirrors NCIt Graph View
CALL n10s.graphconfig.init( {
  handleVocabUris: "MAP"
});
// Create Prefix Definitions (using addFromText procedure from n10s)
CALL n10s.nsprefixes.addFromText('
<rdf:RDF xmlns="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#"
xml:base="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:oboInOwl="http://www.geneontology.org/formats/oboInOwl#"
xmlns:Thesaurus="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:protege="http://protege.stanford.edu/plugins/owl/protege#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
');
// Create Mapping from http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 to Has_GDC_Value
CALL n10s.mapping.add("http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31", "Has_GDC_Value");
// Create Mapping from http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A32 to Is_Value_For_GDC_Property
CALL n10s.mapping.add("http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A32", "Is_Value_For_GDC_Property");
// Add all mappings...
// Lastly... Import Thesaurus.owl
CALL n10s.rdf.import.fetch('file:///var/lib/neo4j/import/Thesaurus.owl', 'RDF/XML');

 

 

Now we can query the graph & see the transformation:

 

 

MATCH (x)-[r:Is_Value_For_GDC_Property]->(y)
WHERE x.NHC0 = 'C94214'
RETURN x, r, y;

 

 


Desired Result!:

Transformed_Result.png

 
I hope this is of help to you! Feel free to ping back if you need more help!

Best,
Rob

View solution in original post

9 REPLIES 9

glilienfield
Ninja
Ninja

What do you mean the relationships do not?   One thing to keep in mind is that neo4j browser by default shows all connecting relationships between the nodes returned in a query. this may be misleading when you are looking for specific relationships between nodes. You can turn this setting off in the settings panel within the browser view. It is the 'connect result nodes' checkbox.

Screen Shot 2022-09-07 at 11.08.41 AM.png

Hi,

Thanks for reply. 

Take for example the Relationships between the node Alpelisib &  Pharmacologic Substance. I would like to see the following.

ceiag_0-1662563713751.png

Note the 'Has_GDC_Value' and 'is_Value_For_GDC_Property', what I see on the neo4j implementation is the following 

ceiag_1-1662563860670.png

I hope that make sense.

Regards

 

Chris

 

The values shown relationships and nodes is set in the browser. It looks like you want to display the relationship type, while you are showing a property of the relationship.  You can change the visual effects of a node label and relationship by clicking on the either of them in the browser and setting the color, property to show, and width for relationships. 

Click on a relationship in the browser. That will show the relationship properties to the right. click on the relationship button, which will bring up a box as shown below.  Select 'type' to show the relationship type in the graph.

Screen Shot 2022-09-07 at 11.28.37 AM.png

Hi,

That is something I did previously try, but unfortunately to no avail. 

ceiag_0-1662629157195.png

See below for the corresponding  snippet from the ontology file

    <!-- http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 -->

    <owl:AnnotationProperty rdf:about="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31">
        <NHC0>A31</NHC0>
        <P106>Conceptual Entity</P106>
        <P108>Has_GDC_Value</P108>
        <P90>Has_GDC_Value</P90>
        <P97>An association that connects a concept representing a GDC property to its dedicated permissible value concept(s).</P97>
        <rdfs:label>Has_GDC_Value</rdfs:label>
        <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#anyURI"/>
    </owl:AnnotationProperty>

 

My mistake, you indicated in a previous reply that the values you want are properties. From the screenshot I can see that the relationship you selected does not have any properties, since the only options for display are the relationship’s type and it’s internal ‘id’. You may want to revisit how you imported the data, so you can include the properties you need. 

Do you want to post the script and the data file to see if we can find the issue? 

Hi,

Apologies for any confusion caused, I'm relatively new to the world of knowledge graphs and neo4j. Really appreciate your continued assistance.

Data file is available at the following location - https://evs.nci.nih.gov/ftp1/NCI_Thesaurus/Thesaurus_22.06d.OWL.zip 

Scripts I used to upload the data is as follows

CREATE CONSTRAINT n10s_unique_uri ON (r:Resource)
ASSERT r.uri IS UNIQUE;

call n10s.graphconfig.init( {  handleMultival: "ARRAY",  handleRDFTypes: "LABELS_AND_NODES" })

CALL n10s.rdf.import.fetch("file:///var/lib/neo4j/import/Thesaurus.owl","RDF/XML")

Rcolinp
Ninja
Ninja

Hi @ceiag

Thanks for providing the graph view of your example from NCIt. I'll use that as my basis/reference as what you are looking for upon importing the Thesaurus.owl OWL ontology.
 
Your hunch regarding your an incorrect graph configuration you have set prior to import causing the issues you have illustrated above is correct but it isn't exactly the only reason the graph view from NCIt differs in naming convention (from what you are seeing in Neo4j). What you are seeing in Neo4j with the current configuration is actually the true raw OWL ontology (disregarding that the uris have been shortened by default and prefixed with an nsx__. This is a result of Neo4j adhering to the default value for the handleVocabUris parameter (see more here --> Configuring Neo4j to use RDF data)).

The NCIt Graph View on the other hand is a modified/transformed graph visualization that is surfacing rdfs:label for the associations contained in this ontology rather than the URI or shortened URI (in Neo4j we are seeing the shortened). Reference NCI Thesaurus documentation regarding how the metadata within this ontology translates to "human readable language". (Thesaurus.owl metadata documentation)

With that said there is a way to perform this transformation within Neo4j! No worries! But first, let's first take a quick look at your current graphConfig:

When using the current graphConfig, handleMultivalhas been set to "Array". When setting handleMultival to "Array",  this is instructing Neo4j to import and store all property values as arrays (including properties that wouldn't make sense to be stored as arrays --> for example: single value properties). In addition to all property values being stored as arrays when handleMultival is set to "ARRAY" in our GraphConfig if we don’t provide a list of property URIs as multivalPropList (within the graphConfig) all properties will be stored as arrays. So if handleMultival needs to be set to "ARRAY", you need to also specify multivalPropList within the graphConfig as-well. This isn't contributing to the reason you are seeing ns2__A31 rather than Has_GDC_Value but this is storing all node property values as arrays when they all should not. 

Easy Initial Solution to Node Properties as Array Problem: 

Change your graphConfig to either omit handleMultival entirely (if the ontology doesn't contain any multi-value properties/you don't care about those properties that are multival) OR specify the exact multi-valued property(s) that should be stored as an array by specifying multivalPropList within your graphConfig. Take a look below:

graphConfig:

 

 

CALL n10s.graphconfig.init( { handleRDFTypes: "LABELS_AND_NODES" } );

 

 

Cypher Statement to Review:

 

 

MATCH (alpelisib)-[r:ns2__A32]->(pharmSub)
WHERE alpelisib.ns2__NHC0 = "C94214"
RETURN alpelisib, r, pharmSub;

 

 

Result Vis:

A31_A32_image.png

 
Note that this is 100% correct based on the OWL file:

 

 

<!-- http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 -->

<owl:AnnotationProperty rdf:about="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31">
    <NHC0>A31</NHC0>
    <P106>Conceptual Entity</P106>
    <P108>Has_GDC_Value</P108>
    <P90>Has_GDC_Value</P90>
    <P97>An association that connects a concept representing a GDC property to its dedicated permissible value concept(s).</P97>
    <rdfs:label>Has_GDC_Value</rdfs:label>
    <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#anyURI"/>
</owl:AnnotationProperty>

 

 

we can see that http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 has been shortened by Neo4j to ns2__A31 (as expected as handleVocabUris within the graphConfig has defaulted to its default value "SHORTEN"). 

As we can see in the OWL snippet, the AnnotationProperty has rdfs:label value of Has_GDC_Value, but upon import using this graphConfig, Neo4j is simply shortening the URI of the predicate to its raw value. If you'd like to further edit what these relationshipTypes (it sounds like you do or want to mirror NCIt graph view), refer to Mapping Graph Models - Neosemantics (4.3). This will walk you through how to set the proper graph configuration to allow you to utilize other neosemantics (n10s) procedures to add namespace prefix definitions and create actual mappings for individual elements in the graph to elements to match the NCIt graph view. 

To help you get going I have provided the steps required to take below too 😃.
(Please note: You'll have to add each relationshipType as a distinct mapping using n10s.mapping.add()).

Solution To Get You Started: 

 

 

// Create Uniqueness Constraint
CREATE CONSTRAINT n10s_unique_uri ON (r:Resource) ASSERT r.uri IS UNIQUE;
// Create GraphConfig --> need to SET handleVocabUris to Map. This will enable ability to ensure Neo4j mirrors NCIt Graph View
CALL n10s.graphconfig.init( {
  handleVocabUris: "MAP"
});
// Create Prefix Definitions (using addFromText procedure from n10s)
CALL n10s.nsprefixes.addFromText('
<rdf:RDF xmlns="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#"
xml:base="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:oboInOwl="http://www.geneontology.org/formats/oboInOwl#"
xmlns:Thesaurus="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:protege="http://protege.stanford.edu/plugins/owl/protege#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
');
// Create Mapping from http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31 to Has_GDC_Value
CALL n10s.mapping.add("http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A31", "Has_GDC_Value");
// Create Mapping from http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A32 to Is_Value_For_GDC_Property
CALL n10s.mapping.add("http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#A32", "Is_Value_For_GDC_Property");
// Add all mappings...
// Lastly... Import Thesaurus.owl
CALL n10s.rdf.import.fetch('file:///var/lib/neo4j/import/Thesaurus.owl', 'RDF/XML');

 

 

Now we can query the graph & see the transformation:

 

 

MATCH (x)-[r:Is_Value_For_GDC_Property]->(y)
WHERE x.NHC0 = 'C94214'
RETURN x, r, y;

 

 


Desired Result!:

Transformed_Result.png

 
I hope this is of help to you! Feel free to ping back if you need more help!

Best,
Rob

Hey @Rcolinp,

Thanks so much for this, Just working through it now, but it's exactly what I'm looking for. 

Once again thanks for sharing your expertise, it's greatly appreciated. 

Chris,

Hey @Rcolinp 

Have been exploring the data a bit more closely. I have a further question if you don't mind me asking. In the https://evsexplore.nci.nih.gov/evsexplore/alldocs you will notice there is a 'Roles' section, is there a method to make these roles appear as 'relationships'.

For example 

C30168---->gene_product_is_physical_part_of*---------->C17270

*Role - gene_product_is_physical_part_of = #R51

Relevant lines in the .owl file

C30168: Phosphatidylinositol 4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha Isoform
    <!-- http://purl.obolibrary.org/obo/NCIT_C30168 -->
    <owl:Class rdf:about=http://purl.obolibrary.org/obo/NCIT_C30168>
        <owl:equivalentClass>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <rdf:Description rdf:about=http://purl.obolibrary.org/obo/NCIT_C16984/>
….
                     <owl:Restriction>
                        <owl:onProperty rdf:resource=http://purl.obolibrary.org/obo/NCIT_R51/>                                                       ## R51: gene_product_is_physical_part_of/complex_has_physical_part             
                        <owl:someValuesFrom rdf:resource=http://purl.obolibrary.org/obo/NCIT_C17270/>
                    </owl:Restriction>
…

 

I did have a play round with the mapping options, but can't quite figure it out.  I understand that an 'owl:Restriction' is a special kind of class, so I do wonder if that is a factor?

Thanks

 

Chris