apoc.import.graphml in 4.3 and above ignores relationship/edge properties

ibaldin
Node

Hello,

Trying to debug an issue with APOC. I use GraphML to move graphs between NetworkX library and Neo4j. With 4.1.6/APOC4.1.0.0 and below things worked as expected, starting with 4.3.15/APOC4.3.0.4 I see problems. Here is the issue - NetworkX doesn't understand 'labels' as far as GraphML is concerned, so when exporting from NetworkX I make sure each relationship has a custom 'Class' property set, so when I load it in Neo4j it becomes the label of the relationship. This works fine in earlier versions. In the newer version I see the following problem - the first data key entry in the edge is ignored. So if the only data key entry i have is 'Class' the relationship just gets the default label RELATED and no other properties. If I add a second key or more they start showing up on the relationship. Here is a simple file that can help reproduce the problem:

<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="d10" for="edge" attr.name="label" attr.type="string" />
<key id="d9" for="edge" attr.name="Class" attr.type="string" />
<key id="d8" for="node" attr.name="Details" attr.type="string" />
<key id="d7" for="node" attr.name="Model" attr.type="string" />
<key id="d6" for="node" attr.name="Site" attr.type="string" />
<key id="d5" for="node" attr.name="StitchNode" attr.type="string" />
<key id="d4" for="node" attr.name="Type" attr.type="string" />
<key id="d3" for="node" attr.name="Name" attr.type="string" />
<key id="d2" for="node" attr.name="NodeID" attr.type="string" />
<key id="d1" for="node" attr.name="Class" attr.type="string" />
<key id="d0" for="node" attr.name="GraphID" attr.type="string" />
<key id="roles" for="edge" attr.name="roles"/>
<graph edgedefault="undirected">
<node id="1">
<data key="d0">bbfdd8d0-d8bd-4af5-9574-c4bef3e9a7b5</data>
<data key="d1">NetworkNode</data>
<data key="d2">225f3c62-1cb8-4dd3-ba19-5e8b149ebd87</data>
<data key="d3">NodeA</data>
<data key="d4">VM</data>
<data key="d5">false</data>
<data key="d6">RENC</data>
</node>
<node id="2">
<data key="d0">bbfdd8d0-d8bd-4af5-9574-c4bef3e9a7b5</data>
<data key="d1">Component</data>
<data key="d2">9c9dfe11-94ff-4db7-a5e9-52d556515d6c</data>
<data key="d3">gpu1</data>
<data key="d4">GPU</data>
<data key="d7">RTX6000</data>
<data key="d8">NVIDIA Corporation TU102GL [Quadro RTX 6000/8000] (rev a1)</data>
<data key="d5">false</data>
</node>
<!--
<edge source="1" target="2"><data key="d10">FIMREL</data><data key="d9">has</data></edge>
-->
<edge source="1" target="2"><data key="d9">has</data></edge>
</graph>
</graphml>

 If you use the file as is with version 4.1.x, the Class property is automatically transferred to relationship label so the relationship becomes 'has'.

On 4.3 and above (I have not checked 4.2) this doesn't happen (relationship stays 'RELATED') and the Class property doesn't even show up on the relationship as a regular property. If you add another key (like the commented line) in front of it, then you can see it (but seems the first key is always ignored).

Thoughts? Suggestions?  Is this a bug or expected behavior?

3 REPLIES 3

koji
Ninja
Ninja

Hi @ibaldin 

I tried various versions.
Even with the same Neo4j 4.1.6, the results differed depending on the APOC version.

Yes Class has, Neo4j 4.1.6, APOC 4.1.0.0
No Class, Neo4j 4.1.6, APOC 4.1.0.11
No Class, Neo4j 4.3.15, APOC 4.3.0.6
No Class, Neo4j 4.3.15, APOC 4.3.0.6
No Class, Neo4j 4.4.8, APOC 4.4.0.7

ibaldin
Node

I can deal with no support for 'Class', the bigger problem is the fact that the first key on edge property list, no matter its name is lost. As I mentioned above, if an edge is specified like so:

 

<edge source="1" target="2"><data key="d10">FIMREL</data><data key="d9">has</data></edge>

 

then only property corresponding to key 'd9' is visible on the edge. If the edge is specified like so:

 

<edge source="1" target="2"><data key="d9">has</data></edge>

 

then there are no properties on the edge.

If attribute 'label' is present on the edge, then key/value pairs appear to be properly read, however I can't do that as NetworkX GraphML export doesn't put the 'label' attribute on edges.

If the key/value property list was faithfully reproduced by apoc.import.graphml I can use other apoc functions (apoc.create.setRealProperty) to force one of them to be the label.

ibaldin
Node

Should I file a ticket for this and is there hope of it getting fixed?