Error creating Cypher GDS graph projection when including list properties for nodes

Hi,

I'm trying to create a graph projection using a list of node IDs (generated by applying a train-test split algorithm for the purposes of graphSAGE model training). Neo4j Desktop version is 1.4.5 and Neo4j database version is 4.2.6, with Graph Data Science plugin v1.5.2. Here's the query I'm using:

CALL gds.graph.create.cypher( 
    'train_citations', 
    'MATCH (n:Publication) WHERE ID(n) IN $ids RETURN ID(n) as id, n.feature_vector AS features, n.label_vector AS labels', 
    'MATCH (n:Publication)-[:CITES]-(m:Publication) WHERE ID(n) IN $ids AND ID(m) IN $ids RETURN ID(n) AS source, ID(m) AS target', 
    {parameters: { ids: [3501, 44757, 17332, 29883, 76708] }})
YIELD graphName, nodeCount, relationshipCount 
RETURN graphName, nodeCount, relationshipCount

I'm running into an error that suggests GDS isn't properly interpreting my Cypher queries:

Failed to invoke procedure `gds.graph.create.cypher`: Caused by: java.lang.IllegalArgumentException: Type of column `labels` should be of type List, but was `[D@32efdba4`

Note that I'm able to run the node query that seems to be the source of the problem by itself in Neo4j Browser (using a pre-set list of IDs parameter).

Does anyone know why it may be returning this error? It sounds like it's seeing some kind of uninterpretable data type instead of the list of floats for the labels column I intended (from the node query).

Interestingly, if I remove one of the two list-like properties (n.label_vector) in the cypher projection, such that my query becomes:

CALL gds.graph.create.cypher(
'train_citations',
'MATCH (n:Publication) WHERE ID(n) IN $ids RETURN ID(n) as id, n.feature_vector AS features',
'MATCH (n:Publication)-[:CITES]-(m:Publication) WHERE ID(n) IN $ids AND ID(m) IN $ids RETURN ID(n) AS source, ID(m) AS target',
{parameters: { ids: {train_node_ids} })
YIELD graphName, nodeCount, relationshipCount
RETURN graphName, nodeCount, relationshipCount

I get a different error:

[Procedure.ProcedureCallFailed] Failed to invoke procedure `gds.graph.create.cypher`: Caused by: java.lang.IllegalArgumentException: Unsupported type [NUMBER_ARRAY] of value [D@157f1a08. Please use a numeric property.

@clair.sullivan Sorry, I see that the post was withdrawn. Did you have a question/solution?

I should note, by the way, that creation of a graph projection via native projection seems to work fine with the feature list I was trying to pass (I tested this by taking my list of train and test node IDs and using them to pre-label the nodes in question with extra Train and Test labels, then create the projection without the filtering step). It's only when I use a Cypher projection that it fails.

While this works for initial testing for my purposes, it isn't a long-term solution, as I will want to vary the nodes that belong in train vs. test as I do model experimentation, so the label-adding approach won't be flexible enough (I assume).

I tested out a similar query and was able to get it to run without any errors:

CALL gds.graph.create.cypher( 
    'test', 
    'MATCH (n) WHERE ID(n) IN $ids RETURN ID(n) as id', 
    'MATCH (n)-[]-(m) WHERE ID(n) IN $ids RETURN ID(n) AS source, ID(m) AS target', 
    {parameters: { ids: [3501, 44757, 17332, 29883, 76708] }, validateRelationships:False})
YIELD graphName, nodeCount, relationshipCount 
RETURN graphName, nodeCount, relationshipCount

My suspicion is that the error message you're getting about unsupported types is misleading. Can you try running it without n.feature_vectors and see if you still get the same error?

Sorry, @emigre459 ! For some reason I was working on a different question on the forum and it got posted here. But thank you for checking back on it!

@alicia_frame1 I agree with your assessment. Removing n.feature_vector allows it to execute without issue. The idea that lists + cypher projections isn't working seems supported by this and my test using the feature vectors but with native projections (which I mention above and seems to work fine).

Is there another test I can do that would be helpful?

I suspect you've uncovered another bug :grimacing:

I was able to reproduce the error on one of my own graphs - it seems that our Cypher projections throw an error with list properties. I tried a cypher projection with an embedding, without an parameters, and still go the unsupported type message.

I've added a card to our engineering inbox, and I'll post as soon as we sort it out :slight_smile:

@emigre459 Thanks for reporting the issue. We already fixed this as part of the current 1.6 development cycle. Please switch to 1.6.0 once it is out, the preview release went out last week, which means it will be out within the next days.

If you want to try it out now, feel free to use the preview release: https://neo4j.com/download-center/#algorithms

Awesome, thanks @martin_junghann ! I'm fine waiting a couple days since I found a near-term workaround, but great to know that the new version has the fix.

Hi @alicia_frame1 Do you know if this has been fixed, I'm using GDS 1.7.2 and getting this error with a cypher projection:

Failed to invoke procedure gds.graph.create.cypher: Caused by: java.lang.IllegalArgumentException: [[55.860931, -4.259839]:java.util.ArrayList] is not a supported property value

Good news - we just published the 1.8 release today, which includes numerous bug fixes: Release GDS 1.8.0 · neo4j/graph-data-science · GitHub

If you're still getting that error message, can you share your query with us?

Thanks Alicia, we'll upgrade and see how we go.

Upgraded to GDS 1.8 (on 4.3.7) but no luck so far

The end of the node projection code

RETURN DISTINCT id(n) AS id, labels(n) AS labels,
toInteger(n.personmasterrecord) AS officerNumber,
toInteger(n.partialdateofbirth) AS dob,
toFloat(n.latitude_prop) AS lat,
toFloat(n.longitude_prop) AS long,
toInteger(n.personmasterrecordcount) AS appointments,
coalesce([toFloat(n.latitude_prop), toFloat(n.longitude_prop)], 0.0) AS latlong'

coalesce([toFloat(n.latitude_prop), toFloat(n.longitude_prop)], 0.0) AS latlong

returns the error

Failed to invoke procedure gds.graph.create.cypher: Caused by: java.lang.IllegalArgumentException: [[55.860931, -4.259839]:java.util.ArrayList] is not a supported property value

I've tested switching in a simple list [1.0, 2.0] and the error is the same.

The whole procedure:

CALL gds.graph.create.cypher(
    'TEST',
    'MATCH (c:Company) WHERE c.number IN $RTA
	CALL apoc.path.expandConfig(c, {
	relationshipFilter: "IS_OFFICER",
    labelFilter: "-Corporate",
    minLevel: 0,
    maxLevel: 1
})  YIELD path
    UNWIND nodes(path) AS x
    UNWIND relationships(path) AS r
    WITH x, r, COLLECT(startNode(r)) + COLLECT(endNode(r)) AS y
    UNWIND y as n
    WITH n, r WHERE (n.status = "NA" OR n.status IS NULL) AND r.appointmentType = "01"
    RETURN DISTINCT id(n) AS id, labels(n) AS labels, toInteger(n.personmasterrecord) AS officerNumber, toInteger(n.partialdateofbirth) AS dob, toFloat(n.latitude_prop) AS lat, toFloat(n.longitude_prop) AS lng, toInteger(n.personmasterrecordcount) AS appointments, coalesce([toFloat(n.latitude_prop), toFloat(n.longitude_prop)], 0.0) AS latlong',
    'MATCH (c:Company) WHERE c.number IN $RTA
	CALL apoc.path.expandConfig(c, {
	relationshipFilter: "IS_OFFICER",
    labelFilter: "-Corporate",
    minLevel: 0,
    maxLevel: 1
}) YIELD path
    UNWIND nodes(path) AS n
    UNWIND relationships(path) AS r
    WITH n, r WHERE (n.status = "NA" OR n.status IS NULL) AND r.appointmentType = "01"
    RETURN DISTINCT id(startNode(r)) AS source, id(endNode(r)) AS target, type(r) AS type, toInteger(r.appointmentType) AS appointmentType, toInteger(r.appointedOn) AS appointedOn, toInteger(r.resignedOn) AS resignedOn',
   {validateRelationships: false, parameters: { RTA: ["09647068", "10461978", "11299702", "07479524", "09677925", "09869279", "07593870", "07669978", "11592405", "09907206", "09275623", "05981946", "10581067", "01403177", "07995485", "10247723", "SC493013"]}}
)