K-Mean Clustering in Neo4j Desktop version 5.13.0

We are trying to run the K-Mean GDS algorithm (CALL gds.kmeans.write/stream) on 'node A' with a property with value type float. However, Neo4j throws an error stating that the property value must be an array.

For our use case, we are dealing with a huge data set with millions of records. It is not possible to alter the property type for the entire data as and when required. Below is a query that I tried:

MATCH (a:A)
WITH toFloatlist(a.property) AS array
CALL gds.kmeans.stream('Graph01', {
nodeProperty: 'array',
k: 3,
randomSeed: 42
})
YIELD nodeId, communityId
RETURN gds.util.asNode(nodeId).name AS name, communityId
ORDER BY communityId DESC

Please suggest a way to deal with this issue.

Hello @Kchat ,
I would suggest to alter the property type of property during the projection of Graph01.
This assumes you are using cypher projectin (Cypher projection - Neo4j Graph Data Science).

Hey @florentin_dorre ,

Thank you for your reply!

Is it possible for you to state an example?

It depends on which projection you use.

Lets take cypher projection:

MATCH (a)-->(b)
RETURN gds.graph.project('Graph01', a, b, {
    sourceNodeProperties: { property: toFloatList(b.property)},
    targetNodeProperties: { property: toFloatList(b.property)}
  }) 

native projection - specify a default value with the correct type to let GDS know the expected property type:

CALL gds.graph.project(
  'Graph01',
  {
    Person: {properties: 'property', defaultValue: [0.0]},
  },
  'KNOWS',
)
YIELD
  graphName, nodeProjection, nodeCount AS nodes, relationshipCount AS rels
RETURN graphName, nodeProjection.Book AS bookProjection, nodes, rels

Thank you so much! I will try this approach.