Variable path length and multiple queries in GDS call

Hello all,

I’m currently struggling with the following situation:

Set up of a GDS call for an anonymous graph using Cypher projection for the Louvain algo with some challenges:

  1. the relationship query requires the call of a path with multiple “in-between” nodes
  2. the returned relations could be more than one for each startnode - endnode combination so there has to be some kind of DISTINCT in-between to make sure each projected startnode - endnode pair is only count once.
  3. There is a need to create a virtual relation property that could be used as weight during the Louivan execution. But for this property there is no source in the source graph. So I have to set a property value during the production of the projected relation.

I tried the following approach

CALL gds.louvain.stream(
{
nodeQuery: '
  MATCH (entity)
  WHERE entity:LABEL_A
  RETURN id(entity) AS id, labels(entity) AS labels',
relationshipQuery: '
  MATCH (entity1:LABEL_A)-[r1:TYPE_A]->(:LABEL_B)-[r2:TYPE_B]->(entity2:LABEL_A)
  WITH DISTINCT entity1, entity2
  RETURN id(entity1) AS source, id(entity2) AS target',
includeIntermediateCommunities: true
}
)
YIELD nodeId, communityId, intermediateCommunityIds
RETURN gds.util.asNode(nodeId).name AS Name, communityId, intermediateCommunityIds

This query works as expected. But when I try to create a "virtual" relationship property for the projected relation between startnode and endnode, I'm struggeling with the correct way of doing it.

When I add a weight value within the DISTINCT like this and RETURN the value the query appears to run as expected:

CALL gds.louvain.stream(
{
nodeQuery: '
  MATCH (entity)
  WHERE entity:LABEL_A
  RETURN id(entity) AS id, labels(entity) AS labels',
relationshipQuery: '
  MATCH (entity1:LABEL_A)-[r1:TYPE_A]->(:LABEL_B)-[r2:TYPE_B]->(entity2:LABEL_A)
  WITH DISTINCT entity1, entity2, 1 AS weight
  RETURN id(entity1) AS source, id(entity2) AS target, weight',
includeIntermediateCommunities: true
}
)
YIELD nodeId, communityId, intermediateCommunityIds
RETURN gds.util.asNode(nodeId).name AS Name, communityId, intermediateCommunityIds

But in this case I havent added the relationshipProperties: 'weight' within the options of the algorithm.

When I do this like here:

CALL gds.louvain.stream(
{
nodeQuery: '
  MATCH (entity)
  WHERE entity:LABEL_A
  RETURN id(entity) AS id, labels(entity) AS labels',
relationshipQuery: '
  MATCH (entity1:LABEL_A)-[r1:TYPE_A]->(:LABEL_B)-[r2:TYPE_B]->(entity2:LABEL_A)
  WITH DISTINCT entity1, entity2, 1 AS weight
  RETURN id(entity1) AS source, id(entity2) AS target, weight',
relationshipProperties: 'weight',
includeIntermediateCommunities: true
}
)
YIELD nodeId, communityId, intermediateCommunityIds
RETURN gds.util.asNode(nodeId).name AS Name, communityId, intermediateCommunityIds

I get a "Failed to invoke procedure gds.louvain.stream: Caused by: java.lang.IllegalArgumentException: Invalid key: relationshipProperties" error.

Is it sufficient to just return the weigth in the RETURN statement an not to define it like the "General configuration for algorithm execution on an anonymous graph" description points out?

A'm a bit confused and not sure if this property will be available for weight calculations by setting (addtionally) the Louivain attribute " reelationshipWeightProperty".

Any hint on creating virtual properties for Cypher based anonymous graph projection within GDS with more than on node in-between the path and no "real" source property within this is greatly appreciated.

Thanks

Krid

Hey Krid!

In this case, the relationshipProperties parameter is for when you are creating your graph directly from the database. So if you had some property already aggregated and stored, that's how you would tell GDS that you want to include it in your graph. However, since you're using Cypher to create your graph, you can ignore it and just include your weight property in the return statement.

With that said, you DO still need to use the relationshipWeightProperty parameter to make sure Louvain uses it. So try this:

CALL gds.louvain.stream(
{
nodeQuery: '
  MATCH (entity)
  WHERE entity:LABEL_A
  RETURN id(entity) AS id, labels(entity) AS labels',
relationshipQuery: '
  MATCH (entity1:LABEL_A)-[r1:TYPE_A]->(:LABEL_B)-[r2:TYPE_B]->(entity2:LABEL_A)
  WITH DISTINCT entity1, entity2
  RETURN id(entity1) AS source, id(entity2) AS target, 1 as weight',
relationshipWeightProperty: 'weight',
includeIntermediateCommunities: true
}
)
YIELD nodeId, communityId, intermediateCommunityIds
RETURN gds.util.asNode(nodeId).name AS Name, communityId, intermediateCommunityIds

Hello Sean,

thanks for your message. That makes the usage scenario of relationshipProperties: more clear. I suppose this is global as the configuration option is listed in the General configuration for algorithm execution on an anonymous graph property list.

Once again many thanks for clarifying

Best

Krid