Hello, I am new to Neo4j and GDS. I have a complex graph of which I have provided the schema below. In this graph, the forwarded_message and message nodes have the following temporal properties: forwarded_date (for forwarded_message) and date (for message), where both of them are datetime type.
I want to create a projection of this graph where forwarded_date and date properties from above are on a specific month, say January, and later on use that projected graph for GDS algorithms.
The problem is I am not sure how to project a filtered graph with multi-labels and relationships between them.
Just to add, the current graph contains 69M nodes and 139M Relationships. In January, there are 3.6M Messages, 64K Users, 599 Channels, and 121K Forward_Message
I would be grateful if anyone could assist or guide me with this.
Thank you very much, I tried using the below cypher projection but it crossed the memory limit. Would you happen to know how I can optimize this?
MATCH (source) WHERE (source:User OR (source:Message AND datetime({year:2019, month: 1, day:1}) <= source.date < datetime({year:2019, month: 1, day:2})))
OPTIONAL MATCH (source)-[r:CREATED|SENT_TO]->(target) WHERE target:Message OR target:Channel
WITH gds.graph.project( 'messagesGraph', source, target, { sourceNodeLabels: labels(source), targetNodeLabels: labels(target), relationshipType: type(r) } ) AS g RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels
Is this a very large graph? Does the cypher part run without issue, i.e. replace the "WITH' and "RETURN" line with something like "return count(*)". What is the count returned?
I don't believe this is an "optimization", but I feel it reads better:
CALL {
MATCH (source:User)
RETURN source
UNION
MATCH (source:Message)
WHERE datetime({year:2019, month: 1, day:1}) <= source.date < datetime({year:2019, month: 1, day:2})
RETURN source
}
OPTIONAL MATCH (source)-[r:CREATED|SENT_TO]->(target:Messaage|Channel)
WITH gds.graph.project( 'messagesGraph', source, target, { sourceNodeLabels: labels(source), targetNodeLabels: labels(target), relationshipType: type(r) } ) AS g RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels
BTW- Isn't your date predicate the same as the following, since the date interval is only one day?
WHERE source.date = datetime({year:2019, month: 1, day:1})