I have a requirement on my system to support subscriptions to parts of a graph, based on cypher queries.
Basically, if a user wants to get all nodes related to a specific relationship, the query MATCH (n)-[r:my_relationship]->(p) RETURN p will give the wanted results. But if later a new node with this relationship is created, I would like to notify the users who subscribed with this new node. So, I need what we call a continuous query system, that is to say a way to re-execute automatically the queries which are impacted by graph modifications (create/update/delete). Of course, I can have more complex queries with LIMIT, sorting, etc.
What is the best way to support continuous queries with Neo4j ?
I haven't used them yet, but APOC has triggers that can be used to cause cypher queries to execute in response to data being updated. Those may be worth looking into: Neo4j APOC Procedures User Guide
Yes, I have seen triggers. But as far as I know, you receive notifications for all transactions, and then you have to parse the transaction event to check if the notification is relevant.
What I need is more fine grained : I would need to declare a (dynamic, not hard-coded) Cypher query and receive events every time a node is created/updated/deleted which has a link to this Cypher query results only.
Do you see what I mean ?
Exactly, our need is to define the Cypher queries at runtime when users ask for subscriptions, and these Cypher queries can be removed when all users have unsubscribed. Our Cypher queries are not hard-coded. They are defined by users via a DSL, and we interpret this DSL into Cypher.
Offhand, I don't see an easy way to avoid some scanning, but maybe avoid polling. E.g combining a transaction hook with a scheduler service. The hook maintains a list of "subscriptions" but those details might vary on use case – I can only surmise. If it's as simple as each subscription having its own relationship type, then you would scan through the transaction change log for each subscription, and on match, schedule a notification that the related query should be re-run. Or perhaps each subscription watches certain start or end nodes – similar logic could be used.
This could obviously get pretty heavy-weight, but it's the only thing I can think of that doesn't involve polling, or using some major external machinery to match granular changes with subscriptions.