Continuous query with Neo4j


(Eric) #1

I have a requirement on my system to support subscriptions to parts of a graph, based on cypher queries.
Basically, if a user wants to get all nodes related to a specific relationship, the query MATCH (n)-[r:my_relationship]->(p) RETURN p will give the wanted results. But if later a new node with this relationship is created, I would like to notify the users who subscribed with this new node. So, I need what we call a continuous query system, that is to say a way to re-execute automatically the queries which are impacted by graph modifications (create/update/delete). Of course, I can have more complex queries with LIMIT, sorting, etc.

What is the best way to support continuous queries with Neo4j ?


(Eric) #2

Do you have a plan to support continuous queries, like is supported by RethinkDB or MongoDB in your roadmap ?


(Josh Southerland) #3

Hi Eric,

I haven't used them yet, but APOC has triggers that can be used to cause cypher queries to execute in response to data being updated. Those may be worth looking into: https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_triggers


(Eric) #4

Yes, I have seen triggers. But as far as I know, you receive notifications for all transactions, and then you have to parse the transaction event to check if the notification is relevant.
What I need is more fine grained : I would need to declare a (dynamic, not hard-coded) Cypher query and receive events every time a node is created/updated/deleted which has a link to this Cypher query results only.
Do you see what I mean ?


(Josh Southerland) #5

I see, that makes sense.

It seemed like you were asking for two things:

  1. A way to notify users
  2. A way to automatically re-execute queries when nodes are updated

and APOC triggers seemed like a possible solution to (2)

edit: It looks like the queries that need to be re-executed can't be fully specified at the time of setting the trigger?


(Eric) #6

Exactly, our need is to define the Cypher queries at runtime when users ask for subscriptions, and these Cypher queries can be removed when all users have unsubscribed. Our Cypher queries are not hard-coded. They are defined by users via a DSL, and we interpret this DSL into Cypher.


(Jiropole) #7

Offhand, I don't see an easy way to avoid some scanning, but maybe avoid polling. E.g combining a transaction hook with a scheduler service. The hook maintains a list of "subscriptions" but those details might vary on use case – I can only surmise. If it's as simple as each subscription having its own relationship type, then you would scan through the transaction change log for each subscription, and on match, schedule a notification that the related query should be re-run. Or perhaps each subscription watches certain start or end nodes – similar logic could be used.

This could obviously get pretty heavy-weight, but it's the only thing I can think of that doesn't involve polling, or using some major external machinery to match granular changes with subscriptions.