I read David Allen's blog post: Streaming Graph Loading with Neo4j and APOC Triggers and made some comments there. He eventually directed me here.
The premise in his post is that incoming IOT data records gets inserted directly into Neo4j. When an IOT record arrives it gets inserted into Neo4j as a node that is never intended to be stored as such. Instead a trigger recognises it by its signature label then extracts the data and enriches the graph with nodes and relationships derived from it, and discards the original node.
Incoming IOT record inserts might look like these:
CREATE (:FriendRecord { p1: "David", p2: "Mark" });
CREATE (:FriendRecord { p1: "David", p2: "Susan" });
CREATE (:FriendRecord { p1: "Bob", p2: "Susan" });
I added a WHERE clause to David's example and here it is:
//TRIGGER EXAMPLE WITH HELPER FUNCTION & WHERE CLAUSE
CALL apoc.trigger.add('loadFriendRecords',
"
UNWIND apoc.trigger.nodesByLabel({assignedLabels}, 'FriendRecord') AS iotnode
WITH iotnode
WHERE iotnode.p1 = 'David'
AND (:Person {name:iotnode.p1})-[:FRIENDS]-(:Person {name:'Mark'})
MERGE (p1:Person { name: iotnode.p1 })
MERGE (p2:Person { name: iotnode.p2 })
MERGE (p1)-[f:FRIENDS]->(p2)
DETACH DELETE iotnode
RETURN p1,f,p2
",
{ phase: 'after' })
Questions:
1.) In this transaction we take an incoming event stream data record, stuff its contents into a newly created node--basically a temporary node--use it to create some proper graph objects, and then delete the temporary node. All of this happens in miliseconds, inside of a transaction. Do these intermediate nodes ever actually get written to disk or do the exist only in memory?
2.) If I will be using a specific node property only for the purpose of being matched by a trigger, and will always immediately dispose of the node, is there any reason to index that property? Would it make the trigger any more efficient?
3.) The method apoc.trigger.nodesByLabel appears to be plural for 'nodes'. Will this function ever return more than a single node?
4.) In the example above it looks like there is initially 'match' action for each individual new node of type 'FriendRecord', then in the WHERE clause it's a filtering action. Is there any benefit to making more of the filtering happen before the WHERE clause...is it even possible?
5.) Are there more examples outside the documentation showing useage of triggers?