Path conundrum for rule solving engine


(Hagen Ivar) #1

Hi!

I'm working on a rule solving engine. The problem is that when I run it first the results needs to be fed into the engine again and potentially again and again. Is it possible to solve this with just one run?

This is my data.

CREATE (a:XTEST {id:["A"], type:"start"})
CREATE (b:XTEST {id:["B"], type:"end"})
CREATE (a)-[:XGIVE]->(b)
CREATE (c:XTEST {id:["C"], type:"start"})
CREATE (d:XTEST {id:["D"], type:"start"})
CREATE (e:XTEST {id:["E"], type:"end"})
CREATE (c)-[:XGIVE]->(d)-[:XGIVE]->(e)

CREATE (be:XTEST {id:["B","E"], type:"start"})
CREATE (f:XTEST {id:["F"], type:"end"})
CREATE (be)-[:XGIVE]->(f)

Looks like this
path1

So when I run with A and C as input I will find nodes B, D and E.

WITH ["A","C"] AS input
MATCH p=(start:XTEST)-[r:XGIVE*]->(end:XTEST) 
WHERE ALL(p IN start.id WHERE p IN input)
RETURN p

But I also would like to get F as well since I now have found both B and E.
I can solve it by do the query again by now using A,B,C,D,E as input and do this until no new nodes are found. But it would be better to get all A,B,C,D,E,F with A,C as input with one query.

Anyone got any ideas on how to solve this?

Regards Ivar


(Paul Thomas) #3

Note I changed some of the property and query aliases to make it less confusing for me.
Something like this is doing what you want maybe ...

CREATE (a:XTEST {pklist:["A"], nodetype:"start"})
CREATE (b:XTEST {pklist:["B"], nodetype:"end"})
CREATE (a)-[:XGIVE]->(b)

CREATE (c:XTEST {pklist:["C"], nodetype:"start"})
CREATE (d:XTEST {pklist:["D"], nodetype:"start"})
CREATE (e:XTEST {pklist:["E"], nodetype:"end"})
CREATE (c)-[:XGIVE]->(d)-[:XGIVE]->(e)

CREATE (be:XTEST {pklist:["B","E"], nodetype:"start"})
CREATE (f:XTEST {pklist:["F"], nodetype:"end"})
CREATE (be)-[:XGIVE]->(f)

WITH ["A","C"] AS input
MATCH pathx = (p1Start:XTEST)-[r:XGIVE*]->(p1End:XTEST)
WHERE ALL(n1 IN p1Start.pklist WHERE n1 IN input)
with collect(distinct p1End.pklist) as pEnds

unwind pEnds as p1
unwind pEnds as p2
with p1 + p2 as pNew
where p1 <> p2
MATCH pathy = (p2Start:XTEST)-[r:XGIVE*]->(p2End:XTEST)
WHERE p2Start.pklist = pNew
return p2End


(Hagen Ivar) #4

Thanks for fast reply. This community is great. :slight_smile:

Your solution works for the scenario I listed. But if I have further rule chain it will not solve that.

If I add this

CREATE (fg:XTEST {pklist:["F","G"], nodetype:"start"})
CREATE (h:XTEST {pklist:["H"], nodetype:"end"})
CREATE (fg)-[:XGIVE]->(h)

So with this as input

WITH ["A","C","G"] AS input

I would like H to be returned as well. The problem is that I will need to run the query as many times as the depth of dependencies. But this maybe is the only way to solve it?
I'm also thinking of creating a user defined function that can do this in a loop until all paths have been explored with the new results (endnodes) are collected while traversing.
Perfomance is also an important issue since I'll be going through tens of tousands of nodes.


(Paul Thomas) #5

Hmm, maybe to avoid having the re-run queries you could change how your data is modeled and have the rules better connected via eg an extra relationship type which would link a pair of end nodes to the follow on start node?

Graphs can sometimes be a bit of a chicken v egg scenario where something you need to model the data while carefully thinking about the types of queries you want to run, rather than just thinking about the raw data ...


(Hagen Ivar) #6

Thanks.

Yes I tried to think of this as a solution:

path2

So having connections from B and E to [B,E] node and counting the number of incoming paths when running query. So if run with input A and C. The paths C-D-E-[B,E]-F and A-B-[B,E]-F with a filter that two paths needs to be present for returning F. But if only A as input then [B,E] node will only have one path through it which will not return F. Complicated and hard to implement maybe?

I will continue to redo the model maybe with other relationship types as you suggested.


(Paul Thomas) #7

Keep in mind, matches like this could be very useful for you ...

WITH ["A","C"] AS input
MATCH
(p1Start:XTEST)-[r:XGIVE*]->(p1End:XTEST)-->(pFork:XTEST)<--(p2End:XTEST)<-[r:XGIVE*]-(p2Start:XTEST),
(pFork)-[r:XGIVE*]->(p3End:XTEST)
WHERE
...