Break shortestPath loop having "WHERE property IN [..]"

Hello to everyone, I'm very newbie in neo4j graph DB.

I can get shortest paths between a source user and 100 other target users as follows:

MATCH p = shortestPath( 
	(u1:User)-[:follows*]->(u2:User {userID:"17911468618"}) 
) 
WHERE u1.userID IN ["173560420","232192182","427553890","6860189","26669533","2094200507","6590609","290023231","260375673","19410587","29394004","14455831","1287006597","186904952","303054725","212742998","44059601","1574083","979177642","1550693326","3132929984","1918078581","284920884","907025384","4213518589","1518284433","1822594644","1506607755","637874562","10482862","176702683","188222091","29421778","1521885756","246194371","2482066817","176618189","187619120","40992858","524226044","371299822","55795588","2274763833","1470414259","2033147472","421468021","1547627005","2289707047","22079794","18900337","144605776","1234301500","1942463581","496865116","1506896522","266928623","414113135","10206720","1363484236","232024162","350708102","324599988","23577429","197260775","50417061","1371714791","263110431","5486909","192417402","248633614","185546187","26444210","266319242","189396108","550150293","24065795","1036534773","489110643","6311667","1660662141","1962023419","271229086","1412007771","548017775","566276178","46983271","1635861286","21193118","9232256","378496191","314761038","1777543238","191110424","277509338","16264572","175353759","270902355","21629668","325848806","242066770"] 
RETURN nodes(p) as results

in this query WHERE u1.userID IN statement has 100 different unique IDs.

The returning results of this query:

Screen Shot 2021-04-26 at 23.37.20

That's okey, it means; 89 out of 100 target users in the list has a :follows path to the source user.

However I want to break the loop when the first 3 target users found which has a path to the source user , is it possible to do that? Thank you.

Yes, you can add LIMIT 3 to the end of your query.

Keep in mind that this won't necessarily get you the closest 3 nodes (top 3 shortest paths). If you need that then a different query is needed.

1 Like

I don't know why I just couldn't simply think "LIMIT 3" , the answer was actually very simple. Thank you so much.

Andrew, I want to ask another question, I want to exclude some userID's in the relationships chain, I mean, if I don't want to see some user's in the shortest path chain, how to do that?

for example:
A -> follows -> B
B -> follows -> C
B -> follows -> D
C -> follows -> M
D -> follows -> M

when I run shortestPath from:A to:M , output is A->B->C->M, however I dont want to see C user in the chain, so I want to get A->B->D->M.

After you have your collection of ids, add a WHERE clause to the shortestPath() match, such that

WHERE none(node IN nodes(p) WHERE node.userID IN excludedIds)
1 Like

Thank you so much again Andrew, sorry for my basic questions, I'm still very newbie at neo4j.

Andrew, I want to ask another question;

When I query shortest path with *1..4 bounds;

MATCH p=shortestPath(
(u1:User {userID:"3514890218"})-[:follows*1..4]->(u2:User {userID:"184460715"})
)
RETURN nodes(p)

it returns as;


in 24 ms.
2

and as you can see, the result consists of 4 hops. so there is no path with <4 hops for those users.

However when I run the same query with *1..5 bounds, like:

MATCH p=shortestPath(
(u1:User {userID:"3514890218"})-[:follows*1..5]->(u2:User {userID:"184460715"})
)
RETURN nodes(p)

it returns as;


in 3240 ms.
4

As you can see, the result still consists of 4 hops as expected. But it takes x135 times slower.
I'm a little confused right now, the query times are quite same everytime I run queries. What I'm expecting that the second query should return the results as fast as like the first query (whenever it finds first 4 hops chain, it should return). What I am missing? Is that normal?

Two things...first uncheck the "Connect result nodes" checkbox in your browser preferences (gear icon in the lower left).

Second, please run a PROFILE of the plan and attach it here.

Hello again, query1:

PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..4]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Result Graph:

Result Plan:

Result Reponse:
query1_response.txt (8.9 KB)

query2:

PROFILE MATCH p=shortestPath(
(u1:User {userID:"3514890218"})-[:follows*1..5]->(u2:User {userID:"184460715"})
)
RETURN nodes(p)

Result Graph:

Result Plan:

Result Reponse:
query2_response.txt (8.9 KB)

I don't think shortestPath() is showing an accurate number of db hits, I've raised the question internally and we'll address it if it's a bug.

We do see significantly more memory being used for the operation in the second query compared to the first, so there does seem to be more work being done, but that too is puzzling, as shortestPath() should be shortcircuiting once the path is found.

Can you try running both queries again, prefixing each with CYPHER runtime=SLOTTED ?

1 Like

Hello again Andrew,

Query1:

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..4]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Timing: 24ms
Screen Shot 2021-05-05 at 00.52.32

Results:
query1_summary.txt (12.2 KB)
query1_response.txt (1.3 KB)

Query 2:

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..5]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Timing: 2780ms
Screen Shot 2021-05-05 at 00.52.48

Results:
query2_summary.txt (12.2 KB)
query2_response.txt (1.3 KB)

Thanks, can you edit that comment to also provide the timings for each of those?

1 Like

I have edited my post again, first is 24ms , second is 2780ms.

Okay, that rules out a runtime problem.

Can you let us know which version of Neo4j you are using?

1 Like

Yes of course. Here it is:
|Version:|4.2.3|
|Edition:|Enterprise|

Thanks, nothing in the changelogs about any known bug here, so this may be something new.

Do you see the time (and memory used for the shortestPath operation) increasing further if you keep raising the upper bound (5, 6, and so on)?

1..3

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..3]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Screen Shot 2021-05-05 at 02.48.16
Screen Shot 2021-05-05 at 02.48.10
summary1-3.txt (12.2 KB)

1..4

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..4]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Screen Shot 2021-05-05 at 02.48.57
Screen Shot 2021-05-05 at 02.49.04
summary1-4.txt (12.2 KB)

1..5

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..5]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Screen Shot 2021-05-05 at 02.49.39
Screen Shot 2021-05-05 at 02.49.45
summary1-5.txt (12.2 KB)

1..6

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*1..6]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Screen Shot 2021-05-05 at 02.50.35
Screen Shot 2021-05-05 at 02.50.45
summary1-6.txt (12.2 KB)

without bounds

CYPHER runtime=SLOTTED 
PROFILE MATCH p=shortestPath( (u1:User {userID:"3514890218"})-[:follows*]->(u2:User {userID:"184460715"}) ) RETURN nodes(p)

Screen Shot 2021-05-05 at 03.01.38
Screen Shot 2021-05-05 at 03.01.43
summary-no-bounds.txt (12.7 KB)