Friend-of-a-friend of many degrees query in Neo4j

damienthewave · May 6, 2023, 5:07pm

I stumbled upon very big execution time for searching for friends of the 5th degree of a person. With 20k people in the database, the query takes approx. 30 seconds to execute. I imagined that this type of query is what Neo4j would be good at. The plan states that there is a cartesian product being made - should that be the optimal way? Or has something gone wrong?

Comparing to Postgres, Neo4j performs poorly - 30 seconds to 0.1 of a second (both databases are filled with the same data).

Neo4j version: 5.7.0

Query:

MATCH (p1: Person {id: 6})-[:FOLLOWS*5]->(followed:Person)
RETURN DISTINCT followed.id as id

Explain:

Planner COST

Runtime SLOTTED

Runtime version 5.7

+------------------------+----+-------------------------------------------------+----------------+
| Operator               | Id | Details                                         | Estimated Rows |
+------------------------+----+-------------------------------------------------+----------------+
| +ProduceResults        |  0 | id                                              |        1288238 |
| |                      +----+-------------------------------------------------+----------------+
| +Distinct              |  1 | cache[followed.id] AS id                        |        1288238 |
| |                      +----+-------------------------------------------------+----------------+
| +VarLengthExpand(Into) |  2 | (p1)-[anon_0:FOLLOWS*5..5]->(followed)          |        1356040 |
| |                      +----+-------------------------------------------------+----------------+
| +CartesianProduct      |  3 |                                                 |          20100 |
| |\                     +----+-------------------------------------------------+----------------+
| | +CacheProperties     |  4 | cache[followed.id]                              |          20100 |
| | |                    +----+-------------------------------------------------+----------------+
| | +NodeByLabelScan     |  5 | followed:Person                                 |          20100 |
| |                      +----+-------------------------------------------------+----------------+
| +NodeIndexSeek         |  6 | RANGE INDEX p1:Person(id) WHERE id = $autoint_0 |              1 |
+------------------------+----+-------------------------------------------------+----------------+

Thanks!

glilienfield · May 6, 2023, 5:45pm

Do you have a complicated graph with lots of paths between your anchor node and other Person nodes?

I am also a little surprised that the approach would be to get every other Person node and determine a path between the two nodes. I figured it would start at your anchor node and expand the path along every possible path, stopping if the path reached a length of 5.

You should try the apoc.path procedures. They will traverse the graph as I describe above.

damienthewave · May 6, 2023, 6:21pm

Do you have a complicated graph with lots of paths between your anchor node and other Person nodes?

No, I only input data for Person nodes and only the FOLLOWS relationship.

ameyasoft · May 7, 2023, 10:53pm

Try this:

I created a sample graph like this:

Ran this query:
MATCH (p1: Person {id: 2})
CALL apoc.path.spanningTree(p1,{relationshipFilter:'FOLLOWS>', maxLevel:5}) YIELD path 
with nodes(path) as n1, p1
unwind n1 as n2
with n2 where id(n2) <> id(p1)
return collect(distinct n2.id) as ids

Result:
[3, 4, 5, 6, 7]

Topic		Replies	Views
Friend of a friend app General migrated	3	404	January 9, 2023
Slow query response time - simple friends of friends cypher query Cypher	3	2681	February 4, 2019
How to avoid factorial (n!) path? Cypher performance , cypher	6	406	January 26, 2022
“Query Optimization” : Neo4j Query builds a cartesian product between disconnected patterns - Cypher performance , cypher , relationship	3	1839	October 5, 2020
Need to optimize Neo4j Query Cypher optimization , performance , cypher	1	436	January 9, 2021

July Summer Fun!

Friend-of-a-friend of many degrees query in Neo4j

Related topics