Hello,
Yes there's potentially a cardinality issue here. If I were to guess, you may have performed the first MATCH almost as if it was a declaration, as in other programming languages. That isn't needed here, and in some circumstances can be problematic.
If you did not have the second MATCH, then the first MATCH would have led to a cartesian product between every :AP, :Customer, and :Customer node, resulting in a number of rows equal to the product of the number of nodes in each of these 3 cases. For example, if there were 1000 :Customer nodes and 500 :AP nodes, then you would have 1000 * 1000 * 500 rows as a result due to the cartesian product, all combinations of these 3.
Your second MATCH, thankfully, provided a pattern for which these nodes are supposed to relate. The planner is able to understand that and plan based upon fulfillment of the path, avoiding the cartesian product. However, if there were any elements in the Cypher restricting the ordering (such as a WITH between them introducing a new variable), then the planner would have had no choice but to perform the cartesian product operation from the first MATCH before it could consider the second MATCH.
Going forward, use the labels inline in your pattern, as here is no need to declare your variables:
MATCH pl=(o:Customer)<-[:PRESIDENT]-(n:AP)-[:SHAREHOLDER]->(m:Customer)
...
Now regarding your use case, if o
and m
should be the same customer (:AP is president and shareholder of the same customer), then only one variable is needed, and you can move part of your pattern into the WHERE clause:
MATCH pl=(o:Customer)<-[:PRESIDENT]-(n:AP)
WHERE (n)-[:SHAREHOLDER]->(o)
...
If it doesn't need to be the same company (:AP is the president of any customer and the shareholder for any customer) then we can move both into the WHERE clause:
MATCH (n:AP)
WHERE (:Customer)<-[:PRESIDENT]-(n) AND (n)-[:SHAREHOLDER]->(:Customer)
...
If :PRESIDENT and :SHAREHOLDER relationships only point to :Customer nodes, then you can omit the :Customer label from those patterns and the query will become even more efficient (since we can tell from the n
node which relationships it has, and if we don't need to filter on the node at the other end, then there's no need to expand to the other node).
For how you could get 3 n
nodes and its connecting nodes, assuming there is no ordering, we can build off of the last query: find an :AP node with a :PRESIDENT and a :SHAREHOLDER relationship, then apply the LIMIT, then with that limited result set either expand the pattern and collect, or (more efficient) use a pattern comprehension to get the results of an expansion into lists:
MATCH (n:AP)
WHERE (:Customer)<-[:PRESIDENT]-(n) AND (n)-[:SHAREHOLDER]->(:Customer)
WITH n
LIMIT 3
RETURN n, [(n)-[:PRESIDENT]->(c:Customer) | c] as presidentForCustomers, [(n)-[:SHAREHOLDER]->(c:Customer) | c] as shareholderForCustomers