Semantics of RETURN and relationship to previous clause

saman · November 15, 2024, 6:30pm

I have a question for the Cypher gurus regarding semantics of cypher, and the conceptual model for Cypher. Here are two queries and their results that illustrate the point.

Query 1:
return "a" as LetterA

As I would expect results in one row with "a", i.e.:

LetterA
  "a"

While Query 2:

match (n)
return "a" as LetterA

Surprisingly results in 3 rows (note there are 3 matches to node n in the database):

LetterA
   "a"
   "a"
   "a"

Why is this ? I would have thought that the resulted projected via RETURN would not be dependent on the results of a seemingly arbitrary MATCH clause that are not even in the RETURN projection.

thanks all!

dana_canzano · November 15, 2024, 7:47pm

@saman
the 2nd query basically says
find me a node, any node in the graph and for that node return the value a.

So of you have 3 nodes you would get 3 a returned.

I'm curious what you would expect in a traditional RDBMS with tables and rows.
If I ran

select 'a' from orders ;

and my orders table had 5 rows would you not expect 5 `a' ?

saman · November 15, 2024, 8:31pm

Super interesting. To answer your question first, yes I would expect the results you said if it was an RDBMS. I'm trying to make sure I understand neo4j so I can use it effectively since in this example with return we are synthesizing data rather than returning what's in the db. So, let me test my understanding. Is it correct that the general statement is that the clause before the RETURN controls the projection of items listed in the RETURN clause? (e.g. in the case of query 2 above the results of MATCH(n) control the projection of "a")

If so, what are the general rules for this? See the queries below for some examples of behavior using UNWIND instead of MATCH and a 2 row returned value (rather than just a 1 row "a").

For example
Query 3:

with ["a","b"] as x
unwind x as X //create X with two rows, row1="a" and row2="b"
match (n)
return X

results in this replication (which has a logic to it- in this case 6 n's in the db):

X
"a"
"a"
"a"
"b"
"b"
"b"

Meanwhile Query 4:

with ["a","b"] as x
unwind x as X //create X with two rows, row1="a" and row2="b"
return X

results in the simple results:

X
"a"
"b"

dana_canzano · November 15, 2024, 8:47pm

@saman

Is it correct that the general statement is that the clause before the RETURN controls the projection of items listed in the RETURN clause

yes
and your examples demonstrate that

saman · November 15, 2024, 8:58pm

Thanks Dana. Do you have the specific rules for how the clause before the RETURN controls the projection? If so, that it would be great. Also, this seems like it could be quite powerful - much more so than the SELECT in a traditional SQL RDBMS. If there are any examples of queries that you have that use this to do something interesting that could be useful.

thanks again!

valerio.malenchino · November 15, 2024, 9:46pm

Hi saman,

If you want to understand Cypher's semantics, I suggest you read this short section of the Neo4j docs: Clause composition - Cypher Manual.
I found it very useful.

saman · November 17, 2024, 4:37pm

Thanks Valerio, I read it it. Useful!

Do you understand the semantics of RETURN and know of any documentation for it? From what I can see (examples above) it is not like a typical "return" statement in a programming language, which just returns the RETURN's argument list to the calling program.

The semantics seems to be something like: It's behavior is governed by a side effect of what ever projection was done most recently before the RETURN statement. So, for example, if it is returning a variable with 1 row, it repeats that row to match the number of the rows in the most recent previous projection (which may not be in the RETURN's argument list). If it is returning a variable with 2 rows it uses some sort of expansion rules to replicates the 2 rows to align with a the most recent prior projection.

These are obviously design choices in the semantics of RETURN and not some sort of bug.
If anyone can point to documentation on this it would be useful.

glilienfield · November 17, 2024, 6:14pm

The return statement returns each row of data with the values listed. It does not duplicate anything. What your are seeing that is confusing you is the unwind behavior. The unwind will unwind the list into multiple rows and duplicate the other values in the current row as the list you are unwinding. See the example below:

saman · November 17, 2024, 6:34pm

Your explanation makes sense for the query you listed, since return * includes "element" in the argument list to return.

The case I was talking about is this query:

with [1,2] as list, "santa" as name
unwind list as element
return name

results in this result:

name
"santa"
"santa"

i.e. repetition of the value of name to match the number of rows in element, even though element is not in the argument list for RETURN.

glilienfield · November 17, 2024, 7:57pm

In that example, the unwind results in two rows:

element, name
1, “Santa”
2, “Santa”

The return then projects only the name variable in the result. The same outcome would occur with a “with” clause.

saman · November 17, 2024, 8:18pm

I think I might be on the verge of an Aha! moment.

So, are you saying that (using the example above) the model in Cypher is that when the UNWIND happens all the previously declared variables in the query (e.g. name in the WITH clause earlier) are replicated to align to the dimensions of the UNWIND's result?

So, the following is not how cypher works: 1/ the UNWIND only operates on "list" and the other variables (e.g. name) are left alone, then 2/the replication of name to align to dimensions of list only happens if they appear together in the RETURN clause.

glilienfield · November 18, 2024, 12:13am

Yes. Unwind expand the list one each row into rows and the other values defined on that row are duplicated. A subsequent return or with statement can be used to pass all or some of the values on.

saman · November 18, 2024, 12:22am

That makes sense. So if I understand correctly, in the case of this query (where there are say 6 rows coming from match(n):

with ["a","b"] as x
unwind x as X //create X with two rows, row1="a" and row2="b"
match (n)
return X

the result is:

X
"a"
"a"
"a"
"b"
"b"
"b"

Is the correct explanation for this that results (1 column, 2 rows) of the UNWIND are combined/aligned with the results of the match (1 column, 6 rows) to create a table with 2 columns (n and X) and 6 rows Then the RETURN just returns the 6 rows in the X column ?

glilienfield · November 18, 2024, 1:53am

So close....there will actually be six rows of "a" and six rows of "b". The unwind produces two rows with x="a" and x="b". The match then executes for each row, generating six rows for x="a" and the same six rows for x="b". Each match row has its corresponding value of x appended to its results.

Test data:

unwind [1,2,3,4,5,6] as id
create (x:Test{id: id})

It is very clear if you also return the id for each node. You can see the match results repeat.

saman · November 18, 2024, 3:01pm

Wow -thank you! The fact that the match executes for each row from the previous clause (e..g. unwind) was lost on me. I think this demystifies a lot of behavior. I hope this thread is useful for others in the future.
thanks again!
saman

glilienfield · November 18, 2024, 7:41pm

Yes, the match executes for each row of data. Typically the match will use results from the row data, I.e., like a correlated query. In this case that doesn’t exists, so maybe that adds to the confusion. I am glad you have been enlightened.

malsharafi · November 20, 2024, 10:04am

Thank you guys. i read that this is the effect of cartesian products. the behavior of nested unwind, match, ..etc is like nested loops of all nodes or elements.
thanks again

malsharafi · November 20, 2024, 10:35am

very helpful. Thanks

Topic		Replies	Views
Understanding RETURN, number of rows Cypher cypher	4	1098	October 28, 2021
A small question about cypher semantics Neo4j Graph Platform migrated	2	100	October 6, 2022
UNWIND makes no sense to me Neo4j Graph Platform migrated	3	150	February 7, 2023
Return response appears in repeating number Cypher cypher	30	1998	June 28, 2019
Double unwinds in a cypher script Cypher cypher	10	1253	September 14, 2020

July Summer Fun!

Semantics of RETURN and relationship to previous clause

Related topics