Two unwind commend in the same query

Avi · February 14, 2023, 10:41pm

I successfully crated the following cypher query.
there are no errors on the run but my suspicious is that the second unwind is incorrect.
unwind range(0,size(vEmp)-1) as emp_index

MATCH x=(n:Employee{Workforce:'Y'})-[r:MTD2023]->(m:Month)
with collect(n.ID) as vEmp,collect(m.ID) as vMonth,collect(m:SalaryMTD)as vMTD, n.Salary as vSalary, m.DaysMonth as vDaysMonth,m.DaysYear as vDaysYear,(n.Salary/365)*m.DaysMonth as vSalaryMTD, (n.Salary/365)*m.DaysYear as vSalaryYTD
unwind range(0,size(vMonth)-1) as index
unwind range(0,size(vEmp)-1) as emp_index
merge (p:Month{ID:vMonth[index]})
merge (e:Employee{ID:vEmp[emp_index]})
merge (e)-[r:MTD2023]->(p)
set r.MTD2023=vSalaryMTD
Return r LIMIT 50

Any hint will help.

glilienfield · February 16, 2023, 12:13am

What is it that you are trying to calculate and update? In summary, you have a match that gives you a collections of rows of employees and months. You then group and collect employees and months together that have the same exact set of values for: n.Salary, vDaysMonth, vDaysYear, vSalaryMTD, and vSalaryYTD. Then you form the Cartesian product between the employee and month elements through the double-unwind, just to update the the relationship between the existing employee and month with the value of vSalaryMTD. I am thinking this can be simplified. I don't have insight into the data, but are you getting vMonth lists that have just one element?

Avi · February 17, 2023, 12:04pm

First, Thank you for taking the time. You are correct in the description of the steps.
I was able to resolve the issue by adding call function e.g. call{with }

Can you please let me know how can I run this query more efficient.
Thanks in advance!

MATCH (n:Employee) where n.Workforce='Y'
with n.ID as EmpList,n.Salary as EmpSalary
MATCH (m:Month) where m.Year=2023
with m.DaysMonth as vDaysMonth,m.DaysYear as vDaysYear,collect(m.ID) as vPeriod,collect(EmpList) as EmpID,EmpSalary
call{with EmpSalary,vDaysMonth,vDaysYear
return round((EmpSalary/365)*vDaysMonth) as SalaryMTD, round((EmpSalary/365)*vDaysYear) as SalaryYTD}
With SalaryMTD,SalaryYTD,EmpID,vPeriod,vDaysMonth,vDaysYear,EmpSalary
unwind range(0,size(EmpID)-1) as index
unwind range(0,size(vPeriod)-1) as index2
merge(e:Employee{ID:EmpID[index]})
merge(p:Month{ID:vPeriod[index2]})
merge(e)-[r:Paycheck]->(p)
SET r.MTD2023=SalaryMTD
SET r.Period=vPeriod
SET r.DaysMonth=vDaysMonth
SET r.AnnualSalary=EmpSalary
return e,p,r

glilienfield · February 17, 2023, 1:21pm

Are you able to provide sample data?

What do the relationship properties daysMonths and daysYear represent?

You changed this. You are no longer matching on an existing relationship between employee and month, but still are creating it. Is the purpose to create a relationship between each employee and every month and set the relationship properties based on that months?

glilienfield · February 17, 2023, 2:53pm

I took a stab at it. I think this does the same thing:

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
unwind months as m
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=[i in months | i.id]
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
return n, r, m

Avi · February 17, 2023, 3:07pm

Thank you again!
in the quick POC that I build , I stored the data in the employee and month modes (see attached) is that best practice?

glilienfield · February 17, 2023, 3:36pm

Sure, they each seem to be attributes of that type of entity. One note though, in a graph database you generally don't store values that refer to another entity; you create a relationship instead. It looks like you are doing something like that with your month's parentNode and nextNode attributes. You should just create relationships to these nodes such as, HAS_PARENT and HAS_NEXT, instead of storing test attributes. Then you can find any month's parent and next node quickly, instead of matching on the other node's primary key.

Did you query give you the results you expected?

Avi · February 17, 2023, 6:32pm

I received the following error after running the query.

Collections containing null values can not be stored in properties.

glilienfield · February 17, 2023, 6:36pm

I got the property wrong. I didn't check your original code first. I used 'id' instead of 'ID' for the month property.

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
unwind months as m
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=[i in months | i.ID]
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
return n, r, m

Avi · February 17, 2023, 7:21pm

works like a charm!! and of course made the query more efficient

Avi · February 23, 2023, 2:47pm

If I may ask one more question please, the query successfully run with 2 employees,
when I try to run a presser test with 40K employees I'm receiving the following error

The allocation of an extra 2.0 MiB would use more than the limit 250.0 MiB. Currently using 248.0 MiB. dbms.memory.transaction.global_max_size threshold reached

My understanding is that due to the unwind of 40,000 rows. I'm using the free version for testing.
Is there workaround?

Thanks a bunch!
Avi

glilienfield · February 24, 2023, 12:18am

Maybe it is due to trying to save 40,000 records in a single transaction. You can trying using a 'call subquery in transactions' or the apoc.periodic.iterate equivalent.

He is a refactor of the code to back the updates into transactions of 10,000 records. You can try a smaller number if the memory issues still persists.

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
call {
    with n, months
    foreach(m in months |
        merge(n)-[r:Paycheck]->(m)
        SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
        SET r.Period=[i in months | i.ID]
        SET r.DaysMonth=m.DaysYear
        SET r.AnnualSalary=n.Salary
    )
} in transactions of 10000 rows

I also removed the return statement, as this slows things down and requires memory to buffer the the results. It seems unnecessary when updating 40,000 records.

Avi · February 27, 2023, 10:41pm

glilienfield:

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
call {
    with n, months
    foreach(m in months |
        merge(n)-[r:Paycheck]->(m)
        SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
        SET r.Period=[i in months | i.ID]
        SET r.DaysMonth=m.DaysYear
        SET r.AnnualSalary=n.Salary
    )
} in transactions of 10000 rows

thanks for taking the time!
i received the following error when I ran the query

"A query with 'CALL { ... } IN TRANSACTIONS' can only be executed in an implicit transaction, but tried to execute in an explicit transaction."

glilienfield · February 27, 2023, 10:55pm

Add ‘:auto’ at the beginning of the query

Avi · February 27, 2023, 11:07pm

i added ":auto", now i received "Unable to complete transaction."
I also reduced the query to 100 rows

IN TRANSACTIONS OF 100 ROWS

Avi · February 27, 2023, 11:09pm

should we try

CALL apoc.periodic.iterate(
"MATCH (o:Officer)
RETURN o",
"WITH o, size((o)--()) AS degree
SET o.degree = degree",
{batchSize:10000, parallel: true})

glilienfield · February 28, 2023, 5:35am

That is odd. What is the new query? That is not the same.

Yes, you can try the apoc method instead, but that is a different query. Did you resolve the issue with the original one?

Avi · February 28, 2023, 8:05pm

the query : please note that I replaced Employee node with List

:auto match (m:Month) where m.Year=2023
with collect(m) as months
match (n:List) where n.Workforce='Y'
call {
with n, months
foreach(m in months |
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=m.ID
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
)
}
in transactions of 10000 rows

glilienfield · February 28, 2023, 11:56pm

Can you run the query with ‘explain’ to get the query plan and post it?

Avi · March 2, 2023, 4:50pm

Attached Neo4j explain

Thanks a bunch!!

Topic		Replies	Views
How to chain UNWIND clauses Neo4j Graph Platform cypher	12	10656	April 29, 2021
Using UNWIND to generate multiple MATCH statements Cypher cypher	6	67	February 27, 2025
Double unwinds in a cypher script Cypher cypher	10	1247	September 14, 2020
Nested match query help Neo4j Graph Platform migrated	11	230	July 23, 2022
Merge and Nested Unwind: How to write an efficient query wrt. indexing Neo4j Graph Platform performance , cypher , unwind , merge	1	1682	April 2, 2020

Get Certified in June!

Two unwind commend in the same query

Related topics