Two unwind commend in the same query

I successfully crated the following cypher query.
there are no errors on the run but my suspicious is that the second unwind is incorrect.
unwind range(0,size(vEmp)-1) as emp_index

MATCH x=(n:Employee{Workforce:'Y'})-[r:MTD2023]->(m:Month)
with collect(n.ID) as vEmp,collect(m.ID) as vMonth,collect(m:SalaryMTD)as vMTD, n.Salary as vSalary, m.DaysMonth as vDaysMonth,m.DaysYear as vDaysYear,(n.Salary/365)*m.DaysMonth as vSalaryMTD, (n.Salary/365)*m.DaysYear as vSalaryYTD
unwind range(0,size(vMonth)-1) as index
unwind range(0,size(vEmp)-1) as emp_index
merge (p:Month{ID:vMonth[index]})
merge (e:Employee{ID:vEmp[emp_index]})
merge (e)-[r:MTD2023]->(p)
set r.MTD2023=vSalaryMTD
Return r LIMIT 50

Any hint will help.

What is it that you are trying to calculate and update? In summary, you have a match that gives you a collections of rows of employees and months. You then group and collect employees and months together that have the same exact set of values for: n.Salary, vDaysMonth, vDaysYear, vSalaryMTD, and vSalaryYTD. Then you form the Cartesian product between the employee and month elements through the double-unwind, just to update the the relationship between the existing employee and month with the value of vSalaryMTD. I am thinking this can be simplified. I don't have insight into the data, but are you getting vMonth lists that have just one element?

First, Thank you for taking the time. You are correct in the description of the steps.
I was able to resolve the issue by adding call function e.g. call{with }

Can you please let me know how can I run this query more efficient.
Thanks in advance!

MATCH (n:Employee) where n.Workforce='Y'
with n.ID as EmpList,n.Salary as EmpSalary
MATCH (m:Month) where m.Year=2023
with m.DaysMonth as vDaysMonth,m.DaysYear as vDaysYear,collect(m.ID) as vPeriod,collect(EmpList) as EmpID,EmpSalary
call{with EmpSalary,vDaysMonth,vDaysYear
return round((EmpSalary/365)*vDaysMonth) as SalaryMTD, round((EmpSalary/365)*vDaysYear) as SalaryYTD}
With SalaryMTD,SalaryYTD,EmpID,vPeriod,vDaysMonth,vDaysYear,EmpSalary
unwind range(0,size(EmpID)-1) as index
unwind range(0,size(vPeriod)-1) as index2
merge(e:Employee{ID:EmpID[index]})
merge(p:Month{ID:vPeriod[index2]})
merge(e)-[r:Paycheck]->(p)
SET r.MTD2023=SalaryMTD
SET r.Period=vPeriod
SET r.DaysMonth=vDaysMonth
SET r.AnnualSalary=EmpSalary
return e,p,r

Are you able to provide sample data?

What do the relationship properties daysMonths and daysYear represent?

You changed this. You are no longer matching on an existing relationship between employee and month, but still are creating it. Is the purpose to create a relationship between each employee and every month and set the relationship properties based on that months?

I took a stab at it. I think this does the same thing:

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
unwind months as m
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=[i in months | i.id]
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
return n, r, m

Thank you again!
in the quick POC that I build , I stored the data in the employee and month modes (see attached) is that best practice?

Sure, they each seem to be attributes of that type of entity. One note though, in a graph database you generally don't store values that refer to another entity; you create a relationship instead. It looks like you are doing something like that with your month's parentNode and nextNode attributes. You should just create relationships to these nodes such as, HAS_PARENT and HAS_NEXT, instead of storing test attributes. Then you can find any month's parent and next node quickly, instead of matching on the other node's primary key.

Did you query give you the results you expected?

I received the following error after running the query.

Collections containing null values can not be stored in properties.

I got the property wrong. I didn't check your original code first. I used 'id' instead of 'ID' for the month property.

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
unwind months as m
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=[i in months | i.ID]
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
return n, r, m
1 Like

works like a charm!! and of course made the query more efficient

1 Like

If I may ask one more question please, the query successfully run with 2 employees,
when I try to run a presser test with 40K employees I'm receiving the following error

The allocation of an extra 2.0 MiB would use more than the limit 250.0 MiB. Currently using 248.0 MiB. dbms.memory.transaction.global_max_size threshold reached

My understanding is that due to the unwind of 40,000 rows. I'm using the free version for testing.
Is there workaround?

Thanks a bunch!
Avi

Maybe it is due to trying to save 40,000 records in a single transaction. You can trying using a 'call subquery in transactions' or the apoc.periodic.iterate equivalent.

He is a refactor of the code to back the updates into transactions of 10,000 records. You can try a smaller number if the memory issues still persists.

match (m:Month) where m.Year=2023
with collect(m) as months
match (n:Employee) where n.Workforce='Y'
call {
    with n, months
    foreach(m in months |
        merge(n)-[r:Paycheck]->(m)
        SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
        SET r.Period=[i in months | i.ID]
        SET r.DaysMonth=m.DaysYear
        SET r.AnnualSalary=n.Salary
    )
} in transactions of 10000 rows

I also removed the return statement, as this slows things down and requires memory to buffer the the results. It seems unnecessary when updating 40,000 records.

thanks for taking the time!
i received the following error when I ran the query

"A query with 'CALL { ... } IN TRANSACTIONS' can only be executed in an implicit transaction, but tried to execute in an explicit transaction."

Add ‘:auto’ at the beginning of the query

i added ":auto", now i received "Unable to complete transaction."
I also reduced the query to 100 rows

IN TRANSACTIONS OF 100 ROWS

should we try

CALL apoc.periodic.iterate(
"MATCH (o:Officer)
RETURN o",
"WITH o, size((o)--()) AS degree
SET o.degree = degree",
{batchSize:10000, parallel: true})

That is odd. What is the new query? That is not the same.

Yes, you can try the apoc method instead, but that is a different query. Did you resolve the issue with the original one?

the query : please note that I replaced Employee node with List

:auto match (m:Month) where m.Year=2023
with collect(m) as months
match (n:List) where n.Workforce='Y'
call {
with n, months
foreach(m in months |
merge(n)-[r:Paycheck]->(m)
SET r.MTD2023=round((n.Salary/365)*m.DaysMonth)
SET r.Period=m.ID
SET r.DaysMonth=m.DaysYear
SET r.AnnualSalary=n.Salary
)
}
in transactions of 10000 rows

Can you run the query with ‘explain’ to get the query plan and post it?

Attached Neo4j explain

Thanks a bunch!!