Multiple Nodes in 1 column as target

lipchean · February 8, 2022, 7:50am

Hi,

The CSV shown above is the result of a manual manipulation from the original csv.

I then proceed to enter the following cypher.
LOAD CSV WITH HEADERS FROM 'file:///Data_3.csv' AS row
MERGE (m:Module {ID: row.ID, RefBy1:COALESCE(row.RefBy1,"NA") , RefBy2: COALESCE(row.RefBy2,"NA")})
return m

Followed by
MATCH (child: Module)
MATCH (parent: Module) where parent.ID = child.RefBy1
MERGE (child)-[:RefBy]->(parent)
RETURN child, parent

Followed by
MATCH (child: Module)
MATCH (parent: Module) where parent.ID = child.RefBy2
MERGE (child)-[:RefBy]->(parent)
RETURN child, parent

The original CSV file is

Do you have any suggestion how i can create a single script to create the same model i showed above. I have tried using UNWIND, as shown below but nothing gets created
LOAD CSV WITH HEADERS FROM 'file:///Data_1.csv' AS row
WITH row.ID AS impID, SPLIT(row.RefBy, ",") AS multiRef
UNWIND multiRef as impRef
MATCH (LI:LineItem {LIName: impID})
MATCH (RB:ReferenceBy {RBName:TRIM(impRef)})
MERGE (LI)-[rel:RefBy]->(RB)
return LI, RB

Any suggestions? Thanks in advance.

Cheers,
LC

lipchean · February 8, 2022, 10:18am

The model should look like this:

glilienfield · February 8, 2022, 7:18pm

Try this. It creates the nodes first, then processes the file again to create the relationships.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID})
UNION ALL
load CSV with headers FROM 'file:///Data_1.csv' AS row
with row
where row.RefBy is not null
match(n:Module{ID:row.ID})
with n, split(row.RefBy, ",") as references
unwind references as reference
match(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)

Test Data:
Screen Shot 2022-02-08 at 2.16.26 PM

Result:

lipchean · February 9, 2022, 7:02am

Hi @glilienfield

Your solution works. I do have a follow up question, i.e. why is UNION ALL required? Is there a way where both row and reference are created from a single LOAD?

Thanks

glilienfield · February 9, 2022, 1:33pm

This seems to work too.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID})
with n, row
call {
with n, row
with n, split(row.RefBy, ",") as references
unwind references as reference
merge(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)
}

lipchean · February 11, 2022, 4:07am

Thanks @glilienfield
I'll try this too

lipchean · March 1, 2022, 8:06am

hi @glilienfield

To make things even more complicated, how would you utilize the 'Value' column, such that the script will return the path with the largest aggregated value (should be A > C > D, because A = 2, C = 10, D = 3 for a total of 15)

Thanks,
LC

glilienfield · March 1, 2022, 2:07pm

First update the query to consume the value attribute. Store it as an attribute of the node.

load CSV with headers FROM 'file:///Data_1.csv' AS row
merge(n:Module{ID:row.ID, value:toInteger(row.value)})
with n, row
call {
with n, row
with n, split(row.RefBy, ",") as references
unwind references as reference
merge(m:Module{ID:reference})
merge(n)-[:RefBy]->(m)
}

You can then sum up the values along each path using the 'reduce' operation on the collection of nodes along each path, calculating the path's rank metric.

match p=(n:Module)-[:RefBy*]->(m:Module)
with nodes(p) as nodes
return [i in nodes | i.ID] as pathNodes, reduce(s=0, i in nodes | s + i.value) as rank
order by rank desc
limit 1

Screen Shot 2022-03-01 at 9.01.25 AM

Note, the query does find intermediate paths, i.e. A->C. You can add a constraint to remove these if they are not needed and you only want full paths. I did not do so, since they would never have the largest rank since your values are all positive. Let me know if you may have negative values and want to eliminate these as a potential result.

You can run the queries separately or concatenate them to achieve the result in one operation.

lipchean · March 2, 2022, 7:05am

Thanks for your very fast response @glilienfield , i'll try it out.

Topic		Replies	Views
How to create the same node from two different columns and create relationships, Cypher	2	374	June 22, 2021
Load data from CSV and connect nodes to a single parent node General migrated	1	157	June 27, 2022
Unique Nodes loaded from CSV file Neo4j Graph Platform migrated	2	168	June 9, 2022
Load CSV - Loading relationships in one column, but same type of node Cypher	7	808	November 24, 2020
How to create a relation from two columns in one table? Cypher question	4	3294	February 12, 2020

Multiple Nodes in 1 column as target

Related topics