Not able to introduce Loops present in a process flow in Neo4j?

Hello All,

I'm Stuck when trying to deal with loops in one use case as operator ID repeats and below are columns I'm working with:

Operator ID, Activity Done

14, enter

12, open

13, change

14, update

15, close

16, view

14, update

11, Note

I used below query :

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" as line

CREATE (n:Operator)

SET n.linenumber = linenumber(),

n.id = line.Operator ID,

n.activity = line.Activity Done;

MATCH (n:Operator)

MATCH (n1:Operator) WHERE n1.linenumber = n.linenumber-1

CREATE (n1)-[:HAS_RELATION]->(n)

REMOVE n.linenumber,n1.linenumber

RETURN n1,n;

But in output, due to linenumber() function I think it restricts looping back to the different node(Correct me If I'm wrong), attached the output in neo4j below using above query:

Desired Output: I am trying to replicate the process flow with loops (attached as Image below)

Process_Flow

(By Loop I mean, Connecting back again to operator ID which previously occurred in the process)

How can I create such graph? Suggestions please.
Thank you.

Hello @munavath.2001077 :slight_smile:

You should do two files:

  • one to create nodes
  • one to create relationships

You can't have a loop with your current query because of this line:

MATCH (n1:Operator) WHERE n1.linenumber = n.linenumber-1

Regards,
Cobra

1 Like

Thank you @cobra for the reply.

I've tried it, but not able to understand of how to connect the Operator ID row-wise without using below statement :
MATCH (n1:Operator) WHERE n1.linenumber = n.linenumber-1

If i'm using it, then I'm not able to implement loops and visualize the process flow. And How could I implement this on larger dataset ?

It would be helpful if you provide some solution for it, It would really help my understanding.

Thank you.

Hello @munavath.2001077 :slight_smile:

This query will create nodes and relationships, you will need APOC installed on the database:

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" AS line
WITH collect(line.`Operator ID`) AS ids, collect(line.`Activity Done`) AS activities
UNWIND ids AS id
MERGE (n:Operator {id: id})
WITH DISTINCT ids, activities
UNWIND range(0, size(ids)-1) AS i
WITH ids[i] AS a, ids[i+1] AS b, i, activities
WHERE a IS NOT null AND b IS NOT null
MATCH (n:Operator {id: a})
MATCH (m:Operator {id: b})
CALL apoc.create.relationship(n, activities[i], {}, m)
YIELD rel
RETURN rel

The result (8 nodes and 7 relationships):
graph

Regards,
Cobra

1 Like

Thank you @cobra for the idea & query. It is helpful.

No problem, did that solve your problem?

1 Like

Yes It has solved my problem @cobra

1 Like

Hello @cobra , I have a doubt with regards to the query you suggested.

How can we add additional property to my relationship arcs in the graph, If I had few other extra columns in the dataset ? Is it possible ?

In the below line I've tried to additional properties:

CALL apoc.create.relationship(n, activities[i], {path:line.Path}, m)

But it is showing below error:

Variable line not defined (line 11, column 55 (offset: 448))
"CALL apoc.create.relationship(n, activities[i], {path:line.Path}, m)"

Where Am I doing wrong?

Thank you.

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" AS line
WITH collect(line.`Operator ID`) AS ids, collect(line.`Activity Done`) AS activities, collect(line.Path) AS path
UNWIND ids AS id
MERGE (n:Operator {id: id})
WITH DISTINCT ids, activities, path
UNWIND range(0, size(ids)-1) AS i
WITH ids[i] AS a, ids[i+1] AS b, i, activities, path
WHERE a IS NOT null AND b IS NOT null
MATCH (n:Operator {id: a})
MATCH (m:Operator {id: b})
CALL apoc.create.relationship(n, activities[i], {path: path[i]}, m)
YIELD rel
RETURN rel
1 Like

Thank You @cobra for the reply.

As mentioned if we add:

-----> CALL apoc.create.relationship(n, activities[i], {path: path[i]}, m)

It is not taking values of that column- path present in the dataset.

Look at the whole query:

I have added:

  • line2: , collect(line.Path) AS path
  • line 5: , path
  • line 7: , path
  • line 11: {path: path[i]}

Thank you @cobra for the reply.

Here is the below additional column which I have in my dataset.

Operator ID, Activity Done, path

14, enter,1

12, open, 2

13, change, 3

14, update, 4

15, close, 5

16, view, 6

14, update, 7

11, Note, 8

I have modified the query as you said. But properties in the relationship are not being added:

Did you delete everything before to use the query or there were still nodes and relationships in the database? If there are nodes and relations already created, we have to use apoc.merge.relationship.

CALL apoc.merge.relationship(n, activities[i],
  {path: path[i]},
  {path: path[i]},
  m,
  {path: path[i]}
)
1 Like

Yes @cobra, I deleted everything before using it.

After running again and calling apoc.merge.relationship, it is showing the below error:

Variable n not defined (line 1, column 30 (offset: 29))
"CALL apoc.merge.relationship(n, activities[i],"
^

This query works on my database:

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" AS line
WITH collect(line.`Operator ID`) AS ids, collect(line.`Activity Done`) AS activities, collect(line.path) AS path
UNWIND ids AS id
MERGE (n:Operator {id: id})
WITH DISTINCT ids, activities, path
UNWIND range(0, size(ids)-1) AS i
WITH ids[i] AS a, ids[i+1] AS b, i, activities, path
WHERE a IS NOT null AND b IS NOT null
MATCH (n:Operator {id: a})
MATCH (m:Operator {id: b})
CALL apoc.merge.relationship(n, activities[i], {path: path[i]}, {path: path[i]}, m, {path: path[i]})
YIELD rel
RETURN rel

With this CSV:

Operator ID,Activity Done,path
14,enter,1
12,open,2
13,change,3
14,update,4
15,close,5
16,view,6
14,update,7
11,Note,8
1 Like

Yes this is working @cobra , Thank you :slight_smile:

1 Like

Hello @munavath.2001077 :slight_smile:

You should open a different topic :slight_smile:

Regards,
Cobra

Okay Sure @cobra, I will create another topic.

Hi @cobra, Thanks for inputs given previously in this topic which was really helpful.
Further working on this use case and I'm stuck. Could provide some input on it?

I have a question, It is with regards to the query you suggested below:

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" AS line
WITH collect(line.`Operator ID`) AS ids, collect(line.`Activity Done`) AS activities, collect(line.path) AS path
UNWIND ids AS id
MERGE (n:Operator {id: id})
WITH DISTINCT ids, activities, path
UNWIND range(0, size(ids)-1) AS i
WITH ids[i] AS a, ids[i+1] AS b, i, activities, path
WHERE a IS NOT null AND b IS NOT null
MATCH (n:Operator {id: a})
MATCH (m:Operator {id: b})
CALL apoc.merge.relationship(n, activities[i], {path: path[i]}, {path: path[i]}, m, {path: path[i]})
YIELD rel
RETURN rel

Output is shown below:

Question that I have is:
Can we add add a common label for all relationships arcs, Like how for all 6 nodes the label is "Operator" . And put the "Activity Done" column values as the properties of Arcs along with "path" column values ?

Could you please help me out on this ?
Thank you.

Hello @munavath.2001077 :slight_smile:

LOAD CSV WITH HEADERS from "file:///C:/Process.csv" AS line
WITH collect(line.`Operator ID`) AS ids, collect(line.`Activity Done`) AS activities, collect(line.path) AS path
UNWIND ids AS id
MERGE (n:Operator {id: id})
WITH DISTINCT ids, activities, path
UNWIND range(0, size(ids)-1) AS i
WITH ids[i] AS a, ids[i+1] AS b, i, activities, path
WHERE a IS NOT null AND b IS NOT null
MATCH (n:Operator {id: a})
MATCH (m:Operator {id: b})
MERGE (n)-[r:ACTIVITY_DONE]->(m)
SET r.activity = activities[i], r.path = path[i]

Regards,
Cobra

1 Like