Create relationships for existings nodes in Cypher in an efficient way

I have a large document of json objects which are in the following format:


{...

"employees": [

{

"id": "549394791bc19",

"employee": "John Doe",

"dep": "Google"},

`{

"id": "9907849bv2312",

"employee": "James",

"dep": "Meta"

},

{

"id": "987738347bd323",

"employee": "Danila",

"dep": "YouTube"

}

],...

}

I have Nodes which contain the id and the attribut employee. These Nodes are named Employee.
And Nodes which contains the dep attribute. These are named Department

How can I create for all of them a relationship between the employee and dep.

This is my solution but it is really slow. It need to be quite fast because the json document contains thousand of json objects.

CALL apoc.periodic.iterate(    
"CALL apoc.load.json('sampled.json') YIELD value
UNWIND value.employees as employee RETURN employee",
"MATCH (e:employee {employee: employee.employee})
MATCH (d:Department {dep:employee.dep})
MERGE(e)-[r:workAt]->(d)",
{batchSize:8000,parallel:false}
)

@myke-sch

a. Neo4j version?
b. do you have indexes on :employee(employee) and :Department(dep)?

The version is 5.12.0

I have the indexes employee_id and dep_id

@myke-sch

I have the indexes employee_id and dep_id

but your matches are on

MATCH (e:employee {employee: employee.employee})
MATCH (d:Department {dep:employee.dep}

which would suggest these MATCHes would benefit from indexes on :employee(employee) and :Department(dep)

So I need to change them to
MATCH (e:employee {employee: employee.employee_id}
AND
MATCH (d:Department {dep:employee.dep_id}

@myke-sch

as to what you should do..

but when u have a statement similar to

MATCH (e:employee {employee: employee.employee})

this effectively says find me a node with label :employee and which has a property named employee and the value for this property is what is represented by employee.employee.

Now if you have an index on :employee(employee) then the query will be really fast as it will use the index to find the record.
If you do not have a index on :employee(employee) then to find said node it will need to iterate over each and every :employee node, whether than be 100 nodes of 100k nodes

1 Like