I anticipate >50million nodes, >20billion relationships)
I have been able to create Nodes using Persons file and Relationships using Calls files through Neo4j admin import.
Challenge comes when deleting relationships for a certain callDate so that I can add newer relationships. It's too/painfully slow for large datasets.
match ()-[r: {callDate:20200101}]->() delete r;
I found out I can't index relationship properties.
Is there a way to optimize this cypher? How could I possibly re-model my CSVs?
Thanks much. Indeed the apoc you shared works (I just refractored syntax abit). But it's a bit slow for about 10billion relationships I'm working with(6 months data)
I came across this "db.index.fulltext.createRelationshipIndex" as a way of indexing relationship property.
The index is currently populating hopefully the cypher will gain some speed once done
Agreed, however initially had done a bulk import (neo4j admin import).
Will neo4j admin import preserve indexes if i create them in advance then do a bulk import?
I don't think so, especially for relationship indexes
I marked one of your messages as solution because i tested that with a subset of the graph and it worked(was fast) also for the fact that i'm solving a different issue now
Daniel,
I would suggest changing your data model to have a day of the call as a node. So it would look like:
(:Person) -[:CALLED_ON]->(:DayOfCall) <-[:RECEIVED_CALL]- (:Person)
Then you can index day of the call with date property - then DELETE request will work much faster. Please note that you will still need to use apoc.periodic.iterate()