Deleting older relationships

DanielGittx · June 6, 2020, 12:44am

I'm new to graph and I'm evaluating if neo4j would fit my use case.

I have 2 CSV files as follows:-

Persons file (phone number, name columns)
Calls file (callerNumber , recipientNumber, callDate columns).

I anticipate >50million nodes, >20billion relationships)

I have been able to create Nodes using Persons file and Relationships using Calls files through Neo4j admin import.

Challenge comes when deleting relationships for a certain callDate so that I can add newer relationships. It's too/painfully slow for large datasets.
match ()-[r: {callDate:20200101}］->() delete r;

I found out I can't index relationship properties.

Is there a way to optimize this cypher? How could I possibly re-model my CSVs?

Cobra · June 6, 2020, 10:17am

Hello @DanielGittx,

Yes, it's possible This request should work:

CALL apoc.periodic.iterate('MATCH ()-[r:{callDate:20200101}]->() RETURN r', 'DELETE r', {batchSize:1000, iterateList:true})

It deletes relationships by batches of 1000 relationships.

Regards,
Cobra

DanielGittx · June 6, 2020, 6:53pm

Hi @Cobra,

Thanks much. Indeed the apoc you shared works (I just refractored syntax abit). But it's a bit slow for about 10billion relationships I'm working with(6 months data)

I came across this "db.index.fulltext.createRelationshipIndex" as a way of indexing relationship property.
The index is currently populating hopefully the cypher will gain some speed once done

Cobra · June 6, 2020, 7:13pm

Nice, happy to hear this :)

The apoc procedure and the index should really speed up your query

Regards,
Cobra

DanielGittx · June 8, 2020, 11:09am

Just an update...
The indexing process is very slow.

Considering:-

Database size is 1.2t
Server configs:-
Heap - 230g
Page cache - 1.182t

Neo4j Version:-
Neo4j Browser version: 4.0.3
Neo4j Server version: [3.5.15]

It has taken 3hrs to just get to 12% (index populating)

CALL db.index.fulltext.createRelationshipIndex("callDateRelationship",["CALLS"],["CALL_DATE"], { analyzer: "url_or_email", eventually_consistent: "true" })

Why is this and is it possible to fast track?

Cobra · June 8, 2020, 11:18am

Hello @DanielGittx,

Yeah because it has to index all your database, that's why it's better to do it when you create the database

Regards,
Cobra

DanielGittx · June 8, 2020, 12:14pm

Agreed, however initially had done a bulk import (neo4j admin import).
Will neo4j admin import preserve indexes if i create them in advance then do a bulk import?

Cobra · June 8, 2020, 12:34pm

If I'm right, the index is set at the importation

Regards,
Cobra

DanielGittx · June 8, 2020, 12:40pm

I don't think so, especially for relationship indexes

I marked one of your messages as solution because i tested that with a subset of the graph and it worked(was fast) also for the fact that i'm solving a different issue now

Cobra · June 8, 2020, 12:45pm

I don't know more about this topic but I think you right, according to the DOC,

Full-text indexes are powered by the Apache Lucene indexing and search library

so it must be pre-computed already

Regards,
Cobra

alex.rivilis · June 14, 2020, 3:59pm

Daniel,
I would suggest changing your data model to have a day of the call as a node. So it would look like:
(:Person) -[:CALLED_ON]->(:DayOfCall) <-[:RECEIVED_CALL]- (:Person)
Then you can index day of the call with date property - then DELETE request will work much faster. Please note that you will still need to use apoc.periodic.iterate()

Topic		Replies	Views
Memory usage when deleting large amount of relationships Procedures & APOC performance , memory , delete , apocperiodiciterate	0	230	January 17, 2024
Faster way to mark and filter a big amount of nodes/relationships Procedures & APOC apoc , cypher , neo4j	3	199	March 27, 2023
Skiping relationship creation if already exist, not about MERGE Cypher apoc , performance	2	301	February 9, 2021
Slow load_csv Cypher cypher	4	2040	July 29, 2019
Very slow cypher queries to create relationships Import / Export apoc , performance , browser , relationship	1	1492	December 16, 2020

July Summer Fun!

Deleting older relationships

Related topics