How to speed up apoc json load

david_rosenblum · October 15, 2021, 3:59pm

If you have all your indexes correct and that is not what is slowing down the merges, then I would think the problem is you are bringing the entire JSON into memory and committing it in one go. To that end I would recommend you use apoc.periodic.iterate to create batches.

so something like:

CALL apoc.periodic.iterate(
"UNWIND ['file:///CH.json'] AS filename
CALL apoc.load.json(filename) YIELD value as v",
"MERGE (p:Paper {title: v.title}) ON CREATE SET p.abstract = COALESCE(v.abstract, 'NULL'),
        p.lensId =  v.lens_id,
        p.datePublished = COALESCE(v.date_published.date, 'NULL'),
        p.publicationType = COALESCE(v.publication_type, 'NULL'),
        p.scholarlyCitationsCount = v.scholarly_citations_count,
        p.patentCitationsCount = v.patent_citations_count

   FOREACH (fund in v.funding | 
MERGE (f:Funding {name: COALESCE(fund.org, 'NULL')}) ON CREATE SET f.country = COALESCE(fund.country, 'NULL'),
        f.fundingId = COALESCE(fund.funding_id, 'NULL')
MERGE (f)-[:FUNDED]->(p)) 
FOREACH (author in v.authors | 
MERGE (a:Author {name: COALESCE(author.first_name + ' ' + author.initials + ' ' + author.last_name, 'NULL')}) 
    ON CREATE SET a.firstName = author.first_name,
    a.initials = author.initials,
    a.lastName = author.last_name,
    a.nameInstitution = COALESCE(author.first_name + '_' + author.initials + '_' + author.last_name + '_' + author.affiliations[0].name, 'NULL')
MERGE (a)-[:AUTHORED]->(p)

FOREACH (affiliation in author.affiliations |
MERGE (i:Institution {name: COALESCE(affiliation.name, 'NULL')}) 
    ON CREATE SET i.gridID = COALESCE(affiliation.grid.id, 'NULL')
MERGE (a)-[:WORKS_AT]->(i))) "
,{batchSize:100, parallel:false})

Topic		Replies	Views
Json loading is slow after some time Import / Export	19	2298	October 25, 2020
Need help to optimize json load performance Neo4j Graph Platform	11	316	February 8, 2024
How to batch json records using apoc library for better importing? Procedures & APOC apoc , performance , cypher , import , json , batching	4	1345	October 16, 2021
Slow merge and match when I use unwind Neo4j Graph Platform migrated	10	136	November 14, 2022
Confused about performance Neo4j Graph Platform	2	533	January 16, 2020

July Summer Fun!

How to speed up apoc json load

Related topics