How to speed up apoc json load

If you have all your indexes correct and that is not what is slowing down the merges, then I would think the problem is you are bringing the entire JSON into memory and committing it in one go. To that end I would recommend you use apoc.periodic.iterate to create batches.

so something like:

CALL apoc.periodic.iterate(
"UNWIND ['file:///CH.json'] AS filename
CALL apoc.load.json(filename) YIELD value as v",
"MERGE (p:Paper {title: v.title}) ON CREATE SET p.abstract = COALESCE(v.abstract, 'NULL'),
        p.lensId =  v.lens_id,
        p.datePublished = COALESCE(v.date_published.date, 'NULL'),
        p.publicationType = COALESCE(v.publication_type, 'NULL'),
        p.scholarlyCitationsCount = v.scholarly_citations_count,
        p.patentCitationsCount = v.patent_citations_count

   FOREACH (fund in v.funding | 
MERGE (f:Funding {name: COALESCE(fund.org, 'NULL')}) ON CREATE SET f.country = COALESCE(fund.country, 'NULL'),
        f.fundingId = COALESCE(fund.funding_id, 'NULL')
MERGE (f)-[:FUNDED]->(p)) 
FOREACH (author in v.authors | 
MERGE (a:Author {name: COALESCE(author.first_name + ' ' + author.initials + ' ' + author.last_name, 'NULL')}) 
    ON CREATE SET a.firstName = author.first_name,
    a.initials = author.initials,
    a.lastName = author.last_name,
    a.nameInstitution = COALESCE(author.first_name + '_' + author.initials + '_' + author.last_name + '_' + author.affiliations[0].name, 'NULL')
MERGE (a)-[:AUTHORED]->(p)

FOREACH (affiliation in author.affiliations |
MERGE (i:Institution {name: COALESCE(affiliation.name, 'NULL')}) 
    ON CREATE SET i.gridID = COALESCE(affiliation.grid.id, 'NULL')
MERGE (a)-[:WORKS_AT]->(i))) "
,{batchSize:100, parallel:false})
1 Like