Another "speed up the load" question from a relatively inexperienced Neo4j user

Cobra · July 8, 2020, 10:10pm

I'm so young , and no, I'm working for a startup but we are opened to consulting

So you must create batches of data now:

BATCH = {'batch': []}


def reset_batch():
    """
    Function to reset the batch.
    """
    BATCH["batch"] = []


def merge_relation(args):
    """
    Function to create relations from a batch.
    """
    if len(BATCH['batch']) > 1000:
        with graphDB_Driver.session() as ses:
            ses.run("UNWIND $batch AS row MATCH (a:ProgNode{inode:row.a}) MATCH (b:ProgNode{inode:row.b}) CALL apoc.merge.relationship(a, 'PROGRAM', {}, apoc.map.removeKeys(properties(row), ['a', 'b']), b) YIELD rel RETURN 1", batch=BATCH["batch"])
        reset_batch()
    BATCH['batch'].append(args.to_dict())


def merge_node(args):
    """
    Function to create nodes from a batch.
    """
    if len(BATCH['batch']) > 1000:
        with graphDB_Driver.session() as ses:
            ses.run("UNWIND $batch AS row CALL apoc.merge.node(['ProgNode', row.nodetype], {inode:row.inode}, apoc.map.removeKeys(properties(row), ['nodetype', 'inode'])) YIELD node RETURN 1", batch=BATCH["batch"])
        reset_batch()
    BATCH['batch'].append(args.to_dict())


nodes = pd.read_csv(filepath_or_buffer='nodes.csv', header=[0], sep='||', encoding='utf-8')
relations = pd.read_csv(filepath_or_buffer='relations.csv', header=[0], sep='||', encoding='utf-8')

nodes.apply(lambda h: merge_node(h), axis=1)
reset_batch()
relations.apply(lambda h: merge_relation(h), axis=1)

Don't forget to add the UNIQUE CONSTRAINTS:

CREATE CONSTRAINT constraint_inode ON (p:ProgNode) ASSERT p.inode IS UNIQUE

You also need to install APOC plugin on your database.

Documentation:

I'm not sure if the code is working correclty but the idea is here I hope it will help you

Regards,
Cobra

Topic		Replies	Views
Help me merge 170M relationships with LOAD CSV Cypher load-csv	10	3639	October 23, 2019
My long importing query never ends Cypher	26	1087	April 12, 2020
Importing data using Cypher shell General	4	96	June 7, 2024
Importing relationships from multiple csv file Import / Export performance , load-csv	12	3200	June 5, 2020
Load matrix (node x node) into neo4j Import / Export import	2	317	December 24, 2021

July Summer Fun!

Another "speed up the load" question from a relatively inexperienced Neo4j user

Related topics