MATCH hundreds of thousands of nodes and return in dataframe chunks

def query(text_query, driver, db = None):
     try: 
         session = driver.session(database=db) 
         response = list(session.run(text_query))
     except Exception as e:
         print("Query failed:", e)
     finally: 
        if session is not None:
            session.close()
     return response

text = "MATCH (n:technique) WHERE exists(n.embedding_vector)  RETURN DISTINCT n.common_ds_cafe_id, n.embedding_vector"
tech_dtf_data = pd.DataFrame([dict(_) for _ in query(text, db='neo4j', driver = driver)])
tech_dtf_data['label'] = 'technique'
print("Techniques loaded:",  tech_dtf_data.shape)

I have a function query which returns my results as lists. I am also able to load my data into pandas dataframes. I am wondering if it is possible to load hundreds of thousands of node properties into a single dataframe in chunks. I looked into apoc.periodic.iterate but that seems to only return amount of rows processed and requires updates to node. I only wish to return node property information in chunks.

I think that you can use commit to achieve it: apoc.periodic.commit - APOC Documentation

1 Like