cancel
Showing results for 
Search instead for 
Did you mean: 

MATCH hundreds of thousands of nodes and return in dataframe chunks

richard_lin
Node Link
def query(text_query, driver, db = None):
     try: 
         session = driver.session(database=db) 
         response = list(session.run(text_query))
     except Exception as e:
         print("Query failed:", e)
     finally: 
        if session is not None:
            session.close()
     return response

text = "MATCH (n:technique) WHERE exists(n.embedding_vector)  RETURN DISTINCT n.common_ds_cafe_id, n.embedding_vector"
tech_dtf_data = pd.DataFrame([dict(_) for _ in query(text, db='neo4j', driver = driver)])
tech_dtf_data['label'] = 'technique'
print("Techniques loaded:",  tech_dtf_data.shape)

I have a function query which returns my results as lists. I am also able to load my data into pandas dataframes. I am wondering if it is possible to load hundreds of thousands of node properties into a single dataframe in chunks. I looked into apoc.periodic.iterate but that seems to only return amount of rows processed and requires updates to node. I only wish to return node property information in chunks.

1 REPLY 1

maxim
Node

I think that you can use commit to achieve it: apoc.periodic.commit - APOC Documentation

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.