Connect a New Node to its Correct Parent

raylukas97 · July 13, 2025, 4:40pm

Lets say a Researcher in the Emerging Technology group wants to propose something like... Neo4j in thier company. We create a Proposal node with a value of Neo4j. And we have a separate CSV file that has the Corporate hierarchy (This Team is part of this Group, This Group is part of this Department, This Department is part of this Division, etc.).
But Maybe some of these columns (Team for example) is null. If this happens I want the Proposal to point to the next level up (Group). if Group is null then Department, etc.
What is a good way, in my CSV load script, to handle this scenario (other than tons of nested ForEach (When... ) clauses.. Can someone post an example.

Thanks Guys for Helping Me Out!

joshcornejo · July 14, 2025, 5:50am

This is an ingestion script that probably would be better off handled by python, rather than creating one-use queries that are complicated

In python you can do precisely what you want and even make it more sophisticated, this code assumes each row is [ department, group, team, userid ] :

import csv
from typing import Optional, Tuple

def generate_cypher_queries(input_file: str, output_file: str) -> None:
    """
    Reads a CSV file and generates Cypher queries to create hierarchical relationships.
    
    Args:
        input_file: Path to the input CSV file
        output_file: Path to save the generated Cypher queries
    """
    # Variables to keep track of the last non-empty values
    last_department: Optional[str] = None
    last_group: Optional[str] = None
    last_team: Optional[str] = None
    
    with open(input_file, 'r') as csv_file, open(output_file, 'w') as out_file:
        reader = csv.DictReader(csv_file)
        
        for row in reader:
            # Get values from current row
            department = row['department'].strip() if row['department'] else None
            group = row['group'].strip() if row['group'] else None
            team = row['team'].strip() if row['team'] else None
            userid = row['userid'].strip()
            
            # Determine which values to use (falling back to higher levels if empty)
            current_department = department if department is not None else last_department
            current_group = group if group is not None else (last_group if group is None and department is None else None)
            current_team = team if team is not None else (last_team if team is None and group is None and department is None else None)
            
            # Update last values for next iteration
            if department is not None:
                last_department = department
            if group is not None:
                last_group = group
            if team is not None:
                last_team = team
                
            # Validate we have all required values
            if not userid:
                print(f"Warning: Skipping row with empty userid: {row}")
                continue
                
            if current_department is None:
                print(f"Warning: No department found for user {userid}")
                continue
                
            # Generate the Cypher query
            query_parts = []
            
            # Create department if it doesn't exist
            query_parts.append(
                f"MERGE (d:Department {{name: '{escape_cypher(current_department)}'}})"
            )
            
            # Create group if it exists
            if current_group:
                query_parts.append(
                    f"MERGE (g:Group {{name: '{escape_cypher(current_group)}'}})"
                )
                query_parts.append(
                    f"MERGE (g)-[:PART_OF]->(d)"
                )
                
                # Create team if it exists
                if current_team:
                    query_parts.append(
                        f"MERGE (t:Team {{name: '{escape_cypher(current_team)}'}})"
                    )
                    query_parts.append(
                        f"MERGE (t)-[:PART_OF]->(g)"
                    )
                    # Create user and connect to team
                    query_parts.append(
                        f"MERGE (u:User {{id: '{escape_cypher(userid)}'}})"
                    )
                    query_parts.append(
                        f"MERGE (u)-[:MEMBER_OF]->(t)"
                    )
                else:
                    # Connect user directly to group
                    query_parts.append(
                        f"MERGE (u:User {{id: '{escape_cypher(userid)}'}})"
                    )
                    query_parts.append(
                        f"MERGE (u)-[:MEMBER_OF]->(g)"
                    )
            else:
                # Connect user directly to department
                query_parts.append(
                    f"MERGE (u:User {{id: '{escape_cypher(userid)}'}})"
                )
                query_parts.append(
                    f"MERGE (u)-[:MEMBER_OF]->(d)"
                )
            
            # Write the query to the output file
            out_file.write(";\n".join(query_parts) + ";\n\n")
            
def escape_cypher(value: str) -> str:
    """Escape special characters for Cypher queries."""
    return value.replace("'", "\\'").replace('"', '\\"')

if __name__ == "__main__":
    input_csv = "input.csv"  # Change to your input file path
    output_cypher = "output.cypher"  # Change to your desired output file path
    
    generate_cypher_queries(input_csv, output_cypher)
    print(f"Cypher queries generated and saved to {output_cypher}")

... then you can run the output - i'll leave it to you if you want to incorporate the code that connects to Neo4J and executes the query.

raylukas97 · July 15, 2025, 2:02pm

Thanks Josh.. Well my friend is, like me a Java guy. But I think I get what you are doing here.. which is a good idea I think, if I understand the idea.. Let me echo this back to see if I get this .. ..
"Ray, Cypher, by design, needs to be a simple mechanism for anyone to use. What you are trying to do should be done in a higher level language like Python or, in your case, Java.. So, what you can do is easily write a chunk of code to read in you CSV file and output the Cypher Commands that you want to execute, perhaps in the same 'read this CSV and Generate Cypher Commands' code (just create a transaction to your database and fire this generates Cypher string(s) into it).. "..

Do i have this correct Josh? (basically: Ray, you were having trouble doing this inside your Load from CSV script, Cypher doesnt really have this capability, but there are good work arounds, like this one..

So Josh, did I get this right sir ?

joshcornejo · July 16, 2025, 5:58am

Yes ... no idea who is Ray.

john.stegeman · July 16, 2025, 12:20pm

he's the person you responded to

joshcornejo · July 16, 2025, 12:21pm

Ahhh .... LOL ... got it now :D

joshcornejo · July 16, 2025, 12:29pm

@raylukas97 - You need to focus your work where it belongs:

ETL (extract-transform-load) are usually performed outside of the database (until the load), if it is delts, you might do part of the transform inside the database. But it creates more complexty than required if you want to build the entire process inside.

If you think about a database as sets - your focus on fetches are always to collect the subset of information you require ... most of the rest of the logic should belong to the application.

I have very complex inserts (for example dynamic length objects), and the statement to create one such objects is 600+ lines ... but I am going to be using it to create thousands of objects in the future and the "cohesion" is correct (the statement is only focused on the data creation, not creating algorithms to figure how the data is created).

A python script is a straightforward way to create the "EXTRACT -> TRANSFORM" and leave the "little transform -> Load" to CYPHER.

raylukas97 · July 16, 2025, 3:01pm

:).. I am Ray.. LOL.. thanks man

raylukas97 · July 16, 2025, 3:03pm

Right.. I like this.. thanks man.. Thanks.. i am java guy but.. all this still holds true.. Thanks Josh..

Topic		Replies	Views
Link nodes hierarchically during LOAD CSV Import / Export apoc , cypher , load-csv , relationship	2	462	May 4, 2021
How to add entities and relations with in CSV Cypher cypher , operations , import , knowledge-base	8	547	June 4, 2021
End where statements Cypher	8	1072	May 13, 2019
Populating database from CSV file with relations Cypher cypher	3	397	December 1, 2020
How to create relationship between rows present in csv file in Neo4j? Cypher cypher , relationship , neo4j-desktop	4	691	April 23, 2021

Connect a New Node to its Correct Parent

Related topics