Cypher query to handle a few scenarios

mkdir491 · August 2, 2023, 12:21pm

hi,
in my Neo4j graph I have internet usernames which connected to persons like this:
(username) <- [has_username] - (person)
while I am loading the incremental data a few scenarios might be.
let's take the username: abc1234 for example
abc1234 can be multiple times in the DB while each abc1234 has it's own unique id property.
for example:
abc1234 with property unique id = 1 belong to person x
abc1234 with property unique id = 2 belong to person y
sometimes there a abc1234 which is still not connected to any person.

I am trying to use cypher in order to load the usernames csv file as follows:

CALL apoc.periodic.iterate(
    "
LOAD CSV FROM 'file:///neo4j/UserName.csv' AS line with line where line[0] is not null with line[0] as id, line[1] as userName, line[2] as idNumber, line[4] as status, line[5] as startDate, line[6] as endDate
RETURN id, userName, idNumber, status, startDate, endDate
    ",
    "
MERGE(un:UserName {userName: userName})
ON CREATE set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status, un.startDate=startDate, un.endDate=endDate
with un
MATCH(un)
OPTIONAL MATCH (un)<-[:HAS_USERNAME]-(n:IdNumber)
with un, n, case when n is null then true else false end as hasusername
case
when hasusername then
set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status, un.startDate=startDate, un.endDate=endDate
else
set un.status=status, un.endDate=endDate
end
   "  ,
    {batchSize:5000, parallel:False} 
);

what I am trying to do is basically create UserName node when it's not exists or
if it does exists and has a relationship with a person , I want to match it and change 2 properties or
if it does exists and has no relationship with a person, I want to match it and change all properties.

my query isn't working for no matter what, may I get some help from you guys please? :)

thank you!!!

glilienfield · August 2, 2023, 1:40pm

There are a couple things I see as problematic. First, on your 'with' clause where you define 'hasusername' you also try to use it in your 'case' statement. You can't use a variable defined on a 'with' on the same 'with'. You can change two 'with' clauses together to define and then use a variable. Secondly, the 'case' statement can not 'set' values conditionally. You can use a 'case' statement on the right-hand side of a 'set' operation to conditionally derive the value to set.

You can try the following refactored query. I used the 'call subquery' to implement a if/else type construct. You could also use apoc procedures that allow you to conditionally execute cypher blocks.

CALL apoc.periodic.iterate(
"
    LOAD CSV FROM 'file:///neo4j/UserName.csv' AS line 
    with line where line[0] is not null 
    with line[0] as id, line[1] as userName, line[2] as idNumber, line[4] as status, 
    line[5] as startDate, line[6] as endDate
    RETURN id, userName, idNumber, status, startDate, endDate
",
"
    MERGE(un:UserName {userName: userName})
    ON CREATE set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status,
    un.startDate=startDate, un.endDate=endDate
    with un, exists{(un)<-[:HAS_USERNAME]-(:IdNumber)} as hasusername
    call{
        with un, hasusername
        with un, hasusername
        where hasusername
        set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status, 
        un.startDate=startDate, un.endDate=endDate
    }
    call{
        with un, hasusername
        with un, hasusername
        where not hasusername
        set un.status=status, un.endDate=endDate
    }
",
{batchSize:5000, parallel:False} 
);

mkdir491 · August 6, 2023, 6:16am

Hi thank you so much!
sorry for the delay, I missed your answer.
I have an error while testing the query:

{

  "The EXISTS subclause is not valid inside a WITH or RETURN clause. (line 5, column 14 (offset: 378))\n"    with un, exists{(un)<-[:HAS_USERNAME]-(:IdNumber)} as hasusername"\n              ^": 4

}

I couldn't understand what was the error telling me.

glilienfield · August 6, 2023, 9:10am

I read the ab exists subquery can be used anywhere an expression can use. It didn’t seem true here. I converted it to the exists function. Try this instead:

CALL apoc.periodic.iterate(
"
    LOAD CSV FROM 'file:///neo4j/UserName.csv' AS line 
    with line where line[0] is not null 
    with line[0] as id, line[1] as userName, line[2] as idNumber, line[4] as status, 
    line[5] as startDate, line[6] as endDate
    RETURN id, userName, idNumber, status, startDate, endDate
",
"
    MERGE(un:UserName {userName: userName})
    ON CREATE set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status,
    un.startDate=startDate, un.endDate=endDate
    with un, exists((un)<-[:HAS_USERNAME]-(:IdNumber)) as hasusername
    call{
        with un, hasusername
        with un, hasusername
        where hasusername
        set un.id=id, un.userName=userName, un.idNumber=idNumber, un.status=status, 
        un.startDate=startDate, un.endDate=endDate
    }
    call{
        with un, hasusername
        with un, hasusername
        where not hasusername
        set un.status=status, un.endDate=endDate
    }
",
{batchSize:5000, parallel:False} 
);

mkdir491 · August 6, 2023, 1:53pm

Hi,
thanks again, it worked.
I just need to find a way to merge by id property if there is HAS_USERNAME relationship because it is currently updating by unique usernames and it can be multiple 'ABC123', it's updating properties to all of them together.

glilienfield · August 7, 2023, 1:12am

Not sure I understand. Why don't you search by Id? if one is not found a new one will be created.

If that does not make sense based on your data model, can you search for the IdNumber node and then get the UserName node if there is a link? If the link does not exists then create the UserName and link it to an IdNumber node.

I can try to help if you can help me understand what you need.

mkdir491 · August 8, 2023, 5:20am

My problem is that I can't merge as a start with id property because there is a problem that I currently have with the data, I get usernames without an id property as a start and only a day later I get the results with id, an additional problem is that I can't wait for the next day And I have to load them as soon as they are created into the database because this is the user's requirement.

please take a look at my screenshot.
UserName ABC123 can be created multiple times but it is not the same UserName, each UserName has unique id property, the property of the name dos not matter. even though their sharing the same name.
what happens is while trying to merge, if it exists I don't know how to merge by Id.
the query setting all properties to both nodes.

Here are the possible scenarios:
1 - There is no such username at all and then I want to create it.
2- This username exists and then I have 3 options:
Or I check if he has a relationship called HAS_USERNAME and if so, I only update it with the status and the cancellation date.
Or I check if he has no relationship called HAS_USERNAME and then I want to update all the properties.
Or it exists both with and without a connection and then I want each of them to get the features they should get.
to handle these scenarios i must use the unique id property.

Topic		Replies	Views
Creating a unique relationship in Neo4j depending on multiple WHERE clauses via LOAD CSV Cypher apoc , cypher	0	412	March 11, 2020
Create new node/relationship based on availability of certain attributes in LOAD CSV Cypher load-csv	4	727	July 4, 2020
Node then Relationship Newbie Questions cypher , neo4j-desktop	2	326	July 2, 2020
Migrating Database to Neo4J as a newbie Newbie Questions apoc	12	420	June 1, 2021
Populating database from CSV file with relations Cypher cypher	3	384	December 1, 2020

Cypher query to handle a few scenarios

Related topics