Params and nodes creation during iterations

I have the following params:

   component: [
        {
            id: 'component1',
            parts: [ 'part1', 'part2' ]
        },
        {
            id: 'component2',
            parts: [ 'part2', 'part3', 'component1' ]
        }
    ],

(all the parts should exist from earlier in the query or pre existing in the database, but in this case, none of the components exist)

I have to check if any component exists that has all the parts already, if it doesn't exist I create it and then associate the parts.

From the above, component1 should be created before component2.

But from my output of a list of parts that exist, it looks like component1 hasn't been created when component2 searches for it.

Can you share your query, and its EXPLAIN plan?

The general approach for this kind of query would be to UNWIND the $component list parameter, and use a subquery CALL {} after the UNWIND so each element gets processed by the subquery in order.

Hi Andrew, the query is very mangled and around 400 lines at the moment, a lot goes around multi-tenancy (a "reference tenant" is inside the green triangle), where the sharing boundary is inside the blue circle, I start by creating the nodes nearer the blue circle and i gradually move outwards, where I need to find if a proposed group from params already pre-exists.

The problem seems to happen at the point now on the far right, where the top node ('beta') references the bottom node ('alpha') which has been created as part of the previous iteration within the same UNWIND, as you can see the other edges are created fine to the nodes that happened outside this iteration (pre-existing or from a previous UNWIND, but not the one with the hand-written line.

OK - I've wrapped the UNWIND inside a CALL and I now have the following issue:

UNWIND $theComponents AS theComponents
  WITH theComponents, listOfPreviousElements
  CALL {
   // all the checks, creations, etc etc etc ...

  // the new element is added ... duplicates removed
  // the new element is added to the listOfPreviousElements

  // but since lists are immutable on the next iteration it goes back to the original list
  }

The point of the CALL {} is to allow iteration through the list elements (exposed when you UNWIND, so with each iteration of the subquery the next element of the list will be processed).

So provided that the list ordering is such that the elements that are dependencies are processed earlier in the list, you should be fine.

For example, using your components, assuming that the parts all exist, and that the component dependencies are such that a component only uses components earlier in the list:

theComponents: [
        {
            id: 'component1',
            parts: [ 'part1', 'part2' ]
        },
        {
            id: 'component2',
            parts: [ 'part2', 'part3', 'component1' ]
        },
        {
            id: 'component3',
            parts: [ 'part3', 'component1' , 'component2']
        },
    ]

You could process like this:

UNWIND $theComponents AS component
 
  CALL {
        WITH component 
        MERGE (c:Component {id:component.id})
        WITH component, c
        
        UNWIND component.parts as partId
        MATCH (part:Component {id:partId})
        MERGE (c)-[:USES]->(part)
  }

There's no need for the list because you've already processed all elements of the list.

But if there's something missing about what you're intending to do with this, please clarify.


If you have no guarantee that you'll be processing this with the ordering described, then provided the nested parts list has everything you need to create the node, just use MERGE for that instead of MATCH:

UNWIND $theComponents AS component
 
  CALL {
        WITH component 
        MERGE (c:Component {id:component.id})
        WITH component, c
        
        UNWIND component.parts as partId
        MERGE (part:Component {id:partId}) // changed from MATCH to MERGE
        MERGE (c)-[:USES]->(part)
  }

Thanks for the prompt response.

I can't just merge (i have to check before if the whole set belongs to the tenant).

And i need to carry the updated list to the next query (i need to process 5 more lists of params and they all refer to this list that is building as i process).

Okay, then do the processing in the order that makes sense.

Or at least be able to list out the steps and the required ordering that needs to happen. Figure out the steps necessary, then see if that can be accomplished with Cypher, or if it requires breaking things up into multiple queries, or usage of a different approach completely, such as with custom procedures.

Remember that Cypher is not an imperative language. While there are some things you can do to ensure some kind of ordering (such as UNWIND and subqueries), there are some things that it cannot do, or cannot do well (such as true looping and mutable lists).

If a single Cypher query isn't the right tool for the job you need this to do, then look for other options.

I know it isn't imperative - and it is fair enough lists are unmutable.

I would rather have everything in one single statement (otherwise i have to introduce explicit transactions, weird recoveries, etc).

Since this is just a "one" off within the creation and this path in the statement isn't going to be that common (or heavy), I've thought overnight about using a temporary node to store my list:

// generate a unique identifier
WITH randomUUID() AS tempNode, ['a','b','c'] AS someProperties
// create a node
MERGE (N:tNode {id: tempNode, lov: someProperties})

// the items to add
WITH tempNode, ['d', 'e', 'f'] AS toAddProperties
UNWIND toAddProperties AS theProperty
CALL {
    WITH tempNode, theProperty
      MATCH (G:tNode {id:tempNode})

    //  -----------------------------------------
    // do whatever else i need to do in the query
    //  -----------------------------------------

    // add the property and update the node
    WITH tempNode, theProperty, G.lov + theProperty AS newProperties
      MERGE (N:tNode {id: tempNode} )
        SET N.lov = newProperties

    // ignore the result
    RETURN true AS test
}
WITH DISTINCT tempNode AS theUUID

// fetch the new list into the variable
MATCH (G:tNode {id:theUUID}) 
  WITH G.lov AS someProperties, G
// delete the temporary node
  DELETE G

// continue with the next part of the statement
WITH someProperties
...
// ...

... and continue ...

Had to do some extra magic. As I had a map, had to split it into 2 lists to save them as properties (and they have to be merged - split on each iteration)

But now i can create N-level relationships that are checked for pre-existing cases.

Added 8 labels, created 8 nodes, deleted 1 node, set 74 properties, created 23 relationships, started streaming 1 records after 476 ms and completed after 480 ms.

Now I will see if i can re-factor some of the WITH ... as I have a lot of them :smiley:

1 Like