How to add a seqnuential number to nodes

I have the following question for you all, and I think it could be of general interest.

I have a cart containing products from different vendors
For each vendor I must create an order, and the order MUST have a sequential number starting form 1.

The query is as follow:

call {
    MATCH (o:Order)
        return o.number as m order by o.number desc limit 1
}
    MATCH (cart)-[contains:CONTAINS]->(product:Product)-[sold:IS_SOLD_BY]->(merchant:Merchant)
    WITH m,merchant, cart,collect({c: contains, p: product}) as merchantProducts, count(product) as noOfProducts

    CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
            set order.number = **???**

If I have n merchant, each order should have a number value of m+1, m+2, m+3 ... m+n

And, of course, the solution should work in a multi-threaded environment!

And, finally, how can I set the initial value to '0' for the first order?

Any idea?

@paolodipietro58

sounds like you want a database sequence though even in a RDBMS world sequences are not guarenteed to be without gaps. Further Neo4j does not support sequences. You could create a sequence node for example create (n:Seq {id:1}); and then everytime you want new id, lock the node, increment and then release the lock. But this could get very lock heavy. And with RDBMS sequences many tried to avoid all this locking by for example giving out batches of ids. But this has side effects.

this is a very good question. I use Oracle and we use the built-in sequence generators to set all of our primary keys. We create a sequence generator for each table. Neo4j should consider such a capability. We could create named sequences and then call a function by sequence name to get the current value and have it incremented, all atomically.

I ended up building something like @dana_canzano referenced. I have a set of "Sequence" nodes representing each sequence I need for different entities. I lock the node, read the current value to return, and increment and set the new value. I do this on the server itself in a custom procedure. I will access it in a cypher query via a custom function so I can set entities keys in my application using the driver. I am already writing a library of custom procedures for my application, so it wasn't a big deal to add. It would be a lot of work for someone to do the same thing just for this if they are not already developing/deploying custom functions/procedures.

I don't yet know the performance impact of my solution yet, as @dana_canzano mentioned. I believe my application will be much more read heavy than write heavy, so I am expecting the locking to not be impactful.

Another idea for neo4j is to include a unique id with each entity that is permanent, unlike the current id. I don't really see the value of the current id in a multiuser environment for application development.

1 Like

Is the sequential number is based on merchant or based on order>?

It is based on Order.

Imagine the same thing for an invoice which must be ordered starting from the first of the year (so we could have duplicates ...): each invoice starts from 2022-1 then 2022-2 and so on, then again 2023-1 then 2023-2. The question is the same

See my answer below: maybe it couldn't be unique id because the number should restart from 1 at the beginning of the new year, just as an invoice number.

Hi @glilienfield ,

Another way you can always try is using an index on this incremental field and then using a trigger with APOC that on every creation of a node within this particular Label, you look for the max(onProperty) + 1. This approach should not lock that much.

On @paolodipietro58 case, the index should be a composite on (date, seq). The date +'-'+seq format can easily be handled by DTO on the application layer.

Bennu

Morning @bennu_neo,

I think this solution has a potential race condition in a multiuser environment, caused by two invoices being created concurrently and both trigger queries returning the same max value.

I saw this behavior when I built the solution I described above. I tested it by using a thread pool to call my sequence generator method concurrently. It was happening so fast that I got sequences back with many duplicates. I solved it by locking the sequence node before reading and updating the sequence value. I believe the only way to guarantee uniqueness in the sequence is to serialize the read and updates.

That being said, I think you have a good idea here. You could implement what I did, and @dana_canzano suggested, without needing to write custom procedures on the server. Instead, use a trigger to executes a cypher script that retrieves the sequence node, locks it by setting a dummy attribute, reads the sequence value, increments it, sets the value attribute with the updated value, removes the dummy attribute, and returns the new sequence value. The trigger can then set the invoice number on the invoice node. The cypher can prepend it with the year too. The sequence node tracks the sequence value and is a synchronizing object. This would be interesting to try.

You are always finding useful ways to slip APOC into solutions. I have learn a lot with your APOC suggestions.

Hi @glilienfield !

I'm always happy to help :smiley:

Good call on the race condition. You can always lock the nodes with apoc.lock.read.nodes as well. This may be a cool APOC functionality overall. Another thing to be aware is deadlock but if the trigger transaction is set properly, we should be able to get away with murder

Regards

Bennu

For Invoice number and other unique sequences, we use the following cypher (where 1200 is our start value)

MERGE (id:UniqueId {name:"Invoice"})
ON CREATE SET id.id = 1200
ON MATCH SET id.id = id.id+1
RETURN id.id

As this single query always is executed a transaction, you don't have to care about locking and you'll never get duplicates. You can include same logic in a bigger cypher that stores your orders and directly set order.id property with id.id. We use it in C# as part of a bigger transaction containing multiple cyphers so it is not commited, when the cart and order storing isn't commited in full.

But I'm not sure if it works in a cluster. Would be interested in that.

Best,
Reiner

1 Like

@Reiner Hi Reiner, well thanks for your suggestion: it looks like your proposal is a valid step ahead to get/initialize a unique sequential order-id.

But the main question is still unresolved. You described a method to have the first valid sequential id to be associated with a new order.

But How can I associate it if I'm working on a node set? In my original question I stated

MATCH (cart)-[contains:CONTAINS]->(product:Product)-[sold:IS_SOLD_BY]->(merchant:Merchant)
    WITH m,merchant, cart,collect({c: contains, p: product}) as merchantProducts, count(product) as noOfProducts

    CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
            set order.number = **???**

So here, I have different merchants, and for each of them I created an order in just one CREATE statement, but for each order I created I need to have a different unique sequential id, which starts from the id.id you generated with your query.

So this is the core of the question!!!

Hello

Please note that you can create your own sequence generator library module and install it as a custom tranformer that you can reuse in many other projects as well.

Many thanks
Mr Sameer Sudhir G

Hi @sameer.gijare14 , thank you for your suggestion, but I don't know how to do that. Can you provide an example?
And, if you give a look at my answer before your latest one, I can I add an increasing number during a set operation? This is my main dilemma: I have no idea and cannot rid of this particular problem!

I noticed APOC has the ability to create custom procedures. You could take the ideas from @dana_canzano and @Reiner to create a 'sequence' node that tracks the current value and access the next value using a custom procedure installed using the APOC library. This makes the sequence generator reusable throughout your code (as @sameer.gijare14 suggested) and you don't have to repeat the sequence update logic every time you want to access the next number.

I tested it and it seems to work. I also added the ability to prefix a string to the id for your purposes. You can remove that and prefix the string after you get the id if you prefer.

Create the custom procedure by calling the following APOC method once. There are other apoc.custom methods to manage all the custom procedures, such as list and remove. The sequences are named, so you can create multiple sequences, one for each entity requiring is own sequence. The 'name' parameter refers to the sequence name.

call apoc.custom.declareProcedure(
'customSequenceWithPrefix(name :: STRING?, prefix :: STRING?) :: (id :: INTEGER?)',
'MERGE (id:Sequence {name: $name}) ON CREATE SET id.id = 0 ON MATCH SET id.id = id.id + 1 RETURN $prefix + id.id as id',
'write')

Once the custom procedure is installed, you can add the following line to your cypher to retrieve and use the next sequence number, prefixed with your preferred prefix.

call custom.customSequenceWithPrefix('Invoice', 'prefix') yield id

One caveat, I believe the sequence will be incremented even if your transaction fails, thus there is the potential of having gaps in your invoice numbers. @Reiner solution avoids this by including the sequence number update/retrieve logic in cypher code. It does require repeating the code each time it is used. Not such a problem if its used in one place.

Just an idea to consider.

Thank you for your suggestion. I already noticed that, it is a good idea (apart from the potential holes as you noted).

I solved the problem with the @reiner solution: simple, easy, it will be used in a couple of queries, so it is not a problem if repeated.

But an important part of the question remains: if I'm working on a set operation, how can I assign a new number? In the next sample, I could create many orders belonging to different merchants at once.
And with this approach everyone will have the same number.

MATCH (cart)-[contains:CONTAINS]->(product:Product)-[sold:IS_SOLD_BY]->(merchant:Merchant)
    WITH m,merchant, cart,collect({c: contains, p: product}) as merchantProducts, count(product) as noOfProducts

    CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
            set order.number = **???**

I believe this works. I took the solution from the original post and inserted @Reiner code for the sequence generation.

MATCH (cart:Cart)-[contains:CONTAINS]->(product:Product)-[sold:IS_SOLD_BY]->(merchant:Merchant)
WITH merchant, collect({c: contains, p: product}) as merchantProducts, count(product) as noOfProducts, sum(product.price*contains.quantity) as orderTotalPrice

CALL {
    MERGE (id:UniqueId {name:"Invoice"})
    ON CREATE SET id.id = 1200
    ON MATCH SET id.id = id.id+1
    RETURN id.id as invoiceId
}

CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
          SET order.id = invoiceId,
              order.total = orderTotalPrice

WITH order, merchantProducts
UNWIND merchantProducts as merchantProduct
WITH order, merchantProduct.c as contains, merchantProduct.p as product
MERGE (order)-[c:CONTAINS]->(product)
             set c=contains

RETURN distinct(order) as order

@glilienfield : Thank you again for your support, but I cannot get rid of this query.

My need is to have to start with one cart containing products belonging to ndifferent merchants. I want one order per merchant, each order having the same orderNo' but consecutive subOrderNostarting from1ton`.

At the end of this query, I have n orders with consecutive numbers each for any product.

Have you any idea why this is not running the way I'd like? Thank yopu again!


CALL {
    MERGE (id:Counters {name:"orderNo"})
    ON CREATE SET id.orderNo = 1
    ON MATCH SET id.orderNo = id.orderNo + 1

    RETURN id.orderNo as orderNo
}

WITH session, cart, orderNo, merchant, collect({c: contains, p: product}) as merchantProducts, count(product) as noOfItems, sum(product.price*contains.quantity) as orderTotalPrice

CALL {
    WITH session
    MERGE (id:Counters {name:"subOrderNo", uuid: session.uuid})
        ON CREATE 
            SET id.subOrderNo = 1,
                id.created_at = dateTime()
        ON MATCH 
            SET id.subOrderNo = id.subOrderNo + 1

    RETURN id.subOrderNo as subOrderNo 
}

CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
          SET order.number = orderNo,
              order.subOrder = subOrderNo,
              order.itemsAmount = orderTotalPrice,
              order.itemsNumber = noOfItems,
              order.transportAmount = noOfItems + 9,
              order.totalAmount = orderTotalPrice + noOfItems + 9,
              order.uuid = apoc.create.uuid(),
              order.created_at = dateTime(),
              order.status = "created"

WITH session, cart, order, merchantProducts

UNWIND merchantProducts as merchantProduct
WITH session, cart, order, merchantProduct.c as contains, merchantProduct.p as product

MERGE (session)-[:HAS_ORDER]->(order)-[c:CONTAINS]->(product)
             set c=contains,
                 c.price = product.price

WITH cart, order, session

//MATCH (cart)-[r]-(m) DETACH DELETE cart
MATCH (id:Counters {name:"subOrderNo", uuid: session.uuid}) DELETE id

RETURN order, as order, products;

I think I understand what you want. The following query extends what we did earlier, but incorporates the sequence code to generate a single order number for all merchant orders and a derived set of indexes for assigning the subOrderNo for each merchant order. To calculate the indexes using the range method, I had to collect all the merchant orders into a list, generate the sequence, then unwind the merchant list of orders.

The following query does execute without error. It is getting complicated to test without some good test data, so give it a good shakeout. I think it should work. Let me know if you uncover anything and we can try to address it.

The query is for one specific cart. Change the cart criteria in the first MATCH to meet your needs.

MATCH (cart:Cart{id: 100})

CALL {
    MERGE (id:UniqueId {name:"orderNo"})
    ON CREATE SET id.id = 1
    ON MATCH SET id.id = id.id+1
    RETURN id.id as orderNo
}

MATCH (cart)-[contains:CONTAINS]->(product:Product)-[sold:IS_SOLD_BY]->(merchant:Merchant)
WITH orderNo, merchant, collect({c: contains, p: product}) as merchantProducts, count(product) as noOfItems, sum(product.price*contains.quantity) as orderTotalPrice

WITH orderNo, collect({m: merchant, p: merchantProducts, n: noOfItems, t: orderTotalPrice}) as merchantOrders
WITH orderNo, merchantOrders, range(1,size(merchantOrders)) as indexes
UNWIND indexes as index

WITH orderNo, index as subOrderNo, merchantOrders[index] as merchantOrder
WITH orderNo, subOrderNo, merchantOrder.m as merchant, merchantOrder.p as merchantProducts, merchantOrder.n as noOfItems, merchantOrder.t as orderTotalPrice

CREATE (order:Order)-[b:BELONGS_TO]->(merchant)
          SET order.number = orderNo,
              order.subOrder = subOrderNo,
              order.itemsAmount = orderTotalPrice,
              order.itemsNumber = noOfItems,
              order.transportAmount = noOfItems + 9,
              order.totalAmount = orderTotalPrice + noOfItems + 9,
              order.uuid = apoc.create.uuid(),
              order.created_at = dateTime(),
              order.status = "created"

WITH order, merchantProducts
UNWIND merchantProducts as merchantProduct
WITH order, merchantProduct.c as contains, merchantProduct.p as product
MERGE (order)-[c:CONTAINS]->(product)
             set c=contains,
                 c.price = product.price

RETURN distinct(order) as order

Yes, you understood perfectly my goal, and I (think/hope to) have understood your cypher.

But just copying and pasting the code and changed the cart id, returns me this error:

The strange thing is merchant@6: I have one cart with 5 products all belonging to just one merchant. So, @6 look out of range ....

#### Neo.DatabaseError.Statement.ExecutionFailed

Failed to create relationship `b`, node ` merchant@6` is missing. If you prefer to simply ignore rows where a relationship node is missing, set 'cypher.lenient_create_relationship = true' in neo4j.conf