How to create a unique index for a property > 8k in Neo4j 5+?

sarmbruster · May 15, 2025, 1:59pm

I want to create a unique index on a label/property combination where the property values can be longer than ~8k.

A simple

CREATE CONSTRAINT my_unique_index FOR (r:Chemical) REQUIRE (r.fingerprint) IS UNIQUE;

does not work if suppliying a property value > 8k, I'm getting

Neo.DatabaseError.Statement.ExecutionFailed
Property value is too large to index, please see index documentation for limitations. Index: Index( id=3, name='my_unique_index', type='RANGE', schema=(:Chemical {fingerprint}), indexProvider='range-1.0', owningConstraint=4 ), element id: 4:6d6230f8-5a72-4075-babf-3f1ce5fc8d80:10, property size: 18011.

I fully understand that range-1.0 index provider has limitations. I could use CREATE TEXT INDEX ... for building an index to support longer strings.

Is there a way to use text-2.0 index provider for a unique constraint?

Back in Neo4j 4.x this was perfectly possible using :

call db.createUniquePropertyConstraint('my_unique_constraint', ['Chemical'], ['fingerprint'], 'lucene+native-3.0')

So far I couldn't find a way to reach feature parity between Neo4j 4.x and Neo4j 5.x (and higher) for this scenario. Any hints?

hakan.lofqvist1 · May 15, 2025, 2:18pm

Maybe just split your fingerprint into f1, f2 (I have never tested, but I hope we can game the system this way).

CREATE CONSTRAINT my_unique_index FOR (r:Chemical) REQUIRE (r.f1, r.f2) IS UNIQUE;

I hope you are well

EDIT: This will not work the ~8k is shared between the keys

joshcornejo · May 16, 2025, 6:34am

If this synopsis from Claude is correct:

Index Acceleration: Uniqueness constraints benefit from Neo4j's index capabilities
Cache Impact: Constraint metadata is kept in Neo4j's schema cache
Execution Cost: Existence constraints add minimal overhead, while uniqueness constraints add index lookup costs

And you have a lot of Chemical nodes, you might be penalising your perfomance by having extremely large objects. So, is it not possible to provide a hash of those keys (either calculating it in your code before inserting or creating a procedure?), using SHA-256 or BLAKE3 depending on your experience:

hasFingerPrint = blake3(fingerprint)

and then

CREATE CONSTRAINT hashIndex FOR (r:Chemical) REQUIRE (r.hashFingerprint) IS UNIQUE;

hakan.lofqvist1 · May 16, 2025, 6:38am

I see some internal chatter about the index provider (sounds like a "I would not hold my breath situation"). So finding a way of reducing the size of the fingerprints is likely the only option. Or some other hack to keep a shorter key mapped to the fingerprint somewhere else.

sarmbruster · May 16, 2025, 7:42am

Thanks for your answers. The easiest way would be indeed use a apoc trigger that hashes the long property value and stores it in a secondary property using the unique constraint.

An alternative would be using a text index (aka text-2.0 provider) for the given property and use MERGE in single threaded way to prevent race conditions.

Does index provider lucene+native-3.0 still exist under the hoods? If so I could check the old source code for db.createUniquePropertyConstraint procedure and try to reimplement it as a custom procedure.

matthew.parnell · May 16, 2025, 10:08am

Sorry lucene+native-3.0 was removed in Neo4j 5. The BTREE index was overloaded trying to handle too much, which lead to splitting it into RANGE, POINT, and TEXT indexes.

We are moving away from public index providers and to focus on index types; they have effectively become versions, and we want people using the latest version of an index type. Public surface for specifying an index provider is already been removed in the latest version of Cypher, and will be ignored in many other existing cases.

It would be best to use a long, high-entropy hash of the relevant data you wish to have uniqueness over, potentially salted, or affixed with some relevant UUID for that data to help minimise collisions.

Topic		Replies	Views
New constraint - value limits and limitations Cypher cypher , knowledge-base	0	389	June 21, 2021
Node Property Character Limit Import / Export apoc , performance , cypher , operations , import	4	1846	February 4, 2021
Property value size is too large to index into this particular index Neo4j Graph Platform	5	581	July 11, 2023
Neo4j create node constraint index failed Neo4j Graph Platform migrated	0	252	January 8, 2023
Unique property General migrated	3	188	August 16, 2022

July Summer Fun!

How to create a unique index for a property > 8k in Neo4j 5+?

Related topics