I am trying to run a spatial bounding box search with the use of a spatial index.
In my graph, there are approximately 1.4 million nodes with a label of :ItemUsageInstance
(there are other nodes beyond those as well). These :ItemUsageInstance
nodes have a uniqueness constraint called instance_id
. There are also 3 point
properties on these node labels that are indexes: volume_min_coord
, volume_centroid
, volume_max_coord
.
What I am trying to do is, given a particular :ItemUsageInstance
node, find the closest other :ItemUsageInstance
s whose volume_centroid
is within a certain bounding box of the first ItemUsageInstance. While my query "works", it takes ~12 seconds and has 4359092 db hits. It does not seems like I am fully realizing the benefit of the spatial indexing.
Query so far:
MATCH (i:ItemUsageInstance {instance_id: '000000000123456'})
WITH i,
i.volume_min_coord.x + 50 AS upper_x, i.volume_max_coord.x - 50 AS lower_x,
i.volume_min_coord.y + 50 AS upper_y, i.volume_max_coord.y - 50 AS lower_y,
i.volume_min_coord.z + 50 AS upper_z, i.volume_max_coord.z - 50 AS lower_z
MATCH (other:ItemUsageInstance)
WHERE id(i) <> id(other)
AND point({x: lower_x, y: lower_y, z: lower_z}) < other.volume_centroid < point({x: upper_x, y: upper_y, z: upper_z})
RETURN distance(i.volume_centroid, other.volume_centroid) AS dist, other.instance_id
ORDER BY dist LIMIT 25;
If I PROFILE
the query, here is what I get:
That said, creating the index on the volume_centroid
property did seem to provide some modest improvement. I had initially forgotten to do so, and without the volume_centroid
index, there were 5805244 total db hits in 12188 ms
. After the volume_centroid
index was ONLINE
, the db hits reduced to: 4359092 total db hits in 12546 ms
Maybe my expectations are not realistic, but it seems like it ought to be doing much better. What can I do to improve the performance?
Result from :schema
:
Indexes
ON :ItemUsageInstance(volume_centroid) ONLINE
ON :ItemUsageInstance(volume_max_coord) ONLINE
ON :ItemUsageInstance(volume_min_coord) ONLINE
ON :ItemUsageInstance(instance_id) ONLINE (for uniqueness constraint)
ON :Item(item_id) ONLINE (for uniqueness constraint)
ON :Material(material_id) ONLINE (for uniqueness constraint)
Constraints
ON ( item:Item ) ASSERT item.item_id IS UNIQUE
ON ( itemusageinstance:ItemUsageInstance ) ASSERT ItemUsageInstance.instance_id IS UNIQUE
ON ( material:Material ) ASSERT material.material_id IS UNIQUE
If I drop the LIMIT 25
from the query, there are: 2994 rows available after 12209 ms, consumed after another 8 ms
Version: 3.4.5
Edition: community