Dear Neo4j Team,
I'm writing to suggest an enhancement to the db.index.vector.queryNodes
procedure. Currently (Neo4j 5.11), this procedure allows querying a vector index by specifying indexName
, numberOfNearestNeighbours
, and a query
vector. However, it lacks the flexibility to set a "LIMIT" beyond the numberOfNearestNeighbours
parameter.
I propose the addition of a separate "LIMIT" parameter. This change would allow users to independently control the maximum number of results returned, enhancing customization and usability for various use cases.
For example, in scenarios where a broader sample from a dataset is needed for analysis, the current configuration restricts the ability to fetch more results than the nearest neighbors count.
I am sorry, but I don't understand the intend of this suggested parameter. Maybe you can provide more detail on how it would change the results and a specific scenario to help understand its utility.
This is an explanation of why I am confused. The method currently returns the number of closest nodes specified by the numberOfNearestNeighbours parameter. If a new limit parameter was provided, there would be two potential scenarios: 1) the limit parameter exceeds the numberOfNearestNeighbours value, and 2) the limit parameter is less than the numberOfNearestNeighbours value. For the first case, what additional nodes would you want outside the nearest ones that are presently returned? Do you want some additional random ones, or the next set of nearest nodes? If the later, couldn't you get that by increasing the value of numberOfNearestNeighbours to the value of your limit parameter? For the second case, what subset of the nearest nodes would you want? Do you want a random smaller set from the nearest ones or do you want the a subset of nearest nodes that are all nearer than the ones removed by the limit? If the later, that is the same as setting the numberOfNearestNeighbours value to your limit parameter.
If you don't want the limit parameter to behave as I described above, but you want it to return additional random nodes or restrict to a set of random nodes, that can be achieved with extra cypher code.
Sorry for my misunderstanding. Hopefully you can provide some more detail so the neo4j team can understand the ask.