Understanding the performance of Skip/Limit

Hatem · December 14, 2023, 3:40pm

I'm trying to understand the performance of pagination using the SKIP/LIMIT combination on the latest neo4j community edition container. The database has 100M nodes, I match on a subset of nodes, controlling the output with SKIP/LIMIT using the following query:

MATCH (n) 
RETURN (n)
SKIP 0
LIMIT 1000000

The query skips 0 rows and returns 1M node, and it takes ~46 seconds to complete. The query planner in the browser showed that the first step was an AllNodesScan that returned 1M node.

I repeated the same query but with changing the SKIP value to 1M -the LIMIT value is the same as in the previous query-, the query planner showed that the AllNodesScan step returned 2M rows, out of which the first 1M rows were skipped, the remaining returned. The execution time was similar to the previous query, i.e. ~46 seconds. I repeated the query several times after restarting the container to make sure that nothing is being cached.

Why does the second query have a similar execution time to the first one, even though it hits 2M rows? (1M rows more than the first query).. Shouldn't the second query take more time than the first one?

Thanks,

myron_higerd · December 20, 2023, 8:44pm

Neo4j can scan nodes very fast with AllNodesScan. Probably the difference for the MATCH(n) step for 1M and 2M nodes was only a few milliseconds. Properties are not accessed until they are needed, so your query accesses the properties in the RETURN step. That means the time in your query is on the RETURN(n) where it does pull all the properties and then returns the data to the calling application. Since you are returning the same number of nodes (limit 1000000), it will take the same amount of time to extract the properties and return them.

If you want to see the difference, run

MATCH(n) WITH SKIP 0 LIMIT 1000000 RETURN count(*)

Topic		Replies	Views
Neo4j : Using ORDER BY with SKIP and LIMIT, is it the better way to get a good performance ( execution time)? Cypher performance , cypher	3	6928	June 22, 2020
Get COUNT before LIMIT Neo4j Graph Platform performance , cypher	10	3246	November 4, 2024
How to paginate results of cypher, neo4j? Neo4j Graph Platform performance , cypher	4	9175	August 1, 2019
Wrong cypher query behaviour with the java driver Drivers & Stacks migrated	6	115	July 11, 2022
Poor Performance on Consuming/Returning Millions of Rows Cypher performance , cypher	2	418	January 18, 2021

Understanding the performance of Skip/Limit

Related topics