cancel
Showing results for 
Search instead for 
Did you mean: 

Wrong cypher query behaviour with the java driver

CrazyNinja
Node Link

I am building a Spring Boot service that connects to a neo4j database via the neo4j-java-driver. One REST endpoint builds a paginated search query with skip and limit that looks like the following:

 

MATCH (i:Item)-[:HAS_PRICE]->(p:Price) 
WHERE p.a>= $aMin AND p.a <= $aMax AND p.b >= $bMin
WITH i, p
ORDER BY p.b asc 
SKIP $skip
LIMIT $limit
RETURN i.code

 

With the query, I am expecting 320 unique items but only get 208. Also on some pages there are duplicate items that were already present in one of the previous pages. When returning all items without skip/limit the result is as expected. When not ordering, the result is also correct. The problem appears to be the combination of ordering with skip/limit.

I also tested the same query on the same database in python with the python neo4j driver version 4.4.2. The skip and limit sizes were the same and the query results were correct, no duplicates. Making the same queries in the Neo4j Browser (with the same skip/limit) does also return the correct results.

So I am thinking that the problem may be with the neo4j-java-driver. I tested it with versions 4.4.8, 5.0.0-alpha03 and 4.0.3. All have the same faulty behavior.

Please ask if you need more information and thank you for helping.

 

6 REPLIES 6

CrazyNinja
Node Link

Edit: Using this solution I found worked. I still have no idea why there has to be a second ordering for it to avoid duplicates.

Is it possible for an item to have multiple prices or multiple items to have the same code? 

The code is constraint to be unique and each item can only have one price

I thought that is what you would say. Its not making sense to me. do you mind running the following query?

MATCH (i:Item)-[:HAS_PRICE]->(p:Price) 
WHERE p.a>= $aMin AND p.a <= $aMax AND p.b >= $bMin
return count(i) as iCount, count(p) as pCount, count(distinct i) as distinctCount

I assume your $skip value is n*$limit, where 'n' is the page number. 

Yes $skip is pageNumber * $limit

I was running the query with this test method:

 

public void test() {
	String query = String.join(" ",
		"MATCH (i:Item)-[:HAS_PRICE]->(p:Price)",
		"WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1",
		"return count(i) as iCount, count(p) as pCount, count(distinct i) as distinctCount"
	);

	try (Session session = driver.session(SessionConfig.forDatabase(databaseName))) {
		Record rec = session.run(query).single();
		int iCount = rec.get("iCount").asInt();
		int pCount = rec.get("pCount").asInt();
		int distinctCount = rec.get("distinctCount").asInt();
		System.out.println("iCount: "+iCount+", pCount: "+pCount+", distinctCount: "+distinctCount);
	}
}

 

This method produces the following output:

iCount: 320, pCount: 320, distinctCount: 320

And that is the correct number of items that match the criteria.

However, when I execute this test method:

 

public void test2() {
	String query = String.join(" ",
		"MATCH (i:Item)-[:HAS_PRICE]->(p:Price)",
		"WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1",
		"WITH i, p",
		"ORDER BY p.b asc",
		"SKIP $skip",
		"LIMIT $limit",
		"RETURN i.code"
	);

	int limit = 25;
	int maxPage = 13; //25*13 exceeding the total number of 320 expected items

	for(int page = 0; page < maxPage; page++){
	   int skip = page * limit;
	   System.out.println("ON PAGE: "+ page);
		try (Session session = driver.session(SessionConfig.forDatabase(databaseName))) {
			session.run(query, parameters("skip", skip, "limit", limit))
				.stream()
				.forEach(rec -> {
					String code = rec.get(0).asString();
					System.out.println(code);
				});
		}
	}
}

 

I also get 320 item codes but for example one item code is present on page 5, 6, 7, 8 and 9 and there are many more duplicate items. When removing the duplicates, the total number of unique items is 208.

EDIT: it works correctly if I add another order by criteria like suggested in this post.

ORDER BY p.b asc, i.foundAt desc

Thank you for helping!!

When I run this python function, the result is also correct, 320 unique items with no duplicates.

def test():
	limit = 25;
	maxPage = 13; #25*13 exceeding the total number of 320 expected items

	for page in range(maxPage):
		skip = page * limit;
		print("ON PAGE: ", page)
		query = f"""
			MATCH (i:Item)-[:HAS_PRICE]->(p:Price)
			WHERE p.a >= 7 AND p.a <= 39 AND p.b >= 1
			WITH i, p
			ORDER BY p.b asc
			SKIP {skip}
			LIMIT {limit}
			RETURN i.code
		"""
		res = connection.query(query)
		for rec in res:
			code = rec[0]
			print(code)

 

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.