There are a lot of difference option to make a query faster. But in my case sometimes is an other query faster than the other query.
But i need help, what is the best way!
I use Neo4j version 3.5.2.
In my case i have a big note with 4 miljoen messages and sometimes i want the last messages, and sometimes i want filter the messages. The messages has some relation to a hourly timetree and we have split all the words to a new note with relations between the word en message.
In some cases it is faster to use that and in some cases it is better to filter the message note by itself.
We want only filter or display the messages in the last month to make the performance better.
The examples are only word filters but there are more filters like a group-code (this code are not in the message but it is always a relation to a other note).
Some queries and results
PROFILE
MATCH (startleaf:Hour{hash: '2018/04/01/05'}), (endleaf:Hour{hash: '2018/04/30/05'}), p = shortestPath((startleaf)-[:NEXT*0..]->(endleaf))
UNWIND nodes(p) AS leaf
MATCH (leaf)<-[:SENDED]-(message:TS_P2000Message)
WITH distinct message
MATCH (message)-[:HAS_WORD]->(:TS_Word { name:'someren'})
WITH distinct message AS message
MATCH (message)-[:HAS_WORD]->(:TS_Word { name:'kruisbaan'})
WITH distinct message AS message
WITH count(message) AS results, collect(message) AS messages
UNWIND(messages) AS message
WITH results, message AS message
SKIP 0 LIMIT 15
RETURN results, message
First: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 848193 total db hits in 2099 ms.
Second: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 848176 total db hits in 763 ms.
PROFILE
MATCH (startleaf:Hour{hash: '2018/04/01/05'}), (endleaf:Hour{hash: '2018/04/30/05'}), p = shortestPath((startleaf)-[:NEXT*0..]->(endleaf))
UNWIND nodes(p) AS leaf
MATCH (leaf)<-[:SENDED]-(message:TS_P2000Message)
WHERE message.message =~ '(?i).*someren.*' AND message.message =~ '(?i).*kruisbaan.*'
WITH count(message) AS results, collect(message) AS messages
UNWIND(messages) AS message
WITH results, message AS message
SKIP 0 LIMIT 15
RETURN results, message
First: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 115168 total db hits in 3732 ms.
Second: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 115168 total db hits in 338 ms.
PROFILE
MATCH p = shortestPath((startleaf:Hour{hash: '2018/04/01/05'})-[:NEXT*0..]->(endleaf:Hour{hash: '2018/04/30/05'}))
WITH NODES(p) AS dates
MATCH (message:TS_P2000Message)-[:SENDED]->(leaf),
(message)-[:HAS_WORD]->(word:TS_Word)
WHERE leaf IN dates AND
word.name IN ['kruisbaan', 'someren']
WITH distinct message AS message
WITH count(message) AS results, collect(message) AS messages
UNWIND(messages) AS message
WITH results, message AS message
SKIP 0 LIMIT 15
RETURN results, message
First: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 4694 total db hits in 1086 ms.
Second: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 4694 total db hits in 36 ms.
But this query is very long when i use this without word filter or with other words (popular words) this query is faster, check the next result
PROFILE
MATCH p = shortestPath((startleaf:Hour{hash: '2018/04/01/05'})-[:NEXT*0..]->(endleaf:Hour{hash: '2018/04/30/05'}))
WITH NODES(p) AS dates
MATCH (message:TS_P2000Message)-[:SENDED]->(leaf),
(message)-[:HAS_WORD]->(word:TS_Word)
WHERE leaf IN dates AND
word.name IN ['brand']
WITH distinct message AS message
WITH count(message) AS results, collect(message) AS messages
UNWIND(messages) AS message
WITH results, message AS message
SKIP 0 LIMIT 15
RETURN results, message
First: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 149648 total db hits in 21066 ms.
Second: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 149652 total db hits in 1679 ms.
When we use without the other notes
PROFILE
MATCH (message:TS_P2000Message)
WHERE 1546281754000 <= message.sended <= 1548960154000
WITH count(message) AS results, collect(message) AS messages
WITH results AS results, messages AS messages
UNWIND(messages) AS message
WITH results, message AS message
ORDER BY message.sended desc
SKIP 0 LIMIT 15
RETURN results, message
First: Started streaming 15 records after 3098 ms and completed after 3100 ms.
Second: Started streaming 15 records after 303 ms and completed after 303 ms.
Other way to filter.
PROFILE
MATCH (message:TS_P2000Message)
WHERE 1546281754000 <= message.sended <= 1548960154000
WITH message
WHERE message.message =~ "(?i).*\\bsomeren\\b.*"
OR message.message =~ "(?i).*\\bbrand\\b.*"
WITH count(message) AS results, collect(message) AS messages
WITH results AS results, messages AS messages
UNWIND(messages) AS message
WITH results, message AS message
ORDER BY message.sended desc
SKIP 0 LIMIT 10
RETURN results, message
First: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 342324 total db hits in 3346 ms.
Second: Cypher version: CYPHER 3.5, planner: COST, runtime: SLOTTED. 342324 total db hits in 1114 ms.