we are doing database tests with different databases on connected data. Surprisingly Neo4J is somewhat slow compared to JDO Datanucleus. In our tests we have around 1.1 mio node, we aim for much bigger datasets ( around 10^12 nodes).
Here is a model of our datamodel (generated by apoc.meta.graph) with additional information on the number of nodes of a specific type. So we have around 1.1 mio nodes in our database.
Here is the query we are using:
MATCH (n:Artefact) WHERE NOT ()-[:Link]->(n) WITH collect(n) AS sources_list, sinks_list MATCH trace = (sources)-[:Link*0..9]->(sinks) WHERE sinks in sinks_list and sources in sources_list WITH [n IN nodes(trace)| n.neo4jIdentity] AS trace_nodes RETURN DISTINCT trace_nodes
This query takes 2min 8sec to execute.
Here is the execution plan:
We guess it would be helpfull to prevent the expand all for the link matching, but we are not sure how to do that.
The goal of the query is to find all paths up to a specified length between Artefacts that have no incoming edge of type Link and Artefacts that have no outgoing edge of type Link.
Neo4j Desktop 1.1.3
Querying tried from the Neo4J Browser and programatically with java using the Object Graph Mapper (Version 3.1.2).
Thanks in advance for your help.