Two equivalent queries (different node label) performing very differently

My neo4j database has been loaded via NeoSemantics. This means that every node has a “Resource” label and also has its own label depending on the type of object. For each node I’ve also added an internalId field that is unique across all nodes in the graph.

Here is my first query:

PROFILE 
MATCH (n:Resource&Organization)-[:documentSource]->(art: Resource&Article)
WHERE n.internalId > 0
AND art.datePublished >= datetime('2025-07-01')
AND n.internalMergedSameAsHighToUri IS NULL
RETURN DISTINCT(n)
ORDER BY n.internalId
LIMIT 100

This performs ok: planner: COST, runtime: PIPELINED. 441936 total db hits in 8932 ms.

But when I try the equivalent query with another label, e.g:

PROFILE 
MATCH (n:Resource&AboutUs)-[:documentSource]->(art: Resource&Article)
WHERE n.internalId > 0
AND art.datePublished >= datetime('2025-07-01')
AND n.internalMergedSameAsHighToUri IS NULL
RETURN DISTINCT(n)
ORDER BY n.internalId
LIMIT 100

The performance degrades dramatically: planner: COST, runtime: PIPELINED. 10152421 total db hits in 35352 ms.

Interestingly, this uses a different index to start its query.

I then tried to hint to use the same index as the Resource&Organization query, like so:

PROFILE
MATCH (n:Resource&AboutUs)-[:documentSource]->(art: Resource&Article)
USING INDEX n:Resource(internalId)
WHERE n.internalId > 0
AND art.datePublished >= datetime('2025-07-01')
AND n.internalMergedSameAsHighToUri IS NULL
RETURN DISTINCT(n)
ORDER BY n.internalId
LIMIT 100

This helped a bit but it's still pretty dire by comparison: planner: COST, runtime: PIPELINED. 34836518 total db hits in 25676 ms.

Some context:
There are about 1.5m matching Organization nodes, of which each one could have many (>10) Articles attached. There are about 1 million matching AboutUs nodes, each of which is only likely to have 1 Article attached.

My question: What can I do to optimize these queries, or at least make them work to a similar level of quality?

@alanbuxton

is there any details of what Neo4j version?

Ah yes sorry for that. It's 5.20

Running in Neo4j Desktop

Even more bizarre.

Today the same query, with the same database, is performing a lot faster.

PROFILE 
MATCH (n:Resource&AboutUs)-[:documentSource]->(art: Resource&Article)
WHERE n.internalId > 0
AND art.datePublished >= datetime('2025-07-01')
AND n.internalMergedSameAsHighToUri IS NULL
RETURN DISTINCT(n)
ORDER BY n.internalId
LIMIT 100

planner: COST, runtime: PIPELINED. 10152421 total db hits in 3503 ms.

Run CALL db.stats.retrieve('GRAPH COUNTS') to check cardinalities

Hi @Froums here is the output:

{
  "nodes": [
    {
      "count": 14029405
    },
    {
      "count": 14029404,
      "label": "Resource"
    },
    {
      "count": 4219668,
      "label": "Organization"
    },
    {
      "count": 2457379,
      "label": "Article"
    },
    {
      "count": 29578,
      "label": "GeoNamesLocation"
    },
    {
      "count": 1500,
      "label": "IndustryCluster"
    },
    {
      "count": 1,
      "label": "_GraphConfig"
    },
    {
      "count": 633692,
      "label": "CorporateFinanceActivity"
    },
    {
      "count": 74708,
      "label": "LocationActivity"
    },
    {
      "count": 71357,
      "label": "Site"
    },
    {
      "count": 259065,
      "label": "Person"
    },
    {
      "count": 278055,
      "label": "Role"
    },
    {
      "count": 309572,
      "label": "RoleActivity"
    },
    {
      "count": 408683,
      "label": "PartnershipActivity"
    },
    {
      "count": 219935,
      "label": "ProductActivity"
    },
    {
      "count": 258999,
      "label": "Product"
    },
    {
      "count": 271511,
      "label": "MarketingActivity"
    },
    {
      "count": 103837,
      "label": "EquityActionsActivity"
    },
    {
      "count": 190046,
      "label": "OperationsActivity"
    },
    {
      "count": 214967,
      "label": "FinancialsActivity"
    },
    {
      "count": 58531,
      "label": "RegulatoryActivity"
    },
    {
      "count": 15110,
      "label": "IncidentActivity"
    },
    {
      "count": 105498,
      "label": "FinancialReportingActivity"
    },
    {
      "count": 9907,
      "label": "AnalystRatingActivity"
    },
    {
      "count": 66,
      "label": "RecognitionActivity"
    },
    {
      "count": 1007288,
      "label": "AboutUs"
    },
    {
      "count": 371162,
      "label": "IndustrySectorUpdate"
    }
  ],
  "indexes": [
    {
      "indexProvider": "token-lookup-1.0",
      "indexType": "LOOKUP",
      "updatesSinceEstimation": 0,
      "labels": [],
      "properties": [],
      "totalSize": 0,
      "estimatedUniqueSize": 0
    },
    {
      "indexProvider": "token-lookup-1.0",
      "indexType": "LOOKUP",
      "updatesSinceEstimation": 0,
      "relationshipTypes": [],
      "properties": [],
      "totalSize": 0,
      "estimatedUniqueSize": 0
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 184736,
      "labels": [
        "Resource"
      ],
      "properties": [
        "uri"
      ],
      "totalSize": 13841235,
      "estimatedUniqueSize": 13841235
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 414714,
      "labels": [
        "Resource"
      ],
      "properties": [
        "internalDocId"
      ],
      "totalSize": 11092964,
      "estimatedUniqueSize": 2368567
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 26831,
      "labels": [
        "Resource"
      ],
      "properties": [
        "internalMergedSameAsHighToUri"
      ],
      "totalSize": 2706672,
      "estimatedUniqueSize": 427547
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 36925,
      "labels": [
        "Article"
      ],
      "properties": [
        "datePublished"
      ],
      "totalSize": 2420463,
      "estimatedUniqueSize": 1444435
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 5252,
      "labels": [
        "Resource"
      ],
      "properties": [
        "internalMergedActivityWithSimilarRelationshipsToUri"
      ],
      "totalSize": 180272,
      "estimatedUniqueSize": 151575
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 0,
      "labels": [
        "GeoNamesLocation"
      ],
      "properties": [
        "countryCode",
        "admin1Code"
      ],
      "totalSize": 29558,
      "estimatedUniqueSize": 2427
    },
    {
      "indexProvider": "vector-2.0",
      "indexType": "VECTOR",
      "updatesSinceEstimation": 0,
      "labels": [
        "IndustryCluster"
      ],
      "properties": [
        "representative_doc_embedding"
      ],
      "totalSize": 0,
      "estimatedUniqueSize": 0
    },
    {
      "indexProvider": "vector-2.0",
      "indexType": "VECTOR",
      "updatesSinceEstimation": 0,
      "labels": [
        "Organization"
      ],
      "properties": [
        "industry_embedding"
      ],
      "totalSize": 0,
      "estimatedUniqueSize": 0
    },
    {
      "indexProvider": "range-1.0",
      "indexType": "RANGE",
      "updatesSinceEstimation": 91535,
      "labels": [
        "Resource"
      ],
      "properties": [
        "internalId"
      ],
      "totalSize": 13921419,
      "estimatedUniqueSize": 13921419
    }
  ],
  "constraints": [
    {
      "label": "Resource",
      "properties": [
        "uri"
      ],
      "type": "Uniqueness constraint"
    }
  ],
 
... etc

(truncated to not make the message any longer). Is this helpful?