Efficiently getting all out-degrees in a large graph

I have a reasonably large and dense graph of 1.5 million nodes. I'd like to get a list of all out-degrees of relationships of a certain type. I've played around a bit, but the queries I've come up with are all slow enough that I'm looking for feedback on whether I can tune it further.

I'm currently using the following:

MATCH (p:Package)-[r:PACKAGE_DEPENDS_ON]->()
RETURN p.name, COUNT(r) AS count

I'm not actually interested in p.name, but without it, the query just returns a single number. My question is whether there's a way to write this query that doesn't return unnecessary information, and whether that might improve performance?

  1. can you share the schema ?
  2. explain plan for the query ?

@taobojlen I guess you could use degree centrality by carefully choosing the arguments. In your case, you could specify "OUTGOING" since you only want to compute the out edges. https://neo4j.com/blog/graph-algorithms-neo4j-degree-centrality/

Yes, here is the schema. I've added the table format since the visualization did not show all labels.

Graph schema

Screenshot from 2020-07-02 09-48-39

  • Red: Version
  • Blue: VersionRequirement
  • Orange: Package
  • Beige: User

Nodes

[
{
  "identity": -3,
  "labels": [
    "VersionRequirement"
  ],
  "properties": {
"indexes": [],
"name": "VersionRequirement",
"constraints": []
  }
}
,
{
  "identity": -2,
  "labels": [
    "User"
  ],
  "properties": {
"indexes": [],
"name": "User",
"constraints": []
  }
}
,
{
  "identity": -4,
  "labels": [
    "Version"
  ],
  "properties": {
"indexes": [],
"name": "Version",
"constraints": []
  }
}
,
{
  "identity": -1,
  "labels": [
    "Package"
  ],
  "properties": {
"indexes": [],
"name": "Package",
"constraints": []
  }
}
]

Relationships

[
{
  "identity": -6,
  "start": -4,
  "end": -4,
  "type": "DEPENDS_ON_RESOLVES_TO",
  "properties": {

  }
}
,
{
  "identity": -8,
  "start": -1,
  "end": -1,
  "type": "PACKAGE_DEPENDS_ON",
  "properties": {

  }
}
,
{
  "identity": -4,
  "start": -3,
  "end": -4,
  "type": "RESOLVES_TO",
  "properties": {

  }
}
,
{
  "identity": -7,
  "start": -4,
  "end": -4,
  "type": "NEXT_VERSION",
  "properties": {

  }
}
,
{
  "identity": -1,
  "start": -4,
  "end": -3,
  "type": "DEPENDS_ON",
  "properties": {

  }
}
,
{
  "identity": -2,
  "start": -2,
  "end": -4,
  "type": "MAINTAINS",
  "properties": {

  }
}
,
{
  "identity": -3,
  "start": -3,
  "end": -1,
  "type": "REQUIREMENT_OF",
  "properties": {

  }
}
,
{
  "identity": -5,
  "start": -4,
  "end": -1,
  "type": "VERSION_OF",
  "properties": {

  }

And the explain plan:
Screenshot from 2020-07-02 09-50-54

I might be missing something from this link, but it looks like their query is very similar to the one I have? In my query I do specify a direction for the relationship.

Well, running the query today seems totally fine -- yesterday I was waiting for ~30 minutes and assumed that it was due to a badly written query. It seems like Neo4j was stuck in a bad state. Sometimes turning it off and on again does work...!