Hi There!
I have a question about how Neo4j stores its relationships on disk vs in the page cache.
My current understanding is based of off this Neo4j knowledge base article about storing data on disks: Understanding Neo4j’s data on disk - Knowledge Base
And the "Most Important Slide you will Ever See" from Max De Marzi's presentation on Neo4j internals: https://cdn.neo4jlabs.com/nodes2019/slides/Data+Modeling+Tricks-pdf.pdf
From the looks of it, it seems that relationships stored on disk are just stored within a single linked-list attached to a node. My understanding is that, on disk, the queries can't make use of things like relationship directions or types, and must simply iterate through the entire linked-list of relationships to find the desired relationships for the query.
Relationships stored within the cache are also essentially stored as linked lists, but metadata about relationship type and direction is also stored in the cache, which allows queries to more efficiently traverse the correct nodes.
E.x.: I have a Video node, User Nodes, and Tag Nodes, and User's :LIKE a Video and a Video :HAS Tag. If I want to query for a Video based on whether or not it :HAS a tag, the query can simply ignore all the :LIKE relationships attached to the Video node and jump straight into iterating through the :HAS relationships, so long as the data its querying is stored in the page cache.
However, my understanding is that were this query to run on disk, the query would not be able to make use of the relationship type or direction, meaning it would have to iterate across every relationship attached to the node, regardless of type or direction.
Is my understanding correct? Or does Neo4j make use of relationship types and directions when querying data stored on disk?
Thank you!
Matt