‘real’ storage size in Neo 4j

Hi I am trying to compare Neo4j and Protege storage size with the same amount of information. Based on lots of literatures the LPG format should be ten times smaller in the size comparing with RDF triples. However, in my case which only has 130 nodes and 130 edges, the size (300 kB) is much larger than those in Protege (100kB). Even if I ignore the logic log(100kB), it is still larger. My questions are:

  1. Should I ignore more other types of data in Neo4j?
  2. If it is what it is, any explanation for the difference between my results and literature?



Perhaps some general observations might help triage what's going on:

  • Have you imported your RDF data and kept it reified?
  • How are you importing the data?

I don't know if you've been using the Neosemantics plugin to import your data (https://neo4j.com/labs/nsmtx-rdf/), also you may find the following post useful when thinking about how you might want to tackle reified nodes, which would reduce the amount of storage required (and likely be the more appropriate model for LPG):

Thanks Lju. Actually I didn't import any thing between, I built the nodes and relationships one by one separately in two software. And I believe now they have same information range.


I think those different Neo4j stores store things in blocks of 8KB, and you can see that a lot of the sizes are a multiplier of 8. That means you can likely add a lot more data without the sizes shown by :sysinfo changing.

I don't think (off the top of my head) that there's a way you can have it flush to disk the exact amount of data that is there - it's always done in blocks of 8KB.

But if you load in more data that 8KB block size stops being as much of an issue as it is when you've only loaded a hundred or so nodes and relationships.

Cheers, Mark

Appreciate it mark! It does help.