Hi,
I am doing my PhD thesis and using Neo4J to study the links between a network of corrupt companies. For that purpose, I came across a paper providing the advantages and disadvantages of using Neo4J. The extract is as follows:
"In graph processing, application needs can be classified into two categories, online query processing, which requires low latency computing, and offline graph analytics, which requires high throughput computing. For example, community analysis on social networks or link analysis on click graphs are analytical tasks, it is important to to ensure that the system also supports interactive user activities such as graph browsing and querying, like approximate shortest distances relying on indices or sketches derived from from the data and building such indices or data sketches can be quite analytical.
Neo4J focuses on supporting online transaction processing on graph data (OLTP). It is like a regular database system but with more powerful and expressive data model. Neo4J is not distributed that is it does not handle graphs that are partitioned over multiple machines. This limits the size of graphs that Neo4J can efficiently handle. This is so because data access on graphs has no locality that is exploration on graphs incurs mostly random data access. For large graphs, that cannot be stored stored in memory, disk random access becomes the performance bottleneck. Moreover, a single machine does not have enough computation power compared with a distributed parallel system. It is difficult for Neo4J to handle web-scale graphs."
My questions are:
How Neo4J addresses its limitations and how different is it from other graph database platforms? If anyone can guide me to a journal article related to the same even that would be appreciated.
Thanks once again.