cancel
Showing results for 
Search instead for 
Did you mean: 

Neo4j always read data to memory , which lead neo4j no responsing when I import data into neo4j

gong
Node

neo4j always read data to memory , which lead neo4j no responsing when I import data into neo4j.
My memory have 55g, I set heap = 21g pageCache=24g by advice of neo4j-admin memrec.
I use neo4j-jdbc import data into neo4j.
I import data into neo4j when my neo4j has stored 157608100 nodes and 157360000 edges.
You can see it from fig.1
2X_b_bd7797c03ad5fbc8ab22a9a212e9438677705f68.png
fig.1
I found neo4j always read data from disk into memory. util is 100%.(using tool iostat -x 1 100)
You can see it from fig.2


fig.2
And, data can't be written into neo4j when I import data into neo4j. You can see my debug.log from fig.3


fig.3
my cypher statement is simple

unwind {0} as row merge (s:node{obj_id:row.obj_id}) on create set s += row, s:entity

neo4j version 3.5.3
driver.name=Neo4j JDBC Driver
driver.version=3.4.0

12 REPLIES 12

Do you have a index on node(obj_id) ?

Yes, I create a constraint on node(obj_id) and entity(obj_id)

what's the size of the list that you're unwinding?

You need to make sure your transaction don't grow too large. apoc.periodic.iterate is key to control transaction size.

My the size of the list is 2000, a transaction includes two unwind statement. So a transaction has 4000 data.

a transaction includes two unwind statement.

Didn't see this in your example, so I think we'll need to see the full query. Depending on what you're doing, it might be multiplicative, not additive, so potentially 2000 * 2000 rows

My the other statement of a transaction is unwind {0} as row merge (s:node{obj_id:row.obj_id}) on create set s += row, s:event. It is is similar to the first statment of a transaction.

As long as these aren't in the same query, then you're likely fine, and it would be additive. If these are both in the same query then depending on the full query it could be multiplicative.

I can write data into neo4j if increase the size of page cache. It means that neo4j is a memory database. It is almost need the size of page cache is equal to the size of data.

I'm not quite sure what you're asking. Neo4j persists data to disk, but does use the pagecache to speed up operations. It is not a purely in memory database, but it does take advantage of caching in memory.

Definitely the more memory you have available, and the larger you can configure your pagecache such that as much of the graph is in memory as possible, the better, though of course you'll still need heap space for holding transactional data before commit.

I set the size of heap is 8g. I think that 8g is enough. I know that neo4j persists data to the disk. I means that I can continue to import fastly data into neo4j when I increase the size of page cache. But I increase the size of heap is unuseful.

The heap is useful, as this is basically the memory that is available as queries execute, and as transactional state builds up. Running with a low heap will restrict what kinds of queries you can execute, as certain queries that cover too much data, or that build up too much transactional state, could blow your heap and require a restart. Please don't minimize the role of the heap, although 8g is a decent amount for a small to medium size graph and small to medium queries.

In terms of ratio, especially with more memory available on a server, you are right that we favor more memory allocated to pagecache. You can use neo4j-admin memrec to get a recommendation for memory configuration given your available memory.

I have used neo4j-admin memrec to get a recommendation for memory configuration. My server only have 64g. But my data have occupied 85g when I would like continuing to import data into neo4j. I have no enough memory to pagecache. So can I only increase my memory if I would like continuing to import fastly data into neo4j.