VM stop-the-world pause for simple LOAD csv query for creating a graph

Hi,

My code is as follows:

rom neo4j.v1 import GraphDatabase, basic_auth
import csv,os,glob
import time

print(os.getcwd())
subdir = '.\\..\\Dataset\\\\plasmin\\' #plasmin, acetylcholine_esterase
os.chdir(subdir)
print(os.getcwd())

fileList = [x.split(".")[0] for x in glob.glob('*.scalaprocessed_maxdist11_graph')]
#fileList = ['1A25']
print(len(fileList),fileList)

driver = GraphDatabase.driver("bolt://127.0.0.1:11005", auth=("neo4j", "123")) #7687 sandbox: neo4j, neo4j
session = driver.session()

session.run("MATCH(n) DETACH DELETE n")

for prot in fileList:
    print("\n", prot)
    if os.path.exists(prot+"_AllCCs_maxDist11.csv"):
        print("Already processed")
        continue
    # prerequisite. delete any pre-existing graph (else wil get added up for each subsequent protein)
    session.run("MATCH(n) DETACH DELETE n")
    
    #1. Create graph
    start = time.time()
    session.run("USING PERIODIC COMMIT "
                 "LOAD CSV FROM 'file:///'+{prot}+'.scalaProcessed_maxdist11/part-00000' AS line "
                 "MERGE (n:MyNode {Name:line[0]}) " 
                 "MERGE (m:MyNode {Name:line[1]}) " 
                 "MERGE (n) -[:TO {dist:line[2]}] -> (m) ", prot=prot)
    print("done step 1")
    #2 find CCs
    result = session.run("CALL algo.unionFind.stream('MyNode', 'TO', {graph:'huge'}) "
                          "YIELD nodeId,setId "
                          "MATCH (n) "
                          "WHERE id(n)=nodeId "
                          "WITH setId,collect(nodeId) as nodes, collect(n.Name) as labels,count(*) as size_of_component "
                          "ORDER BY size_of_component DESC "
                          "RETURN setId as componentId,size_of_component,labels as connectedTSRkeys ")
    # 3. save result
    with open(prot+"_AllCCs_maxDist11.csv","w",newline='') as csvfile: # newline='' <- to avoid blank line between two rows
        writer = csv.writer(csvfile,delimiter = ',')
        writer.writerow(['componentId','size_of_component','connectedTSRkeys'])
        for record in result:
            record =  str(record)[:-1].replace(", ",",").replace("'","").split() 
            #print("\n",record[1],record[2],record[3])
            writer.writerow([record[1].split("=")[1], record[2].split("=")[1], record[3].split("=")[1]])
    
    # 4. delete the graph (else wil get added up for each subsequent protein)
    session.run("MATCH(n) DETACH DELETE n")
    print("Total processing time = ", time.time()-start)

driver.close()
print ("Neo4j end")```

- I am using:
**CentOS 7** on a remote machine with **neo4j 3.5.8**.
**dbms.memory.heap.initial_size=32G
dbms.memory.heap.max.size=64G
dbms.memory.pagecache.size=128G** where available 256GB

- I am using python code to access neo4j through neo4j-python driver. 

The problem I am encountering is a VM stop-the-world pause is getting imposed after creating around 10000 nodes and the execution stalls though the code does not stop. 

The neo4j.log seems to be clean. 
Snapshot of the debug log:
![image|690x420](upload://fr9uU8Tu6uaaHRXqyryFbBCAxJn.png) 

Most of the available solutions are talking about adjusting the heap size, which in my case is more than enough I feel.

couple of issues

have you narrowed this down to the 'LOAD CSV' is triggering the vm stop-the-world? If so do you have indexes on :MyNode and on the Name property?

your code includes match (n) detach delete n; How many nodes/relationships does this end up deleting? If you are deleting 100k+ nodes relationships you might want to consider batching this using apoc.periodic.iterate. See https://neo4j.com/developer/kb/large-delete-transaction-best-practices-in-neo4j/.

vm stop-the-world messages are indicative of a java garbage collection and are a part of a java based app. You cant eliminate them for good but rather attempt to minimize

First, yes, LOAD CSV is causing the stop-the-world pause. I have checked periodically hoe many nodes are getting created, until 11k nodes are getting created (only nodes, not relationships yet), then the above pause is getting applied. The code is not moving to the next step.

Second, in pre cleanup stage it’s deleting the above created 11k nodes at maximum. The operation takes less than 1 sec.

Third, I have executed the same code in my laptop with 20k nodes in total to be created (min heap=2G, max heap=4G, available RAM=4G), the code does not get paused, but takes long time; for each protein (input file for one graph),
step 1 < 5 sec
step 2 > 24 hour
step 3 < 2 sec.
Therefore, it is not a feasible solution to use such low power machine to process > 2000 proteins in a loop.

thanks for the update.

If so do you have indexes on :MyNode and on the Name property?

I decided not to keep indexes on :MyNode on Name property as its seems not to be useful for me. If we think in terms of relational database, I have only one table in my database with one attribute Name, we have self-pointers for the table. Now, by indexing we cannot speedup to any level, while searching, we have to look up all the keys.

Here is a snapshot of one of my input file:

key_i    ,key_j    ,relatinship
7148759_2,7151776_1,783_788
7151776_1,8071389_1,783_788
7047240_2,7148759_2,597_639
8394145_1,8800128_1,732_735
6400697_1,6401718_1,612_653

We can see key '7148759_2' appeared as key_i in first line and as key_j in the third line in the relational table. We have only keys in a single level of the hierarchy, therefore, I guessed indexing on Name (which is key_i/key_j here) might not help.

Please correct me if I am wrong.

indexing will certainly help
Without an index on :MyNode(Name) a cypher statement that attempts to find all nodes with label :MyNode will perform a ScanByLabel and iterate over all nodes with this label and evaulate each of these nodes against the WHERE clause. However with an index it will simply perform a NodeIndexScan. In the case of an environment that has for example 100m nodes with label :MyNode and each Name property is unique, with no index you effectively have to examine all 100m nodes. With an index you would examine only 1 node

Index is the way to go. But this is no different whether graph or RDBMS. Same principles apply

I will apply index and get back to you with the results within 24 hours.

FYI, my last run of code got some results, though not complete: for one input file, it took 15 hr to create 11594 nodes and corresponding relationships. It failed after that. Can you think of any reason what might take 15 hr for a graph to be created, given the above settings?

Keep in mind that without those indexes, each MERGE you were performing was doing a node by label scan and property access (to determine which nodes fit your pattern)

You were performing 2 MERGEs per row, for every row, so if your CSV had 11594 rows then 23188 label scans were being performed along with property access total. While this would be cheap at the start, it would have likely grown more and more expensive the more nodes were present.

We would also advise you to do separate passes through the CSV for nodes. Your two MERGE operations on the same label likely created an Eager operation which disabled the periodic commit behavior of the load, leading to heap pressure and likely to issues with processing. Try creating the nodes first (if all exist under line[0]) then just one pass...but otherwise two passes one per line). When all nodes are created then for the relationships just do a pass using MATCH (instead of MERGE) and then MERGE the relationship.

Here is an update:
I have incorporated indexing before starting to load data from CSV and it improved step1. Here is the updated code (I have updated the deletion approach also).

.
.
.
driver = GraphDatabase.driver("bolt://127.0.0.1:7687", auth=("neo4j", "123")) #7687 sandbox: neo4j, neo4j
session = driver.session()

print("\nConnection Established.")

def delete_all():
        # get total #of nodes 
        res = session.run("MATCH(n) RETURN COUNT(*) AS n")
        total_nodes=0
        for item in res:
            total_nodes= item["n"]
        print("\n Existing nodes in db:", total_nodes)

        # get total #of relationships
        res1 = session.run("MATCH (n)-[r]->() RETURN COUNT(r) as r")
        total_rels=0
        for item in res1:
            total_rels= item["r"]
        print("\n Existing relationships in db:", total_rels)

        # delete all nodes in batches (for faster deletion)
        while total_nodes > 0 :
                res = session.run("MATCH(n) WITH n LIMIT 10000 DETACH DELETE n RETURN COUNT(n) AS count")
                count=0
                for item in res:
                        count = item["count"]#updates deleeted node count here
                total_nodes = total_nodes-count
        print("\n #of nodes in db after deletion completed = ", total_nodes)
start = time.time()
delete_all()
print("\n Pre cleanup time (sec): ", time.time()-start)

for prot in fileList:
    print("\n\n", prot)
    if os.path.exists(prot+"_AllCCs_maxDist11.csv"):
        print("\n Already Processed.")
        continue
    start = time.time()
    delete_all()
    pre_time = time.time()-start
    print("\n Pre cleanup time (sec): ", pre_time)

    # Database preparation 
    session.run("CREATE INDEX ON :MyNode(Name)")

    #1. Create graph
    start = time.time()
    session.run("USING PERIODIC COMMIT "
                 "LOAD CSV FROM 'file:///'+{prot}+'_conflict_resolved.txt' AS line "
                 "MERGE (n:MyNode {Name:line[0]}) "
                 "MERGE (m:MyNode {Name:line[1]}) "
                 "MERGE (n) -[:TO {dist:line[2]}] -> (m) ", prot=prot)

    end = time.time()
    step1_time = end - start
    print("\n Step 1 time (in sec) = ", end-start)

    #2 find CCs
    start = time.time()
    result = session.run("CALL algo.unionFind.stream('MyNode', 'TO', {graph:'huge'}) "
                          "YIELD nodeId,setId "
                          "MATCH (n) "
                          "WHERE id(n)=nodeId "
                          "WITH setId,collect(nodeId) as nodes, collect(n.Name) as labels,count(*) as size_of_component "
                          "ORDER BY size_of_component DESC "
                          "RETURN setId as componentId,size_of_component,labels as connectedTSRkeys ")
    end = time.time()
    step2_time = end - start
    print("\n Step 2 time (in sec) = ", end-start)
   #3. save result
    start = time.time()
    with open(prot+"_AllCCs_maxDist11.csv","w") as csvfile: # newline='' <- to avoid blank line between two rows
        writer = csv.writer(csvfile,delimiter = ',')
        writer.writerow(['componentId','size_of_component','connectedTSRkeys'])
        for record in result:
            record =  str(record)[:-1].replace(", ",",").replace("'","").split()
            print("\n",record[1],record[2],record[3])
            writer.writerow([record[1].split("=")[1], record[2].split("=")[1], record[3].split("=")[1]])
    end = time.time()
    step3_time = end - start
    print("\n Step 3 time (in sec) = ", end-start)

    #4. delete graph
    start = time.time()
    delete_all()
    end = time.time()
    post_time = end - start
    print("\n Post cleanup time (in sec) = ", end-start)

    print("\n Total time = ", pre_time+step1_time+step2_time+step3_time+post_time)

driver.close()
print ("\n Neo4j end")

The code executed for one input file and generated:

Existing nodes in db:11684
Existing relationships in db:4432666
Pre cleanup time (sec): 0.014729022979736328
Step 1 time (in sec) = 0.0004360675811767578
Step 2 time (in sec) = 1128.0729930400848
Step 3 time (in sec) = 0.19386005401611328
Post cleanup time (in sec) = 89.63982391357422
Total time = 1217.921842098236
Traceback (most recent call last):
  File "neo4j_copy.py", line 80, in <module>
    result = session.run("CALL algo.unionFind.stream('MyNode', 'TO', {graph:'huge'}) "
  File "/usr/lib64/python2.7/site-packages/neo4j/v1/api.py", line 331, in run
    self._connection.fetch()
  File "/usr/lib64/python2.7/site-packages/neo4j/bolt/connection.py", line 287, in fetch
    return self._fetch()
  File "/usr/lib64/python2.7/site-packages/neo4j/bolt/connection.py", line 327, in _fetch
    response.on_failure(summary_metadata or {})
  File "/usr/lib64/python2.7/site-packages/neo4j/v1/result.py", line 70, in on_failure
    raise CypherError.hydrate(**metadata)
neo4j.exceptions.DatabaseError: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details.

Here is a snapshot of debug.log file after trying to restart neo4j:

2019-08-23 23:01:25.580+0000 INFO [o.n.i.d.DiagnosticsManager]   Total size of mapped files: 477.17 MB
2019-08-23 23:01:25.580+0000 INFO [o.n.i.d.DiagnosticsManager] --- STARTED diagnostics for KernelDiagnostics:StoreFiles END ---
2019-08-23 23:01:25.580+0000 INFO [o.n.k.a.DatabaseAvailabilityGuard] Fulfilling of requirement 'Database available' makes database graph.db available.
2019-08-23 23:01:25.580+0000 INFO [o.n.k.a.DatabaseAvailabilityGuard] Database graph.db is ready.
2019-08-23 23:01:25.580+0000 INFO [o.n.k.i.DatabaseHealth] Database health set to OK
2019-08-23 23:01:26.351+0000 INFO [o.n.b.BoltServer] Bolt server loaded
2019-08-23 23:01:26.468+0000 INFO [o.n.b.BoltServer] Bolt enabled on 127.0.0.1:7687.
2019-08-23 23:01:27.763+0000 INFO [o.n.i.d.DiagnosticsManager] --- SERVER STARTED START ---
2019-08-23 23:01:28.753+0000 INFO [o.n.i.d.DiagnosticsManager] --- SERVER STARTED END ---
2019-08-23 23:01:35.913+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=131, gcTime=136, gcCount=1}
2019-08-23 23:01:43.286+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=100, gcTime=140, gcCount=1}
2019-08-23 23:01:55.271+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=118, gcTime=119, gcCount=1}
2019-08-23 23:03:02.050+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=111, gcTime=123, gcCount=1}
2019-08-23 23:04:09.150+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=149, gcTime=152, gcCount=1}
2019-08-23 23:04:15.430+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=173, gcTime=257, gcCount=1}
2019-08-23 23:05:31.105+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=220, gcTime=227, gcCount=1}
2019-08-23 23:06:55.569+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=326, gcTime=328, gcCount=1}
2019-08-23 23:08:46.308+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=103, gcTime=0, gcCount=0}
2019-08-23 23:12:28.911+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=115, gcTime=181, gcCount=1}
2019-08-23 23:13:25.994+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=146, gcTime=171, gcCount=1}
2019-08-23 23:14:51.262+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=198, gcTime=249, gcCount=1}
2019-08-23 23:15:19.198+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=156, gcCount=1}
2019-08-23 23:17:37.486+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=157, gcCount=1}
2019-08-23 23:18:06.551+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=130, gcTime=141, gcCount=1}
2019-08-23 23:18:34.201+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=115, gcTime=160, gcCount=1}
2019-08-23 23:19:02.844+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=110, gcTime=172, gcCount=1}
2019-08-23 23:19:15.222+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=241, gcTime=309, gcCount=1}
2019-08-23 23:19:19.044+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=499, gcTime=506, gcCount=1}
2019-08-23 23:19:24.462+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=613, gcTime=575, gcCount=1}
2019-08-23 23:19:28.577+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=411, gcTime=365, gcCount=1}
2019-08-23 23:19:33.258+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=276, gcTime=305, gcCount=1}
2019-08-23 23:19:33.968+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=107, gcTime=168, gcCount=1}
2019-08-23 23:19:36.453+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=181, gcTime=0, gcCount=0}
2019-08-23 23:19:37.011+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=218, gcTime=0, gcCount=0}
2019-08-23 23:19:38.382+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=269, gcTime=265, gcCount=1}
2019-08-23 23:19:40.924+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=339, gcTime=331, gcCount=1}
2019-08-23 23:19:44.498+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=352, gcTime=380, gcCount=1}
2019-08-23 23:19:49.934+0000 INFO [o.n.k.NeoStoreDataSource] Rotated to transaction log [/var/lib/neo4j/data/databases/graph.db/neostore.transaction.db.16] version=16, last transaction in previous log=7870
2019-08-23 23:19:52.906+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1060, gcTime=1086, gcCount=1}
2019-08-23 23:19:58.107+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1043, gcTime=1056, gcCount=1}
2019-08-23 23:20:04.868+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=3204, gcTime=3257, gcCount=1}
2019-08-23 23:20:08.301+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=2331, gcTime=2314, gcCount=1}
2019-08-23 23:20:19.771+0000 INFO [o.n.k.i.p.Procedures] Node Store Scan: Imported 11,684 records and 0 properties from 1615 KiB (1,654,380 bytes); took 0.089 s, 131,692.26 Nodes/s, 17 MiB/s (18,646,784 bytes/s) (per thread: 32,923.06 Nodes/s, 4552 KiB/s (4,661,696 bytes/s))
2019-08-23 23:20:19.915+0000 INFO [o.n.k.i.p.Procedures] [relationship-store-scan-2] LOADING 1% Memory usage: 3864 KiB
2019-08-23 23:20:20.363+0000 INFO [o.n.k.i.p.Procedures] Relationship Store Scan: Imported 4,432,666 records and 0 properties from 287 MiB (301,740,480 bytes); took 0.579 s, 7,661,493.04 Relationships/s, 497 MiB/s (521,533,223 bytes/s) (per thread: 1,915,373.26 Relationships/s, 124 MiB/s (130,383,305 bytes/s))
2019-08-23 23:20:20.364+0000 INFO [o.n.k.i.p.Procedures] [neo4j.BoltWorker-5 [bolt] [/127.0.0.1:47196] ] LOADING 100% Memory usage: 6284 KiB
2019-08-23 23:20:20.465+0000 INFO [o.n.k.i.p.Procedures] [algo-3] ParallelUnionFind 100%
2019-08-23 23:20:20.524+0000 INFO [o.n.k.i.p.Procedures] UnionFind: overall memory usage: 6375 KiB
2019-08-23 23:20:20.971+0000 INFO [o.n.c.i.ExecutionEngine] Discarded stale query from the query cache after 1129 seconds: MATCH (n)-[r]->() RETURN COUNT(r) as r
2019-08-23 23:20:27.094+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1871, gcTime=0, gcCount=0}
2019-08-23 23:20:35.282+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1634, gcTime=1623, gcCount=1}
2019-08-23 23:31:54.221+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=262, gcTime=258, gcCount=1}
2019-08-23 23:31:58.854+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=228, gcTime=228, gcCount=1}
2019-08-23 23:33:05.202+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=289, gcTime=301, gcCount=1}
2019-08-23 23:33:19.039+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=121, gcTime=72, gcCount=1}
2019-08-23 23:34:21.600+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=289, gcTime=286, gcCount=1}
2019-08-23 23:35:09.729+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=168, gcTime=126, gcCount=1}
2019-08-23 23:35:37.672+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=109, gcTime=93, gcCount=1}
2019-08-23 23:35:53.499+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=308, gcTime=324, gcCount=1}
2019-08-23 23:36:52.506+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=111, gcTime=78, gcCount=1}
2019-08-23 23:37:43.229+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=116, gcTime=144, gcCount=1}
2019-08-23 23:37:53.015+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=776, gcTime=429, gcCount=1}
2019-08-23 23:38:02.838+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=712, gcTime=641, gcCount=1}
2019-08-23 23:38:10.275+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=604, gcTime=634, gcCount=1}
2019-08-23 23:38:15.773+0000 ERROR [o.n.k.i.DatabaseHealth] Database panic: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details. No space left on device
java.io.IOException: No space left on device
	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
	at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
	at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211)
	at org.neo4j.io.fs.StoreFileChannel.write(StoreFileChannel.java:160)
	at org.neo4j.io.fs.StoreFileChannel.writeAll(StoreFileChannel.java:73)
	at org.neo4j.kernel.impl.transaction.log.PhysicalLogVersionedStoreChannel.writeAll(PhysicalLogVersionedStoreChannel.java:67)
	at org.neo4j.kernel.impl.transaction.log.PhysicalFlushableChannel.prepareForFlush(PhysicalFlushableChannel.java:72)
	at org.neo4j.kernel.impl.transaction.log.PhysicalFlushableChannel.bufferWithGuaranteedSpace(PhysicalFlushableChannel.java:158)
	at org.neo4j.kernel.impl.transaction.log.PhysicalFlushableChannel.putLong(PhysicalFlushableChannel.java:121)
	at org.neo4j.kernel.impl.transaction.log.PositionAwarePhysicalFlushableChannel.putLong(PositionAwarePhysicalFlushableChannel.java:81)
	at org.neo4j.kernel.impl.transaction.log.PositionAwarePhysicalFlushableChannel.putLong(PositionAwarePhysicalFlushableChannel.java:30)
	at org.neo4j.kernel.impl.transaction.command.Command$PropertyCommand.writePropertyRecord(Command.java:424)
	at org.neo4j.kernel.impl.transaction.command.Command$PropertyCommand.serialize(Command.java:412)
	at org.neo4j.kernel.impl.transaction.log.entry.StorageCommandSerializer.visit(StorageCommandSerializer.java:43)
	at org.neo4j.kernel.impl.transaction.log.entry.StorageCommandSerializer.visit(StorageCommandSerializer.java:30)
	at org.neo4j.kernel.impl.transaction.log.PhysicalTransactionRepresentation.accept(PhysicalTransactionRepresentation.java:70)
	at org.neo4j.kernel.impl.transaction.log.entry.LogEntryWriter.serialize(LogEntryWriter.java:84)
	at org.neo4j.kernel.impl.transaction.log.TransactionLogWriter.append(TransactionLogWriter.java:43)
	at org.neo4j.kernel.impl.transaction.log.BatchingTransactionAppender.appendToLog(BatchingTransactionAppender.java:211)
	at org.neo4j.kernel.impl.transaction.log.BatchingTransactionAppender.append(BatchingTransactionAppender.java:119)
	at org.neo4j.kernel.impl.api.TransactionRepresentationCommitProcess.appendToLog(TransactionRepresentationCommitProcess.java:64)
	at org.neo4j.kernel.impl.api.TransactionRepresentationCommitProcess.commit(TransactionRepresentationCommitProcess.java:48)
	at org.neo4j.kernel.impl.api.KernelTransactionImplementation.commit(KernelTransactionImplementation.java:716)
	at org.neo4j.kernel.impl.api.KernelTransactionImplementation.closeTransaction(KernelTransactionImplementation.java:586)
	at org.neo4j.internal.kernel.api.Transaction.close(Transaction.java:178)
	at org.neo4j.kernel.impl.coreapi.TopLevelTransaction.close(TopLevelTransaction.java:77)
	at org.neo4j.kernel.impl.query.Neo4jTransactionalContext.close(Neo4jTransactionalContext.java:130)
	at org.neo4j.cypher.internal.runtime.interpreted.TransactionalContextWrapper.close(TransactionalContextWrapper.scala:74)
	at org.neo4j.cypher.internal.runtime.interpreted.DelegatingQueryTransactionalContext.close(DelegatingQueryContext.scala:341)
	at org.neo4j.cypher.internal.compatibility.v3_5.ExceptionTranslatingQueryContext$ExceptionTranslatingTransactionalContext.org$neo4j$cypher$internal$compatibility$v3_5$ExceptionTranslatingQueryContext$ExceptionTranslatingTransactionalContext$$sup2019-08-23 23:41:36.957+0000 WARN [o.n.b.r.DefaultBoltConnection] The client is unauthorized due to authentication failure.
2019-08-23 23:41:44.486+0000 ERROR [o.n.b.v.r.ErrorReporter] Client triggered an unexpected error [Neo.DatabaseError.Transaction.TransactionStartFailed]: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details., reference d615fd77-36c4-4be9-aa16-3fc32f0cd578.
2019-08-23 23:41:44.486+0000 ERROR [o.n.b.v.r.ErrorReporter] Client triggered an unexpected error [Neo.DatabaseError.Transaction.TransactionStartFailed]: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details., reference d615fd77-36c4-4be9-aa16-3fc32f0cd578. The database has encountered a critical error, and needs to be restarted. Please see database logs for more details.
org.neo4j.graphdb.TransactionFailureException: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details.
	at org.neo4j.graphdb.facade.spi.ClassicCoreSPI.beginTransaction(ClassicCoreSPI.java:193)
	at org.neo4j.kernel.impl.factory.GraphDatabaseFacade.beginTransactionInternal(GraphDatabaseFacade.java:691)
	at org.neo4j.kernel.impl.factory.GraphDatabaseFacade.beginTransaction(GraphDatabaseFacade.java:375)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachineV1SPI.beginTransaction(TransactionStateMachineV1SPI.java:139)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachineV1SPI.beginTransaction(TransactionStateMachineV1SPI.java:98)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State$1.execute(TransactionStateMachine.java:293)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State$1.run(TransactionStateMachine.java:269)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine.run(TransactionStateMachine.java:110)
	at org.neo4j.bolt.v3.runtime.ReadyState.processRunMessage(ReadyState.java:85)
	at org.neo4j.bolt.v3.runtime.ReadyState.processUnsafe(ReadyState.java:56)
	at org.neo4j.bolt.v3.runtime.FailSafeBoltStateMachineState.process(FailSafeBoltStateMachineState.java:48)
	at org.neo4j.bolt.v1.runtime.BoltStateMachineV1.nextState(BoltStateMachineV1.java:144)
	at org.neo4j.bolt.v1.runtime.BoltStateMachineV1.process(BoltStateMachineV1.java:92)
	at org.neo4j.bolt.messaging.BoltRequestMessageReader.lambda$doRead$1(BoltRequestMessageReader.java:89)
	at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:191)
	at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:139)
	at org.neo4j.bolt.runtime.ExecutorBoltScheduler.executeBatch(ExecutorBoltScheduler.java:171)
	at org.neo4j.bolt.runtime.ExecutorBoltScheduler.lambda$scheduleBatchOrHandleError$2(ExecutorBoltScheduler.java:154)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.neo4j.internal.kernel.api.exceptions.TransactionFailureException: The database has org.neo4j.cypher.internal.compatibility.ClosingExecutionResult.org$neo4j$cypher$internal$compatibility$ClosingExecutionResult$$safely(ClosingExecutionResult.scala:167)
	at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult.initiate(ClosingExecutionResult.scala:65)
	at org.neo4j.cypher.internal.compatibility.ClosingExecutionResult$.wrapAndInitiate(ClosingExecutionResult.scala:214)
	at org.neo4j.cypher.internal.compatibility.CypherCurrentCompiler$CypherExecutableQuery$$anonfun$execute$3.apply(CypherCurrentCompiler.scala:217)
	at org.neo4j.cypher.internal.compatibility.CypherCurrentCompiler$CypherExecutableQuery$$anonfun$execute$3.apply(CypherCurrentCompiler.scala:190)
	at org.neo4j.cypher.exceptionHandler$runSafely$.apply(exceptionHandler.scala:89)
	at org.neo4j.cypher.internal.compatibility.CypherCurrentCompiler$CypherExecutableQuery.execute(CypherCurrentCompiler.scala:224)
	at org.neo4j.cypher.internal.ExecutionEngine.execute(ExecutionEngine.scala:101)
	at org.neo4j.cypher.internal.javacompat.ExecutionEngine.executeQuery(ExecutionEngine.java:85)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachineV1SPI$BoltResultHandleV1.start(TransactionStateMachineV1SPI.java:189)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State.startExecution(TransactionStateMachine.java:506)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State$1.execute(TransactionStateMachine.java:300)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine$State$1.run(TransactionStateMachine.java:269)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine.run(TransactionStateMachine.java:110)
	at org.neo4j.bolt.v1.runtime.TransactionStateMachine.run(TransactionStateMachine.java:98)
	at org.neo4j.bolt.v1.runtime.ReadyState.processRunMessage(ReadyState.java:140)
	at org.neo4j.bolt.v1.runtime.ReadyState.processRunMessage(ReadyState.java:100)
	at org.neo4j.bolt.v1.runtime.ReadyState.process(ReadyState.java:61)
	... 11 more
2019-08-23 23:44:46.963+0000 INFO [o.n.k.a.DatabaseAvailabilityGuard] Database graph.db is unavailable.
2019-08-23 23:44:48.380+0000 INFO [o.n.g.f.GraphDatabaseFacadeFactory] Shutdown started
2019-08-23 23:44:50.926+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1954, gcTime=1569, gcCount=1}
2019-08-23 23:44:51.602+0000 INFO [o.n.g.f.GraphDatabaseFacadeFactory] Shutting down 'graph.db' database.
2019-08-23 23:44:51.603+0000 INFO [o.n.g.f.GraphDatabaseFacadeFactory] Shutdown started
2019-08-23 23:44:52.752+0000 INFO [o.n.i.d.DiagnosticsManager] --- STOPPING diagnostics START ---
2019-08-23 23:44:52.752+0000 INFO [o.n.i.d.DiagnosticsManager] --- STOPPING diagnostics END ---
2019-08-23 23:45:14.563+0000 INFO [o.n.g.f.m.e.CommunityEditionModule] No locking implementation specified, defaulting to 'community'
2019-08-23 23:45:14.774+0000 INFO [o.n.g.f.GraphDatabaseFacadeFactory] Creating 'graph.db' database.
2019-08-23 23:45:14.807+0000 INFO [o.n.k.a.DatabaseAvailabilityGuard] Requirement `Database available` makes database graph.db unavailable.
2019-08-23 23:45:14.808+0000 INFO [o.n.k.a.DatabaseAvailabilityGuard] Database graph.db is unavailable.
2019-08-23 23:45:14.825+0000 INFO [o.n.i.d.DiagnosticsManager] --- INITIALIZED diagnostics START ---
2019-08-23 23:45:14.825+0000 INFO [o.n.i.d.DiagnosticsManager] Neo4j Kernel properties:
2019-08-23 23:45:14.825+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.connector.http.enabled=true
2019-08-23 23:45:14.825+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.security.auth_enabled=true
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] unsupported.dbms.directories.neo4j_home=/var/lib/neo4j
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.import=/var/lib/neo4j/import
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.memory.heap.initial_size=32G
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.run=/var/run/neo4j
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.lib=/usr/share/neo4j/lib
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.connector.https.enabled=true
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.jvm.additional=-XX:+UseG1GC
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.tx_log.rotation.retention_policy=1 days
2019-08-23 23:45:14.826+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.security.procedures.unrestricted=apoc.,algo.*
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] *****************************************************************=
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.windows_service_name=neo4j
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.connector.bolt.listen_address=:7687
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.security.allow_csv_import_from_file_urls=true
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.memory.pagecache.size=128G
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.data=/var/lib/neo4j/data
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.logs.gc.enabled=true
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] unsupported.dbms.edition=community
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.logs=/var/log/neo4j
2019-08-23 23:45:14.827+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.connector.bolt.enabled=true
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.certificates=/var/lib/neo4j/certificates
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.memory.heap.max.size=64G
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.directories.plugins=/var/lib/neo4j/plugins
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] Diagnostics providers:
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] org.neo4j.kernel.configuration.Config
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] org.neo4j.internal.diagnostics.DiagnosticsManager
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] SYSTEM_MEMORY
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] JAVA_MEMORY
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] OPERATING_SYSTEM
2019-08-23 23:45:14.828+0000 INFO [o.n.i.d.DiagnosticsManager] JAVA_VIRTUAL_MACHINE
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] CLASSPATH
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] LIBRARY_PATH
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] SYSTEM_PROPERTIES
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] TIMEZONE_DATABASE
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] LINUX_SCHEDULERS
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] NETWORK
2019-08-23 23:45:14.829+0000 INFO [o.n.i.d.DiagnosticsManager] System memory information:
2019-08-23 23:45:14.838+0000 INFO [o.n.i.d.DiagnosticsManager] Total Physical memory: 440.62 GB
2019-08-23 23:45:14.838+0000 INFO [o.n.i.d.DiagnosticsManager] Free Physical memory: 234.08 GB
2019-08-23 23:45:14.838+0000 INFO [o.n.i.d.DiagnosticsManager] Committed virtual memory: 75.27 GB
2019-08-23 23:45:14.838+0000 INFO [o.n.i.d.DiagnosticsManager] Total swap space: 4.00 GB
2019-08-23 23:45:14.839+0000 INFO [o.n.i.d.DiagnosticsManager] Free swap space: 2.51 GB
2019-08-23 23:45:14.839+0000 INFO [o.n.i.d.DiagnosticsManager] JVM memory information:
2019-08-23 23:45:14.839+0000 INFO [o.n.i.d.DiagnosticsManager] Free  memory: 31.95 GB
2019-08-23 23:45:14.839+0000 INFO [o.n.i.d.DiagnosticsManager] Total memory: 32.00 GB
2019-08-23 23:45:14.839+0000 INFO [o.n.i.d.DiagnosticsManager] Max   memory: 64.00 GB
2019-08-23 23:45:14.840+0000 INFO [o.n.i.d.DiagnosticsManager] Garbage Collector: G1 Young Generation: [G1 Eden Space, G1 Survivor Space, G1 Old Gen]
2019-08-23 23:45:14.840+0000 INFO [o.n.i.d.DiagnosticsManager] Garbage Collector: G1 Old Generation: [G1 Eden Space, G1 Survivor Space, G1 Old Gen]
2019-08-23 23:45:14.841+0000 INFO [o.n.i.d.DiagnosticsManager] Memory Pool: Code Cache (Non-heap memory): committed=8.81 MB, used=8.78 MB, max=240.00 MB, threshold=0.00 B
2019-08-23 23:45:14.841+0000 INFO [o.n.i.d.DiagnosticsManager] Memory Pool: Metaspace (Non-heap memory): committed=22.00 MB, used=20.65 MB, max=-1.00 B, threshold=0.00 B
2019-08-23 23:45:14.841+0000 INFO [o.n.i.d.DiagnosticsManager] Memory Pool: G1 Eden Space (Heap memory): committed=1.66 GB, used=16.00 MB, max=-1.00 B, threshold=?
2019-08-23 23:45:14.841+0000 INFO [o.n.i.d.DiagnosticsManager] Memory Pool: G1 Survivor Space (Heap memory): committed=32.00 MB, used=32.00 MB, max=-1.00 B, threshold=?
2019-08-23 23:45:14.842+0000 INFO [o.n.i.d.DiagnosticsManager] Memory Pool: G1 Old Gen (Heap memory): committed=30.31 GB, used=0.00 B, max=64.00 GB, threshold=0.00 B
2019-08-23 23:45:14.842+0000 INFO [o.n.i.d.DiagnosticsManager] Operating system information:
2019-08-23 23:45:14.842+0000 INFO [o.n.i.d.DiagnosticsManager] Operating System: Linux; version: 3.10.0-514.el7.x86_64; arch: amd64; cpus: 56
2019-08-23 23:45:14.842+0000 INFO [o.n.i.d.DiagnosticsManager] Max number of file descriptors: 40000
2019-08-23 23:45:14.843+0000 INFO [o.n.i.d.DiagnosticsManager] Number of open file descriptors: 133
2019-08-23 23:45:16.800+0000 WARN [o.n.k.r.LogTailScanner] Fail to read transaction log version 17. java.lang.IllegalStateException: Weird, inUse was read in as false but the record is Property[3130370,used=true,prev=-1,next=5545057041199923203,PropertyBlock[blocks=1,SHORT_STRING,key=1,value=231]]. At position LogPosition{logVersion=17, byteOffset=271044598} and entry version V3_0_10
java.io.IOException: java.lang.IllegalStateException: Weird, inUse was read in as false but the record is Property[3130370,used=true,prev=-1,next=5545057041199923203,PropertyBlock[blocks=1,SHORT_STRING,key=1,value=231]]. At position LogPosition{logVersion=17, byteOffset=271044598} and entry version V3_0_10
2019-08-23 23:45:16.802+0000 INFO [o.n.i.d.DiagnosticsManager] --- STOPPING diagnostics END ---
2019-08-23 23:45:16.802+0000 INFO [o.n.g.f.GraphDatabaseFacadeFactory] Shutdown started

Now I am not being able to connec to cypher-shell too.

Change your code to do a multi-pass, the two MERGE operation on the same label and property cause Cypher to eagerly pass all rows through the first statement and the the second to avoid endless loops.

You should see that in your query plan.
Just do 3 passes over the data. You can even use a WITH distinct line[0] as name

session.run("USING PERIODIC COMMIT "
             "LOAD CSV FROM 'file:///'+{prot}+'_conflict_resolved.txt' AS line "
             "WITH distinct line[0] as name "
             "MERGE (:MyNode {Name:name}) ", prot=prot)
session.run("USING PERIODIC COMMIT "
             "LOAD CSV FROM 'file:///'+{prot}+'_conflict_resolved.txt' AS line "
             "WITH distinct line[1] as name "
             "MERGE (:MyNode {Name:name}) ", prot=prot)
session.run("USING PERIODIC COMMIT "
             "LOAD CSV FROM 'file:///'+{prot}+'_conflict_resolved.txt' AS line "
             "MATCH (n:MyNode {Name:line[0]}) "
             "MATCH (m:MyNode {Name:line[1]}) "
             "MERGE (n) -[:TO {dist:line[2]}] -> (m) ", prot=prot)

Question: do you really want to have potentially multiple TO relationships between m and n ? with different dist values? or just one?
Then you will be faster just using CREATE there if there are no duplicates.
or MERGE on :TO but without the property.

Are those directed or undirected rels? in the latter case you can leave off the arrow tip from the relationship merge.