I am running an import query using LOAD CSV using PERIODIC COMMIT on an aws using the provided AMI community version.
I am trying to debug a query that has some performance issues.
The query imports overall about 8mio. relations, that it is expected to take some time.
While monitoring the progress through the data directory, I notice that transaction files (250mb) are being created in the beginning first every minute for the first 6 files, but then the next one takes about 20 minutes before the speed again pick ups and goes about back to normal.
I have not systematically verified this, but to me it seems, when the data directory has many transaction files this problem is more likely to occur. After cleaning the transactions files, it seems that this slow down happens later (but still happens).
Debug.log shows a lot of these warnings:
2020-04-03 05:05:08.875+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause {pauseTime=114, gcTime=163, gcCount=1}
heap size is 18G
Question is, if this is expected behaviour, and how I can avoid it.
In the beginning I believed it was related to my query or the data, but this slow down happens at random stages during the import.
Thanks a lot!