Can't backup remote database using Neo4J

deemeetree · May 8, 2020, 12:57pm

I'm trying to remotely back up my Neo4J database for already 2 days and nothing works.

I run

sudo neo4j-admin backup --backup-dir=backup --name=graph.db-backup --from=1.1.1.1:1111 --timeout=50m

The files start to get saved and then simply erased after a while and there's no backup.

I tried setting up --pagecache=16M and HEAP_SIZE but it has no effect. Sometime it just stalls sometimes I get an error like this:

unexpected error: java.io.IOException: org.neo4j.com.ComException: Channel has been closed

The DB I'm backing up is Enterprise 3.3.3 and the one I'm backing up with is 3.5.14

Thank you for any help.

Is this the normal Neo4J behavior?

david_allen · May 8, 2020, 1:20pm

The two notable things that you're mentioning here are

You're backing up Neo4j 3.3 with Neo4j 3.5 tools. I believe there were some store upgrade changes between these versions and so I would not advise that....have you tried using 3.3 tooling?
The concrete error that you've provided suggest there's a network interruption that's happening some place. It's tough to see what's happening without a full paste of the output of the command, and some knowledge of what's happening on the network between you and the database.

deemeetree · May 11, 2020, 12:05pm

Ok, I'm trying with 3.3 and I get this error now:

command failed: Backup failed: Unexpected Exception

How can I find out what's happening really? Is there a log file for the neo4j-admin backup command?

deemeetree · May 11, 2020, 1:09pm

And then if that doesn't happen if the backup runs until the end, then automatically it erases everything and nothing happens — the process stalls.

Honestly, I'm dealing with it for already 5 days. Is it supposed to be that hard to do a backup?

deemeetree · May 11, 2020, 3:19pm

Just to clarify once again: the files are being copied to the temp-copy folder inside the folder (backup) where I am making a backup to. But either there is an error during the backup, or if the backup is done (judging by the size of the temp-copy, I guess at the stage where it's supposed to finalize the "write" it just erases everything and the process stalls everything disappears.

I'm doing a backup from a WebFaction server (production) to a Neo4J db hosted on AWS.

When I look at the log file in /var/log/neo4j/ there's no information there (just that the database has launched or not) and there is no other log I can access or I don't know where it is.

Could it be an issue with permissions of the folder where the backup is made?

Could it be I have to run neo4j-admin backup using systemctl?

Could you please provide some help on this because this topic is not so well documented in your manual and I think it's very important.

deemeetree · May 11, 2020, 11:40pm

I tried it from another machine, locally, and it can go further but then gives us this error:

2020-05-11 19:20:23.480+0000 INFO [o.n.c.s.StoreCopyClient] Copying index/lucene/relationship/TO/_qmb8.si
2020-05-11 19:20:23.482+0000 INFO [o.n.c.s.StoreCopyClient] Copied index/lucene/relationship/TO/_qmb8.si 427.00 B
2020-05-11 19:20:23.482+0000 INFO [o.n.c.s.StoreCopyClient] Copying neostore
2020-05-11 19:20:23.483+0000 INFO [o.n.c.s.StoreCopyClient] Copied neostore 16.00 kB
2020-05-11 19:20:23.483+0000 INFO [o.n.c.s.StoreCopyClient] Done, copied 637 files
2020-05-11 19:20:23.590+0000 INFO [o.n.b.BackupService] Start receiving transactions from 3794881
2020-05-11 19:20:32.540+0000 INFO [o.n.b.BackupService] Finish receiving transactions at 3794881
2020-05-11 19:20:32.572+0000 INFO [o.n.b.BackupService] Start recovering store
command failed: Backup failed: Error starting org.neo4j.com.storecopy.ExternallyManagedPageCache$GraphDatabaseFactoryWithPageCacheFactory$1, /Volumes/Extreme SSD/Backup/main-graph.db-backup/temp-copy

jggomez · May 12, 2020, 12:16am

Hi, you could try with this program locally... I developed that utility..

GitHub - jggomez/neo4j-backup: This repository contains a program that allows you to create backups for neo4j.

I hope can help you

deemeetree · May 12, 2020, 10:43am

Yes, thank you, @jggomez, I saw this, but I don't want it to run locally plus it's using the same command neo4j-admin backup inside your script, and that is not working for me.

I also want to get a conclusive answer from Neo4J engineers: does the backup, which is a feature (apart from clustering) setting the Enterprise version apart from Community, actually work? Or only in some cases and sometimes? And the 3 pages of documentation that exist on it is all there is to understand how it works?

I'm not new to Neo4J but these 5 days I'm trying to make this simple task of online backup work is the longest stretch I've ever had so far with this technology and my experience is that it's super buggy and unreliable with not enough options and insufficiently documented too.

deemeetree · May 12, 2020, 3:33pm

Now, even if it happens (1 in 10 times) that the process goes to its completion, at the stage where I am at

2020-05-12 15:25:44.046+0000 INFO [o.n.c.s.StoreCopyClient] Copying neostore
2020-05-12 15:25:44.048+0000 INFO [o.n.c.s.StoreCopyClient] Copied neostore 16.00 kB
2020-05-12 15:25:44.049+0000 INFO [o.n.c.s.StoreCopyClient] Done, copied 642 files
2020-05-12 15:25:44.202+0000 INFO [o.n.b.BackupService] Start receiving transactions from 3796108
2020-05-12 15:25:46.884+0000 INFO [o.n.b.BackupService] Finish receiving transactions at 3796108
2020-05-12 15:25:46.920+0000 INFO [o.n.b.BackupService] Start recovering store

I get this error after:

command failed: Backup failed: Error starting org.neo4j.com.storecopy.ExternallyManagedPageCache$GraphDatabaseFactoryWithPageCacheFactory$1

I saw a post about it on Backup Error · Issue #11992 · neo4j/neo4j · GitHub and changed the max open files on my system, but that didn't help either.

The last records in the log on the server I am backing up are:

2020-05-12 15:25:42.811+0000 INFO [o.n.k.i.s.c.CountsTracker] About to rotate counts store at transaction 3796320 to [/home/neo4j-enterprise-3.3.3/data/databases/graph.db/neostore.counts.db.b], from [/home/neo4j-enterprise-3.3.3/data/databases/graph.db/neostore.counts.db.a].
2020-05-12 15:25:42.819+0000 INFO [o.n.k.i.s.c.CountsTracker] Successfully rotated counts store at transaction 3796320 to [/home/neo4j-enterprise-3.3.3/data/databases/graph.db/neostore.counts.db.b], from [/home/neo4j-enterprise-3.3.3/data/databases/graph.db/neostore.counts.db.a].
2020-05-12 15:25:47.723+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [3796320]:  Store flush completed
2020-05-12 15:25:47.723+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [3796320]:  Starting appending check point entry into the tx log...
2020-05-12 15:25:47.724+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [3796320]:  Appending check point entry into the tx log completed
2020-05-12 15:25:47.725+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [3796320]:  Check pointing completed
2020-05-12 15:25:47.725+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [898]:  Starting log pruning.
2020-05-12 15:25:47.728+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Log Rotation [898]:  Log pruning complete.
2020-05-12 15:30:25.788+0000 WARN [o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 204ms.

The local backup log (where I'm backing up to) doesn't have any errors.

leo.szumel · May 12, 2020, 6:59pm

@deemeetree it might be worth double-checking the file limit increase is in effect. IIRC we edited the neo4j-admin script to print ulimit -a. Also note that 65535 has not always been sufficient for us. I would try a much larger value to rule that out as a root cause.

deemeetree · May 15, 2020, 12:14pm

Somebody from Neo4J — could you please respond and advise?

elaine_rosenber · May 15, 2020, 12:54pm

Can you set debug level for logging and then provide the log file(s)?

dbms.logs.debug.level=DEBUG

And set the env variable NEO4j_DEBUG to true

If you don't want to provide the log file(s) here, you can send them to the Intercom ticket you opened for this case also.

Elaine

deemeetree · May 16, 2020, 12:02pm

Hi Elaine,

Thank you for responding. I'm actually communicating with you through DM on Twitter but I guess you are receiving it through Intercom, right?

Could you please tell me if I need to set this up as you advised above on the database I am backing up or on the remote system I'm using to do the backup? Or on both?

Thanks

elaine_rosenber · May 18, 2020, 12:03pm

To be sure, you are using the same version of Neo4j on the system you are backing up and the system where you are running neo4j-admin backup from correct?

Since you are saying that it appears to do the backup and then the files disappear, I would say that what you need to look at is debugging the system from where. you are executing the backup command from. The server doesn't seem to be the problem.

That being said, you do not have a log file or debug log file on the system from where you are running neo4j-admin backup so perhaps setting the env variable will help you, but changing anything in the neo4j.conf will not as you do not use a local Neo4j instance to perform the backup.

Does the server that you want to back up need to be online 24x7? Another option you could try is to shut down the server that. you want to back up and try neo4j-admin dump to at least get a dump file for the database.

Elaine

deemeetree · May 19, 2020, 10:53am

Hello Elaine,

The whole point of me switching to the enterprise version was to be able to do online backups. So I want to be able to do those.

Regarding the backup files — I already provided all the data from all the sources (both remote and local) above.

It looks like the backup feature in Neo4J Enterprise works really badly and is super unstable.

I guess I should just switch back to Community and do offline backups as before, right?

elaine_rosenber · May 19, 2020, 12:27pm

For now, can you back up on the same system as the server just to make sure that the backup works locally? Then copy the backup files to a different system. This will at least enable you to backup your database without any interruption of service.

Elaine

deemeetree · May 25, 2020, 1:48pm

I can try to do it on the same system but do you know if it's going to slow down my app / database drastically comparing to remote backup? And how can I ensure it doesn't happen? Thanks!

deemeetree · May 25, 2020, 2:37pm

Also — is it possible to do an offline backup, then copy it to a remote location, and then do online incremental backups on that offline backup?

deemeetree · May 25, 2020, 2:38pm

Also, I'm doing it locally and it just crashes my database with a message

command failed: Backup failed: Unexpected Exception

elaine_rosenber · May 26, 2020, 12:50pm

Can you send the log file for this time-period where the local backup failed?

Elaine

Topic		Replies	Views
Can't backup remote database using Neo4J Neo4j Graph Platform backup , migrated , operations-tagged	1	293	January 17, 2023
Can't backup remote db using Neo4J Neo4j Graph Platform performance , browser , cypher , neo4j-enterprise-343 , neo4j-desktop	1	401	February 22, 2021
Neo4j backup if failing Operations backup	6	1701	May 12, 2020
Backups randomly fails after upgrade from 4.3.8 to 4.4.12 Neo4j Graph Platform backup , neo4j-migrations , migrated	0	275	November 18, 2022
Stuck Remote Neo4j Backups Operations neo4j-enterprise-343	4	850	January 21, 2020

Get Certified in June!

Can't backup remote database using Neo4J

Related topics