Running Neo4j Community on AWS

aws
ec2
knowledge-base
(M. David Allen) #1

I wanted to share an article I wrote recently for community members about how to use AMIs we provide on Amazon.

Get up and running quickly with the Neo4j VM Image on AWS

I'd love to get people's feedback on this. We're actively working on improving the documentation and writing technical articles, to make it easy to use Neo4j anywhere in the cloud.

1 Like
(A Bramson) #2

Let's say I've built a graph database on my local machine. And let's say I've managed to follow the instructions and create a Neo4j Community Instance on AWS (but don't otherwise know anything about how to use AWS). How would/could I load the database from my local machine onto my AWS instance? I managed to figure out how to create a dump of my database from Stack Exchange, but I don't know what to do from there.

Also, the easiest way to run code to build the network seems to be using AWS's Cloud9 online IDE. So instructions for using that to connect to the graph database (e.g., using Python) would be very helpful.

Thanks.

(M. David Allen) #3

@a_bramson for community, some of the standard backup/restore tools don't apply (those are in enterprise) so generally to make a copy of a graph database we do this:

  • Shut the database down
  • Create a copy of the entire /var/lib/neo4j/data directory
  • Zip that, move it to the target machine where you want to restore
  • Unzip
  • Restart database.
1 Like
(Ben Butler Cole) #4

@a_bramson You should be able to do neo4j-admin dump to avoid fiddling around on the filesystem directly. This produces a single-file dump which can be loaded elsewhere with neo4j-admin load.

1 Like
(A Bramson) #5

@ben.butler-cole Yes, this is what I have been doing to create and restore backups locally. So the remaining question is: "Where do input the that load command?" I've not used AWS yet, but my understanding is that there won't be an interface like the Neo4j Desktop (which raises another question about accessing something like the Neo4j Browser for a database running on AWS), so I don't know where the appropriate terminal will be...it was actually quite difficult to get useful information on thie dump-load process for the desktop version, and I haven't found any info for doing this on AWS.

(M. David Allen) #6

@a_bramson you would input that command in an SSH terminal session.

Given that you've started an AMI on AWS, at the time of your AMI creation you needed to choose an SSH key to communicate with your instance. Follow AWS directions to SSH to that instance, and then you can run sudo neo4j-admin dump along the lines that Ben recommends.

(Ben Butler Cole) #7

I expect that neo4j-admin is in your path on the EC2 instance. To run neo4j-admin load then Neo4j must stopped and you must run as the neo4j user.

(Ninth Mind) #8

I followed your instructions, but I still can't connect to my instance. I double checked that the security group was correct (ports 7473, 7474, 7687, 22,), I tried all sorts of usernames (neo4j, ubuntu, root), i used the PublicDNS and the Public IPv4, i chmod 400 my .pem file, I tried deploying another instance, and in no circumstance am I able to connect!

Any ideas?

(Ninth Mind) #9

Found my problem! It was the VPC Subnet Routing table was configured wrong. These docs helped me:
Specifically the part that said "Check the route table for the subnet."

Troubleshooting Connecting to Instance

1 Like
(Christine N Buckler) #10

@david.allen I followed your tutorial but I am not able to connect to the remote service. I was able to find that neo4j failed to start. My only feedback/error message is below. Have you seen this or know what this might be related to? I can't find anyone else with this issue.

$ systemctl status neo4j
● neo4j.service - Neo4j Graph Database
   Loaded: loaded (/etc/systemd/system/neo4j.service; enabled; vendor preset: enabled)
   Active: failed (Result: start-limit-hit) since Tue 2019-03-12 18:01:38 UTC; 34min ago
  Process: 1782 ExecStart=/etc/neo4j/pre-neo4j.sh (code=exited, status=1/FAILURE)
 Main PID: 1782 (code=exited, status=1/FAILURE)

Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Unit entered failed state.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Failed with result 'exit-code'.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Service hold-off time over, scheduling restart.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: Stopped Neo4j Graph Database.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Start request repeated too quickly.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: Failed to start Neo4j Graph Database.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Unit entered failed state.
Mar 12 18:01:38 ip-172-18-9-201 systemd[1]: neo4j.service: Failed with result 'start-limit-hit'.
(M. David Allen) #11

Try posting the results of these commands -- the issue here is that I can see your service is failing, but not why. We need to have a look at the logs to further debug.

journalctl -u neo4j -b > neo4.log will get the service logs.

And you should find a file called debug.log in /var/log/neo4j which may also contain useful clues. In those logs, you're looking for some exception thrown on startup, which will tell us more.

(Christine N Buckler) #12

I found a few lines in the log that might be relevant... the sh script was created by the AMI.

Mar 12 18:01:33 ip-172-18-9-201 pre-neo4j.sh[1506]: pre-neo4j.sh: Starting neo4j console...
Mar 12 18:01:33 ip-172-18-9-201 pre-neo4j.sh[1506]: /usr/share/neo4j/bin/neo4j: line 185: export: `<html': not a valid identifier
Mar 12 18:01:33 ip-172-18-9-201 systemd[1]: neo4j.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 18:01:33 ip-172-18-9-201 systemd[1]: neo4j.service: Unit entered failed state.
Mar 12 18:01:33 ip-172-18-9-201 systemd[1]: neo4j.service: Failed with result 'exit-code'.
(M. David Allen) #13

Christine - OK I think I know what's happening from just that little tidbit. Are you running the AMI inside of a VPC with no external addresses? The AMI is trying to detect its external address and not finding it , if you are running in a network location where there are no external addresses. This was a bug recently reported in our AMIs that we're working on fixing, which only affects those customers running the AMI where there are no external addresses (i.e. locked up in VPCs)

Edit the file /etc/neo4j/pre-neo4j.sh and look for lines that look like this:

export INTERNAL_IP_ADDR=$(curl --silent $API/meta-data/network/interfaces/macs/$MAC_ADDR/local-ipv4s)
export EXTERNAL_IP_ADDR=$(curl -f --silent $API/meta-data/network/interfaces/macs/$MAC_ADDR/public-ipv4s)

Change them to this:

export INTERNAL_IP_ADDR=$(curl --silent $API/meta-data/network/interfaces/macs/$MAC_ADDR/local-ipv4s)
export EXTERNAL_IP_ADDR=$INTERNAL_IP_ADDR

And then systemctl restart neo4j and you should be good.

(Christine N Buckler) #14

I found that file and made the changes; however, I not have have permissions to edit this file...

Error writing pre-neo4j.sh: Permission denied

(M. David Allen) #15

If you control the VM, you do have permissions. If you ssh in as the ubuntu user, just use sudo in front of your editor command and you'll edit with permissions of root.

(Christine N Buckler) #16

Awesome I am so close! I was able to get service running and open login page in the browser. I am trying to use the instance ID as the password which isn't working. Any other passwords to try?

(M. David Allen) #17

Default password is usually 'neo4j' but under some launch configurations it can be the instanceID. Try 'neo4j', if it's not that, it's the instance ID.

(Christine N Buckler) #18

Ah yes, the tutorial says it's the instance ID but it is just 'neo4j'. Thank you very much for your help!

1 Like
(David Gordon) #19

When running AMI ami-0118d82e9da26d491 (Neo4J 3.5.3 community, us-east-1), the following errors occur in pre-neo4j.sh.

Apr 7 03:43:58 ip-10-0-0-23 pre-neo4j.sh[1517]: /etc/neo4j/pre-neo4j.sh: line 18: [: missing ]' [...] Apr 7 03:43:59 ip-10-0-0-23 pre-neo4j.sh[1517]: /etc/neo4j/pre-neo4j.sh: line 45: export:aws:cloudformation:logical_id=Neo4jServer': not a valid identifier

The first is corrected on line 18 by adding the appropriate brackets:
if [ $? -ne 0 ] || [ "$EXTERNAL_IP_ADDR" == "" ] ; then

And then second is corrected on line 45 by replacing all ':' by '':
key=$(echo $key | /usr/bin/tr ':' '
' | /usr/bin/tr '-' '_' | /usr/bin/tr '[:upper:]' '[:lower:]')

Regards,

1 Like
(Axtonpitt) #20

Hi there,

I'm having trouble restarting neo4j on the Neo4j Community AWS AMI after importing csv files with neo4j-admin. I used the command as follows sudo neo4j-admin import --nodes [insert file name here]. However when restarting neo4j using sudo systemctl start neo4j it fails to start (and gets stuck in a loop), due to being unable to obtain a store lock.

pre-neo4j.sh[18486]: 2019-05-16 01:01:12.446+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@6cce16f4' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to obtain lock on store lock file: /var/lib/neo4j/data/databases/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@6cce16f4' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to obtain lock on store lock file: /var/lib/neo4j/data/databases/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".
pre-neo4j.sh[18486]: org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@6cce16f4' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to obtain lock on store lock file: /var/lib/neo4j/data/databases/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".

Should I be running neo4j-admin with a different user? I was unable to use any other user however, as neo4j or ubuntu do not have the necessary access. I was also unsure about other processes using neo4j as the ubuntu user always seems to have a process using "org.neo4j.server" that reboots even when killed eg:

$ ps aux | grep "org.neo4j.server"
ubuntu   23262  0.0  0.0  12944  1056 pts/0    S+   01:16   0:00 grep --color=auto org.neo4j.server
$ sudo kill -9 23262
$ ps aux | grep "org.neo4j.server"
ubuntu   23272  0.0  0.0  12944   932 pts/0    S+   01:19   0:00 grep --color=auto org.neo4j.server

Any help is appreciated, bit stuck here.

Kind regards