This summer, Neo4j launched the ability to create graph clusters in managed Google Kubernetes instances. Check it out!
I'm also adding this as a thread on the community forum so that later folks can find it. You can feel free to post follow-up questions on this topic here, or in the #neo4j-graph-platform:cloud topic.
Hi @david_allen I've gone through this setup and your posts a few times but I'm missing a key detail... how do I connect to the cluster from my code? Normally I give py2neo a bolt or http address and a db username/password. I can't figure out what the address should be in Kubernetes.
@brendan please check the limitations section here:
The cluster is exposed primarily inside of the kubernetes network, not outside. You can set up additional kubernetes things like NodePorts to expose individual pods if you like - one thing you might also want to do is check the SSH port forwarding section in those docs linked above to forward bolt.
Bolt+routing from outside of the kubernetes cluster is a bit problematic at the moment because of the way kubernetes networking works. We weren't able to provide default templates for this because much depends on your local network setting, but those docs there should help.
We are trying to use Neo4j on GCP by deploying the causal cluster. We have a set of microservices deployed in another GKE cluster that would like to connect to Neo4j instance that sits in its own cluster.
Since this was posted in November last year, I would like to know if there has been any update to make it easier to connect to the Neo4j cluster.
Would it make sense to expose a cluster IP that manages set of Neo4j nodes and use that IP from the microservices cluster to connect to Neo4j.
We would like to avoid dealing with individual pods in Neo4j for the sake of connection.
Would really appreciate any help in this regard.
There is some more information and suggested solutions around the limitations you can find here:
We don't set this up for users directly though, because it depends on too many configuration aspects of the GKE cluster that we can't anticipate ahead of time in the packaging (like how you do DNS management).
From the outside you could create a DNS name that has multiple A records to point to all of the other cluster members, and then use bolt+routing to that. There are frankly a lot of ways to do it -- but it's for each organization to choose how they want to do this given their security posture and other configuration bits.
A core challenge here is that neo4j uses a smart client-based routing approach (bolt+routing) and Kubernetes really wants to treat all pods as indistinguishable from one another and front them with LBs, and these two approaches do not match well. It's a common situation for other databases in kubernetes as well that differentiate between cluster member roles in their architecture.
Thank you for your detailed response earlier.
For our dev environment on a GKE cluster in GCP, we have installed Neo4j Causal Cluster in the same cluster as our app microservices.
When trying to insert some seed data in the database, we connected to the cluster thru kubectl as follows:
kubectl run -it --rm cypher-shell --image=gcr.io/cloud-marketplace/neo4j-public/causal-cluster-k8s:3.5 --restart=Never --namespace=default --command -- ./bin/cypher-shell -u neo4j -p "$NEO4J_PASSWORD" -a $APP_INSTANCE_NAME-neo4j.default.svc.cluster.local
Following is what we get as the after the connection succeeds:
Connected to Neo4j 3.5.1 at bolt://causal-cluster-k8s-1-neo4j.default.svc.cluster.local:7687 as user neo4j.
But when trying to insert some dummy data, we keep getting following:
No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER
Not sure what we are doing wrong here, it does seem like we are using bolt to connect to cypher shell.
We have three nodes in the dev envionment.
Upon running dbms.cluster.overview(), I do see one of the nodes as the Leader and the other two as followers.
Could you please let me know if I am doing something wrong?
Thank you once again for being thorough in your explanation of DNS in the GKE cluster.
-a $APP_INSTANCE_NAME-neo4j.default.svc.cluster.local
You are connecting to a DNS service name with a default bolt driver. Essentially your client is looking up the first available node (which happens to be a FOLLOWER) and then connecting you to that. Your writes then fail.
Change it to this:
-a bolt+routing://$APP_INSTANCE_NAME-neo4j.default.svc.cluster.local
This will have your client use a "Routing Driver" which will assure that the application itself routes the right query to the right machine.