Websocket connection failed - possible certificate chain issue

I am trying to set up community server 4.0 with HTTPS front end and TLS/SSL enabled bolt connections, running on RedHat EL 7. I think it was working fine with self-signed certs, but since I switched it to use certificates obtained from InCommon, I've experienced problems getting the browser-based client to talk to the database.

One possible issue is the fact that these certificates require intermediate certificates before they reach a root authority. We tried several ways of copying the certification chain into the certificate, but each thing we did resulted in the server starting without listening on the https or the bolt port. There don't seem to be any updates to the testing code for this in 7 years. I'm not sure if that means that it Just Works (in which case I could use a pointer to some docs on how to make a suitable cert file) or if the test is just not actually testing anything any more.

When I use a normal host certificate the server started ok, listening on both https and bolt ports. I can point chrome at it and get the JS based client. But when I try to log into the client I get the "ServiceUnavailable: WebSocket connection failure. Due to security constraints..." error. When I look in the console I see "WebSocket opening handshake was canceled" coming from neo4j-driver.chunkhash.bundle.js. Most of the answers I've seen online about this indicate that it's an indication of SSL issues, but chrome doesn't have a problem talking the https server. Any ideas?

Thanks,
Eric

You're right that these symptoms usually indicate SSL issues, so you're right to be looking in this direction.

You should double check your config & post some details about what files you have located where in your SSL config, along with the config snippets and so forth. 4.0.0 didn't ship with a self-signed cert out of the box, so if you created one yourself, how you staged it on disk is important. You should also check your debug.log file to see if on startup there are any messages related to SSL support. It's possible for example that HTTPS may not start at all but that the service still comes alive. Can you load the browser UI at all on port 7473?

There's Sooo much config and sooo much output that it's hard to know what to include. I'll start with the answer to your question: yes, i can bring up the login screen of the UI by pointing my browser at port 7473. So SSL is working for HTTPS. This may be because my browser has been configured to trust the root ca and the intermediate ca's

I'll put the SSL bit of the config here. The thing that's a little wonky is that I'm trying to use the same certs for bolt as for https, which the doc discourages. I'm not really sure how I can get different ones.

dbms.directories.certificates=/var/lib/neo4j/certificates
dbms.connectors.default_listen_address=0.0.0.0
# Bolt connector
dbms.connector.bolt.enabled=true
dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=:7687
# HTTPS Connector. There can be zero or one HTTPS connectors.
dbms.connector.https.enabled=true
#dbms.connector.https.listen_address=:7473

# Bolt SSL configuration
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.base_directory=certificates/https
dbms.ssl.policy.bolt.private_key=private.key
dbms.ssl.policy.bolt.public_certificate=public.crt

# Https SSL configuration
dbms.ssl.policy.https.enabled=true
dbms.ssl.policy.https.base_directory=certificates/https
dbms.ssl.policy.https.private_key=private.key
dbms.ssl.policy.https.public_certificate=public.crt

And to show that the cert files exist:

# ls -lH /var/lib/neo4j/certificates/https/*crt
-rw-r--r-- 1 root root  6617 Apr  3 17:02 /var/lib/neo4j/certificates/https/combined.crt
-rw-r--r-- 1 root neo4j 2531 Mar 26 15:15 /var/lib/neo4j/certificates/https/public.crt
# ls -lH /var/lib/neo4j/certificates/https/*key
-rw-r----- 1 root neo4j 1704 Mar 23 10:13 /var/lib/neo4j/certificates/https/private.key
# id neo4j
uid=992(neo4j) gid=989(neo4j) groups=989(neo4j)

OK...so the one thing I'm not seeing here is where .base_directory is relative to. I'm expecting it to be /var/lib/neo4j but there's nothing in the config that states that. And yet https works, so...But I'm going to test that (by adding the full path into the base_directory clauses). Yeah, that didn't change it.

Is there other config you'd like to see?

As for debug.log, I'll put snippets that look interesting. Please let me know if I'm missing anything you'd want to see.

2020-04-06 20:15:37.575+0000 INFO [o.n.i.d.DiagnosticsManager] --------------------------------------------------------------------------------
2020-04-06 20:15:37.575+0000 INFO [o.n.i.d.DiagnosticsManager]                         [ Operating system information ]       
2020-04-06 20:15:37.575+0000 INFO [o.n.i.d.DiagnosticsManager] --------------------------------------------------------------------------------
2020-04-06 20:15:37.576+0000 INFO [o.n.i.d.DiagnosticsManager] Operating System: Linux; version: 3.10.0-1062.18.1.el7.x86_64; arch: amd64; cpus: 4
2020-04-06 20:15:37.580+0000 INFO [o.n.i.d.DiagnosticsManager] Max number of file descriptors: 60000
2020-04-06 20:15:37.581+0000 INFO [o.n.i.d.DiagnosticsManager] Number of open file descriptors: 152
2020-04-06 20:15:37.610+0000 INFO [o.n.i.d.DiagnosticsManager] --------------------------------------------------------------------------------
2020-04-06 20:15:37.610+0000 INFO [o.n.i.d.DiagnosticsManager]                               [ JVM information ]              
2020-04-06 20:15:37.610+0000 INFO [o.n.i.d.DiagnosticsManager] --------------------------------------------------------------------------------
2020-04-06 20:15:37.611+0000 INFO [o.n.i.d.DiagnosticsManager] VM Name: OpenJDK 64-Bit Server VM
2020-04-06 20:15:37.611+0000 INFO [o.n.i.d.DiagnosticsManager] VM Vendor: Oracle Corporation
2020-04-06 20:15:37.611+0000 INFO [o.n.i.d.DiagnosticsManager] VM Version: 11.0.6+10-LTS
2020-04-06 20:15:37.612+0000 INFO [o.n.i.d.DiagnosticsManager] JIT compiler: HotSpot 64-Bit Tiered Compilers
2020-04-06 20:15:37.612+0000 INFO [o.n.i.d.DiagnosticsManager] VM Arguments: [-Xms4500m, -Xmx4500m, -XX:+UseG1GC, -XX:-OmitStackTraceInFastThrow, -XX:+AlwaysPreTouch, -XX:+UnlockExperimentalVMOptions, -XX:+TrustFinalNonStaticFields, -XX:+DisableExplicitGC, -Djdk.nio.maxCachedBufferSize=262144, -Dio.netty.tryReflectionSetAccessible=true, -Djdk.tls.ephemeralDHKeySize=2048, -Djdk.tls.rejectClientInitiatedRenegotiation=true, -Dfile.encoding=UTF-8]

2020-04-06 20:15:37.749+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.bolt.base_directory=/var/lib/neo4j/certificates/https
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.bolt.enabled=true
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.bolt.private_key=/var/lib/neo4j/certificates/https/private.key
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.bolt.public_certificate=/var/lib/neo4j/certificates/https/public.crt
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.https.base_directory=/var/lib/neo4j/certificates/https
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.https.enabled=true
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.https.private_key=/var/lib/neo4j/certificates/https/private.key
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.ssl.policy.https.public_certificate=/var/lib/neo4j/certificates/https/public.crt
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.tx_log.rotation.retention_policy=1 days
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager] dbms.windows_service_name=neo4j
2020-04-06 20:15:37.750+0000 INFO [o.n.i.d.DiagnosticsManager]
2020-04-06 20:15:38.087+0000 INFO [o.n.s.c.SslPolicyLoader] Loaded SSL policy 'HTTPS' = SslPolicy{keyCertChain=Subject: CN=amaretto-hub.broadinstitute.org, OU=BITS, O=The Broad Institute of MIT and Harvard, STREET=415 Main St., L=Cambridge, ST=Massachusetts, OID.2.5.4.17=02142, C=US, Issuer: CN=InCommon RSA Server CA, OU=InCommon, O=Internet2, L=Ann Arbor, ST=MI, C=US, ciphers=null, tlsVersions=[TLSv1.2], clientAuth=OPTIONAL}
2020-04-06 20:15:38.088+0000 INFO [o.n.s.c.SslPolicyLoader] Loaded SSL policy 'BOLT' = SslPolicy{keyCertChain=Subject: CN=amaretto-hub.broadinstitute.org, OU=BITS, O=The Broad Institute of MIT and Harvard, STREET=415 Main St., L=Cambridge, ST=Massachusetts, OID.2.5.4.17=02142, C=US, Issuer: CN=InCommon RSA Server CA, OU=InCommon, O=Internet2, L=Ann Arbor, ST=MI, C=US, ciphers=null, tlsVersions=[TLSv1.2], clientAuth=OPTIONAL}
2020-04-06 20:15:38.093+0000 INFO [o.n.g.f.EditionLocksFactories] Locking implementation 'community' selected.

Hopefully I've provided enough information, and thank you for responding David

Anybody have any more thoughts on things I can try? This is super frustrating.

I thought it had worked before using a self-signed certificate, but I wasn't able to dig up enough of my old configuration to exactly reproduce the configuration. And teh only self-signed certs I could find on the host were for Subject CN=localhost so I decided to generate a new Certificate using the actual hostname.

The chrome browser complains as expected about an untrusted connection. When I try to log in from the JS client I now get the websocket connection "failed: Error in connection establishment: net::ERR_CONNECTION_CLOSED" which strikes me as rather worse than before.

debug.log doesn't show any differences beyond the modified path to the certs (as far as I can tell).

I also try to set up a server with Néo4j 4.0.3. I'm in the same situation as Eric.
I used Let's Encrypt to generate an SSL certificate and that works well for the HTTPS but the connection with bolt seems to be a problem.

Here is the error generated during the connection:

SessionExpired: WebSocket connection failure. Due to security constraints in your web browser, the reason for the failure is not available to this Neo4j Driver. Please use your browsers development console to determine the root cause of the failure. Common reasons include the database being unavailable, using the wrong connection URL or temporary network problems. If you have enabled encryption, ensure your browser is configured to trust the certificate Neo4j is configured to use. WebSocket `readyState` is: 3

and here is the error in the browser console:

WebSocket connection to 'wss://neo4j.mydomaine.com:7687/' failed: WebSocket opening handshake was canceled
r @ neo4j-driver.chunkhash.bundle.js:1

Any ideas?

Check your SSL settings. The way SSL works in Neo4j is that you create an SSL "policy" and then you associate it with the various connectors (bolt, https, etc)

It sounds as though you got your HTTPS SSL set up, but maybe you haven't wired that SSL policy to your bolt connector.

https://neo4j.com/docs/operations-manual/current/security/ssl-framework/#ssl-settings

In particular, the missing piece may be:

dbms.ssl.policy.bolt.enabled=true

OR

dbms.ssl.policy.<scope>.<setting-suffix>

I think I did it right for the connectors.

dbms.default_listen_address=0.0.0.0
dbms.default_advertised_address=neo4j.mydomaine.com
dbms.connector.http.advertised_address=0.0.0.0:7474
dbms.memory.pagecache.size=512M
dbms.tx_log.rotation.retention_policy=100M size
dbms.directories.logs=/logs
dbms.directories.certificates=/ssl

# HTTPS CONNECTOR
dbms.connector.https.enabled=true
dbms.connector.https.listen_address=0.0.0.0
dbms.connector.https.advertised_address=0.0.0.0:7473

# HTTPS SSL POLICY
dbms.ssl.policy.https.enabled=true
dbms.ssl.policy.https.base_directory=/ssl/https

# BOLT CONNECTOR
dbms.connector.bolt.enabled=true
dbms.connector.bolt.listen_address=0.0.0.0
dbms.connector.bolt.advertised_address=0.0.0.0:7687

# BOLT SSL POLICY
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.base_directory=/ssl/bolt

Please let me know if something is missing?

Mine is similar (as shown above) with the exception that I specified the names of the cert and key files explicitly. I have heard from my user that he was able to connect when we were running 3.5 so I may try to revert to that to see if I have more success.

Cross post -- make sure to see this other related thread, with details & config that may be relevant to this problem: Neo4j Enterprise 4.0 on GCP doesn't work out of the box

OMIGOSH! OMIGOSH! I think the secret was setting the client_auth to disabled, but now it works! I'm so excited!

It would have been nice if SOMETHING got thrown into the debug.log when client_auth failed...

1 Like

Nice, it works for me !