Very slow experience with JDBC driver against dockerized neo4j

(Miguel Fiandor Gutierrez) #1

Hi, we are running some tests to import data through the neo4j-jdbc driver, actually using springframework.jdbc on top of it.
The schema is super basic, 2 nodes (Documents and NamedEntities) and 1 relationship (MENTIONED_IN).

The test consist on creating 100 documents, with a 100 named entities each and a relationship in between. It takes over 10 minutes in different machines.

We have run neo4j server 3.5 as docker container using the neo4j-jdbc driver 3.3.1. Other combinations like driver 3.4.0, with lower server versions 3.3.9 and 3.38, and 3.4.12 returned some exceptions.

Our connection string is:
jdbcTemplate = new JdbcTemplate(new DriverManagerDataSource("jdbc:neo4j:bolt://neo4j/?user=neo4j&password=XXXX&flatten=-1"));

We create two constraints on each node ids for uniqueness:

jdbcTemplate.update("CREATE CONSTRAINT ON (doc:Document) ASSERT IS UNIQUE");
jdbcTemplate.update("CREATE CONSTRAINT ON (ne:NamedEntity) ASSERT IS UNIQUE");

For instance the first 10 docs with a total of 1000 namedentities nodes takes 2minutes aprox.

The properties of each kind of node are a few and very basic, so nothing big or weird to worry here.

I was wondering if there is any performance known issue by using the docker container, or some kind of overhead in the client/driver communication with the dockerized server that we should take into account.



(William Lyon) #2

Hey Miguel - could you share some code snippets of the queries and how you are executing them?

From what you've posted I'm not aware of any issues that should cause that type of performance, so seeing some code would help troubleshoot.


(Miguel Fiandor Gutierrez) #4

Hey Will, we tried out the java native driver (not the jdbc) and we noticed a 20x improvement. So most of the overhead seems to be in the jdbc driver on this kind of architecture - using the docker container. I'll try to share more details on how it looks now.