Golang Driver | Neo4j Performance Issue

I have to believe I MUST be doing something wrong. I am a database expert by the way, having worked in engineering and consulting at database companies for 24 years, and I'm a language expert in over 13 languages. So I know what I am doing.

But when I run a simple benchmark for simple inserts into an ODBMS or RDBMS, or just about any other DBMS, records with two columns (or properties) I consistently get between 10,000 up to 30,000 tps. But with Neo4j I get an appalling 11 tps. I was getting over 10,000 TPS 23 years ago with ObjectStore, so I cannot understand why when I follow documentation I only get 11 TPS with Neo4j? Can someone point out what I am doing incorrectly?

I have to be doing something terribly wrong. This same hardware (64 GB RAM, 2 TB NVMe, 10 Gbps Eth) gets bleeding edge performance with any other database. I have assigned 16 GB to the database. And I am simply trying to insert a few dozen records, upwards of a few thousand. So this is a VERY SMALL database, no reason to be slow. And I am not setting up ANY relationships, only inserting nodes.

Can someone explain to me what I could possibly be doing wrong in the code below that would result in this sort of performance? I am literally copying the code out of the golang driver "benchmark" (for the most part).

	ctx := context.Background()

	// create an auth token...
	neo4jAuthToken := getNeoAuthToken(neo4jUsername, neo4jPassword, neo4jRealm)

	// create a driver...
	driver, err := neo4j.NewDriverWithContext(neo4jUri, neo4jAuthToken)
	if err != nil {
		log.Fatal(err)
	}
	defer driver.Close(ctx)

	if err := driver.VerifyConnectivity(ctx); err != nil {
		log.Fatalf("failed to verify connection: %s", err)
	}

	// create the session configuration...
	config := neo4j.SessionConfig{
		AccessMode: neo4j.AccessModeWrite,
	}

	count := 100
	start := time.Now()
	for i := 0; i < count; i++ {
		session := driver.NewSession(ctx, config)
		defer session.Close(ctx)

		name := fmt.Sprintf("jim_%d", i)
		_, err = session.Run(ctx, `CREATE (n:Account {name: $name, hash: $hash}) RETURN n`,
			map[string]interface{}{
				"name": name,
				"hash": hash(name),
			})
		if err != nil {
			log.Fatal(err)
		}
	}
	elapsed := time.Since(start).Milliseconds()
	rate := float64(count) * 1000 / float64(elapsed)
	fmt.Println("Time to load: " + strconv.Itoa(int(elapsed)))
	fmt.Println("Rate in load: " + fmt.Sprintf("%f", rate))

``

It would probably be faster to post this as an issue on the Go bolt driver github here:

https://github.com/neo4j/neo4j-go-driver

Also you might want to test aligned more with the example code presented in the readme, and see if the timing matches yours or is faster.

We would want to know the version of Neo4j you are using, the version of the Go driver, whether this is enterprise or community, and whether this is a single instance or cluster.

Thanks to my co-worker Rouven, who made this into some actually reasonable Python code:

from datetime import datetime

from neo4j import GraphDatabase


uri = "neo4j://localhost:7687"
auth = ("username", "password")

print(datetime.now())

names = ["Jim_{i}" for i in range(100000)]

print(datetime.now())

with GraphDatabase.driver(uri, auth=auth) as driver:
    with driver.session(database="neo4j") as session:
        session.run("UNWIND $names AS name CREATE (n:Account {name: name})",
                    names=names)

print(datetime.now())

oh, and I forgot to add - I was running this on a little autonomous cluster I had set up for a blog that I'm working on. The cluster had 3 primaries, so my above tps also includes the overhead of guaranteed writes to the other 2 primary nodes. Single instance would be even better

@rbuck-som

You're seeing the overhead of back-and-forth with every transaction. I tested this on a pretty puny VM running on my laptop (both python and the DB are in the VM):

from neo4j import GraphDatabase
from datetime import datetime
class Neo4jConnection:
    
    def __init__(self, uri, user, pwd):
        self.__uri = uri
        self.__user = user
        self.__pwd = pwd
        self.__driver = None
        try:
            self.__driver = GraphDatabase.driver(self.__uri, auth=(self.__user, self.__pwd))
        except Exception as e:
            print("Failed to create the driver:", e)
        
    def close(self):
        if self.__driver is not None:
            self.__driver.close()
        
    def query(self, query, db=None):
        assert self.__driver is not None, "Driver not initialized!"
        session = None
        response = None
        try: 
            session = self.__driver.session(database=db) if db is not None else self.__driver.session() 
            response = list(session.run(query))
        except Exception as e:
            print("Query failed:", e)
        finally: 
            if session is not None:
                session.close()
        return response

    def driver(self):
        d = self.__driver
        return d

conn = Neo4jConnection(uri="redacted", user="redacted", pwd="redacted")

batch = []
print(datetime.now())

for i in range(100000):
    batch.append({"name": "Jim_"+str(i)})
print(datetime.now())


s=conn.driver().session()
s.run("unwind $batch as row create (n:Account {name: row['name']})", batch=batch);
s.close()

print(datetime.now())

Don't critique my Python please :slightly_smiling_face:

Results are about 62k tps

Running this twice in a row... 100K nodes in less than a second. repeated with 500k rows (in a single txn, probably can be optimized for even better performance):

2022-11-22 11:42:58.046505
2022-11-22 11:43:02.242657

>119k tps. Now do this with multithreading and proper batch sizing :slightly_smiling_face:

Thanks, unless I can find a reasonable benchmark or simple test illustrating acceptable transaction performance I will have to drop Neo4j from consideration and look at Allegro | Arango | Neptune | Cambridge Semantics. I just ran the equivalent code using the python client, and I get 40.15 TPS. While 4x faster than the Go driver, it's not anywhere near 10,000 to 20,000 TPS that I should expect from such a simple example.

I decided I'm not going to respond to the thread. I will find a different database to work with.

fwiw ~1800 tps:

print(datetime.now())

with GraphDatabase.driver(uri, auth=auth) as driver:
    with driver.session(database="neo4j") as session:
        with session.begin_transaction() as tx:
            for i in range(10000):
                tx.run("CREATE (a:Account {name: $name})", name="Jim_"+str(i))
            
            tx.commit()
print(datetime.now())