Can I use session.BeginTransaction() for running multiple queries inside one transaction?

muzos07 · April 17, 2020, 6:29pm

Hello,

I'm trying to implement .NET official Neo4j driver into my C# project.
While studying the docs, I learned about Transaction functions (Write/ReadTransaction).

However, in my case I would like to run more than one CREATE query during those transactions (I know for sure that all those queries would fit into WriteTransaction), which current WriteTransaction method doesn't allow.

Then I discovered that ISession can give me a transaction with BeginTransaction() method, but I cannot find any mention about this in the Driver manual or on driver repository page.

So my question is: What is the best practise with running multiple CREATE queries inside a transaction?

If I put it inside using() block, would commit, reroll, retry mechanisms work correctly?
Basically will it work exactly the same as WriteTransaction is described in docs?

using (ITransaction tx = session.BeginTransaction())
{
    // run first query here using tx.Run()

    // run second query here using tx.Run()

    // run third query here using tx.Run()
}

OR do I have to apply these mechanisms using available methods tx.RollBack(), tx.Commit(), tx.Dispose()?

Just a little info about what I'm trying to accomplish:

Inside project in memory will be loaded bunch of "NodeGraphs" (this object contains data about specific nodes and each node has information about relationships to others). I would like to iterate all NodeGraphs and use transaction for each NodeGraphs. That means, if one node from NodeGraph B fails, all already loaded nodes from B are rolled back, but not all nodes from NodeGraph A.

During one transaction I would like to iterate all nodes, load them and then iterate them again and load their relationships. I'm certain that there does not exist relationship between node from A and node from B.

Thanks, Peter.

muzos07 · April 17, 2020, 6:43pm

I was looking into this furthermore and discovered with the help of this article.

Inside Visual Studio I see that WriteTransaction() just has to return some IResult:

Does this mean, that my task (which is running a lot of CREATE queries inside one transaction) could be performed by using WriteTransaction() and inside call these queries and at the end just return empty IResult (since I dont really need any results)

charlotte.skardon · April 20, 2020, 11:12am

Hey @muzos07,

To take your first question, err, first. If you had:

using(var tx = session.BeginTransaction())
{
	//Statements
}

If you don't put a tx.Commit() it will Rollback upon Dispose(). So to get your 3 CREATE messages, you would need to do:

using(var tx = session.BeginTransaction())
{
	tx.Run("CREATE (:Node {Id:1})");
	tx.Run("CREATE (:Node {Id:2})");
	tx.Run("CREATE (:Node {Id:3})");
	tx.Commit();
}

Now, onto the WriteTransaction method:
You don't have to return an IResult at all, the signature for the WriteTransaction is:

WriteTransaction(Func<ITransaction, T> func)

So you could just return bool if you want, or maybe the time it took to run your statements?

You would use it the same way:

TimeSpan timeTaken = session.WriteTransaction(tx =>
{
	TimeSpan taken = TimeSpan.Zero;
	taken = taken.Add(tx.Run("CREATE (n:Node {Id:1})").Consume().ResultAvailableAfter);
	taken = taken.Add(tx.Run("CREATE (n:Node {Id:2})").Consume().ResultAvailableAfter);
	taken = taken.Add(tx.Run("CREATE (n:Node {Id:3})").Consume().ResultAvailableAfter);

	return taken;
});

Console.WriteLine($"Took: {timeTaken.TotalMilliseconds.ToString()}ms");

There is a KEY difference here though - this will Commit() without you having to ask it to.

Does that help at all?

All the best

Chris

muzos07 · April 20, 2020, 12:20pm

Hello, thank you.

Through practise and error I did something similar like your first example, which goes something like this:

using (ISession session = driver.Session())
{
    foreach (NodeGraph nodeGraph in filteredNodeGraphsFromOneXml)
    {
            using (ITransaction tx = session.BeginTransaction())
            {
                var rootNodeID = "CREATE (n:RootNode $parameters) return id(n) as id",
                    new { nodeGraph.rootNode.Parameters }).First()["id"].As<long>();
                
                foreach (<some foreach through main nodes>)
                {
                    var mainNodeID = tx.Run(
                        "MATCH (r:RootNode) WHERE id(r) = $rootNodeID" +
                        "CREATE (n:Executable $parameters)-[:HAS_ROOT_NODE]->(r)" +
                        "RETURN id(n) as id",
                        new
                        {
                            rootNodeID = rootNodeID,
                            parameters = executable.Parameters
                        }).First()["id"].As<long>();

                    foreach(<some other foreach or condition to create secondary nodes)
                    {
                         tx.Run(
                             "MATCH (e:Executable) WHERE id(e) = $mainNodeID CREATE (n:Detection $parameters)<-[:HAS_DETECTION]-(e)",
                             new { mainNodeID = mainNodeID, parameters = executable.Detection.Value.Parameters }).Consume();
                    }
                 }
     
                 <some other cycles or conditions for adding another nodes and relationships>

                 tx.Commit();
            }
            catch
            {
                tx.Rollback();
                tx.Dispose();
                session.Dispose();
                throw;
            }
        }
    }
}

This works, however I will need to work on the speed, which isn't the best right now. I tried to use multiple approaches:

Multiple queries with same MATCH statement - #2 by stefan.armbruster - returning id(n) for further matching
using apoc.static.get/set to store current mainNode into database as some kind of global variable
creating relationships directly, which needed to use a LOT of MERGE statements, which I guess is worse than MATCH+MATCH+CREATE relationship in the performance way

I don't have currently everything tested right now (I plan to do it in the next days), so you can say this issue regarding mupltiple statements in transaction is closed.

However I have one following question:
Is Write/ReadTransaction any better in case of performance over BeginTransaction?

Thank you very much for this and following answers, appreciate that.

charlotte.skardon · April 20, 2020, 1:25pm

Depends - there should be no difference on a Single instance DB, and if you're doing Writes, there should also be no difference in a cluster - the only time you would have potential performance improvements would be if you had a cluster and did purely READ transactions through the BeginTransaction - as that would route those to the Leader, as opposed to the Followers.

But that's all about clustering, in essence - if you can say the query is a read/write then use the correct one if you plan on going to a clustered environment in the future.

In terms of performance, it's tricky to see without your class structure for the things you want to store in the database, but the quickest way when you have an array/list typically is to use UNWIND to do the work, so I would consider changing your Root/Executable node creation to something like this:

using(var tx = session.BeginTransaction())
{
    List<ExecutableParameters> allExecutableParameters = <???>; //All the params
    var query = 
        @"CREATE (root:RootNode $rootParameters)
          UNWIND $allExecutableParameters AS executableParameters
          CREATE (e:Executable)-[:HAS_ROOT_NODE]->(root)
          SET e = executableParameters";
    tx.Run(query, new {
            rootParameters = nodeGraph.Parameters,
            allExecutableParameters = allExecutableParameters
        });
}

You could RETURN something like COLLECT(id(e)) AS ids to get all the Ids.

Obviously this is just to give an idea, given the code, it might make sense to start with the Executable->Detection side of things first, i.e. group the Detection parameters and UNWIND them.

Generally, the more you can do in one statement the better.

In terms of your MATCH vs MERGE type thing, there should be little difference, as MERGE is basically doing what you are doing. If you want - you could PROFILE the queries to see which is more efficient.

All the best

Chris

Topic		Replies	Views
How many queries to run in one Async Session - Transaction Java performance , java-driver	3	886	May 21, 2021
Need examples of running multiple queries in a single transaction using JavaScript without using "await" Javascript	6	2066	March 4, 2021
Difference between session.run and session.readTransaction or session.writeTransaction Javascript	12	5109	November 9, 2021
Cannot begin a transaction on a closed session Javascript transaction	4	2662	December 24, 2021
Struggling with Transactions Go	7	830	June 28, 2021

Can I use session.BeginTransaction() for running multiple queries inside one transaction?

Related topics