How to get only Node object using Spring Neo4j instead of populating it's relationship lists

I'm facing this problem that whenever I'm fetching a node from Neo4j using the findById( ) method of Spring JPA, it's also populating the relationships list. For example this is the node entity that I'm trying to retrieve from the database:

@Node
public class Location extends Transaction{

    public final static String featureID = "12116";

    public final static String featureVariantID = "000";

    @Relationship("Parent")
    private Set<BaseRelationship<Location>> parents;

    @Relationship("Type")
    private Set<BaseRelationship<LocationType>> types;

    @Relationship("Workplace")
    private Set<BaseRelationship<Workplace>> workplaces;

    @PersistenceCreator
    public Location() {

    }

    public Location(String transactionID, String tenantID) {
        super(featureID, featureVariantID, transactionID, tenantID);
        // initializing the lists
    }

    // Getters and Setters

    private static final Map<String, String> allowedRelationships = new HashMap<>();
    static {
        allowedRelationships.put("Parent", "Data.ParentLocationID");
        allowedRelationships.put("Type", "Data.LocationLocationTypeID");
        allowedRelationships.put("Workplace", "Data.LocationWorkplaceID");
    }

    public Map<String, String> getAllowedRelationships() {
        return Location.allowedRelationships;
    }
}

The relationship class is like this:

@RelationshipProperties
public class BaseRelationship <Target extends Transaction>{

    @Id
    @GeneratedValue
    private Long id;

    @Property("EffectiveTillTimestamp")
    private String effectiveTillTimestamp;

    @Property("EffectiveFromTimestamp")
    private String effectiveFromTimestamp;

    @Property("Status")
    private String status;

    @TargetNode
    private Target targetNode;

    public BaseRelationship(String effectiveTillTimestamp, String effectiveFromTimestamp, String status, Target targetNode) {
        // initializing the data members
    }

    public BaseRelationship() {
    }

    // Getters and Setters
}

So whenever I pass in the nodeID into the findByID( ) method of the repository, it also populates the types, parents and workplaces relationship lists. The objects in the list also have their relationship lists populated. According to me this is adding an extra overhead while retrieving data from Neo4j.

This is the repository interface being used and the repository interfaces of each individual node classes inherits from it.

@Repository
public interface AbstractTransactionRepository<T extends Transaction> extends Neo4jRepository<T, String> {

    @Query("MATCH (n) WHERE n.NodeID= $nodeID RETURN n")
    Optional<T> findOnlyNode(String nodeID);
}

I tried making use of a custom query as defined in the findOnlyNode method in the above repository class to fetch only the node excluding its relationships. However, after creating and saving a new relationship with the fetched node as the Target Node and another newly created node, the already existing relationships of the Target Node(the one that I fetched using findOnlyNode method) got destroyed. For example:

If earlier the structure was:
Target Node->Parent1->Parent2
After saving:
New Node->Target Node
Instead it should have been like this:
New Node->Target Node->Parent1->Parent2 (Correct)

If I just use the findById( ) method of Spring JPA then it works correctly but the time taken to fetch the node increases. So if the node has several children and those children also have other children then the time taken to fetch the node crosses 2 sec, however if the node does not have any children then then time taken to fetch the node is always less than 700 ms.

Is there a way to just fetch the node object from Neo4j without populating the relationships of that node and also save it without destroying the already exisiting relationships?

The use case for Neo4j Spring Data is to manage domain entities. As such, it keeps your java domain model in synch with your database model. This is why you lost your relationships when saving the node without its associated relationships in your java object. As such, writes have to be done with the full domain model.

You can use a projection for reading a subset of the data. In your case, define an interface with methods that return just the properties you want. The format of the methods is getPropertyName. Projections can be hierarchical as well when using an interface projection.

https://docs.spring.io/spring-data/neo4j/docs/current/reference/html/#projections

1 Like

I don’t think projections would work for me because I need the whole node object, not just a property like nodeID. I have to pass the node object to the BaseRelationship constructor and assign it to the targetNode. Is there a way to create a new node object from the nodeID and use it to create the relationship? Would that also create the relationship with the existing node in the database and preserve its other relationships?

I mentioned a projection because you mentioned you wanted to read a subset of the data because reading the entire object took too long. Projections are used retrieve a subset of your domain object.

For writing, you need to create your Java domain object with all the data, as SDN will create neo4j objects that matches what is in your Java object. Its purpose is to sync the Java object to neo4j so you don’t have to worry about the details.

I have two distinct use case for neo4j in my application. One is where I am managing isolated domain objects, like an invoice and its line items. In this case I use SDN, as it makes it easy to read and write invoices.

My other use case is where I have a network of data. In this case, there is no identifiable domain object, as the data is all tangled up in a mesh of data. For this use case, I use the neo4j driver so I can create operations on my graph so I can manage insertions/updates/deletions to any part of the graph. There would be no practical efficient way to use SDN here (in my opinion).

Is SDN the correct choice for your data?

Thank you for providing some use cases. After reviewing them, I've come to realize that my data structure closely resembles a network. Like there is a node type that has 15 relationships with 2 of these relationships being to the same node type(but different node objects) and each target node of these relationships would further have deeply connected nodes. Given this complexity, it seems that SDN may not be a suitable choice for me.
Can you provide an example of saving and creating nodes using the neo4j driver so that I could learn from it? This would greatly assist me in understanding.

You would start by defining all the operations you need to perform to create/update/delete/read any of the objects in your graph. I defined a separate repository for each neo4j entity, but the design is entirely up to you. You may end up with a lot of methods to implement because you are providing course and fine grained changes to your graph.

I used the Neo4j driver directly in my first project because I knew I did not want to use SDN due to the fact that I had a network of entities to manage. Later when I learned SDN to use with my next project that did manage domain objects, I also learned about SDN's Neo4jClient. Neo4jClient is build upon Neo4j Driver. It provides a fluent api and transaction management is integrated with spring's. I would probably use this instead of the Neo4j if I would to redo my first project. It encapsulates a lot of the boilerplate code and I feel I would write code much faster. It is exactly equivalent of using streams instead of manually iterating over data. I use streams anywhere I can, so I probably would feel the same about Neo4jClient vs Neo4j Driver.

For an example, I threw something together to give you an idea of what I mean when using the driver to create repository. The code is just representative, as I have not tested it nor wrote it for me to use. I feel a lot of this code would be simplified if using Neo4jClient instead.

package com.example.neo4jdriverpoc;

import lombok.AllArgsConstructor;
import org.neo4j.driver.Driver;
import org.neo4j.driver.Session;
import org.neo4j.driver.Values;

import java.util.List;
import java.util.Map;
import java.util.Optional;

@AllArgsConstructor
public class Neo4jPersonRepository {

    Driver driver;

    public List<PersonEntity> findByName(String name) {
        String query = """
                match(n:Person)
                where n.lastName contains $name or n.firstName contains $name
                return n
                """;
        try (Session session = driver.session()) {
            return session.executeRead(tx -> tx.run(query, Values.parameters("name", name))
                    .list(r -> new PersonEntity(
                            r.get("key").asLong(),
                            r.get("firstName").asString(),
                            r.get("lastName").asString(),
                            r.get("ssn").asString()
                    )));
        }
    }

    public Optional<PersonEntity> update(PersonEntity person) {
        Map<String, Object> parameters = Map.of(
                "key", person.getKey(),
                "firstName", person.getFirstName(),
                "lastName", person.getLastName(),
                "ssn", person.getSsn()
        );
        String query = """
                match(n:Person{key:$key})
                set n = $parameters
                return n
                """;
        try (Session session = driver.session()) {
            return session.executeWrite(tx -> {
                List<PersonEntity> list = tx.run(query, parameters)
                        .stream()
                        .map(r -> new PersonEntity(
                                r.get("key").asLong(),
                                r.get("firstName").asString(),
                                r.get("lastName").asString(),
                                r.get("ssn").asString()))
                        .toList();
                if (list.size() == 0) {
                    return Optional.empty();
                } else if (list.size() == 1) {
                    return Optional.of(list.get(0));
                } else {
                    throw new IllegalStateException("more than one Person entity exists with key '" + person.getKey() + "'");
                }
            });
        }
    }
}

You can write in your Repo something like that:

@Query("MATCH (n) WHERE n.NodeID= $nodeID RETURN n {.*}")
    Optional<T> findOnlyNode(String nodeID);

So, you would just get all "basic" Data of your node mapped to your object by Spring Data.

If that is not enough, there are also possibilities to include "some" needed related objects.
E.g. see my examle here:

The problem with your destroyed relations on

New Node->Target Node->Parent1->Parent2 (Correct)

is that you did a "recursive save" by saving the "New Node".
If the "Target Node" included in your save method did not contain its own relations to "Parent1", the operation will just "reflect that" to the Neo and will so remove the relation to the "Parent1".

The only way out of this is - as far as I know - not to use the generic "save" methods provided by Spring Data Neo4j at all, as soon as the object inludes a relation.

We write our own "save"-Methods where we store at first just the node properties and then draw the relations by dedicated cypher calls.

Thanks @glilienfield and @alexander.gehre for your awesome answers. I think that creating a custom save method for each of my relationships would be my only solution to the problem. I was just hoping that there would be some solution using the built-in save method of SDN as I also have to check the validity of the existing relationships and update them in some cases other than just adding a new relationship. This would just add into the complexity and the save method eased it out. But thanks for the answers, they have really helped me out a lot.

1 Like

We startet with dedicated methods in our Repository classes for drawing each single relation.
But, beside this is very error-prone it can be achived much easier.
The following code does a good job on our solution:

public class GenericNeoService {

    private final Neo4jTemplate neo4jOperations;

    /**
     * Add specific relation from a source node to a target node. Will simply add no relation if source or target node
     * could not be found.
     *
     * @param sourceNodeLabel Required Label of the source node
     * @param sourceNodeUuid  Uuid of the source node
     * @param relationName    Name of the relation to add
     * @param targetNodeLabel Required Label of the target node
     * @param targetNodeUuid  Uuid of the target node
     * @return count of added relations
     */
    public long addRelation(String sourceNodeLabel, String sourceNodeUuid, String relationName,
                            String targetNodeLabel, String targetNodeUuid) {
        String query = "MATCH (source:" + sourceNodeLabel + ") WHERE source.uuid = '" + sourceNodeUuid + "' " +
                "MATCH (target:" + targetNodeLabel + ") WHERE target.uuid = '" + targetNodeUuid + "' " +
                "MERGE (source)-[rel:" + relationName + "]->(target) " +
                "RETURN count(rel)";
        return neo4jOperations.count(query);
    }
    
    ..
}

Of cause, if your classes do not use "uuid" as "@Id" field you should adopt the code...

BTW, as far as I remember, some time ago there was a possibility in the code of Spring Data to define a "depth" for the recursion in reading/writing objects.
Unfortunately, they simply removed it. I really liked that approach, as it allowed to define exactly, where to "cut-of" reading an writing operations while still using generic save/find methods provided by the framework.

@alexander.gehre , that's true, could be there any plan to make depth available again?

@glilienfield , dear I want to ask you regarding the performance, when I using custom query to get just the node without their relatioships the performance is way better, so what is the best practice here to keep the balance between the performance and keeping the sync with DB

Thank you guys

Chiming in here to give some insights on this topic. Please feel free to link this also in other threads if you think it makes sense.

There is hardly any way to re-introduce the depth in the concept like it was present in SDN with Neo4j-OGM before Spring Data Neo4j version 6.
The idea of "fetch depth" was based on the ability of a cache when loading the data to respect this also when persisting again. Although this cache works very good when invoked in a standard unit of work with load, modify and save, it acted user-facing very surprising if there were subsequent fetches/loading.
In addition to this, this cache prevented us from introducing support for reactive flows and immutable object creation because later modification to the cache might have an impact on already mapped objects.

Going forward to see the existing problem with fetching the whole graph in worst-case modelling scenarios, we made use of the already in Spring Data commons (and partially supported in earlier versions of SDN) projection feature.
It does not only give you the option to define the depth but also is very flexible when it come to exclude relationships on the same "depth level" while keeping others.
It has a direct effect on the query creation, there won't be any unnecessary data queried from the database.
This would also help to fetch a node with just one relationship property and add new ones to it, because the projections, now in contrast to other Spring Data modules, are also supported in persist operations via the Neo4jTemplate. (see Spring Data Neo4j Projections :: Spring Data Neo4j)

We are very aware of the situation that writing projections for every use-case can be a certain amount of work for benefiting of the "best" query. That's why we are currently investigating/experimenting with different ways of making it more convenient. But this is ongoing work and yet not decided which solution is the best.

Thanks dear @gerrit.meier , I am checking "projection feature" if it's suitable for my case

We do anything with custom queries in Repositories and on the Template classes.
Using this approach we can specify in detail the data we need from the node requested and the related nodes without a 'generic' definition of a 'depth' for the request.
So we get the perfect data set we need with one single query, the best performance and anything mapped automatically to our Java model classes with one simple call. Of cause, this also works for projections.

I am happy that you found a way to solve this for your situation.
Not a critique on this approach but more a heads up for others who read this:

The Java object graph fetched with custom Cypher statements probably doesn't contain all relationships (in Alexander's case this is the intention). When persisting such an entity again with default mechanisms in SDN (and without projections to reduce the scope), it will persist the object graph as-is. Relationships that were not loaded, will get removed from the database(!)

You're right Gerrit!
Our approach only works for us, as we do not save objects directly with the generic save methods of Spring Data.
In our use case this is most of the time not needed at all, as we use the Neo as backend for a stateless REST webservice.
What is the important point for this use case: you do not get or provide a full data model from/to the client, but only the data set (partial model) for a specific use case, i.e. a node with 'some' related nodes .
So, especially you have no chance at all to map and save the data directly to the Neo without loosing most of the existing relations of the objects involved, at least in most of the PUT (update) use cases.
This is the reason why we have to store the data separated by simple properties and relations. The good thing here is, that this work very well and easy and gives us full control about what is going to be the final result in the Neo.

I am also facing the same issue: I cannot retrieve data in the hierarchy. a parent inside a child and a child inside another sub-child.