Hi, let me give you a simplified example in order to explain, how we boost up the Neo4j performance and avoided Stackoverflow errors on our Java backend with simple to use Spring Data Neo4j standard functionality.
Think of a standard use case for a Rest backend:
GET on a Users endpoint to get all registered users including necessary (sub-)data in the User objects with one single Rest call to display the Users on a Web backend.
Let me start with a simplified demo model for User objects:
@Node
@Data
public class User {
@Id
@GeneratedValue(generatorClass = UUIDStringGenerator.class)
private String uuid;
private String title;
private String firstName;
private String lastName;
private String email;
@Relationship(type = "HAS_ADDRESS", direction = Relationship.Direction.OUTGOING)
private Address address;
@Relationship(type = "HAS_COMPANY", direction = Relationship.Direction.OUTGOING)
private Company company;
...
}
@Node
@Data
public class Address {
@Id
@GeneratedValue(generatorClass = UUIDStringGenerator.class)
private String uuid;
private String country;
private String postalCode;
private String city;
private String street;
private String houseNumber;
...
}
@Node
@Data
public class Company {
@Id
@GeneratedValue(generatorClass = UUIDStringGenerator.class)
private String uuid;
private String name;
private String email;
@Relationship(type = "HAS_ADDRESS", direction = Relationship.Direction.OUTGOING)
private Address address;
@Relationship(type = "HAS_PARENT_COMPANY", direction = Relationship.Direction.OUTGOING)
private Company parentCompany;
...
}
with a simple CRUD interface like
public interface UserRepository extends CrudRepository<User, String>
you can now easy write code like this:
List<User> users = userRepository.findAll();
However, this is not a good idea, if you have a lot of Users in your backend.
Spring Data will now start in getting all Users in one call.
BUT: After that, it will resolve for each User the Address one by one.
Same with the company.
And: In my small example it is possible to model a cyclic dependency in the part with the parent Company. Here you would get a Stackoverflow error!
So you end up the a really poor performance and will have a good chance for getting a severe error on the backend.
What can we do?
Simply do not use the "findAll()" method, but extend your Repository with a method including a custom query:
@Query("""
MATCH (user:User)
RETURN user{.*,
User_HAS_ADDRESS_Address:
[(user)-[:HAS_ADDRESS]->(address:Address)
| address{.*}],
User_HAS_COMPANY_Company:
[(user)-[:HAS_COMPANY]->(company:Company)
| company{.uuid, .name,
Company_HAS_COMPANY_Company:
[(company)-[:HAS_PARENT_COMPANY]->(parentCompany:Company)
| parentCompany{.uuid, .name}]
}]
}
ORDER BY user.lastName, user.firstName
""")
List<User> getAllUsers();
With that simple query you can also simply write
List<User> users = userRepository.getAllUsers();
in your service method but now:
- you will get the exact data you need for your REST client - and nothing more
- you will get the result with ONE SINGLE call to the Neo DB i.e. with maximum of possible performance
- Spring Data will not waste time to go uncontrolled through your model (sub graph) recursively for resolving data you do not need
- there will be no chance for a Stackoverflow anymore here, as for the parentCompany the parentCompany will not be resolved at all anymore.
From our experience we can say, that we improved the performance for such GET calls on a factor of 20 to 100 depending on the complexity of the (sub-)model and the amount of data.
I hope with that small demo I can help some developers here to use Spring Data more efficiently, with higher performance and less errors from automated resolving of cyclic dependencies in sub graphs.