How to get sub graph with custom query

9mikev · March 2, 2022, 5:13pm

I have the following classes for node and edge


@Node
public class MyNode {

    @Id
    String id;
    String name;
    String type;

    @Relationship(type = "CONTAINS")
    Set<MyEdge> outgoingEdges = new HashSet<>();

    public MyNode() {
    }

    public MyNode(String id, String name, String type) {
        this.id = id;
        this.name = name;
        this.type = type;
    }

    // Getters and setters... 
}

and

@RelationshipProperties
public class MyEdge {

    @Id
    String id;

    String idNodeFrom;
    String idNodeTo;

    @TargetNode
    MyNode targetNode;

    public MyEdge(String idNodeFrom, String idNodeTo, MyNode targetNode) {
        this.id = idNodeFrom + " -> " + idNodeTo;
        this.idNodeFrom = idNodeFrom;
        this.idNodeTo = idNodeTo;
        this.targetNode = targetNode;
    }
    
    // Getters and setters
}

and I want to find the sub graph starting at a specific node. The query I have is this

MATCH (root {name: "nodeName"})-[*]->(leaf) RETURN *

but I'm not sure what the return type will be in Java. That is,

@Query("MATCH (root {name: $nodeName})-[*]->(leaf) RETURN *")
ReturnType??? customMethod(@Param("nodeName") String nodeName);

There are no back edges or cycles in my directed graph so it will not produce any type of potential conflicts.

Any help would be appreciated. Thank you

michael_simons1 · March 3, 2022, 9:02am

Hello @9mikev

the return type can be of many things.

I assume that you have a repository that looks something like this

interface MyNodeRepository extends Neo4jRepository<MyNode, String> {
}

Than a valid return type would be one of the following

MyNode (there is always one, when there's none, this will return null)
Optional<MyNode> (there my be none or one)
Collection<MyNode> (with whatever collection you want, usually List or Set is appropriate)

The query however needs a bit of shape:

MATCH (root {name: $nodeName})-[r*]->(leaf) RETURN root, collect(r), collect(leaf)

Basically making sure you get one record by root node which is explained here: Spring Data Neo4j

However, when all you wanna do is find that root node by a simple, mapped attribute (name), why are you making it so hard on yourself? This is all you need:

interface MyNodeRepository extends Neo4jRepository<MyNode, String> {
    Optional<MyNode> findOneByName(String name);

    List<MyNode> findAllByName(String name);
}

Those methods are called "derived query methods". SDN understands the domain model and the repository is able to derive a query for you by the given method name. No need to write Cypher for that.

Let us know if this answer was helpful.

9mikev · March 3, 2022, 10:08am

Hey @michael_simons1 and thank you for your answer!

I am aware of the derived query methods, but the reason I'm not using findOneByName is because it performs very slowly for some reason. The graph I'm working on is relatively simple - around 630 nodes and 2,800 relationships - and I don't know why the execution time of these built in queries is so slow - we're talking about 30 seconds waiting time for some queries - but I have experimented and found that custom queries work much faster.

Now as to the query you suggested, I tried it like this

@Query("MATCH (root {id: $nodeId})-[r*]->(leaf) RETURN root, collect(r), collect(leaf)")
Optional<MyNode> customSubGraphStartingAt(@Param("nodeId") String nodeId);

but it only returns only the root node and not the rest of the nodes it should return. That is, its outgoingEdges set is empty.

Another thing is that there is a deprecation warning that popped up when using this query. It says

    MATCH (root {id: $nodeId})-[contains*]->(tool) RETURN root, collect(contains), collect(tool)
	                          ^
Binding relationships to a list in a variable length pattern is deprecated. (Binding a variable length relationship pattern to a variable ('contains') is deprecated and will be unsupported in a future version. The recommended way is to bind the whole path to a variable, then extract the relationships:
	MATCH p = (...)-[...]-(...)
	WITH *, relationships(p) AS contains)

I considered using the path version of this query but again I'm not sure what the return type in Java should be.

Again, thanks for your time and help!

9mikev · March 15, 2022, 11:55am

@gerrit.meier Any thoughts on this...?

I also tried this, both Database-side reduction (which was very slow) and Client-side reduction (was not working and can't really recall the reason, it's been some days now) and I don't know what else to try.

Any help would be appreciated

gerrit.meier · March 15, 2022, 2:41pm

Please have a look at neo4j-issues-examples/discourse-52793 at master · meistermeier/neo4j-issues-examples · GitHub I created a demo project with your reported problem. Of course with more naive dataset missing some relationships.
I don't know where the difference is but I hope that it can help us to tackle down your problem.
The check for the relationships until the last both nodes is a little bit stupid because it checks every chain over and over, but hey, it shows that all is loaded.
The graph:

gerrit.meier · March 15, 2022, 3:05pm

Alright, I increased the amount of relationships a bit to match yours.
Please try to apply this version of the query and you should be fine. neo4j-issues-examples/MyNodeRepository.java at 8f46768fc873a00f4ccd1cd519f0d0d17c427313 · meistermeier/neo4j-issues-examples · GitHub

9mikev · March 16, 2022, 11:18am

First of all thanks so much for your time helping me with this.

The query you suggested is something I had already tried in my pervious answer when I said I tried the Client-side reduction. But the problem is that the query returns the first node, but its set of outgoindEdges is empty, thus it is the only node I get.

gerrit.meier · March 17, 2022, 10:34am

Are you really sure that you are using the same query as I have linked?
MATCH p=... RETURN root, collect(nodes(p)), collect(relationships(p))
With this result all outgoingEdges (and their target's outgoingEdges, ...) are mapped. So it could only be that you have another relationship type and not CONTAINS.
Please take the time to see where my example and your project differs, or please provide a reproducer project.

abccbaandy · January 4, 2025, 4:24am

Just found this topic.

So surprise that official staff ignore this important performance issue reported by user 3yr ago.

It's still happen in 2025.
relate issue link

github.com/spring-projects/spring-data-neo4j

[Performance Issue] Generic relationship does not honor type when query

opened 09:50AM - 20 Aug 24 UTC

abccbaandy

status: waiting-for-triage

I have some node ````java @Data @Node public abstract class BaseNode { …@Id @GeneratedValue private UUID id; } @EqualsAndHashCode(callSuper = true) @Node @Data public class Child extends BaseNode{ } @EqualsAndHashCode(callSuper = true) @Node @Data @NoArgsConstructor @AllArgsConstructor @Builder @ToString(callSuper = true) public class Parent1 extends BaseNode{ @Relationship(type = "Parent1_CONTAIN", direction = Relationship.Direction.OUTGOING) private List<BaseRelationship<BaseNode>> parent1Relationships; } @EqualsAndHashCode(callSuper = true) @Node @Data @NoArgsConstructor @ToString(callSuper = true) public class Parent2 extends BaseNode { @Relationship(type = "Parent2_CONTAIN", direction = Relationship.Direction.OUTGOING) private List<BaseRelationship<BaseNode>> parent2Relationships; } ```` When I get parent1 with find all: ````java List<Parent1> all = parent1Repository.findAll(); ```` The log shows ```` MATCH (parent1:`Parent1`:`BaseNode`) WITH collect(elementId(parent1)) AS __sn__ RETURN __sn__ MATCH (parent1:`Parent1`:`BaseNode`) OPTIONAL MATCH (parent1)-[__sr__:`Parent1_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(parent1)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__ MATCH (baseNode:`BaseNode`) WHERE elementId(baseNode) IN $__ids__ OPTIONAL MATCH (baseNode)-[__sr__:`Parent2_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(baseNode)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__ MATCH (baseNode:`BaseNode`) WHERE elementId(baseNode) IN $__ids__ OPTIONAL MATCH (baseNode)-[__sr__:`Parent1_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(baseNode)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__ MATCH (rootNodeIds:`Parent1`) WHERE elementId(rootNodeIds) IN $rootNodeIds WITH collect(rootNodeIds) AS n OPTIONAL MATCH ()-[relationshipIds]-() WHERE elementId(relationshipIds) IN $relationshipIds WITH n, collect(DISTINCT relationshipIds) AS __sr__ OPTIONAL MATCH (relatedNodeIds) WHERE elementId(relatedNodeIds) IN $relatedNodeIds WITH n, __sr__ AS __sr__, collect(DISTINCT relatedNodeIds) AS __srn__ UNWIND n AS rootNodeIds WITH rootNodeIds AS parent1, __sr__, __srn__ RETURN parent1 AS __sn__, __sr__, __srn__ ```` In the query log, shows it search for `Parent2_CONTAIN` but it shouldn't, because `Parent2_CONTAIN` is not in Parent1 node. In real case, if I have 10 node extends the base node, it will end up query all 10 node's relationship, I think it is a performance issue. Change to `Child` still have this issue ````java @Relationship(type = "Parent1_CONTAIN", direction = Relationship.Direction.OUTGOING) private List<BaseRelationship<Child>> parent1Relationships; ```` Also, I think this issue is associate https://github.com/spring-projects/spring-data-neo4j/issues/2933

For anyone got the same issue:
Seems the default non-cypher query way is totally unusable in real project.

Topic		Replies	Views
Relationships not returned in query Spring Data Neo4j & Neo4j-OGM	8	2070	March 21, 2021
Properties of relationship entities return null in custom @Query Spring Data Neo4j & Neo4j-OGM	1	1170	October 20, 2020
Can't get edges from Neo4jRepository Spring Data Neo4j & Neo4j-OGM	10	470	February 22, 2022
Relationships not returned in query Drivers & Stacks migrated	1	202	September 28, 2022
Cypher query for getting a subgraph by multiple relationship paths Cypher	8	2723	May 27, 2022

How to get sub graph with custom query

Related topics