Tree-like data, sorting nodes by connecting relationships

Good evening,

My data structure is like a tree and I would like to traverse it from a root object. The parents and the children are connected via a relationship which has a {:order}-property.

A variable length path is therefore required and on each level of the tree, I would like to order the results by this property. Unfortunately, it seems that Cypher is able to order by node-properties, but not by relationship-properties.

So, this works:

MATCH p=(rbs1:Rule_Book_Struct {rule_book_struct_id: "Rule Root Object"})-[:HAS_RULE_BOOK_STRUCT_CHILD*0..1]->(rbs2:Rule_Book_Struct)
RETURN p order by rbs2.rule_book_struct_id

but this does not:

MATCH p=(rbs1:Rule_Book_Struct {rule_book_struct_id: "Rule Root Object"})-[r1:HAS_RULE_BOOK_STRUCT_CHILD*0..1]->(rbs2:Rule_Book_Struct)
RETURN p order by r1.order

Any idea how to fix this issue?

Thank you, JJJ

Hello @janezic :slight_smile:

Since you have a depth relationship, r1 is a list so you need to UNWIND it to sort your RETURN statement, here is an example.

MATCH p=(rbs1:Rule_Book_Struct {rule_book_struct_id: "Rule Root Object"})-[r1:HAS_RULE_BOOK_STRUCT_CHILD*0..1]->(rbs2:Rule_Book_Struct)
UNWIND r1 AS rels
RETURN rels ORDER BY rels.order

Regards,
Cobra

Hello Cobra,

Thank you for your reply.

What happens if I have more than one levels of depth?

We have found a very complex solution for that which is limited to a certain no of levels of depth. But we would - of course - prefer a solution for an undefined no of levels.

Our solution (which creates the complete tree, sorted and 5 levels deep) looks like that:

MATCH path=(rbs1:Rule_Book_Struct {rule_book_struct_id: "Rule Root Object"})-[:HAS_RULE_BOOK_STRUCT_CHILD*0..5]->(rbs2:Rule_Book_Struct)
WITH *, relationships(path) as r1 order by size(r1), r1[0].order ASC, r1[1].order ASC, r1[2].order ASC, r1[3].order ASC, r1[4].order ASC
WITH COLLECT(path) AS paths 
CALL apoc.convert.toTree(paths) YIELD value
RETURN value

My query should work for all cases:

MATCH path=(rbs1:Rule_Book_Struct {rule_book_struct_id: "Rule Root Object"})-[:HAS_RULE_BOOK_STRUCT_CHILD*0..5]->(rbs2:Rule_Book_Struct)
UNWIND relationships(path) AS r
WITH path, r
ORDER BY r.order
WITH collect(path) AS paths
CALL apoc.convert.toTree(paths) YIELD value
RETURN value

Unfortunately this doesn't work. The nodes are not ordered properly (on no level of depth).

It looks as if the path-variable wich is "handed over" in the 2nd WITH-clause remains unordered. So there is no difference in the result even if I remove the "ORDER BY r.order"-part of the query.

Can you give me sample data and an example of result?

Hello Cobra,

Here is code to create (:Test_Elements) which are connected via [:HAS_TEST_ELEMENT_CHILD]-relationship. The relationship has a property {order}.

MERGE (elem0:Test_Element {id:"0"})

MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:30}]-(elem3:Test_Element {id: "3"})
MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:20}]-(elem2:Test_Element {id: "2"})
MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:10}]-(elem1:Test_Element {id: "1"})

MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:10}]-(elem2_1:Test_Element {id: "2_1"})
MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:30}]-(elem2_3:Test_Element {id: "2_3"})
MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:20}]-(elem2_2:Test_Element {id: "2_2"})

MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:20}]-(elem1_2:Test_Element {id: "1_2"})
MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:10}]-(elem1_1:Test_Element {id: "1_1"})
MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:30}]-(elem1_3:Test_Element {id: "1_3"})

MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:20}]-(elem2_1_2:Test_Element {id: "2_1_2"})
MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:30}]-(elem2_1_3:Test_Element {id: "2_1_3"})
MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:10}]-(elem2_1_1:Test_Element {id: "2_1_1"})

MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:30}]-(elem3_3:Test_Element {id: "3_3"})
MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:10}]-(elem3_1:Test_Element {id: "3_1"})
MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:20}]-(elem3_2:Test_Element {id: "3_2"})

The nodes were not created in the right order, because then it would be very likely that the sorting would work anyway (just because of the internal id of the nodes).

Here is the result, how it should look like:

0
-1
--1_1
--1_2
--1_3
-2
--2_1
---2_1_1
---2_1_2
---2_1_3
--2_2
--2_3
-3
--3_1
--3_2
--3_3

Thank you for your assistance!

JJJ

I always forget GDS plugin and its BFS algorithm :sweat_smile:

  • To create the graph in-memory:
CALL gds.graph.create('myGraph', 'Test_Element', 'HAS_TEST_ELEMENT_CHILD', { relationshipProperties: 'order' })
  • To get the result:
MATCH (a:Test_Element {id: "0"})
WITH id(a) AS startNode
CALL gds.alpha.bfs.stream('myGraph', {startNode: startNode})
YIELD path
UNWIND [ n in nodes(path) | n.id ] AS tags
RETURN tags
ORDER BY tags

Regards,
Cobra

Unfortunately, I'm using Neo via Aura, and my latest info was, that AURO does not support GDS. :unamused:

Hello,

Is the order property can be changed? For example:

MERGE (elem0:Test_Element {id:"0"})

MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:300}]-(elem3:Test_Element {id: "3"})
MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:200}]-(elem2:Test_Element {id: "2"})
MERGE (elem0)-[:HAS_TEST_ELEMENT_CHILD {order:101}]-(elem1:Test_Element {id: "1"})

MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:210}]-(elem2_1:Test_Element {id: "2_1"})
MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:230}]-(elem2_3:Test_Element {id: "2_3"})
MERGE (elem2)-[:HAS_TEST_ELEMENT_CHILD {order:220}]-(elem2_2:Test_Element {id: "2_2"})

MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:120}]-(elem1_2:Test_Element {id: "1_2"})
MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:110}]-(elem1_1:Test_Element {id: "1_1"})
MERGE (elem1)-[:HAS_TEST_ELEMENT_CHILD {order:130}]-(elem1_3:Test_Element {id: "1_3"})

MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:212}]-(elem2_1_2:Test_Element {id: "2_1_2"})
MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:213}]-(elem2_1_3:Test_Element {id: "2_1_3"})
MERGE (elem2_1)-[:HAS_TEST_ELEMENT_CHILD {order:211}]-(elem2_1_1:Test_Element {id: "2_1_1"})

MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:330}]-(elem3_3:Test_Element {id: "3_3"})
MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:310}]-(elem3_1:Test_Element {id: "3_1"})
MERGE (elem3)-[:HAS_TEST_ELEMENT_CHILD {order:320}]-(elem3_2:Test_Element {id: "3_2"})

So the query is:

MATCH path=(a:Test_Element {id: "0"})-[r:HAS_TEST_ELEMENT_CHILD*0..5]->(:Test_Element)
WITH last(NODES(path)).id AS id, CASE WHEN last(RELATIONSHIPS(path)).order IS NULL THEN 0 END AS o
ORDER BY o ASC
RETURN id

Regards,
Cobra

Basically, the order-property is changeable for us.

Currently the numbering starts in each "branch" of the tree from the scratch. What we would need is a unified system of numbering which belongs to all branches and levels.

This is a little bit tricky, because if we create a system like this

0: 000000
-1:001000
-1_1:001010
-1_2:001020
-1_3:001030
-2:002000
-2_1:002010
-2_2:002020
-2_3:002030
-3:003000
-3_1:001010
-3_2:001020
-3_3:001030

But if I add one more level, suddenly all the integers have been changed. For example, adding 2_1_1 to 2_1_3 would result in:

0: 00000000
-1:00100000
-1_1:00101000
-1_2:00102000
-1_3:00103000
-2:00200000
-2_1:00201000
-2_1_1:00201010
-2_1_2:00201020
-2_1_3:00201030
-2_2:00202000
-2_3:00203000
-3:00300000
-3_1:00101000
-3_2:00102000
-3_3:00103000

This becomes very complicated.... Don't you think so?

But I will try to do it with a simple string instead of an integer, then

00
0010
001010
001020
001030
0020
002010
002020
002030
0030
003010
003020
003030

should be sufficient, and if I add one level, it could be

00
0010
001010
001020
001030
0020
002010
00201010
00201020
00201030
002020
002030
0030
003010
003020
003030

Sorting strings is a good idea and you could add as many levels as you want.

This looks pretty good:

MATCH path=(te1:Test_Element {id: "0"})-[:HAS_TEST_ELEMENT_CHILD*0..]->(te2:Test_Element)
WITH *, nodes(path) AS path2
CALL {
    WITH path2
    RETURN
    CASE
    WHEN exists(path2[3].order) THEN toString(path2[0].order)+toString(path2[1].order)+toString(path2[2].order)+toString(path2[3].order)
    WHEN exists(path2[2].order) THEN toString(path2[0].order)+toString(path2[1].order)+toString(path2[2].order)
    WHEN exists(path2[1].order) THEN toString(path2[0].order)+toString(path2[1].order)
    ELSE toString(path2[0].order)
    END AS sort_string
}
RETURN path ORDER BY sort_string

There are only two issues with that:

  1. It's still not flexible as regards the depth of the branches;
  2. I have to implement a function to bring the strings to a proper format.

And I have implemented the {order}-property now in the nodes instead of the connecting relationships.

But to be honest, I find it really annoying that there is no function for this problem implemented. Sorting a tree over all it's branches must be a standard-problem for Graph-DBs.

It seems, I have found now the solution (back now to have the {:order}-property in the relationships:

Match path=(te1:Test_Element {id:"0"})-[:HAS_TEST_ELEMENT_CHILD*]->(te2:Test_Element)
WITH path, toString(reduce(a=1, r in relationships(path) | a*1000+r.order)) as orders
WITH path AS path2 order by orders
WITH collect(path2) AS paths
CALL apoc.convert.toTree(paths) YIELD value
RETURN value

The solution goes back to: https://stackoverflow.com/questions/31…

We have improved the algorithm now by far and I would like to share it with the community:

First of all, we have now a structure wich includes nodes with different labels:

MERGE (elem_0:Test_Element {id:"0"})

MERGE (elem_0)-[:HAS_TEST_ELEMENT_CHILD {order: 30, order_lvl_01:30}]-(elem_3:Test_Element {id: "TE_3"})
MERGE (elem_0)-[:HAS_TEST_ELEMENT_CHILD {order: 20, order_lvl_01:20}]-(elem_2:Test_Element {id: "TE_2"})
MERGE (elem_0)-[:HAS_TEST_ELEMENT_CHILD {order: 10, order_lvl_01:10}]-(elem_1:Test_Element {id: "TE_1"})

MERGE (elem_2)-[:HAS_TEST_ELEMENT_CHILD {order: 10, order_lvl_01:10}]-(elem_2_1:Test_Element {id: "TE_2_1"})
MERGE (elem_2)-[:HAS_TEST_ELEMENT_CHILD {order: 30, order_lvl_01:30}]-(elem_2_3:Test_Element {id: "TE_2_3"})
MERGE (elem_2)-[:HAS_TEST_ELEMENT_CHILD {order: 20, order_lvl_01:20}]-(elem_2_2:Test_Element {id: "TE_2_2"})

MERGE (elem_1)-[:HAS_TEST_ELEMENT_CHILD {order: 20, order_lvl_01:20}]-(elem_1_2:Test_Element {id: "TE_1_2"})
MERGE (elem_1)-[:HAS_TEST_ELEMENT_CHILD {order: 10, order_lvl_01:10}]-(elem_1_1:Test_Element {id: "TE_1_1"})
MERGE (elem_1)-[:HAS_TEST_ELEMENT_CHILD {order: 30, order_lvl_01:30}]-(elem_1_3:Test_Element {id: "TE_1_3"})

MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_CHILD {order: 20, order_lvl_01:20}]-(elem_2_1_2:Test_Element {id: "TE_2_1_2"})
MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_CHILD {order: 30, order_lvl_01:30}]-(elem_2_1_3:Test_Element {id: "TE_2_1_3"})
MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_CHILD {order: 10, order_lvl_01:10}]-(elem_2_1_1:Test_Element {id: "TE_2_1_1"})

MERGE (elem_3)-[:HAS_TEST_ELEMENT_CHILD {order: 30, order_lvl_01:30}]-(elem_3_3:Test_Element {id: "TE_3_3"})
MERGE (elem_3)-[:HAS_TEST_ELEMENT_CHILD {order: 10, order_lvl_01:10}]-(elem_3_1:Test_Element {id: "TE_3_1"})
MERGE (elem_3)-[:HAS_TEST_ELEMENT_CHILD {order: 20, order_lvl_01:20}]-(elem_3_2:Test_Element {id: "TE_3_2"})

MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_STATE {order: 10, order_lvl_02:10}]-(elem_state_2_1_1:Test_Element_State {id: "TES_2_1_1"})
MERGE (elem_2)-[:HAS_TEST_ELEMENT_STATE {order: 20, order_lvl_02:20}]-(elem_state_2_2:Test_Element_State {id: "TES_2_2"})
MERGE (elem_2)-[:HAS_TEST_ELEMENT_STATE {order: 10, order_lvl_02:10}]-(elem_state_2_1:Test_Element_State {id: "TES_2_1"})
MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_STATE {order: 30, order_lvl_02:30}]-(elem_state_2_1_3:Test_Element_State {id: "TES_2_1_3"})
MERGE (elem_2)-[:HAS_TEST_ELEMENT_STATE {order: 30, order_lvl_02:30}]-(elem_state_2_3:Test_Element_State {id: "TES_2_3"})
MERGE (elem_2_1)-[:HAS_TEST_ELEMENT_STATE {order: 20, order_lvl_02:20}]-(elem_state_2_1_2:Test_Element_State {id: "TES_2_1_2"})

Again the nodes are created not in he proper order to avoid correct sorting by chance.

This is the query - if you want to create a sorting string from numbers:

MATCH (te1:Test_Element {id:"0"})
CALL apoc.path.expandConfig(te1, {relationshipFilter: "HAS_TEST_ELEMENT_CHILD>|HAS_TEST_ELEMENT_STATE"})
YIELD path
WITH path, apoc.text.rpad(toString(reduce(a = 0, r in relationships(path) | a * 1000 + COALESCE(r.order_lvl_01,0))),20,"0")+apoc.text.rpad(toString(reduce(a = 0, r in relationships(path) | a * 1000 + COALESCE(r.order_lvl_02,0))),20,"0") as orders
RETURN path, orders order by orders

If your structure goes very deep, using numbers (and creating the sorting string at the very end) could produce a memory overflow very quickly, but if your structure is flat, this could work properly.

Since our structure is deep enough to run into trouble, we have developed an algorithm to create a sorting string completely on the basis of a string:

MATCH (te1:Test_Element {id:"0"})
CALL apoc.path.expandConfig(te1, {relationshipFilter: "HAS_TEST_ELEMENT_CHILD>|HAS_TEST_ELEMENT_STATE"})
YIELD path
WITH path, apoc.text.rpad(reduce(a = "", r in relationships(path) | a + apoc.text.rpad(toString(COALESCE(r.order_lvl_01,0)),4,"0")),100,"0") + apoc.text.rpad(reduce(a = "", r in relationships(path) | a + apoc.text.rpad(toString(COALESCE(r.order_lvl_02,0)),4,"0")),100,"0") as orders
RETURN path, orders order by orders

The idea is the following: you accumulate a string on the basis of an integer which is in an order-property stored in the relationships. After having accumulated this string you sort by that.