Hello everybody!
I recently started using Neo4j in my application and I'm working on a rather complex query. So far I managed to get it to work by breaking it down into smaller queries and gluing it together with some Ruby. I'm guessing there has to be a way to do it all in Cypher but haven't yet figured out how to do it and I hope you'll be able to help me.
In my app, I have Item
nodes and two types of relations between them: :SIMILAR_TO
and :CONNECTED_TO
. The :SIMILAR_TO
relation has a Float
attribute called value
.
The requirement is that when two Item
s are similar enough, they are considered identical. Basically, given a number ($some_number
), I need to collapse all the nodes that have a :SIMILAR_TO
between them with a value
less than that number, and then I render the resulting graph with all Item
nodes for the selected IDs (in $item_ids
) and only the :CONNECTED_TO
relations between them.
This is how I collapse the 'identical' nodes:
MATCH (i:Item)-[s:SIMILAR_TO]->(j:item)
WHERE s.value < $some_number
AND id(i) IN $item_ids
AND id(j) IN $item_ids
WITH i + collect(j) AS identical
CALL apoc.nodes.collapse(identical, { properties: 'combine' })
YIELD from, rel, to
RETURN from, rel, to
My problem with this query is that:
(1) it leaves out other items with the provided IDs if they don't have :SIMILAR_TO
relation satisfying the value
criterion, so they end up orphaned because the end node IDs on :SIMILAR_TO
don't match the IDs of the virtual nodes, with negative IDs (at least that's what I think is happening), and
(2) it returns all relationships in rel
, i.e. both :SIMILAR_TO
and :CONNECTED_TO
.
What I'm trying to do is something like:
MATCH (i:Item)-->(j:Item)
WHERE id(i) IN $item_ids
AND id(q) IN $item_ids
// Then collapse these nodes
MATCH (i)-[s:SIMILAR_TO]->(j)
WHERE s.value < $some_number
WITH i + collect(j) AS identical
CALL apoc.nodes.collapse(identical, { properties: 'combine' })
YIELD from, rel, to
RETURN from, rel, to
// Then return the results in this format
MATCH (from)-[rel:CONNECTED_TO]->(to)
As I said, so far I've been able to do it by breaking it down into three separate queries and passing the results from one to the next one in Ruby. It works, but it's not the cleanest piece of code.
I'm sure there's a nice way to do it in Cypher and I hope you'll be able to help me. Thanks in advance!