I have the following "node hierarchy": (:MediaHost) has multiple (:Bucket)s and each bucket can have multiple (:MediaFile)s and are related like so:
(h:MediaHost)-[:HAS_BUCKET]->(b:Bucket)-[:LINKS_TO_FILE]->(f:MediaFile)
So, for getting 2 files from each bucket for a given host, and return a table, I do the following
MATCH (h: MediaHost {hostname: 'myhost.com'}) -[:HAS_BUCKET]-> (b: Bucket) -[:LINKS_TO_FILE]-> (f: MediaFile) WITH h, b, COLLECT(f)[0..2] AS files UNWIND files AS file RETURN b.name AS bucket_name, file.name ORDER BY b.name, file.name
This works well.
The files also "belong" to a (:MediaDir) node like so:
(d:MediaDir)-[:HAS_FILE]->(f:MediaFile)
I am trying to get the dir of each file like so:
MATCH (h: MediaHost {hostname: 'myhost.com'}) -[:HAS_BUCKET]-> (b: Bucket) -[:LINKS_TO_FILE]-> (f: MediaFile) WITH h, b, COLLECT(f)[0..2] AS files UNWIND files AS file MATCH (d:MediaDir)-[:HAS_FILE]->(file) RETURN b.name AS bucket_name, file.name, d.name ORDER BY b.name, file.name
Once I introduce the last MATCH section in the query to get the file's directory, I get duplicate MediaFiles.
However, without the last MATCH section, I don't get the duplicates.
Can anyone suggest what's wrong in my query?