How to search in properties of a relationship which has an array of values?

Hi,
How do you search in the relationship properties if the relationship properties has an array of values?
For e.g.

  1. In the Movie database, the Person has Acted_IN Movies.
  2. A person plays multiple roles for e.g. Keanu Reeves ACTED_IN 'The Matrix'. The 'ACTED_IN' has properties called roles, and the values are for e.g. ['Neo', 'Reo', 'Leo'].

In this case, If I want to check in the database who has ACTED_IN role ='Reo', how do we express this?

In this case, the following query will not return any result:
match (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) where a.roles = 'Reo' return p.name, m.title, a.roles

But if we use the
match (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) where a.roles = ['Neo', 'Reo', 'Leo'] return p.name, m.title, a.roles

Then we get the result back.

I donot want to query using all roles, but only one of the many roles.

Thanks
Bijay

You can use the IN keyword to test whether an element is contained in a list:

match (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) 
where 'Reo' IN a.roles
return p.name, m.title, a.roles

Alternately you can use the any() list predicate to see if any of the elements in the list have that value:

match (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) 
where any(role in a.roles WHERE role = 'Reo')
return p.name, m.title, a.roles

And if you have a list (let's say it's a list parameter) of roles and you want to make sure all of those roles were played by the same person in a movie, you can use the all() list predicate instead:

match (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) 
where all(role in $roleList WHERE role IN a.roles)
return p.name, m.title, a.roles

Oh, and keep in mind none of these will leverage indexes (even if these are list properties on a node).

4 Likes

Thank you! It worked.
I was trying to use UNWIND first and then search in the returned row, but that was not necessary as per your suggestion. Just for curiosity, could UNWIND be used for the relation properties and searched in the rowset returned? How would you do that?
Thank you once again!!

What about single?

where single(role in a.roles WHERE role = 'Reo')

In this particular example, single() would work, but would not be any more useful than any().

When using single() here, that means that there is exactly one match of that role in the list. In the movies graph, I believe elements in each role list are unique anyway, so this wouldn't produce any better results than any() unless you had duplicates in the list.

1 Like

You could do this, but with UNWIND you're increasing the cardinality (adding a row for every list element, but the start and end nodes and relationship would be the same for those extra rows). If you are only interested in one role, that may be fine. If you're looking for any of a set of roles, such that the multiple roles might be present in the roles list on a single relationship, then you might get back duplicate results (same person, movie, and relationship, one row per role matched), and so a means of deduplication could be needed (via aggregation or DISTINCT) which would make the query more expensive. In general for these, list predicates will be more concise and performant.

MATCH (p:Person)-[a:ACTED_IN]->(m:Movie{title:'The Matrix'}) 
UNWIND a.roles as role
WITH p, a, m
WHERE 'Reo' = role
RETURN p.name, m.title, a.roles

Would there be a method for using CONTAINS within an array property? or would unwind be required to execute CONTAINS?

Hey Eric,

Welcome to the Graph Community.

There are predicate functions in Neo4j that can perform the Contains operations for you.

You can read up on them here :

Best Regards,
Kshitiz Arora

1 Like

Here's a twisted question....

What if the movie is Kind Hearts and Coronets where Alec Guinness plays nine different roles at once?

Here, a person creating the ACTED_IN relationship might decide (badly?) that the roles property should be in a list instead of creating multiple relationships with a single string for each role. (They might decide have nine relationships of the same type with different properties clutters things up.)

CREATE (p:Person {name:'Alec Guinness', born : 1914})


CREATE (m:Movie {title: 'Kind Hearts and Coronets', released: 1949, tagline:'He chopped down the family tree'})


MATCH (m:Movie), (p:Person)
WHERE p.name = 'Alec Guinness' AND m.title STARTS WITH 'Kind'
CREATE (p)-[r:ACTED_IN {roles: [
 'Ethelred, 8th Duke of Chalfont',
 'The Reverend Lord Henry',
 'General Lord Rufus',
 'Admiral Lord Horatio',
 'Lord Ascoyne',
 'Lady Agatha D\'Ascoyne',
 'Young Ascoyne',
 'Young Henry',
 'Ethelred, 7th Duke of Chalfont'
]}]->(m)
return p,r,m

Then this doesn't return anything (I'm not sure how to make this query work either...):

[Fixed typo: Move => Movie]

MATCH (p:Person)-[a:ACTED_IN]->(m:Movie) WHERE 'Young Henry' IN a.roles RETURN p, a, m

I'm designing a schema, where I'm not sure if where some "tags" should really be entities that person nodes point to, or if the tags should be a list in a person's property. I'm not sure what the trade offs are.

This is vaguely reminiscent of 3rd Normal Form of RDBMS where having a list in a field is a no-no. Are there such recommendations for Graph DBs?

(Also annoying.... when you click on the ACTED_IN relationship, nothing shows up in the properties list except a triangle, which you are supposed to know to click on. It would be nice to show the first line of data.)

For this one, be careful of typos! Your MATCH query is using :Move instead of :Movie, if that's fixed then the query should execute as expected, though an index won't be able to be leveraged here.

(fixed Movie typo. When I don't get any results, I always start looking for typos!)

What I didn't realize, is IN will work if the attribute is not a list. (A surprise!)

So this works, even though the DB has only one role for Keanu Reeves.

MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE 'Neo' IN a.roles
RETURN p, a, m

but now I'm puzzled why this doesn't work:

MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE a.roles =  'Neo' 
RETURN p, a, m

The IN operation is designed to handle that, but the equals operator is not, as it's more strict. In most cases you should know enough about your graph that you would know when a property is a list type, and that allows you to take advantage of several functions and operations that are specific to lists.

1 Like

Is the query WHERE a.roles = 'Neo' not working because I have added to my copy of the Movie DB to have at least one instance of the ACTED_IN relationship to have a list?

It seems like this query now returns nothing for me:

MATCH (p:Person)-[a:ACTED_IN{roles:'Neo'}]->(m:Movie)
RETURN p, a, m

so I'm wondering if me making just one roles attribute into a list has made all the other roles attributes to implicitly all become lists (and require the IN operator?)

Unless you've changed the property, roles should still be a list of strings, not a single string. Your pattern is specifically looking for a roles property that is the single string 'Neo'.

If you wanted to express that it's a single-element list then you would use a list literal instead:

MATCH (p:Person)-[a:ACTED_IN{roles:['Neo']}]->(m:Movie)

Though the WHERE 'Neo' IN a.roles would be better, as that would still work even if they were present in more than one role.

And no, changing the property type in one instance of a relationship (or node) does not in any way affect the property type in any of the other instances.

1 Like

This would really be a good point to add to the manual, as I ran into this issue and only now seeing this I understand what was happening. The way you and clem break it down would be perfect to add the two examples as clem mentioned and adrew explanation as to why it failed.

Can Someone explain why we are talking about list of roles here for Keanu Reeves in 'The matrix'? when I searched the database it clearly said he has done only 1 role which is 'NEO' so why we have to search for list here? instead of just

where rel.roles='Neo'

Also How do we know that 1 property has LIST of values instead of a single value?

Query:

> match(p:Person)-[rel:ACTED_IN]->(m:Movie)
> 
> where m.title='The Matrix' and 'Neo' in rel.roles return p,m,rel

Output:

{
  "identity": 1,
  "labels": [
    "Person"
  ],
  "properties": {
"born": 1964,
"name": "Keanu Reeves"
  }
}
{
  "identity": 0,
  "labels": [
    "Movie"
  ],
  "properties": {
"tagline": "Welcome to the Real World",
"title": "The Matrix",
"released": 1999
  }
}
{
  "identity": 0,
  "start": 1,
  "end": 0,
  "type": "ACTED_IN",
  "properties": {
"roles": [
      "Neo"
    ]
  }
}

The datatype of 'role' property is set as an array to facilitate storing of multiple roles played by an actor in a the selected movie. Also, the array can have one or many values in the list. To check the number of values in the property 'role', you can use size(a.role) to get the number of values stored in that array. Setting it to an array simplifies the query as this accommodates one or many values.