Keeping array of string arrays as a property in a node

I have a huge amount of text data in my db. Therefore when i need to calculate the bigrams and trigrams of those texts it happens to be very expensive as you can imagine. So I am looking for a way to import all the n-grams in the nodes, then group them on db to get the most frequent n-grams(like top 5). Here is an example:

for a sentence like the following;
"you have no idea how much i love neo4j graph database"

i want to import the n-grams with the query;

match (p)
where p.Id = 'da786ecb-7965-4ab6-84fd-3811da9b31a0'
set p.Ngrams = [['you', 'have'], ['you', 'have', 'no'], ['have', 'no'], ['have', 'no', 'idea'], ['no', 'idea'], ['no', 'idea', 'how'], ['idea', 'how'], ['idea', 'how', 'much'], ['how', 'much'], ['how', 'much', 'i'], ['much', 'i'], ['much', 'i', 'love'], ['i', 'love'], ['i', 'love', 'neo4j'], ['love', 'neo4j'], ['love', 'neo4j', 'graph'], ['neo4j', 'graph'], ['neo4j', 'graph', 'database'], ['graph', 'database']]

the query gives me the following error:

neo4j.exceptions.CypherTypeError: {code: Neo.ClientError.Statement.TypeError} {message: Property values can only be of primitive types or arrays thereof}

any idea how to solve a situation like this? Thanks

Hello @taylangezici1 :slight_smile:

You can store a list of strings but you cannot store a list of list of strings. In your case, you should create a new type of node NGram and store one n-gram by node then link these new nodes to the node with the sentance.


Cobra's idea is probably the best one from a graph performance perspective.

However, it is possible to make a list-of-lists: use a Map

The main syntactic difference is that a map is wrapped in curly braces instead of brackets. Obviously, there are other performance implications, though.