Can neo4j find specific words/terms within fields of tsv field containing sentences or paragraphs?

Here's the situation:

  1. Some of the columns of my TSV file contain sentences or paragraphs ("sentence file"".
  2. Another of my TSV files is a dictionary of single or multi-word terms of interest ("dictionary file").

Can neo4j identify just the words/terms from the dictionary file that appear in the context of the sentences file?

If so:

A) is there a specific cypher query you can provide to get me started in the right direction? B) Is there anything special I need to do/prepare in either of the two TSV files to make this possible? (I don't think I could possibly provide a stopwords list that would help in this situation, by the way)

I am brand new to neo4j, so please explain like I'm five. ;)

Neo4j does have the ability to search a text string, such as your sentences, for a phrase (such as one of your dictionary words) using the string predicate ‘contains’. You could index the property that contained the sentences to improve search performance.

https://neo4j.com/docs/cyph/indexes/

You can also use a full-text index to get more flexible search capabilities on searching your sentences.

Your first step is to design a data model for how you want to store the data.

After, learn cypher to build your data model and right search queries.

1 Like

not sure what you mean to do exactly but if these are your sentence.tsv and dictionary.tsv files

sentence
test sentence
Kitty is a cat
It's sunny today
Adam has a cat

word
cat
sunny

you need to put both in the import directory of your db

then run this to import them as nodes

load csv WITH HEADERS from "file:///sentences.tsv" AS line FIELDTERMINATOR '\t'
create (:Sentence {text:line.sentence})

load csv WITH HEADERS from "file:///dictionary.tsv" AS line FIELDTERMINATOR '\t'
create (:Word {text:line.word})

here is what you get

then you can create links between words in the dictionary and sentences that contain them

match (w:Word) with w match (s:Sentence) where s.text contains w.text create (s)-[:CONTAINS]->(w)

then here is how you would get the list of sentences that contains words

Actually, you do not need to import words, you can run this with just sentences imported