Error while cypher-querying nodes using full-text index

awesomeanonymously88 · October 8, 2021, 12:11pm

Let me tell my use-case with full text Index:

I want to find nodes with similar values inside a property.

I am facing this error while querying nodes using full text index -

Query:

MATCH (e: Email)
WITH e
CALL db.index.fulltext.queryNodes('convEmailFtIndex', e.convEmail)
YIELD node, score
RETURN e.email, node.convEmail, node.email, score ORDER BY e.email LIMIT 1000

Error:

ERROR 
Neo.ClientError.Procedure.ProcedureCallFailed
Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: org.apache.lucene.queryparser.classic.ParseException: Encountered "<EOF>" at line 1, column. 0.
Was expecting one of:
<NOT> ...
"+"...
"-"...
<BAREOPER> ...
"(" ...
"*"...
<QUOTED>...
<TERM>...
<PREFIXTERM>...
<WILDTERM>...
<REGEXPTERM>...
"["...
"{"...
<NUMBER>...
<TERM>...

I have this property called convEmail in my email which is basically only the alphabets of an email id, excluding numbers and special characters.

This is how I created the full-text index:
CALL db.index.fulltext.createNodeIndex('convEmailFtIndex', ['Email'], ['convEmail'], {analyzer: 'standard-no-stop-words'});

It would be very helpful if someone can help me resolve this. Thank you.

Cobra · October 8, 2021, 7:36pm

Hello @awesomeanonymously88

Your query looks good.
Can you provide some data to recreate your Email nodes please?
Which version of Neo4j are you using?

Regards,
Cobra

awesomeanonymously88 · October 8, 2021, 8:17pm

Hi @Cobra, thanks for the response.
I am using Neo4j version 4.2.5.
And sorry, I cannot provide email information as those are customer data in our business.
One thing I can confirm is that I have preprocessed the emails to contain only alphabets for better querying.

However, if I query using a specific email id (in the full-text index) instead of providing it generically to find out top 1000 records, it's giving me results.

But in our use-case, we need to find out top matching emails for each email which are similar in order to merge those nodes together.

(PS. I have already tried using apoc text functions like JaroWinkler to find out the similarity between two emails but unfortunately it's taking too much time. In our database, there are more than 10 million email ids so need to find out top matching email for each one.)

It would be great if you could help me solve this problem. Thank you.

Cobra · October 8, 2021, 8:53pm

This is my dataset: email.txt (26,1 Ko) (you must replace the file extension by .csv).

I used the version 4.3.5 of Neo4j.

First, I created the nodes:

LOAD CSV WITH HEADERS FROM 'file:///email.csv' AS row
WITH row
MERGE (c:Email {id: row.id})
SET c.email = row.email

Then, I created the index:

CREATE FULLTEXT INDEX email FOR (n:Email) ON EACH [n.email]

Finally, I tested your query:

MATCH (e:Email)
CALL db.index.fulltext.queryNodes('email', e.email)
YIELD node, score
RETURN e.email, node.email, score
ORDER BY e.email

Everything worked on my side, this is the result: export.txt (195,0 Ko)

I also tried with the option:

CREATE FULLTEXT INDEX email FOR (n:Email) ON EACH [n.email] OPTIONS {indexConfig: {`fulltext.analyzer`: 'url_or_email'}}

The query also worked

Regards,
Cobra

awesomeanonymously88 · October 8, 2021, 10:49pm

Thank you.
Btw do you have any idea about the error I posted? Like what it says or what needs to be done?

Cobra · October 9, 2021, 5:56am

Did you try another analyser in the option?
Can you try to update your database to the latest version?
I think it could also come from your data.

Can you try to a LIMIT 2 after the WITH e and tell me if that still works?

Topic		Replies	Views
Full-Text is created and online but is not working Newbie Questions browser , cypher , operations , import	8	784	January 20, 2021
Select nodes based on a keyword/phrase using Full-text search index not working Procedures & APOC cypher , neo4j-desktop	9	325	May 4, 2022
Select nodes based on a keyword/phrase using Full-text search index not working Neo4j Graph Platform migrated	1	110	June 10, 2022
When querying a full-text index with the term "account" the node "Accounts" is not retrieved Cypher	0	179	October 31, 2021
Fulltext query doesn't allow special symbols Neo4j Graph Platform cypher	1	314	May 8, 2022

July Summer Fun!

Error while cypher-querying nodes using full-text index

Related topics