I want to find nodes with similar values inside a property.
I am facing this error while querying nodes using full text index -
Query:
MATCH (e: Email)
WITH e
CALL db.index.fulltext.queryNodes('convEmailFtIndex', e.convEmail)
YIELD node, score
RETURN e.email, node.convEmail, node.email, score ORDER BY e.email LIMIT 1000
Error:
ERROR
Neo.ClientError.Procedure.ProcedureCallFailed
Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: org.apache.lucene.queryparser.classic.ParseException: Encountered "<EOF>" at line 1, column. 0.
Was expecting one of:
<NOT> ...
"+"...
"-"...
<BAREOPER> ...
"(" ...
"*"...
<QUOTED>...
<TERM>...
<PREFIXTERM>...
<WILDTERM>...
<REGEXPTERM>...
"["...
"{"...
<NUMBER>...
<TERM>...
I have this property called convEmail in my email which is basically only the alphabets of an email id, excluding numbers and special characters.
This is how I created the full-text index: CALL db.index.fulltext.createNodeIndex('convEmailFtIndex', ['Email'], ['convEmail'], {analyzer: 'standard-no-stop-words'});
It would be very helpful if someone can help me resolve this. Thank you.
Hi @cobra, thanks for the response.
I am using Neo4j version 4.2.5.
And sorry, I cannot provide email information as those are customer data in our business.
One thing I can confirm is that I have preprocessed the emails to contain only alphabets for better querying.
However, if I query using a specific email id (in the full-text index) instead of providing it generically to find out top 1000 records, it's giving me results.
But in our use-case, we need to find out top matching emails for each email which are similar in order to merge those nodes together.
(PS. I have already tried using apoc text functions like JaroWinkler to find out the similarity between two emails but unfortunately it's taking too much time. In our database, there are more than 10 million email ids so need to find out top matching email for each one.)
It would be great if you could help me solve this problem. Thank you.