Best Practices for optimising full text search in Neo4j

Greetings Neo4j!


I am currently trying to build a search application on top of SNOMED CT international FULL RF 2 Release. The database is quite huge and I had decided to move on with a full text search index for optimal results. So there are primarily 3 types of nodes in a SNOMED CT database :

  • ObjectConcept
  • Descriptions
  • Role Group
    There are multiple relationships between the nodes but I'm focussing on that later.
    Currently I'm focussing on searching for ObjectConcept Nodes by a String property value called FSN which stands for fully specified name. For this I tried two things :
  • Create text indexes on FSN :   After using this with MATCH queries the results were rather slow even when I was using the CONTAINS predicate and limiting return value to 15.
  • Create Full Text Indexes:   According to the docs, the FT indexes are powered by Apache Lucene. I creaded this for FSN. After using the FT Indexes and using AND clauses in the search term, for example :
    >Search Term : head pain
    >Query Term: pain AND headache
    I observe quite impressive benefits in query time using the profiler in neo4j browser(around from 43ms to 10ms for some queries) however once I start querying the db using the apollo server, query times go as high as 2s - 3s.

    The query is as follows, implemented by a custom resolver in neo4j/graphql and apollo-server:


    const session = context.driver.session()
     let words = args.name.split(" ");
     let compoundQuery = "";
     if (words.length === 1) compoundQuery = words[0];
     else compoundQuery = words.join(" AND ");
     console.log(compoundQuery)
     compoundQuery+= AND (${args.type})
     return session
        .run(
       
        `CALL db.index.fulltext.queryNodes('searchIndex',$name) YIELD node, score
        RETURN node
        LIMIT 10
        `,
        { name: compoundQuery }
        )
      .then((res) => {
      session.close()
      return res.records.map((record) => {
      return record.get('node').properties
       })
      })
    }


    I have the following questions:

  • Am I utilising FT indexes as much as I can or am I missing important optimisations out?
  • I was trying to implement elasticsearch with neo4j but I read elasticsearch and the FT indexes are both powered by lucene. So am I likely to gain improvements from using elasticsearch? If so? how should I go about it considering that I am using neo4j aura db and my graphql server is on ec2. I am confused as to how to use elasticsearch overall with the GRANDStack. Any help would be appreciated.
  • Any other Suggestions for optimising search would be greatly appreciated.

  • Thanks in advance!