Matching Named Entities from text documents to graph nodes

pingelsan · July 4, 2024, 3:43pm

We are building a GraphRAG application to make a bunch of old documents accessible in an intelligent manner. The "diffbot" method from the "LLM Knowledge Graph Builder" doesn't work for us, since diffbot knows only public entities (not the ones from our organization, products, customers... )
So we trained a spaCy model for NER and get a list of named entities for every document and paragraph. However, since the spelling of the entities sometimes differ from the nodes in our graph and their name or title attribute, I am wondering if there has been research or best practices on how to effectively onboard or connect entities from incoming text documents. Using Levenshtein distance? Application logic, or is there something helpful available in neo4j?
BTW, the combination of LLMs and 'traditional' NLP methods like in this case looks quite promising.
Thanks for any input!
Cheers,
Chris

Topic		Replies	Views
Entity-network maps - Neo4j and NLP Cypher apoc , cypher , relationship	4	1059	August 25, 2020
New Blog: A Tale of LLMs and Graphs: The GenAI Graph Gathering Community Content & Blogs	0	61	June 18, 2024
Chatbot based on neo4j to perform query disambiguation GenAI	0	38	June 11, 2024
New Blog: LLM Knowledge Graph Builder: From Zero to GraphRAG in Five Minutes Community Content & Blogs	1	283	July 21, 2024
Neo4j Live: Entity Resolution and Deduplication with Neo4j and GenAI Conferences, Meetups, & Events	0	131	March 1, 2024

Matching Named Entities from text documents to graph nodes

Related topics