Help with Building Knowledge Graph for Unstructured Date using Neo4j API

saleh.samer · October 6, 2025, 10:05pm

Hi,

Im trying to use the Neo4J API to generate Knowledge Graph put of Unstructured Data. I tried to follow the tutorial and the example listed in the developer guide . Im using the SimpleKGBuilder as the following:

pipeline = SimpleKGPipeline(

driver=driver,

llm=llm,

prompt_template=ERExtractionTemplate(system_instructions=system_instr),

schema={

"node_types": entities,

"relationship_types": relations,

"patterns": patterns,

"additional_node_types": False

},

from_pdf=False,

embedder=embedder)

For the entities properties , I did not define any as I would like to get all the properties listed with a given entity. I know this can be done because I was able to do it through Langchain neo4j wrapper when defining the LLMGraphTransformer as the following:

llm_transformer = LLMGraphTransformer(

llm=llm,

Example node def: [{'label': 'EQUIPMENT', 'description': '...', 'additional_properties': True}

allowed_nodes=allowed_nodes,

allowed_relationships=allowed_relationships,

node_properties=True, #<=== Captures all properties

strict_mode=True, # Set to True if you want ONLY these types

additional_instructions=additional_instr

)

Another thing that I've noticed when using langchain is that my entity resolution works much better and my KG looks more connected than what I get with the SimpleKGPipeline where I get more isolated clusters.

What am I missing ? Can someone point me in the right direction please.
Thanks

XavPil · October 14, 2025, 3:39am

For Text (from_pdf=False) or PDF (from_pdf=True), I tried that and it worked with or without schema
if SCHEMA is None:
kg_builder = SimpleKGPipeline(
llm=llm,
driver=NEO4J_DRIVER,
embedder=EMBEDDINGS,
from_pdf=True,// False if text
neo4j_database=DATABASE,
)
else:
kg_builder = SimpleKGPipeline(
llm=llm,
driver=NEO4J_DRIVER,
embedder=EMBEDDINGS,
schema=SCHEMA,
from_pdf=True,// False if text
neo4j_database=DATABASE,
)
return await kg_builder.run_async(text=content) // Text
OR
return await kg_builder.run_async(file_path=file_path) // PDF

nathalie.charbel · October 20, 2025, 10:54am

For the entity resolution part, we have a couple of resolvers that you can try out. The SimpleKGPipeline uses the default one, which is based on merging nodes with the same label and exactly the same name property. Unfortunately, you cannot so far customise the resolver component from SimpleKGPipeline. You can instead skip it when you run the pipeline, then run the more advanced Resolver components afterwards:
pipeline = SimpleKGPipeline(
# ...
perform_entity_resolution=False,
# ... )
Then you can test the different resolvers separately once the KG is written to the database:
# run fuzzy match for entity resolution
# resolver = FuzzyMatchResolver(driver)
# run semantic match for entity resolution
resolver = SpaCySemanticMatchResolver(driver)
res = await resolver.run()
If needed, you can also configure similarity_threshold (for the advanced resolvers), the resolve_properties (the list of properties to consider for the resolution), and filter_query to run the resolution on a specific part of the graph.

Topic		Replies	Views
Help with Generating Knowledge Graph using SimpleKGPipeline Graph + AI	0	46	October 7, 2025
How to create knowledge graphs on unstructured data? Neo4j Graph Platform data-science	5	1193	December 2, 2023
Need help relating to Neo3j cypher Newbie Questions cypher	2	68	November 7, 2025
GraphRAG Use Case at Scale: 20% Accuracy Lift with Qdrant + Neo4j-Like KG from Unstructured Text GraphRAG	0	92	June 27, 2025
LLM Knowledge Graph builder new source and destination : an existing Neo4J Aura graph GenAI knowledge-graph , llm	5	96	March 23, 2025

Help with Building Knowledge Graph for Unstructured Date using Neo4j API

Related topics