Vector Search with Traverse Functionality using Python Libraries

hossein · January 15, 2025, 4:20pm

Hello Neo4j Community,

I am learning the basics and looking forward to your expertise.

Working on a project, I am leveraging RAG with LangChain to query nodes in Neo4j.

Please see the python codes at the end of this message.

Here’s the scenario I am dealing with:

I have a Node B (Income Statement) with a text property and an embedding property populated. Using LangChain in python script, I can easily query this node and retrieve relevant information.

Now, I have added Node A, which is connected to Node B. Node A has properties, such as a year property with a value (e.g., 2000). This node doesn't have embedding property.

My goal is to ask a question that queries both Node B and Node A, retrieving relevant information, including the properties of both nodes (e.g., the value of year from Node A and text-related content from Node B).

What kind of libraries or LangChain functions should I use to achieve this functionality effectively? Specifically:

How can I ensure that queries traverse the connection between Node A and Node B using Python scripts?

How can I retrieve both embeddings-based content from Node B and property-based data from Node A in the same query?

Any guidance or best practices would be greatly appreciated. Thank you!

Hossein J.

My python script at this time looks like this:

import dotenv
import os
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Neo4jVector
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI

Load environment variables

dotenv.load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")

Function to initialize Neo4j vector index

def get_vector_index():
return Neo4jVector.from_existing_graph(
OpenAIEmbeddings(),
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD'),
index_name='Vi',
node_label="Income Statement",
text_node_properties=['title', 'drafting', 'engineering_consulting', 'total_revenue'],
embedding_node_property='embedding'
)

Create vector index retriever

vector_index = get_vector_index()

Initialize the RAG QA chain

vector_qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),
chain_type="stuff",
retriever=vector_index.as_retriever()
)

Manually input your query

query = input("Enter your question here: ")

Run the query and get the result

try:
response = vector_qa.invoke({"query": query})
print("Query Result:")
print(response['result'])
except Exception as e:
print(f"An error occurred: {e}")

hossein · January 17, 2025, 6:32pm

I spent a bit more and realized that I can use CypherQAChain.

It does what I was initially looking for, except the response is chopped, rather than a human-like response.

I can imagine the chopped replies can be fed into llm as context and receive a human-like reply.

I am hoping that maybe there is another easier way to retrieve information from nodes that do not have embedding property.

Any hints would be appreciated.

Thank you.
Hossein

hossein · February 7, 2025, 11:00pm

As I am learning more. I can see that there are various ways to do the search.
They are Vector, Graph and FullText.
The next step is to figure out, how to put the results of 3 searches together and generate a human like response.

hossein · February 12, 2025, 6:52pm

I found more related information to my original question at:
GraphRAG Github

Let me explain more:
I did let Knowledge Graph Builder to do the chunking and embedding.
Then I wanted to put together a python script to search through the nodes.
Above link provides directions.
When you visit the Readme file of that site, scroll almost to the end. Find the heading of " Performing a Similarity Search"
That provided ultimate answer I was looking for.
Now I still need more improvement like inserting some Prompt Engineering so the query doesn't return information from external sources and Chat Memory features.

Happy coding,
Hossein

hossein · February 19, 2025, 4:59am

Ok, more lessons learned:
It seems that with GraphRag, I have to use OpenAILLM.

That is all right - but how do I utilize prompts? After some tryouts, I realized that the answer is: RagTemplate()

The next question is how to keep the chat history? With ChatOpenAI, it is relatively easy. But with GraphRAG, I have to use OpenAILLM. I have to read more and find out how to implement chat history with this wrapper(OpenAILLM)?

Topic		Replies	Views
Error within Langchain RetrievalQA fetching page_content from a Vector built using existing Node data Integrations & Ecosystem vector-search	2	124	May 18, 2024
New Blog: Enhancing RAG with Neo4j Cypher and Vector Templates Using LangChain Agents Community Content & Blogs	0	109	April 5, 2024
How to query llm-graph-builder.neo4jlabs.com from python? GenAI	1	91	December 20, 2024
Some Tips using Search Vector text embeddings with LangChain, my first steps Graph + AI langchain	0	1046	October 6, 2023
LangChain Library Adds Full Support for Neo4j Vector Index Community Content & Blogs	0	369	September 1, 2023

August Summer Fun!

Vector Search with Traverse Functionality using Python Libraries

Load environment variables

Function to initialize Neo4j vector index

Create vector index retriever

Initialize the RAG QA chain

Manually input your query

Run the query and get the result

Related topics