Hello Neo4j Community,
I am learning the basics and looking forward to your expertise.
Working on a project, I am leveraging RAG with LangChain to query nodes in Neo4j.
Please see the python codes at the end of this message.
Here’s the scenario I am dealing with:
I have a Node B (Income Statement) with a text property and an embedding property populated. Using LangChain in python script, I can easily query this node and retrieve relevant information.
Now, I have added Node A, which is connected to Node B. Node A has properties, such as a year property with a value (e.g., 2000). This node doesn't have embedding property.
My goal is to ask a question that queries both Node B and Node A, retrieving relevant information, including the properties of both nodes (e.g., the value of year from Node A and text-related content from Node B).
What kind of libraries or LangChain functions should I use to achieve this functionality effectively? Specifically:
How can I ensure that queries traverse the connection between Node A and Node B using Python scripts?
How can I retrieve both embeddings-based content from Node B and property-based data from Node A in the same query?
Any guidance or best practices would be greatly appreciated. Thank you!
Hossein J.
My python script at this time looks like this:
import dotenv
import os
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Neo4jVector
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAILoad environment variables
dotenv.load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")Function to initialize Neo4j vector index
def get_vector_index():
return Neo4jVector.from_existing_graph(
OpenAIEmbeddings(),
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD'),
index_name='Vi',
node_label="Income Statement",
text_node_properties=['title', 'drafting', 'engineering_consulting', 'total_revenue'],
embedding_node_property='embedding'
)Create vector index retriever
vector_index = get_vector_index()
Initialize the RAG QA chain
vector_qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),
chain_type="stuff",
retriever=vector_index.as_retriever()
)Manually input your query
query = input("Enter your question here: ")
Run the query and get the result
try:
response = vector_qa.invoke({"query": query})
print("Query Result:")
print(response['result'])
except Exception as e:
print(f"An error occurred: {e}")