Error within Langchain RetrievalQA fetching page_content from a Vector built using existing Node data

Hi, I'm developing an application within Langchain that uses an existing Neo4j implementation to build embeddings using text content on a set of Nodes.

I'm following along the blog post -> blogs/llm/devops_rag.ipynb at master · tomasonjo/blogs · GitHub.

It worked up to the point were I hooked my vector into the Langchain RetrievalQA chain. Any assistance on what I may be missing would be much appreciated.

NOTE: I have three indexes I'm creating. I've only shared one here for brevity.

Code snippet:

## - Create the 'MeetingTranscript' vector store
transcript_vector = Neo4jVector.from_existing_graph(
    OpenAIEmbeddings(),
    url=url,
    username=username,
    password=password,
    index_name='transcript',
    node_label="MeetingTranscript",
    text_node_properties=['meeting_id', 'transcript'],
    embedding_node_property='embedding',
) 

## - Connect to existing MeetingTranscript vector store
transcript_qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(), chain_type="stuff", retriever=transcript_vector.as_retriever())

transcript_qa.invoke(
     {"query": "What are the meeting_ids where Health Care is discussed?"}
)

Error out:

  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/main.py", line 72, in <module>
    transcript_qa.invoke(
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain/chains/base.py", line 163, in invoke
    raise e
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain/chains/base.py", line 153, in invoke
    self._call(inputs, run_manager=run_manager)
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 142, in _call
    docs = self._get_docs(question, run_manager=_run_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 254, in _get_docs
    return self.retriever.invoke(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/retrievers.py", line 194, in invoke
    return self.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/_api/deprecation.py", line 148, in warning_emitting_wrapper
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/retrievers.py", line 323, in get_relevant_documents
    raise e
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/retrievers.py", line 316, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 696, in _get_relevant_documents
    docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_community/vectorstores/neo4j_vector.py", line 917, in similarity_search
    return self.similarity_search_by_vector(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_community/vectorstores/neo4j_vector.py", line 1078, in similarity_search_by_vector
    docs_and_scores = self.similarity_search_with_score_by_vector(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_community/vectorstores/neo4j_vector.py", line 1046, in similarity_search_with_score_by_vector
    docs = [
           ^
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_community/vectorstores/neo4j_vector.py", line 1048, in <listcomp>
    Document(
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/langchain_core/documents/base.py", line 22, in __init__
    super().__init__(page_content=page_content, **kwargs)
  File "/Users/dan/Documents/Gov-Sessions/rag_with_docs/.conda/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Document
page_content
  none is not an allowed value (type=type_error.none.not_allowed)```

I've asked Claude to help diagnosis and it suggested that I overwrite the provided code 'neo4j_vector', which seemed extreme.  I'm probably missing something obvious to someone with more experience.

I thought I was on to something with this thread -> community: Bug Fix, in Neo4j VectorStore when having multiple indexes the sort is not working and the store that returned is random by ehude · Pull Request #17396 · langchain-ai/langchain · GitHub

Unfortunately, I altered the names of the embedding property for each of my three indexes and still have same error.

What am I missing?

This error means that sometimes you get an empty response for text node properties, which is really weird.

Can you show

transcript_vector.retrieval_query

And then run

MATCH (node: MeetingTranscript), 1 as score
+ retrieval query

Adding a filter where text is null, because that is the root cause