Some Tips using Search Vector text embeddings with LangChain, my first steps

I like to share some tips about using LangChain Agents that I encountered when I used with Neo4j Vector text embedding feature. I was completely satisfy with the final answer using "trial and error" approach

Tip 1 The first one is using some instructions to tell at the agent what to do , I had to do this every time that I add some extra information from other nodes as you can see in Instruction 3 (labels in Spanish). I was not using a formal "Prompt Template" from langchian , so I think this instrucctions act as "prompt"

def run_semantic_search (text:str)->str:
    """Use this function to do a semantic search\
    Instruction 1 : Get input text and then use embeddings\
    Instruction 2 : Use the funciont graph.query and get the results\
    Instruction 3 : Complete the final text with Importe,Responsable,Clave_Unidad_Compra,Proveedor
    Instruction 4 : You must to stop when have all information about Contrato\
    Instruction 5 : You must response in Spanish language \

Here is my Cypher using the embedding and doing "grounding"
my Knowledge Graph. It´s about contracts, providers and persons who assign it , for example I'm retrive the "Importe" (contract Amount) and the Provider

query_embedding = embeddings.embed_query(text_search)
    results=graph.query(
    """
    CALL db.index.vector.queryNodes('contratos_VectorIndex',10, $embedding)
    yield node as contrato, score 

    MATCH (responsable)-[:COMPRO]->(contrato) 
    OPTIONAL MATCH (responsable)-[:PERTENECE_A]->(UC)
    OPTIONAL MATCH (contrato)-[:ASIGNADO_A]->(proveedor)
    RETURN 
    contrato.Id_Contrato as IdContrato
    ,contrato.Importe as Importe
    ,responsable.ResponsableUC as Responsable
    ,UC.ClaveUC as Clave_Unidad_Compra
    ,proveedor.Proveedor as Proveedor
    ,score as score
    ,contrato.text AS Text
    """, {'embedding':query_embedding}
    )

Tip 2 Set this specific instruction when add the LangChain tool : " You must use this to search information just once" . In verbose mode true I realized that the Agent was doing several "round trips" to the Graph DataBase even if the information was retrieved and finally get "external error"

tools=[Tool(name="GetInformation",func=run_semantic_search,description="""
            You must use this to search information just once, you must to answer in spanish""")]
agent=initialize_agent(
tools
,llm
,agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION
,handle_parsing_errors=True
,verbose=False
)

Then I´m ready to do some questions
"What are the contracts of daycare? include all information"

sensible and private data is represented by XX for this post, and just show some of them

try:
    result=agent("""
                 Cuáles contratos son de Servicio Guardería, incluye toda 
                 su información""")
except :
    print("Exception on External Access")

The response in spanish , formatting currency courtesy of the Agent :smile:

Los contratos de servicio de guardería son los siguientes:

- Contrato número 17144:
  - Importe: $90,138,835.00
  - Responsable: XXXX LXXX
  - Clave de unidad de compra: 050GYR007
  - Proveedor: CXXXX DE INSTITUCIONES XXXXXXXSC
  - Texto: El contrato fue asignado por el IMSS para realizar el servicio de guarderías del año 2022 al 2026.

- Contrato número 17131:
  - Importe: $71,998,764.00
  - Responsable: XXXXX
  - Clave de unidad de compra: 050GYR007
  - Proveedor: CXXXDE XXXX PEDAGOGICO Y PSICOLOGICO SC
  - Texto: El contrato fue asignado por el IMSS para realizar el servicio de guarderías del año 2022 al 2026.

- Contrato número 17142:
  - Importe: $52,705,765.00
  - Responsable: JXX  LXXX
  - Clave de unidad de compra: 050GYR007
  - Proveedor: XXX DE DESARROLLO INFANTIL XXXXX AC
  - Texto: El contrato fue asignado por el IMSS para realizar el servicio de guarderías del año 2022 al 2026.

My results were OK doing this with LangChain 0.0.292, my next steps are use Prompt Template, Memory and the newest version of LangChain 0.0.310
Whatever consider this if you face the same

The implementation were based from the article published by
Tomaz Bratanic :

Efficient semantic search over unstructured text in Neo4j

(Efficient semantic search over unstructured text in Neo4j | by Tomaz Bratanic | Aug, 2023 | Towards Data Science)

Greetings

1 Like