Good morning,
I am a student working on a project on Text-to-Cypher. I have been conducting research on the literature regarding training datasets used to fine-tune LLMs for improving their Cypher generation capabilities. While the literature often highlights the difficulty LLMs face with advanced Cypher queries, such as third-level or higher navigation queries, I have not seen examples of such queries in the synthetic datasets used for training these models.
Am I correct in this observation? Additionally, do you think it is reasonable to focus on first-level queries when creating a synthetic dataset for fine-tuning, as I have done for some LLM models?
Thank you very much