Double splitting strings?

I'm trying an experiment that consists of ingesting a text file that has numbered questions and answers in the following format:

1.) Question : Answer
2.) Question : Answer
...

I'm able to read the TXT file using the LOAD CSV function just fine. I can separate the answer from the question with the SPLIT function (e.g., split(row, ':') as newrow. The question is how can I do a second split to extract just the question number as well? I tried the following:

unwind(row) as newrow
with split(newrow, ':') as myrow
unwind(myrow) as thisrow
with myrow, split(thisrow, ".)") as q
return q, myrow[0], myrow[1]

But I seem to be looping within myrow constraints as I get q returning with the question number & question, then the answer in the next row, etc.

Any suggestions? I would really like to separate out the actual question number and then the answer to further build the graph. The question text doesn't really matter (consistent number and questions).

Hi @mbandor!

Have you tried without the unwind?

Something like

with split(row, ':') as myrow
with myrow, split(myrow[0], ".)") as q
return q[0], q[1], myrow[1]

Bennu

Unfortunately split() only works with string values. At this point, row is a list and an error is thrown about the type mismatch.

Hello @mbandor :slight_smile:

Here is a try:

LOAD CSV FROM 'file:///test.txt' AS line
UNWIND line AS item
WITH split(item, ":") AS elements
WITH split(elements[0], ".)") AS questions, elements[1] AS answer
RETURN questions[0] AS question_number, questions[1] AS question, answer

Regards,
Cobra