Proper Cypher regular expression examples

tideon · December 11, 2020, 3:06am

I have been for weeks reading and trying to figure out how to use regular expressions in Cypher. And no the website doc has next to nothing there, also that site everyone send me to doesn't actually show neo4j cypher way of doing it.

So my question is. What is the exact code to write the equivalent in cypher of this: \s[a-zA-Z]

Can someone write some normal regular expressions with then the cypher version next to it, because the documentation has nothing that would give you any idea how to write meaningful expressions.

Kind regards,
Jeffrey

cobra · December 11, 2020, 10:05am

Hello @tideon

In the documentation, there is a little example you can transform:

MATCH (n:Person)
WHERE n.name =~ '\s[a-zA-Z]'
RETURN n.name, n.age

But I think you already tried that so:

can you tell us what you are trying to achieve with the regex?
can you tell us what the query should do?
can you give us your model?

Regards,
Cobra

tideon · December 11, 2020, 2:22pm

Hello Cobra,

I think that the regex isn't working because I copy pasted the where statement above and got the following error.

Here is the query
match (t:Toy)
WHERE t.ModelNr=~ '\s[a-zA-Z]'
return t


Invalid input 's': expected '\', ''', '"', 'b', 'f', 'n', 'r', 't', UTF16 or UTF32 (line 2, column 21 (offset: 34))
"WHERE t.ModelNr=~ '\s[a-zA-Z]'"

I am attemting to find the toys where the model nr has letters in the field, so I can identify the ones that need to be cleaned up.

So lets start with this and go from there. If what you gave me doesn't work then something is fundamentally wrong.

Are you from Neo4J?

The internet is filled with people who can't get regular expression to work in Neo4j. I spend over six hours yesterday reading every post I can find on the internet, So many people cannot get a clear understanding form the manual, as it give very little explanation.

tideon · December 11, 2020, 2:39pm

I think there is a lot of confusion of how regEx works in Neo4j

Here is a good example that highlights what i have read on multiple forums including here and stack Overflow

github.com/neo4j/neo4j

Cypher Regex matching with slashes

opened 04:40PM - 09 Jan 19 UTC

closed 08:30AM - 26 Feb 19 UTC

g3rd

team-cypher

## Bug report After upgrading from Neo4j 3.4.x to 3.5.x, regex now needs to b…e escaped. Also, the python driver no longer binds regex values correctly. The following query example has worked on Neo4j since I believe 3.1ish, both community, and enterprise. Python 2.7, 3.6, and 3.7. **Neo4j Version:** 3.5.1 EE and 3.5.0 EE **Operating System:** Neo4j provided docker image & Ubuntu 18.04.1 LTS **API:** Cypher, Neo4j Browser **Driver:** Python 1.7.1 (Python 3.6 and Python 3.7) ## Steps to reproduce 1. Create data ``` CREATE (n:Document) SET n.tagIds = '111', n.isValid = false; UNWIND ['111', '111,222,333', '222', '222,333', '111,222'] AS val CREATE (n:Document) SET n.tagIds = val, n.isValid = true; ``` ### Neo4j Browser 2. Cypher to Query in the Neo4j Browser ``` MATCH (n:Document {isValid: true}) WHERE n =~ '.*\b222\b.*' RETURN n ``` This is how I got the above query to work in 3.5.x: ``` MATCH (n:Document {isValid: true}) WHERE n =~ '.*\\b222\\b.*' RETURN n ``` ### Python 2. `query.py` ```python from neo4j import GraphDatabase driver = GraphDatabase.driver( "URL", auth=("NEO4J_USER", "NEO4J_PASSWORD")) def query(tag_id): cypher = ( "MATCH (n:Document {isValid: $is_true}) " "WHERE n =~ $regex_pattern " "RETURN n" ) with driver.session() as session: results = session.run( cypher, is_true=True, regex_pattern=r".*\b222\b.*") ``` When I add a pdb breakpoint and examine what is being passed to the server, the escaping of the slash is wrong. I've tried various combinations of the string to get the escaping correct. However, none have been successful. The only way I have gotten the above query to work is to remove the binding: ```python cypher = ( "MATCH (n:Document {{isValid: $is_true}}) " "WHERE n =~ '.*\\\\b{}\\\\b.*' " "RETURN n" ).format(tag_id) ``` ## Expected behavior * Results to be returned * Documentation that details the change to escaping (I'd be happy to contribute if pointed in the right direction) ## Actual behavior No results are returned

That very short excerpt in the manual doesn't go into what fags there are, what each one mean, how to write a query that gets back digits.

watched and learned this very good tutorial on regular expressions, but can't translate that knowledge into cypher

Many / all things are not working.

tideon · December 11, 2020, 2:42pm

Here is another example

https://staging.thepavilion.io/t/regex-in-cypher/8023

people refer to java, but there no robust examples of actual uses so you can see how it is actually written. Even the O'reilly "Graph databases", don't cover the topic. It is a very powerful option to have, but there is next to no attention given to it.

cobra · December 11, 2020, 3:01pm

Yeah it's weird, I just know regex from Neo4j are from Java. No, I'm not working for Neo4j

I do not see the syntax error to be honest

tideon · December 12, 2020, 10:52am

Hello Cobra,

I figured it out.

All the the metacharacters need to be escaped. So "\s" needs to be "\s"

The manual doesn't address this at all.

tideon · December 12, 2020, 4:40pm

So now I understand how to use regular expressions. Is there way to extract a part of a string that matches a criteria and put into another field?

So for instance: Technic 42020: Twin-Rotor Helicopter
That I would extract "42020" and put in a propery ModelNr

SET t.ModelNr = 42020

cobra · December 12, 2020, 6:37pm

You should find what you need here:

tideon · December 12, 2020, 7:29pm

I read everything there, but nothing gave the impression that I can use to achieve my goal.

Do you have an example of how it could be achieved?

I now know how to fully use Regex in Cypher, but no way to extract the pattern I have found to put it in a property field.

Thanks in advance,
Tideon

clem · December 12, 2020, 8:13pm

You need to use the APOC regex functions.

This is what you want:

tideon · December 13, 2020, 10:13pm

I'm reading it documentation, this is exactly what I needed.

tideon · December 14, 2020, 8:38pm

Hello Clem,

So I have reach so far:

// Apoc JSON ADD STORE & INVENTORY
CALL apoc.load.json("file:///lego3-5.json") YIELD value 
WITH value AS v
WITH apoc.text.regexGroups ( v.Product_Price, '\\d{1,3},\\d{1,2}' ) as price

RETURN price

Output

And I get an output of ( See attachment), but I cant use unwind to get them out of the nested list. Was at it for about 6 hours getting to this point and trying everything I can. The manual didn't explain how to use it further than the Return example.

I keep asking myself, why is the manual so bad. I never really explains as you would expect a manual to explain a subject, it is often a brief overview.

cobra · December 14, 2020, 8:52pm

Hello @tideon

Could you provide us the JSON file?

Regards,
Cobra

clem · December 14, 2020, 9:01pm

[edit because I misread the screen shot....]

Try returning price[0] to get the first element of the list of price(s) instead of the list of the single price. (European style of numbers with comma instead of decimal point.)

tideon · December 15, 2020, 12:50am

Hey Cobra,

Here is a sample of the data.
had to give it the .txt extension so I could upload it.
sample_lego.json.txt (389 Bytes)

tideon · December 15, 2020, 12:52am

Hello Clem,

I uploaded a file with a sample of the data.
As you will see the price has a comma in the price ( europe ). So the RegEx is written to capture that. It is one price.

clem · December 15, 2020, 2:18am

oops. My eyes are so good, so I misread it.

price[0] will give you a string. But if you want to store it as a float, you'll need to substitute the "," for a "." and then do toFloat()

tideon · December 15, 2020, 2:20am

the issue still remains, as to not being able to get the value out of this nested list, and I don't understand why apoc is doing that, the manual is very limited.

tideon · December 15, 2020, 2:22am

It is one price. so for instance 123,50
In europe we use comma's and I also want it to keep the comma, because all prices are written that way.

I have tweaked it in everyway I can so see if something would work.

Topic		Replies	Views
Regular expressions in fulltext searches Cypher	1	603	July 16, 2021
RegEx in Cypher Cypher	2	1368	June 13, 2019
Needed: Cypher Examples and resources Cypher	4	276	April 13, 2022
Optional Matches and Regular Expressions Filtering Results Cypher cypher	2	2076	October 12, 2018
Regex matching "\text{Hom}(.+,.+)" does not work Cypher cypher	1	301	February 6, 2021

Get Certified in June!

Proper Cypher regular expression examples

Related topics