Data type of a property

cypher

(Mike R Black) #1

In Cypher how can I check the data type of a property?

Once a property is used for one data type, are all properties for that node label that same property? I'm assuming no. So if I have a node label :testnode and I have a property of mystery. I could mix-match putting numbers, strings, dates, etc... in the nodes. Obviously a data quality nightmare but that leads me back to my first question. I'd like to be able to query my nodes and look for data quality issues.


(M. David Allen) #2

Cypher properties always have a type, but Neo4j doesn't constrain their type. That is to say that if you have a node property called mystery, it's possible to make it sometimes a string, sometimes an integer.

For example, this is OK:

CREATE (:testnode { mystery: 1 });
CREATE (:testnode { mystery: "Hello" });

A way that you can profile looking for data quality issues is by checking types like this:

MATCH (t:testnode)
WHERE t.mystery = toString(t.mystery)
RETURN count(t);

That will tell you how many strings there are. If the property were an int, it wouldn't match. You could do a similar things with other types too, to build up a table of how many instances were which type for an attribute.


(Mike R Black) #3

Thanks!

I think you may have given me an idea for a way I could contribute to the APOC library. Have a function like DataType() that would go through a series of case statements testing each data type and return a string value of the data type that it was determined to be.


(Benoit Simard) #4

Hi Mike,

In fact it already exists, take a look at apoc.meta.type function.

Example :

WITH [true, 42, 'foo', 1.2] AS data
UNWIND data as value
RETURN apoc.meta.type(value)

Result is :

"BOOLEAN"
"INTEGER"
"STRING"
"FLOAT" 

Cheers


(M. David Allen) #5

Benoit's got a good idea -- thing is I would just caution you that there's a way of telling the type of an individual property value, but that's not the same thing as a property having a type -- they don't have types, or at least they can vary.

Reason I bring this up is that you need to do some kind of sampling, like for example MATCH (n:Node) return n.mystery limit 100 or MATCH (n:Node) where id(n) % 3 = 0 return n.mystery limit 100. Only if the types of all of the sample agree is it probably safe to assume that's the type of the property.


(Pradeep Ponduri) #6

Once a property is set as INT, how to make sure new values for that property too are INT.

Example:
I created a node with age as property and type casted it to INT.
When am creating nodes using load csv or jdbc, this age property is loaded as string not INT.
Do I need to typecast everytime i load?


(Elaine Rosenberg) #7

Properties do not have types. When you load the data, you must convert the data to the type that you want for that property.