Creating procedure in APOC with the same behaviour as cube function in OLAP

apoc

(Gabriela) #1

Hi,
I am new to Neo4j and Cypher and I am trying to create the behaviour of cube function (OLAP function) for Neo4j by using APOC library. More specifically, I am trying to have the same result I have using this query :

MATCH(p:Product)
RETURN p.reorderLevel as ReordLevel, p.supplierID as Supplier, sum(p.unitsIn Stock) as SommeUnits
ORDER BY ReordLevel 

UNION ALL

MATCH(p:Product)
RETURN 'null' as ReordLevel, p.supplierID as Supplier, sum(p.unitsIn- Stock) as SommeUnits
ORDER BY ReordLevel

UNION ALL

MATCH(p:Product)
RETURN p.reorderLevel as ReordLevel,'null' as Supplier, sum(p.unitsIn- Stock) as SommeUnits
ORDER BY ReordLevel 

UNION ALL

MATCH(p:Product)
RETURN 'null' as ReordLevel, 'null' as Supplier, sum(p.unitsInStock) as SommeUnits
ORDER BY ReordLevel

I would like to obtain something like that :

MATCH (p:Product, ....)
CUBE (p.reorderLevel , p.supplierID, ....)
ORDER BY p.reorderLevel , p.supplierID, ....;

Could someone help me with some ideas ?


(Michael Hunger) #2

I think you should be able to do that with a procedure that:

  1. takes a label
  2. takes a list of properties to cube
  3. takes a list of properties to aggregate

and then either generate and run that cypher code internally.
or use the java api to do the same.

or you could use apoc.cypher.run with a loop/combination on field-name + null

unwind ['p.reorderLevel',null] as ReordLevel
unwind ['p.supplierID', null] as Supplier
call apoc.cypher.run('
MATCH(p:Product)
RETURN '+ReordLevel+' as ReordLevel, '+Supplier+' as Supplier, sum(p.unitsIn Stock) as SommeUnits
ORDER BY ReordLevel 
') yield value as row
return row. ReordLevel as ReordLevel, row.Supplier as Supplier, row.SommeUnits as SommeUnits

(Gabriela) #3

Thank you for your answer, @michael.hunger. I am gonna take into consideration your ideas in order to create the procedure. I tried to execute your function for apoc.cypher.run but I get an error

Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.cypher.run: Caused by: org.opencypher.v9_0.util.SyntaxException: Variable p.reorderLevel not defined (line 1, column 25 (offset: 24))

I tried to fix it, but I am new to Neo4j so I couldn't fix it.


(Michael Hunger) #4

What exactly did you run? I don't see where this could come from in the code I've shared.


(Gabriela) #5

I ran again the code you have shared and now I have this

I even tried to add null as second param but it stil doesn't work and I don't understand why

unwind ['p.reorderLevel',null] as ReordLevel
unwind ['p.supplierID', null] as Supplier
call apoc.cypher.run('
MATCH(p:Product)
RETURN '+ReordLevel+' as ReordLevel, '+Supplier+' as Supplier, sum(p.unitsIn Stock) as SommeUnits ORDER BY ReordLevel', null ) yield value as row
return row. ReordLevel as ReordLevel, row.Supplier as Supplier, row.SommeUnits as SommeUnits

Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.cypher.run: Caused by: org.opencypher.v9_0.util.SyntaxException: Invalid input 'o': expected 'a/A' (line 1, column 98 (offset: 97))


(Michael Hunger) #6

can you remove the space there?


(Gabriela) #7

@michael.hunger, already done with all empty spaces, but still the same error

Neo.ClientError.Statement.SyntaxError: Procedure call does not provide the required number of arguments: got 1 expected 2.

Procedure apoc.cypher.run has signature: apoc.cypher.run(cypher :: STRING?, params :: MAP?) :: value :: MAP?
meaning that it expects 2 arguments of type STRING?, MAP?
Description: apoc.cypher.run(fragment, params) yield value - executes reading fragment with the given parameters (line 3, column 1 (offset: 87))
"call apoc.cypher.run('MATCH(p:Product) RETURN '+ReordLevel+' as ReordLevel, '+Supplier+' as Supplier, sum(p.unitsInStock) as SommeUnits ORDER BY ReordLevel') yield value as row"
^

If if add null as second param I have null pointer error

Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.cypher.run: Caused by: java.lang.NullPointerException


(Michael Hunger) #8

Then pass in an empty map {} as 2nd argument.


(Gabriela) #9

I tried it already and this gives another syntax error

Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.cypher.run: Caused by: org.opencypher.v9_0.util.SyntaxException: Invalid input 'o': expected 'a/A' (line 1, column 97 (offset: 96))

I also tried to check online for other syntax errors that are similar to mine but I still have no answer to it.


(Michael Hunger) #10

Do you have some example data for me to try this on?

Michael


(Michael Hunger) #11

I got it. It seems that the aproc procedure doesn't like to group on null, so I changed that to 'null' but you can change it also to a numeric value like -1 or whatever works for you.

unwind ['p.reorderLevel','null'] as ReordLevel
unwind ['s.supplierId', 'null'] as Supplier
call apoc.cypher.run('
MATCH (p:Product)-[:SUPPLIERS]->(s:Supplier)
RETURN '+ReordLevel+' as ReordLevel, '+Supplier+' as Supplier, sum(p.unitsInStock) as SommeUnits ORDER BY ReordLevel', {} ) 
yield value as row
return row. ReordLevel as ReordLevel, row.Supplier as Supplier, row.SommeUnits as SommeUnits
order by ReordLevel, Supplier

I ran this on a northwind import and it worked fine. Sorry that it took so long.


(Gabriela) #12

Thank you so much @michael.hunger, it works !


(Michael Hunger) #13

This sounds actually like a great topic to write up. Would you mind creating a short blog post/article about it? With your original intention / source of inspiration to how it was solved and can be generalized.


(Gabriela) #14

This would be a great idea, to write an article about this subject.
Now I am trying to compare the output obtained with Cube function in SQL with the output obtained with the apoc.cypher.run function. I just observed that there are two lines at the end of the output table corresponding to :

I don't understant the second line corresponding to null null 6238 and I didn't find how to get rid of it. Do you have any idea ?


(Michael Hunger) #15

Can it be that you have a few NaN in your data? Which might be rendered as null?


(Gabriela) #16

I am not sure there are NaN in the Northwind data set. But what suprises me is that 6238 is exactely the double of 3119 so I think it does somewhere 3119 + 3119 and I still don't know why.

[EDIT]

In fact I think I have an idea. It computes this output because it does this operation

MATCH(p:Product)
RETURN 'null' as ReordLevel, 'null' as Supplier, sum(p.unitsInStock) as SommeUnits
ORDER BY ReordLevel

And this corresponds exactely to the line null null 6238. But I don't know how to get rid of this operation.


(Michael Hunger) #17

Could you by accident imported the data twice?
Or there is another dimension that differentiates the two sets?


(Michael Hunger) #18

Oh and can you differentiate between our null string and an actual null stored?
you could pass instead of our 'null' e.g. '"total"' to see the difference.