Changing order variables in database

Hi everyone,

I´m trying to create some customers which later on will be displayed on a map in R. Unfortunatly the lat and long coordinates are mixed up sometimes.

Here is the code to create the data:

CREATE (C0001:Customer {name: 'Minnetonka MN', province: 'Minnesota', customer_lead_time_in_h: 48, lat: 44.930000, long: -93.46})
CREATE (C0002:Customer {name: 'Edinburg TX', province: 'Texas', customer_lead_time_in_h: 52, lat: 26.300000, long: -98.16})
CREATE (C0003:Customer {name: 'Charleston SC', province: 'South Carolina', customer_lead_time_in_h: 30, lat: 32.790000, long: -79.99})
CREATE (C0004:Customer {name: 'Brentwood NY', province: 'Philadelphia', customer_lead_time_in_h: 50, lat: 40.780000, long: -73.25})
CREATE (C0005:Customer {name: 'Bel Air South MD', province: 'Maryland', customer_lead_time_in_h: 48, lat: 39.500000, long: -76.32})
CREATE (C0006:Customer {name: 'Lincoln NE', province: 'Nebraska', customer_lead_time_in_h: 48, lat: 40.820000, long: -96.69})

When searching for all customers using:

MATCH (c:Customer) RETURN c

... the order of long and lat is changed for the 5th item (id 4):

"Normal" customer:

{
  "identity": 3,
  "labels": [
    "Customer"
  ],
  "properties": {
    "name": "Brentwood NY",
    "customer_lead_time_in_h": 50,
    "province": "Philadelphia",
    "lat": 40.78,
    "long": -73.25
  }
}

5th customer:

{
  "identity": 4,
  "labels": [
    "Customer"
  ],
  "properties": {
    "name": "Bel Air South MD",
    "customer_lead_time_in_h": 48,
    "province": "Maryland",
    "long": -76.32,
    "lat": 39.5
  }
}

I have more customers in my database which have the same problem (there are more customers than the 6 mentioned here).
This behaviour would not be a problem if R would recognise the columns correctly, which is not the case.
By using this code in R:

# function to query neo4j tibble
queryNeo <- function(q) {
  out <- q %>%
    call_neo4j(con)
  return(out)
}

cus <- queryNeo("MATCH (c:Customer) RETURN c")

usa <- map_data("usa")
ggplot() + 
  geom_polygon(data=usa, aes(x=long, y=lat, group=group)) +
  coord_fixed(1.3) +
  geom_point(data = cus$c, aes(x=long, y=lat, colour="blue"), size = 0.5)

I receive a tibble, in which some columns have their coordinates mixed up:

> cus$c
# A tibble: 50 x 5
   customer_lead_time_in~ province      name             lat  long
                    <int> <chr>         <chr>          <dbl> <dbl>
 1                     48 Minnesota     Minnetonka MN   44.9 -93.5
 2                     52 Texas         Edinburg TX     26.3 -98.2
 3                     30 South Caroli~ Charleston SC   32.8 -80.0
 4                     50 Philadelphia  Brentwood NY    40.8 -73.2
 5                     48 Maryland      Bel Air Sout~  -76.3  39.5
 6                     48 Nebraska      Lincoln NE      40.8 -96.7

There also comes a warning message when returing the customers:

Column 5 ['lat'] of item 5 appears in position 4 in item 4. Set use.names=TRUE to match by column name, or use.names=FALSE to ignore column names. use.names='check' (default from v1.12.2) emits this message and proceeds as if use.names=FALSE for  backwards compatibility. See news item 5 in v1.12.2 for options to control this message.

But the call_neo4j function does not have a variable use.names.

All this is created using Neo4j version 3.5.18 (because 3.6.X does not work with neo4r).

I also tried to create a new database, where the described behaviour was reproducable.

Any suggestions how to fix this are welcome!
Thanks everyone.

Instead of returning the c as-is you could make the reply order consistent by specifying it manually:

MATCH (c:Customer) RETURN collect( 
  {
      name:c.name,
      customer_lead_time_in_h: c. customer_lead_time_in_h,
      province: c. province,
      lat: c.lat,
      long: c.long
}) as c
1 Like

Your solution works fine in the Neo4j Browser. Unfortunatly there seems to be a problem with the collect function when using neo4r:

cus <- queryNeo("MATCH (c:Customer) RETURN collect( 
  {
      name: c.name,
      customer_lead_time_in_h: c.customer_lead_time_in_h,
      province: c.province,
      lat: c.lat,
      long: c.long
}) as c")
 Error in .x[[i]] : subscript out of bounds 

When in the browser do all records have all properties?
It could be that some have missing props which are skipped on the collect.
Not familiar with the neo4r driver but perhaps it gets confused when a prop is missing,

To clarify: if some nodes do not have a lat property the collect will create an object without that field.
So:

{
      name: 'name',
      customer_lead_time_in_h: 42,
      province: 'val',
      long: 42
}

instead of:

{
      name: 'name',
      customer_lead_time_in_h: 42,
      province: 'val',
      lat: 0,
      long: 42
}
1 Like

Thanks for the clarification!

The error occurs in R. As a workaround I figured out to return every variable without the collect-function.

This seems like a problem with neo4r.

Thanks for your help!

1 Like