cancel
Showing results for 
Search instead for 
Did you mean: 

Convert int to string with leading 0

trevor_miles
Node Link

I'm trying to convert a date into a YYYYMMDD and YYYYMM format in order to create an index for a GraphAware TimeTree, as suggested in this post.

As noted in their blog, the authors originally used as MM/DD/YYYY format for the UUID, which leads to some issues with sorting.

I've tried converting the original code as follows:

MATCH (root:TimeTreeRoot)-[:CHILD]->(year:Year)
WITH collect(year) AS years UNWIND years AS y
	SET y.uuid = toString(y.value)
	WITH y
	MATCH (y)-[:CHILD]->(month:Month)
	WITH y, collect(month) AS months UNWIND months AS m
		SET m.uuid = y.value + substring("00", size(m.value)) + m.value
		WITH y, m
		MATCH (m)-[:CHILD]->(day:Day)
		WITH y, m, collect(day) AS days UNWIND days AS d
			SET d.uuid = y.value + substring("00", size(m.value)) + m.value + substring("00", size(d.value)) +  d.value
;

First of all this does not seem very efficient to me. Second, I'm not getting the result I would expect. The days and months less than 10 are left padded with a0, but so are the days and months greater than 9. Running
MATCH (n:Month) WHERE EXISTS(n.uuid) RETURN DISTINCT "node" as entity, n.uuid AS uuid LIMIT 25 UNION ALL MATCH ()-[r]-() WHERE EXISTS(r.uuid) RETURN DISTINCT "relationship" AS entity, r.uuid AS uuid LIMIT 25
returns

|"node"|"201901"|
|"node"|"197001"|
|"node"|"201902"|
|"node"|"201903"|
|"node"|"201904"|
|"node"|"201905"|
|"node"|"201906"|
|"node"|"201907"|
|"node"|"201908"|
|"node"|"201909"|
|"node"|"2019010"|
|"node"|"2019011"|
|"node"|"2019012"|

Running the same query of Day returns

|node"|"201901027"|
|"node"|"201901028"|
|"node"|"201901029"|
|"node"|"201901030"|
|"node"|"201901031"|
|"node"|"20190201"|
|"node"|"20190202"|
|"node"|"20190203"|
|"node"|"20190204"|
|"node"|"20190205"|
|"node"|"20190206"|
|"node"|"20190207"|
|"node"|"20190208"|
|"node"|"20190209"|
|"node"|"201902010"|
|"node"|"201902011"|
|"node"|"201902012"|
|"node"|"201902013"|
|"node"|"201902014"|
|"node"|"201902015"|
|"node"|"201902016"|
|"node"|"201902017"|

I was looking for a format statement along the lines of the code below, but I cannot find anything in the Neo4j manuals.
SET m.uuid = y.value + format("00", m.value)

#GraphAware #TimeTree #format #convert-time

10 REPLIES 10

Christophe_Will
Graph Buddy

Are you sure you're trying to use the GraphAware timetree ? If yes, then it only accepts ms timestamps as time property.

Yes, I am indeed using GraphAware Time Tree. Yes, I know it uses ms timestamps to create the tree, but as the blog pointed out searching the time tree is very inefficient without an index on the nodes, so I'm trying to create the "uuid" values for the nodes so that they can be traversed very quickly.

The issue I'm having is not with creating the Time Tree, but rather with creating the "uuid" values on the Month and Day nodes.

Also make sure you actually need a timetree. Since indexable native temporal types were added in 3.4.x, many use cases can now be handled by using these new types and adding appropriate indexes.

That said, there are still several cases where timetrees remain useful. Can you give us some info on what you're using this for?

Good question, and I may be approaching this from the incorrect angle. I need to create a time series so that I can represent the state of a piece of equipment at different times. Let's say that the equipment has the following states: Off, Idle, Failure, Setup, Working. I need to capture the state changes over time.

I also need to follow a product during different stages of production and distribution. Think of a job application as it moves from initial inquiry to interviews to acceptance to start.

Hope this helps.

You could model your state/event nodes with temporal types as properties on those nodes, and relationships to the product in question going through these events. And while you can easily order and collect a product's events, you could even create a linked list between the event nodes themselves, if your queries require it.

Two primary cases come to mind as to why you might need a time tree (and I'm sure Christophe might come up with several more):

  1. If you need to traverse through dates when there are not relationships connecting events on the same date, so starting with an event, finding other events that are on the same date via traversal rather than by executing an additional indexed query.

  2. If you need to explore temporal elements across differing larger temporal contexts...for example, days spanning multiple of week/month/year (every day of the week...every first Monday of every month), or the same month/months spanning years (every October). Time trees can make it easier to query across discrete time elements. For instance, a query to find events in September would perform an index lookup for the :Month of September (or traverse from years to months and filter) then expand to events under each September month node.

By contrast, looking at temporal types, although we can index date values on nodes, lookup of nodes by specific date elements (WHERE n.eventDate.month = 9) don't use the index, and would fall back to a label scan. Using the time tree for these cases would likely be more efficient.

Thanks Andrew, this is very useful information and confirms that I will need the Time Tree.

Which means I am still left with my original issue: How to create sortable indices on Month and Day nodes. The most sensible to me is integer values of YYYYMMDD, so Jan 1, 2019 will become 20190101 and Oct 1, 2019 will become 20191001.

Any thoughts on how to do this in Cypher?

APOC date/time conversion functions should do the trick, but if all you need is padding then you can use APOC text functions for this:

return apoc.text.lpad('51', 2, '0')

returns '51' while

return apoc.text.lpad('1', 2, '0')

returns '01'

Thank you Andrew, I solved it in Cypher. See below. Is an index on a string more efficient than on an integer? I'm using an integer.

I was hoping to avoid apoc, but maybe I should branch out.

MATCH (root:TimeTreeRoot)-[:CHILD]->(year:Year)
WITH collect(year) AS years UNWIND years AS y
	SET y.uuid = toInt(y.value)
	WITH y
	MATCH (y)-[:CHILD]->(month:Month)
	WITH y, collect(month) AS months UNWIND months AS m
		SET m.uuid = toInt(y.value*100 + m.value)
		WITH y, m
		MATCH (m)-[:CHILD]->(day:Day)
		WITH y, m, collect(day) AS days UNWIND days AS d
			SET d.uuid = toInt(y.value*10000 + m.value*100 +  d.value)
;

Andrew, you must have missed my related question:
Is an Integer more efficient as an index that a String?

I'm not currently aware of any notable performance difference here. If I find out otherwise I'll let you know.