Finding number of companies incorporated in an year

I wrote the following query to get the incorporation dates into date format:

Match (c:Company)

WITH c,toString(c.incorporation) AS date

Return c.name, date

But since it is in dd/mm/yyyy format I am struggling to get into yyyy format and then get number of companies incorporated in each year?

The right() string function will give you the rightmost characters of a string. Assuming your date is already a string, your query would be:

MATCH (c:Company)
RETURN right(c.incorporation, 4) as year, count(c) as numCompanies
// order by year or numCompanies if needed
1 Like

Thanks a lot! It worked.

I used the following query:

MATCH (c:Company)
WHERE c.incorporation <> "NA"
RETURN right(c.incorporation, 4) as year, count(c) as numCompanies
ORDER BY year DESC

I have another question. What would be the cypher query to create an "Early Incorporation" relationship between companies, let say the ones incorporated before 2011?

I am using the following query:
MATCH (c:Company)
WITH right(c.incorporation, 4) as Year
WHERE Year <"2011"
MERGE (a:Company)-[:INC_EARLY {year:Year}]-(b:Company);

Is it correct?

I'm not entirely clear what you're trying to do here. It's not clear what you want a and b to match to. Since the variables are new in that MERGE, this query would perform a MERGE of every company with every other company, relationships from the cross product of all companies. So there's something missing about how you want to select a and b.

What is supposed to be the relationship between each pairing that you're creating a relationship between? If you're just trying to mark companies as being incorporated before 2011, then don't use relationships for this, maybe use an additional label, :EarlyIncorporation or something.

Dear Andrew,

I was trying to replicate the last chapter of data science course provided by Neoj online training academy on my dataset of companies incorporated in the UK.
https://colab.research.google.com/github/neo4j-contrib/training-v2/blob/master/Courses/DataScience/notebooks/04_Predictions.ipynb

I am trying to predict possible future links between the companies. The exercise provided in the course did that with the dataset of authors and the possibility of future collaborations.

Since the exercise in the course made use of date property to split the data, I was trying to replicate the same.

Initially, I thought I would try and identify the companies with common addresses and create a relationship on the basis of that and use the incorporation data to split the dataset. However, I have been struggling with the cypher queries which would help in developing a link prediction for companies.

Any guidance/ help on how can I do it would be much appreciated.

My dataset mainly comprises of list of companies which were incorporated and were involved in cases of corruption.