regarding the comment of
When Neo4j creates an index, it creates a redundant copy of the data in the database. Therefore using an index will result in more disk space being utilized, plus slower writes to the disk.
Therefore, you need to weigh up these factors when deciding which data/properties to index.
as a point of clarification the 'creates a redundant copy of data`
as a point of clarification
creates a redundant copy of data
this should be
creates a redundant copy of data for the property indexed.
For example if you have 100 million :Person nodes and each node has 20 properties and you then create an index on :Person(age) we do not create a duplicate copy of those 100 million :Person nodes with 20 properties. Rather we simply create a
redundant copy of the 100 million :Person nodes and on the given property
Also with regards to 'using an index will result in more disk space being utilized, plus slower writes to the disk.` this is true but this is true of most any/all RDBMS. Indexes are not exactly free. Free to create yes, but they do impact load/write performance simply because as you update the data you also then need to update the associated indexes.
As to indexes and why they are import, if one runs
match (n:Person) where n.age>20 and n.age<30 return n;
without in index on the
age property we would need to iterate over the 100 million :Person nodes and check each node to see if it satisfies the where clause. However with a index on :Person(age) as the index has details on the
age property the query would be much faster.