Thypari
(Thypari)
July 25, 2020, 10:54pm
1
I am reading in a directory tree from the disk. While I traverse the directories I am directly streaming them via IEnumerable to my asynchronous database methods. Since they are asynchronous I can't rely on the correct order (parent before child directory). So I just create all nodes and then connect them with relations in a later step. This makes use of all neo4j threads and seems a lot faster than writing synchronous.
But it's still really slow.
Would it be faster to just write the nodes to disk first e.g. into a csv file. And then bulk insert them into neo4j?
Any suggestions would be appreciated.
Hi Thypari,
If I were you I would collect the directories into lists of 1-2000 then make one query call using unwind to handle all in one go.
For example, if creating a Dir was:
CREATE (:Directory $param)
it would now be:
UNWIND $param AS dir
CREATE (d:Directory) SET d = dir
That's typically a lot faster.
All the best
Chris
1 Like
Thypari
(Thypari)
July 31, 2020, 7:04am
3
Do you still have to convert all user-defined types to Dictionaries?
Directory{ size long, string name, ShareInformation shareInformation }
ShareInformation {string someProperty1, int someProperty2}
So I can't just pass a List<Directory>
into the cypher because it contains a ShareInformation
property? The same goes for none-defined types like GUID
s:
/// <see cref="ulong"/>,
/// <see cref="byte"/>,
/// <see cref="char"/>,
/// <see cref="bool"/>,
/// <see cref="string"/>,
/// <see cref="List{T}"/>,
/// <see cref="INode"/>,
/// <see cref="IRelationship"/>,
/// <see cref="IPath"/>.
/// Undefined support for other types that are not listed above.
/// No support for user-defined types, e.g. Person, Movie.
/// </typeparam>
/// <returns>The value of specified return type.</returns>
/// <remarks>Throws <see cref="InvalidCastException"/> if the specified cast is not possible.</remarks>
public static T As<T>(this object value, T defaultValue)
{
return value == null ? defaultValue : value.As<T>();
}
/// <summary>
/// A helper method to explicitly cast the value streamed back via Bolt to a local type.
Is the recommended approach still to convert objects into nested Dictionaries?
Yep, unfortunately so - you'd need to parse the output, something like this:
async Task Main()
{
var directory = new DirectoryInfo("d:\\Projects\\");
var directories = directory
.GetDirectories()
.Select(d => new Directory { Name = d.Name, Size = d.GetFiles().Length + d.GetDirectories().Length, ShareInformation = new ShareInformation { PropInt = 1, PropString = d.FullName } })
.ToList();
var query = new Query(
@"UNWIND $directories AS dir
CREATE (d:Directory) SET d = dir",
new Dictionary<string, object> {
{ "directories", ConvertToDriverFormatFromCollection(directories)}
});
var driver = GraphDatabase.Driver("neo4j://localhost:7687", AuthTokens.Basic("neo4j", "neo"), config => config.WithEncryptionLevel(EncryptionLevel.None));
var session = driver.AsyncSession();
var x =await session.RunAsync(query);
await x.ConsumeAsync();
}
public IEnumerable<IDictionary<string, object>> ConvertToDriverFormatFromCollection<T>(IEnumerable<T> items)
{
return items.Select(i => ConvertToDriverFormat(i));
}
public IDictionary<string, object> ConvertToDriverFormat<T>(T item)
{
return item.GetType().GetProperties().Where(i => i.CanRead && i.PropertyType.IsValueType || i.PropertyType == typeof(string)).ToDictionary(x => x.Name, x => x.GetValue(item));
}
public class Directory
{
public long Size { get; set; }
public string Name { get; set; }
public ShareInformation ShareInformation { get; set; }
}
public class ShareInformation
{
public string PropString { get; set; }
public int PropInt { get; set; }
}
1 Like