@michael.hunger/@mark.needham
As I am starting learning neo4j and I am trying to build graph using my asset data. Data which has both software and hardware informations and below is the sample json data,
{
"DisplayName": "Google Chrome",
"DisplayVersion": " 67.0.3396.99",
"Publisher": " Google Inc.",
"InstallDate": "20160629",
"EstimatedSize": "",
"HostName": "LTP-1001",
"Manufacturer": "Dell Inc.",
"Model": "Vostro 3458",
"CPU": "Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz",
"RAM": "3 GB",
"IPAddress": "10.101.52.199 fe80::1040:f590:6d5:4346",
"HDDCapacity": "465.66",
"HDDSpace": "87.44 %",
"OperatingSystem": "Microsoft Windows 7 Professional ",
"ServicePack": "0",
"LastReboot": "20180801130720.109999+330"
}
Note:- similarly the data has different softwares mapped to different hardware
Below is the cypher query I used to create nodes,
CALL apoc.load.json("file:/home/test/AssetInfo.json") YIELD value AS data
WITH data WHERE data.DisplayName <> 'null' AND data.DisplayName <> ''
MERGE (s:Software {DisplayName: data.DisplayName})
MERGE (h:Hardware {HostName: data.HostName})
SET h.Manufacturer = data.Manufacturer
SET h.Model = data.Model
SET h.CPU = data.CPU
SET h.RAM = data.RAM
SET h.IPAddress = data.IPAddress
SET h.HDDCapacity = data.HDDCapacity
SET h.HDDSpace = data.HDDSpace
SET h.OperatingSystem = data.OperatingSystem
SET h.ServicePack = data.ServicePack
SET h.UserloggedIn = data.UserloggedIn
SET h.LastReboot = data.LastReboot
WITH s, h, data, COUNT(*) AS count
MERGE (s)-[i:installed]->(h) ON CREATE SET i.DisplayVersion = data.DisplayVersion, i.Publisher = data.Publisher, i.InstallDate = data.InstallDate, i.EstimatedSize=data.EstimatedSize, i.count = count
RETURN s,h
Now I want to build the graph which should show the most important software connected to the hardware device and I am using pagerank graph algorithm to get the result but I am unable to get the right pagerank score.
Below is the pagerank query I executed and got the response and the score value is same for all the softwares name and not able to understand how the score values get generated and correct me If am doing anything wrong in the setup.
CALL algo.pageRank.stream('Software', 'installed', {iterations:20, dampingFactor:0.85})
YIELD nodeId, score
MATCH (node) WHERE id(node) = nodeId
RETURN node.DisplayName AS page,score
ORDER BY score DESC LIMIT 10
I am following this link to understand the importance of PageRank Algorithm,
https://neo4j.com/docs/graph-algorithms/3.4/algorithms/page-rank/
I also tried using cypher projection in pagerank and below is the query and the response,
CALL algo.pageRank.stream(
'MATCH (s:Software)-[:installed]->(h:Hardware) RETURN id(s) as source, id(h) as target') YIELD node,score with node,score order by score desc limit 10
RETURN node.HostName,node.DisplayName, score
I am little confused again how the pagerank score is calculated in this case and please share your thoughts and help me to resolve this.
Regards,
Ganeshbabu R