Ha sorry, I didn't understand correctly. So "UniqEntity" is useless, I gonna remove it, and I will put a constraint on ID and relaunch the query.
Can we keep the same query or should it be updated ?
Best regards,
Ha sorry, I didn't understand correctly. So "UniqEntity" is useless, I gonna remove it, and I will put a constraint on ID and relaunch the query.
Can we keep the same query or should it be updated ?
Best regards,
If the property name is id
, you can use this one:
MATCH (a)-[*]-(b)
WITH id(a) AS id, apoc.coll.sortText(apoc.coll.toSet(collect(DISTINCT b.id) + [a.id])) AS nodes_list
WITH DISTINCT nodes_list, size(nodes_list) AS size
WITH size, apoc.coll.flatten(collect(nodes_list)) AS nodes_list
CALL apoc.periodic.iterate('
MATCH (n)
WHERE n.id IN $nodes_list
SET n.community_id = $community_id
', '
DETACH DELETE n
', {batchSize:1000, params:{nodes_list:nodes_list, community_id:size}}) YIELD batch, operations
RETURN 1
This requires more memory than the db size as Cypher needs to keep all these paths into memory. So, it can kick off garbage collection.
Have you tried the weakly connected components algo in GDS?
May be this can help identify the weekly connected components. Once you run you can determine smaller communities and delete them
Oh I forget this one
Yeah it could work
Thank you for the suggestion, I gonna try that.
Best regards !
Found that :
CALL gds.wcc.stream({
nodeProjection: "Library",
relationshipProjection: "DEPENDS_ON"
})
YIELD nodeId, componentId
RETURN componentId, collect(gds.util.asNode(nodeId).id) AS libraries
ORDER BY size(libraries) DESC;
would that be a good starting point ?
Yeah, good start!
If you want to do everything in one time (maybe you have to change the nodeProjection and the relationship Projection). In my query, it will delete communities which have less than 6 nodes.
CALL gds.wcc.stream({
nodeProjection: "Item",
relationshipProjection: "BELONGS_TO"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId).id) AS libraries
WITH size(libraries) AS size, libraries
WHERE size < 6
WITH apoc.coll.flatten(collect(libraries)) AS nodes_list
CALL apoc.periodic.iterate('
MATCH (n)
WHERE n.id IN $nodes_list
RETURN n
', '
DETACH DELETE n
', {batchSize:1000, params:{nodes_list:nodes_list}}) YIELD batch, operations
RETURN 1
If you want to do it in two times:
CALL gds.wcc.stream({
nodeProjection: "Item",
relationshipProjection: "BELONGS_TO"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId).id) AS libraries
WITH size(libraries) AS size, libraries
WITH size, apoc.coll.flatten(collect(libraries)) AS nodes_list
CALL apoc.periodic.iterate('
MATCH (n)
WHERE n.id IN $nodes_list
RETURN n
', '
SET n.community_id = $community_id
', {batchSize:1000, params:{nodes_list:nodes_list, community_id:size}}) YIELD batch, operations
RETURN 1
CALL apoc.periodic.iterate('MATCH (n) WHERE n.community_id < $community_id RETURN n', 'DETACH DELETE n', {batchSize:1000, params:{community_id:6}})
Regards,
Cobra
Hi,
I tried to delete all the nodes to do another import with the following command :
match (a) -[r] -> () delete a, r
And after a while, I got this error message :
Neo.DatabaseError.Transaction.TransactionCommitFailed
Makes me think to a db settings issue (Maybe to root cause of the issue with the query no eonding ?)
My settings :
dbms.directories.import=import
dbms.security.auth_enabled=true
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=4G
dbms.memory.pagecache.size=2G
dbms.tx_state.memory_allocation=ON_HEAP
dbms.connector.bolt.enabled=true
dbms.connector.http.enabled=true
dbms.connector.https.enabled=false
dbms.security.procedures.unrestricted=apoc.*
dbms.jvm.additional=-XX:+UseG1GC
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow
dbms.jvm.additional=-XX:+AlwaysPreTouch
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields
dbms.jvm.additional=-XX:+DisableExplicitGC
dbms.jvm.additional=-XX:MaxInlineLevel=15
dbms.jvm.additional=-Djdk.nio.maxCachedBufferSize=262144
dbms.jvm.additional=-Dio.netty.tryReflectionSetAccessible=true
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048
dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true
dbms.jvm.additional=-XX:FlightRecorderOptions=stackdepth=256
dbms.jvm.additional=-XX:+UnlockDiagnosticVMOptions
dbms.jvm.additional=-XX:+DebugNonSafepoints
dbms.windows_service_name=neo4j
Seems ok to you ?
Have a great day !
Tried another time and got :
Neo.DatabaseError.Statement.ExecutionFailed
Java heap space
To delete everything in the database, you should use:
CALL apoc.periodic.iterate('MATCH (n) RETURN n', 'DETACH DELETE n', {batchSize:1000})
Thank you !
BTW, I increased dbms.memory.heap.max_size to 16G and the delete query have been executed
It's another way but it's always better to use the query I gave you
Hi Cobra,
The Query is sucesseful but it seems no nodes are deleted. The query terminates very fast too :
Query used :
CALL gds.wcc.stream({
nodeProjection: "Entity",
relationshipProjection: "DEPENDS"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId).id) AS libraries
WITH size(libraries) AS size, libraries
WHERE size < 16
WITH apoc.coll.flatten(collect(libraries)) AS nodes_list
CALL apoc.periodic.iterate('
MATCH (n)
WHERE n.id IN $nodes_list
RETURN n
', '
DETACH DELETE n
', {batchSize:1000, params:{nodes_list:nodes_list}}) YIELD batch, operations
RETURN 1
Can you show me what is returned by:
CALL gds.wcc.stream({
nodeProjection: "Entity",
relationshipProjection: "DEPENDS"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId).id) AS libraries
WITH size(libraries) AS size, libraries
RETURN *
Can you show me your properties on the right please? And tell me the one which is unique please
Sure. Here you go :
Here's the headers of my csv file :
Entity:ID,description:LABEL
Entity and ID is the same data . I added a unique constrainte on "Entity" even if I guess it is done automatically because Entity is used as ID.
Best regards
What is returned by:
CALL gds.wcc.stream({
nodeProjection: "Entity",
relationshipProjection: "DEPENDS"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId).Entity) AS libraries
WITH size(libraries) AS size, libraries
RETURN *
Try this:
CALL gds.wcc.stream({
nodeProjection: "Entity",
relationshipProjection: "DEPENDS"
})
YIELD nodeId, componentId
WITH componentId, collect(gds.util.asNode(nodeId)) AS libraries
RETURN *