cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot train GraphSage algorithm - java.lang.IllegalStateException: No more running tasks error

isisdesade
Node Link

I run Neo4j docker 4.4.4
With python graphDataScience driver gds version 2.0.2
Python 3.9.7
Using VSCODE notebooks.

I'm quite new to Neo4J and wanted to train a GraphSage node embedding on the Games of Thrones graph,

Here's the code:

from graphdatascience import GraphDataScience

# Use Neo4j URI and credentials according to your setup
gds = GraphDataScience('bolt://isa-neo4j:7687', auth=('neo4j', '<some_password>'))

print(gds.version())

# create the projected graph
graphName = 'GotGraph2'

if (gds.graph.exists(graphName).exists == True):
    G = gds.graph.get(graphName)
else:
    G, graphProjectResults = gds.graph.project(
    graphName,
    {
        'Character': {
            'label': 'Character',
            'properties': ['degree']
        }
    },
    {
        'INTERACTS1': {
            'type': 'INTERACTS1',
            'orientation': 'UNDIRECTED'
        }
    }
    )

model, res = gds.beta.graphSage.train(
  G,
  modelName='GotGraphModel',
  featureProperties=['degree']
)

It fails with ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure gds.beta.graphSage.train: Caused by: java.lang.IllegalStateException: No more running tasks}

Note it was doing the same behaviour from Neo4j Browser and a cypher query so I don't think it is related to the python client.

Thanks for any tips,

7 REPLIES 7

isisdesade
Node Link

Here's the debug.log

2022-04-19 09:24:02.283+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Start
2022-04-19 09:24:02.287+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Nodes :: Start
2022-04-19 09:24:02.290+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Nodes :: Store Scan :: Start
2022-04-19 09:24:02.299+0000 INFO  [o.n.k.a.p.GlobalProcedures] [node-store-scan-0] Loading :: Nodes :: Store Scan 100%
2022-04-19 09:24:02.304+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Nodes :: Store Scan :: Finished
2022-04-19 09:24:02.308+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Nodes :: Finished
2022-04-19 09:24:02.308+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Relationships :: Start
2022-04-19 09:24:02.309+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Relationships :: Store Scan :: Start
2022-04-19 09:24:02.317+0000 INFO  [o.n.k.a.p.GlobalProcedures] [relationship-store-scan-0] Loading :: Relationships :: Store Scan 100%
2022-04-19 09:24:02.321+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Relationships :: Store Scan :: Finished
2022-04-19 09:24:02.322+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Relationships :: Finished
2022-04-19 09:24:02.323+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Actual memory usage of the loaded graph: 337 KiB
2022-04-19 09:24:02.324+0000 INFO  [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt] [/172.19.0.2:37210] ] Loading :: Finished
2022-04-19 09:24:05.959+0000 WARN  [o.n.k.a.p.GlobalProcedures] Computation failed
java.lang.IllegalStateException: No more running tasks
	at org.neo4j.gds.core.utils.progress.tasks.TaskProgressTracker.lambda$requireCurrentTask$1(TaskProgressTracker.java:189) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at java.util.Optional.orElseThrow(Optional.java:408) ~[?:?]
	at org.neo4j.gds.core.utils.progress.tasks.TaskProgressTracker.requireCurrentTask(TaskProgressTracker.java:189) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.core.utils.progress.tasks.TaskProgressTracker.endSubTaskWithFailure(TaskProgressTracker.java:153) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.executor.ProcedureExecutor.lambda$executeAlgorithm$0(ProcedureExecutor.java:135) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.executor.ProcedureExecutor.runWithExceptionLogging(ProcedureExecutor.java:197) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.executor.ProcedureExecutor.executeAlgorithm(ProcedureExecutor.java:129) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.executor.ProcedureExecutor.compute(ProcedureExecutor.java:109) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.AlgoBaseProc.compute(AlgoBaseProc.java:79) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.AlgoBaseProc.compute(AlgoBaseProc.java:70) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.gds.embeddings.graphsage.GraphSageTrainProc.train(GraphSageTrainProc.java:56) ~[neo4j-graph-data-science-2.0.2.jar:?]
	at org.neo4j.kernel.impl.proc.GeneratedProcedure_train1918881933410100.apply(Unknown Source) ~[?:?]
	at org.neo4j.procedure.impl.ProcedureRegistry.callProcedure(ProcedureRegistry.java:235) ~[neo4j-procedure-4.4.4.jar:4.4.4]
	at org.neo4j.procedure.impl.GlobalProceduresRegistry.callProcedure(GlobalProceduresRegistry.java:352) ~[neo4j-procedure-4.4.4.jar:4.4.4]
	at org.neo4j.kernel.impl.newapi.AllStoreHolder.callProcedure(AllStoreHolder.java:1092) ~[neo4j-kernel-4.4.4.jar:4.4.4]
	at org.neo4j.kernel.impl.newapi.AllStoreHolder.procedureCallRead(AllStoreHolder.java:1004) ~[neo4j-kernel-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.CallSupport$.$anonfun$callReadOnlyProcedure$1(CallSupport.scala:47) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.CallSupport$.callProcedure(CallSupport.scala:70) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.CallSupport$.callReadOnlyProcedure(CallSupport.scala:47) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.TransactionBoundReadQueryContext.callReadOnlyProcedure(TransactionBoundQueryContext.scala:1135) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.planning.ExceptionTranslatingReadQueryContext.callReadOnlyProcedure(ExceptionTranslatingQueryContext.scala:226) ~[neo4j-cypher-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.LazyReadOnlyCallMode$.callProcedure(ProcedureCallMode.scala:50) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.pipes.ProcedureCallPipe.call(ProcedureCallPipe.scala:88) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.pipes.ProcedureCallPipe.$anonfun$internalCreateResultsByAppending$1(ProcedureCallPipe.scala:74) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.ClosingIterator$$anon$1.nextCur(ClosingIterator.scala:107) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.ClosingIterator$$anon$1.innerHasNext(ClosingIterator.scala:113) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.ClosingIterator.hasNext(ClosingIterator.scala:93) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.ClosingIterator$$anon$3.innerHasNext(ClosingIterator.scala:152) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.ClosingIterator.hasNext(ClosingIterator.scala:93) ~[neo4j-cypher-runtime-util-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.PipeExecutionResult.serveResults(PipeExecutionResult.scala:85) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.runtime.interpreted.PipeExecutionResult.request(PipeExecutionResult.scala:73) ~[neo4j-cypher-interpreted-runtime-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.result.StandardInternalExecutionResult.request(StandardInternalExecutionResult.scala:90) ~[neo4j-cypher-4.4.4.jar:4.4.4]
	at org.neo4j.cypher.internal.result.ClosingExecutionResult.request(ClosingExecutionResult.scala:144) ~[neo4j-cypher-4.4.4.jar:4.4.4]
	at org.neo4j.fabric.stream.QuerySubject$BasicQuerySubject$1.doRequest(QuerySubject.java:184) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at org.neo4j.fabric.stream.QuerySubject$BasicQuerySubject$1.request(QuerySubject.java:167) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.request(Operators.java:2158) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.FluxPeek$PeekSubscriber.request(FluxPeek.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at reactor.core.publisher.StrictSubscriber.request(StrictSubscriber.java:138) ~[reactor-core-3.4.11.jar:3.4.11]
	at org.neo4j.fabric.stream.Rx2SyncStream$RecordSubscriber.request(Rx2SyncStream.java:129) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at org.neo4j.fabric.stream.Rx2SyncStream.maybeRequest(Rx2SyncStream.java:91) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at org.neo4j.fabric.stream.Rx2SyncStream.readRecord(Rx2SyncStream.java:50) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at org.neo4j.fabric.bolt.BoltQueryExecutionImpl$QueryExecutionImpl.request(BoltQueryExecutionImpl.java:179) ~[neo4j-fabric-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.AbstractCypherAdapterStream.handleRecords(AbstractCypherAdapterStream.java:105) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.v3.messaging.ResultHandler.onPullRecords(ResultHandler.java:41) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.v4.messaging.PullResultConsumer.consume(PullResultConsumer.java:42) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.statemachine.impl.TransactionStateMachine$State.consumeResult(TransactionStateMachine.java:507) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.statemachine.impl.TransactionStateMachine$State$1.streamResult(TransactionStateMachine.java:260) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.statemachine.impl.TransactionStateMachine.streamResult(TransactionStateMachine.java:99) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.transaction.StatementProcessorTxManager.streamResults(StatementProcessorTxManager.java:249) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.transaction.StatementProcessorTxManager.pullData(StatementProcessorTxManager.java:111) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.v4.runtime.AutoCommitState.processStreamPullResultMessage(AutoCommitState.java:45) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.v4.runtime.AbstractStreamingState.processUnsafe(AbstractStreamingState.java:51) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.v3.runtime.FailSafeBoltStateMachineState.process(FailSafeBoltStateMachineState.java:48) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.nextState(AbstractBoltStateMachine.java:154) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.process(AbstractBoltStateMachine.java:102) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.messaging.BoltRequestMessageReader.lambda$doRead$1(BoltRequestMessageReader.java:93) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.DefaultBoltConnection.lambda$enqueue$0(DefaultBoltConnection.java:156) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatchInternal(DefaultBoltConnection.java:252) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:187) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:177) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.executeBatch(ExecutorBoltScheduler.java:257) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.lambda$scheduleBatchOrHandleError$3(ExecutorBoltScheduler.java:240) ~[neo4j-bolt-4.4.4.jar:4.4.4]
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at java.lang.Thread.run(Thread.java:829) [?:?]

Hello @isisdesade ,
this error indicates a bug on our end. I am surprised to see that not even a single GraphSage related task was started.
I will try to reproduce it and get back to you.

@isisdesade , I could not reproduce your bug locally when I used the game-of-thrones dataset.

Could you share how you imported the data?

As a note, the degree property you can also get via gds by using CALL gds.degree.mutate("GoT2", {mutateProperty: 'degree'})

isisdesade
Node Link

Hi, Thanks for the reply.
I imported the data by following the :play got data exploration

CREATE CONSTRAINT ON (c:Character)
ASSERT c.name IS UNIQUE;

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/asoiaf-book3-edges.csv' AS row
MERGE (src:Character {name: row.Source})
MERGE (tgt:Character {name: row.Target})
// relationship for the book
MERGE (src)-[r:INTERACTS3]->(tgt)
ON CREATE SET r.weight = toInteger(row.weight), r.book=3


Similar import for book 1 and 2 and 4+5

isisdesade
Node Link

I re-did the whole import exactly as above after deleting all nodes and relationship, just in case.

I added degree the same way as before as well:

MATCH (p)-[r]->(x)
with p, count(r) as degree
set p.degree = degree
return p

Then I ran the python notebook and had the same error.
My notebook runs on a vscode dev container in my local as well - hope that is not the problem.

I have added some "print" on the neo4j_query_runner.py :

neo4j query runner query: CALL gds.beta.graphSage.train($graph_name, $config)
neo4j query runner params: {'graph_name': 'GotGraph2', 'config': {'modelName': 'GotGraphModel', 'featureProperties': ['degree']}}
neo4j query runner result: <neo4j.work.result.Result object at 0x7efc4dd7c940>
... _connection : <neo4j.io._common.ConnectionErrorHandler object at 0x7efc4dd7c850>
... _hydrant : <neo4j.data.DataHydrator object at 0x7efc4dd7cc10>
... _on_closed : <bound method Session._result_closed of <neo4j.work.simple.Session object at 0x7efc4dd7ca30>>
... _metadata : {'query': 'CALL gds.beta.graphSage.train($graph_name, $config)', 'parameters': {'graph_name': 'GotGraph2', 'config': {'modelName': 'GotGraphModel', 'featureProperties': ['degree']}}, 'server': <neo4j.api.ServerInfo object at 0x7efc4dd675b0>, 't_first': 1, 'fields': ['modelInfo', 'configuration', 'trainMillis']}
... _record_buffer : deque([])
... _summary : None
... _bookmark : None
... _raw_qid : -1
... _fetch_size : 1000
... _discarding : False
... _attached : True
... _streaming : True
... _has_more : False
... _closed : False
... _keys : ['modelInfo', 'configuration', 'trainMillis']

But then I have no more ideas of what going wrong. This my Vscode version:

Version: 1.66.2 (user setup)
Commit: dfd34e8260c270da74b5c2d86d61aee4b6d56977
Date: 2022-04-11T07:46:01.075Z
Electron: 17.2.0
Chromium: 98.0.4758.109
Node.js: 16.13.0
V8: 9.8.177.11-electron.0
OS: Windows_NT x64 10.0.19044

isisdesade
Node Link

Here's the stack trace:
Here's the stack trace:


Thank you for your queries.
We could reproduce the error and it looks like our progess tracking hid an error. The actual error is:

Node with ID `250` has invalid feature property value `NaN` for property `degree

This indicates the degree property is not correctly set for each node.
Your query to set the degree property does not set the property for nodes with 0 outgoing relationships.
As an alternative you can either use a defaultValue in the project query (see Creating graphs - Neo4j Graph Data Science).
Or as already suggested above you can use gds.degree.mutate (see Degree Centrality - Neo4j Graph Data Science).

We will look into fixing the general issue of hiding error messages by the progress tracking.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.