cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Pregel function call never ends - Although log says it's finished

cuneyttyler
Ninja
Ninja

I wrote a simple pregel code to find shortest path between given source,target pair. However when I call the function via cypher the execution never ends. I looked at the log and it lists iterations and after that it says it's finished. When I force terminate the query an exception is thrown: Here is the exception :

ERROR [o.n.b.r.DefaultBoltConnection] Protocol breach detected in bolt session 'bolt-0'.
org.neo4j.bolt.runtime.BoltProtocolBreachFatality: Message 'PULL Map{qid -> Long(0), n -> Long(1000)}' cannot be handled by a session in the READY state.
at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.nextState(AbstractBoltStateMachine.java:159) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.process(AbstractBoltStateMachine.java:102) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.messaging.BoltRequestMessageReader.lambda$doRead$1(BoltRequestMessageReader.java:93) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.DefaultBoltConnection.lambda$enqueue$0(DefaultBoltConnection.java:156) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatchInternal(DefaultBoltConnection.java:252) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:187) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:177) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.executeBatch(ExecutorBoltScheduler.java:257) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.lambda$scheduleBatchOrHandleError$3(ExecutorBoltScheduler.java:240) ~[neo4j-bolt-4.4.7.jar:4.4.7]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.75.Final.jar:4.1.75.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]

I have a distance variable in my schema(which is public) and I YIELD the call to function with 'YIELD values'.

What's the problem?

Thanks.

6 REPLIES 6

Could you please open an issue on https://github.com/neo4j/graph-data-science

I'm opening the issue.

Apart from what martin wrote, it would also be helpful if you could share your pregel example or at least some code for us to reproduce the error.

Cheers

cuneyttyler
Ninja
Ninja

So as @martin_junghann  pointed out in https://community.neo4j.com/t5/neo4j-graph-platform/pregel-node-degree-is-0-even-if-it-has-relations... the node ids obtained from config and context don't match because internal gds ids are different than neo4j database ids. I campare these two. That's why computation never ends - it never converges. But I took that code from pregel examples. There must be something wrong with those examples as well. And in this case, how should we find the startnodes or end nodes in our algorithm during procedure call?

Also, still, even though max iterations has been reached and log says 'Finished', neo4j browser still stucks at loading screen. No exceptions.

As discussed on the GitHub issue we don't consider this as a bug, since the user attempts to stream 10 million result rows which just takes time. The "Finished" log refers to the computation of the algorithm and not to the point in time where the results are visible in e.g. the browser.

To address the Pregel  node id space issue, as discussed on the other thread, I added functionality to read the original node id from the Pregel compute context. I updated the documentation and also the example code . It's released as part of GDS 2.1.