Too slow Traverse


(Guille Rr) #1

Hi all . . .

The problem:

It's a multiiple origin one destination problem. The origin node is "7" and the origin nodes are: 9,1 and 10.

It has to return de numer of nodes into the diferent path between all the nodes (label:inicio) (Source) and the destination node.

The example:

image.png

The traverse that show all nodes: (but it's too slow)

TraversalDescription td = db.traversalDescription()
                    .depthFirst()
                    .expand(PathExpanders.forTypeAndDirection(RelTypes.SIGUIENTE, Direction.INCOMING))										
                    .uniqueness( Uniqueness.NODE_PATH)
		            .evaluator(new Evaluator() {
		                @Override
		                public Evaluation evaluate(Path path
		                    if( path.endNode().hasLabel(Labels.Inicio)) {
		                        return Evaluation.INCLUDE_AND_CONTINUE;
		                    }
		                    return Evaluation.EXCLUDE_AND_CONTINUE;
		                }
		            });
            
            HashSet<Node> pathNodes = new HashSet<>();
            
            for ( Path path: td.traverse(inicio)) {
            	for (Node node : path.nodes()) {
            		pathNodes.add(node);
            	}
            	
            }
            
            return pathNodes.stream().map( item -> new NodeResult(item)); 

With this other configuration, the traverse shows all the nodes correctly but it is much slower and takes half an hour. The solution 1 takes less than a second.

The question:

How can I solve the problem so that it shows all the nodes that intervene in the process and take a little time?

What kind of configuration do I have to use?

Is it possible to use Traverse to solve the problem? or do I need other functionality?

need I a bi-directional traversal ?

The solution 1 : (paths returned by Traverser) with this config:

TraversalDescription td = db.traversalDescription()
					.breadthFirst()
					.expand(PathExpanders.forTypeAndDirection(RelTypes.SIGUIENTE, Direction.INCOMING))		
					.uniqueness(Uniqueness.RELATIONSHIP_PATH)
					.evaluator(new Evaluator() {
						@Override
						public Evaluation evaluate(Path path) {
							
							System.out.println(" - - - - - - - "+ "\n");
							System.out.println(" - - dentro eval - - "+ "\n"+Paths.pathToString( path, pathPrinter ));
							
							if( path.endNode().hasLabel(Labels.Inicio)) {
								System.out.println("..........INCLUDE_AND_PRUNE");
								return Evaluation.INCLUDE_AND_PRUNE;
//								return Evaluation.ofContinues(false);
								
							}else {
								System.out.println("EXCLUDE_AND_CONTINUE...........");
								return Evaluation.EXCLUDE_AND_CONTINUE;
							}
						}
					});

			
			String output=" ";
         
			HashSet<Long> pathNodes = new HashSet<>();

			for ( Path path: td.traverse(inicio)) {
				for (Node node : path.nodes()) {
					pathNodes.add(node.getId());
				}
				
				output += "\n"+Paths.pathToString( path, pathPrinter ); 
				System.out.println("______Path dentro del Bucle FOR_________ ");
				System.out.println(" rutas: "+ output);
			}

			return pathNodes.stream().map(item -> new HashSetResult(item));	 
			
		}
 

(id7)<--[SIGUIENTE]--(id6)<--[SIGUIENTE]--(id5)<--[SIGUIENTE]--(id10)

(id7)<--[SIGUIENTE]--(id6)<--[SIGUIENTE]--(id5)<--[SIGUIENTE]--(id2)<--[SIGUIENTE]--(id1)

(id7)<--[SIGUIENTE]--(id4)<--[SIGUIENTE]--(id12)<--[SIGUIENTE]--(id11)<--[SIGUIENTE]--(id3)<--[SIGUIENTE]--(id9)

But the Traverser continue with the Evaluation . . .

    • dentro eval - -

(id7)<--[SIGUIENTE]--(id4)<--[SIGUIENTE]--(id12)<--[SIGUIENTE]--(id11)<--[SIGUIENTE]--(id3)<--[SIGUIENTE]--(id2)

EXCLUDE_AND_CONTINUE...........


    • dentro eval - -

(id7)<--[SIGUIENTE]--(id4)<--[SIGUIENTE]--(id3)

EXCLUDE_AND_CONTINUE...........


    • dentro eval - -

(id7)<--[SIGUIENTE]--(id4)

EXCLUDE_AND_CONTINUE...........

And in the following case, it does not include the two elements marked in red

image.png

Thanks in advance !


(Maxdemarzi) #2

You have a single destination... start there and walk backwards. Something like:

Node destination = some node.
HashSet<Long> inicios = new HashSet<>;
ResourceIterator<Node> iterator = db.findNodes(Labels.Inicio);
while (iterator.hasNext()){
  inicios.add(iterator.next().getId());
}

TraversalDescription td = db.traversalDescription()
					.breadthFirst()
					.expand(PathExpanders.forTypeAndDirection(RelTypes.SIGUIENTE, Direction.OUTGOING))
                    .evaluator(new Evaluator() {
		                @Override
		                public Evaluation evaluate(Path path
		                    if( inicios.contains(path.endNode().getId())) {
		                        return Evaluation.INCLUDE_AND_PRUNE;
		                    }
		                    return Evaluation.EXCLUDE_AND_CONTINUE;
		                }
		            });		

To make it better, I would create a custom expander that takes the inicios set and remove the nodes ids from the set as it finds them. Check if the set is empty, and if so return no more relationships, so it stops traversing. Otherwise in the for loop of the traverser, count to the number of paths in the set and stop once you get the number you need (don't loop forever).


(Guille Rr) #3

Hi Max,

Thank you very much for your answer. I am a follower of yours.

I have made the changes in the code and the speed has increased a lot. But there are some nodes that are not included in the route.

I expose two cases:

Case OK : 3 paths 11 nodes

Case KO : 1 paths 5 nodes instead 7

I only need the number of diferent nodes

The code:

        Node destination = db.findNode(Label.label("Job"), "jobId", jobId);
		HashSet<Long> inicios = new HashSet<>();
		HashSet<Long> pathNodes = new HashSet<>();
		int numPaths = 0;
		ResourceIterator<Node> iterator = db.findNodes(Labels.Inicio);
		while (iterator.hasNext()){
			  inicios.add(iterator.next().getId());
		}
		
		TraversalDescription td = db.traversalDescription()
				.depthFirst()
				.expand(PathExpanders.forTypeAndDirection(RelTypes.SIGUIENTE, Direction.INCOMING))
                .evaluator(new Evaluator() {
	                @Override
	                public Evaluation evaluate(Path path) {
	   
	                    if( inicios.contains(path.endNode().getId())) {
	                        return Evaluation.INCLUDE_AND_PRUNE;
	                    }
	                    return Evaluation.EXCLUDE_AND_CONTINUE;
	                }
	            });
		
		while(numPaths <= inicios.size()){
			for ( Path allpath: td.traverse(destination)) {
				for (Node node : allpath.nodes()) {
					pathNodes.add(node.getId());  >>>> count++  (faster ! )
				}
				numPaths+=1;
			}
		}
		
		return Stream.of(new LongResult((long)pathNodes.size())); 
		
		}

How can i fix this?

Thank you very much, greetings.


(Maxdemarzi) #4

NodeGlobal uniqueness will not work. NodePath is the one you want. That while loop you have may be a problem if you don't find all the potential paths for every "inicios" node. There may not be a path to get there, so the while loop will spin forever. You can add a limit of how far you are willing to go, by stopping at some depth of 50 hops or whatnot. Put a conditional break inside the Path for loop to break out of it once you find the number of paths you want. But since there may be multiple ways to get to the same "inicios" node, I'm not sure that's the right way to deal with it.


(Guille Rr) #5

Thanks for the recomendation, but I have found cycles or loops between nodes. That's the reason why the traverse is so slow.
Now the big question is ¿How can I resolve cycles? the traverse enters and does not come out. this causes memory problems
Thansk in advance!