Graph traversal using node property value as weight

Hi

I have a cloud asset network with assets information coded in the form of the properties. The graph reflects the topology of a cloud environment and the relationships and nodes both have properties.

I am trying to find a traversal path between a node which provides entry to the network and an asset with highly sensitive info.

I have a node property that has been calculated for each node and I am trying to use that property to do the traversal.

Staring from a node, i want to be able to follow along the path with the highest value of that property at any fork during the traversal.

I have been looking into neo4j GDS and looked at the Dikstra algorithms, but all of them either follow a relationship property 'cost' or the hops.

I also looked into Neo4j Path Expanders, but not sure about using the weights of the properties there.

Please suggest if there is a provision in the GDS to do that or if there is any other alternative.

Thanks

This is a perfect use case for a custom procedure using the neo4j Java API. This would be fairly simple. I can help you out if you are interested. I could even write a simple solution that you can use as a starting point. Let me know if you are interested.

Another solution is to use the Pregel API. I am not familiar with it, but it has been recommended in other posts as a means of writing your own traversal algorithms. From my brief review of the documentation, I feel this would be a lot more complicated then writing your own procedure. You will need access to your server to deploy the plug-in that has your custom procedure.

https://neo4j.com/docs/graph-data-science/current/algorithms/pregel-api/

https://neo4j.com/developer/java-procedures/

https://neo4j.com/developer/cypher/procedures-functions/

https://www.tutorialspoint.com/neo4j/neo4j_cypher_api_example.htm

https://neo4j.com/developer-blog/lets-write-a-stored-procedure-in-neo4j-part-i/

Try this:

match (a:StartNode)

CALL apoc.path.spanningTree(a, {maxLevel:5}) yield path
with nodes(path) as n1
unwind n1 as n2
with n2 where n2.weight is not null and n2.weight = "XY"
return n2

I believe this would work where you can provide a condition on a property for each node.

But in my case, nodes weights are not known in advance and I just want to follow the highest property value node whenever there is a fork in the path.

thanks

Are you stating you would like to proceed to the node with the highest property value across all related nodes? Would the result the list of nodes or specific properties of all the nodes that were traversed? This would be fairly simple with a custom procedure.

Going from A:External to DS, the path to be selected from the procedure should be A->B->G->H->DS as this path choses the highest 'sens_value' whenever there is a fork.

I would like to get the nodes back. Please suggest.

I don't see a means of doing this in cypher. I have not come across anything in APOC. I have not used the GDS library, but I think their path finding algorithms are based on total cost. Your algorithm is not a total cost problem.

This would be extremely easy to do with a custom procedure. A handful of lines of java. Are you interested in this solution? You will need to deploy a jar to each server in your cluster. This is not supported in Aura if that is what you are using. If you are interested, I can write a stripped down working version that you can take and productionalize.

That would be awesome. Also this is just the beginning, but I would be looking to make the traversal criteria more customized, so thinking custom procedure way would be able to provide me that flexibility.

Please suggest the solution. I am not using Aura but planning to use the EC2 deployment.

Thanks

Hi

Please share the solution, so I could work on that.

Thanks

I will. I started earlier. Will continue later today. Shouldn’t take long. You can use it as a prototype to get you going.

I created a maven project with a prototype of the custom procedure. I just got something to work to get you started. It works, but I don't know how you want to define the input parameters. I just specified the root node and the property of each node to test. The procedure terminates when it gets to a terminal node without any children or no node has the property to test. In your case, this is the 'DS' node. You could change the algorithm to pass the 'DS' node and terminate when it is reached. Anyways, you can use this to modify it for your needs.

I did't try to optimize it or make it production ready. It is just a quick prototype.

package customFunctions;

import org.neo4j.graphdb.Direction;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.Relationship;
import org.neo4j.procedure.Description;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.Procedure;

import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.stream.Stream;

public class CustomProcedure {

    @Procedure(name = "custom.traverseGraph")
    @Description("Get list of nodes from root node that traverse the path with highest values of property 'prop'")
    public Stream<TraversalResult> traverseTree(@Name("root") Node rootNode, @Name("property") String prop) {

        Objects.requireNonNull(rootNode);
        Objects.requireNonNull(prop);

        List<Node> listOfNodes = new ArrayList<>();
        processNode(rootNode, prop, listOfNodes);

        return Stream.of(TraversalResult.of(listOfNodes));
    }

    private void processNode(Node node, String property, List<Node> nodes) {
        boolean nodeFound = false;
        Long currentMaxValue = Long.MIN_VALUE;
        Node childNodeWithMaxValue = null;
        Iterable<Relationship> relationships = node.getRelationships(Direction.OUTGOING);
        for (Relationship relationship : relationships) {
            Node childNode = relationship.getOtherNode(node);
            if (childNode.hasProperty(property)) {
                Long propertyValue = (Long) childNode.getProperty(property);
                if (propertyValue > currentMaxValue) {
                    childNodeWithMaxValue = childNode;
                    currentMaxValue = propertyValue;
                    nodeFound = true;
                }
            }
        }
        if (nodeFound) {
            nodes.add(childNodeWithMaxValue);
            processNode(childNodeWithMaxValue, property, nodes);
        }
    }

    public static class TraversalResult {
        public List<Node> nodes;

        private TraversalResult(List<Node> nodes) {
            this.nodes = nodes;
        }

        public static TraversalResult of(List<Node> nodes) {
            return new TraversalResult(nodes);
        }
    }
}

Test class showing how to test the procedure using an embedded instance.

import customFunctions.CustomProcedure;
import org.junit.jupiter.api.*;
import org.neo4j.driver.*;
import org.neo4j.driver.types.Node;
import org.neo4j.harness.Neo4j;
import org.neo4j.harness.Neo4jBuilders;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

class CustomProcedureTests {

static Driver driver;  

@BeforeAll  
static void setup\_db() {  
    Neo4j neo4j = Neo4jBuilders.newInProcessBuilder()  
            .withProcedure(CustomProcedure.class)  
            .build();  

    driver \= GraphDatabase.driver(neo4j.boltURI(), Config.builder()  
            .withoutEncryption()  
            .build());  
}  

@BeforeEach  
void delete\_data() {  
    String cypher = "merge(A:External {name:'A'})\\n" +  
            "merge(B:Node {name:'B'}) set B.sens\_value = 80\\n" +  
            "merge(C:Node {name:'C'}) set C.sens\_value = 70\\n" +  
            "merge(D:Node {name:'D'}) set D.sens\_value = 50\\n" +  
            "merge(E:Node {name:'E'}) set E.sens\_value = 90\\n" +  
            "merge(F:Node {name:'F'}) set F.sens\_value = 10\\n" +  
            "merge(G:Node {name:'G'}) set G.sens\_value = 40\\n" +  
            "merge(H:Node {name:'H'}) set H.sens\_value = 60\\n" +  
            "merge(I:Node {name:'I'}) set I.sens\_value = 50\\n" +  
            "merge(J:Node {name:'J'}) set J.sens\_value = 30\\n" +  
            "merge(K:Node {name:'K'}) set K.sens\_value = 45\\n" +  
            "merge(L:Node {name:'L'}) set L.sens\_value = 55\\n" +  
            "merge(DS:Node {name:'DS'})\\n" +  
            "merge(A)-\[:RELATION\]->(B)\\n" +  
            "merge(A)-\[:RELATION\]->(C)\\n" +  
            "merge(A)-\[:RELATION\]->(D)\\n" +  
            "merge(D)-\[:RELATION\]->(E)\\n" +  
            "merge(E)-\[:RELATION\]->(F)\\n" +  
            "merge(F)-\[:RELATION\]->(DS)\\n" +  
            "merge(B)-\[:RELATION\]->(G)\\n" +  
            "merge(G)-\[:RELATION\]->(I)\\n" +  
            "merge(I)-\[:RELATION\]->(DS)\\n" +  
            "merge(G)-\[:RELATION\]->(H)\\n" +  
            "merge(H)-\[:RELATION\]->(DS)\\n" +  
            "merge(B)-\[:RELATION\]->(J)\\n" +  
            "merge(J)-\[:RELATION\]->(DS)\\n" +  
            "merge(C)-\[:RELATION\]->(K)\\n" +  
            "merge(K)-\[:RELATION\]->(L)\\n" +  
            "merge(L)-\[:RELATION\]->(DS)";  
    try (Session session = driver.session()) {  
        session.run("match(n) detach delete n");  
        session.run(cypher);  
    }  
}  

@Test  
@DisplayName("Test TraverseGraph Scenarios")  
void test() {  
    String cypher = "match (a:External {name: 'A'}) " +  
            "call custom.traverseGraph(a, 'sens\_value') yield nodes " +  
            "return nodes ";  
    List<Node> result = getCypherResults(cypher);  
    List<String> listOfIds = result.stream().map(n -> n.get("name").asString()).collect(Collectors.toList());  
    Assertions.assertIterableEquals(Arrays.asList("B", "G", "H"), listOfIds);  
}  

private List<Node> getCypherResults(String cypher) {  
    try (Session session = driver.session()) {  
        Result result = session.run(cypher);  
        return  result.single().get("nodes").asList(Value::asNode);  
    }  
}  

}

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0

<groupId>com.domain.CustomProcedure</groupId>  
<artifactId>customProcedure</artifactId>  
<version>0.0.1</version>  

<packaging>jar</packaging>  
<name>Custom Procedure to Traverse Graph</name>  
<description>Prototype of a custom procedure to find the optimal path based on a property value</description>  

<properties>        <java.version>11</java.version>  
    <maven.compiler.source>${java.version}</maven.compiler.source>  
    <maven.compiler.target>${java.version}</maven.compiler.target>  
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>  
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>  
    <neo4j.version>4.4.8</neo4j.version>  
    <neo4j-java-driver.version>4.4.6</neo4j-java-driver.version>  
    <maven-shade-plugin.version>3.2.4</maven-shade-plugin.version>  
    <maven-compiler-plugin.version>3.8.1</maven-compiler-plugin.version>  
    <maven-surefire-plugin.version>2.22.2</maven-surefire-plugin.version>  
    <maven-failsafe-plugin.version>2.22.2</maven-failsafe-plugin.version>  
</properties>  
<dependencies>        <dependency>            <!-- This gives us the Procedure API our runtime code uses.  
             We have a \`provided\` scope on it, because when this is                 deployed in a Neo4j Instance, the API will be provided                 by Neo4j. If you add non-Neo4j dependencies to this                 project, their scope should normally be \`compile\` -->            <groupId>org.neo4j</groupId>  
        <artifactId>neo4j</artifactId>  
        <version>${neo4j.version}</version>  
        <scope>provided</scope>  

    </dependency>  
    <!-- Test Dependencies -->  
    <dependency>  
        <!-- This is used for a utility that lets us start Neo4j with  
             a specific Procedure, which is nice for writing tests. -->            <groupId>org.neo4j.test</groupId>  
        <artifactId>neo4j-harness</artifactId>  
        <version>${neo4j.version}</version>  
        <scope>test</scope>  
    </dependency>  
    <dependency>            <!-- Used to send cypher statements to our procedure. -->  
        <groupId>org.neo4j.driver</groupId>  
        <artifactId>neo4j-java-driver</artifactId>  
        <version>${neo4j-java-driver.version}</version>  
        <scope>test</scope>  
    </dependency>  
    <dependency>            <groupId>org.junit.jupiter</groupId>  
        <artifactId>junit-jupiter-api</artifactId>  
        <version>5.8.2</version>  
        <scope>test</scope>  
    </dependency>  
</dependencies>  
<build>        <plugins>            <plugin>                <groupId>org.apache.maven.plugins</groupId>  
            <artifactId>maven-compiler-plugin</artifactId>  
            <version>${maven-compiler-plugin.version}</version>  
        </plugin>            <plugin>                <groupId>org.apache.maven.plugins</groupId>  
            <artifactId>maven-surefire-plugin</artifactId>  
            <version>${maven-surefire-plugin.version}</version>  
            <configuration>                    <skipTests>true</skipTests>  
            </configuration>            </plugin>            <plugin>                <groupId>org.apache.maven.plugins</groupId>  
            <artifactId>maven-failsafe-plugin</artifactId>  
            <version>${maven-failsafe-plugin.version}</version>  
        </plugin>            <plugin>                <!-- This generates a jar-file with our procedure code,  
                 plus any dependencies marked as \`compile\` scope.                     This should then be deployed in the \`plugins\` directory                     of each Neo4j instance in your deployment.                     After a restart, the procedure is available for calling. -->                <groupId>org.apache.maven.plugins</groupId>  
            <artifactId>maven-shade-plugin</artifactId>  
            <version>${maven-shade-plugin.version}</version>  
            <configuration>                    <createDependencyReducedPom>false</createDependencyReducedPom>  
            </configuration>                <executions>                    <execution>                        <phase>package</phase>  
                    <goals>                            <goal>shade</goal>  
                    </goals>                    </execution>                </executions>            </plugin>        </plugins>    </build></project> 

These three files should be all you need.

Thanks , let me try it out and come back with the results.

There is a junit test to verify the results are correct, but it also shows you how the procedure would be called.

If you want to deploy it to a test server, you need to package the code into a jar and copy it to your server's plugin folder. You can use 'mvn package', which will create the jar in the 'target' folder. Sorry if you already know this.

Really appreciate your help, so far I was able to get it to the package and deploy and run query from the test using the UI . I am able to get the node properties evaluated and get the path based on the property value.

Will be making changes in this though, Can you point me to the documentation for the traversal framework that is used in this procedure.

Thanks for your help.

These are the references I used to get up to speed.

https://neo4j.com/developer/java-procedures/

https://neo4j.com/docs/java-reference/current/

https://neo4j.com/docs/java-reference/4.4/javadocs/

Hi, @glilienfield Gary (or anyone),
I'm trying to get the sample code to compile (not knowing anything about Java - other than how to spell it). I've found and downloaded the jars: neo4j-graphdb-api-5.5.0 & neo4j-procedure-api-5.5.0.

However, I'm getting a compilation error:

C:\Neo4j\Java>java -cp .\Lib\* CustomProcedure.java
CustomProcedure.java:34: error: cannot access ResourceIterable
Iterable<Relationship> relationships = node.getRelationships(Direction.OUTGOING);
^
class file for org.neo4j.graphdb.ResourceIterable not found
1 error
error: compilation failed

Since the jar doesn't contain the referenced class, is there an easy solution to this problem?
TIA,
Paolo

That code was using version 4. I got it to work with the following pom. See if this works.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.domain.CustomProcedure</groupId>
    <artifactId>customProcedure</artifactId>
    <version>0.0.1</version>

    <packaging>jar</packaging>
    <name>Custom Procedure to Traverse Graph</name>
    <description>Prototype of a custom procedure to find the optimal path based on a property value</description>

    <properties>
        <java.version>11</java.version>
        <maven.compiler.source>${java.version}</maven.compiler.source>
        <maven.compiler.target>${java.version}</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <neo4j.version>5.5.0</neo4j.version>
        <neo4j-java-driver.version>5.5.0</neo4j-java-driver.version>
        <maven-shade-plugin.version>3.2.4</maven-shade-plugin.version>
        <maven-compiler-plugin.version>3.8.1</maven-compiler-plugin.version>
        <maven-surefire-plugin.version>2.22.2</maven-surefire-plugin.version>
        <maven-failsafe-plugin.version>2.22.2</maven-failsafe-plugin.version>
    </properties>

    <dependencies>
        <dependency>
            <!-- This gives us the Procedure API our runtime code uses.
                 We have a `provided` scope on it, because when this is
                 deployed in a Neo4j Instance, the API will be provided
                 by Neo4j. If you add non-Neo4j dependencies to this
                 project, their scope should normally be `compile` -->
            <groupId>org.neo4j</groupId>
            <artifactId>neo4j</artifactId>
            <version>${neo4j.version}</version>
            <scope>provided</scope>

        </dependency>

        <!-- Test Dependencies -->
        <dependency>
            <!-- This is used for a utility that lets us start Neo4j with
                 a specific Procedure, which is nice for writing tests. -->
            <groupId>org.neo4j.test</groupId>
            <artifactId>neo4j-harness</artifactId>
            <version>${neo4j.version}</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <!-- Used to send cypher statements to our procedure. -->
            <groupId>org.neo4j.driver</groupId>
            <artifactId>neo4j-java-driver</artifactId>
            <version>${neo4j-java-driver.version}</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-api</artifactId>
            <version>5.9.2</version>
            <scope>test</scope>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>${maven-compiler-plugin.version}</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>${maven-surefire-plugin.version}</version>
                <configuration>
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-failsafe-plugin</artifactId>
                <version>${maven-failsafe-plugin.version}</version>
            </plugin>
            <plugin>
                <!-- This generates a jar-file with our procedure code,
                     plus any dependencies marked as `compile` scope.
                     This should then be deployed in the `plugins` directory
                     of each Neo4j instance in your deployment.
                     After a restart, the procedure is available for calling. -->
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>${maven-shade-plugin.version}</version>
                <configuration>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

Thanks, Gary @glilienfield,
Now I just have to learn about POM and Maven (and how to use them)... :grin: (I mentioned elsewhere that I was a newbie)
Seriously... Thanks! I'll "do some more digging" and get back if I really need help!

Paolo

I can send you the new project. Are you using an IDE, such as IntelliJ or Eclipse? I use IntelliJ. They have a free community version.

Maven is a build tool that helps manage dependencies.

Learning:

https://maven.apache.org/guides/getting-started/maven-in-five-minutes.html

Hi Gary,@glilienfield,
At present, just to get this started and investigate it, we're just using Notepad++ and CMD. At work, we generally use VSCode or Visual Studio. If I download IntelliJ and Maven can I build the project you send?
I really appreciate the effort you've gone to.

Paolo