cancel
Showing results for 
Search instead for 
Did you mean: 

KNN New GDS production changes - How to write to the default db

bryant
Node Clone

I have a kinda dumb question about how to write GDS data back to the default (not named) graph db.

I had been using the beta version of knn and it didn't require the named graph, but now it does, and it won't take the db name. So what do you pass the parameter on gds.knn.write().
It's not taking gds.knn.write('neo4j',$config)
null didn't work and '' doesn't work So... how do you write back to the db without going through an in memory graph projection, but write directly to the db now with the new GDS KNN update?

1 ACCEPTED SOLUTION

Yes - you've got it right. As of GDS 2.0, you have to explicitly create the named graph, and then when you run an algorithm in write mode, it executes against your named graph, but writes the results to the database.

The autogenerated syntax from NEuler can be a little confusing, but separating it out, you should run the following three procedures:

//1. Create your graph
CALL gds.graph.project('myGraph',{
  nodeProjection: 'Price',
  relationshipProjection: {
    relType: {
      type: 'PRICE_PERIOD',
      orientation: 'NATURAL',
    }
  },
 nodeProperties: ['volume']
})

//2. Run KNN on your graph and write results to database
CALL gds.knn.write ( 'myGraph',
  writeProperty: 'score',
  writeRelationshipType: 'SIMILAR_KNN_VOLUME'
  topK: 10,
  randomJoins: 10,
  sampleRate: 0.5,
  deltaThreshold: 0.001,
  nodeProperties: {volume:'PEARSON'})

//3. Drop graph
CALL gds.graph.drop('myGraph')

View solution in original post

5 REPLIES 5

Can you share your original query?

You always need to got through the projected graph step - it was happening silently with anonymous graphs (we created the graph, ran the algorithms, and threw away the projection with one call).

The updated syntax will look something like:

  1. Create the projection
CALL gds.graph.project(
    'myGraph',
    {
        Person: {
            label: 'Person',
            properties: ['age','lotteryNumbers','embedding']
        }
    },
    '*'
);
  1. Run KNN
CALL gds.knn.write('myGraph', {
    writeRelationshipType: 'SIMILAR',
    writeProperty: 'score',
    topK: 1,
    randomSeed: 42,
    concurrency: 1,
    nodeProperties: ['age']
})
  1. Drop the graph
CALL gds.graph.drop('myGraph')

bryant
Node Clone

Thanks for replying to my message, Alicia. Here's my original query from gds.beta.knn:

// K-Nearest Neighbors based on Volume
:param limit => ( 42);
:param config => ({nodeProjection: 'Price', relationshipProjection: {relType: {type: 'PRICE_PERIOD', orientation: 'NATURAL', properties: {} } }, nodeWeightProperty: 'volume', topK: 10,  randomJoins: 10, sampleRate: 0.5, deltaThreshold: 0.001, nodeProperties: ['volume'], writeProperty: 'volumeSimilarityScore', writeRelationshipType: 'SIMILAR_KNN_VOLUME'});
:param communityNodeLimit => ( 10);
CALL gds.beta.knn.write($config);  

Previously, I just set everything up in Neuler (Graph Data Science Playground) and had it write to the existing base database not a named graph.

Now, I'm trying to do this, but it's asking me to pass in the named graph, which I don't want to do. I just it to write the property and relationships to my base neo4j database and not an in memory graph.. Here's my current attempt using the new version:

// K-Nearest Neighbors based on Volume

:param limit => ( 42);

:param graphConfig => ({

  nodeProjection: 'Price',

  relationshipProjection: {

    relType: {

      type: 'PRICE_PERIOD',

      orientation: 'NATURAL',

      properties: {}

    }

  },

  nodeProperties: [

    'volume'

  ]

});

:param config => ({

  topK: 10,

  randomJoins: 10,

  sampleRate: 0.5,

  deltaThreshold: 0.001,

  nodeProperties: {

    volume: 'PEARSON'

  },

  writeProperty: 'score',

  writeRelationshipType: 'SIMILAR_KNN_VOLUME'

});

:param communityNodeLimit => ( 10);

CALL gds.knn.write($config);

gds.knn.write() seems to need the name of a graph like this: CALL gds.knn.write($generatedName, $config); But in my case I'm not projecting to a named graph, I writing directly to the database.
Well at least I was in the beta.knn version anyway, So I'm just not sure how to do this in the new knn production verion, or if it's possible.

Hopefully that makes sense. Let me know if you need more details or background on what I'm doing.

This is one step I use in my core demo data on the stock market series using Neo4j for Insight Driven Analytics and visualization with Power BI. I'm looking at correlations and studies on volume similarity between the same stock and different stocks on different days.

After reading through your reply, Alicia, I think I understand what you're saying. in the beta version it was automatically running everything in the anonymous graph, which was a special named graph or something? So, the new production version just requires me to formally name the graph and write it to the graph and it will then automatically write it back to the db in permanent storage.

Am I understanding that correctly? Is there anything specific I need to do to tell it to write to my main db graph so it's persisted to the graph db? Or put the other way, If I wanted to run the algo and just get some results and not write it to the db how do I control that? I think that's the part I'm just not quite getting.

Thanks again, for your reply.

Yes - you've got it right. As of GDS 2.0, you have to explicitly create the named graph, and then when you run an algorithm in write mode, it executes against your named graph, but writes the results to the database.

The autogenerated syntax from NEuler can be a little confusing, but separating it out, you should run the following three procedures:

//1. Create your graph
CALL gds.graph.project('myGraph',{
  nodeProjection: 'Price',
  relationshipProjection: {
    relType: {
      type: 'PRICE_PERIOD',
      orientation: 'NATURAL',
    }
  },
 nodeProperties: ['volume']
})

//2. Run KNN on your graph and write results to database
CALL gds.knn.write ( 'myGraph',
  writeProperty: 'score',
  writeRelationshipType: 'SIMILAR_KNN_VOLUME'
  topK: 10,
  randomJoins: 10,
  sampleRate: 0.5,
  deltaThreshold: 0.001,
  nodeProperties: {volume:'PEARSON'})

//3. Drop graph
CALL gds.graph.drop('myGraph')

bryant
Node Clone

Thanks again for your explanation on this Alicia. I wanted to follow up and mark your comment as a solution and let you know how it worked out.
I also had to do the same thing with all my GDS 2.0 algorithms including Louvain and Label Propagation in addition to the KNN similarity algorithm. I rewrote all my GDS Cypher Scripts in 2.0 and everything's working again now.