Typo or correct in 04_Predictions.ipynb of Data Science with Neo4j 3.5

mark.needham · November 30, 2020, 1:54pm

Hey,

No it isn't a typo.

So we do the splitting into EARLY (train) and LATE (test) graphs to help pick pairs of positive and negative examples to go into the feature matrices.

And then when we're computing the scores for the train matrix we need to make sure that we don't look at any data that's in the test graph, hence using CO_AUTHOR_EARLY for all our computations there.

But when we compute the scores for the test matrix we don't need to worry about that, and it wouldn't actually make sense if we only computed the scores based on the LATE graph, as we'd be missing all of the collaborations that have already happened.

Hope that makes sense.

Cheers, Mark

Topic		Replies	Views
Question about: Using a Machine Learning Workflow for Link Prediction - Using a Machine Graph Academy & Certifications	1	406	February 1, 2021
Can't proceed with 04_Predictions.ipynb - Data Science with Neo4j Graph Academy & Certifications	2	299	November 12, 2020
Using Neo4j Graph Data Science in Python to Improve Machine Learning Models Community Content & Blogs migrated	0	189	July 12, 2022
DAG graph repesented as a JSON file to ML feature vectors Neo4j Graph Platform	1	779	April 14, 2019
About GDS node classification pipeline Neo4j Graph Platform migrated	1	65	June 28, 2022

August Summer Fun!

Typo or correct in 04_Predictions.ipynb of Data Science with Neo4j 3.5

Related topics