What is the recommended way to delete nodes in a Neo4j database using the Spark connector?
Documentation around this topic seems to be lacking. The following code is not considered a valid write query in neo4j_connector_apache_spark_2_12_5_1_0_for_spark_3.jar
. Guidance on best-practices would be welcome.
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, IntegerType
spark = SparkSession.builder.getOrCreate()
empty_df = spark.createDataFrame(
[],
schema=StructType([
StructField("boilerplate", IntegerType(), nullable=True),
])
)
(empty_df
.write
.format("org.neo4j.spark.DataSource")
.option("url", f"bolt://{NEO_URL}:{NEO_PORT}")
.option("authentication.basic.username", NEO_USER)
.option("authentication.basic.password", NEO_PWD)
.mode("overwrite")
.option("query","""
:auto
MATCH (n)
CALL {
WITH n
DETACH DELETE n
} IN TRANSACTIONS OF 10000 ROWS;
""")
.save())