Spark Connector only returns empty DataFrames

lukas.b · November 9, 2020, 12:14pm

Hi,

I need help troubleshooting a rather weird error. I have setup a (single instance) Azure VM to run Neo4j following the official documentation to feed data to an Azure Databricks cluster running Spark. I connected to the Neo4j VM via HTTP on port 7474 to populate it with some data. For the Databricks cluster, I installed the connector and followed this documentation, basically just setting the connection address and login credentials as Spark parameters.

When I run a sample query via the spark connector on the Databricks cluster, I can successfully establish a connection - however, it only returns empty data:

%scala
import org.neo4j.spark._
val neo = Neo4j(sc)
# => neo: org.neo4j.spark.Neo4j = org.neo4j.spark.Neo4j@7c444d23

%scala
val rdd = neo.cypher("MATCH (n:Person) RETURN id(n) as id ").loadRowRdd
rdd.count
# => rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = Neo4jRDD partitions Partitions(1,9223372036854775807,9223372036854775807,None) MATCH (n:Person) RETURN id(n) as id  using Map()
# => res1: Long = 0

the same happens for .loadDataFrame, .loadGraphFrame etc:

# => java.lang.RuntimeException: Cannot infer schema-types from empty result, please use loadDataFrame(schema: (String,String)*)

I can confirm that the query should in fact not return an empty DF by connecting to the remote VM from my local Neo4j Desktop and running it there:

Where is my mistake here? Thanks in advance!

(Logs and specs, see below)

david_allen · November 12, 2020, 12:28pm

You are using the old driver, which works in a different way and has different versions of spark that it works with and supports.

Please consider having a look at the new Neo4j connector for spark - it's easier to use, more modern, and is under active development Neo4j Connector for Apache Spark v5.0.0 - Neo4j Spark Connector

Topic		Replies	Views
Unable to fetch data from databricks from Neo4j desktop Neo4j Graph Platform	1	358	February 27, 2021
Connecting Neo4j 3.5.6 DB within Spark Databricks for ML purpose Graph + AI	1	1155	January 12, 2021
Neo4j community edition - Can it integrate with Apache Spark Operations	11	830	November 16, 2020
Neo4j / Spark / Databricks - connection fails Neo4j Graph Platform	0	243	May 24, 2023
Neo4j Spark Connector integration with pyspark Using Pycharm Ecosystem & Integrations	1	374	May 13, 2021

Take the Course Then Join The Aura Agent Hackathon

Spark Connector only returns empty DataFrames

Related topics

Take the Course Then Join
The Aura Agent Hackathon