Data ingestion from scratch (via auto api)

f1-outsourcing · December 1, 2024, 2:14pm

I am doubting a bit what to do with populating a neo4j database from scratch. Mostly I have been focussed on how I could represent neo4j data visually.

Most subjects here are about ingesting data from already existing sources and then it makes sense importing some csv file. However I don't have anything so I practically need to select text and paste it into the database.

So my first thought would be to add some auto rest/crud api in front of the database. If I would be working with mariadb/postgres I would get some api like this[1] and see if I can integrate it with some client to automate input.
It looks like this has been discussed earlier[3], and this sofa[4] looks interesting. I was just wondering if there are any new or better ways than suggested there?

Basically I am going to ingest nodes and their relationships, and I need to have something that registers which user added them. (Don't have yet a db schema)

The other issue I am doubting is that it would be easier for me to implement this in a regular rdbms and then export this data to neo4j, but disadvantage is then I can't view added data real time.

So I was wondering if someone is doing something similar, adding data 'manually' and what they decided to implement.

[1]

[2]

[3]

[4]

f1-outsourcing · March 20, 2025, 10:41am

Better think about trying with SOFA api could be a waste of your time.
Seems buggy. I hardly tested.

github.com/graphql-hive/SOFA

swaggerUI config for enabling sorting?

opened 11:13PM - 26 Feb 25 UTC

f1-outsourcing

When clicking a bit through this ui I am getting this error. I was wondering if …my 'default' setup needs to be changed for this. "message": "Variable \"$sort\" got invalid value [{}] at \"sort[0]\"; Field \"0\" is not defined by type \"UserSort\"." ``` type Query { databaseStatus: String actors: [Person] } schema { query: Query } type Person @node { name: String created: BigInt } type User @node { created: BigInt username: String enabled: Boolean password: String name: String apikey: String } ```

glilienfield · March 21, 2025, 2:42am

I don’t know your tech stack constraints, but you could build a REST api to perform CRUD operations and other specific updates/queries using spring boot and java api.

f1-outsourcing · March 21, 2025, 9:57am

Yes I maybe have to look into that. Currently I am operating mostly on the lower end of the market where eg java developers are charging me 16 hours for a downloaded stomp ws example and still don't know the difference between topics and queues and how to correctly implement queues

joshcornejo · March 21, 2025, 10:16am

I don't get what is your objective.

If you are going to be doing ad-hoc ingestion, your pipeline could look like this:

source -> python code doing ETL -> Neo4J

If you want to get updates regularly or have more dynamic types of ingestion, you will do something like this:

source -> Extract component -> GraphQL API -> Neo4J

Where your GraphQL will take the shape of whatever "transform->load" you are doing.

But so far in my experience, what I ended up doing?

1. source convert CSV or JSON
2. move to import folder
3. apoc.load.csv()
4. cleanse the nodes

And I have in my sight https://debezium.io as a source for CDC (change data capture) if required.

f1-outsourcing · March 21, 2025, 10:42am

I was thinking of updating a node or properties of a node via an api. In some cases there will be an property added. I want to keep the client as simple as possible so in the middle ware and/or in Neo4j plugin I will do some modifications.

f1-outsourcing · March 25, 2025, 8:41pm

I am currently testing a bit with

spring-boot-starter-data-neo4j and spring-boot-starter-data-rest

and switching from the movies database on bolt+s://demo.neo4jlabs.com:7687 to my own, and getting this error

2025-03-25T20:05:26.976Z WARN 2827 --- [nio-8090-exec-1] o.s.data.neo4j.cypher.unrecognized : Neo.ClientNotification.Statement.UnknownPropertyKeyWarning: The provided property key is not in the database
MATCH (movie:Movie) RETURN movie{.released, .tagline, .title, nodeLabels: labels(movie), elementId: id(movie)}

this is because properties are missing. But what I don't understand is, that this is quite common for nodes in any database. AI is telling me the solution is to start overriding queries, but I have the impression that there must be a more 'auto' solution than to start overriding queries?

glilienfield · March 26, 2025, 12:49am

It’s just a warning, so no impact.

Topic		Replies	Views
How to automatically import MYSQL database into Neo4j using java Import / Export mysql , neo4j-import , java-api	5	3614	July 14, 2020
How to auto-import csv content into Neo4j Neo4j Graph Platform	3	1165	May 7, 2020
Importing data to neo4j from a GraphQL data source GraphQL & GRANDstack	7	1481	April 20, 2022
Persist REST API response data in database Import / Export import , migrated	1	256	February 10, 2023
Batch import vs API import Neo4j Graph Platform import	1	180	November 24, 2023

Related topics