neo4J in Churn

skandagn · August 17, 2021, 7:21am

Hi Guys,

Posting a random idea/use case using Neo4J for customer churn under telcom domain. We all know that predicting a customer churn is important for any company and making necessary things to retain works in almost all industries. We specifically focus on telcos, and we are brainstorming with data for improving the churn model that is built using XGBoost. Traditional ML algos work well , give a decent rate of accuracy. But we see the potential of GraphDBs here, but the customer's data is all in traditional RDBMS format, the challenge is how do we use it with neo4j? Pushing huge data everyday would be huge effort and time consuming, Any effective ways are there?

Any thoughts are welcome!

Thanks!

david_allen · August 17, 2021, 11:57am

There are a lot of different integration approaches from batch to streaming, but yes basically you end up taking the data from RDBMS and writing it into a graph shape so that you can use graph approaches like GDS. Which exact approach you take depends on what's in place, how fresh you need the data and so on.

Some customers use the spark connector (Neo4j Connector for Apache Spark v5.0.0 - Neo4j Spark Connector) to pull batches and engineer to a graph shape. Others use the kafka integration to set up source connectors for their original DBs and replicate data into the graph. There are a bunch of other ways besides this too depending on what you're trying to do

skandagn · August 19, 2021, 5:59am

Thanks for writing David. It was helpful, If you can point to some example github repos or demo it would be great. The freqency of the data update will be everyday, and the solution is in AWS. I guess, good services like AWS Glue help in automating the pyspark solution of conversion of the data from RDBMS to GraphDB.

sameer.gijare14 · August 28, 2021, 10:04pm

Hello skandagn
You can write some scripts to connect to RDBMS and pump the data into graph store.The semantics of script will vary from dB to dB but the central idea is that you need to connect to source and pull the data periodically from source system into target. The frequency of pull mechanism can be tuned based on the capacity of store in which you are pushing the data as well as amount of information available at the source. If it is just one time data upload scheme then it is better to trigger it on happening if event in point of time.But if it is a regular happening then you can use scheduler of your choice to pull the data from source and push it into the store.
Hopes that this bring clarify things.
Thanking you
Yours faithfully
Sameer

Topic		Replies	Views
Transferring BigQuery Data to Neo4J Neo4j Graph Platform migrated	1	308	July 24, 2022
Import the data from DocumentDB/json data to graph General migrated	0	172	August 23, 2022
Migrate incremental data from RDBMS to neo4j Import / Export	2	1039	July 22, 2019
Graphical DB use for operations Random: Challenges, Polls, Fun Banter	2	29	July 20, 2025
What is the recommended method to update the results from spark back to Neo4j? Streaming (Kafka, Spark, Flink) spark	4	2565	November 12, 2018

neo4J in Churn

Related topics