CDR analysis

dimespi · August 5, 2019, 10:13am

Hello,

I'm doing some analysis on Call Detail Records (CDR). My dataset is similiar to this: Using Neo4j for Call Detail Records (CDR) Analytics [Community Post]

Here are the fields from my dataset :

source (operator)
called_number
calling_number
calling_date
country_code_from
country_code_to
usage
service_name (SMS, DATA, VOICE)
- SMS-OUTGOING
- SMS-OUTGOING-ROAMING
- SMS-INCOMING
- DATA-OUTGOING
- DATA-OUTGOING-ROAMING
- VOICE-OUTGOING
- VOICE-OUTGOING-ROAMING
- VOICE-INCOMING
- VOICE-INCOMING-ROAMING

If the service_name is SMS, the usage value will be set to 1.
If the service_name is DATA, the called_number and country_code_to will be empty.

I'd like to apply some machine learning algorithms and predictions for fraud/anomaly detection. I'm wondering wich one would be best for my use case? Kmeans, RandomForest, NaiveBayes, TimeSeries?

I found this:

I'm using py2neo and MLlib.

jennifer_reif · August 6, 2019, 7:01pm

What kinds of fraud or anomalies are you looking for in this data set? I think understanding a bit more about your use case would help me narrow down the better options.

Cheers,
Jennifer

rsagar4 · September 7, 2019, 6:58am

Hi dimespi, were u able to find any example code on CDR analytics.. Great if you share link for example... also do have sample dataset for this.. Thanks in advance..

Topic		Replies	Views
Graph data + graph algorithms + machine learning for fraud detection/prevention Cypher	2	663	April 14, 2020
Analytics Use Cases and a Scholar to present the Research Paper Feedback & Requests	2	869	September 10, 2018
Graph dataset for Financial Fraud Cypher cypher , knowledge-base	2	692	June 22, 2020
Here to learn more about Neo4j Introduce-Yourself	1	325	December 18, 2019
Hi Neo4j users and staff Introduce-Yourself	0	205	August 5, 2020

CDR analysis

Related topics