Adding correlation values as relationship properties

I have a csv file with 458 stocks of the NSE. It is a 458 X 458 CSV file which showcases correlation values of each stock with another stock in the recent coronavirus crisis. I intend to build a graph with the stocks as nodes, and the relationships as the correlation values. Then I intend to get the nodes between which we have a high correlation.

I have made the graph with stocks as nodes, but am having trouble understanding how to import the correlation values as relationships.

Hey @nilanjandebnath13, you already created the nodes of stocks with name property I guessed. So, this may help you to proceed further

LOAD CSV WITH HEADERS FROM 'nse.csv' as rows
WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len as l
MATCH (s:Stock {name:rows[header[0]]})
MERGE (s)-[:re {value:rows[header[l]]} ]->(:Stock {name:header[l]})

Hey, I am really sorry but I am pretty new at this. What does this code do exactly? I ran it and it takes a bit of time but I cannot see anything happen. No relationship is made, nor any property added to the nodes.

LOAD CSV WITH HEADERS FROM 'file:///stocks.csv' as rows
WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len as l
MATCH (s:Company {label:rows[header[0]]})
MERGE (s)-[:re {value:rows[header[l]]} ]->(:Company {label:header[l]})

I modified the code a bit because some things were different. But otherwise its all the same.

Before running this query, have you created nodes with Company label and label property?

Yes..
I used this.

LOAD CSV WITH HEADERS FROM "file:///stocks.csv" AS company CREATE(a1:Company{label:company.SC_NAME})

Hey, try to run this query:
LOAD CSV WITH HEADERS FROM 'file:///stocks.csv' as rows
WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len as l
MATCH (s:Company {label:rows[header[0]]})
Return s

And see whether you are getting all companies nodes or not?

If not, try to run this:
LOAD CSV WITH HEADERS FROM 'file:///stocks.csv' as rows
WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len as l
Return rows[header[0]]

And see whether you are getting same names of company or not

If I run the first query, nothing happens, however when I run the second query, I get a table with one column and 209764 rows. From the top around 400 entries are -0.11655942553863054. The next 400 are -0.11073719397142893. The next are 0.16142129640360856.

LOAD CSV WITH HEADERS FROM 'file:///stocks.csv' as rows
WITH rows,keys(rows) AS header
Return header

run this query and see whether you are getting same headers which are present in csv file

Yep. I am getting all 458 headers. But I am getting them 458 times. So basically I am getting 458 times this.

["HERITGFOOD", "ULTRATECH CM", "NOCIL LTD", "TV18BRDCST", "GRIND NORTON", "ADANIGREEN", "UNITED BREWA", "IDBI BANK L", "KAVERI SEED", "JAIN IRRIG", "RALLIS INDIA", "INFO EDGE", "TATASTEELPP", "TITAN", "K.P.R. MILL", "SUNDARAM FAS", "GLAXOSMITHKL", "TORRENT PHAR", "NAVIN FLUORI", "ALEM PHARMA", "MEGHMANI", "SOLAR IND", "BLUE DART EX", "FINOL. CABLE", "JAGRAN PRAK", "PHIL.CARBON", "SUN PHARMA.", "AARTI INDUST", "ASHOKA", "GLAXOSMITH", "MINERAL&METL", "GODREJ IND", "RAYMOND", "CENTURY TEXT", "QUESS", "SHANKARA", "ABBOTT (I)", "JYOTHYLAB", "KANSAI NERO", "MCX LTD", "SHRMCITYUN", "IBUL HSG FIN", "ITC LTD.", "SOBHA", "LEMONTREE", "TATA METALIK", "EIH LTD.", "NATCO PHARM", "LALPATHLAB", "GUJ.NARMADA", "VARDH TEXT", "MANAP FIN", "NLCINDIA", "GPPL", "MANPASAND", "UNION BANK L", "HIND UNI LT", "SIEMENS LTD.", "AVANTI FEEDS", "NIACL", "ICICI BANK", "MRF LTD.", "DISH TV", "BSOFT", "INFIBEAM", "AXIS BANK", "INFOSYS LTD", "V GUARD IND", "TORNT POWER", "NMDC LTD", "ORACLE FIN", "HONEYWELL", "FORTIS", "CHAMBAL FERT", "SANOFI", "IRB INFRA", "MAHINDCIE", "IFCI LTD", "BNK OF INDIA", "INDIAN BANK", "RAMCOCEM", "RADICOKHAIT", "AJANTA PHARM", "EICHER MOTOR", "POWER GRID", "PETRONET LNG", "VENKEY'S (I)", "CONTAIN.CORP", "CUMMINS (I)", "FORCE MOTR", "GIC HOUSING", "VMART", "DCAL", "PEL", "LAXMIMACH", "TATA ELXSI", "TAMIL.NEWSPR", "JAI CORP LTD", "MARICOLTD", "UJJIVAN", "AIAENGINEER", "GODREJAGRO", "PNBHOUSING", "LAURUSLABS", "CHOLAHLDNG", "NHPC", "GODREJ CONS", "SUVEN LIFE S", "JAMNA AUTO", "UPL", "HINDUS.ZI", "LIC HOUS.FIN", "STATE BANK", "PGHL", "FSL", "GREAVES COTT", "INDIAN HOTEL", "YES BANK", "NCC", "NBCC", "ARVIND Ltd", "MARUTISUZUK", "SYNDICATE BN", "HINDALCO", "AMAR RAJA BA", "SC_NAME", "THYROCARE", "SUZLONENERGY", "BALKRISH IND", "JSW ENERGY", "KEC INTERN", "CESC LTD.", "INDIAN OVER.", "PRSMJOHNSN", "GHCL LTD", "MAHINDRALIFE", "NAVKARCORP", "GRUH FINANCE", "PI INDUS.LTD", "PHOENIX MILL", "ESCORTS LTD.", "GODREJ PROP", "RAJESH EXPO.", "BOM DYEING", "GUJ.ST.FERT", "HEROMOTOCO", "EMAMI LTD.", "HDFC BANK", "KOTAK MAH.BK", "COX KINGS", "CENTRAL BK", "SKF INDIA", "GRASIM IND.", "BATA INDIA", "TRIDENT", "ASHOK LEYL.", "COROM INTER", "ADANI PORTS", "CIPLA LTD.", "MFSL", "CRISIL LTD", "CARBORUNDUM", "ASTRAZEN.PH.", "WOCKHARDT", "DABUR (I)", "SUPREM IND", "BAYER CROP", "GICRE", "ADVENZYMES", "ALKEM", "SCHAEFFLER", "LAKSHMIVILAS", "GRAPHITE IN.", "E.I.D.PAR", "AUROB.PHARMA", "JK LAKSHMI", "TEAMLEASE", "JK CEMENT", "MOTHERSON SS", "FCONSUMER", "SUN TVNET", "GAIL (I) LTD", "REPCO HOME", "IIFL", "CHENNAI PET.", "ZEE ENTER", "BHARAT ELECT", "PNCINFRA", "WABCO", "DR.REDDY'S", "KAJARIA CER.", "TATAMTRTDVR", "MAXINDIA", "ONGC CORPN", "THERMAX", "PIDILITE", "KEI INDUST.", "ZYDUS WELL", "PERSISTENT", "MINDTREE LTD", "SHARDACROP", "KNR CONST", "GUJGAS", "PROCTER &GAM", "SRF LIMITED", "MGL", "CASTROL", "MAGMA FIN", "MOIL LTD", "IPCA LAB LTD", "OBEROI REAL", "BAJ HOLD INV", "BEML LTD", "UCOBANK", "ENDURANCE", "CYIENT", "SUNPHA ADV", "MAH & MAH", "BHARAT FORGE", "DELTA CORP", "HIND.CONSTN.", "INDBUL REAL", "FRETAIL", "REDINGTON", "LTTS", "TATA POWER", "NIITTECH", "ICICIPRULI", "SWAN ENG", "CORPN.BANK", "INDIGO", "CROMPTON", "ZENSAR TECH.", "ALLAHABAD BK", "BAJFINANCE", "FUTLIF FAS", "RELIANCE", "PTC INDIA", "COLGATE PALM", "BASF INDIA", "INDUSIND BNK", "GILLETTE (I)", "MRPL", "PC JEWELLER", "ORINTAL BANK", "EQUITAS", "HCL TECHNO", "OIL INDIA", "IDFCFIRSTB", "INOX LEISURE", "JINDAL SAW", "SADBHAV ENGG", "MAHLOG", "HIND.COPPER", "HAL", "ORIENT CEM", "STAR", "PAGE INDUSTR", "ABB LTD.", "SHRIRAM TRAN", "TATA STEEL", "PUNJAB NATIO", "ANDHRA BANK", "COFFEEDAY", "EXIDE IND.", "ABFRL", "BAJAJ ELECT.", "AMBUJA CEME", "ECLERX", "WHIRLPOOL", "HSCL", "RECLTD", "WESTLIFE DEV", "HDFC", "J & K BANK", "APOLLO TYRES", "GET&D", "ASIAN PAINTS", "VINATI ORG.", "BERGE PAINT", "ICICIGI", "DLF LIMITED", "RASHTR.CHEM.", "INDIAN OIL", "JAIPRA", "JUBL FOOD", "LNT FH", "ERIS", "VST INDUS.", "DEEPAK FERT.", "SBILIFE", "GATEWAY DIST", "IEX", "INDIA CEM.", "HEG LIMITED", "BLUE STAR", "IDEA", "3M INDIA LTD", "BHARAT PET.", "PRESTIGE", "INTELLECT", "SYMPHONY", "TECH MAH", "ADANI POWER", "VIP INDUS.", "TAKE SOLUT", "NAV BHAR VEN", "TVS MOTOR L", "COAL INDIA", "TATAGLOBAL", "ENGINRS.(I)", "JK TYRE IND", "ASTRAL POLY", "RBLBANK", "HDFCLIFE", "CGPOWER", "BAJAJ AUTO", "DMART", "MOTILALOFS", "SHOPPER STOP", "TTK PRESTIGE", "ATUL LTD.", "BAJAJ FINSE", "HATSUN AGRO", "ADANITRANS", "BOM.BURMAH", "TATA MOTORS", "INDRA GAS", "UFLEX LTD", "VATECH WABA", "VBL", "ABCAPITAL", "CENTURYPLY", "INDOSTAR", "SUDARSH.CHEM", "TATA COMM", "SUNTECK REAL", "CENTRUM CAP", "POWER FINAN", "REL COM LTD", "RELAXO FOOT.", "IDFC", "WIPRO LTD.", "D B CORP", "ELGI EQUIP", "JSWSL", "NTPC LTD", "BHARTI ARTL", "GMR INFRASTR", "PARAGMILK", "NTWK MED INV", "SYNGENE", "RNAM", "STER TECH", "ADANI ENTER", "APOLLO HOSP.", "SREIINFRA", "ASAHI IN.GLA", "BALRAM.CHINI", "LINDEINDIA", "SUNDAR.CLAY", "VEDL", "THOMAS COOK", "CERA SANITAR", "LARSEN & TOU", "DCBBANK", "TCS LTD.", "BRITANIA IND", "JSLHISAR", "SJVN", "ISEC", "ITI LTD.", "GUJ PETRONET", "SPICEJET LTD", "BHEL", "APL APOLLO", "JM FINANCIAL", "SONATA SOFT.", "BH INFRATEL", "MPHASIS LTD", "HOUSING DEV", "HEIDEL CEM", "CARERATING", "DIVI'S LAB.", "TIINDIA", "M&M FINANSER", "BOSCH LTD", "DCMSHRIRAM", "VAKRANGEE", "HEXAWARE TEC", "KALPA.POWER", "DBL", "CANFIN HOMES", "HIMACHAL FUT", "REL.CAP.", "JUBILANT", "HUDCO", "CCL PRODUCTS", "GODFREY PH", "WELCORP", "HAVELLSINDIA", "RPOWER", "FINOLEX IND.", "FEDERAL BANK", "CITY UNION", "JINDAL STE", "SOUTH IND.BK", "HIND.PETRO", "NESTLE (I)", "RAIN", "GUJ.MINERAL", "DEWAN H.FIN.", "VOLTAS LTD", "JUSTDIAL", "OMAXE LTD", "TRENT LTD.", "LTI", "CEAT LIMITED", "STEEL AUTH.", "TATA INVEST.", "TATA CHEM.", "TIME TECHNO", "CHOLAFIN", "AUBANK", "NILKAMAL L", "LUPIN LTD.", "BANKBARODA", "CANARA BANK", "MINDA IND", "WELSPUN (I)", "GREAT EASTE", "GLENMARK PHA", "BIOCON LTD.", "AKZOINDIA", "GUJ.ALKALI", "AEGIS LOGIS", "MUTHOOT FIN", "NAT.ALUM", "KRBL LTD.", "DEEPAK NITR.", "PFIZER LTD.", "SHREE CEMENT", "GRANULES (I)", "KARNATAKA BK", "BIRLA CORPN.", "IBULISL", "PVR LTD", "CADILA HEALT", "BAJAJCON", "REL INFRA", "MONSANTO (I)", "ACC LTD", "BANDHANBNK", "LUXIND", "COCHINSHIP", "MAH HOLIDAY", "SHIPP.CORPN", "INOXWIND", "EDELWEISS"]

458 are Company names and these headers are printing 458 times because there are 458 rows in your CSV file.
Run this:
Hey, try to run this query:
LOAD CSV WITH HEADERS FROM 'file:///stocks.csv' as rows
WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len as l
MATCH (s:Company {label:rows.SC_NAME})
Return s
this will print 458 company names

Something weird happened. So If I look at the graph, its 458 nodes, however if I look at the table, its 209764 records.
I don't know how things happen here, but is it okay if I send you the stocks.csv file? I think then you'll be able to understand what is happening better.

Yeah, share CSV file

Thank you so much in advance for helping me out.

Hey, did you find anything?

Hey @nilanjandebnath13, run these two queries :

LOAD CSV WITH HEADERS FROM 'https://drive.google.com/uc?export=view&id=1Ue83ol0M2upK0_tqASzHzv5c6uXgFoZp' as rows 
merge (c:Company {name:rows['SC_NAME']})
return c

This will create nodes of companies


CALL apoc.periodic.iterate('LOAD CSV WITH HEADERS FROM "https://drive.google.com/uc?export=view&id=1Ue83ol0M2upK0_tqASzHzv5c6uXgFoZp" AS rows RETURN rows',
'WITH rows,keys(rows) AS header
WITH rows,header,range(1,size(header)-1) AS len
UNWIND len AS l
MATCH (c1:Company {name:rows["SC_NAME"]}),(c2:Company {name:header[l]})
MERGE (c1)-[:Value {value:rows[header[l]]}]->(c2)',
{batchSize:10000, parallel:false}
)

This will take time but it will create all the required relations

This worked! I had to install apoc. But thanks a lot. You saved my life.

Can you please tell me what the 2nd query means...

Refer this:
https://neo4j.com/docs/labs/apoc/current/graph-updates/periodic-execution/

Hello @nilanjandebnath13
My name is wa ode zalmawati, I am an 8th semester student at a polytechnic in Indonesia. I'm working on my thesis, and I'm having some confusion. The topic of my thesis is the same as your discussion topic, which is to calculate the correlation value between variables using neo4j, then look for which variable has the most high correlation value and then visualize it, all using neo4j. Honestly, I'm still new to neo4j, what I mean is, would you please give me your email for discussion with me to guide me in completing my thesis? :pray:
I hope you will be willing to discuss with me :pray:

Thank you, I hope we are all healthy
Greetings Zalma