Neo.ClientError.Procedure.ProcedureCallFailed: Can't read url or key https://de.wikipedia.org/w/api.php?

Hey,

I am starting with Neo4j and trying to build a mathematic knowledge graph with wikipedia. So I follow the steps from the youtube tutorial. (Neo4j Online Meetup #7: Building the Wikipedia Knowledge Graph in Neo4j - YouTube)

I just finished the first level and set the parameter to 2. So I tried to include the next level with exactly the same code as before and got the following error:

"Failed to invoke procedure apoc.load.json: Caused by: java.lang.RuntimeException: Can't read url or key https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category::Maßtheorie&cmprop=ids|title&cmlimit=500 as json: Server returned HTTP response code: 400 for URL: https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category::Maßtheorie&cmprop=ids|title&cmlimit=500"

Can someone please help me?

if one tries to connect to said URL using a browser the response is

{"error":{"code":"invalidtitle","info":"Bad title \"Category::Ma\u00dftheorie\".","*":"See https://de.wikipedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."},"servedby":"mw1410"}

so the error from APOC is expected.

Hey @dana.canzano,
do you think it is because of the double ":" in the link?
So I tried the link

in the browser and doesn't get the error there.
So I tried in Neo4j Sandbox instead of

MATCH (c:Category { fetched: false, level: $level - 1}) 
call apoc.load.json('https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category:' + replace(c.catName,' ','%20') + '&cmprop=ids%7Ctitle&cmlimit=500')
YIELD value as results
UNWIND results.query.categorymembers AS subcat
MERGE (sc:Category {catID:subcat.pageid})
ON CREATE SET sc.catName=substring(subcat.title,9), 
			  sc.fetched = false,
              sc.level=$level
WITH sc,c
CALL apoc.create.addLabels(sc,['Level' + $level + 'Category']) YIELD node
MERGE (sc)-[:SUBCAT_OF]->(c)
WITH DISTINCT c
SET c.fetched=TRUE

another link and delete the ":" after Category

MATCH (c:Category { fetched: false, level: $level - 1}) 
call apoc.load.json('https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category' + replace(c.catName,' ','%20') + '&cmprop=ids%7Ctitle&cmlimit=500')
YIELD value as results
UNWIND results.query.categorymembers AS subcat
MERGE (sc:Category {catID:subcat.pageid})
ON CREATE SET sc.catName=substring(subcat.title,9), 
			  sc.fetched = false,
              sc.level=$level
WITH sc,c
CALL apoc.create.addLabels(sc,['Level' + $level + 'Category']) YIELD node
MERGE (sc)-[:SUBCAT_OF]->(c)
WITH DISTINCT c
SET c.fetched=TRUE

But I got the same error:
"Failed to invoke procedure apoc.load.json: Caused by: java.lang.RuntimeException: Can't read url or key https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category:Maßtheorie&cmprop=ids|title&cmlimit=500 as json: Server returned HTTP response code: 400 for URL: https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category:Maßtheorie&cmprop=ids|title&cmlimit=500"

I don't understand what I am doing wrong? Because the first level is absolutely fine.

i dont completely understand the API itself but yes I think there are maybe 2 issues.

  1. the :: after Category?
  2. the Maßtheorie needs to be HTML encoded as (for example see Forms action page)
https://de.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=subcat&cmtitle=Category:Ma%C3%9Ftheorie&cmprop=ids%7Ctitle&cmlimit=50

Hey @dana.canzano

I tried it completely again and it works fine with the English wikipedia and just one :. With the german Wikipedia I have to run it twice. First with :: and after that I have to run it again with just one :, because i get more subclasses. (don't know why)

But now I have a Neo.ClientError.Transaction.TransactionTimedOut. Error.

Neo.ClientError.Transaction.TransactionTimedOut
The transaction has been terminated. Retry your operation in a new transaction, and you should see a successful result. The transaction has not completed within the specified timeout (dbms.transaction.timeout). You may want to retry with a longer timeout. 

My solution idea is to set the timeout longer. I found the following code at the Transaction Management. dbms.transaction.timeout=10s.
Is this the correct command? Cause I have trouble using it in the sandbox.

The Neo.ClientError.Statement.SyntaxError says this is an invalid input and it is expecting an operator in combination with the command like "SET". So I tried to put it in front of the command, but then it respond another syntax error.

SET dbms.transaction.timeout=10s
Neo.ClientError.Statement.SyntaxError
Variable `dbms` not defined (line 1, column 5 (offset: 4))
"SET dbms.transaction.timeout=10s"
     ^
dbms.transaction.timeout=10s
Neo.ClientError.Statement.SyntaxError
Invalid input 'dbms': expected 
  "RETURN"
  "CREATE"
  "DELETE"
  "SET"
  "REMOVE"
  "DETACH"
  "MATCH"
  "WITH"
  "UNWIND"
  "USE"
  "CALL"
  "LOAD"
  "FROM"
  "FOREACH"
  "MERGE"
  "OPTIONAL"
  "USING" (line 1, column 1 (offset: 0))
"dbms.transaction.timeout=10s"
 ^

Do you know, what am I doing wrong?

Thank you for helping me!

did encoding the url help per my last update?
and im really confused as to why running the same command 2x results in he first time failing and the 2nd, and the same command, then succeeding

As to dbms.transaction.timeout as you experienced this is not configured via a SET command. Though I'm also not aware of SET being valid at all. dbms.stransaction.timeout is typically defined in the conf/neo4j.conf file or you can dynamically set using call dbms.setConfigValue(); see Dynamic settings - Operations Manual though this documentation speaks of Neo4j v4.2 . It should be valid for all 4.x versions but as there is no prior detail of what version of Neo4j is in play here if you are pre 4.x then call dbms.setConfigValue(); may not be valid