Neo4j ETL Mapping not working - using remote DB2 database on IBMi

I posted this on the GitHub issues for the tool but haven't got any responses, so figured it wouldn't hurt to post it here as well.

I am running the Neo4j Desktop v1.2.1 and Neo4j ETL Tool v1.4.2 on a Windows 10 machine. I have a remote instance of Neo4j running on a Linux machine (Neo4j v3.5.6) and I have a dB2 database running on an IBM i machine.

I am using a JDBC to connect to the dB2 database and 'Test Connection' results in success (though it does appear to register my username as the schema, even though I am specifying a default schema that is not my username) I am trying to use the ETL tool to map metadata and though the connection appears to be successful, the output (in the logs) is empty. I have attached a picture of my JDBC setup and the resulting logs (when I hit 'Start Mapping'). Any help or tips or questions for more information would be appreciated!

COMMAND: java -cp "C:\Users\User\.Neo4jDesktop\graphApps\_global\neo4j-etl-ui/dist/neo4j-etl.jar;C:\Program Files\IBM\SDPShared\plugins\com.ibm.datatools.db2_2.2.201.v20151119_1817\driver\db2jcc_license_cisuz.jar;C:\Program Files\IBM\SDPShared\plugins\com.ibm.datatools.db2_2.2.201.v20151119_1817\driver\db2jcc4.jar" org.neo4j.etl.NeoIntegrationCli generate-metadata-mapping --rdbms:url "jdbc:db2://192.168.1.3:446/S1031056" --rdbms:password "password" --rdbms:user "user" --schema "LSCPRDLIB" --output-mapping-file "C:\Users\USER\AppData\Local\Temp/db2_S1031056_LSCPRDLIB_mapping.json"
- Skipping reading import options from file because file [] doesn't exist.
- Creating RDBMS to CSV mappings...
Available schema: #CGULIB
.....(*ABRIDGED FOR BREVITY)
Available schema: ZMARIADB
- Using database plugin for 
- Crawling schemas
- Retrieving all schemas
- Retrieving all catalogs
- Processed 597 rows for <retrieveAllSchemas>
- Including schema LSCPRDLIB
- Retrieved 1 schemas
- Including schema LSCPRDLIB
- Total time taken for <crawlSchemas> - 00:00:00.063 hours
- 98.4% - 00:00:00.062 - <retrieveSchemas>
-  1.6% - 00:00:00.001 - <sortAndFilterSchemas>

- Retrieving database information
- Not retrieving additional database information, since this was not requested
- Retrieving JDBC driver information
- Not retrieving additional JDBC driver information, since this was not requested
- Retrieving SchemaCrawler crawl information
- Total time taken for <crawlDatabaseInfo> - 00:00:00.011 hours
-  0.0% - 00:00:00.000 - <retrieveDatabaseInfo>
-  0.0% - 00:00:00.000 - <retrieveAdditionalDatabaseInfo>
-  0.0% - 00:00:00.000 - <retrieveJdbcDriverInfo>
-  0.0% - 00:00:00.000 - <retrieveAdditionalJdbcDriverInfo>
-100.0% - 00:00:00.011 - <retrieveCrawlHeaderInfo>

- Crawling column data types
- Retrieving system column data types
- Not retrieving user column data types, since this was not requested
- Total time taken for <crawlColumnDataTypes> - 00:00:00.024 hours
-100.0% - 00:00:00.024 - <retrieveSystemColumnDataTypes>
-  0.0% - 00:00:00.000 - <retrieveUserDefinedColumnDataTypes>

- Crawling tables
- Retrieving tables
- Retrieving tables for schema <LSCPRDLIB>
- Processed 3704 rows for <retrieveTables>
- Retrieved 0 tables
- Crawling routines
- Retrieved 0 routines
- Not retrieving synonyms, since this was not requested
- Not retrieving sequences, since this was not requested

OUTPUT:
[ ]

@MarkESmith can you try with this jar

In order to install it provide the path where you downloaded with the file:// prefix

I look forward to your feedback.
Thanks a lot.
Andrea

@conker84 I tried with the file you suggested, deleted the installed instance of Neo4j ETL first.

Unfortunately, appears that the problem persists. The following are the last lines of the logs that it generates when I attempt to do the mapping:

- Crawling tables
- Retrieving tables
- Retrieving tables for schema <NEOETLTST>
- Processed 2 rows for <retrieveTables>
- Retrieved 0 tables
- Crawling routines
- Retrieved 0 routines
- Not retrieving synonyms, since this was not requested
- Not retrieving sequences, since this was not requested

OUTPUT:
[ ]

The schema specified (NEOETLTST) only has two tables but for some reason, the mapping procedure doesn't seem to progress past the point indicated in the logs.

One thing I noticed is that in your screenshot, you appear to be running the 1.4.0 version of the ETL tool and the one that is linked is the 1.4.2 version - is there any chance that an older version of this ETL tool does not have these problems?

@MarkESmith can you try to remove the schema field from the connection?

@conker84

Removed the schema from the connection - while the mapping did take longer and it did appears to crawl through all the schemas, there was still no output:

- Retrieving tables for schema <XMLSAMPLES>
- Processed 3 rows for <retrieveTables>
- Retrieving tables for schema <YAJL>
- Processed 4 rows for <retrieveTables>
- Retrieving tables for schema <ZC0141334>
- Processed 36 rows for <retrieveTables>
- Retrieving tables for schema <ZENDPHP7>
- Processed 15 rows for <retrieveTables>
- Retrieving tables for schema <ZENDSVR>
- Processed 93 rows for <retrieveTables>
- Retrieving tables for schema <ZENDSVR6>
- Processed 17 rows for <retrieveTables>
- Retrieving tables for schema <ZMARIADB>
- Processed 0 rows for <retrieveTables>
- Retrieved 0 tables
- Crawling routines
- Retrieved 0 routines
- Not retrieving synonyms, since this was not requested
- Not retrieving sequences, since this was not requested

OUTPUT:
[ ]

That's odd I was able to export the sample db from IBM, can you share a dump of your dataset?

You were able to get the mapping to work when targeting Db2 on an IBM i?

What exactly do you mean by a dump of my dataset? You want to see what the records/file structure looks like for a specific schema?

Sorry I mean that I was able to successfully map the sample dataset provided by IBM,
following the compose file that I used:

services:

  db:
    image: ibmcom/db2
    restart: always
    privileged: true
    ports:
      - 50000:50000
      - 55001:55000
    environment:
      LICENSE: accept
      DB2INST1_PASSWORD: neo4j
      SAMPLEDB: 'true'
    volumes: 
      - ./data:/database

For the dump I mean that I want your dataset to better understand what is the problem in your env.
If you can share please send it to my email address andrea.santurbano[at]larus-ba.it
Thanks a lot

@conker84 really appreciate the help! Is it okay if I send you a dump of a specific schema that I'm trying to target (NEOETLTST, in this case - has two tables with a PK/FK relation that I created for demonstrative purposes)? This is our production system, so some of the information is confidential in addition to the fact that in totality, the database size is gargantuan - much greater than any e-mail quota that I'm aware of.

Sure please, send it to andrea.santurbano[at]larus-ba.it

@MarkESmith please look at the comment above + another question are you using the driver JDBC downloaded from here:
https://www.ibm.com/support/pages/db2-jdbc-driver-versions-and-downloads
Or another one?

@conker84 I have tried two JDBC drivers - one of which was the latest from the link you provided and another was a JDBC driver that came packaged with IBM's Rational Developer for i.

@MarkESmith which DB2 version are you using?

@conker84 because it's a Db2 running on an IBM i, Db2 is part of the base operating system - so the versions, as far as I can tell and have read, are equivalent. Our version of IBM i is V7R3 so the version of Db2 would be V7.