Neo4J Error while importing example database Musicbrainz


(Wunderbardan) #1

I try to import Musicbrainz database from the example datasets https://neo4j.com/developer/example-data/. I downloaded the 4.5GB big file archive and dezipped the whole database to a new graph.db folder. When I now try to start the database after sucessfully opened it I get the following error:

java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory, /home/neo4j/data/databases/graph.db
	at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:212)
	at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:125)
	at org.neo4j.graphdb.factory.GraphDatabaseFactory.newDatabase(GraphDatabaseFactory.java:137)
	at org.neo4j.graphdb.factory.GraphDatabaseFactory.newEmbeddedDatabase(GraphDatabaseFactory.java:130)
	at org.neo4j.graphdb.factory.GraphDatabaseFactory$1.newDatabase(GraphDatabaseFactory.java:107)
	at org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatabase(GraphDatabaseBuilder.java:199)
	at org.neo4j.shell.kernel.GraphDatabaseShellServer.instantiateGraphDb(GraphDatabaseShellServer.java:228)
	at org.neo4j.shell.kernel.GraphDatabaseShellServer.<init>(GraphDatabaseShellServer.java:75)
	at org.neo4j.shell.StartClient.getGraphDatabaseShellServer(StartClient.java:311)
	at org.neo4j.shell.StartClient.tryStartLocalServerAndClient(StartClient.java:294)
	at org.neo4j.shell.StartClient.startLocal(StartClient.java:282)
	at org.neo4j.shell.StartClient.start(StartClient.java:213)
	at org.neo4j.shell.StartClient.main(StartClient.java:147)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore@549949be' was successfully initialized, but failed to start. Please see the attached cause exception "The node label update contained unsorted label ids [17, 9]".
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:466)
	at org.neo4j.kernel.lifecycle.LifeSupport.bringToState(LifeSupport.java:315)
	at org.neo4j.kernel.lifecycle.LifeSupport.add(LifeSupport.java:248)
	at org.neo4j.kernel.lifecycle.Lifespan.add(Lifespan.java:41)
	at org.neo4j.kernel.impl.storemigration.participant.NativeLabelScanStoreMigrator.migrate(NativeLabelScanStoreMigrator.java:83)
	at org.neo4j.kernel.impl.storemigration.StoreUpgrader.migrateToIsolatedDirectory(StoreUpgrader.java:224)
	at org.neo4j.kernel.impl.storemigration.StoreUpgrader.migrateStore(StoreUpgrader.java:144)
	at org.neo4j.kernel.impl.storemigration.StoreUpgrader.migrateIfNeeded(StoreUpgrader.java:122)
	at org.neo4j.kernel.impl.storemigration.DatabaseMigrator.migrate(DatabaseMigrator.java:100)
	at org.neo4j.kernel.NeoStoreDataSource.upgradeStore(NeoStoreDataSource.java:564)
	at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:419)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
	at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
	at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:100)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
	at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
	at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:208)
	... 12 more
Caused by: java.lang.IllegalArgumentException: The node label update contained unsorted label ids [17, 9]
	at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.extractChange(NativeLabelScanWriter.java:210)
	at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.flushPendingChanges(NativeLabelScanWriter.java:175)
	at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.write(NativeLabelScanWriter.java:144)
	at org.neo4j.kernel.impl.api.scan.FullLabelStream.visit(FullLabelStream.java:61)
	at org.neo4j.kernel.impl.storemigration.participant.NativeLabelScanStoreMigrator$MonitoredFullLabelStream.visit(NativeLabelScanStoreMigrator.java:178)
	at org.neo4j.kernel.impl.storemigration.participant.NativeLabelScanStoreMigrator$MonitoredFullLabelStream.visit(NativeLabelScanStoreMigrator.java:164)
	at org.neo4j.kernel.impl.transaction.state.storeview.StoreViewNodeStoreScan.process(StoreViewNodeStoreScan.java:90)
	at org.neo4j.kernel.impl.transaction.state.storeview.NodeStoreScan.run(NodeStoreScan.java:74)
	at org.neo4j.kernel.impl.api.scan.FullLabelStream.applyTo(FullLabelStream.java:54)
	at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.start(NativeLabelScanStore.java:420)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
	... 28 more
	Suppressed: java.lang.IllegalArgumentException: The node label update contained unsorted label ids [17, 9]
		at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.extractChange(NativeLabelScanWriter.java:210)
		at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.flushPendingChanges(NativeLabelScanWriter.java:175)
		at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanWriter.close(NativeLabelScanWriter.java:270)
		at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.start(NativeLabelScanStore.java:421)
		... 29 more

My Neo4j version is 3.4.5.
Thank you!


(Michael Hunger) #2

That dataset is for an older version, I'll upload one for 3.4.x.

Here you go: example-data.neo4j.org/3.4-datasets/musicbrainz.tgz


(Wunderbardan) #3

The new dataset works. Thank you. And I execute call apoc.meta.graph(). And now I have waited for 443 minutes, the procedure is still running. My PC has 8G RAM and core i7. I am not sure how long should I wait. Is there any accelerating tool to execute apoc procedure for large datasets?
Thank you!


(Michael Hunger) #4

It should finish in seconds. Can you try call db.schema() meanwhile?


(Wunderbardan) #5

I observed my memory. For call apoc.meta.graph(), memory would be increased up to 50%, cpu would be occupied over 100%.

By call db.schema() the graph can be illustrated after seconds and wrote that there are over 13000 relationships. The graph would be incrementally updated. The cpu would be ocupied from over 100% to over 400%. And it seems like another endless procedure.

By call apoc.meta.data() the process finished in 3 minutes and displayed first 1000 rows. When processed: cpu: 10%~20%, memory: over 40%. After displaying the result, cpu would be relatively low and memory is constant 40.2%. After exporting the data file into csv, I read there are over 2000 rows.

Maybe the illustration over 10000 relationships is too difficult. Now I want to know the principle of sampling datasets. And it would be beneficial for my research.
Thank you!