Issues with dbms.memory.transaction.global_max_size

  1. I'm a backend engineer with the sole task of implementing a knowledge graph for SNOMED CT in neo4j. We have a neo4j aura instance (the cheapest one 1gb ram 2gb storage) When runnig the upload of these very large csv files using pyingest I get the following error:
 {message: The allocation of an extra 15.1 MiB would use more than the limit 100.0 MiB. Currently using 88.4 MiB. dbms.memory.transaction.global_max_size threshold reached}

Am I not paying for 1gb? Am I doing something terribly stupid or wrong? I've read the graph databases book (the one with the octopus). What else should I read?

  1. [ 3:55 PM ]

I've fallen into this beautiful rabbit hole and find new insights each day, but I have to deliver and be practical
I'll appreciate any practical guidance and book/learning resource recommendation

  1. [ 3:56 PM ]

again, sorry if this is the wrong place to ask!

  1. [ 3:58 PM ]
server_uri: neo4j+s://id:7687
admin_user: neo4j
admin_pass: pass

files:
  # concepts
  - url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot.csv
    compression: none
    skip_file: false
    chunk_size: 100
    cql: |
      WITH $dict.rows as rows UNWIND rows as row
        MERGE (c:Concept {conceptId:row.id,term:row.term,descType:row.descType})
        ON CREATE SET c.conceptId = row.id, c.term = row.term, c.descType = row.descType
        ON MATCH SET c.conceptId = row.id, c.term = row.term, c.descType = row.descType
  
  ## concept synonim generator        
  - url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot_add.csv
    compression: none
    skip_file: false
    chunk_size: 50
    cql: |
      WITH $dict.rows as rows UNWIND rows as row
        MATCH (dest:Concept) WHERE dest.conceptId = row.id 
        CREATE (c:Concept:Synonym{
          conceptId: row.id,
          term: row.term,
          descType: row.descType
          })-[r:IS_A {
            relId:'116680003',
            term:'Is a (attribute)',
            descType:'900000000000003001'
          }]->(dest);

  # relationships
  - url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot_add.csv
    compression: none
    skip_file: false
    chunk_size: 50
    cql: |
      WITH $dict.rows as rows UNWIND rows as row
        MATCH (source:Concept) WHERE source.conceptId = row.sourceId
        MATCH (dest:Concept:FSA) WHERE dest.conceptId = row.destinationId
        CREATE (source)-[r:row.relLabel{relId: row.typeId, term: row.term, descType: row.descType}]->(dest)"
  1. [ 3:59 PM ]

that's the config.yml with all the queries (i'm chunking to try and avoid this issue)

  1. [ 4:00 PM ]
 {code: Neo.TransientError.General.MemoryPoolOutOfMemoryError} {message: The allocation of an extra 7.3 MiB would use more than the limit 100.0 MiB. Currently using 99.0 MiB. dbms.memory.transaction.global_max_size threshold reached}
  1. [ 4:01 PM ]

now I get this error, I'm not running any other queries on the database, nor is anyone else (i'm the only one with credentials)

I noticed one thing with your query:

relationships

  • url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot_add.csv
    compression: none
    skip_file: false
    chunk_size: 50
    cql: |
    WITH $dict.rows as rows UNWIND rows as row
    MATCH (source:Concept) WHERE source.conceptId = row.sourceId
    MATCH (dest:Concept:FSA) WHERE dest.conceptId = row.destinationId
    CREATE (source)-[r:row.relLabel{relId: row.typeId, term: row.term, descType: row.descType}]->(dest)"

The [r:row.relLabel] won't resolve. Pyingest uses regular cypher parameter substitution which doesn't include the relationship name. To create a dynamic relationship name, you need to use something like apoc.createRelationship

You can try to reduce your chunk sizes.

It might also be good to merge on single properties (with constraint) only -> here as your id-is row.id, the other fields should not be part of the merge but an ON CREATE SET ...

MERGE (c:Concept {conceptId:row.id}) ON CREATE SET ...

Do you see which of the import queries causes the memory issue?

Sometimes AuraDB free works better in terms of memory limits, give it a try, as it doesn't have to support a clustered environment.

hey there! have you solved this issue of yours?

if yes, could you let me know how u did that?

did u allocated more memory in configuration or made changes to internal structure?