Convert stream of records to JSON in Python driver

I am trying to convert the result of a query to JSON using the Neo4j Python driver. However, there are serialization issues with records.

What is the correct way to convert a stream of records to JSON?

I had a look at similar question But I don't seem to get it completely.

This post is inspired by my Stackoverflow question.

Thank you.

Try to use record.data() which gives you a dictionary and serialize that

https://neo4j.com/docs/api/python-driver/current/api.html#neo4j.Record.data

something like

JSON.dumps([r.data() for r in records])

1 Like

Thank you @michael.hunger for your response.

I did it like this however it does not yield the labels field as it does in response to the browser tool:

Browser tool query execution response

[
  {
    "keys": [
      "a",
      "type(r)",
      "b"
    ],
    "length": 3,
    "_fields": [
      {
        "identity": {
          "low": 43,
          "high": 0
        },
        "labels": [
          "TBox"
        ],
        "properties": {
          "identifier": "http://example.org/tbox/person",
          "ontology_level": "upper",
          "neo4jImportId": "76",
          "html_info": "",
          "namespace": "skm",
          "admin": "yes",
          "description": "Person agents are people.",
          "sing": "Person",
          "title": "Person",
          "pl": "People",
          "version": "v3.1"
        }
      },
      "subclass_of",
      {
        "identity": {
          "low": 25,
          "high": 0
        },
        "labels": [
          "TBox"
        ],
        "properties": {
          "identifier": "http://example.org/tbox/agent",
          "ontology_level": "upper",
          "neo4jImportId": "25",
          "html_info": "",
          "namespace": "prov",
          "admin": "yes",
          "description": "An agent is something that bears some form of ",
          "sing": "Agent",
          "pl": "Agents",
          "title": "Agent",
          "version": "v6.0"
        }
      }
    ],
    "_fieldLookup": {
      "a": 0,
      "type(r)": 1,
      "b": 2
    }
  }
]

Code is as follows

API way

    def execute(self, index, query):
        print('executing query: ', index)
        with self.driver.session() as session:
            result = session.write_transaction(generate_select_query_function(query))
        print(result)
        return self.serialize_response(result)


    def serialize_response(self, result):
        return {'result': [self.serialize_data_custom(index, record) for index, record in enumerate(result)]}

    """This is another workaround using API helper methods but it lacks an ID field which is very important. 
    
    The following function does not return ID but only the resulting node information. We have to explicitly ask for 
    ID(n) in plain cypher if we need IDs.
    
    API reference -- https://neo4j.com/docs/api/python-driver/4.2/api.html#neo4j.Record.data
    """
    def serialize_data_api_way(self, index, record):
        return [record.data() for r in record]

API output (label names are missing) :

{"result": [[{"a": {"admin": "yes", "comment": "", "description": "Person agents are people.", "html_info": "", "identifier": "http://example.org/tbox/person", "namespace": "skm", "ontology_level": "upper", "pl": "People", "sing": "Person", "title": "Person", "unique": "", "uri": "", "url": "", "version": "v3.1", "xsd_type": ""}, "type(r)": "subclass_of", "b": {"admin": "yes", "comment": "", "description": "An agent is something that bears some form of ", "html_info": "", "identifier": "http://example.org/tbox/agent", "namespace": "prov", "ontology_level": "upper", "pl": "Agents", "sing": "Agent", "title": "Agent", "unique": "", "uri": "", "url": "", "version": "v6.0", "xsd_type": ""}}, {"a": {"admin": "yes", "comment": "", "description": "Person agents are people.", "html_info": "", "identifier": "http://example.org/tbox/person", "namespace": "skm", "ontology_level": "upper", "pl": "People", "sing": "Person", "title": "Person", "unique": "", "uri": "", "url": "", "version": "v3.1", "xsd_type": ""}, "type(r)": "subclass_of", "b": {"admin": "yes", "comment": "", "description": "An agent is something that bears some form of ", "html_info": "", "identifier": "http://example.org/tbox/agent", "namespace": "prov", "ontology_level": "upper", "pl": "Agents", "sing": "Agent", "title": "Agent", "unique": "", "uri": "", "url": "", "version": "v6.0", "xsd_type": ""}}, {"a": {"admin": "yes", "comment": "", "description": "Person agents are people.", "html_info": "", "identifier": "http://example.org/tbox/person", "namespace": "skm", "ontology_level": "upper", "pl": "People", "sing": "Person", "title": "Person", "unique": "", "uri": "", "url": "", "version": "v3.1", "xsd_type": ""}, "type(r)": "subclass_of", "b": {"admin": "yes", "comment": "", "description": "An agent is something that bears some form of ", "html_info": "", "identifier": "http://example.org/tbox/agent", "namespace": "prov", "ontology_level": "upper", "pl": "Agents", "sing": "Agent", "title": "Agent", "unique": "", "uri": "", "url": "", "version": "v6.0", "xsd_type": ""}}]]}

However, with a custom serializer implementation, I do get labels name as shown below

    def serialize_data_custom(self, index, record):
        """
        A custom serializer.

        Keyword arguments:
        index -- optional
        record -- required

        Record class documentation - https://neo4j.com/docs/api/python-driver/4.2/api.html#record
        """
        print('record ', index, ':', record)  # console print statement
        # Create an empty dictionary
        graph_data_type_list = {}
        # Iterate over the list of records also enumerating it.
        for j, graph_data_type in enumerate(record):
            # Check if the record has string or integer literal.
            if isinstance(graph_data_type, str) or isinstance(graph_data_type, int):
                # Return the keys and values of this record as a dictionary and store it inside graph_data_type_dict.
                graph_data_type_dict = record.data(j)
            else:
                # If the record fails the above check then manually convert them into dictionary with __dict__
                graph_data_type_dict = graph_data_type.__dict__
                # Remove unnecessary _graph as we do not need it to serialize from the record.
                if '_graph' in graph_data_type_dict:
                    del graph_data_type_dict['_graph']
                # Add a _start_node key from the record.
                if '_start_node' in graph_data_type_dict:
                    graph_data_type_dict['_start_node'] = graph_data_type_dict['_start_node'].__dict__
                    # Add a _labels key of start node from the record.
                    if '_labels' in graph_data_type_dict['_start_node']:
                        frozen_label_set = graph_data_type.start_node['_labels']
                        graph_data_type_dict['_start_node']['_labels'] = [v for v in frozen_label_set]
                    # Remove unnecessary _graph as we do not need it to serialize from the record.
                    if '_graph' in graph_data_type_dict['_start_node']:
                        del graph_data_type_dict['_start_node']['_graph']
                # Add a _start_node key from the record.
                if '_end_node' in graph_data_type_dict:
                    graph_data_type_dict['_end_node'] = graph_data_type_dict['_end_node'].__dict__
                    # Add a _labels key of start node from the record.
                    if '_labels' in graph_data_type_dict['_end_node']:
                        frozen_label_set = graph_data_type.start_node['_labels']
                        graph_data_type_dict['_end_node']['_labels'] = [v for v in frozen_label_set]
                    # Remove unnecessary _graph as we do not need it to serialize from the record.
                    if '_graph' in graph_data_type_dict['_end_node']:
                        del graph_data_type_dict['_end_node']['_graph']
                # Add other labels for representation from frozenset()
                if '_labels' in graph_data_type_dict:
                    frozen_label_set = graph_data_type_dict['_labels']
                    graph_data_type_dict['_labels'] = [v for v in frozen_label_set]
                # print(graph_data_type_dict) # test statement
            graph_data_type_list.update(graph_data_type_dict)

   
        return graph_data_type_list

custom serializer output

{"result": [{"_id": 25, "_properties": {"admin": "yes", "comment": "", "description": "An agent is something that bears some form of ", "html_info": "", "identifier": "http://example.org/tbox/agent", "namespace": "prov", "ontology_level": "upper", "pl": "Agents", "sing": "Agent", "title": "Agent", "unique": "", "uri": "", "url": "", "version": "v6.0", "xsd_type": ""}, "_labels": ["TBox"], "type(r)": "subclass_of"}]}

Is there a way to get a response like the Neo4j browser tool?

1 Like

Did you find a solution here? getting the records or result.data() will give a dict, however then you need to re-structure it again to be a usable graph.

Maybe you can add your own property to the object which is a duplicate of the label ?

Did you see there is also a result.graph() that gives you graph objects back.

You would just have to serialize those yourself.