1.2. Serialization

Serialization and deserilization mechanisms were implemented to enable persistent storage and thus reusability of pulse template definitions. Currently, the serialization format is a plain text document containing JSON formatted data. [1]

Serialization was constructed in a way that allows that a given pulse template may refer to subtemplates which are used by several different parent templates (or more than once in one) such as, e.g., the measurement pulse. These should then be stored in a separate file and referenced by a unique identifier in all parent templates to avoid unnecessary redundancy. On the other hand, there might be subtemplates which are only relevant to their parent and thus should be embedded in its serialization to avoid creating a multitude of files that are meaningless to a user. To allow the serialization process to make this distinction, each pulse template (or other serializable object) provides an optional identifier (which can be set by the user via the constructor for all pulse template variants). If an identifier is present in a pulse template, it is stored in a separate file. If not, it is embedded in its parent’s serialization.

The implementation of (de)serialization features Serializer class and a Serializable interface. Every class that implements the latter can be serialized and thus stored as a JSON file. Currently, this is the case for all PulseTemplate variants as well as the ParameterDeclaration class. Additionally, the StorageBackend interface abstracts the actual storage backend. While currently there only exists a single implementation of this interface, namely the FileSystemStorageBackend, this allows to support, e.g., database storage, in the future.

The essential methods of Serializer are Serializer.serialize() and Serializer.deserialize(). Serializer.serialize() serializes a serializable object (i.e., any object of a class that implements/derives from Serializable) in a recursive process: It invokes the Serializable.get_serialization_data() method of the provided serializable object, which in turn might invoke the Sequencer to obtain a serialization for complex embedded data (such as a ParameterDeclaration in a TablePulseTemplate). In the end, a dictionary representation of the object is obtained which is then converted into a JSON string using built-in Python functionality. The JSON representation is finally stored using a given StorageBackend. Deserialization (Serializer.deserialize()) works analogously and is thus not explained in detail here.

For an example of how to use serialization to store and load pulse templates, see examples/03Serialization.ipynb in the examples section.

1.2.1. Implementing a Serializable Class

To make any new class serializable, it must derive from the Serializable and implement the methods Serializable.get_serialization_data(), Serializable.deserialize() and the Serializable.identifier property.

If class objects should be stored in a separate file, the identifier must be a non-empty string. If, on the other hand, class objects should be embedded into their parent’s serialization (as is the case for, e.g., ParameterDeclaration), identifier must be None.

The method serialize should return a dictionary of native Python types containing all relevant data. If the class has members that are not native Python types but must be serialized, they must be serializable and the serialize method can obtain their serialization as the return value of Serializer.dictify() and embed it in its result. The dictionary returned by serialize should not include the identifier in the returned dictionary.

The method deserialize is invoked with all key-value pairs created by a call to serialize as keyword arguments as well as an additional identifier keyword argument (which may be None) and must return a valid corresponding class instance.

The following may serve as a simple example:

from qctoolkit.serialization import Serializable, Serializer
from qctoolkit.pulses import PulseTemplate
from typing import Any, Dict, Optional

class Foo(Serializable):

    def __init__(self, template: PulseTemplate, identifier: Optional[str]=None) -> None:
        self.__template = template
        self.__identifier = identifier

    @property
    def identifier(self) -> Optional[str]:
        return self.__identifier

    def get_serialization_data(self, serializer: Serializer) -> Dict[str, Any]:
        return dict(template=serializer.dictify(self.__template))

    @staticmethod
    def deserialize(serializer: Serializer, template: Dict[str, Any], identifer: Optional[str]=None) -> Serializable:
        return Foo(serialzer.deserialize(template), identifier=identifier)

Footnotes

[1]After some discussion of the format in which to store the data, JSON files were the favored solution. The main competitor were relational SQL databases, which could provide a central, globally accessible pulse database. However, since pulses are often changed between experiments, a more flexible solution that can be maintained by users without database experience and also allows changes only in a local environment was desired. Storing pulse templates in files was the obvious solution to this. This greatest-simplicity-requirement was also imposed on the data format, which thus resulted in JSON being chosen over XML or other similar formats. An additional favorable argument for JSON is the fact that Python already provides methods that convert dictionaries containing only native python types into valid JSON and back.