Export Extenstion#

class metacatalog.ext.export.ExportExtension#

Export functions.

The default ExportExtension can be used to produce raw export functionality. The Entry has a export function, that will translate the requested format to a activated extension of this name or fall back to this extension and call a function of same name. Raw export means, that the given Entry will be translated into a Python dictionary and then implemented into the given file-format as natively as possible. This does not follow any specified standards or rules. Standard metadata formats are implemented in separate extensions.

Currently, the following exports are supported:

  • JSON

  • XML (called fast_XML)

  • pickle

  • netCDF

Note

The XML export is currently done with another package (dicttoxml). This works great, but it is not possible to adjust the exported XML to i.e. ISO19115 standard requirements. This will be implemented with an export via lxml. Thus, for the base version of that export, the function xml is reserved.

If no path is specified, the native Python object that will be used to create the file is returned. This can be very useful to pack more than one Entry together. With path=None, the export function will return the following objects:

  • JSON -> str: the JSON object as string

  • fast_XML -> str: the XML representation as (decoded UTF-8) string

  • pickle -> dict: the dict underlying all export functions

  • netCDF -> xarray: the xarray used to build the netCDF file

Note that the export of 'pickle' and 'netCDF' can be particularly useful without setting the path.

classmethod fast_xml(entry: Entry, path=None, no_data=False, **kwargs)#

Export an Entry to XML. If a path is given, a new file will be created. This is the fast XML version, which will convert the metadata to custom XML tags.

Parameters:
  • entry (metacatalog.models.Entry) – The entry instance to be exported

  • path (str) – If given, a file location for export.

  • no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.

Returns:

out – The the XML str if path is None, else None

Return type:

str

Notes

The content of the file will be created using a ImmutableResultSet. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:

>>> from metacatalog.ext.export.extension import ENTRY_KEYS

A updated list can then be passed as kwargs:

>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')]
>>> Export = metacatalog.ext.extension('export')
>>> Export.fast_xml(entry, '/temp/metadata.xml', use_keys=use_keys)
classmethod flat_keys(data: dict, prefix=False, delimiter: str = '.', **kwargs) dict#

Turn nested dictionaries into flat dictionaries by expanding nested keys using the given delimiter

classmethod get_data(entry: Entry, serialize=True, **kwargs) dict#

Return the data as UUID indexed dict

classmethod json(entry: Entry, path: Optional[str] = None, flat=False, indent: int = 4, no_data=False, **kwargs)#

Export an Entry to JSON. If a path is given, a new file will be created.

Parameters:
  • entry (metacatalog.models.Entry) – The entry instance to be exported

  • path (str) – If given, a file location for export.

  • flat (bool) – If True, the resulting JSON will be un-nested and build formerly nested keys like parent.child, where the delimiter defaults to ‘.’ but can be changed. Defaults to False.

  • indent (int) – The default indentation for the JSON file

  • no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.

Returns:

out – The JSON string if path is None, else None

Return type:

str

Notes

The content of the file will be created using a ImmutableResultSet. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:

>>> from metacatalog.ext.export.extension import ENTRY_KEYS

A updated list can then be passed as kwargs:

>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')]
>>> Export = metacatalog.ext.extension('export')
>>> Export.json(entry, '/temp/metadata.json', use_keys=use_keys)
classmethod netcdf(entry: Entry, path=None, **kwargs)#

Export an Entry to netCDF or xarray. If a path is given, a new netCDF file will be created, if path is None, the xarray used for building the netCDF is returned.

Note that the common attribute no_data, which is available for the other export functions, is not available for netCDF export. Furthermore, the flat flag is always true, as the Python netCDF implementation does not support nested attributes.

Parameters:
  • entry (metacatalog.models.Entry) – The entry instance to be exported

  • path (str) – If given, a file location for export.

Returns:

out – The the XML str if path is None, else None

Return type:

str

Notes

The content of the file will be created using a ImmutableResultSet. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:

>>> from metacatalog.ext.export.extension import ENTRY_KEYS

A updated list can then be passed as kwargs:

>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')]
>>> Export = metacatalog.ext.extension('export')
>>> Export.netCDF(entry, '/temp/metadata.xml', use_keys=use_keys)
classmethod pickle(entry: Entry, path=None, flat=False, serialize=False, no_data=False, **kwargs)#

Export an Entry to Python dict. If a path is given, a new file will be created.

Parameters:
  • entry (metacatalog.models.Entry) – The entry instance to be exported

  • path (str) – If given, a file location for export.

  • flat (bool) – If True, the resulting JSON will be un-nested and build formerly nested keys like parent.child, where the delimiter defaults to ‘.’ but can be changed. Defaults to False.

  • serialize (bool) – If True, all output data will be converted to serializable types, if possible. This may not work for all data formats. If no path is given, it is recommended to set serializable to False. Defaults to False

  • no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.

Returns:

out – The native Python dict if path is None, else None

Return type:

dict

Notes

The content of the file will be created using a ImmutableResultSet. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:

>>> from metacatalog.ext.export.extension import ENTRY_KEYS

A updated list can then be passed as kwargs:

>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')]
>>> Export = metacatalog.ext.extension('export')
>>> Export.pickle(entry, '/temp/metadata.pickle', use_keys=use_keys)
classmethod to_dict(entry: Entry, use_keys: List[str] = ('uuid', 'external_id', 'title', 'authors', 'abstract', 'citation', 'location_shape', 'variable', 'license', 'datasource', 'details', 'embargo', 'embargo_end', 'version', 'latest_version', 'plain_keyword_dict', 'publication', 'lastUpdate', 'comment', 'associated_groups'), serialize=True, no_data=False, clean=True, **kwargs) dict#

Return as dict to finally export.