Export Extenstion#
- class metacatalog.ext.export.ExportExtension#
Export functions.
The default ExportExtension can be used to produce raw export functionality. The
Entry
has aexport
function, that will translate the requested format to a activated extension of this name or fall back to this extension and call a function of same name. Raw export means, that the given Entry will be translated into a Python dictionary and then implemented into the given file-format as natively as possible. This does not follow any specified standards or rules. Standard metadata formats are implemented in separate extensions.Currently, the following exports are supported:
JSON
XML (called fast_XML)
pickle
netCDF
Note
The XML export is currently done with another package (
dicttoxml
). This works great, but it is not possible to adjust the exported XML to i.e. ISO19115 standard requirements. This will be implemented with an export vialxml
. Thus, for the base version of that export, the functionxml
is reserved.If no path is specified, the native Python object that will be used to create the file is returned. This can be very useful to pack more than one Entry together. With
path=None
, the export function will return the following objects:JSON -> str: the JSON object as string
fast_XML -> str: the XML representation as (decoded UTF-8) string
pickle -> dict: the dict underlying all export functions
netCDF -> xarray: the xarray used to build the netCDF file
Note that the export of
'pickle'
and'netCDF'
can be particularly useful without setting the path.- classmethod fast_xml(entry: Entry, path=None, no_data=False, **kwargs)#
Export an
Entry
to XML. If a path is given, a new file will be created. This is the fast XML version, which will convert the metadata to custom XML tags.- Parameters:
entry (metacatalog.models.Entry) – The entry instance to be exported
path (str) – If given, a file location for export.
no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.
- Returns:
out – The the XML str if path is None, else None
- Return type:
str
Notes
The content of the file will be created using a
ImmutableResultSet
. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:>>> from metacatalog.ext.export.extension import ENTRY_KEYS
A updated list can then be passed as kwargs:
>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')] >>> Export = metacatalog.ext.extension('export') >>> Export.fast_xml(entry, '/temp/metadata.xml', use_keys=use_keys)
- classmethod flat_keys(data: dict, prefix=False, delimiter: str = '.', **kwargs) dict #
Turn nested dictionaries into flat dictionaries by expanding nested keys using the given delimiter
- classmethod get_data(entry: Entry, serialize=True, **kwargs) dict #
Return the data as UUID indexed dict
- classmethod json(entry: Entry, path: Optional[str] = None, flat=False, indent: int = 4, no_data=False, **kwargs)#
Export an
Entry
to JSON. If a path is given, a new file will be created.- Parameters:
entry (metacatalog.models.Entry) – The entry instance to be exported
path (str) – If given, a file location for export.
flat (bool) – If True, the resulting JSON will be un-nested and build formerly nested keys like parent.child, where the delimiter defaults to ‘.’ but can be changed. Defaults to False.
indent (int) – The default indentation for the JSON file
no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.
- Returns:
out – The JSON string if path is None, else None
- Return type:
str
Notes
The content of the file will be created using a
ImmutableResultSet
. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:>>> from metacatalog.ext.export.extension import ENTRY_KEYS
A updated list can then be passed as kwargs:
>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')] >>> Export = metacatalog.ext.extension('export') >>> Export.json(entry, '/temp/metadata.json', use_keys=use_keys)
- classmethod netcdf(entry: Entry, path=None, **kwargs)#
Export an
Entry
to netCDF or xarray. If a path is given, a new netCDF file will be created, if path is None, the xarray used for building the netCDF is returned.Note that the common attribute no_data, which is available for the other export functions, is not available for netCDF export. Furthermore, the flat flag is always true, as the Python netCDF implementation does not support nested attributes.
- Parameters:
entry (metacatalog.models.Entry) – The entry instance to be exported
path (str) – If given, a file location for export.
- Returns:
out – The the XML str if path is None, else None
- Return type:
str
Notes
The content of the file will be created using a
ImmutableResultSet
. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:>>> from metacatalog.ext.export.extension import ENTRY_KEYS
A updated list can then be passed as kwargs:
>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')] >>> Export = metacatalog.ext.extension('export') >>> Export.netCDF(entry, '/temp/metadata.xml', use_keys=use_keys)
- classmethod pickle(entry: Entry, path=None, flat=False, serialize=False, no_data=False, **kwargs)#
Export an
Entry
to Python dict. If a path is given, a new file will be created.- Parameters:
entry (metacatalog.models.Entry) – The entry instance to be exported
path (str) – If given, a file location for export.
flat (bool) – If True, the resulting JSON will be un-nested and build formerly nested keys like parent.child, where the delimiter defaults to ‘.’ but can be changed. Defaults to False.
serialize (bool) – If True, all output data will be converted to serializable types, if possible. This may not work for all data formats. If no path is given, it is recommended to set serializable to False. Defaults to False
no_data (bool) – If set to True, the actual data will not be loaded and included. This can be helpful if the data is not serializable or very large.
- Returns:
out – The native Python dict if path is None, else None
- Return type:
dict
Notes
The content of the file will be created using a
ImmutableResultSet
. This will lazy-load sibling Entries and parent groups as needed for an useful Metadata export. The list of exported properties is hardcoded into this extension, but can be overwritten. You can also import the list:>>> from metacatalog.ext.export.extension import ENTRY_KEYS
A updated list can then be passed as kwargs:
>>> use_keys = [k for k in ENTRY_KEYS if not k.startswith('embargo')] >>> Export = metacatalog.ext.extension('export') >>> Export.pickle(entry, '/temp/metadata.pickle', use_keys=use_keys)
- classmethod to_dict(entry: Entry, use_keys: List[str] = ('uuid', 'external_id', 'title', 'authors', 'abstract', 'citation', 'location_shape', 'variable', 'license', 'datasource', 'details', 'embargo', 'embargo_end', 'version', 'latest_version', 'plain_keyword_dict', 'publication', 'lastUpdate', 'comment', 'associated_groups'), serialize=True, no_data=False, clean=True, **kwargs) dict #
Return as dict to finally export.