Metadata Overview#

Available metdata entities#

The following table sumamrizes all Metadata information that can be stored in metacatalog.

Metadata descriptions#

header

description

user_provided

mandatory

external_id

If the metadata is referencing the dataset in another system (i.e. on export), the external id should identify the original metadata set

yes

no

uuid

Version 4 universal unique identifier, that is assigned by the system.

no

yes

title

The title of the dataset. It should delimit the dataset from similar, i.e. within the same campain. Use ie. IDs, or variable names

yes

yes

citation

Citation information, how the dataset should be cited. If empty, a default standard citation style will be applied on export.

optional

yes

abstract

Detailed description of the dataset. Give as much information as possible in a natural english text. The abstract is full-text searchable. Please do also include any information about data proovenance and quality if available

yes

yes

author

The one main author of the dataset.

yes

yes

coAuthors

List of additional co-authors to the dataset. They have the same structure as the first author but hold an additional order key to set the order of coAuthors. Note: Technically, other relationships than coAuthor can be set. On export to DataCite or ISO19115, anyone, who is not an author will become a coAuthor.

yes

no

author.first_name

First name for real persons

yes

yes

author.last_name

Last name for real persons

yes

yes

author.affiliation

The persons affiliation if applicable.

yes

no

author.order

only applies to coAuthors. Can be used to set the order of contributors

yes

no

contributors, editors, publishers, rightHolders, owners, originators

All lists of persons, just like coAuthors, but of different relationship. All are optional, one or more can be set. Persons can be in more than one list, except for contributors, which is mutually exclusive. Please note that all lists will be transformed to coAuthors on metadata export.

optional

no

location

WGS84 coordinates of the dataset. Important: this is only used as a reference location, thus only POINT geometry is allowed. If not applicable use the geometric center of the original geometry

yes

yes

geom

WGS84 multi-geometry collection if the geometry of the dataset cannot be represented by a POINT (location) properly. Add a description to the abstract if used

yes

no

variable

The name of the used variable. Note, that the main design decision of the metadata scheme is that each dataset has only one variable. If more than one variable is contained in a dataset, you need to split them and create a ‘Composite’ dataset-group.

yes

yes

unit

The unit of the variable. Use SI or derived SI units, wherever possible.

yes

yes

license

Data license: Each dataset must contain usage information represented by a license. Available licenses are: - ODbL (Open Data COmmons Open Database License) - ODC-by (Open data COmmons Attribution License v1.0) - CC BY 4.0 (Creative Commons Attribution 4.0 International) CC BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International) Important: The usage of CC BY-NC is discouraged as it not an open license

yes

yes

comment

general comments, which are of more technical nature. Usually not used

yes

no

temporalScale

the Scale triplet for temporal scaled data. If you set a scale, you need to specify all three components

yes

no

temporalScale.resolution

The temporal resolution of one time-step in the dataset. Please use ISO 8601 Durations to decribe the resolution

yes

no

temporalScale.extent

The temporal extent of the dataset. Provide a list of [start, end] for XML or JSON format or observation_start, observation_end for columnar metadata formats

yes

no

temporalScale.support

The ratio of the specified resolution that is actually supported by an observation. Defaults to 1.0, which means the full-timestep. Use smaller numbers if observations do not support the full resolution. I.e. if support is 0.5 and resolution 1hour, means the observation is only representative for the 30min up to the observation timestamp.

optional

no

spatialScale

the Scale triplet for spatially scaled datasets. If you set a scale, you need to specify all three components

yes

no

spatialScale.resolution

Spatial resolution in meter. Please estimate or approximate, if the datasource does not use a meter based CRS.

yes

no

spatialScale.extent

The extent of the dataset. Whenever possible use bounding boxes here. If not applicable, a POLYGON can be used.

yes

no

spatialScale.support

The ratio of the resolution that is supported by an observation. The support is usually always 1.0 for remote sensing products. If ground truthing is available, the support can be set to the share of resolution covered by ground truthing.

optional

no

data_names

The database stores default (column) names for the data on export. data_names can overwrite these settings. If used, you must describe the names in the abstract or via details keywords

optional

no

encoding

If the dataset is exported to a file-based format, it will by default be UTF-8 endcoded. If another encoding is needed for technical reasons, the encoding can be overwritten.

optional

yes

embargo

The dataset can be kept private for the first two years after uploading, while i.e. a publication is in preparation. After this period, the dataset will be public under the specified license

optional

yes

keywords

The database implements the NASA GCMD Earth Science keywords (https://earthdata.nasa.gov/earth-observation-data/find-data/idn/gcmd-keywords). You can tag the dataset by as many as needed. As they are hierachical, only the uuid of the last element is needed.

optional

yes

details

A dataset can be described by an arbitrary amount of additonal information as key-value pairs. The keys should be short and descriptive. The values can be literals or nested structures like lists. Details can be provided as a key=value list of as nested structures ie. JSON: {“mykey”: {“value”: “foobar”, “description”: “Text what mykey and foobar is all about”}}. Additinally, keys should be described in the abstract.

yes

no

groups

In the metadata sheme, groups are used to model relations between datasets. There are three important groups, that can be specified, if applicable: - Split-dataset: If the metadata changes for one dataset, it has to be split up into two datasets, with their own metadata. A Split-dataset group will merge the data together again on export. - Composite-dataset: If this variables must not or cannot be used without another dataset, you can indicate a composite dataset, that will always export siblings along with the data. - Labeled-dataset: If there is any structuring within the Project, that groups datasets together, but not in a strict sense like a composite, a Label can be used. This is usually a Site or a place name, that makes working with the data more convenient

yes

no

group.title

The title of the group. For Composites and Split-datasets, the same title for the group and each dataset is commonly used.

yes

no

group.description

If necessary, an additional description for the group

yes

no

project

Any number of datasets, that were collected within the same context can be grouped together by a project. Technically, this is also a group, but it does not cause any lazy-loading of other datasets and has only descriptive character. If not specified, the original network is used as project, if applicable

optional

no