Find Command#
Help#
The help text for the find
subcommand can be shown by passing the -h
flag.
[1]:
%%bash
metacatalog find -h
usage: metacatalog find [-h] [--version] [--connection CONNECTION] [--verbose]
[--quiet] [--dev] [--logfile LOGFILE] [--by BY BY]
[--json] [--stdout] [--csv]
entity
positional arguments:
entity Name of the requested database entity.
optional arguments:
-h, --help show this help message and exit
--version, -v Returns the module version
--connection CONNECTION, -C CONNECTION
Connection string to the database instance.Follows the
syntax: driver://user:password@host:port/database
--verbose, -V Activate extended output.
--quiet, -q Suppress any kind of output.
--dev Development mode. Unexpected errors will not be
handled and the full traceback is printed to the
screen.
--logfile LOGFILE If a file is given, output will be written to that
file instead of printed to StdOut.
--by BY BY key value pair to be used for finding record(s) in the
database. Flag can be used multiple times.
--json Output the found entities as JSON objects
--stdout Default option. Print the string representation of
found entities to StdOut.
--csv Output the found entities as CSV.
Prerequists#
The find
command assumes that either `create
<cli_create.ipynb>`__ and `populate
<cli_populate.ipynb>`__ or `init
<cli_init.ipynb>`__ were executed successfully.
Usage#
entity#
Note
The CLI endpoint of find
is just wrapping the Python API endpoint. The API is designed for building model instances, which is often not really helpful from the command line. In future releases, more database model clases will represent themselves correctly when printed to StdOut. Furthermore a set of export flags are planned, to export models into CSV or JSON files.
Until then, some entities might not turn out very helpful at the current state.
The find
command has one positional argument entity
that has to be provided. This is the name of the record entitiy that should be found
. There is a dictionary in metacatalog
that maps enitity names to database models:
[2]:
from metacatalog.api._mapping import TABLE_MAPPING
from pprint import pprint
pprint(TABLE_MAPPING)
{'datasource_types': <class 'metacatalog.models.datasource.DataSourceType'>,
'datasources': <class 'metacatalog.models.datasource.DataSource'>,
'entries': <class 'metacatalog.models.entry.Entry'>,
'entry_groups': <class 'metacatalog.models.entrygroup.EntryGroup'>,
'keywords': <class 'metacatalog.models.keyword.Keyword'>,
'licenses': <class 'metacatalog.models.license.License'>,
'person_roles': <class 'metacatalog.models.person.PersonRole'>,
'persons': <class 'metacatalog.models.person.Person'>,
'thesaurus': <class 'metacatalog.models.keyword.Thesaurus'>,
'units': <class 'metacatalog.models.variable.Unit'>,
'variables': <class 'metacatalog.models.variable.Variable'>}
Many entities map to the same model. This is either due to different spelling, or because the API creates database records in different contexts. E.g. the API forces the user to pass at least one person as the first author of an Entry on creation. The contributors are optional and can be added if applicable. All persons will, however, be saved into the same table.
connection#
In case no default connection was created and saved, you have to supply a connection string to the database using the --connection
flag. See `connection
<cli_connection.ipynb>`__ command.
passing arguments#
Arguments to filter for the correct records can be spcified by the --by
flag. It’s usage is optional. If no filter is set, all records will be returned, which might be a lot. You can pass --by
multiple times to create multiple filters.
Note
The find
endpoint is not made for open searches and does not offer fine-granular filtering. Each filter passed is stacked on top of each other, effectively resulting in a logical AND connection.
The --by
flag requires exactly two arguments. The first is the column to filter and the second the value which has to be matched. It cannot perform not-filters and does not accept a None
or null
.
Example#
[3]:
%%bash
metacatalog find licenses --by short_title ODbL
Open Data Commons Open Database License <ID=4>
[4]:
%%bash
metacatalog find licenses --by id 4
Open Data Commons Open Database License <ID=4>
[5]:
%%bash
metacatalog find licenses --by by_attribution True
Open Data Commons Open Database License <ID=4>
Open Data Commons Attribution License v1.0 <ID=5>
Creative Commons Attribution 4.0 International <ID=6>
Creative Commons Attribution-ShareAlike 4.0 International <ID=7>
Creative Commons Attribution-NonCommerical 4.0 International <ID=8>
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International <ID=9>
[6]:
%%bash
metacatalog find entry
<ID=3 Sap Flow - Hohes Hol [sap flow] >
<ID=4 Sap Flow - Hohes Hol [sap flow] >
<ID=5 Sap Flow - Hohes Hol [sap flow] >
<ID=6 Sap Flow - Hohes Hol [sap flow] >
<ID=7 Sap Flow - Hohes Hol [sap flow] >
<ID=8 Sap Flow - Hohes Hol [sap flow] >
<ID=9 Sap Flow - Hohes Hol [sap flow] >
<ID=16 Sap Flow - Hohes Hol [sap flow] >
<ID=17 Sap Flow - Hohes Hol [sap flow] >
<ID=18 Alfred's data [awesome] >
<ID=19 Alfred's data [awesome] >
<ID=1 Sap Flow - Hohes Hol [sap flow] >
<ID=11 Sap Flow - Hohes Hol [sap flow] >
<ID=10 Sap Flow - Hohes Hol [sap flow] >
<ID=12 Sap Flow - Hohes Hol [sap flow] >
<ID=13 Sap Flow - Hohes Hol [sap flow] >
<ID=14 Sap Flow - Hohes Hol [sap flow] >
<ID=15 Sap Flow - Hohes Hol [sap flow] >
<ID=2 Sap Flow - Hohes Hol [sap flow] >
<ID=20 Alfred data [awesome] >