{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Find Command" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Help\n", "The help text for the `find` subcommand can be shown by passing the `-h` flag." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: metacatalog find [-h] [--version] [--connection CONNECTION] [--verbose]\n", " [--quiet] [--dev] [--logfile LOGFILE] [--by BY BY]\n", " [--json] [--stdout] [--csv]\n", " entity\n", "\n", "positional arguments:\n", " entity Name of the requested database entity.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --version, -v Returns the module version\n", " --connection CONNECTION, -C CONNECTION\n", " Connection string to the database instance.Follows the\n", " syntax: driver://user:password@host:port/database\n", " --verbose, -V Activate extended output.\n", " --quiet, -q Suppress any kind of output.\n", " --dev Development mode. Unexpected errors will not be\n", " handled and the full traceback is printed to the\n", " screen.\n", " --logfile LOGFILE If a file is given, output will be written to that\n", " file instead of printed to StdOut.\n", " --by BY BY key value pair to be used for finding record(s) in the\n", " database. Flag can be used multiple times.\n", " --json Output the found entities as JSON objects\n", " --stdout Default option. Print the string representation of\n", " found entities to StdOut.\n", " --csv Output the found entities as CSV.\n" ] } ], "source": [ "%%bash\n", "metacatalog find -h" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequists\n", "\n", "The `find` command assumes that either [`create`](cli_create.ipynb) and [`populate`](cli_populate.ipynb) or [`init`](cli_init.ipynb) were executed successfully." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Usage\n", "\n", "### entity" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. note::\n", "\n", " The CLI endpoint of ``find`` is just wrapping the Python API endpoint. The API is designed for building model instances, which is often not really helpful from the command line. In future releases, more database model clases will represent themselves correctly when printed to StdOut. Furthermore a set of *export flags* are planned, to export models into CSV or JSON files.\n", " Until then, some entities might not turn out very helpful at the current state." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `find` command has one positional argument `entity` that has to be provided. This is the name of the record entitiy that should be `found`. There is a dictionary in `metacatalog` that maps enitity names to database models:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'datasource_types': ,\n", " 'datasources': ,\n", " 'entries': ,\n", " 'entry_groups': ,\n", " 'keywords': ,\n", " 'licenses': ,\n", " 'person_roles': ,\n", " 'persons': ,\n", " 'thesaurus': ,\n", " 'units': ,\n", " 'variables': }\n" ] } ], "source": [ "from metacatalog.api._mapping import TABLE_MAPPING\n", "from pprint import pprint\n", "pprint(TABLE_MAPPING)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Many entities map to the same model. This is either due to different spelling, or because the API creates database records in different contexts. E.g. the API forces the user to pass at least one *person* as the first author of an *Entry* on creation. The *contributors* are optional and can be added if applicable. All *person*s will, however, be saved into the same table." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### connection\n", "\n", "In case no default connection was created and saved, you have to supply a connection string to the database using the `--connection` flag. See [`connection`](cli_connection.ipynb) command." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### passing arguments\n", "\n", "Arguments to filter for the correct records can be spcified by the `--by` flag. It's usage is optional. If no filter is set, **all** records will be returned, which might be a lot.\n", "You can pass `--by` multiple times to create multiple filters. " ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. note::\n", "\n", " The ``find`` endpoint is not made for open searches and does not offer fine-granular filtering. Each filter passed is stacked **on top** of each other, effectively resulting in a logical **AND** connection. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `--by` flag requires exactly two arguments. The first is the column to filter and the second the value which has to be matched. It cannot perform *not*-filters and does not accept a `None` or `null`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Open Data Commons Open Database License \n" ] } ], "source": [ "%%bash\n", "metacatalog find licenses --by short_title ODbL" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Open Data Commons Open Database License \n" ] } ], "source": [ "%%bash\n", "metacatalog find licenses --by id 4" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Open Data Commons Open Database License \n", "Open Data Commons Attribution License v1.0 \n", "Creative Commons Attribution 4.0 International \n", "Creative Commons Attribution-ShareAlike 4.0 International \n", "Creative Commons Attribution-NonCommerical 4.0 International \n", "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International \n" ] } ], "source": [ "%%bash \n", "metacatalog find licenses --by by_attribution True" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] } ], "source": [ "%%bash\n", "metacatalog find entry" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "finalized": { "timestamp": 1590129905396, "trusted": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }