├── .github └── ISSUE_TEMPLATE │ ├── new-term-template.md │ └── term-change-template.md ├── .gitignore ├── README.md ├── build ├── README.md ├── build-termlist.py ├── build.py ├── build_other_doc_header.py ├── dwc_doc_hierarchy │ └── index.md ├── dwc_doc_inclusive │ └── index.md ├── dwc_doc_tcr │ ├── authors_configuration.yaml │ ├── document_configuration.yaml │ └── termlist-header.md ├── generate_term_versions.py ├── qrg-list.csv ├── requirements.txt ├── tcr-2024-02-28 │ ├── config.yaml │ ├── tcr.csv │ └── vocab.yaml ├── tcr_build.py ├── termlist-footer.md ├── termlist-header.md ├── termlist-header_filled.md ├── terms.tmpl └── update_previous_doc.py ├── dist ├── simple_eco_horizontal.csv └── simple_eco_vertical.csv ├── docs ├── CNAME ├── _config.yml ├── _data │ ├── footer.yml │ └── navigation.yml ├── _sass │ └── _custom.scss ├── hierarchy │ ├── fig1.png │ ├── fig2.png │ ├── fig3.png │ └── index.md ├── humboldt_extension_implementation_experience_report.pdf ├── inclusive │ └── index.md ├── index.md ├── list │ ├── 2023-08-08.md │ ├── 2023-08-25.md │ ├── 2023-09-03.md │ ├── 2023-09-04.md │ ├── 2024-02-28.md │ └── index.md ├── tcr │ └── index.md └── terms │ └── index.md ├── material ├── Checklist Metadata - Data Entry Manual.docx ├── Guralnick et al Ecography 2017.pdf ├── HCSupplementalTable3_FullTermList_r2_v4_RW.xlsx ├── HC_SupplementalTable_ExamplesNEW.xlsx ├── TDWG_Task_Group_Charter_Template_03.docx └── desktop.ini └── vocabulary ├── old ├── HC_terms_2021-02-28.csv ├── HC_terms_2021-11-17.csv ├── HC_terms_2022-02-25.csv ├── HC_terms_2022-03-02.csv ├── README.md └── term_versions_eco_2023-03-02.csv └── term_versions.csv /.github/ISSUE_TEMPLATE/new-term-template.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: New term template 3 | about: This template sets up a new issue with the information needed for a new term 4 | request. 5 | title: 'New Term - ' 6 | labels: Term - add 7 | assignees: '' 8 | 9 | --- 10 | 11 | ## New term 12 | 13 | * Submitter: 14 | * Efficacy Justification (why is this term necessary?): 15 | * Demand Justification (name at least two organizations that independently need this term): 16 | * Stability Justification (what concerns are there that this might affect existing implementations?): 17 | * Implications for dwciri: namespace (does this change affect a dwciri term version)?: 18 | 19 | Proposed attributes of the new term: 20 | 21 | * Term name (in lowerCamelCase for properties, UpperCamelCase for classes): 22 | * Term label (English, not normative): 23 | * Organized in Class (e.g., Occurrence, Event, Location, Taxon): 24 | * Definition of the term (normative): 25 | * Usage comments (recommendations regarding content, etc., not normative): 26 | * Examples (not normative): 27 | * Refines (identifier of the broader term this term refines; normative): 28 | * Replaces (identifier of the existing term that would be deprecated and replaced by this term; normative): 29 | * ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG; not normative): 30 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/term-change-template.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Term change template 3 | about: This template sets up a new issue with the information needed for a term change 4 | request. 5 | title: 'Change term - ' 6 | labels: '' 7 | assignees: '' 8 | 9 | --- 10 | 11 | ## Term change 12 | 13 | * Submitter: 14 | * Efficacy Justification (why is this change necessary?): 15 | * Demand Justification (if the change is semantic in nature, name at least two organizations that independently need this term): 16 | * Stability Justification (what concerns are there that this might affect existing implementations?): 17 | * Implications for dwciri: namespace (does this change affect a dwciri term version)?: 18 | 19 | Current Term definition: https://eco.tdwg.org/list/#eco_[term name here] 20 | 21 | Proposed attributes of the new term version (Please put actual changes to be implemented in **bold** and ~strikethrough~): 22 | 23 | * Term name (in lowerCamelCase for properties, UpperCamelCase for classes): 24 | * Term label (English, not normative): 25 | * Organized in Class (e.g., Occurrence, Event, Location, Taxon): 26 | * Definition of the term (normative): 27 | * Usage comments (recommendations regarding content, etc., not normative): 28 | * Examples (not normative): 29 | * Refines (identifier of the broader term this term refines; normative): 30 | * Replaces (identifier of the existing term that would be deprecated and replaced by this term; normative): 31 | * ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG; not normative): 32 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Mac OS X 2 | .DS_Store 3 | 4 | # Jekyll 5 | docs/_site/* 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Humboldt Extension Task Group 2 | A TDWG Task Group of the Observations & specimens Interest Group 3 | 4 | ## :arrow_upper_right: Current activity 5 | 6 | [Guralnick et al.](https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.02942) introduced the Humboldt Core as a proof of concept in 2018. In **2021**, the [TDWG Humboldt Extension Task Group](https://www.tdwg.org/community/osr/humboldt-extension/) was established to review how to best integrate the terms proposed in the original publication with existing standards and implementation schemas. In the context of sharing data using the **Darwin Core standard**, different types of inventories can be represented as **Events** with different nesting levels. Therefore, it was deemed appropriate to build a **DwC Extension** to include all Humboldt Extension terms that capture the details of the inventory process. 7 | 8 | The Task Group members reviewed all original terms from [Guralnik et al. 2018](https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.02942), reformulated definitions, and discarded or added new terms where needed. The latest version of the terms can be found in the [vocabulary folder](https://github.com/tdwg/hc/tree/main/vocabulary). We have also built a [Humboldt Extension Quick Reference Guide](). 9 | 10 | We are currently developing a [Humboldt Extension user guide](), but see the GBIF summarised version [here](https://docs.gbif.org/survey-monitoring-quick-start/en/). 11 | 12 | As of February 2024, the Humboldt Extension was ratified as a Darwin Core Event class extension (https://eco.tdwg.org/). Hence, the Task Group was closed, and the extension is now maintained by the Darwin Core Maintenance Group (https://www.tdwg.org/community/dwc/). 13 | 14 |
15 | 16 | ## :books: Relevant documents and materials 17 | 18 | [TG Charter](https://github.com/MapofLife/hc/blob/main/material/TDWG_Task_Group_Charter_Template_03.docx) 19 | 20 | [Humboldt Core paper](https://github.com/MapofLife/hc/blob/main/material/Guralnick%20et%20al%20Ecography%202017.pdf) Guralnick et al. 2017 21 | 22 | [HC Supplemental: Full Term List](https://github.com/MapofLife/hc/blob/main/material/HCSupplementalTable3_FullTermList_r2_v4_RW.xlsx) 23 | 24 | [HC Supplemental: Examples table](https://github.com/MapofLife/hc/blob/main/material/HC_SupplementalTable_ExamplesNEW.xlsx) 25 | 26 | 27 | -------------------------------------------------------------------------------- /build/README.md: -------------------------------------------------------------------------------- 1 | # Build scripts 2 | 3 | ## Generating the "list of terms" document for the main eco vocabulary 4 | 5 | Prior to building the production List of Terms document, the Python script "update_previous_doc.py" must be run to change the headers of the previous version of the document and to rename that previous version to a dated version. This must be done first, otherwise the previous index.md file will be overwritten by the new one that is generated by the build-termlist.py script. NOTE: The vocabulary and document metadata in the rs.tdwg.org repository must have been updated before running this script. See for details. Command line arguments are: 6 | 7 | `--slug` (required): the last part of the document URL before the trailing slash. For the Humboldt Extension List of Terms, this is `list`. 8 | 9 | `--dir` (required): the subdirectory of the `process/document_metadata_processing/` directory in the rs.tdwg.org repository where the `author_metadata.yaml` and `document_metadata.yaml` files are located. For the Humboldt Extension List of Terms, this is `dwc_doc_eco`. 10 | 11 | `--branch` (optional): the branch of the rs.tdwg.org repository where the metadata are located. The default is `master`. 12 | 13 | If you are creating a List of Terms document for proofreading prior to ratification, then you should first create a branch of the repo so that the previous version of the document in master is not overwritten and so that the preliminary draft does not appear in the GitHub pages site. You can then run the build-termlist.py script to generate a new index.md file and commit it to the branch. You can then look at the document in the GitHub repository (not the eco.tdwg.org GitHub pages site) to see how it is rendered. It will not have the styling that is provided by the TDWG Jekyll theme. 14 | 15 | The Python script `build-termlist.py` inputs the header template from `termlist-header.md`, then builds the list of terms and their metadata from data in the [rs.tdwg.org](http://github.com/tdwg/rs.tdwg.org) repository. The script also inputs `termlist-footer.md` and appends it to the end of the generated document, but currently it has no content. After the header, term list, and footer are concatenated, the script will then insert author and document metadata from the `author_metadata.yaml` and `document_metadata.yaml` files from the rs.tdwg.org GitHub repository. Therefore, those files must be in place and updated prior to running the script. The constructed Markdown document is saved as `/docs/list/index.md`. 16 | 17 | Command line arguments are: 18 | 19 | `--branch` (optional): the branch of the rs.tdwg.org repository where the metadata are located. The default is `master`. 20 | 21 | ## Generating the taxonCompletenessReported CV document 22 | 23 | As with the List of Terms document, the "update_previous_doc.py" script must be run prior to generating a production version of the document in order to update headers and preserve the previous version of the document. In this case the command line arguments are: 24 | 25 | `--slug` (required): `tcr`. 26 | 27 | `--dir` (required): `dwc_doc_tcr`. 28 | 29 | `--branch` (optional): default is `master`. 30 | 31 | The vocabulary and document metadata in the rs.tdwg.org repository must also have been updated before running the `tcr_build.py` script. 32 | 33 | Command line arguments for `tcr_build.py` are: 34 | 35 | `--branch` (optional): the branch of the rs.tdwg.org repository where the metadata are located. The default is `master`. 36 | 37 | The header template is in the file `dwc_doc_tcr/termlist_header.md`. The header metadata is inserted from metadata in the rs.tdwg.org repository. The term list itself is generated from CSV metadata uploaded to the rs.tdwg.org repository. The output file will be written to `docs/tcr/index.md`. 38 | 39 | 40 | ## Generating the additional standards documents from their templates 41 | 42 | The script `build_other_doc_header.py` inserts document and author metadata from the rs.tdwg.org repo (as with the List of Terms document) and inserts them into the header of the document template that is stored in a subdirectory whose name parallels the permanent IRI of the document (e.g. for `http://rs.tdwg.org/dwc/doc/hierarchy/`, the directory is `dwc_doc_hierarchy`). ([example document template](https://github.com/tdwg/hc/blob/main/build/dwc_doc_hierarchy/index.md)). Unlike the List of Terms document, the document template is largely hand-edited (except for the header). If the document content is to be updated, it must be edited in the template file, with the production doc regenerated by this script. 43 | 44 | Command line options are: 45 | 46 | `--slug` (required): the last part of the document URL before the trailing slash. For example, the slug for `http://rs.tdwg.org/dwc/doc/hierarchy/` is `hierarchy`. 47 | 48 | `--branch` (optional): the branch of the rs.tdwg.org repository where the metadata are located. The default is `master`. 49 | 50 | Once the production document is generated in the `docs` directory, check the diff of the production document to make sure it makes sense. ([example production doc](https://github.com/tdwg/hc/blob/main/docs/hierarchy/index.md)) push the change to GitHub. If the main branch is being used, it will take some time for GitHub Pages to rebuild the site. When that is done, the TDWG styling will be applied to the production page. 51 | 52 | ## Generating the "normative document" (term versions CSV file) 53 | 54 | The script `generate_term_versions.py` pulls source data from the [rs.tdwg.org](http://github.com/tdwg/rs.tdwg.org) repository. The local file `qrg-list.csv` contains a list of the term IRIs in the order that they are to appear in the Quick Reference Guide. This list needs to be changed whenever terms are added to or deprecated from Darwin Core. 55 | 56 | It generates the file `term_versions.csv`, which is used as the input for the `build.py` script below. 57 | 58 | NOTE: the branch of rs.tdwg.org is hard-coded as `master`. If updates are made using a different branch, this will need to be changed. It would probably be a good idea to make this a command line option by copying the code from `build_other_doc_header.py`. 59 | 60 | ## Build script for the eco Quick Reference Guide 61 | 62 | The build script `build.py` uses as input: 63 | 64 | * [vocabulary/term_versions.csv](../vocabulary/term_versions.csv): the list of terms 65 | * [terms.tmpl](terms.tmpl): a Jinja2 template for the quick reference guide 66 | 67 | And creates: 68 | 69 | * The quick reference guide is a Markdown file at [docs/terms/index.md](../docs/terms/index.md). The guide is built as Markdown (with a lot of included html) rather than html, so it can be incorporated by Jekyll in the Darwin Core website (including a header, footer and table of contents). 70 | * Two simple Darwin Core CSV files in [dist/](../dist/) 71 | 72 | **Run the build script** 73 | 74 | 1. Install the required libraries (once): 75 | 76 | ```bash 77 | pip install -r requirements.txt 78 | ``` 79 | 80 | 2. Run the script from the command line: 81 | 82 | ```bash 83 | python build.py 84 | ``` 85 | 86 | -------------------------------------------------------------------------------- /build/build-termlist.py: -------------------------------------------------------------------------------- 1 | # Script to build Markdown pages that provide term metadata for complex vocabularies 2 | # Steve Baskauf 2020-06-28 CC0 3 | # Modified for use with Humboldt Extension 2022-05-29 4 | # This script merges static Markdown header and footer documents with term information tables (in Markdown) generated from data in the rs.tdwg.org repo from the TDWG Github site 5 | 6 | import re 7 | import requests # best library to manage HTTP transactions 8 | import csv # library to read/write/parse CSV files 9 | import json # library to convert JSON to Python data structures 10 | import pandas as pd 11 | import yaml 12 | import sys 13 | 14 | # ----------------- 15 | # Command line arguments 16 | # ----------------- 17 | 18 | arg_vals = sys.argv[1:] 19 | opts = [opt for opt in arg_vals if opt.startswith('-')] 20 | args = [arg for arg in arg_vals if not arg.startswith('-')] 21 | 22 | # "master" for production, something else for development 23 | # Example: First part of branch URL is "https://raw.githubusercontent.com/tdwg/rs.tdwg.org/eco/", branch is "eco". 24 | if '--branch' in opts: 25 | github_branch = args[opts.index('--branch')] 26 | else: 27 | github_branch = 'master' 28 | 29 | # ----------------- 30 | # Configuration section 31 | # ----------------- 32 | 33 | # This is the base URL for raw files from the branch of the repo that has been pushed to GitHub 34 | githubBaseUri = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/' + github_branch + '/' 35 | 36 | headerFileName = 'termlist-header.md' 37 | footerFileName = 'termlist-footer.md' 38 | outFileName = '../docs/list/index.md' 39 | 40 | # This is a Python list of the database names of the term lists to be included in the document. 41 | termLists = ['humboldt', 'humboldt_iri'] 42 | 43 | # If this list of terms is for terms in a single namespace, set the value of has_namespace to True. The value 44 | # of has_namespace should be False for a list of terms that contains multiple namespaces. 45 | has_namespace = False 46 | 47 | # NOTE! There may be problems unless every term list is of the same vocabulary type since the number of columns will differ 48 | # However, there probably aren't any circumstances where mixed types will be used to generate the same page. 49 | vocab_type = 1 # 1 is simple vocabulary, 2 is simple controlled vocabulary, 3 is c.v. with broader hierarchy 50 | 51 | # Terms in large vocabularies like Darwin and Audubon Cores may be organized into categories using tdwgutility_organizedInClass 52 | # If so, those categories can be used to group terms in the generated term list document. 53 | organized_in_categories = True 54 | 55 | # If organized in categories, the display_order list must contain the IRIs that are values of tdwgutility_organizedInClass 56 | # If not organized into categories, the value is irrelevant. There just needs to be one item in the list. 57 | 58 | display_order = [ 'http://rs.tdwg.org/dwc/terms/Event', 'http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI'] 59 | display_label = ['Literal-value terms', 'IRI-value terms'] 60 | display_comments = ['',''] 61 | display_id = ['event', 'use_with_iri'] 62 | 63 | # --------------- 64 | # Load header data 65 | # --------------- 66 | 67 | config_file_path = 'process/document_metadata_processing/dwc_doc_eco/' 68 | contributors_yaml_file = 'authors_configuration.yaml' 69 | document_configuration_yaml_file = 'document_configuration.yaml' 70 | 71 | if has_namespace: 72 | # Load the data about the namespace from term lists metadata at rs.tdwg.org 73 | term_lists_df = pd.read_csv(githubBaseUri + 'term-lists/term-lists.csv') 74 | # Find the row in the term-lists.csv file that corresponds to the database. 75 | term_list_row = term_lists_df.loc[term_lists_df['database'] == termLists[0]] 76 | # Extract the namespace IRI and preferred namespace prefix from the row. 77 | namespace_uri = term_list_row['vann_preferredNamespaceUri'].values[0] 78 | pref_namespace_prefix = term_list_row['vann_preferredNamespacePrefix'].values[0] 79 | 80 | ''' 81 | 82 | metadata_config_text = requests.get(githubBaseUri + config_file_path + 'config.yaml').text 83 | metadata_config = yaml.load(metadata_config_text, Loader=yaml.FullLoader) 84 | namespace_uri = metadata_config['namespaces'][0]['namespace_uri'] 85 | pref_namespace_prefix = metadata_config['namespaces'][0]['pref_namespace_prefix'] 86 | ''' 87 | 88 | # Load the contributors YAML file from its GitHub URL 89 | contributors_yaml_url = githubBaseUri + config_file_path + contributors_yaml_file 90 | contributors_yaml = requests.get(contributors_yaml_url).text 91 | if contributors_yaml == '404: Not Found': 92 | print('Contributors YAML file not found. Check the URL.') 93 | print(contributors_yaml_url) 94 | exit() 95 | contributors_yaml = yaml.load(contributors_yaml, Loader=yaml.FullLoader) 96 | 97 | # Load the document configuration YAML file from its GitHub URL 98 | document_configuration_yaml_url = githubBaseUri + config_file_path + document_configuration_yaml_file 99 | document_configuration_yaml = requests.get(document_configuration_yaml_url).text 100 | document_configuration_yaml = yaml.load(document_configuration_yaml, Loader=yaml.FullLoader) 101 | 102 | # --------------- 103 | # Function definitions 104 | # --------------- 105 | 106 | # replace URL with link 107 | # 108 | def createLinks(text): 109 | def repl(match): 110 | if match.group(1)[-1] == '.': 111 | return '' + match.group(1)[:-1] + '.' 112 | return '' + match.group(1) + '' 113 | 114 | pattern = '(https?://[^\s,;\)"]*)' 115 | result = re.sub(pattern, repl, text) 116 | return result 117 | 118 | # 2021-08-06 Replace the createLinks() function with functions copied from the QRG build script written by S. Van Hoey 119 | def convert_code(text_with_backticks): 120 | """Takes all back-quoted sections in a text field and converts it to 121 | the html tagged version of code blocks ... 122 | """ 123 | return re.sub(r'`([^`]*)`', r'\1', text_with_backticks) 124 | 125 | def convert_link(text_with_urls): 126 | """Takes all links in a text field and converts it to the html tagged 127 | version of the link 128 | """ 129 | def _handle_matched(inputstring): 130 | """quick hack version of url handling on the current prime versions data""" 131 | url = inputstring.group() 132 | return "{}".format(url, url) 133 | 134 | regx = "(http[s]?://[\w\d:#@%/;$()~_?\+-;=\\\.&]*)(?{% for example in examples %}
  • {{ example }}
  • {% endfor %}{% endif %} 141 | def convert_examples(text_with_list_of_examples: str) -> str: 142 | examples_list = text_with_list_of_examples.split('; ') 143 | if len(examples_list) == 1: 144 | return examples_list[0] 145 | else: 146 | output = '
      \n' 147 | for example in examples_list: 148 | output += '
    • ' + example + '
    • \n' 149 | output += '
    ' 150 | return output 151 | 152 | print('Retrieving term list metadata from GitHub') 153 | term_lists_info = [] 154 | 155 | frame = pd.read_csv(githubBaseUri + 'term-lists/term-lists.csv', na_filter=False) 156 | for termList in termLists: 157 | term_list_dict = {'list_iri': termList} 158 | term_list_dict = {'database': termList} 159 | for index,row in frame.iterrows(): 160 | if row['database'] == termList: 161 | term_list_dict['pref_ns_prefix'] = row['vann_preferredNamespacePrefix'] 162 | term_list_dict['pref_ns_uri'] = row['vann_preferredNamespaceUri'] 163 | term_list_dict['list_iri'] = row['list'] 164 | term_lists_info.append(term_list_dict) 165 | #print(term_lists_info) 166 | 167 | # Create column list 168 | column_list = ['pref_ns_prefix', 'pref_ns_uri', 'term_localName', 'label', 'rdfs_comment', 'dcterms_description', 'examples', 'term_modified', 'term_deprecated', 'rdf_type', 'tdwgutility_abcdEquivalence', 'replaces_term', 'replaces1_term'] 169 | if vocab_type == 2: 170 | column_list += ['controlled_value_string'] 171 | elif vocab_type == 3: 172 | column_list += ['controlled_value_string', 'skos_broader'] 173 | if organized_in_categories: 174 | column_list.append('tdwgutility_organizedInClass') 175 | column_list.append('version_iri') 176 | 177 | print('Retrieving metadata about terms from all namespaces from GitHub') 178 | # Create list of lists metadata table 179 | table_list = [] 180 | for term_list in term_lists_info: 181 | # retrieve versions metadata for term list 182 | versions_url = githubBaseUri + term_list['database'] + '-versions/' + term_list['database'] + '-versions.csv' 183 | versions_df = pd.read_csv(versions_url, na_filter=False) 184 | 185 | # retrieve current term metadata for term list 186 | data_url = githubBaseUri + term_list['database'] + '/' + term_list['database'] + '.csv' 187 | frame = pd.read_csv(data_url, na_filter=False) 188 | for index,row in frame.iterrows(): 189 | row_list = [term_list['pref_ns_prefix'], term_list['pref_ns_uri'], row['term_localName'], row['label'], row['rdfs_comment'], row['dcterms_description'], row['examples'], row['term_modified'], row['term_deprecated'], row['rdf_type'], row['tdwgutility_abcdEquivalence'], row['replaces_term'], row['replaces1_term']] 190 | #row_list = [term_list['pref_ns_prefix'], term_list['pref_ns_uri'], row['term_localName'], row['label'], row['definition'], row['usage'], row['notes'], row['term_modified'], row['term_deprecated'], row['type']] 191 | if vocab_type == 2: 192 | row_list += [row['controlled_value_string']] 193 | elif vocab_type == 3: 194 | if row['skos_broader'] =='': 195 | row_list += [row['controlled_value_string'], ''] 196 | else: 197 | row_list += [row['controlled_value_string'], term_list['pref_ns_prefix'] + ':' + row['skos_broader']] 198 | if organized_in_categories: 199 | # Hack on 2024-03-27 to make the ecoiri: terms be in separate sections 200 | if term_list['list_iri'] == 'http://rs.tdwg.org/eco/iri/': 201 | row_list.append('http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI') 202 | else: 203 | row_list.append('http://rs.tdwg.org/dwc/terms/Event') 204 | #row_list.append(row['tdwgutility_organizedInClass']) 205 | 206 | # Borrowed terms really don't have implemented versions. They may be lacking values for version_status. 207 | # In their case, their version IRI will be omitted. 208 | found = False 209 | for vindex, vrow in versions_df.iterrows(): 210 | if vrow['term_localName']==row['term_localName'] and vrow['version_status']=='recommended': 211 | found = True 212 | version_iri = vrow['version'] 213 | # NOTE: the current hack for non-TDWG terms without a version is to append # to the end of the term IRI 214 | if version_iri[len(version_iri)-1] == '#': 215 | version_iri = '' 216 | if not found: 217 | version_iri = '' 218 | row_list.append(version_iri) 219 | 220 | table_list.append(row_list) 221 | 222 | print('processing data') 223 | # Turn list of lists into dataframe 224 | terms_df = pd.DataFrame(table_list, columns = column_list) 225 | 226 | terms_sorted_by_label = terms_df.sort_values(by='label') 227 | #terms_sorted_by_localname = terms_df.sort_values(by='term_localName') 228 | 229 | # This makes sort case insensitive 230 | terms_sorted_by_localname = terms_df.iloc[terms_df.term_localName.str.lower().argsort()] 231 | #terms_sorted_by_localname 232 | print('done retrieving') 233 | print() 234 | 235 | print('Generating term index by CURIE') 236 | text = '### 3.1 Index By Term Name\n\n' 237 | text += '(See also [3.2 Index By Label](#32-index-by-label))\n\n' 238 | 239 | #text += '**Classes**\n' 240 | #text += '\n' 241 | #for row_index,row in terms_sorted_by_localname.iterrows(): 242 | # if row['rdf_type'] == 'http://www.w3.org/2000/01/rdf-schema#Class': 243 | # curie = row['pref_ns_prefix'] + ":" + row['term_localName'] 244 | # curie_anchor = curie.replace(':','_') 245 | # text += '[' + curie + '](#' + curie_anchor + ') |\n' 246 | #text = text[:len(text)-2] # remove final trailing vertical bar and newline 247 | #text += '\n\n' # put back removed newline 248 | 249 | for category in range(0,len(display_order)): 250 | text += '**' + display_label[category] + '**\n' 251 | text += '\n' 252 | if organized_in_categories: 253 | filtered_table = terms_sorted_by_localname[terms_sorted_by_localname['tdwgutility_organizedInClass']==display_order[category]] 254 | filtered_table.reset_index(drop=True, inplace=True) 255 | else: 256 | filtered_table = terms_sorted_by_localname 257 | 258 | for row_index,row in filtered_table.iterrows(): 259 | if row['rdf_type'] != 'http://www.w3.org/2000/01/rdf-schema#Class': 260 | curie = row['pref_ns_prefix'] + ":" + row['term_localName'] 261 | curie_anchor = curie.replace(':','_') 262 | text += '[' + curie + '](#' + curie_anchor + ') |\n' 263 | text = text[:len(text)-2] # remove final trailing vertical bar and newline 264 | text += '\n\n' # put back removed newline 265 | 266 | index_by_name = text 267 | 268 | #print(index_by_name) 269 | 270 | print('Generating term index by label') 271 | text = '\n\n' 272 | 273 | # Comment out the following two lines if there is no index by local names 274 | text = '### 3.2 Index By Label\n\n' 275 | text += '(See also [3.1 Index By Term Name](#31-index-by-term-name))\n\n' 276 | 277 | #text += '**Classes**\n' 278 | #text += '\n' 279 | #for row_index,row in terms_sorted_by_label.iterrows(): 280 | # if row['rdf_type'] == 'http://www.w3.org/2000/01/rdf-schema#Class': 281 | # curie_anchor = row['pref_ns_prefix'] + "_" + row['term_localName'] 282 | # text += '[' + row['label'] + '](#' + curie_anchor + ') |\n' 283 | #text = text[:len(text)-2] # remove final trailing vertical bar and newline 284 | #text += '\n\n' # put back removed newline 285 | 286 | for category in range(0,len(display_order)): 287 | if organized_in_categories: 288 | text += '**' + display_label[category] + '**\n' 289 | text += '\n' 290 | filtered_table = terms_sorted_by_label[terms_sorted_by_label['tdwgutility_organizedInClass']==display_order[category]] 291 | filtered_table.reset_index(drop=True, inplace=True) 292 | else: 293 | filtered_table = terms_sorted_by_label 294 | 295 | for row_index,row in filtered_table.iterrows(): 296 | if row_index == 0 or (row_index != 0 and row['label'] != filtered_table.iloc[row_index - 1].loc['label']): # this is a hack to prevent duplicate labels 297 | if row['rdf_type'] != 'http://www.w3.org/2000/01/rdf-schema#Class': 298 | curie_anchor = row['pref_ns_prefix'] + "_" + row['term_localName'] 299 | text += '[' + row['label'] + '](#' + curie_anchor + ') |\n' 300 | text = text[:len(text)-2] # remove final trailing vertical bar and newline 301 | text += '\n\n' # put back removed newline 302 | 303 | index_by_label = text 304 | 305 | #print(index_by_label) 306 | 307 | decisions_df = pd.read_csv('https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/decisions/decisions-links.csv', na_filter=False) 308 | 309 | # --------------- 310 | # generate a table for each term, with terms grouped by category 311 | # --------------- 312 | 313 | print('Generating terms table') 314 | # generate the Markdown for the terms table 315 | text = '## 4 Vocabulary\n' 316 | if True: 317 | filtered_table = terms_sorted_by_localname 318 | 319 | #for category in range(0,len(display_order)): 320 | # if organized_in_categories: 321 | # text += '### 4.' + str(category + 1) + ' ' + display_label[category] + '\n' 322 | # text += '\n' 323 | # text += display_comments[category] # insert the comments for the category, if any. 324 | # filtered_table = terms_sorted_by_localname[terms_sorted_by_localname['tdwgutility_organizedInClass']==display_order[category]] 325 | # filtered_table.reset_index(drop=True, inplace=True) 326 | # else: 327 | # filtered_table = terms_sorted_by_localname 328 | 329 | for row_index,row in filtered_table.iterrows(): 330 | text += '\n' 331 | curie = row['pref_ns_prefix'] + ":" + row['term_localName'] 332 | curieAnchor = curie.replace(':','_') 333 | text += '\t\n' 334 | text += '\t\t\n' 335 | text += '\t\t\t\n' 336 | text += '\t\t\n' 337 | text += '\t\n' 338 | text += '\t\n' 339 | text += '\t\t\n' 340 | text += '\t\t\t\n' 341 | uri = row['pref_ns_uri'] + row['term_localName'] 342 | text += '\t\t\t\n' 343 | text += '\t\t\n' 344 | text += '\t\t\n' 345 | text += '\t\t\t\n' 346 | text += '\t\t\t\n' 347 | text += '\t\t\n' 348 | 349 | if row['version_iri'] != '': 350 | text += '\t\t\n' 351 | text += '\t\t\t\n' 352 | text += '\t\t\t\n' 353 | text += '\t\t\n' 354 | 355 | text += '\t\t\n' 356 | text += '\t\t\t\n' 357 | text += '\t\t\t\n' 358 | text += '\t\t\n' 359 | 360 | if row['term_deprecated'] != '': 361 | text += '\t\t\n' 362 | text += '\t\t\t\n' 363 | text += '\t\t\t\n' 364 | text += '\t\t\n' 365 | 366 | for dep_index,dep_row in filtered_table.iterrows(): 367 | if dep_row['replaces_term'] == uri: 368 | text += '\t\t\n' 369 | text += '\t\t\t\n' 370 | text += '\t\t\t\n' 371 | text += '\t\t\n' 372 | if dep_row['replaces1_term'] == uri: 373 | text += '\t\t\n' 374 | text += '\t\t\t\n' 375 | text += '\t\t\t\n' 376 | text += '\t\t\n' 377 | 378 | text += '\t\t\n' 379 | text += '\t\t\t\n' 380 | text += '\t\t\t\n' 381 | #text += '\t\t\t\n' 382 | text += '\t\t\n' 383 | 384 | if row['dcterms_description'] != '': 385 | #if row['notes'] != '': 386 | text += '\t\t\n' 387 | text += '\t\t\t\n' 388 | text += '\t\t\t\n' 389 | #text += '\t\t\t\n' 390 | text += '\t\t\n' 391 | 392 | if row['examples'] != '': 393 | #if row['usage'] != '': 394 | text += '\t\t\n' 395 | text += '\t\t\t\n' 396 | text += '\t\t\t\n' 397 | #text += '\t\t\t\n' 398 | text += '\t\t\n' 399 | 400 | if row['tdwgutility_abcdEquivalence'] != '': 401 | #if row['usage'] != '': 402 | text += '\t\t\n' 403 | text += '\t\t\t\n' 404 | text += '\t\t\t\n' 405 | text += '\t\t\n' 406 | 407 | if vocab_type == 2 or vocab_type ==3: # controlled vocabulary 408 | text += '\t\t\n' 409 | text += '\t\t\t\n' 410 | text += '\t\t\t\n' 411 | text += '\t\t\n' 412 | 413 | if vocab_type == 3 and row['skos_broader'] != '': # controlled vocabulary with skos:broader relationships 414 | text += '\t\t\n' 415 | text += '\t\t\t\n' 416 | curieAnchor = row['skos_broader'].replace(':','_') 417 | text += '\t\t\t\n' 418 | text += '\t\t\n' 419 | 420 | text += '\t\t\n' 421 | text += '\t\t\t\n' 422 | if row['rdf_type'] == 'http://www.w3.org/1999/02/22-rdf-syntax-ns#Property': 423 | #if row['type'] == 'http://www.w3.org/1999/02/22-rdf-syntax-ns#Property': 424 | text += '\t\t\t\n' 425 | elif row['rdf_type'] == 'http://www.w3.org/2000/01/rdf-schema#Class': 426 | #elif row['type'] == 'http://www.w3.org/2000/01/rdf-schema#Class': 427 | text += '\t\t\t\n' 428 | elif row['rdf_type'] == 'http://www.w3.org/2004/02/skos/core#Concept': 429 | #elif row['type'] == 'http://www.w3.org/2004/02/skos/core#Concept': 430 | text += '\t\t\t\n' 431 | else: 432 | text += '\t\t\t\n' # this should rarely happen 433 | #text += '\t\t\t\n' # this should rarely happen 434 | text += '\t\t\n' 435 | 436 | # Look up decisions related to this term 437 | for drow_index,drow in decisions_df.iterrows(): 438 | if drow['linked_affected_resource'] == uri: 439 | text += '\t\t\n' 440 | text += '\t\t\t\n' 441 | text += '\t\t\t\n' 442 | text += '\t\t\n' 443 | 444 | text += '\t\n' 445 | text += '
    Term Name ' + curie + '
    Term IRI' + uri + '
    Modified' + row['term_modified'] + '
    Term version IRI' + row['version_iri'] + '
    Label' + row['label'] + '
    This term is deprecated and should no longer be used.
    Is replaced by' + dep_row['pref_ns_uri'] + dep_row['term_localName'] + '
    Is replaced by' + dep_row['pref_ns_uri'] + dep_row['term_localName'] + '
    Definition' + row['rdfs_comment'] + '' + row['definition'] + '
    Notes' + convert_link(convert_code(row['dcterms_description'])) + '' + createLinks(row['notes']) + '
    Examples' + convert_examples(convert_link(convert_code(row['examples']))) + '' + createLinks(row['usage']) + '
    ABCD equivalence' + convert_link(convert_code(row['tdwgutility_abcdEquivalence'])) + '
    Controlled value' + row['controlled_value_string'] + '
    Has broader concept' + row['skos_broader'] + '
    TypePropertyClassConcept' + row['rdf_type'] + '' + row['type'] + '
    Executive Committee decisionhttp://rs.tdwg.org/decisions/' + drow['decision_localName'] + '
    \n' 446 | text += '\n' 447 | text += '\n' 448 | term_table = text 449 | print('done generating') 450 | print() 451 | 452 | #print(term_table) 453 | 454 | print('Merging term table with header and footer and saving file') 455 | #text = index_by_label + term_table 456 | text = index_by_name + index_by_label + term_table 457 | 458 | # read in header and footer, merge with terms table, and output 459 | 460 | headerObject = open(headerFileName, 'rt', encoding='utf-8') 461 | header = headerObject.read() 462 | headerObject.close() 463 | 464 | # Build the Markdown for the contributors list 465 | contributors = '' 466 | for contributor in contributors_yaml: 467 | contributors += '[' + contributor['contributor_literal'] + '](' + contributor['contributor_iri'] + ') ' 468 | contributors += '([' + contributor['affiliation'] + '](' + contributor['affiliation_uri'] + ')), ' 469 | contributors = contributors[:-2] # Remove the last comma and space 470 | 471 | # Substitute values of ratification_date and contributors into the header template 472 | header = header.replace('{document_title}', document_configuration_yaml['documentTitle']) 473 | header = header.replace('{ratification_date}', document_configuration_yaml['doc_modified']) 474 | header = header.replace('{created_date}', document_configuration_yaml['doc_created']) 475 | header = header.replace('{contributors}', contributors) 476 | header = header.replace('{standard_iri}', document_configuration_yaml['dcterms_isPartOf']) 477 | header = header.replace('{current_iri}', document_configuration_yaml['current_iri']) 478 | header = header.replace('{abstract}', document_configuration_yaml['abstract']) 479 | header = header.replace('{creator}', document_configuration_yaml['creator']) 480 | header = header.replace('{publisher}', document_configuration_yaml['publisher']) 481 | year = document_configuration_yaml['doc_modified'].split('-')[0] 482 | header = header.replace('{year}', year) 483 | if has_namespace: 484 | header = header.replace('{namespace_uri}', namespace_uri) 485 | header = header.replace('{pref_namespace_prefix}', pref_namespace_prefix) 486 | 487 | # Determine whether there was a previous version of the document. 488 | if document_configuration_yaml['doc_created'] != document_configuration_yaml['doc_modified']: 489 | # Load versions list from document versions data in the rs.tdwg.org repo and find most recent version. 490 | versions_data_url = githubBaseUri + 'docs/docs-versions.csv' 491 | versions_list_df = pd.read_csv(versions_data_url, na_filter=False) 492 | # Slice all rows for versions of this document. 493 | matching_versions = versions_list_df[versions_list_df['current_iri']==document_configuration_yaml['current_iri']] 494 | # Sort the matching versions by version IRI in descending order so that the most recent version is first. 495 | matching_versions = matching_versions.sort_values(by=['version_iri'], ascending=[False]) 496 | # The previous version is the second row in the dataframe (row 1). 497 | # The version IRI is in the second column (column 1). 498 | most_recent_version_iri = matching_versions.iat[1, 1] 499 | #print(most_recent_version_iri) 500 | 501 | # Insert the previous version information into the header 502 | previous_version_metadata_string = '''Previous version 503 | : <''' + most_recent_version_iri + '''> 504 | 505 | ''' 506 | # Insert the previous version information into the designated slot. 507 | header = header.replace('{previous_version_slot}\n\n', previous_version_metadata_string) 508 | else: 509 | # If there was no previous version, remove the slot from the header. 510 | header = header.replace('{previous_version_slot}\n\n', '') 511 | 512 | footerObject = open(footerFileName, 'rt', encoding='utf-8') 513 | footer = footerObject.read() 514 | footerObject.close() 515 | 516 | output = header + text + footer 517 | outputObject = open(outFileName, 'wt', encoding='utf-8') 518 | outputObject.write(output) 519 | outputObject.close() 520 | 521 | print('done') 522 | -------------------------------------------------------------------------------- /build/build.py: -------------------------------------------------------------------------------- 1 | # 2 | # S. Van Hoey, John Wieczorek 3 | # 4 | # Build script for document handling 5 | # Based on https://github.com/tdwg/dwc/blob/master/build/build.py 6 | # 7 | 8 | import io 9 | import os 10 | import re 11 | import csv 12 | import sys 13 | import codecs 14 | 15 | from urllib import request 16 | from jinja2 import FileSystemLoader, Environment 17 | 18 | NAMESPACES = { 19 | 'http://rs.tdwg.org/eco/iri/' : 'ecoiri', 20 | 'http://rs.tdwg.org/eco/terms/' : 'eco', 21 | 'http://rs.tdwg.org/dwc/terms/attributes/' : 'tdwgutility'} 22 | 23 | class DwcNamespaceError(Exception): 24 | """Namespace link is not available in the currently provided links""" 25 | pass 26 | 27 | class DwcBuildReader(): 28 | 29 | def __init__(self, dwc_build_file): 30 | """Custom Reader switching between raw Github or local file""" 31 | self.dwc_build_file = dwc_build_file 32 | 33 | def __enter__(self): 34 | if "https://raw.github" in self.dwc_build_file: 35 | self.open_dwc_term = request.urlopen(self.dwc_build_file) 36 | else: 37 | self.open_dwc_term = open(self.dwc_build_file, 'rb') 38 | return self.open_dwc_term 39 | 40 | def __exit__(self, *args): 41 | self.open_dwc_term.close() 42 | 43 | class DwcDigester(object): 44 | 45 | def __init__(self, term_versions, qrg_term_versions): 46 | """Digest the term and qrg documents of Darwin Core to support automatic 47 | generation of derivatives 48 | 49 | Parameters 50 | ----------- 51 | term_versions : str 52 | Either a relative path and filename of the normative Dwc document 53 | or a URL link to the raw Github version of the file 54 | qrg_term_versions : str 55 | Either a relative path and filename of the Quick Reference Guide term order 56 | document or a URL link to the raw Github version of the file 57 | 58 | Notes 59 | ----- 60 | Remark that the sequence of the term versions entries is 61 | essential for the automatic generation of the individual documents 62 | (mainly the index.html) 63 | """ 64 | self.term_versions = term_versions 65 | self.qrg_term_versions = qrg_term_versions 66 | 67 | self.term_versions_data = {} 68 | self._store_versions() 69 | 70 | # create the defined data-object for the different outputs 71 | self.template_data = self.process_terms() 72 | self.properties = self.properties_list() 73 | 74 | def versions(self): 75 | """Iterator providing the terms as represented in the normative term 76 | versions file 77 | """ 78 | with DwcBuildReader(self.term_versions) as versions: 79 | for vterm in csv.DictReader(io.TextIOWrapper(versions), delimiter=','): 80 | if vterm["status"] == "recommended": 81 | yield vterm 82 | 83 | def _store_versions(self): 84 | """Collect all the versions data in a dictionary as the 85 | term_versions_data attribute 86 | """ 87 | for term in self.versions(): 88 | self.term_versions_data[term["term_iri"]] = term 89 | 90 | def _select_versions_term(self, term_iri): 91 | """Select a specific term of the versions data, using term_iri match 92 | """ 93 | return self.term_versions_data[term_iri] 94 | 95 | @staticmethod 96 | def split_iri(term_iri): 97 | """Split an iri field into the namespace url and the local name 98 | of the term 99 | """ 100 | prog = re.compile("(.*/)([^/]*$)") 101 | namespace, local_name = prog.findall(term_iri)[0] 102 | return namespace, local_name 103 | 104 | @staticmethod 105 | def resolve_namespace_abbrev(namespace): 106 | """Using the NAMESPACE constant, get the namespace abbreviation by 107 | providing the namespace link 108 | 109 | Parameters 110 | ----------- 111 | namespace : str 112 | valid key of the NAMESPACES variable 113 | """ 114 | if namespace not in NAMESPACES.keys(): 115 | print("namespace url: %s", namespace) 116 | raise DwcNamespaceError("The namespace url is currently not supported in NAMESPACES") 117 | return NAMESPACES[namespace] 118 | 119 | def get_term_definition(self, term_iri): 120 | """Extract the required information from the terms table to show on 121 | the webpage of a single term by using the term_iri as the identifier 122 | 123 | Notes 124 | ------ 125 | Due to the current implementation, make sure to provide the same keys 126 | represented in the record-level specific version `process_terms` 127 | method (room for improvement) 128 | """ 129 | vs_term = self._select_versions_term(term_iri) 130 | 131 | term_data = {} 132 | term_data["label"] = vs_term['term_localName'] # See https://github.com/tdwg/dwc/issues/253#issuecomment-670098202 133 | term_data["iri"] = term_iri 134 | term_data["class"] = vs_term['organized_in'] 135 | term_data["definition"] = self.convert_link(vs_term['definition']) 136 | term_data["comments"] = self.convert_link(self.convert_code(vs_term['comments'])) 137 | term_data["examples"] = self.convert_link(self.convert_code(vs_term['examples'])) 138 | term_data["rdf_type"] = vs_term['rdf_type'] 139 | namespace_url, _ = self.split_iri(term_iri) 140 | term_data["namespace"] = self.resolve_namespace_abbrev(namespace_url) 141 | return term_data 142 | 143 | @staticmethod 144 | def convert_code(text_with_backticks): 145 | """Takes all back-quoted sections in a text field and converts it to 146 | the html tagged version of code blocks ... 147 | """ 148 | return re.sub(r'`([^`]*)`', r'\1', text_with_backticks) 149 | 150 | @staticmethod 151 | def convert_link(text_with_urls): 152 | """Takes all links in a text field and converts it to the html tagged 153 | version of the link 154 | """ 155 | def _handle_matched(inputstring): 156 | """quick hack version of url handling on the current prime versions data""" 157 | url = inputstring.group() 158 | return "{}".format(url, url) 159 | 160 | regx = "(http[s]?://[\w\d:#@%/;$()~_?\+-;=\\\.&]*)(? 110 | 111 | ''' 112 | # Insert the previous version information into the designated slot. 113 | header = header.replace('{previous_version_slot}\n\n', previous_version_metadata_string) 114 | else: 115 | # If there was no previous version, remove the slot from the header. 116 | header = header.replace('{previous_version_slot}\n\n', '') 117 | 118 | outputObject = open(outFileName, 'wt', encoding='utf-8') 119 | outputObject.write(header) 120 | outputObject.close() 121 | 122 | print('done') 123 | -------------------------------------------------------------------------------- /build/dwc_doc_hierarchy/index.md: -------------------------------------------------------------------------------- 1 | # {document_title} 2 | 3 | Title 4 | : {document_title} 5 | 6 | Date version issued 7 | : {ratification_date} 8 | 9 | Date created 10 | : {created_date} 11 | 12 | Part of TDWG Standard 13 | : <{standard_iri}> 14 | 15 | This version 16 | : <{current_iri}{ratification_date}> 17 | 18 | Latest version 19 | : <{current_iri}> 20 | 21 | {previous_version_slot} 22 | 23 | Abstract 24 | : {abstract} 25 | 26 | Contributors 27 | : {contributors} 28 | 29 | Creator 30 | : {creator} 31 | 32 | Bibliographic citation 33 | : {creator}. {year}. {document_title}. {publisher}. <{current_iri}{ratification_date}> 34 | 35 | 36 | ## 1 Introduction (non-normative) 37 | 38 | ### 1.1 Status of the content of this document 39 | 40 | Section 3 of this document is normative, serving as official guidelines 41 | in application of the Humboldt Extension. The other sections are 42 | non-normative and designed to help improve overall understanding in 43 | application and interpretation of the Extension. 44 | 45 | ### 1.2 RFC 2119 keywords 46 | --------------------- 47 | 48 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 49 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 50 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 51 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 52 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 53 | when, and only when, they are written in capitals (as shown here). 54 | 55 | ## 1.3 Namespaces and terminology 56 | 57 | The namespace `eco:` abbreviates terms minted for the Humboldt Extension 58 | for ecological inventories 59 | ([http://rs.tdwg.org/eco/terms/](http://rs.tdwg.org/eco/terms/)). 60 | `dwc:` abbreviates terms from the main Darwin Core vocabulary namespace 61 | ([http://rs.tdwg.org/dwc/terms/](http://rs.tdwg.org/dwc/terms/)). 62 | 63 | Words in `code markup` are term IRIs or literal values. The word 64 | "organism" is used colloquially and is not used in the technical sense 65 | of the dwc:Organism class, unless specifically presented as 66 | "dwc:Organism." The word "Event" is used in the technical sense of the 67 | dwc:Event class. "Humboldt Extension" is an abbreviation for the 68 | "Humboldt Extension for Ecological Inventories." 69 | 70 | ### 1.4 Intended audience and use for this document 71 | 72 | The information in this document is targeted at data providers, data 73 | aggregators, and data consumers. *Data providers* are the individuals 74 | responsible for mapping ecological inventory data into an Event-based 75 | [Darwin Core 76 | Archive](https://ipt.gbif.org/manual/en/ipt/latest/dwca-guide) 77 | format that includes the Humboldt Extension. *Data aggregators* and 78 | *data consumers* can use this document to better understand the data 79 | shared by data providers, specifically with respect to the 80 | **relationships between hierarchical dwc:Event levels** and **when it is 81 | or is not appropriate to make inferences** about attributes such as 82 | abundance or absence of detection. 83 | 84 | 85 | ## 2 Rationale (non-normative) 86 | 87 | Ecological inventories in the context of Darwin Core can be considered 88 | as types of [dwc:Events](http://rs.tdwg.org/dwc/terms/Event) 89 | --- they are actions that occur at specific locations over defined 90 | periods of time. The terms in the Humboldt Extension are all properties 91 | of a dwc:Event. 92 | 93 | There are many types of ecological inventory, ranging from singular 94 | observations of individual taxa (1 event:1 observation; Example 1 in 95 | Figure 1) to highly structured and deeply nested observations within 96 | other observations (e.g., 1 event:2 sub-events, each sub-event:2 97 | sub-sub-events; Example 4 in Figure 1). The need for guidance on **how 98 | to capture the details of nested observations** (dwc:Event hierarchies) 99 | is the rationale for this document. Nested sampling designs can be 100 | translated into a relational database schema of parent-child dwc:Event 101 | relationships (a parent event with one or more child sub-events; Figure 102 | 1). This document describes the circumstances under which specific 103 | properties of parent and child dwc:Events SHOULD be populated based on 104 | the parent-child relationship. 105 | 106 | Note that the proposed structure for sharing ecological inventories does 107 | not follow typical database practice. Whilst a (relational) database 108 | would store information in multiple tables to avoid repetition of key 109 | information, datasets shared using the Darwin Core archive format and 110 | the Humboldt Extension instead use a "flattened" structure. In order to 111 | share inventory data such that no information is lost and no information 112 | is incorrectly inferred, one SHOULD **report all information at all 113 | applicable levels**. The rules for applicability and how to populate 114 | terms at parent and child levels in the dwc:Event hierarchy are captured 115 | in section *3.2 Guiding principles* and in section *3.3 Implementation principles*. 116 | 117 | 118 | ![Illustration of four examples of nested dwc:Events](fig1.png) 119 | 120 | **Figure 1.** Visual representation of an ecological inventory 121 | illustrating four examples of occurrence data associated with dwc:Events 122 | nested within parent dwc:Events, at varying levels of complexity ranging 123 | from low (Example 1) to high (Example 4). 124 | 125 | 126 | ## 3 Usage guidelines (normative) 127 | 128 | ### 3.1 Definitions 129 | 130 | **Inventory dataset** - An inventory (dataset) consists of one or more 131 | dwc:Events that MAY be related to each other in a hierarchy of parent 132 | and child dwc:Events. This is not new to the capabilities or intentions 133 | of Darwin Core. 134 | 135 | **Inventory hierarchy** - A set of related dwc:Events, in which a 136 | narrower dwc:Event (child) points to the related broader dwc:Event 137 | (parent) via the child's dwc:parentEventID. A higher-level dwc:Event 138 | generally contains information about the inventory design that applies 139 | to all of its children. 140 | 141 | **Parent dwc:Event** - A parent dwc:Event is any dwc:Event whose 142 | dwc:eventID is a dwc:parentEventID for at least one other dwc:Event 143 | (e.g. EVENT_01 in Figure 2). 144 | 145 | **Child dwc:Event** - A child dwc:Event is any dwc:Event whose 146 | dwc:parentEventID is populated with the dwc:eventID of another dwc:Event 147 | (e.g. EVENT_02 or EVENT_03 in Figure 2). 148 | 149 | ![Visual representation of parent/child relationship](fig2.png) 150 | 151 | **Figure 2.** Visual representation of an inventory hierarchy 152 | illustrating parent-child dwc:Event relations. The higher-level (parent) 153 | dwc:Event, EVENT_01, may include general information about the inventory 154 | design. Species occurrences are captured for two child dwc:Events 155 | (EVENT_02 and EVENT_03). 156 | 157 | 158 | ## 3.2 Guiding principles 159 | 160 | 161 | ### 3.2.1 Principle of spatiotemporal coverage 162 | 163 | **A parent dwc:Event MUST encompass its child dwc:Events spatially 164 | and temporally.** Specifically, the spatial extent and temporal 165 | interval of a parent dwc:Event MUST contain the spatial extents and 166 | temporal intervals of all of its children. For example, if child 167 | dwc:Events took place in various locations throughout, and only within, 168 | Burundi, then the spatial extent of the parent dwc:Event would be 169 | Burundi. Similarly, if the child dwc:Events took place periodically 170 | throughout the year 2019, the temporal interval of the parent dwc:Event 171 | would begin when the earliest child dwc:Event began and end when the 172 | latest child dwc:Event ended. 173 | 174 | 175 | ### 3.2.2 Principle of applicability 176 | 177 | **Humboldt Extension terms SHOULD contain data explicitly at every level 178 | in the dwc:Event hierarchy to which they *directly* apply.** The value 179 | of a term for a dwc:Event SHOULD be populated for the Event itself 180 | rather than merely summarized in a higher-level dwc:Event. For example, 181 | a child dwc:Event (**C**) with multiple dwc:Occurrences, some of which 182 | resulted in voucher specimens, SHOULD possess a value of `true` for 183 | the term eco:hasVouchers. The data user SHOULD NOT be expected to look 184 | at the eco:hasVouchers term for the parent dwc:Event (**P**) of **C** in 185 | order to find the value. 186 | 187 | If a term genuinely applies at multiple levels of an dwc:Event 188 | hierarchy, values SHOULD be reported explicitly at *each* of those 189 | levels. The values for child dwc:Events might be the same as their 190 | parental values, or child dwc:Events might possess their own more 191 | specific values. This principle allows child dwc:Events to be 192 | "autonomous" to the greatest degree possible, and avoids uncertainty 193 | about where to look for the values of properties of any given dwc:Event. 194 | 195 | 196 | ### 3.2.3 Principle of non-derivation 197 | 198 | As a complement to the *Principle of applicability*, **Humboldt 199 | Extension terms SHOULD NOT be populated by deriving or summarizing 200 | information from child dwc:Events to their common parent dwc:Event**. If 201 | a term does not directly apply to a given level of dwc:Event (i.e., it 202 | is not an actual property of that dwc:Event), it SHOULD NOT be populated 203 | with a value. For example, if the parent dwc:Event **P** from the 204 | example in section *3.2.2* above is not directly linked to 205 | dwc:Occurrences, then the term eco:hasVouchers does not apply at that 206 | dwc:Event level and SHOULD be left unpopulated. Data providers SHOULD 207 | NOT construct a value for a parent dwc:Event from values at the level of 208 | child dwc:Events. 209 | 210 | In some cases, including the example above, it would not be valid to 211 | derive or summarize information from child dwc:Events to populate a 212 | parent dwc:Event. Suppose parent dwc:Event **P** has two child 213 | dwc:Events, one with eco:hasVouchers `true` and one with 214 | eco:hasVouchers `false`. The value of eco:hasVouchers for **P** cannot 215 | be derived or summarized from its children, as it is neither `true` 216 | nor `false` for all of them (the only two values consistent with the 217 | recommended controlled vocabulary for the term). It would be neither 218 | desirable nor reliable to use the values of the child dwc:Events to 219 | infer a value for the parent dwc:Event. The *Principle of inference* 220 | (below) provides a further example, where *scope* terms of parent 221 | dwc:Events MUST NOT be populated by summarizing from lower levels 222 | (either through the scope values of child dwc:Events or, for example, 223 | through taxa detected in child dwc:Events). 224 | 225 | There are terms which could theoretically be populated for a parent 226 | dwc:Event from the primary data already provided for that dwc:Event\'s 227 | children (e.g., eco:materialSampleTypes). Populating the parent term 228 | could facilitate the discovery of higher-level dwc:Events among whose 229 | children there is a particular value of a property (e.g., a search 230 | through the highest-level dwc:Events in datasets to find datasets in 231 | which there are particular eco:materialSampleTypes). However, providing 232 | such summary values is specifically NOT RECOMMENDED. Doing so a\) adds no 233 | information to the dataset (the summary information is already available 234 | by inspecting the primary data in the dwc:Events in the dataset), b\) 235 | adds an extra burden of summary upon the data provider, and c\) is 236 | susceptible to errors (ambiguities, inconsistencies, incompleteness) 237 | when trying to construct secondary summary information for higher-level 238 | Events. 239 | 240 | 241 | ### 3.2.4 Principle of inference 242 | 243 | **Certain terms in the Humboldt Extension support inferences.** Examples 244 | of terms that help data users to determine whether or not inferences can 245 | be made include those describing the *scope* of the inventory, such as 246 | eco:targetTaxonomicScope and eco:excludedTaxonomicScope, and terms 247 | describing *completeness*, such as eco:taxonCompletenessReported, 248 | eco:taxonCompletenessProtocols and eco:isTaxonomicScopeFullyReported. 249 | The values of these terms in a dwc:Event have implications for the 250 | interpretation of all of that dwc:Event's child dwc:Events. These terms 251 | MUST be populated for the highest level dwc:Event to which they apply, 252 | and all of its child dwc:Events. 253 | 254 | **The *scope* terms of a dwc:Event MUST be populated whenever the scope 255 | was in effect**. Having this information in a dwc:Event is the only way 256 | **to be able to infer absences of detection** within that dwc:Event, 257 | whenever the dwc:Occurrences linked to that dwc:Event do not explicitly 258 | state zero counts or when there are no dwc:Occurrence records for a 259 | given taxon that fell within the taxonomic scope (the combination of 260 | eco:targetTaxonomicScope and eco:excludedTaxonomicScope). The ability to 261 | "implicitly" support inferences about undetected dwc:Taxa (and other 262 | organismal targets) was a high priority objective in the design and 263 | structure of the Humboldt Extension. By "implicitly support 264 | inferences" we mean that a dwc:organismQuantity of zero individuals 265 | within a particular scope does not need to be provided explicitly as a 266 | separate dwc:Occurrence record, for a dwc:Event that does declare an 267 | encompassing scope and where all the taxa/targets that *were* detected 268 | were fully reported. Instead, those zero counts can be reconstituted by 269 | data users based on the data contained in other terms. When the target 270 | taxonomic scope (the combination of eco:targetTaxonomicScope and 271 | eco:excludedTaxonomicScope) is determined in advance of inventory data 272 | collection, and eco:isTaxonomicScopeFullyReported = `true`, then all 273 | dwc:Taxa that fall within the taxonomic scope but are not reported in 274 | the dwc:Occurrences of any child dwc:Events **can be inferred to be 275 | dwc:Occurrences with a dwc:organismQuantity of zero** (i.e., undetected 276 | dwc:Taxa). 277 | 278 | These inferred zero counts, in combination with information about 279 | sampling effort (i.e., eco:samplingEffortProtocol, 280 | eco:samplingEffortValue and eco:samplingEffortUnit), can then be used to 281 | estimate the likelihood that a count of zero organisms represents a 282 | *true* absence of a dwc:Taxon. However, if eco:taxonCompletenessReported 283 | = `reported incomplete` and/or eco:isTaxonomicScopeFullyReported = 284 | `false` for a dwc:Event, then future users SHOULD NOT make assumptions 285 | about absences. 286 | 287 | Data providers **MUST NOT retrospectively infer and populate 288 | eco:targetTaxonomicScope, or other *scope* terms**, for inclusion in a 289 | dataset shared with the Humboldt Extension. This is a further example of 290 | the *Principle of non-derivation* (*3.2.3*). Likewise, data users SHOULD 291 | NOT assume or reconstruct a scope that was not explicitly given by the 292 | data provider. There are at least two reasons for this: (1) Artificial 293 | construction of scope: retrospective inference of target scope by a data 294 | provider by aggregating information across all child dwc:Events may 295 | result in a reported scope that is narrower than the actual intended 296 | scope of the inventory. (2) Artificial broadening of scope: it is 297 | possible that the inferred scope can be described in multiple ways. For 298 | example, the scope of a list of species within a single genus could be 299 | described as the genus, as the family containing that genus, or as an 300 | even broader taxonomic concept. Thus, unless the true taxonomic scope is 301 | a known variable in the inventory protocol, then a presumed scope may be 302 | too broad or too narrow, leading to errors when inferring counts of 303 | zero. 304 | 305 | 306 | ## 3.3 Implementation principles 307 | 308 | 1. A Darwin Core-based inventory dataset MUST consist of at least one 309 | dwc:Event record. 310 | 311 | 2. Each dwc:Event in an inventory dataset MUST have a non-empty value 312 | for dwc:eventID that is unique within the dataset. More benefits 313 | are realizable if the dwc:eventIDs are also globally unique. 314 | 315 | 3. Any association of a Humboldt Extension record with a dwc:Event 316 | record MUST be done via that dwc:Event\'s dwc:eventID; the 317 | associated records MUST use the same dwc:eventID. It is 318 | permissible to have dwc:Event records without associated Humboldt 319 | Extension records. 320 | 321 | 4. An inventory hierarchy MUST be realized by explicitly relating each 322 | child dwc:Event to a parent dwc:Event through the child 323 | dwc:Event's dwc:parentEventID. 324 | 325 | 5. Data providers SHOULD follow [Darwin Core principle 326 | 4](https://dwc.tdwg.org/simple/#5-are-there-any-rules-normative), 327 | which is to fill the values of as many terms as possible, subject 328 | to the *Principle of applicability* and the *Principle of 329 | non-derivation* (sections *3.2.2* and *3.2.3*, respectively). 330 | 331 | 6. A child dwc:Event MUST NOT be assumed to implicitly "inherit" the 332 | value of any property of any of its parent dwc:Events; rather, the 333 | value SHOULD be provided explicitly as discussed in section *3.2.2 334 | Principle of applicability*. 335 | 336 | 7. A parent dwc:Event term SHOULD NOT be populated by deriving or 337 | summarizing information from child dwc:Events; rather, the value 338 | SHOULD be provided explicitly if appropriate to the nature and 339 | level of the dwc:Event, as discussed in section *3.2.3 Principle of non-derivation*. 340 | 341 | 342 | ## 4 Examples (non-normative) 343 | 344 | ![Tables illustrating implementation principles](fig3.png) 345 | 346 | **Figure 3.** Example illustrating the [Implementation 347 | principles](#implementation). Numbering of colored 348 | rectangles indicates the relevant principle; lines, arrows or rectangles 349 | in the same color indicate that the cells, columns or records are 350 | affected by the principle. *Notolepis coatsi* and *Cranchiidae* are not 351 | within the reported eco:targetTaxonomicScope. Principle 1 - an inventory 352 | dataset must have at least one dwc:Event record; here, 3 records can be 353 | identified. Principle 2 - each dwc:Event record must have a unique 354 | dwc:eventID. Principle 3 - Humboldt Extension records must be linked to 355 | the core dwc:Events via shared dwc:eventIDs. Principle 4 - every child 356 | dwc:Event must be related to its parent dwc:Event through a 357 | dwc:parentEventID. Principle 5 - term values for dwc:Events should be 358 | populated whenever possible; in the figure all records follow Darwin 359 | Core principle 4, subject to the *Principle of applicability* and the 360 | *Principle of non-derivation*. Principle 6 - terms for child dwc:Events 361 | must be explicitly populated rather than "inheriting" values from 362 | their parent dwc:Events. Principle 7 - terms for parent dwc:Events 363 | should be populated whenever relevant, but not be derived or summarized 364 | from their child dwc:Events. 365 | -------------------------------------------------------------------------------- /build/dwc_doc_inclusive/index.md: -------------------------------------------------------------------------------- 1 | # {document_title} 2 | 3 | Title 4 | : {document_title} 5 | 6 | Date version issued 7 | : {ratification_date} 8 | 9 | Date created 10 | : {created_date} 11 | 12 | Part of TDWG Standard 13 | : <{standard_iri}> 14 | 15 | This version 16 | : <{current_iri}{ratification_date}> 17 | 18 | Latest version 19 | : <{current_iri}> 20 | 21 | {previous_version_slot} 22 | 23 | Abstract 24 | : {abstract} 25 | 26 | Contributors 27 | : {contributors} 28 | 29 | Creator 30 | : {creator} 31 | 32 | Bibliographic citation 33 | : {creator}. {year}. {document_title}. {publisher}. <{current_iri}{ratification_date}> 34 | 35 | ## 1 Introduction (non-normative) 36 | 37 | This document elaborates upon the meaning and use of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive`. Use of this term is necessary in order to describe how to treat counts of organisms (or any other organisms quantity) when records from a single `dwc:Event` () include multiple target categories (e.g., taxonomic ranks within a higher rank or different life stages for the same species). For example, a statement whether the least specific target category quantity is inclusive should be reported when an `dwc:Event` includes records reporting quantities that are associated with subcategories (e.g., subspecies) and records reporting quantities for more general categories (e.g., the species). In this example, the higher taxon rank (i.e., species) is the least specific category, because it is more general than the subspecies category nested below it. Species and subspecies are just one example of a pair of category and subcategory. Other examples of subcategories are life stages (e.g., “adult”, “larva”, “egg”), and sexes. 38 | 39 | ### 1.1 Status of the content of this document 40 | 41 | Sections 3 of this document is normative. The other sections are non-normative. 42 | 43 | 44 | ### 1.2 RFC 2119 key words 45 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 46 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 47 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 48 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 49 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 50 | when, and only when, they are written in capitals (as shown here). 51 | 52 | ### 1.3 Namespaces and terminology 53 | 54 | The namespace `eco:` abbreviates `http://rs.tdwg.org/eco/terms/` and is used with terms minted for the Humboldt Extension for ecological inventories. `dwc:` abbreviates `http://rs.tdwg.org/dwc/terms/`, and is used with terms in the main Darwin Core vocabulary namespace. Words in `code markup` are term IRIs or literal values. The word "organisms" is used colloquially and is not used in the technical sense of the `dwc:Organism` class. 55 | 56 | ## 2 Rationale (non-normative) 57 | 58 | The term `eco:isLeastSpecificTargetCategoryQuantityInclusive` was introduced into the Humboldt Extension for ecological inventories late in development, after testing it with real-world cases ([Sica et al., 2022](#ref2)). Testing revealed that the quantities of organisms stored in two major biodiversity databases — OBIS (OBIS, 2023) and eBird (Sullivan et al., 2014) — need to be treated differently in order to calculate the total quantity of organisms in the least specific category. In the specific case of data in the OBIS database, the information for a single `dwc:Event` can contain multiple records for a species, with one record for a species listing the quantity of individual organisms for the species without specifying any subcategory of life stage, and other records for the same species in the same `dwc:Event` listing quantities for different life stages (e.g., one record for adults and another record for juveniles). In this example the single `dwc:Event` will contain 3 records: one for the species without any life stage specified, one for adults of the species, and one for juveniles of the species. For the OBIS data, the quantity in the record for which no life stage is specified is the sum of three quantities: the number of juveniles, the number of adults, and the number of individuals that were not recorded as belonging to any specific life stage. In other words, when using OBIS data, the total quantity of individuals recorded for a species, across all life stages combined, has been pre-calculated and stored in the database; unless the quantities of individuals within specific life stages are of interest, the information in the life stage subcategories can be ignored. The value of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` in this case would be `true` - the least specific category (species without any life stage specified) already includes the counts of the more specific subcategories. 59 | 60 | eBird stores information about quantities of organisms differently. For the example of a `dwc:Event` that contains separate records for subspecies and their parent species, the total number of individuals of the species needs to be calculated by the end user as the sum of the quantity reported for the species plus the quantities reported for the subspecies. In other words, the total quantity of organisms of each species has not been pre-calculated and must be derived by the end user. The value of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` in this case would be `false` - the least specific category (species) does not include the counts of the more specific subcategories (subspecies). 61 | 62 | In summary, the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` is required to inform the end user of whether they will need to derive the total quantity of organisms for the least specific category (e.g., for a species), or whether this total quantity has already been calculated prior to the data being entered into the database. Note that, if a dataset contains only simple targets that have no subcategories, the result of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` being `true` or `false` is exactly the same - the count is the total in either case. Only in this circumstance does the term not strictly need to be populated. However, given that data records acquire a "life of their own" separate from their associated metadata when aggregated from multiple data sets, best practice is to include and populate the term `eco:isLeastSpecificTargetCategoryQuantityInclusive`. 63 | 64 | ## 3 Usage guidelines (normative) 65 | 66 | The term `eco:isLeastSpecificTargetCategoryQuantityInclusive` is defined as "The total detected quantity of organisms for a `dwc:Taxon` (including subsets thereof) in a `dwc:Event` is given explicitly in a single record (`dwc:organismQuantity` value) for that `dwc:Taxon`." 67 | 68 | Values MUST be `true` and `false`. If `true`, the `dwc:organismQuantity` values for a `dwc:Taxon` in an `dwc:Event` is inclusive of all organisms of the `dwc:Taxon` (including more specific scopes such as different life stages or lower taxonomic ranks) and the total detected quantity of organisms for that `dwc:Taxon` in the `dwc:Event` MUST NOT be determined by summing the `dwc:organismQuantity` values for all records of the `dwc:Taxon` in the `dwc:Event`. Instead, the total detected quantity of organisms for the `dwc:Taxon` in an `dwc:Event` MUST be reported in a single record for the `dwc:Taxon` in the `dwc:Event`, with this record having no further specific scopes. In this case the sum of `dwc:organismQuantity` values for the reported subsets of the `dwc:Taxon` MUST NOT exceed the value of `dwc:organismQuantity` for the single record for the `dwc:Taxon` without subsets (i.e., the total). If `false`, the `dwc:organismQuantity` values for a `dwc:Taxon` in an `dwc:Event` MUST be added to get the total detected quantity of organisms for that `dwc:Taxon` in the `dwc:Event`. 69 | 70 | ## 4 Examples (non-normative) 71 | 72 | ### 4.1 Single `dwc:Taxon` example 73 | 74 | As an example of the difference between `true` and `false` values for `eco:isLeastSpecificTargetCategoryQuantityInclusive`, suppose there are three records (see Table 1) with `dwc:organismQuantity` for a `dwc:Taxon` (taxon_01) for an `dwc:Event` (event_01). One record is for adults of the `dwc:Taxon` with `dwc:organismQuantity` = `1` and `dwc:organismQuantityType` = `individuals`, one record is for juveniles of the `dwc:Taxon` with `dwc:organismQuantity` = `2` and `dwc:organismQuantityType` = `individuals`, and one record is for the `dwc:Taxon` without specifying the life stage and with `dwc:organismQuantity` = `4` and `dwc:organismQuantityType` = `individuals`. 75 | 76 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `true` for event_01, then the total number of individuals of taxon_01 for the `dwc:Event` is 4 (the least specific `dwc:Taxon` record — the one with no more specific scopes — includes all individuals of the `dwc:Taxon`). This means that there was 1 adult, 2 juveniles and 1 individual of taxon_01 whose life stage was not recorded. 77 | 78 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `false` for event_01, then the total number of individuals of taxon_01 for the `dwc:Event` is 7 (the least specific `dwc:Taxon` record - the one with no more specific scopes - does not include all individuals of the `dwc:Taxon`, rather, it is a separate category that must also be added to get the total). This means there was 1 adult, 2 juveniles and 4 individuals of taxon_01 whose life stage was not recorded. 79 | 80 | **Table 1. Organism quantities in `dwc:Occurrence` records** 81 | 82 | | occurrenceID | eventID | taxonID | lifeStage | organismQuantity | organismQuantityType | 83 | | ------------ | ------- | ------- | --------- | ---------------- | -------------------- | 84 | | occ_01 | event_01 | taxon_01 | adult | 1 | individual | 85 | | occ_02 | event_01 | taxon_01 | juvenile | 2 | individual | 86 | | occ_03 | event_01 | taxon_01 | | 4 | individual | 87 | 88 | ### 4.2 Nested taxa example 89 | 90 | Suppose there are three records (see Table 2) with `dwc:organismQuantity` for three taxa (*Hirundo rustica* and two subspecies) for a `dwc:Event` (event_01). The record for the species has `dwc:organismQuantity` = `3` and `dwc:organismQuantityType` = `individuals`. The record for *H. r. rustica* has `dwc:organismQuantity` = `2` and `dwc:organismQuantityType` = `individuals`. The record for *H. r. gutturalis* has `dwc:organismQuantity` = `4` and `dwc:organismQuantityType` = `individuals`. 91 | 92 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `true` for event_01, then the total number of individuals of the species *H. rustica* for the `dwc:Event` is 3 (the least specific `dwc:Taxon` record includes all individuals of the `dwc:Taxon`). This means there were 2 *H. r. rustica*, 1 *H. r. gutturalis*, and no other *H. rustica* of any kind detected. 93 | 94 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `false` for event_01, then the total number of individuals of the species *H. rustica* for the `dwc:Event` is 6 (the least specific `dwc:Taxon` record does not include all individuals of the `dwc:Taxon`). This means there were 2 *H. r. rustica*, 1 *H. r. gutturalis*, and 3 other *H. rustica* detected that were not identified to subspecies. 95 | 96 | **Table 2. Organism quantities in `dwc:Event` records** 97 | 98 | | eventID | scientificName | organismQuantity | organismQuantityType | 99 | | ------- | -------------- | ---------------- | -------------------- | 100 | | event_01 | Hirundo rustica | 3 | individual | 101 | | event_01 | Hirundo rustica rustica | 2 | individual | 102 | | event_01 | Hirundo rustica gutturalis | 1 | individual | 103 | 104 | # 5 References 105 | 106 | OBIS (2023) Ocean Biodiversity Information System. Intergovernmental Oceanographic Commission of UNESCO. . 107 | 108 | Sica Y. V., K. Ingenloff, Y-M GAN, Z. Kachian, S. J. Baskauf, J. Wieczorek, P. F. Zermoglio, R. D. Stevenson (2022). Application of Humboldt Extension to Real-world Cases. *Biodiversity Information Science and Standards* 6: e91502. 109 | 110 | Sullivan, B. L., J. L. Aycrigg, J. H. Barry, R. E. Bonney, N. Bruns, C. B. Cooper, T. Damoulas, A. A. Dhondt, T. Dietterich, A. Farnsworth, D. Fink, et al. (2014). The eBird enterprise: an integrated approach to development and application of citizen science. *Biological Conservation* 169:31-40. 111 | -------------------------------------------------------------------------------- /build/dwc_doc_tcr/authors_configuration.yaml: -------------------------------------------------------------------------------- 1 | - contributor_iri: https://orcid.org/0000-0002-1720-0127 2 | contributor_literal: Yanina V. Sica 3 | contributor_role: contributor 4 | role_uri: http://www.wikidata.org/entity/Q20204892 5 | affiliation: Yale University 6 | affiliation_uri: http://www.wikidata.org/entity/Q49112 7 | 8 | - contributor_iri: https://orcid.org/0000-0002-0595-7827 9 | contributor_literal: Wesley M. Hochachka 10 | contributor_role: contributor 11 | role_uri: http://www.wikidata.org/entity/Q20204892 12 | affiliation: Cornell Lab of Ornithology 13 | affiliation_uri: http://www.wikidata.org/entity/Q2997535 14 | 15 | - contributor_iri: https://orcid.org/0000-0003-4365-3135 16 | contributor_literal: Steven J. Baskauf 17 | contributor_role: contributor 18 | role_uri: http://www.wikidata.org/entity/Q20204892 19 | affiliation: Vanderbilt University Libraries 20 | affiliation_uri: http://www.wikidata.org/entity/Q16849893 21 | -------------------------------------------------------------------------------- /build/dwc_doc_tcr/document_configuration.yaml: -------------------------------------------------------------------------------- 1 | # ---------------- 2 | # Values set by the task group or maintainers of the standard. 3 | # ---------------- 4 | 5 | # Official title of the document assigned by authors. 6 | documentTitle: "Taxon Completeness Reported Controlled Vocabulary List of Terms" 7 | 8 | # Abstract of document written by authors. 9 | abstract: The Humboldt Extension for Ecological Inventories mints the term 10 | `taxonCompletenessReported` to alert users that the inventory was conducted in 11 | such a way that all of the target taxa should have been detectable if they were 12 | present during the dwc:Event. This vocabulary provides terms that should be used 13 | as values for `eco:taxonCompletenessReported` and `ecoiri:taxonCompletenessReported`. 14 | 15 | # This value is generally the name of the task group that created the document. 16 | creator: TDWG Humboldt Extension Task Group 17 | 18 | # Current (2023-08-27) practice is to publish documents as Markdown files in a TDWG GitHub repository. 19 | # These Markdown documents are then converted to HTML by GitHub Pages. To match the TDWG theme, the 20 | # document maintainers will need to work with the Infrastructure team to set up the repository so 21 | # that it can host the ancillary website for the standard or vocabulary. 22 | # The exact setup of the repository will determin the values of accessUrl and browserRedirectUri. 23 | 24 | # Media type of source document used to generate the HTML version of the document. 25 | mediaType: text/markdown 26 | 27 | # Value determined by the location of the raw Markdown file in the GitHub repository. 28 | # The repository pattern used should be to create a subdirectory for the document whose name will be 29 | # the slug for the page, then place the Markdown file named index.md in that subdirectory. 30 | accessUrl: https://raw.githubusercontent.com/tdwg/hc/main/docs/tcr/index.md 31 | 32 | # Actual URL of the document to which the permanent IRI is redirected. 33 | # When generated by GitHub pages, this will be related to the location of the raw Markdown file. 34 | # The initial default value is https://tdwg.github.io/repository_name/subdirectory_name/. 35 | # However, typically, the Infrastructure team sets up a subdomain of the tdwg.org domain for the 36 | # ancillary website. In that case, the value will eventually be 37 | # https://subdomain.tdwg.org/subdirectory_name/. 38 | browserRedirectUri: https://tdwg.github.io/hc/tcr/ 39 | 40 | # ---------------- 41 | # Values set by the TDWG Infrastructure team at time of ratification 42 | # ---------------- 43 | 44 | # Permanent IRI of the standard with which the document is associated. 45 | # For documents added to an existing standard, see the landing page 46 | # for the standard on the TDWG website for the correct value. 47 | # For new standards, this value will be set by the TDWG Infrastructure team. 48 | dcterms_isPartOf: http://www.tdwg.org/standards/450 49 | 50 | # IRI value assigned as a permanent identifier for the document based on standard TDWG IRI patterns. 51 | # This value will automatically get updated from the general_configuration.yaml file. It should not be 52 | # set manually. 53 | current_iri: http://rs.tdwg.org/dwc/doc/tcr/ 54 | 55 | # Date of first ratification of the document. Will match the doc_modified value for the first 56 | # version of the document. For lists of terms, this will also match the date that the 57 | # first version of the vocabulary was issued (date_issued). 58 | doc_created: '2023-09-13' 59 | 60 | # ---------------- 61 | # Do not edit below this line 62 | # ---------------- 63 | 64 | # Standard metadata determined by TDWG policy. 65 | publisher: Biodiversity Information Standards (TDWG) 66 | license_statement: Licensed under a Creative Commons Attribution 4.0 International (CC BY) License. 67 | license_uri: http://creativecommons.org/licenses/by/4.0/ 68 | 69 | # Typically left blank. May be used to provide additional information about the document 70 | # in the machine-readable metadata. 71 | comment: '' 72 | 73 | # This value will automatically get updated from the date_issued value in config.yaml if the document 74 | # is a list of terms document. For other types of documents, it is set from the general_configuration.yaml 75 | # file. 76 | doc_modified: '2023-08-25' 77 | -------------------------------------------------------------------------------- /build/dwc_doc_tcr/termlist-header.md: -------------------------------------------------------------------------------- 1 | # {document_title} 2 | 3 | Title 4 | : {document_title} 5 | 6 | Namespace IRI 7 | : {namespace_uri} 8 | 9 | Preferred namespace abbreviation 10 | : {pref_namespace_prefix}: 11 | 12 | Date version issued 13 | : {ratification_date} 14 | 15 | Date created 16 | : {created_date} 17 | 18 | Part of TDWG Standard 19 | : <{standard_iri}> 20 | 21 | This version 22 | : <{current_iri}{ratification_date}> 23 | 24 | Latest version 25 | : <{current_iri}> 26 | 27 | {previous_version_slot} 28 | 29 | Abstract 30 | : {abstract} 31 | 32 | Contributors 33 | : {contributors} 34 | 35 | Creator 36 | : {creator} 37 | 38 | Bibliographic citation 39 | : {creator}. {year}. {document_title}. {publisher}. <{current_iri}{ratification_date}> 40 | 41 | ## 1 Introduction (non-normative) 42 | 43 | This document includes terms intended to be used as a controlled value for the Humboldt Extension terms with the local name `taxonCompletenessReported`. 44 | 45 | ### 1.1 Status of the content of this document 46 | 47 | Sections 1 and 3 are non-normative. Section 2 is normative. In Section 4, the values of the `Term IRI`, `Definition`, and `Controlled value` are normative. The value of `Usage` (if it exists for a given term) is normative. The values of `Term Name` are non-normative, although one can expect that the namespace abbreviation prefix is one commonly used for the term namespace. `Label` and the values of all other properties (such as `Notes`) are non-normative. 48 | 49 | ### 1.2 RFC 2119 key words 50 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 51 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 52 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 53 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 54 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 55 | when, and only when, they are written in capitals (as shown here). 56 | 57 | ### 1.3 Namespaces 58 | 59 | The namespace `eco:` abbreviates `http://rs.tdwg.org/eco/terms/` and the namespace `ecoiri:` abbreviates `http://rs.tdwg.org/eco/iri/`. Both namespaces are used with terms minted for the Humboldt Extension for Ecological Inventories. `ecotcr:` abbreviates `http://rs.tdwg.org/ecotcr/values/`, and is used with terms in this vocabulary. 60 | 61 | ## 2 Use of Terms (normnative) 62 | 63 | Due to the requirements of [Section 1.4.3 of the Darwin Core RDF Guide](http://rs.tdwg.org/dwc/terms/guides/rdf/#143-use-of-darwin-core-terms-in-rdf-normative), term IRIs MUST be used as values of `ecoiri:taxonCompletenessReported`. Controlled value strings MUST be used as values of `eco:taxonCompletenessReported`. 64 | 65 | -------------------------------------------------------------------------------- /build/generate_term_versions.py: -------------------------------------------------------------------------------- 1 | # ----------------------------- 2 | # file import and configuration 3 | # ----------------------------- 4 | 5 | import pandas as pd 6 | 7 | # This is the base URL for raw files from the branch of the repo that has been pushed to GitHub 8 | github_baseUri = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/' 9 | 10 | # This is a Python list of the database names of the term version lists to be included in the document. 11 | term_lists = ['humboldt', 'humboldt_iri'] 12 | 13 | column_mappings = [ 14 | {'norm': 'iri', 'accum': 'version'}, 15 | {'norm': 'term_localName', 'accum': 'term_localName'}, 16 | {'norm': 'label', 'accum': 'label'}, 17 | {'norm': 'definition', 'accum': 'rdfs_comment'}, 18 | {'norm': 'comments', 'accum': 'dcterms_description'}, 19 | {'norm': 'examples', 'accum': 'examples'}, 20 | {'norm': 'organized_in', 'accum': 'tdwgutility_organizedInClass'}, 21 | {'norm': 'issued', 'accum': 'version_issued'}, 22 | {'norm': 'status', 'accum': 'version_status'}, 23 | {'norm': 'replaces', 'accum': 'replaces_version'}, 24 | {'norm': 'rdf_type', 'accum': 'rdf_type'}, 25 | {'norm': 'term_iri', 'accum': 'term_iri'}, 26 | {'norm': 'abcd_equivalence', 'accum': 'tdwgutility_abcdEquivalence'}, 27 | {'norm': 'flags', 'accum': 'tdwgutility_usageScope'} 28 | ] 29 | 30 | # ----------------------------- 31 | # Load the term version data for all of the term lists that are included in Darwin Core (including obsolete ones) 32 | # ----------------------------- 33 | 34 | print('Loading namespace CSV files from GitHub:') 35 | for term_list_index in range(len(term_lists)): 36 | # retrieve configuration metadata for term list 37 | config_url = github_baseUri + term_lists[term_list_index] + '/constants.csv' 38 | config_df = pd.read_csv(config_url, na_filter=False) 39 | term_namespace = config_df.iloc[0].loc['domainRoot'] 40 | # print(term_namespace) 41 | 42 | # Retrieve versions metadata for term list 43 | versions_url = github_baseUri + term_lists[term_list_index] + '-versions/' + term_lists[term_list_index] + '-versions.csv' 44 | print(versions_url) 45 | versions_df = pd.read_csv(versions_url, na_filter=False) 46 | 47 | # Add a column for the term IRI by concatenating the term namespace with the local name value for each row 48 | versions_df['term_iri'] = term_namespace + versions_df['term_localName'] 49 | 50 | if term_list_index == 0: 51 | # start the DataFrame with the first term list versions data 52 | accumulated_frame = versions_df.copy() 53 | else: 54 | # append subsequent term lists data to the DataFrame 55 | #accumulated_frame = accumulated_frame._append(versions_df.copy(), sort=True) 56 | accumulated_frame = pd.concat([accumulated_frame, versions_df], sort=True) 57 | ''' 58 | # Special procedure for obsolete terms 59 | # Retrieve versions metadata 60 | versions_url = github_baseUri + 'dwc-obsolete-versions/dwc-obsolete-versions.csv' 61 | print(versions_url) 62 | versions_df = pd.read_csv(versions_url, na_filter=False) 63 | 64 | # Retrieve term/version join data 65 | join_url = github_baseUri + 'dwc-obsolete/dwc-obsolete-versions.csv' 66 | join_df = pd.read_csv(join_url, na_filter=False) 67 | 68 | # Find the term IRI for each version and add it to a list 69 | term_iri_list = [] 70 | 71 | for row_index,row in versions_df.iterrows(): 72 | for join_index,join_row in join_df.iterrows(): 73 | # Locate the row in the join data where the version matches the row in the versions DataFrame 74 | if join_row['version'] == row['version']: 75 | term_iri_list.append(join_row['term']) 76 | break 77 | 78 | # Locate the row in the join data where the version matches the row in the versions DataFrame 79 | term_iri_row = join_df.loc[join_df['version'] == row['version']] 80 | # Add the current term IRI from the join data row to the list 81 | term_iri_list.append(term_iri_row['term']) 82 | 83 | # Add the curren term IRI list to the DataFrame as the term_iri column 84 | versions_df['term_iri'] = term_iri_list 85 | # Add the obsolete terms DataFrame to the accumulated DataFrame 86 | accumulated_frame = accumulated_frame._append(versions_df.copy(), sort=True) 87 | ''' 88 | accumulated_frame.reset_index(drop=True, inplace=True) # reset the row indices to consecutive starting with zero 89 | accumulated_frame.fillna('', inplace=True) # replace all missing values with empty strings 90 | accumulated_frame.head() 91 | print() 92 | 93 | # ----------------------------- 94 | # Create a list of lists building each row of the normative document 95 | # ----------------------------- 96 | 97 | # Create column header list for the normative document 98 | column_headers = [] 99 | for column_mapping in column_mappings: 100 | # Add the value of the 'norm' key for the column 101 | column_headers.append(column_mapping['norm']) 102 | #print(column_headers) 103 | 104 | print('merging rows for output document') 105 | # Create the rows of the normative document 106 | normative_doc_list = [] 107 | for row_index,row in accumulated_frame.iterrows(): 108 | normative_doc_row = [] 109 | for column_mapping in column_mappings: 110 | # Add the value from the accumulation DataFrame column whose name is the value of the 'accum' key for the column 111 | if column_mapping['norm'] == 'replaces': 112 | # concatenate all versions that were replaced; pipe separated 113 | replace_iri = row['replaces_version'] 114 | if row['replaces1_version'] != '': 115 | replace_iri += '|' + row['replaces1_version'] 116 | if row['replaces2_version'] != '': 117 | replace_iri += '|' + row['replaces2_version'] 118 | normative_doc_row.append(replace_iri) 119 | else: 120 | normative_doc_row.append(row[column_mapping['accum']]) 121 | normative_doc_list.append(normative_doc_row) 122 | 123 | ''' NO LONGER NEEDED FOR HANDLING OF IRI VALUED TERMS 124 | # special handling for http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI. Eventually we want to eliminate this. 125 | use_with_iri_row = ['http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI-2017-10-06', 126 | 'UseWithIRI', 127 | 'UseWithIRI', 128 | 'The category of terms that are recommended to have an IRI as a value.', 129 | 'A utility class to organize the dwciri: terms.', 130 | '', 131 | 'http://www.w3.org/2000/01/rdf-schema#Class', 132 | '2017-10-06', 133 | 'recommended', 134 | '', 135 | 'http://www.w3.org/2000/01/rdf-schema#Class', 136 | 'http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI', 137 | 'not in ABCD', 138 | ''] 139 | normative_doc_list.append(use_with_iri_row) 140 | ''' 141 | 142 | # Turn list of lists into dataframe 143 | normative_doc_df = pd.DataFrame(normative_doc_list, columns = column_headers) 144 | # Set the row label as the version IRI 145 | normative_doc_df.set_index('iri', drop=False, inplace=True) 146 | normative_doc_df.index.names = ['row_index'] 147 | #normative_doc_df.to_csv('test.csv', index = False) 148 | #string1 = normative_doc_df.iloc[571]['term_iri'] 149 | 150 | # ----------------------------- 151 | # Order the rows as required for generating the Quick Reference Guide 152 | # ----------------------------- 153 | 154 | # DataFrame to hold built Quick Reference Guide-ordered rows 155 | built_rows_df = normative_doc_df.iloc[1:0].copy() 156 | 157 | # DataFrame to hold remaining rows 158 | remaining_rows_df = normative_doc_df.copy() 159 | 160 | # Load the ordered list of terms in the Quick Reference Guide (single column named recommended_term_iri) 161 | print('ordering rows for output document') 162 | qrg_df = pd.read_csv('qrg-list.csv', na_filter=False) 163 | for qrg_index,qrg_row in qrg_df.iterrows(): 164 | found = False 165 | for row_index,row in normative_doc_df.iterrows(): 166 | if (qrg_row['recommended_term_iri'] == row['term_iri']) and (row['status'] == 'recommended'): 167 | found = True 168 | #built_rows_df = built_rows_df.append(row) 169 | built_rows_df.loc[len(built_rows_df.index)] = row 170 | remaining_rows_df.drop(row['iri'], axis=0, inplace=True) 171 | break 172 | if not found: 173 | print('row not found:', qrg_row['recommended_term_iri']) 174 | 175 | # Alphabetize remaining term versions 176 | #remaining_rows_df.sort_values(by='iri', inplace=True) 177 | sorted_output = remaining_rows_df.iloc[remaining_rows_df.iri.str.lower().argsort()] 178 | 179 | # Concatenate ordered terms and remaining versions 180 | #normative_doc_df = built_rows_df.append(remaining_rows_df) 181 | #normative_doc_df = built_rows_df.append(sorted_output) 182 | normative_doc_df = pd.concat([built_rows_df, sorted_output]) 183 | 184 | # Save the normative document DataFrame as a CSV 185 | normative_doc_df.to_csv('../vocabulary/term_versions.csv', index = False) 186 | 187 | print('done') 188 | -------------------------------------------------------------------------------- /build/qrg-list.csv: -------------------------------------------------------------------------------- 1 | recommended_term_iri 2 | group:Site 3 | http://rs.tdwg.org/eco/terms/siteCount 4 | http://rs.tdwg.org/eco/terms/siteNestingDescription 5 | http://rs.tdwg.org/eco/terms/verbatimSiteDescriptions 6 | http://rs.tdwg.org/eco/terms/verbatimSiteNames 7 | http://rs.tdwg.org/eco/terms/geospatialScopeAreaValue 8 | http://rs.tdwg.org/eco/terms/geospatialScopeAreaUnit 9 | http://rs.tdwg.org/eco/terms/totalAreaSampledValue 10 | http://rs.tdwg.org/eco/terms/totalAreaSampledUnit 11 | http://rs.tdwg.org/eco/terms/reportedWeather 12 | http://rs.tdwg.org/eco/terms/reportedExtremeConditions 13 | group:Habitat Scope 14 | http://rs.tdwg.org/eco/terms/targetHabitatScope 15 | http://rs.tdwg.org/eco/terms/excludedHabitatScope 16 | group:Temporal Scope 17 | http://rs.tdwg.org/eco/terms/eventDurationValue 18 | http://rs.tdwg.org/eco/terms/eventDurationUnit 19 | group:Taxonomic Scope 20 | http://rs.tdwg.org/eco/terms/targetTaxonomicScope 21 | http://rs.tdwg.org/eco/terms/excludedTaxonomicScope 22 | http://rs.tdwg.org/eco/terms/taxonCompletenessReported 23 | http://rs.tdwg.org/eco/terms/taxonCompletenessProtocols 24 | http://rs.tdwg.org/eco/terms/isTaxonomicScopeFullyReported 25 | http://rs.tdwg.org/eco/terms/isAbsenceReported 26 | http://rs.tdwg.org/eco/terms/absentTaxa 27 | http://rs.tdwg.org/eco/terms/hasNonTargetTaxa 28 | http://rs.tdwg.org/eco/terms/nonTargetTaxa 29 | http://rs.tdwg.org/eco/terms/areNonTargetTaxaFullyReported 30 | group:Organismal Scope 31 | http://rs.tdwg.org/eco/terms/targetLifeStageScope 32 | http://rs.tdwg.org/eco/terms/excludedLifeStageScope 33 | http://rs.tdwg.org/eco/terms/isLifeStageScopeFullyReported 34 | http://rs.tdwg.org/eco/terms/targetDegreeOfEstablishmentScope 35 | http://rs.tdwg.org/eco/terms/excludedDegreeOfEstablishmentScope 36 | http://rs.tdwg.org/eco/terms/isDegreeOfEstablishmentScopeFullyReported 37 | http://rs.tdwg.org/eco/terms/targetGrowthFormScope 38 | http://rs.tdwg.org/eco/terms/excludedGrowthFormScope 39 | http://rs.tdwg.org/eco/terms/isGrowthFormScopeFullyReported 40 | http://rs.tdwg.org/eco/terms/hasNonTargetOrganisms 41 | http://rs.tdwg.org/eco/terms/verbatimTargetScope 42 | group:Methodology Description 43 | http://rs.tdwg.org/eco/terms/compilationTypes 44 | http://rs.tdwg.org/eco/terms/compilationSourceTypes 45 | http://rs.tdwg.org/eco/terms/inventoryTypes 46 | http://rs.tdwg.org/eco/terms/protocolNames 47 | http://rs.tdwg.org/eco/terms/protocolDescriptions 48 | http://rs.tdwg.org/eco/terms/protocolReferences 49 | http://rs.tdwg.org/eco/terms/isAbundanceReported 50 | http://rs.tdwg.org/eco/terms/isAbundanceCapReported 51 | http://rs.tdwg.org/eco/terms/abundanceCap 52 | http://rs.tdwg.org/eco/terms/isVegetationCoverReported 53 | http://rs.tdwg.org/eco/terms/isLeastSpecificTargetCategoryQuantityInclusive 54 | group:Material Collected 55 | http://rs.tdwg.org/eco/terms/hasVouchers 56 | http://rs.tdwg.org/eco/terms/voucherInstitutions 57 | http://rs.tdwg.org/eco/terms/hasMaterialSamples 58 | http://rs.tdwg.org/eco/terms/materialSampleTypes 59 | group:Sampling Effort 60 | http://rs.tdwg.org/eco/terms/samplingPerformedBy 61 | http://rs.tdwg.org/eco/terms/isSamplingEffortReported 62 | http://rs.tdwg.org/eco/terms/samplingEffortProtocol 63 | http://rs.tdwg.org/eco/terms/samplingEffortValue 64 | http://rs.tdwg.org/eco/terms/samplingEffortUnit 65 | group:UseWithIRI 66 | http://rs.tdwg.org/eco/iri/absentTaxa 67 | http://rs.tdwg.org/eco/iri/compilationSourceTypes 68 | http://rs.tdwg.org/eco/iri/compilationTypes 69 | http://rs.tdwg.org/eco/iri/eventDurationUnit 70 | http://rs.tdwg.org/eco/iri/excludedDegreeOfEstablishmentScope 71 | http://rs.tdwg.org/eco/iri/excludedGrowthFormScope 72 | http://rs.tdwg.org/eco/iri/excludedHabitatScope 73 | http://rs.tdwg.org/eco/iri/excludedLifeStageScope 74 | http://rs.tdwg.org/eco/iri/excludedTaxonomicScope 75 | http://rs.tdwg.org/eco/iri/geospatialScopeAreaUnit 76 | http://rs.tdwg.org/eco/iri/inventoryTypes 77 | http://rs.tdwg.org/eco/iri/materialSampleTypes 78 | http://rs.tdwg.org/eco/iri/nonTargetTaxa 79 | http://rs.tdwg.org/eco/iri/protocolNames 80 | http://rs.tdwg.org/eco/iri/samplingEffortProtocol 81 | http://rs.tdwg.org/eco/iri/samplingEffortUnit 82 | http://rs.tdwg.org/eco/iri/samplingPerformedBy 83 | http://rs.tdwg.org/eco/iri/targetDegreeOfEstablishmentScope 84 | http://rs.tdwg.org/eco/iri/targetGrowthFormScope 85 | http://rs.tdwg.org/eco/iri/targetHabitatScope 86 | http://rs.tdwg.org/eco/iri/targetLifeStageScope 87 | http://rs.tdwg.org/eco/iri/targetTaxonomicScope 88 | http://rs.tdwg.org/eco/iri/taxonCompletenessProtocols 89 | -------------------------------------------------------------------------------- /build/requirements.txt: -------------------------------------------------------------------------------- 1 | jinja2 2 | PyYAML 3 | -------------------------------------------------------------------------------- /build/tcr-2024-02-28/config.yaml: -------------------------------------------------------------------------------- 1 | # To use this configuration file, it must be in the process directory from which the 2 | # process.py script is run. Typically, a copy is stored with the modifications CSV file. 3 | 4 | # Date assigned to all versions, usually the date of approval by the Executive Committee. 5 | # It is appended to all version IRIs. Format: YYYY-MM-DD 6 | date_issued: '2024-02-28' 7 | 8 | # UTC offset for the computer running the script (i.e. the appropriate offset for values produced by the 9 | # Python method datetime.datime.now() . 10 | local_offset_from_utc: -05:00 11 | 12 | # Only relevant when new term lists or vocabularies are created. It does nothing when 13 | # existing terms are changed. 14 | # Technical note: this controls which template column mapping file from the "current terms" and "versions" 15 | # directories of the process directory in rs.tdwg.org repo. If additional properties are added in addition 16 | # to the standard ones, the template file will need to be edited. See Section 3 of process-vocaulary.md for details. 17 | # Categories: 18 | # 1: Simple vocabulary 19 | # 2: Simple controlled vocabulary 20 | # 3: Controlled vocabluary with broader hierarchy 21 | vocab_type: 2 22 | 23 | # Permanent IRI for the list of terms document that is associated with this vocabulary. 24 | # This is needed to automatically update the date_modified value of the list of terms document 25 | # using the date_issued value above. 26 | list_of_terms_iri: http://rs.tdwg.org/dwc/doc/tcr/ 27 | 28 | # IRI of containing standard. Existing standards IRIs: 29 | # Darwin Core - http://www.tdwg.org/standards/450 30 | # Audiovisual Core - http://www.tdwg.org/standards/638 31 | # Latimer Core - http://www.tdwg.org/standards/x 32 | standard: http://www.tdwg.org/standards/450 33 | 34 | # Text to describe the Executive Committee Decision that approved the change. 35 | decisions_text: Humboldt Extension for Ecological Inventories and controlled vocabulary for eco:taxonCompletenessReported ratified as a part of the Darwin Core Standard. See https://github.com/tdwg/hc/milestone/1 36 | 37 | namespaces: 38 | 39 | # Repeat the following data for each namespace 40 | 41 | # For existing term lists, MUST be namespace assigned by issuing organization. For TDWG 42 | # term lists, MUST follow conventional TDWG IRI patterns. 43 | - namespace_uri: http://rs.tdwg.org/ecotcr/values/ 44 | 45 | # Standard namespace abbreviation for the namespace IRI. 46 | pref_namespace_prefix: ecotcr 47 | 48 | # Database name for associated directories and files in the rs.tdwg.org repository. 49 | # MUST NOT contain spaces. SHOULD be descriptive and lowerCamelCase is RECOMMENDED. 50 | # Borrowed term lists SHOULD use naming convention of Darwin and Audiovisual Cores. 51 | # Do not append -versions to this name, the versions directory will be created automatically. 52 | database: taxonCompletenessReported 53 | 54 | # MUST be set to true if namespace not issued by TDWG in the rs.tdwg.org subdomain. 55 | # MUST be set to false if namespace issued and controlled by TDWG. 56 | borrowed: false 57 | 58 | # Set to true if a new term list that has never been processed before. Otherwise, set to false. 59 | # MUST be set to true if it is a new term list that has never been processed before. 60 | # Note that there are extra configuration files that must be set up for term lists that are 61 | # part of new vocabularies. See Section 2.1.2 for details. 62 | # MUST be set to false if this is an existing term list that has been processed at some time in the past. 63 | new_term_list: true 64 | 65 | # Normally set to false except for non-versioned namespaces like decisions. 66 | utility_namespace: false 67 | 68 | # Path to hand-edited changes CSV file. Relative to process directory from which the 69 | # process.py script is run. 70 | modifications_file_path: dwc-revisions/tcr-2024-02-28/tcr.csv 71 | 72 | # For TDWG-minted terms, SHOULD be set to empty string (Termlist IRI will be set to be 73 | # the same as the namespace IRI). For borrowed terms, mint an IRI that conforms to the 74 | # TDWG termlist IRI pattern. 75 | 76 | # For TDWG-minted terms, this value SHOULD be the empty string and the termlist IRI will be set to be 77 | # the same as the namespace IRI. If a value is given for TDWG-minted terms, it MUST be the same as the 78 | # namespace IRI. When terms are borrowed from other non-TDWG vocabularies to be included within a TDWG 79 | # vocabulary, an IRI for the borrowed term list conforming to the term list IRI pattern 80 | # (https://github.com/tdwg/rs.tdwg.org#3rd-level-iris-denoting-term-lists) MUST be minted. 81 | # The subdomain MUST be `rs.tdwg.org` and the first level IRI component following the subdomain MUST be 82 | # the standard component for the vocabulary that is borrowing the terms. The second level IRI component 83 | # SHOULD be a short, memorable string commonly associated with the borrowed vocabulary. Examples: 84 | # http://rs.tdwg.org/ac/xmp/ for the XMP terms borrowed by the Audiovisual Core 85 | # http://rs.tdwg.org/dwc/dcterms/ for the Dublin Core dcterms: terms borrowed by the Darwin Core 86 | termlist_uri: '' 87 | 88 | # Label used for the term list in machine-readable metadata. 89 | label: taxonCompletenessReported controlled values list 90 | 91 | # Description of the term list used in machine-readable metadata. 92 | description: Controlled values list for the Humboldt Extension for Ecological Inventories term taxonCompletenessReported. 93 | 94 | # The following values are used to set up redirects to the list of terms document. 95 | 96 | # IRI string from List of Terms document URL to be prepended to the term fragment identifier when 97 | # dereferencing terms and an HTML representation is requested. 98 | prepend_url: https://eco.tdwg.org/tcr/# 99 | 100 | # Indicates whether the namespace abbreviation is included in the fragment identifier for the term. 101 | use_namespace_in_fragment: true 102 | 103 | # String that us used to separate the namespace abbreviation from the term name in the fragment identifier. 104 | # If use_nameapace_in_fragment is false, this value is ignored. 105 | separator: '_' 106 | -------------------------------------------------------------------------------- /build/tcr-2024-02-28/tcr.csv: -------------------------------------------------------------------------------- 1 | term_localName,label,skos_inScheme,definition,definition_derived_from,usage,notes,examples,controlled_value_string,type 2 | tcr,taxon completeness reported concept scheme,,a SKOS concept scheme for categorizing taxon completeness reporting,,,,,,http://www.w3.org/2004/02/skos/core#ConceptScheme 3 | tcr00,not reported,tcr,Taxonomic completeness was not assessed or reported for the dwc:Event.,,,,,notReported,http://www.w3.org/2004/02/skos/core#Concept 4 | tcr01,reported complete,tcr,"Taxonomic completeness was assessed for the dwc:Event, and it was determined to be complete.",,,,,reportedComplete,http://www.w3.org/2004/02/skos/core#Concept 5 | tcr02,reported incomplete,tcr,"Taxonomic completeness was assessed for the dwc:Event, and it was determined to be incomplete.",,,,,reportedIncomplete,http://www.w3.org/2004/02/skos/core#Concept 6 | -------------------------------------------------------------------------------- /build/tcr-2024-02-28/vocab.yaml: -------------------------------------------------------------------------------- 1 | # To use this configuration file, it must be in the process directory from which the 2 | # process.py script is run. Typically, a copy is stored with the modifications CSV file. 3 | 4 | # The following values are only required if the vocabulary is new. If they are provided for 5 | # an existing vocabulary, they will replace the existing values. 6 | vocabulary_label: Taxon Completeness Reported Controlled Vocabulary 7 | vocabulary_description: A controlled vocabulary for the Humboldt Extension for Ecological Inventories term eco:taxonCompletenessReported. 8 | 9 | # Current practice is to use the nane of the Task Group that created it. For vocabularies 10 | # that have been heavily modified, the name of the Maintenance Group may be used. 11 | dc_creator: TDWG Humboldt Extension Task Group 12 | 13 | # TDWG's standard license should be used. 14 | dcterms_license: https://creativecommons.org/licenses/by/4.0/ 15 | 16 | # The following values are only required if the standard is new. If they are provided for 17 | # an existing standard, they will replace the existing values. 18 | standard_label: Darwin Core 19 | standard_description: Darwin Core is a standard maintained by the Darwin Core maintenance group. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing identifiers, labels, and definitions. Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. 20 | -------------------------------------------------------------------------------- /build/tcr_build.py: -------------------------------------------------------------------------------- 1 | # Script to build Markdown pages that provide term metadata for simple vocabularies 2 | # Steve Baskauf 2020-06-28 CC0 3 | # This script merges static Markdown header and footer documents with term information tables (in Markdown) generated from data in the rs.tdwg.org repo from the TDWG Github site 4 | 5 | # Note: this script calls a function from http_library.py, which requires importing the requests, csv, and json modules 6 | import re 7 | import requests # best library to manage HTTP transactions 8 | import csv # library to read/write/parse CSV files 9 | import json # library to convert JSON to Python data structures 10 | import pandas as pd 11 | import yaml 12 | import sys 13 | 14 | # ----------------- 15 | # Command line arguments 16 | # ----------------- 17 | 18 | arg_vals = sys.argv[1:] 19 | opts = [opt for opt in arg_vals if opt.startswith('-')] 20 | args = [arg for arg in arg_vals if not arg.startswith('-')] 21 | 22 | # "master" for production, something else for development 23 | # Example: First part of branch URL is "https://raw.githubusercontent.com/tdwg/rs.tdwg.org/eco/", branch is "eco". 24 | if '--branch' in opts: 25 | github_branch = args[opts.index('--branch')] 26 | else: 27 | github_branch = 'master' 28 | 29 | # ----------------- 30 | # Configuration section 31 | # ----------------- 32 | 33 | # This is the base URL for raw files from the branch of the repo that has been pushed to GitHub 34 | githubBaseUri = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/' + github_branch + '/' 35 | 36 | headerFileName = 'dwc_doc_tcr/termlist-header.md' 37 | footerFileName = 'termlist-footer.md' 38 | outFileName = '../docs/tcr/index.md' 39 | 40 | # This is a Python list of the database names of the term lists to be included in the document. 41 | termLists = ['taxonCompletenessReported'] 42 | 43 | # If this list of terms is for terms in a single namespace, set the value of has_namespace to True. The value 44 | # of has_namespace should be False for a list of terms that contains multiple namespaces. 45 | has_namespace = True 46 | 47 | # NOTE! There may be problems unless every term list is of the same vocabulary type since the number of columns will differ 48 | # However, there probably aren't any circumstances where mixed types will be used to generate the same page. 49 | vocab_type = 2 # 1 is simple vocabulary, 2 is simple controlled vocabulary, 3 is c.v. with broader hierarchy 50 | 51 | # Terms in large vocabularies like Darwin and Audubon Cores may be organized into categories using tdwgutility_organizedInClass 52 | # If so, those categories can be used to group terms in the generated term list document. 53 | organized_in_categories = False 54 | 55 | # If organized in categories, the display_order list must contain the IRIs that are values of tdwgutility_organizedInClass 56 | # If not organized into categories, the value is irrelevant. There just needs to be one item in the list. 57 | display_order = [''] 58 | display_label = ['Vocabulary'] # these are the section labels for the categories in the page 59 | display_comments = [''] # these are the comments about the category to be appended following the section labels 60 | display_id = ['Vocabulary'] # these are the fragment identifiers for the associated sections for the categories 61 | 62 | # --------------- 63 | # Load header data 64 | # --------------- 65 | 66 | config_file_path = 'process/document_metadata_processing/dwc_doc_tcr/' 67 | contributors_yaml_file = 'authors_configuration.yaml' 68 | document_configuration_yaml_file = 'document_configuration.yaml' 69 | 70 | if has_namespace: 71 | # Load the data about the namespace from term lists metadata at rs.tdwg.org 72 | term_lists_df = pd.read_csv(githubBaseUri + 'term-lists/term-lists.csv') 73 | # Find the row in the term-lists.csv file that corresponds to the database. 74 | term_list_row = term_lists_df.loc[term_lists_df['database'] == termLists[0]] 75 | # Extract the namespace IRI and preferred namespace prefix from the row. 76 | namespace_uri = term_list_row['vann_preferredNamespaceUri'].values[0] 77 | pref_namespace_prefix = term_list_row['vann_preferredNamespacePrefix'].values[0] 78 | 79 | ''' 80 | # Load the configuration file used in the metadata creation process. 81 | metadata_config_text = requests.get(githubBaseUri + 'process/config.yaml').text 82 | metadata_config = yaml.load(metadata_config_text, Loader=yaml.FullLoader) 83 | namespace_uri = metadata_config['namespaces'][0]['namespace_uri'] 84 | pref_namespace_prefix = metadata_config['namespaces'][0]['pref_namespace_prefix'] 85 | ''' 86 | 87 | # Load the contributors YAML file from its GitHub URL 88 | contributors_yaml_url = githubBaseUri + config_file_path + contributors_yaml_file 89 | contributors_yaml = requests.get(contributors_yaml_url).text 90 | if contributors_yaml == '404: Not Found': 91 | print('Contributors YAML file not found. Check the URL.') 92 | print(contributors_yaml_url) 93 | exit() 94 | contributors_yaml = yaml.load(contributors_yaml, Loader=yaml.FullLoader) 95 | 96 | # Load the document configuration YAML file from its GitHub URL 97 | document_configuration_yaml_url = githubBaseUri + config_file_path + document_configuration_yaml_file 98 | document_configuration_yaml = requests.get(document_configuration_yaml_url).text 99 | document_configuration_yaml = yaml.load(document_configuration_yaml, Loader=yaml.FullLoader) 100 | 101 | # --------------- 102 | # Function definitions 103 | # --------------- 104 | 105 | # replace URL with link 106 | # 107 | def createLinks(text): 108 | def repl(match): 109 | if match.group(1)[-1] == '.': 110 | return '' + match.group(1)[:-1] + '.' 111 | return '' + match.group(1) + '' 112 | 113 | pattern = '(https?://[^\s,;\)"]*)' 114 | result = re.sub(pattern, repl, text) 115 | return result 116 | 117 | # 2021-08-06 Replace the createLinks() function with functions copied from the QRG build script written by S. Van Hoey 118 | def convert_code(text_with_backticks): 119 | """Takes all back-quoted sections in a text field and converts it to 120 | the html tagged version of code blocks ... 121 | """ 122 | return re.sub(r'`([^`]*)`', r'\1', text_with_backticks) 123 | 124 | def convert_link(text_with_urls): 125 | """Takes all links in a text field and converts it to the html tagged 126 | version of the link 127 | """ 128 | def _handle_matched(inputstring): 129 | """quick hack version of url handling on the current prime versions data""" 130 | url = inputstring.group() 131 | return "{}".format(url, url) 132 | 133 | regx = "(http[s]?://[\w\d:#@%/;$()~_?\+-;=\\\.&]*)(?{% for example in examples %}
  • {{ example }}
  • {% endfor %}{% endif %} 140 | def convert_examples(text_with_list_of_examples: str) -> str: 141 | examples_list = text_with_list_of_examples.split('; ') 142 | if len(examples_list) == 1: 143 | return examples_list[0] 144 | else: 145 | output = '
      \n' 146 | for example in examples_list: 147 | output += '
    • ' + example + '
    • \n' 148 | output += '
    ' 149 | return output 150 | 151 | print('Retrieving term list metadata from GitHub') 152 | term_lists_info = [] 153 | 154 | frame = pd.read_csv(githubBaseUri + 'term-lists/term-lists.csv', na_filter=False) 155 | for termList in termLists: 156 | term_list_dict = {'list_iri': termList} 157 | term_list_dict = {'database': termList} 158 | for index,row in frame.iterrows(): 159 | if row['database'] == termList: 160 | term_list_dict['pref_ns_prefix'] = row['vann_preferredNamespacePrefix'] 161 | term_list_dict['pref_ns_uri'] = row['vann_preferredNamespaceUri'] 162 | term_list_dict['list_iri'] = row['list'] 163 | term_lists_info.append(term_list_dict) 164 | 165 | print('Retrieving metadata about terms from all namespaces from GitHub') 166 | # Create column list 167 | column_list = ['pref_ns_prefix', 'pref_ns_uri', 'term_localName', 'label', 'definition', 'usage', 'notes', 'term_modified', 'term_deprecated', 'type'] 168 | if vocab_type == 2: 169 | column_list += ['controlled_value_string'] 170 | elif vocab_type == 3: 171 | column_list += ['controlled_value_string', 'skos_broader'] 172 | if organized_in_categories: 173 | column_list.append('tdwgutility_organizedInClass') 174 | column_list.append('version_iri') 175 | 176 | # Create list of lists metadata table 177 | table_list = [] 178 | for term_list in term_lists_info: 179 | # retrieve versions metadata for term list 180 | versions_url = githubBaseUri + term_list['database'] + '-versions/' + term_list['database'] + '-versions.csv' 181 | versions_df = pd.read_csv(versions_url, na_filter=False) 182 | 183 | # retrieve current term metadata for term list 184 | data_url = githubBaseUri + term_list['database'] + '/' + term_list['database'] + '.csv' 185 | frame = pd.read_csv(data_url, na_filter=False) 186 | for index,row in frame.iterrows(): 187 | row_list = [term_list['pref_ns_prefix'], term_list['pref_ns_uri'], row['term_localName'], row['label'], row['definition'], row['usage'], row['notes'], row['term_modified'], row['term_deprecated'], row['type']] 188 | if vocab_type == 2: 189 | row_list += [row['controlled_value_string']] 190 | elif vocab_type == 3: 191 | if row['skos_broader'] =='': 192 | row_list += [row['controlled_value_string'], ''] 193 | else: 194 | row_list += [row['controlled_value_string'], term_list['pref_ns_prefix'] + ':' + row['skos_broader']] 195 | if organized_in_categories: 196 | row_list.append(row['tdwgutility_organizedInClass']) 197 | 198 | # Borrowed terms really don't have implemented versions. They may be lacking values for version_status. 199 | # In their case, their version IRI will be omitted. 200 | found = False 201 | for vindex, vrow in versions_df.iterrows(): 202 | if vrow['term_localName']==row['term_localName'] and vrow['version_status']=='recommended': 203 | found = True 204 | version_iri = vrow['version'] 205 | # NOTE: the current hack for non-TDWG terms without a version is to append # to the end of the term IRI 206 | if version_iri[len(version_iri)-1] == '#': 207 | version_iri = '' 208 | if not found: 209 | version_iri = '' 210 | row_list.append(version_iri) 211 | 212 | table_list.append(row_list) 213 | 214 | print('processing data') 215 | # Turn list of lists into dataframe 216 | terms_df = pd.DataFrame(table_list, columns = column_list) 217 | 218 | terms_sorted_by_label = terms_df.sort_values(by='label') 219 | 220 | # This makes sort case insensitive 221 | terms_sorted_by_localname = terms_df.iloc[terms_df.term_localName.str.lower().argsort()] 222 | 223 | print('done retrieving') 224 | print() 225 | 226 | print('Generating term index by CURIE') 227 | # generate the index of terms grouped by category and sorted alphabetically by lowercase term local name 228 | 229 | text = '### 3.1 Index By Term Name\n\n' 230 | text += '(See also [3.2 Index By Label](#32-index-by-label))\n\n' 231 | for category in range(0,len(display_order)): 232 | text += '**' + display_label[category] + '**\n' 233 | text += '\n' 234 | if organized_in_categories: 235 | filtered_table = terms_sorted_by_localname[terms_sorted_by_localname['tdwgutility_organizedInClass']==display_order[category]] 236 | filtered_table.reset_index(drop=True, inplace=True) 237 | else: 238 | filtered_table = terms_sorted_by_localname 239 | filtered_table.reset_index(drop=True, inplace=True) 240 | 241 | for row_index,row in filtered_table.iterrows(): 242 | curie = row['pref_ns_prefix'] + ":" + row['term_localName'] 243 | curie_anchor = curie.replace(':','_') 244 | text += '[' + curie + '](#' + curie_anchor + ')' 245 | if row_index < len(filtered_table) - 1: 246 | text += ' |' 247 | text += '\n' 248 | text += '\n' 249 | index_by_name = text 250 | 251 | text = '\n\n' 252 | 253 | text = '## 3 Term Index \n\n' 254 | #text += '(See also [3.1 Index By Term Name](#31-index-by-term-name))\n\n' 255 | for category in range(0,len(display_order)): 256 | if organized_in_categories: 257 | text += '**' + display_label[category] + '**\n' 258 | text += '\n' 259 | filtered_table = terms_sorted_by_label[terms_sorted_by_label['tdwgutility_organizedInClass']==display_order[category]] 260 | filtered_table.reset_index(drop=True, inplace=True) 261 | else: 262 | filtered_table = terms_sorted_by_label 263 | filtered_table.reset_index(drop=True, inplace=True) 264 | 265 | for row_index,row in filtered_table.iterrows(): 266 | if row_index == 0 or (row_index != 0 and row['label'] != filtered_table.iloc[row_index - 1].loc['label']): # this is a hack to prevent duplicate labels 267 | curie_anchor = row['pref_ns_prefix'] + "_" + row['term_localName'] 268 | text += '[' + row['label'] + '](#' + curie_anchor + ')' 269 | if row_index < len(filtered_table) - 2 or (row_index == len(filtered_table) - 2 and row['label'] != filtered_table.iloc[row_index + 1].loc['label']): 270 | text += ' |' 271 | text += '\n' 272 | text += '\n' 273 | index_by_label = text 274 | 275 | decisions_df = pd.read_csv('https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/decisions/decisions-links.csv', na_filter=False) 276 | 277 | # generate a table for each term, with terms grouped by category 278 | 279 | # generate the Markdown for the terms table 280 | text = '## 4 Vocabulary\n' 281 | for category in range(0,len(display_order)): 282 | if organized_in_categories: 283 | text += '### 4.' + str(category + 1) + ' ' + display_label[category] + '\n' 284 | text += '\n' 285 | text += display_comments[category] # insert the comments for the category, if any. 286 | filtered_table = terms_sorted_by_localname[terms_sorted_by_localname['tdwgutility_organizedInClass']==display_order[category]] 287 | filtered_table.reset_index(drop=True, inplace=True) 288 | else: 289 | filtered_table = terms_sorted_by_localname 290 | filtered_table.reset_index(drop=True, inplace=True) 291 | 292 | for row_index,row in filtered_table.iterrows(): 293 | text += '\n' 294 | curie = row['pref_ns_prefix'] + ":" + row['term_localName'] 295 | curieAnchor = curie.replace(':','_') 296 | text += '\t\n' 297 | text += '\t\t\n' 298 | text += '\t\t\t\n' 299 | text += '\t\t\n' 300 | text += '\t\n' 301 | text += '\t\n' 302 | text += '\t\t\n' 303 | text += '\t\t\t\n' 304 | uri = row['pref_ns_uri'] + row['term_localName'] 305 | text += '\t\t\t\n' 306 | text += '\t\t\n' 307 | text += '\t\t\n' 308 | text += '\t\t\t\n' 309 | text += '\t\t\t\n' 310 | text += '\t\t\n' 311 | 312 | if row['version_iri'] != '': 313 | text += '\t\t\n' 314 | text += '\t\t\t\n' 315 | text += '\t\t\t\n' 316 | text += '\t\t\n' 317 | 318 | text += '\t\t\n' 319 | text += '\t\t\t\n' 320 | text += '\t\t\t\n' 321 | text += '\t\t\n' 322 | 323 | if row['term_deprecated'] != '': 324 | text += '\t\t\n' 325 | text += '\t\t\t\n' 326 | text += '\t\t\t\n' 327 | text += '\t\t\n' 328 | 329 | text += '\t\t\n' 330 | text += '\t\t\t\n' 331 | text += '\t\t\t\n' 332 | text += '\t\t\n' 333 | 334 | if row['usage'] != '': 335 | text += '\t\t\n' 336 | text += '\t\t\t\n' 337 | text += '\t\t\t\n' 338 | text += '\t\t\n' 339 | 340 | if row['notes'] != '': 341 | text += '\t\t\n' 342 | text += '\t\t\t\n' 343 | text += '\t\t\t\n' 344 | text += '\t\t\n' 345 | 346 | if (vocab_type == 2 or vocab_type == 3) and row['controlled_value_string'] != '': # controlled vocabulary 347 | text += '\t\t\n' 348 | text += '\t\t\t\n' 349 | text += '\t\t\t\n' 350 | text += '\t\t\n' 351 | 352 | if vocab_type == 3 and row['skos_broader'] != '': # controlled vocabulary with skos:broader relationships 353 | text += '\t\t\n' 354 | text += '\t\t\t\n' 355 | curieAnchor = row['skos_broader'].replace(':','_') 356 | text += '\t\t\t\n' 357 | text += '\t\t\n' 358 | 359 | text += '\t\t\n' 360 | text += '\t\t\t\n' 361 | if row['type'] == 'http://www.w3.org/1999/02/22-rdf-syntax-ns#Property': 362 | text += '\t\t\t\n' 363 | elif row['type'] == 'http://www.w3.org/2000/01/rdf-schema#Class': 364 | text += '\t\t\t\n' 365 | elif row['type'] == 'http://www.w3.org/2004/02/skos/core#Concept': 366 | text += '\t\t\t\n' 367 | else: 368 | text += '\t\t\t\n' # this should rarely happen 369 | text += '\t\t\n' 370 | 371 | # Look up decisions related to this term 372 | for drow_index,drow in decisions_df.iterrows(): 373 | if drow['linked_affected_resource'] == uri: 374 | text += '\t\t\n' 375 | text += '\t\t\t\n' 376 | text += '\t\t\t\n' 377 | text += '\t\t\n' 378 | 379 | text += '\t\n' 380 | text += '
    Term Name ' + curie + '
    Term IRI' + uri + '
    Modified' + row['term_modified'] + '
    Term version IRI' + row['version_iri'] + '
    Label' + row['label'] + '
    This term is deprecated and should no longer be used.
    Definition' + row['definition'] + '
    Usage' + convert_link(convert_code(row['usage'])) + '
    Notes' + convert_link(convert_code(row['notes'])) + '
    Controlled value' + row['controlled_value_string'] + '
    Has broader concept' + row['skos_broader'] + '
    TypePropertyClassConcept' + row['type'] + '
    Executive Committee decisionhttp://rs.tdwg.org/decisions/' + drow['decision_localName'] + '
    \n' 381 | text += '\n' 382 | text += '\n' 383 | term_table = text 384 | 385 | print('done generating') 386 | print() 387 | 388 | #print(term_table) 389 | 390 | print('Merging term table with header and footer and saving file') 391 | text = index_by_label + term_table 392 | 393 | # read in header and footer, merge with terms table, and output 394 | 395 | headerObject = open(headerFileName, 'rt', encoding='utf-8') 396 | header = headerObject.read() 397 | headerObject.close() 398 | 399 | # Build the Markdown for the contributors list 400 | contributors = '' 401 | for contributor in contributors_yaml: 402 | contributors += '[' + contributor['contributor_literal'] + '](' + contributor['contributor_iri'] + ') ' 403 | contributors += '([' + contributor['affiliation'] + '](' + contributor['affiliation_uri'] + ')), ' 404 | contributors = contributors[:-2] # Remove the last comma and space 405 | 406 | # Substitute values of ratification_date and contributors into the header template 407 | header = header.replace('{document_title}', document_configuration_yaml['documentTitle']) 408 | header = header.replace('{ratification_date}', document_configuration_yaml['doc_modified']) 409 | header = header.replace('{created_date}', document_configuration_yaml['doc_created']) 410 | header = header.replace('{contributors}', contributors) 411 | header = header.replace('{standard_iri}', document_configuration_yaml['dcterms_isPartOf']) 412 | header = header.replace('{current_iri}', document_configuration_yaml['current_iri']) 413 | header = header.replace('{abstract}', document_configuration_yaml['abstract']) 414 | header = header.replace('{creator}', document_configuration_yaml['creator']) 415 | header = header.replace('{publisher}', document_configuration_yaml['publisher']) 416 | year = document_configuration_yaml['doc_modified'].split('-')[0] 417 | header = header.replace('{year}', year) 418 | if has_namespace: 419 | header = header.replace('{namespace_uri}', namespace_uri) 420 | header = header.replace('{pref_namespace_prefix}', pref_namespace_prefix) 421 | 422 | # Determine whether there was a previous version of the document. 423 | if document_configuration_yaml['doc_created'] != document_configuration_yaml['doc_modified']: 424 | # Load versions list from document versions data in the rs.tdwg.org repo and find most recent version. 425 | versions_data_url = githubBaseUri + 'docs/docs-versions.csv' 426 | versions_list_df = pd.read_csv(versions_data_url, na_filter=False) 427 | # Slice all rows for versions of this document. 428 | matching_versions = versions_list_df[versions_list_df['current_iri']==document_configuration_yaml['current_iri']] 429 | # Sort the matching versions by version IRI in descending order so that the most recent version is first. 430 | matching_versions = matching_versions.sort_values(by=['version_iri'], ascending=[False]) 431 | # The previous version is the second row in the dataframe (row 1). 432 | # The version IRI is in the second column (column 1). 433 | most_recent_version_iri = matching_versions.iat[1, 1] 434 | #print(most_recent_version_iri) 435 | 436 | # Insert the previous version information into the header 437 | previous_version_metadata_string = '''Previous version 438 | : <''' + most_recent_version_iri + '''> 439 | 440 | ''' 441 | # Insert the previous version information into the designated slot. 442 | header = header.replace('{previous_version_slot}\n\n', previous_version_metadata_string) 443 | else: 444 | # If there was no previous version, remove the slot from the header. 445 | header = header.replace('{previous_version_slot}\n\n', '') 446 | 447 | 448 | footerObject = open(footerFileName, 'rt', encoding='utf-8') 449 | footer = footerObject.read() 450 | footerObject.close() 451 | 452 | output = header + text + footer 453 | outputObject = open(outFileName, 'wt', encoding='utf-8') 454 | outputObject.write(output) 455 | outputObject.close() 456 | 457 | print('done') 458 | -------------------------------------------------------------------------------- /build/termlist-footer.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/build/termlist-footer.md -------------------------------------------------------------------------------- /build/termlist-header.md: -------------------------------------------------------------------------------- 1 | # {document_title} 2 | 3 | Title 4 | : {document_title} 5 | 6 | Date version issued 7 | : {ratification_date} 8 | 9 | Date created 10 | : {created_date} 11 | 12 | Part of TDWG Standard 13 | : <{standard_iri}> 14 | 15 | This version 16 | : <{current_iri}{ratification_date}> 17 | 18 | Latest version 19 | : <{current_iri}> 20 | 21 | {previous_version_slot} 22 | 23 | Abstract 24 | : {abstract} 25 | 26 | Contributors 27 | : {contributors} 28 | 29 | Creator 30 | : {creator} 31 | 32 | Bibliographic citation 33 | : {creator}. {year}. {document_title}. {publisher}. <{current_iri}{ratification_date}> 34 | 35 | ## 1 Introduction 36 | 37 | This document contains all former and current terms in the {ratification_date} version of the Humboldt Extension for Ecological Inventories vocabulary (). The vocabulary uses the namespace abbreviation `eco:` for `http://rs.tdwg.org/eco/terms/` and `ecoiri:` for `http://rs.tdwg.org/eco/iri/`. 38 | 39 | For a simplified list that contains only the currently recommended terms, see the Humboldt Extension Quick Reference Guide (). 40 | 41 | ### 1.1 Status of the content of this document 42 | 43 | In Section 4, the values of the `Term IRI`, and `Definition` are normative. The values of `Term Name` are non-normative, although one can expect that the namespace abbreviation prefix is one commonly used for the term namespace. `Label` and the values of all other properties (such as `Notes` and `Examples`) are non-normative. 44 | 45 | ### 1.2 RFC 2119 key words 46 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [BCP 14](https://www.rfc-editor.org/info/bcp14) [\[RFC 2119\]](https://datatracker.ietf.org/doc/html/rfc2119) and [\[RFC 8174\]](https://datatracker.ietf.org/doc/html/rfc8174) when, and only when, they appear in all capitals, as shown here. 47 | 48 | ## 2 Use of Terms 49 | 50 | The terms in this extension are meant to provide stable definitions that can be used in a variety of biodiversity inventory contexts but were envisioned principally to function together as an extension to Darwin Core. This vocabulary allows the reporting of detailed information about the inventory process such as i\) a general description of the survey, ii\) where an inventory takes place and the habitat characteristics and environmental conditions of survey sites, iii\) when an inventory takes place, iv\) the target taxonomic group, life stages, growth forms, and degrees of establishment of the organisms sampled, v\) the methodology implemented (inventory type performed, protocol(s) used, absence reported, material samples or vouchers collected, non-target taxa reported), and vi\) the completeness of the inventory and the sampling effort applied. 51 | 52 | This extension allows the representation of complex, highly nested survey designs. An ancillary document explaining how dwc:Event hierarchies for ecological inventories should be structured and providing guidance on the use of the terms in the context of parent and child dwc:Event(s) can be found at [http://rs.tdwg.org/dwc/doc/hierarchy/](https://tdwg.github.io/hc/hierarchy/). 53 | 54 | To assist in the interpretation of the term eco:isLeastSpecificTargetCategoryQuantityInclusive a detailed description of its use is provided at [http://rs.tdwg.org/dwc/doc/inclusive/](https://tdwg.github.io/hc/inclusive/). 55 | 56 | Terms that are expected to have Booleans as values should use controlled value strings from the TDWG Boolean Controlled Vocabulary at [http://rs.tdwg.org/tag/doc/boolean/](https://tag.tdwg.org/boolean/) when those values are serialized in text form. See also the [Best practices for serializing booleans](https://tag.tdwg.org/guides/boolean/) and the [Boolean Values Best Practices Reference](https://tag.tdwg.org/reference/boolean/). 57 | 58 | ## 3 Term index 59 | 60 | -------------------------------------------------------------------------------- /build/termlist-header_filled.md: -------------------------------------------------------------------------------- 1 | # Humboldt Extension Vocabulary List of Terms 2 | 3 | Title 4 | : Humboldt Extension Vocabulary List of Terms 5 | 6 | Namespace IRI: 7 | : http://rs.tdwg.org/eco/terms/ 8 | 9 | Preferred namespace abbreviation 10 | : eco: 11 | 12 | Date version issued 13 | : 2023-xx-xx 14 | 15 | Date created 16 | : 2023-xx-xx 17 | 18 | Part of TDWG Standard 19 | : 20 | 21 | This version 22 | : 23 | 24 | Latest version 25 | : 26 | 27 | Abstract 28 | : The Humboldt Extension for Ecological Inventories is a vocabulary for transmitting information about biological inventories. It is used along with Darwin Core terms to extend descriptions of events. This document lists all terms currently used in the vocabulary. 29 | 30 | Contributors 31 | : fill in 32 | 33 | Creator 34 | : TDWG Humboldt Extension Task Group 35 | 36 | Bibliographic citation 37 | : TDWG Humboldt Extension Task Group. 2023. Humboldt Extension Vocabulary List of Terms. Biodiversity Information Standards (TDWG). 38 | 39 | ## 1 Introduction 40 | 41 | This document contains all versions of terms in the Humboldt Extension for Ecological Inventories vocabulary (). The vocabulary uses the namespace abbreviation `eco:`. 42 | 43 | For a simplified list that contains only the currently recommended terms, see the Humboldt Extension Quick Reference Guide (). 44 | 45 | ### 1.1 Status of the content of this document 46 | 47 | In Section 4, the values of the `Term IRI`, and `Definition` are normative. The values of `Term Name` are non-normative, although one can expect that the namespace abbreviation prefix is one commonly used for the term namespace. `Label` and the values of all other properties (such as `Notes` and `Examples`) are non-normative. 48 | 49 | ### 1.2 RFC 2119 key words 50 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [BCP 14](https://www.rfc-editor.org/info/bcp14) [\[RFC 2119\]](https://datatracker.ietf.org/doc/html/rfc2119) and [\[RFC 8174\]](https://datatracker.ietf.org/doc/html/rfc8174) when, and only when, they appear in all capitals, as shown here. 51 | 52 | ## 2 Use of Terms 53 | 54 | The terms in this standard are meant to provide stable definitions that can be used in a variety of contexts, but were envisioned principally to function together as an extension to Darwin Core, where each core record may be annotated by ... . 55 | 56 | ## 3 Term index 57 | 58 | -------------------------------------------------------------------------------- /build/terms.tmpl: -------------------------------------------------------------------------------- 1 | {# 2 | This template is NOT used by jekyll, but by the build script 3 | to create the terms/index.md file, which mostly contains html. 4 | #} 5 | --- 6 | container: fluid 7 | --- 8 | 9 | # Humboldt Extension quick reference guide 10 | 11 | This document is intended to be an easy-to-read reference to the currently recommended terms that extend the [Darwin Core standard](https://www.tdwg.org/standards/dwc/) with vocabulary to describe biological inventories. This document is not part of the standard. It draws on the [term names and definitions](../list/) from the normative part of the standard and combines them with comments and examples that are not normative, but that are meant to help people to use the terms consistently. The category to which all of the terms in this extension correspond is the Darwin Core Event (dwc:Event) class. Comprehensive metadata for current and obsolete terms in human readable form are found in a [list of terms document](../list/). CSV files with the [full history](https://github.com/tdwg/hc/blob/master/vocabulary/term_versions.csv) of the terms, with [horizontal and vertical lists](https://github.com/tdwg/hc/tree/master/dist) of these terms and the schema for the [Darwin Core Archive extension](https://github.com/tdwg/hc/tree/master/dist) can be found in the [Humboldt Extension repository](https://github.com/tdwg/hc). 12 | 13 | {% for class_group in class_groups %} 14 | 15 | ## {{ class_group.label }} 16 | 17 | {% if class_group.label == 'UseWithIRI' %} 18 | For more information on `UseWithIRI`, see [Section 2.5 of the RDF Guide](https://dwc.tdwg.org/rdf/#25-terms-in-the-dwciri-namespace-normative). 19 | {% endif %} 20 |
    21 | {% for term in class_group.terms %} 22 | {{ term.label }} 23 | {% endfor %} 24 |
    25 | 26 | {% if class_group.label != 'Site' and class_group.label != 'Habitat Scope' and class_group.label != 'Temporal Scope' and class_group.label != 'Taxonomic Scope' and class_group.label != 'Organismal Scope' and class_group.label != 'Identification' and class_group.label != 'Methodology Description' and class_group.label != 'Material Collected' and class_group.label != 'Sampling Effort' and class_group.label != 'UseWithIRI' %} 27 | {# Class #} 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 |
    {{ class_group.label }} Class
    Identifier{{ class_group.iri }}
    Definition{{ class_group.definition }}
    Comments{{ class_group.comments }}
    Examples{{ class_group.examples }}
    37 | {%endif %} 38 | 39 | {% for term in class_group.terms %} 40 | {# Term #} 41 | 44 | {% set examples = term.examples.split("; ") %} 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 |
    {{ term.label }} Property
    Identifier{{ term.iri }}
    Definition{{ term.definition }}
    Comments{{ term.comments }}
    Examples{% if examples | length == 1 %}{{ examples | first }}{% else %}
      {% for example in examples %}
    • {{ example }}
    • {% endfor %}
    {% endif %}
    54 | {% endfor %} 55 | 56 | {% endfor %} 57 | -------------------------------------------------------------------------------- /build/update_previous_doc.py: -------------------------------------------------------------------------------- 1 | # Script to make the current document be the previous document 2 | # This program is released under a GNU General Public License v3.0 http://www.gnu.org/licenses/gpl-3.0 3 | # Author: Steve Baskauf 4 | 5 | script_version = '0.1.1' 6 | version_modified = '2024-03-03' 7 | 8 | # NOTE: This script should be run only after the script updating the machine-readable metadata has been run. 9 | # It must be run before the script that generates the new document version. 10 | 11 | import requests 12 | import pandas as pd 13 | import yaml 14 | import os 15 | import sys 16 | 17 | # ----------------- 18 | # Command line arguments 19 | # ----------------- 20 | 21 | arg_vals = sys.argv[1:] 22 | opts = [opt for opt in arg_vals if opt.startswith('-')] 23 | args = [arg for arg in arg_vals if not arg.startswith('-')] 24 | 25 | # Name of the last part of the URL of the doc 26 | if '--slug' in opts: 27 | document_slug = args[opts.index('--slug')] 28 | else: 29 | print('Must specify URL slug for document using --slug option') 30 | print('For example, if the permanent URL is "http://rs.tdwg.org/dwc/doc/eco/", the slug is "eco".') 31 | exit() 32 | 33 | # Used as the directory name 34 | if '--dir' in opts: 35 | directory_name = args[opts.index('--dir')] 36 | else: 37 | print('Must specify name of directory containing template and configs using --dir option') 38 | print('For example, if the path to the templates in the rs.tdwg.org repo') 39 | print('is "process/document_metadata_processing/dwc_doc_eco/", the directory name is "dwc_doc_eco".') 40 | exit() 41 | 42 | # "master" for production, something else for development 43 | # Example: First part of branch URL is "https://raw.githubusercontent.com/tdwg/rs.tdwg.org/eco/", branch is "eco". 44 | if '--branch' in opts: 45 | github_branch = args[opts.index('--branch')] 46 | else: 47 | github_branch = 'master' 48 | 49 | 50 | # ----------------- 51 | # Configuration section 52 | # ----------------- 53 | 54 | githubBaseUri = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/' + github_branch + '/' 55 | 56 | config_file_path = 'process/document_metadata_processing/' + directory_name + '/' 57 | document_configuration_yaml_file = 'document_configuration.yaml' 58 | 59 | path_of_doc_relative_to_build_dir = '../docs/' + document_slug + '/' 60 | 61 | # Load the document configuration YAML file from its GitHub URL 62 | document_configuration_yaml_url = githubBaseUri + config_file_path + document_configuration_yaml_file 63 | document_configuration_yaml = requests.get(document_configuration_yaml_url).text 64 | document_configuration_yaml = yaml.load(document_configuration_yaml, Loader=yaml.FullLoader) 65 | 66 | # Determine date of the document that is to be turned into the previous document and the version IRI 67 | # of the most recent version of that document. 68 | 69 | # Load versions list from document versions data in the rs.tdwg.org repo and find most recent version. 70 | versions_data_url = githubBaseUri + 'docs-versions/docs-versions.csv' 71 | versions_list_df = pd.read_csv(versions_data_url, na_filter=False) 72 | 73 | # Slice all rows for versions of this document. 74 | matching_versions = versions_list_df[versions_list_df['current_iri']==document_configuration_yaml['current_iri']] 75 | # Sort the matching versions by version IRI in descending order so that the most recent version is first. 76 | matching_versions = matching_versions.sort_values(by=['version_iri'], ascending=[False]) 77 | 78 | # Check for the error condition of there being no matching versions. 79 | if len(matching_versions.index) == 0: 80 | print('There are no versions of this document. Did you run the script to update the document metadata?') 81 | exit() 82 | 83 | # If there is only one row in the matching_versions dataframe (only one version), then the rest of the script should not be run. 84 | if len(matching_versions.index) == 1: 85 | print('There is only one version of this document. No changes are being made to the documents.') 86 | exit() 87 | 88 | # The most recent version is the first row in the dataframe (row 0). 89 | 90 | # Find the column index of the column named "version_iri". 91 | version_iri_column_index = matching_versions.columns.get_loc('version_iri') 92 | most_recent_version_iri = matching_versions.iat[0, version_iri_column_index] 93 | print(most_recent_version_iri) 94 | 95 | # Find the date of the previous version, which is in the second row of the dataframe (row 1). 96 | # Find the column index of the column named "version_issued". 97 | version_iri_column_index = matching_versions.columns.get_loc('version_issued') 98 | previous_version_date = matching_versions.iat[1, version_iri_column_index] 99 | print(previous_version_date) 100 | 101 | # The document to be converted is named "index.md". Its name must be changed to the date of the previous version. 102 | os.rename(path_of_doc_relative_to_build_dir + 'index.md', path_of_doc_relative_to_build_dir + previous_version_date + '.md') 103 | 104 | # Open the renamed file and read its text. 105 | with open(path_of_doc_relative_to_build_dir + previous_version_date + '.md', 'rt') as file_object: 106 | file_text = file_object.read() 107 | 108 | # Insert the replacement version information into the header 109 | replacement_version_metadata_string = '''Replaced by 110 | : <''' + most_recent_version_iri + '''> 111 | 112 | ''' 113 | 114 | # Insert the previous version information into the header above the Abstract section. 115 | header = file_text.replace('Abstract\n:', replacement_version_metadata_string + 'Abstract\n:') 116 | 117 | # Write the updated file text to the file. 118 | with open(path_of_doc_relative_to_build_dir + previous_version_date + '.md', 'wt') as file_object: 119 | file_object.write(header) 120 | -------------------------------------------------------------------------------- /dist/simple_eco_horizontal.csv: -------------------------------------------------------------------------------- 1 | siteCount,siteNestingDescription,verbatimSiteDescriptions,verbatimSiteNames,geospatialScopeAreaValue,geospatialScopeAreaUnit,totalAreaSampledValue,totalAreaSampledUnit,reportedWeather,reportedExtremeConditions,targetHabitatScope,excludedHabitatScope,eventDurationValue,eventDurationUnit,targetTaxonomicScope,excludedTaxonomicScope,taxonCompletenessReported,taxonCompletenessProtocols,isTaxonomicScopeFullyReported,isAbsenceReported,absentTaxa,hasNonTargetTaxa,nonTargetTaxa,areNonTargetTaxaFullyReported,targetLifeStageScope,excludedLifeStageScope,isLifeStageScopeFullyReported,targetDegreeOfEstablishmentScope,excludedDegreeOfEstablishmentScope,isDegreeOfEstablishmentScopeFullyReported,targetGrowthFormScope,excludedGrowthFormScope,isGrowthFormScopeFullyReported,hasNonTargetOrganisms,verbatimTargetScope,compilationTypes,compilationSourceTypes,inventoryTypes,protocolNames,protocolDescriptions,protocolReferences,isAbundanceReported,isAbundanceCapReported,abundanceCap,isVegetationCoverReported,isLeastSpecificTargetCategoryQuantityInclusive,hasVouchers,voucherInstitutions,hasMaterialSamples,materialSampleTypes,samplingPerformedBy,isSamplingEffortReported,samplingEffortProtocol,samplingEffortValue,samplingEffortUnit,absentTaxa,compilationSourceTypes,compilationTypes,eventDurationUnit,excludedDegreeOfEstablishmentScope,excludedGrowthFormScope,excludedHabitatScope,excludedLifeStageScope,excludedTaxonomicScope,geospatialScopeAreaUnit,inventoryTypes,materialSampleTypes,nonTargetTaxa,protocolNames,samplingEffortProtocol,samplingEffortUnit,samplingPerformedBy,targetDegreeOfEstablishmentScope,targetGrowthFormScope,targetHabitatScope,targetLifeStageScope,targetTaxonomicScope,taxonCompletenessProtocols 2 | -------------------------------------------------------------------------------- /dist/simple_eco_vertical.csv: -------------------------------------------------------------------------------- 1 | siteCount 2 | siteNestingDescription 3 | verbatimSiteDescriptions 4 | verbatimSiteNames 5 | geospatialScopeAreaValue 6 | geospatialScopeAreaUnit 7 | totalAreaSampledValue 8 | totalAreaSampledUnit 9 | reportedWeather 10 | reportedExtremeConditions 11 | targetHabitatScope 12 | excludedHabitatScope 13 | eventDurationValue 14 | eventDurationUnit 15 | targetTaxonomicScope 16 | excludedTaxonomicScope 17 | taxonCompletenessReported 18 | taxonCompletenessProtocols 19 | isTaxonomicScopeFullyReported 20 | isAbsenceReported 21 | absentTaxa 22 | hasNonTargetTaxa 23 | nonTargetTaxa 24 | areNonTargetTaxaFullyReported 25 | targetLifeStageScope 26 | excludedLifeStageScope 27 | isLifeStageScopeFullyReported 28 | targetDegreeOfEstablishmentScope 29 | excludedDegreeOfEstablishmentScope 30 | isDegreeOfEstablishmentScopeFullyReported 31 | targetGrowthFormScope 32 | excludedGrowthFormScope 33 | isGrowthFormScopeFullyReported 34 | hasNonTargetOrganisms 35 | verbatimTargetScope 36 | compilationTypes 37 | compilationSourceTypes 38 | inventoryTypes 39 | protocolNames 40 | protocolDescriptions 41 | protocolReferences 42 | isAbundanceReported 43 | isAbundanceCapReported 44 | abundanceCap 45 | isVegetationCoverReported 46 | isLeastSpecificTargetCategoryQuantityInclusive 47 | hasVouchers 48 | voucherInstitutions 49 | hasMaterialSamples 50 | materialSampleTypes 51 | samplingPerformedBy 52 | isSamplingEffortReported 53 | samplingEffortProtocol 54 | samplingEffortValue 55 | samplingEffortUnit 56 | absentTaxa 57 | compilationSourceTypes 58 | compilationTypes 59 | eventDurationUnit 60 | excludedDegreeOfEstablishmentScope 61 | excludedGrowthFormScope 62 | excludedHabitatScope 63 | excludedLifeStageScope 64 | excludedTaxonomicScope 65 | geospatialScopeAreaUnit 66 | inventoryTypes 67 | materialSampleTypes 68 | nonTargetTaxa 69 | protocolNames 70 | samplingEffortProtocol 71 | samplingEffortUnit 72 | samplingPerformedBy 73 | targetDegreeOfEstablishmentScope 74 | targetGrowthFormScope 75 | targetHabitatScope 76 | targetLifeStageScope 77 | targetTaxonomicScope 78 | taxonCompletenessProtocols 79 | -------------------------------------------------------------------------------- /docs/CNAME: -------------------------------------------------------------------------------- 1 | eco.tdwg.org -------------------------------------------------------------------------------- /docs/_config.yml: -------------------------------------------------------------------------------- 1 | # SITE SETTINGS 2 | title: Humboldt Extension for Ecological Inventories 3 | description: Vocabulary maintained by the Darwin Core Maintenance Interest Group to facilitate the sharing of information about biological inventories. 4 | url: "https://tdwg.github.io/hc/" 5 | 6 | # THEME SETTINGS 7 | theme: minima 8 | remote_theme: tdwg/petridish 9 | github_edit: false 10 | logo: /assets/theme/images/tdwg-logo-short.svg 11 | 12 | # BUILD SETTINGS 13 | markdown: kramdown 14 | plugins: 15 | - jekyll-feed 16 | - jekyll-sitemap 17 | exclude: 18 | - README.md 19 | - Gemfile 20 | - Gemfile.lock 21 | - LICENSE 22 | 23 | # FRONTMATTER DEFAULTS 24 | defaults: 25 | - scope: 26 | path: "" 27 | values: 28 | layout: default 29 | toc: true 30 | -------------------------------------------------------------------------------- /docs/_data/footer.yml: -------------------------------------------------------------------------------- 1 | # Footer content is organized in columns, with the first one reserved for social icons (defined in _config.yml). 2 | # You can also add a small print license statement at the bottom. 3 | 4 | # Columns (the more you add, the narrower they will be) 5 | columns: 6 | - links: 7 | - text: Biodiversity Information Standards (TDWG) 8 | href: https://www.tdwg.org/ 9 | 10 | # Small print license statement to add at the bottom of the footer. Can be Markdown 11 | # Will be prefixed by "© {{ site.author }}" if defined in _config.yml 12 | license: > 13 | Content on this site, made open by [Biodiversity Information Standards (TDWG)](https://www.tdwg.org/) 14 | is licensed under a [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0/). 15 | 16 | -------------------------------------------------------------------------------- /docs/_data/navigation.yml: -------------------------------------------------------------------------------- 1 | # Links listed below will be included in your site's navbar (navigation at the top) 2 | 3 | - text: Home 4 | href: / 5 | - text: Terms 6 | menu: 7 | - text: Darwin Core 8 | href: https://dwc.tdwg.org/list/ 9 | - text: "---" 10 | - text: Humboldt Extension 11 | href: list/ 12 | - text: "---" 13 | - text: Taxon Completeness Reported Controlled Vocabulary 14 | href: tcr/ 15 | - text: Guides 16 | menu: 17 | - text: Darwin Core Quick Reference 18 | href: https://dwc.tdwg.org/terms/ 19 | - text: "---" 20 | - text: Humboldt Extension Quick Reference 21 | href: terms/ 22 | - text: "---" 23 | - text: isLeastSpecificTargetCategoryQuantityInclusive Guidelines 24 | href: inclusive/ 25 | - text: Hierarchical Events Guidelines 26 | href: hierarchy/ 27 | - text: "---" 28 | - text: Questions & Answers 29 | href: https://github.com/tdwg/dwc-qa/blob/master/README.md 30 | - text: GitHub 31 | href: https://github.com/tdwg/hc 32 | -------------------------------------------------------------------------------- /docs/_sass/_custom.scss: -------------------------------------------------------------------------------- 1 | // Custom styling 2 | 3 | .content { 4 | table { 5 | td:first-of-type { 6 | width: 120px; // Label column, long words will still push this wider 7 | } 8 | 9 | .list-group-item { 10 | padding: 0.5rem 0; // Examples 11 | } 12 | } 13 | } 14 | -------------------------------------------------------------------------------- /docs/hierarchy/fig1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/docs/hierarchy/fig1.png -------------------------------------------------------------------------------- /docs/hierarchy/fig2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/docs/hierarchy/fig2.png -------------------------------------------------------------------------------- /docs/hierarchy/fig3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/docs/hierarchy/fig3.png -------------------------------------------------------------------------------- /docs/hierarchy/index.md: -------------------------------------------------------------------------------- 1 | # Properties of hierarchical events in the Humboldt Extension for Ecological Inventories 2 | 3 | Title 4 | : Properties of hierarchical events in the Humboldt Extension for Ecological Inventories 5 | 6 | Date version issued 7 | : 2024-02-28 8 | 9 | Date created 10 | : 2024-02-28 11 | 12 | Part of TDWG Standard 13 | : 14 | 15 | This version 16 | : 17 | 18 | Latest version 19 | : 20 | 21 | Abstract 22 | : Ecological inventories in the context of Darwin Core can be considered as types of dwc:Events with the potential for hierarchical structure relating broader parent dwc:Events with narrower child dwc:Events. Terms in the Humboldt Extension are all properties of a dwc:Event. This document explains how dwc:Event hierarchies for ecological inventories should be structured and provides guidance on the use of Humboldt Extension terms in the context of parent and child dwc:Events. 23 | 24 | Contributors 25 | : [Yi-Ming Gan](https://orcid.org/0000-0001-7087-2646) ([Royal Belgian Institute of Natural Sciences](http://www.wikidata.org/entity/Q16665660)), [Wesley M. Hochachka](https://orcid.org/0000-0002-0595-7827) ([Cornell Lab of Ornithology](http://www.wikidata.org/entity/Q2997535)), [John Wieczorek](https://orcid.org/0000-0003-1144-0290) ([VertNet](http://www.wikidata.org/entity/Q98382028)), [Yanina V. Sica](https://orcid.org/0000-0002-1720-0127) ([Yale University](http://www.wikidata.org/entity/Q49112)), [Peter Brenton](https://orcid.org/0000-0001-9730-8340) ([Atlas of Living Australia, CSIRO](http://www.wikidata.org/entity/Q16335177)), [Robert D. Stevenson](https://orcid.org/0000-0003-1617-5895) ([Department of Biology, University of Massachusetts Boston](http://www.wikidata.org/entity/Q15144)), [Anahita J. N. Kazem](https://orcid.org/0000-0003-2475-132X) ([German Centre for Integrative Biodiversity Research, Leipzig and Friedrich Schiller University, Jena](http://www.wikidata.org/entity/Q1206134)), [Steven J. Baskauf](https://orcid.org/0000-0003-4365-3135) ([Vanderbilt University Libraries](http://www.wikidata.org/entity/Q16849893)), [Zachary R. Kachian](https://orcid.org/0000-0002-0500-0339) ([Keller Science Action Center, Field Museum of Natural History](http://www.wikidata.org/entity/Q1122595)), [Kate Ingenloff](https://orcid.org/0000-0001-5942-9053) ([Global Biodiversity Information Facility (GBIF)](http://www.wikidata.org/entity/Q1531570)) 26 | 27 | Creator 28 | : TDWG Humboldt Extension Task Group 29 | 30 | Bibliographic citation 31 | : TDWG Humboldt Extension Task Group. 2024. Properties of hierarchical events in the Humboldt Extension for Ecological Inventories. Biodiversity Information Standards (TDWG). 32 | 33 | 34 | ## 1 Introduction (non-normative) 35 | 36 | ### 1.1 Status of the content of this document 37 | 38 | Section 3 of this document is normative, serving as official guidelines 39 | in application of the Humboldt Extension. The other sections are 40 | non-normative and designed to help improve overall understanding in 41 | application and interpretation of the Extension. 42 | 43 | ### 1.2 RFC 2119 keywords 44 | --------------------- 45 | 46 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 47 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 48 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 49 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 50 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 51 | when, and only when, they are written in capitals (as shown here). 52 | 53 | ## 1.3 Namespaces and terminology 54 | 55 | The namespace `eco:` abbreviates terms minted for the Humboldt Extension 56 | for ecological inventories 57 | ([http://rs.tdwg.org/eco/terms/](http://rs.tdwg.org/eco/terms/)). 58 | `dwc:` abbreviates terms from the main Darwin Core vocabulary namespace 59 | ([http://rs.tdwg.org/dwc/terms/](http://rs.tdwg.org/dwc/terms/)). 60 | 61 | Words in `code markup` are term IRIs or literal values. The word 62 | "organism" is used colloquially and is not used in the technical sense 63 | of the dwc:Organism class, unless specifically presented as 64 | "dwc:Organism." The word "Event" is used in the technical sense of the 65 | dwc:Event class. "Humboldt Extension" is an abbreviation for the 66 | "Humboldt Extension for Ecological Inventories." 67 | 68 | ### 1.4 Intended audience and use for this document 69 | 70 | The information in this document is targeted at data providers, data 71 | aggregators, and data consumers. *Data providers* are the individuals 72 | responsible for mapping ecological inventory data into an Event-based 73 | [Darwin Core 74 | Archive](https://ipt.gbif.org/manual/en/ipt/latest/dwca-guide) 75 | format that includes the Humboldt Extension. *Data aggregators* and 76 | *data consumers* can use this document to better understand the data 77 | shared by data providers, specifically with respect to the 78 | **relationships between hierarchical dwc:Event levels** and **when it is 79 | or is not appropriate to make inferences** about attributes such as 80 | abundance or absence of detection. 81 | 82 | 83 | ## 2 Rationale (non-normative) 84 | 85 | Ecological inventories in the context of Darwin Core can be considered 86 | as types of [dwc:Events](http://rs.tdwg.org/dwc/terms/Event) 87 | --- they are actions that occur at specific locations over defined 88 | periods of time. The terms in the Humboldt Extension are all properties 89 | of a dwc:Event. 90 | 91 | There are many types of ecological inventory, ranging from singular 92 | observations of individual taxa (1 event:1 observation; Example 1 in 93 | Figure 1) to highly structured and deeply nested observations within 94 | other observations (e.g., 1 event:2 sub-events, each sub-event:2 95 | sub-sub-events; Example 4 in Figure 1). The need for guidance on **how 96 | to capture the details of nested observations** (dwc:Event hierarchies) 97 | is the rationale for this document. Nested sampling designs can be 98 | translated into a relational database schema of parent-child dwc:Event 99 | relationships (a parent event with one or more child sub-events; Figure 100 | 1). This document describes the circumstances under which specific 101 | properties of parent and child dwc:Events SHOULD be populated based on 102 | the parent-child relationship. 103 | 104 | Note that the proposed structure for sharing ecological inventories does 105 | not follow typical database practice. Whilst a (relational) database 106 | would store information in multiple tables to avoid repetition of key 107 | information, datasets shared using the Darwin Core archive format and 108 | the Humboldt Extension instead use a "flattened" structure. In order to 109 | share inventory data such that no information is lost and no information 110 | is incorrectly inferred, one SHOULD **report all information at all 111 | applicable levels**. The rules for applicability and how to populate 112 | terms at parent and child levels in the dwc:Event hierarchy are captured 113 | in section *3.2 Guiding principles* and in section *3.3 Implementation principles*. 114 | 115 | 116 | ![Illustration of four examples of nested dwc:Events](fig1.png) 117 | 118 | **Figure 1.** Visual representation of an ecological inventory 119 | illustrating four examples of occurrence data associated with dwc:Events 120 | nested within parent dwc:Events, at varying levels of complexity ranging 121 | from low (Example 1) to high (Example 4). 122 | 123 | 124 | ## 3 Usage guidelines (normative) 125 | 126 | ### 3.1 Definitions 127 | 128 | **Inventory dataset** - An inventory (dataset) consists of one or more 129 | dwc:Events that MAY be related to each other in a hierarchy of parent 130 | and child dwc:Events. This is not new to the capabilities or intentions 131 | of Darwin Core. 132 | 133 | **Inventory hierarchy** - A set of related dwc:Events, in which a 134 | narrower dwc:Event (child) points to the related broader dwc:Event 135 | (parent) via the child's dwc:parentEventID. A higher-level dwc:Event 136 | generally contains information about the inventory design that applies 137 | to all of its children. 138 | 139 | **Parent dwc:Event** - A parent dwc:Event is any dwc:Event whose 140 | dwc:eventID is a dwc:parentEventID for at least one other dwc:Event 141 | (e.g. EVENT_01 in Figure 2). 142 | 143 | **Child dwc:Event** - A child dwc:Event is any dwc:Event whose 144 | dwc:parentEventID is populated with the dwc:eventID of another dwc:Event 145 | (e.g. EVENT_02 or EVENT_03 in Figure 2). 146 | 147 | ![Visual representation of parent/child relationship](fig2.png) 148 | 149 | **Figure 2.** Visual representation of an inventory hierarchy 150 | illustrating parent-child dwc:Event relations. The higher-level (parent) 151 | dwc:Event, EVENT_01, may include general information about the inventory 152 | design. Species occurrences are captured for two child dwc:Events 153 | (EVENT_02 and EVENT_03). 154 | 155 | 156 | ## 3.2 Guiding principles 157 | 158 | 159 | ### 3.2.1 Principle of spatiotemporal coverage 160 | 161 | **A parent dwc:Event MUST encompass its child dwc:Events spatially 162 | and temporally.** Specifically, the spatial extent and temporal 163 | interval of a parent dwc:Event MUST contain the spatial extents and 164 | temporal intervals of all of its children. For example, if child 165 | dwc:Events took place in various locations throughout, and only within, 166 | Burundi, then the spatial extent of the parent dwc:Event would be 167 | Burundi. Similarly, if the child dwc:Events took place periodically 168 | throughout the year 2019, the temporal interval of the parent dwc:Event 169 | would begin when the earliest child dwc:Event began and end when the 170 | latest child dwc:Event ended. 171 | 172 | 173 | ### 3.2.2 Principle of applicability 174 | 175 | **Humboldt Extension terms SHOULD contain data explicitly at every level 176 | in the dwc:Event hierarchy to which they *directly* apply.** The value 177 | of a term for a dwc:Event SHOULD be populated for the Event itself 178 | rather than merely summarized in a higher-level dwc:Event. For example, 179 | a child dwc:Event (**C**) with multiple dwc:Occurrences, some of which 180 | resulted in voucher specimens, SHOULD possess a value of `true` for 181 | the term eco:hasVouchers. The data user SHOULD NOT be expected to look 182 | at the eco:hasVouchers term for the parent dwc:Event (**P**) of **C** in 183 | order to find the value. 184 | 185 | If a term genuinely applies at multiple levels of an dwc:Event 186 | hierarchy, values SHOULD be reported explicitly at *each* of those 187 | levels. The values for child dwc:Events might be the same as their 188 | parental values, or child dwc:Events might possess their own more 189 | specific values. This principle allows child dwc:Events to be 190 | "autonomous" to the greatest degree possible, and avoids uncertainty 191 | about where to look for the values of properties of any given dwc:Event. 192 | 193 | 194 | ### 3.2.3 Principle of non-derivation 195 | 196 | As a complement to the *Principle of applicability*, **Humboldt 197 | Extension terms SHOULD NOT be populated by deriving or summarizing 198 | information from child dwc:Events to their common parent dwc:Event**. If 199 | a term does not directly apply to a given level of dwc:Event (i.e., it 200 | is not an actual property of that dwc:Event), it SHOULD NOT be populated 201 | with a value. For example, if the parent dwc:Event **P** from the 202 | example in section *3.2.2* above is not directly linked to 203 | dwc:Occurrences, then the term eco:hasVouchers does not apply at that 204 | dwc:Event level and SHOULD be left unpopulated. Data providers SHOULD 205 | NOT construct a value for a parent dwc:Event from values at the level of 206 | child dwc:Events. 207 | 208 | In some cases, including the example above, it would not be valid to 209 | derive or summarize information from child dwc:Events to populate a 210 | parent dwc:Event. Suppose parent dwc:Event **P** has two child 211 | dwc:Events, one with eco:hasVouchers `true` and one with 212 | eco:hasVouchers `false`. The value of eco:hasVouchers for **P** cannot 213 | be derived or summarized from its children, as it is neither `true` 214 | nor `false` for all of them (the only two values consistent with the 215 | recommended controlled vocabulary for the term). It would be neither 216 | desirable nor reliable to use the values of the child dwc:Events to 217 | infer a value for the parent dwc:Event. The *Principle of inference* 218 | (below) provides a further example, where *scope* terms of parent 219 | dwc:Events MUST NOT be populated by summarizing from lower levels 220 | (either through the scope values of child dwc:Events or, for example, 221 | through taxa detected in child dwc:Events). 222 | 223 | There are terms which could theoretically be populated for a parent 224 | dwc:Event from the primary data already provided for that dwc:Event\'s 225 | children (e.g., eco:materialSampleTypes). Populating the parent term 226 | could facilitate the discovery of higher-level dwc:Events among whose 227 | children there is a particular value of a property (e.g., a search 228 | through the highest-level dwc:Events in datasets to find datasets in 229 | which there are particular eco:materialSampleTypes). However, providing 230 | such summary values is specifically NOT RECOMMENDED. Doing so a\) adds no 231 | information to the dataset (the summary information is already available 232 | by inspecting the primary data in the dwc:Events in the dataset), b\) 233 | adds an extra burden of summary upon the data provider, and c\) is 234 | susceptible to errors (ambiguities, inconsistencies, incompleteness) 235 | when trying to construct secondary summary information for higher-level 236 | Events. 237 | 238 | 239 | ### 3.2.4 Principle of inference 240 | 241 | **Certain terms in the Humboldt Extension support inferences.** Examples 242 | of terms that help data users to determine whether or not inferences can 243 | be made include those describing the *scope* of the inventory, such as 244 | eco:targetTaxonomicScope and eco:excludedTaxonomicScope, and terms 245 | describing *completeness*, such as eco:taxonCompletenessReported, 246 | eco:taxonCompletenessProtocols and eco:isTaxonomicScopeFullyReported. 247 | The values of these terms in a dwc:Event have implications for the 248 | interpretation of all of that dwc:Event's child dwc:Events. These terms 249 | MUST be populated for the highest level dwc:Event to which they apply, 250 | and all of its child dwc:Events. 251 | 252 | **The *scope* terms of a dwc:Event MUST be populated whenever the scope 253 | was in effect**. Having this information in a dwc:Event is the only way 254 | **to be able to infer absences of detection** within that dwc:Event, 255 | whenever the dwc:Occurrences linked to that dwc:Event do not explicitly 256 | state zero counts or when there are no dwc:Occurrence records for a 257 | given taxon that fell within the taxonomic scope (the combination of 258 | eco:targetTaxonomicScope and eco:excludedTaxonomicScope). The ability to 259 | "implicitly" support inferences about undetected dwc:Taxa (and other 260 | organismal targets) was a high priority objective in the design and 261 | structure of the Humboldt Extension. By "implicitly support 262 | inferences" we mean that a dwc:organismQuantity of zero individuals 263 | within a particular scope does not need to be provided explicitly as a 264 | separate dwc:Occurrence record, for a dwc:Event that does declare an 265 | encompassing scope and where all the taxa/targets that *were* detected 266 | were fully reported. Instead, those zero counts can be reconstituted by 267 | data users based on the data contained in other terms. When the target 268 | taxonomic scope (the combination of eco:targetTaxonomicScope and 269 | eco:excludedTaxonomicScope) is determined in advance of inventory data 270 | collection, and eco:isTaxonomicScopeFullyReported = `true`, then all 271 | dwc:Taxa that fall within the taxonomic scope but are not reported in 272 | the dwc:Occurrences of any child dwc:Events **can be inferred to be 273 | dwc:Occurrences with a dwc:organismQuantity of zero** (i.e., undetected 274 | dwc:Taxa). 275 | 276 | These inferred zero counts, in combination with information about 277 | sampling effort (i.e., eco:samplingEffortProtocol, 278 | eco:samplingEffortValue and eco:samplingEffortUnit), can then be used to 279 | estimate the likelihood that a count of zero organisms represents a 280 | *true* absence of a dwc:Taxon. However, if eco:taxonCompletenessReported 281 | = `reported incomplete` and/or eco:isTaxonomicScopeFullyReported = 282 | `false` for a dwc:Event, then future users SHOULD NOT make assumptions 283 | about absences. 284 | 285 | Data providers **MUST NOT retrospectively infer and populate 286 | eco:targetTaxonomicScope, or other *scope* terms**, for inclusion in a 287 | dataset shared with the Humboldt Extension. This is a further example of 288 | the *Principle of non-derivation* (*3.2.3*). Likewise, data users SHOULD 289 | NOT assume or reconstruct a scope that was not explicitly given by the 290 | data provider. There are at least two reasons for this: (1) Artificial 291 | construction of scope: retrospective inference of target scope by a data 292 | provider by aggregating information across all child dwc:Events may 293 | result in a reported scope that is narrower than the actual intended 294 | scope of the inventory. (2) Artificial broadening of scope: it is 295 | possible that the inferred scope can be described in multiple ways. For 296 | example, the scope of a list of species within a single genus could be 297 | described as the genus, as the family containing that genus, or as an 298 | even broader taxonomic concept. Thus, unless the true taxonomic scope is 299 | a known variable in the inventory protocol, then a presumed scope may be 300 | too broad or too narrow, leading to errors when inferring counts of 301 | zero. 302 | 303 | 304 | ## 3.3 Implementation principles 305 | 306 | 1. A Darwin Core-based inventory dataset MUST consist of at least one 307 | dwc:Event record. 308 | 309 | 2. Each dwc:Event in an inventory dataset MUST have a non-empty value 310 | for dwc:eventID that is unique within the dataset. More benefits 311 | are realizable if the dwc:eventIDs are also globally unique. 312 | 313 | 3. Any association of a Humboldt Extension record with a dwc:Event 314 | record MUST be done via that dwc:Event\'s dwc:eventID; the 315 | associated records MUST use the same dwc:eventID. It is 316 | permissible to have dwc:Event records without associated Humboldt 317 | Extension records. 318 | 319 | 4. An inventory hierarchy MUST be realized by explicitly relating each 320 | child dwc:Event to a parent dwc:Event through the child 321 | dwc:Event's dwc:parentEventID. 322 | 323 | 5. Data providers SHOULD follow [Darwin Core principle 324 | 4](https://dwc.tdwg.org/simple/#5-are-there-any-rules-normative), 325 | which is to fill the values of as many terms as possible, subject 326 | to the *Principle of applicability* and the *Principle of 327 | non-derivation* (sections *3.2.2* and *3.2.3*, respectively). 328 | 329 | 6. A child dwc:Event MUST NOT be assumed to implicitly "inherit" the 330 | value of any property of any of its parent dwc:Events; rather, the 331 | value SHOULD be provided explicitly as discussed in section *3.2.2 332 | Principle of applicability*. 333 | 334 | 7. A parent dwc:Event term SHOULD NOT be populated by deriving or 335 | summarizing information from child dwc:Events; rather, the value 336 | SHOULD be provided explicitly if appropriate to the nature and 337 | level of the dwc:Event, as discussed in section *3.2.3 Principle of non-derivation*. 338 | 339 | 340 | ## 4 Examples (non-normative) 341 | 342 | ![Tables illustrating implementation principles](fig3.png) 343 | 344 | **Figure 3.** Example illustrating the [Implementation 345 | principles](#implementation). Numbering of colored 346 | rectangles indicates the relevant principle; lines, arrows or rectangles 347 | in the same color indicate that the cells, columns or records are 348 | affected by the principle. *Notolepis coatsi* and *Cranchiidae* are not 349 | within the reported eco:targetTaxonomicScope. Principle 1 - an inventory 350 | dataset must have at least one dwc:Event record; here, 3 records can be 351 | identified. Principle 2 - each dwc:Event record must have a unique 352 | dwc:eventID. Principle 3 - Humboldt Extension records must be linked to 353 | the core dwc:Events via shared dwc:eventIDs. Principle 4 - every child 354 | dwc:Event must be related to its parent dwc:Event through a 355 | dwc:parentEventID. Principle 5 - term values for dwc:Events should be 356 | populated whenever possible; in the figure all records follow Darwin 357 | Core principle 4, subject to the *Principle of applicability* and the 358 | *Principle of non-derivation*. Principle 6 - terms for child dwc:Events 359 | must be explicitly populated rather than "inheriting" values from 360 | their parent dwc:Events. Principle 7 - terms for parent dwc:Events 361 | should be populated whenever relevant, but not be derived or summarized 362 | from their child dwc:Events. 363 | -------------------------------------------------------------------------------- /docs/humboldt_extension_implementation_experience_report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/docs/humboldt_extension_implementation_experience_report.pdf -------------------------------------------------------------------------------- /docs/inclusive/index.md: -------------------------------------------------------------------------------- 1 | # isLeastSpecificTargetCategoryQuantityInclusive Guidelines 2 | 3 | Title 4 | : isLeastSpecificTargetCategoryQuantityInclusive Guidelines 5 | 6 | Date version issued 7 | : 2024-02-28 8 | 9 | Date created 10 | : 2024-02-28 11 | 12 | Part of TDWG Standard 13 | : 14 | 15 | This version 16 | : 17 | 18 | Latest version 19 | : 20 | 21 | Abstract 22 | : The Humboldt Extension for ecological inventories mints the term eco:isLeastSpecificTargetCategoryQuantityInclusive to describe how to treat counts of organisms when records from a single dwc:Event include multiple target categories. This document describes how to use that term. 23 | 24 | Contributors 25 | : [Yi-Ming Gan](https://orcid.org/0000-0001-7087-2646) ([Royal Belgian Institute of Natural Sciences](http://www.wikidata.org/entity/Q16665660)), [Wesley M. Hochachka](https://orcid.org/0000-0002-0595-7827) ([Cornell Lab of Ornithology](http://www.wikidata.org/entity/Q2997535)), [John Wieczorek](https://orcid.org/0000-0003-1144-0290) ([VertNet](http://www.wikidata.org/entity/Q98382028)), [Yanina V. Sica](https://orcid.org/0000-0002-1720-0127) ([Yale University](http://www.wikidata.org/entity/Q49112)), [Peter Brenton](https://orcid.org/0000-0001-9730-8340) ([Atlas of Living Australia, CSIRO](http://www.wikidata.org/entity/Q16335177)), [Steven J. Baskauf](https://orcid.org/0000-0003-4365-3135) ([Vanderbilt University Libraries](http://www.wikidata.org/entity/Q16849893)) 26 | 27 | Creator 28 | : TDWG Humboldt Extension Task Group 29 | 30 | Bibliographic citation 31 | : TDWG Humboldt Extension Task Group. 2024. isLeastSpecificTargetCategoryQuantityInclusive Guidelines. Biodiversity Information Standards (TDWG). 32 | 33 | ## 1 Introduction (non-normative) 34 | 35 | This document elaborates upon the meaning and use of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive`. Use of this term is necessary in order to describe how to treat counts of organisms (or any other organisms quantity) when records from a single `dwc:Event` () include multiple target categories (e.g., taxonomic ranks within a higher rank or different life stages for the same species). For example, a statement whether the least specific target category quantity is inclusive should be reported when an `dwc:Event` includes records reporting quantities that are associated with subcategories (e.g., subspecies) and records reporting quantities for more general categories (e.g., the species). In this example, the higher taxon rank (i.e., species) is the least specific category, because it is more general than the subspecies category nested below it. Species and subspecies are just one example of a pair of category and subcategory. Other examples of subcategories are life stages (e.g., “adult”, “larva”, “egg”), and sexes. 36 | 37 | ### 1.1 Status of the content of this document 38 | 39 | Sections 3 of this document is normative. The other sections are non-normative. 40 | 41 | 42 | ### 1.2 RFC 2119 key words 43 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 44 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 45 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 46 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 47 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 48 | when, and only when, they are written in capitals (as shown here). 49 | 50 | ### 1.3 Namespaces and terminology 51 | 52 | The namespace `eco:` abbreviates `http://rs.tdwg.org/eco/terms/` and is used with terms minted for the Humboldt Extension for ecological inventories. `dwc:` abbreviates `http://rs.tdwg.org/dwc/terms/`, and is used with terms in the main Darwin Core vocabulary namespace. Words in `code markup` are term IRIs or literal values. The word "organisms" is used colloquially and is not used in the technical sense of the `dwc:Organism` class. 53 | 54 | ## 2 Rationale (non-normative) 55 | 56 | The term `eco:isLeastSpecificTargetCategoryQuantityInclusive` was introduced into the Humboldt Extension for ecological inventories late in development, after testing it with real-world cases ([Sica et al., 2022](#ref2)). Testing revealed that the quantities of organisms stored in two major biodiversity databases — OBIS (OBIS, 2023) and eBird (Sullivan et al., 2014) — need to be treated differently in order to calculate the total quantity of organisms in the least specific category. In the specific case of data in the OBIS database, the information for a single `dwc:Event` can contain multiple records for a species, with one record for a species listing the quantity of individual organisms for the species without specifying any subcategory of life stage, and other records for the same species in the same `dwc:Event` listing quantities for different life stages (e.g., one record for adults and another record for juveniles). In this example the single `dwc:Event` will contain 3 records: one for the species without any life stage specified, one for adults of the species, and one for juveniles of the species. For the OBIS data, the quantity in the record for which no life stage is specified is the sum of three quantities: the number of juveniles, the number of adults, and the number of individuals that were not recorded as belonging to any specific life stage. In other words, when using OBIS data, the total quantity of individuals recorded for a species, across all life stages combined, has been pre-calculated and stored in the database; unless the quantities of individuals within specific life stages are of interest, the information in the life stage subcategories can be ignored. The value of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` in this case would be `true` - the least specific category (species without any life stage specified) already includes the counts of the more specific subcategories. 57 | 58 | eBird stores information about quantities of organisms differently. For the example of a `dwc:Event` that contains separate records for subspecies and their parent species, the total number of individuals of the species needs to be calculated by the end user as the sum of the quantity reported for the species plus the quantities reported for the subspecies. In other words, the total quantity of organisms of each species has not been pre-calculated and must be derived by the end user. The value of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` in this case would be `false` - the least specific category (species) does not include the counts of the more specific subcategories (subspecies). 59 | 60 | In summary, the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` is required to inform the end user of whether they will need to derive the total quantity of organisms for the least specific category (e.g., for a species), or whether this total quantity has already been calculated prior to the data being entered into the database. Note that, if a dataset contains only simple targets that have no subcategories, the result of the term `eco:isLeastSpecificTargetCategoryQuantityInclusive` being `true` or `false` is exactly the same - the count is the total in either case. Only in this circumstance does the term not strictly need to be populated. However, given that data records acquire a "life of their own" separate from their associated metadata when aggregated from multiple data sets, best practice is to include and populate the term `eco:isLeastSpecificTargetCategoryQuantityInclusive`. 61 | 62 | ## 3 Usage guidelines (normative) 63 | 64 | The term `eco:isLeastSpecificTargetCategoryQuantityInclusive` is defined as "The total detected quantity of organisms for a `dwc:Taxon` (including subsets thereof) in a `dwc:Event` is given explicitly in a single record (`dwc:organismQuantity` value) for that `dwc:Taxon`." 65 | 66 | Values MUST be `true` and `false`. If `true`, the `dwc:organismQuantity` values for a `dwc:Taxon` in an `dwc:Event` is inclusive of all organisms of the `dwc:Taxon` (including more specific scopes such as different life stages or lower taxonomic ranks) and the total detected quantity of organisms for that `dwc:Taxon` in the `dwc:Event` MUST NOT be determined by summing the `dwc:organismQuantity` values for all records of the `dwc:Taxon` in the `dwc:Event`. Instead, the total detected quantity of organisms for the `dwc:Taxon` in an `dwc:Event` MUST be reported in a single record for the `dwc:Taxon` in the `dwc:Event`, with this record having no further specific scopes. In this case the sum of `dwc:organismQuantity` values for the reported subsets of the `dwc:Taxon` MUST NOT exceed the value of `dwc:organismQuantity` for the single record for the `dwc:Taxon` without subsets (i.e., the total). If `false`, the `dwc:organismQuantity` values for a `dwc:Taxon` in an `dwc:Event` MUST be added to get the total detected quantity of organisms for that `dwc:Taxon` in the `dwc:Event`. 67 | 68 | ## 4 Examples (non-normative) 69 | 70 | ### 4.1 Single `dwc:Taxon` example 71 | 72 | As an example of the difference between `true` and `false` values for `eco:isLeastSpecificTargetCategoryQuantityInclusive`, suppose there are three records (see Table 1) with `dwc:organismQuantity` for a `dwc:Taxon` (taxon_01) for an `dwc:Event` (event_01). One record is for adults of the `dwc:Taxon` with `dwc:organismQuantity` = `1` and `dwc:organismQuantityType` = `individuals`, one record is for juveniles of the `dwc:Taxon` with `dwc:organismQuantity` = `2` and `dwc:organismQuantityType` = `individuals`, and one record is for the `dwc:Taxon` without specifying the life stage and with `dwc:organismQuantity` = `4` and `dwc:organismQuantityType` = `individuals`. 73 | 74 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `true` for event_01, then the total number of individuals of taxon_01 for the `dwc:Event` is 4 (the least specific `dwc:Taxon` record — the one with no more specific scopes — includes all individuals of the `dwc:Taxon`). This means that there was 1 adult, 2 juveniles and 1 individual of taxon_01 whose life stage was not recorded. 75 | 76 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `false` for event_01, then the total number of individuals of taxon_01 for the `dwc:Event` is 7 (the least specific `dwc:Taxon` record - the one with no more specific scopes - does not include all individuals of the `dwc:Taxon`, rather, it is a separate category that must also be added to get the total). This means there was 1 adult, 2 juveniles and 4 individuals of taxon_01 whose life stage was not recorded. 77 | 78 | **Table 1. Organism quantities in `dwc:Occurrence` records** 79 | 80 | | occurrenceID | eventID | taxonID | lifeStage | organismQuantity | organismQuantityType | 81 | | ------------ | ------- | ------- | --------- | ---------------- | -------------------- | 82 | | occ_01 | event_01 | taxon_01 | adult | 1 | individual | 83 | | occ_02 | event_01 | taxon_01 | juvenile | 2 | individual | 84 | | occ_03 | event_01 | taxon_01 | | 4 | individual | 85 | 86 | ### 4.2 Nested taxa example 87 | 88 | Suppose there are three records (see Table 2) with `dwc:organismQuantity` for three taxa (*Hirundo rustica* and two subspecies) for a `dwc:Event` (event_01). The record for the species has `dwc:organismQuantity` = `3` and `dwc:organismQuantityType` = `individuals`. The record for *H. r. rustica* has `dwc:organismQuantity` = `2` and `dwc:organismQuantityType` = `individuals`. The record for *H. r. gutturalis* has `dwc:organismQuantity` = `4` and `dwc:organismQuantityType` = `individuals`. 89 | 90 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `true` for event_01, then the total number of individuals of the species *H. rustica* for the `dwc:Event` is 3 (the least specific `dwc:Taxon` record includes all individuals of the `dwc:Taxon`). This means there were 2 *H. r. rustica*, 1 *H. r. gutturalis*, and no other *H. rustica* of any kind detected. 91 | 92 | If `eco:isLeastSpecificTargetCategoryQuantityInclusive` is `false` for event_01, then the total number of individuals of the species *H. rustica* for the `dwc:Event` is 6 (the least specific `dwc:Taxon` record does not include all individuals of the `dwc:Taxon`). This means there were 2 *H. r. rustica*, 1 *H. r. gutturalis*, and 3 other *H. rustica* detected that were not identified to subspecies. 93 | 94 | **Table 2. Organism quantities in `dwc:Event` records** 95 | 96 | | eventID | scientificName | organismQuantity | organismQuantityType | 97 | | ------- | -------------- | ---------------- | -------------------- | 98 | | event_01 | Hirundo rustica | 3 | individual | 99 | | event_01 | Hirundo rustica rustica | 2 | individual | 100 | | event_01 | Hirundo rustica gutturalis | 1 | individual | 101 | 102 | # 5 References 103 | 104 | OBIS (2023) Ocean Biodiversity Information System. Intergovernmental Oceanographic Commission of UNESCO. . 105 | 106 | Sica Y. V., K. Ingenloff, Y-M GAN, Z. Kachian, S. J. Baskauf, J. Wieczorek, P. F. Zermoglio, R. D. Stevenson (2022). Application of Humboldt Extension to Real-world Cases. *Biodiversity Information Science and Standards* 6: e91502. 107 | 108 | Sullivan, B. L., J. L. Aycrigg, J. H. Barry, R. E. Bonney, N. Bruns, C. B. Cooper, T. Damoulas, A. A. Dhondt, T. Dietterich, A. Farnsworth, D. Fink, et al. (2014). The eBird enterprise: an integrated approach to development and application of citizen science. *Biological Conservation* 169:31-40. 109 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: home 3 | title: Humboldt Extension for Ecological Inventories 4 | description: The Humboldt Extension for Ecological Inventories is a vocabulary for transmitting information about biodiversity surveys with hierarchical structure. It is used along with Darwin Core terms to extend descriptions of Events. 5 | --- 6 | The Humboldt Extension for Ecological Inventories is a standard vocabulary maintained by the [Darwin Core Maintenance Group](https://www.tdwg.org/community/dwc/). It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to **facilitate the sharing of information about ecological inventories** by providing identifiers, labels, definitions, usage comments and examples. 7 | 8 | The official documents for this extension to [Darwin Core](http://www.tdwg.org/standards/450) include the [list of terms](list/), a [list of controlled vocabulary terms](tcr/) for the property `eco:taxonCompletenessReported`, and two guides that explain how the extension must be used ([isLeastSpecificTargetCategoryQuantityInclusive](inclusive/) and [hierarchical events](hierarchy/) guidelines). The [Quick reference guide](terms/) and [Usage guide](https://docs.google.com/document/d/1rX4m94rtZDR_8iIe3RvRnNYKDJcmSX3ii4S5hCznEA0/edit?usp=sharing) are not officially part of the addition to Darwin Core, but may provide a less technical point of entry to understanding the extension. 9 | 10 | ## Getting started 11 | 12 | * [Quick reference guide](terms/) 13 | * [Usage guide](https://docs.google.com/document/d/1rX4m94rtZDR_8iIe3RvRnNYKDJcmSX3ii4S5hCznEA0/edit?usp=sharing): how to use the Humboldt Extension 14 | * [GitHub repository](https://github.com/tdwg/hc): where the Humboldt Extension is maintained 15 | * [Guidelines for using isLeastSpecificTargetCategoryQuantityInclusive](inclusive/) 16 | * [Properties of hierarchical events in the Humboldt Extension for Ecological Inventories](hierarchy/). 17 | * [Term list](list/): the document containing complete metadata and normative term definitions for all Humboldt Extension terms. 18 | * [Utility files](https://github.com/tdwg/hc/tree/master/dist): CSV files of vertical and horizontal term lists plus the Humboldt Extension schema 19 | * [Implementation Experience Report](humboldt_extension_implementation_experience_report.pdf) 20 | -------------------------------------------------------------------------------- /docs/tcr/index.md: -------------------------------------------------------------------------------- 1 | # Taxon Completeness Reported Controlled Vocabulary List of Terms 2 | 3 | Title 4 | : Taxon Completeness Reported Controlled Vocabulary List of Terms 5 | 6 | Namespace IRI 7 | : http://rs.tdwg.org/ecotcr/values/ 8 | 9 | Preferred namespace abbreviation 10 | : ecotcr: 11 | 12 | Date version issued 13 | : 2024-02-28 14 | 15 | Date created 16 | : 2024-02-28 17 | 18 | Part of TDWG Standard 19 | : 20 | 21 | This version 22 | : 23 | 24 | Latest version 25 | : 26 | 27 | Abstract 28 | : The Humboldt Extension for Ecological Inventories mints the term `taxonCompletenessReported` to alert users that the inventory was conducted in such a way that all of the target taxa should have been detectable if they were present during the dwc:Event. This vocabulary provides terms that should be used as values for `eco:taxonCompletenessReported` and `ecoiri:taxonCompletenessReported`. 29 | 30 | Contributors 31 | : [Yanina V. Sica](https://orcid.org/0000-0002-1720-0127) ([Yale University](http://www.wikidata.org/entity/Q49112)), [Wesley M. Hochachka](https://orcid.org/0000-0002-0595-7827) ([Cornell Lab of Ornithology](http://www.wikidata.org/entity/Q2997535)), [Steven J. Baskauf](https://orcid.org/0000-0003-4365-3135) ([Vanderbilt University Libraries](http://www.wikidata.org/entity/Q16849893)) 32 | 33 | Creator 34 | : TDWG Humboldt Extension Task Group 35 | 36 | Bibliographic citation 37 | : TDWG Humboldt Extension Task Group. 2024. Taxon Completeness Reported Controlled Vocabulary List of Terms. Biodiversity Information Standards (TDWG). 38 | 39 | ## 1 Introduction (non-normative) 40 | 41 | This document includes terms intended to be used as a controlled value for the Humboldt Extension terms with the local name `taxonCompletenessReported`. 42 | 43 | ### 1.1 Status of the content of this document 44 | 45 | Sections 1 and 3 are non-normative. Section 2 is normative. In Section 4, the values of the `Term IRI`, `Definition`, and `Controlled value` are normative. The value of `Usage` (if it exists for a given term) is normative. The values of `Term Name` are non-normative, although one can expect that the namespace abbreviation prefix is one commonly used for the term namespace. `Label` and the values of all other properties (such as `Notes`) are non-normative. 46 | 47 | ### 1.2 RFC 2119 key words 48 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 49 | "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to 50 | be interpreted as described in [BCP 14](https://datatracker.ietf.org/doc/html/bcp14) 51 | [[RFC2119]](https://datatracker.ietf.org/doc/html/rfc2119) 52 | [[RFC8174]](https://datatracker.ietf.org/doc/html/rfc8174) 53 | when, and only when, they are written in capitals (as shown here). 54 | 55 | ### 1.3 Namespaces 56 | 57 | The namespace `eco:` abbreviates `http://rs.tdwg.org/eco/terms/` and the namespace `ecoiri:` abbreviates `http://rs.tdwg.org/eco/iri/`. Both namespaces are used with terms minted for the Humboldt Extension for Ecological Inventories. `ecotcr:` abbreviates `http://rs.tdwg.org/ecotcr/values/`, and is used with terms in this vocabulary. 58 | 59 | ## 2 Use of Terms (normnative) 60 | 61 | Due to the requirements of [Section 1.4.3 of the Darwin Core RDF Guide](http://rs.tdwg.org/dwc/terms/guides/rdf/#143-use-of-darwin-core-terms-in-rdf-normative), term IRIs MUST be used as values of `ecoiri:taxonCompletenessReported`. Controlled value strings MUST be used as values of `eco:taxonCompletenessReported`. 62 | 63 | ## 3 Term Index 64 | 65 | [not reported](#ecotcr_tcr00) | 66 | [reported complete](#ecotcr_tcr01) | 67 | [reported incomplete](#ecotcr_tcr02) | 68 | [taxon completeness reported concept scheme](#ecotcr_tcr) 69 | 70 | ## 4 Vocabulary 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 |
    Term Name ecotcr:tcr
    Term IRIhttp://rs.tdwg.org/ecotcr/values/tcr
    Modified2024-02-28
    Term version IRIhttp://rs.tdwg.org/ecotcr/values/version/tcr-2024-02-28
    Labeltaxon completeness reported concept scheme
    Definitiona SKOS concept scheme for categorizing taxon completeness reporting
    Typehttp://www.w3.org/2004/02/skos/core#ConceptScheme
    Executive Committee decisionhttp://rs.tdwg.org/decisions/decision-2024-02-28_42
    108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 |
    Term Name ecotcr:tcr00
    Term IRIhttp://rs.tdwg.org/ecotcr/values/tcr00
    Modified2024-02-28
    Term version IRIhttp://rs.tdwg.org/ecotcr/values/version/tcr00-2024-02-28
    Labelnot reported
    DefinitionTaxonomic completeness was not assessed or reported for the dwc:Event.
    Controlled valuenotReported
    TypeConcept
    Executive Committee decisionhttp://rs.tdwg.org/decisions/decision-2024-02-28_42
    150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 |
    Term Name ecotcr:tcr01
    Term IRIhttp://rs.tdwg.org/ecotcr/values/tcr01
    Modified2024-02-28
    Term version IRIhttp://rs.tdwg.org/ecotcr/values/version/tcr01-2024-02-28
    Labelreported complete
    DefinitionTaxonomic completeness was assessed for the dwc:Event, and it was determined to be complete.
    Controlled valuereportedComplete
    TypeConcept
    Executive Committee decisionhttp://rs.tdwg.org/decisions/decision-2024-02-28_42
    192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 |
    Term Name ecotcr:tcr02
    Term IRIhttp://rs.tdwg.org/ecotcr/values/tcr02
    Modified2024-02-28
    Term version IRIhttp://rs.tdwg.org/ecotcr/values/version/tcr02-2024-02-28
    Labelreported incomplete
    DefinitionTaxonomic completeness was assessed for the dwc:Event, and it was determined to be incomplete.
    Controlled valuereportedIncomplete
    TypeConcept
    Executive Committee decisionhttp://rs.tdwg.org/decisions/decision-2024-02-28_42
    234 | 235 | 236 | -------------------------------------------------------------------------------- /material/Checklist Metadata - Data Entry Manual.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/material/Checklist Metadata - Data Entry Manual.docx -------------------------------------------------------------------------------- /material/Guralnick et al Ecography 2017.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/material/Guralnick et al Ecography 2017.pdf -------------------------------------------------------------------------------- /material/HCSupplementalTable3_FullTermList_r2_v4_RW.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/material/HCSupplementalTable3_FullTermList_r2_v4_RW.xlsx -------------------------------------------------------------------------------- /material/HC_SupplementalTable_ExamplesNEW.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/material/HC_SupplementalTable_ExamplesNEW.xlsx -------------------------------------------------------------------------------- /material/TDWG_Task_Group_Charter_Template_03.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/material/TDWG_Task_Group_Charter_Template_03.docx -------------------------------------------------------------------------------- /material/desktop.ini: -------------------------------------------------------------------------------- 1 | [.ShellClassInfo] 2 | ConfirmFileOp=0 3 | IconResource=C:\Users\yanis\AppData\Local\Temp\drive_fs_td_2_38823.ico 4 | -------------------------------------------------------------------------------- /vocabulary/old/HC_terms_2021-02-28.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/vocabulary/old/HC_terms_2021-02-28.csv -------------------------------------------------------------------------------- /vocabulary/old/HC_terms_2021-11-17.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tdwg/hc/92e0ed94afceeea6a2d8ceb559da37f450ad007c/vocabulary/old/HC_terms_2021-11-17.csv -------------------------------------------------------------------------------- /vocabulary/old/README.md: -------------------------------------------------------------------------------- 1 | # Folder "old" 2 | 3 | This folder contains an archive of out of ate vocabulary files. --------------------------------------------------------------------------------