├── .binder
├── environment.yml
├── install.R
└── runtime.txt
├── .gitignore
├── LICENSE.md
├── README.md
├── examples-python
├── 01_re3data_API_medical_research_community.ipynb
├── 02_re3data_API_certification_by_type.ipynb
└── 03_re3data_API_repository_APIs.ipynb
└── examples-r
├── 01_re3data_API_medical_research_community.ipynb
├── 02_re3data_API_certification_by_type.ipynb
└── 03_re3data_API_repository_APIs.ipynb
/.binder/environment.yml:
--------------------------------------------------------------------------------
1 | dependencies:
2 | - python=3.10
3 | - httpx=0.23.0
4 | - lxml=4.8.0
5 | - pandas=1.4.2
6 | - seaborn=0.11.2
7 |
--------------------------------------------------------------------------------
/.binder/install.R:
--------------------------------------------------------------------------------
1 | install.packages("dplyr")
2 | install.packages("ggplot2")
3 | install.packages("httr")
4 | install.packages("purrr")
5 | install.packages("tidyr")
6 | install.packages("xml2")
--------------------------------------------------------------------------------
/.binder/runtime.txt:
--------------------------------------------------------------------------------
1 | r-2021-10-19
2 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # R
2 | .Rproj.user
3 | .Rhistory
4 | .RData
5 | .Ruserdata
6 |
7 | # Python
8 | .venv
9 | __pycache__
10 |
11 | # Jupyter notebook checkpoints
12 | .ipynb_checkpoints
13 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022 re3data
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Examples for using the re3data API
2 |
3 | [](https://mybinder.org/v2/gh/re3data/using_the_re3data_API/HEAD)
4 |
5 | ## About re3data
6 |
7 | [re3data](https://www.re3data.org/) is a global registry providing detailed descriptions of more than 2800 research data repositories. These descriptions are based on the [re3data Metadata Schema](https://www.re3data.org/schema/2-2) and can be accessed via the [re3data API](https://www.re3data.org/api/doc).
8 |
9 | ## About this repository
10 | There are many conceivable use cases for re3data metadata. In this GitHub repository, we provide some examples for using the re3data API that can be adapted to fit other use cases.
11 |
12 | The example use cases are implemented in R and Python using Jupyter Notebooks, which can be run in binder (see link above).
13 |
14 | ### Structure
15 |
16 | Examples are implemented in R and Python. The repository is structured as follows:
17 | * **examples-r** - R implementation of the example use cases
18 | * **examples-python** - Python implementation of the example use cases
19 |
20 | ### Example use cases
21 |
22 | Currently, we provide notebooks for three example use cases:
23 | * **01_re3data_API_medical_research_community.ipynb** - identify and collect information about repositories catering to the medical research community
24 | * **02_re3data_API_certification_by_type.ipynb** - distribution of certificates across repository types
25 | * **03_re3data_API_repository_APIs.ipynb** - collect information on repository APIs
26 |
27 | ### Contributors
28 |
29 | [Dorothea Strecker](https://orcid.org/0000-0002-9754-3807) (R implementation)
30 |
31 | [Yi Wang](https://orcid.org/0000-0003-1354-3461) (R implementation)
32 |
33 | [Heinz-Alexander Fütterer](https://orcid.org/0000-0003-4397-027X) (Python implementation)
34 |
35 | [Binh Nguyen Thanh](https://orcid.org/0000-0001-5509-8357) (contributor)
36 |
37 | ## Contact
38 |
39 | If your specific use case is not covered here, feel free to [contact us](mailto:info@re3data.org).
40 |
--------------------------------------------------------------------------------
/examples-python/01_re3data_API_medical_research_community.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 1: Identify and collect information about repositories catering to the medical research community (Python)\n",
8 | "\n",
9 | "> This notebook is based on the examples written in `R` from Dorothea Strecker's [examples-r/01_re3data_API_medical_research_community.ipynb](https://github.com/re3data/using_the_re3data_API/blob/main/examples-r/01_re3data_API_medical_research_community.ipynb). \n",
10 | "> Adapted in `Python` by Heinz-Alexander Fütterer.\n",
11 | "\n",
12 | "Medical researchers are looking for a suitable repository to deposit their data. They require a repository catering to medical research that offers data upload and assigns DOIs to datasets.\n",
13 | "\n",
14 | "Repositories meeting these specifications can be identified via the re3data API. The API also provides the option to retrieve further information about these repositories, such as the name of the repository or a description.\n",
15 | "\n",
16 | "### Step 1: load packages\n",
17 | "\n",
18 | "The package `httpx` includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the redata API are returned in XML. `lxml` includes functions for working with XML, for example parsing or extracting content of specific elements. The `pandas` library is used for storing the responses in a tabular data structure (i.e. a `DataFrame`).\n",
19 | "\n",
20 | "If necessary, install the packages before loading them."
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "# !pip install httpx==0.23.0 lxml==4.8.0 pandas==1.4.2"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": 2,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "import typing\n",
39 | "\n",
40 | "import httpx\n",
41 | "import pandas\n",
42 | "from lxml import html"
43 | ]
44 | },
45 | {
46 | "cell_type": "markdown",
47 | "metadata": {},
48 | "source": [
49 | "### Step 2: define query parameters\n",
50 | "\n",
51 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of repositories with the desired characteristics need to be identified first.\n",
52 | "\n",
53 | "The re3data API allows querying via the endpoint **/api/beta/repositories**. Parameters that can be queried are listed in the [re3data API documentaion](https://www.re3data.org/api/doc). For more information on re3data metadata, including descriptions of available elements and controlled vocabularies, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema). \n",
54 | "The query below returns re3data IDs of repositories meeting the following conditions:\n",
55 | "\n",
56 | "* **\"subjects[]\" = \"205 Medicine\"** The repository caters to the subject *Medicine*, notation 205 in the DFG Subject Classification, the subject classification used by re3data.\n",
57 | "* **\"dataUploads[]\"=\"open\"** The repository allows data upload.\n",
58 | "* **\"pidSystems[]\"=\"DOI\"** The repository assigns DOIs."
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "re3data_query = {\n",
68 | " \"subjects[]\": \"205 Medicine\",\n",
69 | " \"dataUploads[]\": \"open\",\n",
70 | " \"pidSystems[]\": \"DOI\",\n",
71 | "}"
72 | ]
73 | },
74 | {
75 | "cell_type": "markdown",
76 | "metadata": {},
77 | "source": [
78 | "### Step 3: obtain URLs for further API queries\n",
79 | "\n",
80 | "The query parameters defined in the previous step can then be passed to the re3data API using `httpx.get()`.\n",
81 | "\n",
82 | "The XML response is parsed using `html.fromstring()`. XML elements or attributes can be identified using XPath syntax. The response from the re3data API includes URLs for further queries to the **/api/beta/repository** endpoint. These URLs can be identified with a simple XPath expression. All attributes matching the XPath syntax are identified with `.xpath()`.\n",
83 | "\n",
84 | "The three functions are nested in the example below."
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": 4,
90 | "metadata": {},
91 | "outputs": [
92 | {
93 | "data": {
94 | "text/plain": [
95 | "['https://www.re3data.org/api/beta/repository/r3d100012823',\n",
96 | " 'https://www.re3data.org/api/beta/repository/r3d100010953',\n",
97 | " 'https://www.re3data.org/api/beta/repository/r3d100012815',\n",
98 | " 'https://www.re3data.org/api/beta/repository/r3d100010261',\n",
99 | " 'https://www.re3data.org/api/beta/repository/r3d100012074']"
100 | ]
101 | },
102 | "execution_count": 4,
103 | "metadata": {},
104 | "output_type": "execute_result"
105 | }
106 | ],
107 | "source": [
108 | "URL = \"https://www.re3data.org/api/beta/repositories\"\n",
109 | "\n",
110 | "re3data_response = httpx.get(URL, params=re3data_query)\n",
111 | "urls = html.fromstring(re3data_response.content).xpath(\"//@href\")\n",
112 | "\n",
113 | "urls[:5]"
114 | ]
115 | },
116 | {
117 | "cell_type": "markdown",
118 | "metadata": {},
119 | "source": [
120 | "### Step 4: define what information about the repositories should be requested\n",
121 | "\n",
122 | "The function `extract_repository_info()` defined in the following code block extracts the content of specific XML elements and attributes. This function will be used to extract the specified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 3 above.\n",
123 | "The XPath expressions defined here will extract the re3data IDs, names, URLs, and descriptions of the repositories. Results are stored in a dictionary that can be processed later.\n",
124 | "\n",
125 | "Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema).\n",
126 | " \n",
127 | "Please note that in version 2.2 of the re3data Metadata Schema, the elements mentioned here have occurences of 1 or 0-1, meaning that for each repository, they occur once at most. For information on how to deal with elements that can occur multiple times, please refer to other examples for using the re3data API."
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": 5,
133 | "metadata": {},
134 | "outputs": [],
135 | "source": [
136 | "def extract_repository_info(\n",
137 | " repository_metadata_xml: html.HtmlElement,\n",
138 | ") -> typing.Dict[str, str]:\n",
139 | " \"\"\"Extracts wanted metadata elements from a given repository metadata xml representation.\n",
140 | "\n",
141 | " Args:\n",
142 | " repository_metadata_xml: XML representation of repository metadata.\n",
143 | "\n",
144 | " Returns:\n",
145 | " Dictionary representation of repository metadata.\n",
146 | "\n",
147 | " \"\"\"\n",
148 | "\n",
149 | " namespaces = {\"r3d\": \"http://www.re3data.org/schema/2-2\"}\n",
150 | " return {\n",
151 | " \"re3data_id\": repository_metadata_xml.xpath(\"//re3data.orgidentifier/text()\", namespaces=namespaces)[0],\n",
152 | " \"name\": repository_metadata_xml.xpath(\"//repositoryname/text()\", namespaces=namespaces)[0],\n",
153 | " \"url\": repository_metadata_xml.xpath(\"//repositoryurl/text()\", namespaces=namespaces)[0],\n",
154 | " \"description\": repository_metadata_xml.xpath(\"//description/text()\", namespaces=namespaces)[0],\n",
155 | " }"
156 | ]
157 | },
158 | {
159 | "cell_type": "markdown",
160 | "metadata": {},
161 | "source": [
162 | "### Step 5: gather detailed information about repositories\n",
163 | "\n",
164 | "After preparing the list of URLs and the extracting function, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using `.get()` from a `httpx.Client`. The XML response is parsed with `html.fromstring()` before `extract_repository_info()` is called. The results are then appended to `results_list`.\n",
165 | "\n",
166 | "`repository_info` is a container for storing results of the API query. The DataFrame has four columns corresponding to names of the list items defined by `extract_repository_info()`."
167 | ]
168 | },
169 | {
170 | "cell_type": "code",
171 | "execution_count": 6,
172 | "metadata": {},
173 | "outputs": [],
174 | "source": [
175 | "results = []\n",
176 | "\n",
177 | "with httpx.Client() as client:\n",
178 | " for url in urls:\n",
179 | " repository_metadata_response = client.get(url)\n",
180 | " repository_metadata_xml = html.fromstring(repository_metadata_response.content)\n",
181 | " results.append(extract_repository_info(repository_metadata_xml))\n",
182 | "\n",
183 | "repository_info = pandas.DataFrame(results)"
184 | ]
185 | },
186 | {
187 | "cell_type": "markdown",
188 | "metadata": {},
189 | "source": [
190 | "### Step 6: Look at the results\n",
191 | "\n",
192 | "Results are now stored in `repository_info`. They can be inspected using `.head()`, visualized or stored locally with `.to_csv()`."
193 | ]
194 | },
195 | {
196 | "cell_type": "code",
197 | "execution_count": 7,
198 | "metadata": {},
199 | "outputs": [
200 | {
201 | "data": {
202 | "text/html": [
203 | "
\n",
204 | "\n",
217 | "
\n",
218 | " \n",
219 | "
\n",
220 | "
\n",
221 | "
re3data_id
\n",
222 | "
name
\n",
223 | "
url
\n",
224 | "
description
\n",
225 | "
\n",
226 | " \n",
227 | " \n",
228 | "
\n",
229 | "
0
\n",
230 | "
r3d100012823
\n",
231 | "
Vivli
\n",
232 | "
https://vivli.org/
\n",
233 | "
Vivli is a non-profit organization working to ...
\n",
234 | "
\n",
235 | "
\n",
236 | "
1
\n",
237 | "
r3d100010953
\n",
238 | "
Polar Data Catalogue
\n",
239 | "
https://www.polardata.ca/
\n",
240 | "
The Polar Data Catalogue is an online database...
\n",
241 | "
\n",
242 | "
\n",
243 | "
2
\n",
244 | "
r3d100012815
\n",
245 | "
UNB Libraries Dataverse
\n",
246 | "
https://dataverse.lib.unb.ca/
\n",
247 | "
UNB Dataverse is repository for research data ...
\n",
248 | "
\n",
249 | "
\n",
250 | "
3
\n",
251 | "
r3d100010261
\n",
252 | "
National Addiction & HIV Data Archive Program
\n",
253 | "
https://www.icpsr.umich.edu/web/pages/NAHDAP/i...
\n",
254 | "
NAHDAP acquires, preserves and disseminates da...
\n",
255 | "
\n",
256 | "
\n",
257 | "
4
\n",
258 | "
r3d100012074
\n",
259 | "
BindingDB
\n",
260 | "
http://bindingdb.org/bind/index.jsp
\n",
261 | "
BindingDB is a public, web-accessible knowledg...
\n",
262 | "
\n",
263 | " \n",
264 | "
\n",
265 | "
"
266 | ],
267 | "text/plain": [
268 | " re3data_id name \\\n",
269 | "0 r3d100012823 Vivli \n",
270 | "1 r3d100010953 Polar Data Catalogue \n",
271 | "2 r3d100012815 UNB Libraries Dataverse \n",
272 | "3 r3d100010261 National Addiction & HIV Data Archive Program \n",
273 | "4 r3d100012074 BindingDB \n",
274 | "\n",
275 | " url \\\n",
276 | "0 https://vivli.org/ \n",
277 | "1 https://www.polardata.ca/ \n",
278 | "2 https://dataverse.lib.unb.ca/ \n",
279 | "3 https://www.icpsr.umich.edu/web/pages/NAHDAP/i... \n",
280 | "4 http://bindingdb.org/bind/index.jsp \n",
281 | "\n",
282 | " description \n",
283 | "0 Vivli is a non-profit organization working to ... \n",
284 | "1 The Polar Data Catalogue is an online database... \n",
285 | "2 UNB Dataverse is repository for research data ... \n",
286 | "3 NAHDAP acquires, preserves and disseminates da... \n",
287 | "4 BindingDB is a public, web-accessible knowledg... "
288 | ]
289 | },
290 | "execution_count": 7,
291 | "metadata": {},
292 | "output_type": "execute_result"
293 | }
294 | ],
295 | "source": [
296 | "repository_info.head()"
297 | ]
298 | }
299 | ],
300 | "metadata": {
301 | "language_info": {
302 | "name": "python"
303 | }
304 | },
305 | "nbformat": 4,
306 | "nbformat_minor": 4
307 | }
308 |
--------------------------------------------------------------------------------
/examples-python/02_re3data_API_certification_by_type.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 2: Distribution of certificates across repository types (Python)\n",
8 | "\n",
9 | "> This notebook is based on the examples written in `R` from Dorothea Strecker's [examples-r/02_re3data_API_certification_by_type.ipynb](https://github.com/re3data/using_the_re3data_API/blob/main/examples-r/02_re3data_API_certification_by_type.ipynb). \n",
10 | "> Adapted in `Python` by Heinz-Alexander Fütterer.\n",
11 | "\n",
12 | "Observants of the repository landscape are interested in conducting a multivariate analysis of certification status and type of research data repositories.\n",
13 | "\n",
14 | "Research data repositories are diverse. The re3data Metadata Schema tries to account for that, resulting in rich and detailed metadata that can be accessed via the re3data API.\n",
15 | "\n",
16 | "### Step 1: load packages\n",
17 | "\n",
18 | "The package `httpx` includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the redata API are returned in XML. `lxml` includes functions for working with XML, for example parsing or extracting content of specific elements. The `pandas` library is used for storing the responses in a tabular data structure (i.e. a `DataFrame`). It offers useful functions for data manipulation and reshaping as well. `seaborn` is a package for beautiful data visualization.\n",
19 | "\n",
20 | "If necessary, install the packages before loading them."
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 1,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "# !pip install httpx==0.23.0 lxml==4.8.0 pandas==1.4.2 seaborn==0.11.2"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": 2,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "import datetime\n",
39 | "import typing\n",
40 | "\n",
41 | "import httpx\n",
42 | "import pandas\n",
43 | "import seaborn\n",
44 | "from lxml import html\n",
45 | "\n",
46 | "seaborn.set_style(\"whitegrid\")"
47 | ]
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "metadata": {},
52 | "source": [
53 | "### Step 2: obtain re3data IDs of all repositories indexed in re3data\n",
54 | "\n",
55 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of all repositories indexed in re3data need to be identified first, using the endpoint **/api/v1/repositories**. Details of the re3data APIs are outlined in the [re3data API documentaion](https://www.re3data.org/api/doc).\n",
56 | "\n",
57 | "The endpoint is queried using `httpx.get()`. The XML response is parsed using `html.fromstring()`. XML elements or attributes can be identified using XPath syntax. All elements matching the XPath syntax for finding re3data urls are extracted with `.xpath()`. The three functions are nested in the example below.\n",
58 | "\n",
59 | "The endpoint **/api/beta/repository** provides detailed information about individual repositories that can be accessed via re3data IDs. Therefore, URLs for the next query are collected."
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": 3,
65 | "metadata": {},
66 | "outputs": [
67 | {
68 | "name": "stdout",
69 | "output_type": "stream",
70 | "text": [
71 | "2872\n"
72 | ]
73 | },
74 | {
75 | "data": {
76 | "text/plain": [
77 | "['https://www.re3data.org/api/beta/repository/r3d100010141',\n",
78 | " 'https://www.re3data.org/api/beta/repository/r3d100010148',\n",
79 | " 'https://www.re3data.org/api/beta/repository/r3d100010153',\n",
80 | " 'https://www.re3data.org/api/beta/repository/r3d100010201',\n",
81 | " 'https://www.re3data.org/api/beta/repository/r3d100010209']"
82 | ]
83 | },
84 | "execution_count": 3,
85 | "metadata": {},
86 | "output_type": "execute_result"
87 | }
88 | ],
89 | "source": [
90 | "URL = \"https://www.re3data.org/api/beta/repositories\"\n",
91 | "\n",
92 | "re3data_response = httpx.get(URL, timeout=60)\n",
93 | "tree = html.fromstring(re3data_response.content)\n",
94 | "urls = tree.xpath(\"//@href\")\n",
95 | "print(len(urls))\n",
96 | "\n",
97 | "urls[:5]"
98 | ]
99 | },
100 | {
101 | "cell_type": "markdown",
102 | "metadata": {},
103 | "source": [
104 | "### Step 3: define what information about the repositories should be requested\n",
105 | "\n",
106 | "The function `extract_repository_info()` defined in the following code block points to and extracts the content of specific XML elements and attributes. This function will be used later to extract the spedified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 2 above. \n",
107 | "The XPath expressions defined here will extract the re3data IDs, certificates, and types of the repositories. According to version 2.2 of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) used by the API, **type** and **certificate** have an occurence of 1-n and 0-n, respectively. This means that the elements can occur multiple times. For this reason, all occurrences of these elements are stored in a list. These lists can be separated for the analysis later. In this and similar cases, extracting the re3data ID is particularly important, as it can serve as an ID column in the analysis. Results are stored in a dictionary.\n",
108 | "\n",
109 | "Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006)."
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 4,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "def extract_repository_info(\n",
119 | " repository_metadata_xml: html.HtmlElement,\n",
120 | ") -> typing.Dict[str, typing.Any]:\n",
121 | " \"\"\"Extracts wanted metadata elements from a given repository metadata xml representation.\n",
122 | "\n",
123 | " Args:\n",
124 | " repository_metadata_xml: XML representation of repository metadata.\n",
125 | "\n",
126 | " Returns:\n",
127 | " Dictionary representation of repository metadata.\n",
128 | "\n",
129 | " \"\"\"\n",
130 | "\n",
131 | " namespaces = {\"r3d\": \"http://www.re3data.org/schema/2-2\"}\n",
132 | " return {\n",
133 | " \"re3data_id\": repository_metadata_xml.xpath(\"//re3data.orgidentifier/text()\", namespaces=namespaces)[0],\n",
134 | " \"type\": repository_metadata_xml.xpath(\"//type/text()\", namespaces=namespaces),\n",
135 | " \"certificate\": repository_metadata_xml.xpath(\"//certificate/text()\", namespaces=namespaces),\n",
136 | " }"
137 | ]
138 | },
139 | {
140 | "cell_type": "markdown",
141 | "metadata": {},
142 | "source": [
143 | "### Step 4: gather detailed information about repositories\n",
144 | "\n",
145 | "After preparing the list of URLs and the extracting function, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using `.get()` from a `httpx.Client`. The XML response is parsed with `html.fromstring()` before `extract_repository_info()` is called. The results are then appended to `results_list`.\n",
146 | "\n",
147 | "Because these steps are repeated for all repositories indexed in re3data, the process will take a while (~ 5 minutes).\n",
148 | "\n",
149 | "`repository_info` is a container for storing results of the API query. The DataFrame has four columns corresponding to names of the list items defined by `extract_repository_info()`."
150 | ]
151 | },
152 | {
153 | "cell_type": "code",
154 | "execution_count": 5,
155 | "metadata": {},
156 | "outputs": [
157 | {
158 | "data": {
159 | "text/html": [
160 | "
\n",
161 | "\n",
174 | "
\n",
175 | " \n",
176 | "
\n",
177 | "
\n",
178 | "
re3data_id
\n",
179 | "
type
\n",
180 | "
certificate
\n",
181 | "
\n",
182 | " \n",
183 | " \n",
184 | "
\n",
185 | "
0
\n",
186 | "
r3d100010141
\n",
187 | "
[disciplinary, institutional]
\n",
188 | "
[]
\n",
189 | "
\n",
190 | "
\n",
191 | "
1
\n",
192 | "
r3d100010148
\n",
193 | "
[disciplinary, institutional]
\n",
194 | "
[]
\n",
195 | "
\n",
196 | "
\n",
197 | "
2
\n",
198 | "
r3d100010153
\n",
199 | "
[disciplinary]
\n",
200 | "
[]
\n",
201 | "
\n",
202 | "
\n",
203 | "
3
\n",
204 | "
r3d100010201
\n",
205 | "
[disciplinary, institutional]
\n",
206 | "
[RatSWD]
\n",
207 | "
\n",
208 | "
\n",
209 | "
4
\n",
210 | "
r3d100010209
\n",
211 | "
[disciplinary]
\n",
212 | "
[]
\n",
213 | "
\n",
214 | " \n",
215 | "
\n",
216 | "
"
217 | ],
218 | "text/plain": [
219 | " re3data_id type certificate\n",
220 | "0 r3d100010141 [disciplinary, institutional] []\n",
221 | "1 r3d100010148 [disciplinary, institutional] []\n",
222 | "2 r3d100010153 [disciplinary] []\n",
223 | "3 r3d100010201 [disciplinary, institutional] [RatSWD]\n",
224 | "4 r3d100010209 [disciplinary] []"
225 | ]
226 | },
227 | "execution_count": 5,
228 | "metadata": {},
229 | "output_type": "execute_result"
230 | }
231 | ],
232 | "source": [
233 | "results = []\n",
234 | "\n",
235 | "with httpx.Client() as client:\n",
236 | " for i, url in enumerate(urls):\n",
237 | " # Uncomment to see progress, every 100th url is printed\n",
238 | " # if i % 100 == 0:\n",
239 | " # print(url)\n",
240 | "\n",
241 | " repository_metadata_response = client.get(url, follow_redirects=True)\n",
242 | " repository_metadata_xml = html.fromstring(repository_metadata_response.content)\n",
243 | " results.append(extract_repository_info(repository_metadata_xml))\n",
244 | "\n",
245 | "repository_info = pandas.DataFrame(results)\n",
246 | "repository_info.head()"
247 | ]
248 | },
249 | {
250 | "cell_type": "markdown",
251 | "metadata": {},
252 | "source": [
253 | "### Step 5: process the results\n",
254 | "\n",
255 | "The first line in the code block below uses the method `.astype(bool)` on the column **certificate**, resulting in a new column indicating whether a repository received at least one certificate (`True`) or not (`False`).\n",
256 | "\n",
257 | "The results can be stored locally with `.to_csv()`. Columns containing values in lists (e.g. **type**) in the column type are separated with `.explode()`, creating new rows if a repository was assigned multiple values. The resulting dataframe follows the specifications of [tidy data](http://dx.doi.org/10.18637/jss.v059.i10), a \"standard way of mapping the meaning of a dataset to its structure\". Tidy dataframes are often easier to understand and work with.\n",
258 | "\n",
259 | "Although this introduces duplication - multiple rows can now correspond to the same repository - the re3data IDs can be used to deduplicate results at any time."
260 | ]
261 | },
262 | {
263 | "cell_type": "code",
264 | "execution_count": 6,
265 | "metadata": {},
266 | "outputs": [
267 | {
268 | "data": {
269 | "text/html": [
270 | "
\n",
271 | "\n",
284 | "
\n",
285 | " \n",
286 | "
\n",
287 | "
\n",
288 | "
re3data_id
\n",
289 | "
type
\n",
290 | "
certification_status
\n",
291 | "
\n",
292 | " \n",
293 | " \n",
294 | "
\n",
295 | "
688
\n",
296 | "
r3d100000001
\n",
297 | "
disciplinary
\n",
298 | "
True
\n",
299 | "
\n",
300 | "
\n",
301 | "
1207
\n",
302 | "
r3d100000002
\n",
303 | "
disciplinary
\n",
304 | "
False
\n",
305 | "
\n",
306 | "
\n",
307 | "
1383
\n",
308 | "
r3d100000004
\n",
309 | "
disciplinary
\n",
310 | "
True
\n",
311 | "
\n",
312 | "
\n",
313 | "
2703
\n",
314 | "
r3d100000005
\n",
315 | "
institutional
\n",
316 | "
False
\n",
317 | "
\n",
318 | "
\n",
319 | "
2191
\n",
320 | "
r3d100000006
\n",
321 | "
disciplinary
\n",
322 | "
True
\n",
323 | "
\n",
324 | "
\n",
325 | "
639
\n",
326 | "
r3d100000007
\n",
327 | "
institutional
\n",
328 | "
False
\n",
329 | "
\n",
330 | "
\n",
331 | "
639
\n",
332 | "
r3d100000007
\n",
333 | "
disciplinary
\n",
334 | "
False
\n",
335 | "
\n",
336 | "
\n",
337 | "
1590
\n",
338 | "
r3d100000011
\n",
339 | "
disciplinary
\n",
340 | "
True
\n",
341 | "
\n",
342 | "
\n",
343 | "
1015
\n",
344 | "
r3d100000012
\n",
345 | "
disciplinary
\n",
346 | "
True
\n",
347 | "
\n",
348 | "
\n",
349 | "
2336
\n",
350 | "
r3d100000013
\n",
351 | "
other
\n",
352 | "
False
\n",
353 | "
\n",
354 | " \n",
355 | "
\n",
356 | "
"
357 | ],
358 | "text/plain": [
359 | " re3data_id type certification_status\n",
360 | "688 r3d100000001 disciplinary True\n",
361 | "1207 r3d100000002 disciplinary False\n",
362 | "1383 r3d100000004 disciplinary True\n",
363 | "2703 r3d100000005 institutional False\n",
364 | "2191 r3d100000006 disciplinary True\n",
365 | "639 r3d100000007 institutional False\n",
366 | "639 r3d100000007 disciplinary False\n",
367 | "1590 r3d100000011 disciplinary True\n",
368 | "1015 r3d100000012 disciplinary True\n",
369 | "2336 r3d100000013 other False"
370 | ]
371 | },
372 | "execution_count": 6,
373 | "metadata": {},
374 | "output_type": "execute_result"
375 | }
376 | ],
377 | "source": [
378 | "repository_info[\"certification_status\"] = repository_info[\"certificate\"].astype(bool)\n",
379 | "repository_info.drop(\"certificate\", axis=1, inplace=True)\n",
380 | "\n",
381 | "repository_info = repository_info.explode(\"type\")\n",
382 | "repository_info[\"type\"] = repository_info[\"type\"].astype(\"category\")\n",
383 | "\n",
384 | "repository_info.drop_duplicates(inplace=True)\n",
385 | "repository_info.sort_values(by=\"re3data_id\", inplace=True)\n",
386 | "repository_info.head(10)"
387 | ]
388 | },
389 | {
390 | "cell_type": "markdown",
391 | "metadata": {},
392 | "source": [
393 | "### Step 6: visualize the results\n",
394 | "\n",
395 | "Now that the results are processed, they can be visualized. The example below generates a `seaborn.catplot` showing the prevalence of (any) certification by repository type. \n",
396 | "Please note that, as mentioned above, **type** has an occurence of 1-n. Some repositories are assigned more than one type, for example *institutional* and *other*."
397 | ]
398 | },
399 | {
400 | "cell_type": "code",
401 | "execution_count": 7,
402 | "metadata": {},
403 | "outputs": [
404 | {
405 | "data": {
406 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAApQAAAFwCAYAAAAG8us1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAABEdklEQVR4nO3deZgU1fX/8fewCEQQlE1ZFFE8GUkAl7jEJS6oKCAIajAqiHyjxijBNbIkGg1KjBtqND81iiTuBAFxX+ISFRfU4DKeuGEEERQRWRQB5/fHvT00Tc9Mz9TM9Cyf1/PMM91Vt6pOVXVVnb73VnVBcXExIiIiIiKV1SjfAYiIiIhI3aaEUkREREQSUUIpIiIiIokooRQRERGRRJRQioiIiEgiSihFREREJJEm+Q4AwMyOB0a4+6Hx/T7AFGAb4ATgVOBud7+9ipf7V2Chu19SlfOV2sPMVgK93P3DfMdSk8zMgHuAHYDx7n5tnkMqk5ldBOzo7ifkO5aaYGbbAu8Ard19fRXN82Gq+DxpZvOBjsA0dz+xquZbgeUXAz3c/f08LLvWH0OZ184qmmeDPGdKdmY2CpgMbE45x2JBLs+hNLNfAGcDPwRWAG8AE93935UIrhvwEdDU3deVUuZJYJa7T67o/MtY7knA/7n7vlU1z0rE0I1y1j3LNPMJcT9RjaE1CGY2BVjg7hPq4/Iylv034Gt3P6umlx2XfyBwLdAVWA88C5zh7gtLKX8ROSaU+dyu1cXMngb+4e635DuWdPk+/+Q5oczrMVQZ+dxeuTKzDoQE5WeEJOUt4Gx3fymtzJmEnKMt8F9gTCrfiF+c9kub5WaAu/uPc5l3ObFdAQwCtgYWApe6+9S08X2AvwGFQBEwyt3fiOPOA0YA2wFfADe4+59zXecssXQDbgP2BP5HOH8+kTa+O+Ec+zNgDXCru59fxvzOAn4L/ACYBvzK3ddklPkZ8DQhv5uQMa7cz1a5Td5mdjZwDXAp4ZvqtsANhI1eIWaWa43odsDbFZ2/5Ecu+7UC+77WMrPG+Y6hAip9DFXR/nwHOMzd2wCdgPeAGysTT3Wr659NMyswM3VfKkMl93HOx1B92Ac1eBy0BF4BdgO2Am4HHjSzljGOPYFJwNFAa0ICd3/q/Ovuh7t7y9Qf8AJwXy7zzsEqYGBc7ghgspn9NMa1GTAT+AewZZz3zDgcoAAYHsf1A84ws2EJ4roLeJ2QVI8HpplZ+7RYHgeeIiS/XWJcWZnZYcAFwMGEz3V34A8ZZZoSkt6cku9syqyhNLPWhCx9pLvfV0qZRsD5wC+BNsCTwGnu/mVajdz/ARcC84FuhFqLVXEWhwBGrD00sw+A7QkZ93rCxnyUtG/vZvZLwreXLsAnwAnu/pqZXRDj6BCHj3f3+82skLBjmgLfAOvcvU1mTUec728JO/zfcT0+jeOKgV8B5wDtgTsI3xg22YBmtgch6d4pLu8Odz/bzP6XZd2XADcDvYHiuK6/dvevzOzvwPFp2+Ji4OW4LbqkLW9+3H5PlLbsLDEeQPgA3hC35cq4ve6I45sBE4FjgWbA/cBZ7v5N2rTXAWcBj2c2h8Ua4V/GeIcTkolLcphnafG0jss7HFgdt9ml7v69me1IOOn0AdYCT7r7z9P2Ww/gIOAvcRt/B/zL3QfGz8aNcdqFwFh3nxWnnRK34XaEb4EXAecCnVLNlGY2BLjQ3XtnrP8pmcsj1NLt5e5D08pdCxS7+29iLdWLhIP+h3Gake7+ZSy7F3AVsDPwMfAbd3+aDGb2VIx3LbAO2BVYXMb222RfZfl2ehHwI+Bb4Mi4j+6L8RwBfE/4Nn1hZhNu/CxdBAxy953jsO0J3Vp2BeYADrRJ1VCa2X2EWogWwH8I36bfzrZd437MeuxnbpvKrEva9nkdOBFYRDhGn4zz6wT8FdgX+BL4k7vfHMeVdi7oRmytIJzYL0jbX1Pc/Yx4IZscp/0vYX+/EOf7NPA8cEDchj8GbmHj8+TJwHmEC87LwCnu/rGZFcR1PR5oTvgsHefub2XZVvPJqKEs61yYWdOc2SoT4/434XjsRfiMn0SoaRlI+Bwc4+7z05b1G2AMsEXcL7919+/LWse0ac+I0zZx9+2zrN+RwGVAZ0LL26/cvSjbMeTu/82YNts+aEI4znYDPgd+5+73xvJTCJ+5HYC9gNeA4WnxlrW/TwJ+H7f3F8AEd78jveXNzJ4lHDOrCcfHKHe/J4fr2kbbKL0mqpzrQDvCMbwv4Zh5G/hZat9UhJl9DRzo7nPN7OfAOe6+Rxy3OeF60MndF2VM1w34ANgh9Zkpa96ViGsW8Iy7X2lmhxI+f11S1/14TT/F3R/JMu21QIG7n1nRuMxsJ+BNoJ27r4jDniOcP/4az4Mnuvt+mdOWsqw7gfnuPi6+PzjOa+u0MhcQPiMdyNICVBU1lHsTTjhZT8zRmcBgwsHXCVhGOOGn+xmhivgwYP84rE38hvFiekF334FQvTswjs+skj2GcHEaTjjBHAksjaM/IBxQrQkn6X+Y2TbuXgScBrwY59kmcyXM7CDCieVYQt/Nj4G7M4oNAH5COBEeG9cnm8nAZHffgnDyuDcOz7buBXG5neI26hrXj5ikpW+Ly0tZXi7LzmZroB3hZDoCuMnMLI6bRDix9QF2jGV+nzHtVoRk65RS5r8n8CGhZntijvMsLZ7rCPu1O+HzNBwYGcddAjxG+GbYJZbdiLvfRLjwXR635cD4jeyBOG0Hwmf5jrRlAvwixt4qzncpkN5f6URgKhmyLY+QMPczszZQUiMwLGP64cDJhM/gOsKFFjPrDDwI/JGw3c8F/pn6xpqx7IOA5wgX+ZbxQljW9oNN91U2gwhNJW3iuk2JMe4I7BK3y/+lCpvZtmb2FSGROhdI//zeCcwl7O9LCPs73cOELwIdCBfeO+K6ZduuUMqxX8p6VHhdCNvngxjvhcB0M9sqjrsbWEA4ho8GLo3nE8jheHT38Wy8v86I836QsP/bEhLAB82sbdqkJxKOvVaE81UJMxsEjAOGEJKQ5wg1HsR1259wLLYmnMuWUjG5nguzGRZj70zYJi8SLtRbEZoRL8wofxSwOyFpG0Q4Pspbx5TBhH23c2YQ8aJ9FyGZag88BDxgZpuVcgxlk74PPifUGt1J+NwOA24ws/RlH0/4vLcjJLCpL8yl7u+YUF0LHO7urYCfxmk34u6p60vvGPM9OV7XSt1GlH3OPofwuW9POG+MIySymNkNZnZD1i2WwUIz8mZAKlF5GGhsZntaqJU8Oa7vZ1kmHw48V0YymTnvnJlZC8JnPFVL3ROYl1GJNC8Oz5y2gHA+ylrDnUNcPYEPU8lk9J+0Ze0FzDezh83sCzN72sx+XMbq9IzTp8+rY+p8YmbbEbbzxWXMo1zlVXG3Bb7wsvv7nUY46BbEwC4C/mdm6TVWF7n7qjg+QbhAOMlf7u6vxPclO8Q3rkW9x8zGAnsQqqnLczyhD8JrMc6xwDIz65b2YZ3k7l8BX5nZvwgH2SbfTAjfanc0s3bu/gWhBiarmO2n1uFzM7uKTU+oFZHzsqPfxaT9GTN7EDjWzP5IOEn2Sqsdu5Rwohwbp/ueUIOzJttMo0/d/bo4/foc5llaPJcSTs594gG2wsyuJJzM/xbXeTvCN9gFhG/hudiL0BQxKX6rfsrMZgPHEZN6YKa7Px9ff2tmtxNuFHs4XgQOA07PZWHuvijWIhxDqCHsRzi+0r+h/j1VU2RmvwPeMLMRcZkPuftDsdzjZvYqoUatzJsw4km5rO0HafuKkFhl86K7z4jz3CIuu427fwOsMrOrCfv4/8X1/R/QJm6nXwLvxmm3JZyo+8Z9/ayZPZCxrW5Ni/8iwrHY2t2XZwusEsd+hdaF0JJwTbyY3GNm5wD9Yy3VPkB/d/+WsL9uIVzonqLix2NKf+A9d/97fH+XmY0m1OJNicOmuHvJBSvj3HoacFn8Mp061sbFC8daQgL0Q+DlVJkKyvVcmM1t7v5BjOthYOdUDaiFmunMmyT/FM8ZX5rZNYTj85ay1jFV6xfHf1lKHD8HHnT3x+P0VxBqQ39K6EeWi5J9YGb9CLVAt8Vxr5vZPwnHe6p58UF3fzaWHw8sN7OuhFrO0vb3fYTz7Y/M7H+xlm6jmroy5HJdy7qNYlJU1jl7LSFJ3S5ex55LTevuOZ0T47H3d+APacf2CuCfhPN4AfAVIZnO1pw6nPAlO9d5V8RfCYnXo/F9SyBzPssJx1KmiwgVdrdljsgxrtKW1Tm+7gIcSKhQe5LwuZ1pZj909+9ymF/qdSvCl8lrCdfelUlytPISyqVAOzNrUkZSuR2hf0N6Nfd6wjeWlE8qHeGmuhJqCjZhZsMJzVfd4qCWhG+CuehEqAkBIG7YpYQdOD8OTv+GtDrOP5tRhEz/XTP7iPDBmV1KzB0JtRj7EXZuI0Itb2XlvGxgWSrRjz4mbIf2hI67c9M+XAVAeh/Cz+MFtCzp+z2XeZYWTztC0+DHGeNSB9f5hIvQy2a2DLgyPSEpQyfgE9+4iSZ9vpnrAKGWsSjWGhxL+Hac68kdQvL3K0JCeQLhxJIufXkfE9a7HeE4O8bMBqaNb0poMixPedsvc7mlSS+zXZznorT92SjbfDx0f7kd+E+sae1E9n3dFUoS4ImEC3F7wsU0tR5ZT8CVOPYrui4LMy5oqc9mJ+DLjJqEjwk1alCx4zFdJzJqHanYPtuO0P/ryrRhBUBnd3/KzK4ntCRtZ2bTgXPd/esc4krJ9VyYzeK0199keZ85r8xjolN8Xeo6smHblbWNNtrGHrp/fMLG27g8mZ+jPS3Uyqc0YeNjvKR8vMZ8yYbPUdb97e6rLDQDnwv8zcyeJzQJv5tDfLlc10rbRuWds/9MSJwei+NvcvdJOcQElNQAPgDMcffL0kaNIrSe9CRUthwKzDazXTw21cfp9yW0ak2rwLxzje3PhG4xB6Yd9ysJraLptiAkwOnTnkFIdPfLrHApLS4ze5vw+YHQLam8ZX0D/NvdH47TXwFMAArN7Eds+CL8nLtnm1/q9Yp4TWnl7veUsjlyVl5C+SKh/95gsuy06BPg5LRanBIW+jdArAbP8royPiE0k2QuazvCRfpgQu3DejN7g3AA5LLcT9mwQ1P9NtoS+tVViLu/BxxnoX/pEEJn2ralxHBpHP7jeOEdDFyfNj5zmlWEgzwVZ2PCgV/msjMu3ilbmtnmaeO2Jdx99gXhA9vTS7krt5R1KatMLvMsK55ULeQ7aeMWArj7Z4QasNRJ5gkze9Y37euRGfOnQFcza5SWVG5L6L+UdRp3X2hmLxK27YmUfaNJtm00A7gxHvQDCMlwuq5pr7clrPcXhM/93939l2UsrzRlbr8yYs2UXuYTwrmhXTktGClNCM2AWxBqV7Lt69T8f0Fo2uxLuOi1JnzJynos53DsV8W6dDazgrSLy7bALMJnaCsza5WWVKZ/Nks7F5QVD2Scj9Lmm14LWNY++4Rwp+Yd2UZ6eATOtRbuPr2X0A/xd2XML1cbnZ8IF/ykurKh6XBbwraBctYxKmsbfUro9wiU1Mh1pWLn/MzP0TPufkgZ5UuObws3ZGwV4yhzf7v7o8CjMSH5I+Hznkv/uVyua6VtozLP2fHzfg5wTjyfPWVmr3jsW1wWC30zZxCazE/NGN0HmO0buhk8YmaLCDXH6XnICGC6u6+swLzLZWZ/ICR1P8v4kvU2YV3TzwO9SOviZ6FP7wXA/rG1LKe43L1nRtmdgO4Z55XehNphCE3t+2SLPx4PmcfE23H6VJeb3sBid19qoT/l7maW+pLYGlhvZj9290HZllGaMvtQxurY3wN/MbPBZvYDM2tqZoebWao/1F+BifGkjpm1t9C3pTSfE2oculck0DS3AOea2W4W7qzbMS57c8KB8XmMYyThG0bKYqCLbbgjK9NdwEgz6xN3/KXAS15K34yymNkJZtY+JilfxcHfk33dWxG+PSyPtTfnZcxucUb5/wLNzay/hT6AEwidpctbdmn+YGabmdl+hATnvjjtzcDV8YKDmXW2cKdYpVRgntniWU84ECaaWau4v88m3tVmZseYWeompWWEz0G2dc7cli8RalfOj5/rAwhNTJl9jDJNJSSCPwaml1Euc3nEWt1phBPDy7FZON0JZrazmf2AULM1La7/P4CBZnaYmTU2s+ZmdkDaepeqvO1XGbFW9jHgSjPbwswamdkOFh47gZkNsaCRhX6eVwGvu/uXsTnyVTbs630J2z2lFSHBW0pITi7NWHzmdi3v2E+0LlEHYHT8nBxD6O/8kLt/QrjL9LK4T3oRalhSn81cj8fMdXoI2MnMfmFmTSzUUO0M5FK7CeG8PNbMesY4Wse4MbOfWOif1pSQAH5bSkyV8Qawv4X+s63ZuDtLZZ1nZltaaBr+DeHZkFDGOuboXkK3hYPjtjiH8Ll7oZJxzibssxPj56Rp3NaFaWWOMLN943XoEkJN1SeUsb/NrKOZDYrJ4BrC9aK0/ZX5Oar0da28c7aZDYjX3wJCy8H6MuIqEbf1NEKyOsI3vYnnFcJ+6R6v8YcQ+nG+lTaPFoQWoikVmbeZdTOzYttQ2ZUZ21jCF9q+7p7Zr/jpuI6jzayZhZpICF1bsPBM0EuBQzzjGZ45rPNGYjL9BnBhPK8cRUhe/xmL/APYy8z6WqhUGkP4AlBa95WpwKh4bWlDyBumxHG/Y0M/2T6EL8o3s3Ef+5yU+5gDd7+ScPGZQDhhf0K4K2xGLDI5BvCYma0g9BHas4z5rSY0Zz1vZl9ZuHM1Zx76Sk0kXJBXxDi2cvd3gCsJtaqLCRf79FrTpwhZ+mdm9kWW+T5B2LD/JNSg7EDod1YZ/YC3LTwgdjIwzN2/KWXd/0DobL6c0Ck7M0G5DJgQy58bk/zTCYn1QsIFIf2bUNZllxLnZ4QE7FPCN5rT0ppRfktobphj4W60J4CkHWDLm2dZ8ZxJWNcPCX1r7gRSzdo/AV6K6zyLcHfkRgd09Ddg57gtZ8S+JgMJ30a/INyNOzyHpqT7iV094j4tzUbLSxt+O+HzmdncTRw2hbAtmgOjAeJFJ3UTQuo4PI/cf+2qrO1XWcMJHcvfIey3aYQ+VRCa1B4hHKNvEi40R6VN+wvCeeJLQp/h9BuTphKa+xbGeWf2O8zcj+Ud+0nXBcKXjx6Ez8lE4Oi0C85xhKb2TwmfjQt9w13RuR6Pk4GjzWyZmV0b5z2AkOQsJXyBGeChH2a5PNzh/ifg7nisvUX4nEOoJb45rufHcf5/zmW+OSz3cULCN49w01WuCXBZZsZ5vUE4R/4tLqusdcwlVid0O7mOsF8HEm6AzNYHLZf5rSA0zw4jfBY+i/E1Syt2J+Hz/iXhTvAT4rRl7e9GhGvwp3G6nxG6zWRzEXB7PDaOrYLrWlnn7B7x/UrCsXeDu/8LwMz+auGHQ7L5aVzXQwl9cFfGv1SN61TCl/qnga8J/ftOzTgvDyZ8Qcvs8lPevLuy4dySzaWEmuH306YdBxA/F4MJ54qvCDexDE77vPyRUPv7Stq0qW1QXlzZDCN0nVlGfIySu38eY0l9dv8axw8Cjizts+vhLvTL4/b6X9wGF8ZxK9z9s9QfIeld5aX3PS5VTg82l/rH4mN6PO3xQ/lU2+Ipj4XHW53qlXjgs4WbUt4Ftk5vUrFa+nDrhs5qwY8i5IuZOSGxvt/dM+/ElwqwevhA/rrGzCYQ+v//v3ILC1DS4nM1oYJj51Iqa4Ba8tOLInWJmQ0lNLE+VYlpU7UNd3vFboIQqXHunvixHCK1hbtnvSNcSufhqQWb3K2ejRJKkQqItYg7Ex4qW6F+Z7EP1GJCc0O/qo9OREQkP9TkLSIiIiKJ5NqhX0REREQkKzV5S4Pz3nvvFffo0SPfYYhI/VbWc1BF6h3VUEqDs25dLs/hFhERkVwpoRQRERGRRJRQioiIiEgiSihFREREJBEllCIiIiKSiO7yFqmERYsWcf7557N06VIKCgo49thjGTFiBF999RVnnXUWCxcupHPnzlxzzTW0bt2aWbNmcfPNNwOw+eabc9FFF/HDH/4QgIMOOojNN9+cRo0a0bhxY6ZPz/w5dxERkdpNDzaXBqeoqKi4sLAw0TyWLFnC559/Ts+ePVm5ciVDhw7lL3/5C9OnT6dNmzaccsop3HTTTSxfvpzzzjuP1157jR122IHWrVvzzDPPcP3113PfffcBIaGcNm0aW221VVWsnojUDnpskDQoavIWqYQOHTrQs2dPAFq2bEn37t1ZvHgxTz75JIMHDwZg8ODBPPHEEwDsuuuutG7dGoA+ffrw2Wef5SVuERGR6qCEUiShBQsWUFRURO/evVm6dCkdOnQAoH379ixdunST8tOmTWP//fffaNioUaMYMmQI99xzT43ELCIiUpXUh1IkgVWrVjF69GjGjRtHy5YtNxpXUFBAQcHGrV5z5sxh2rRp3HnnnSXD7rrrLjp27MjSpUsZOXIk3bt35yc/+UmNxC8iIlIVVEMpUklr165l9OjRDBw4kEMPPRSAtm3bsmTJEiD0s0zvF/nuu+8yYcIEbrjhBrbccsuS4R07diyZ9pBDDmHevHk1uBYiIiLJKaEUqYTi4mLGjx9P9+7dGTlyZMnwgw46iBkzZgAwY8YMDj74YAA+/fRTzjzzTC6//HK23377kvKrV69m5cqVJa+ff/559DvjIiJS1+gub2lwquIu71dffZXjjz+enXbaiUaNwveys88+m169ejFmzBgWLVpEp06duOaaa2jTpg3jx4/nscceo1OnTgAljwf65JNP+PWvfw3A+vXrGTBgAL/61a+SraCI1Aa6y1saFCWU0uBURUIpIlIOJZTSoOimHMkLM+sKTAU6AsXATe4+2cy2Au4BugHzgWPdfZmZFQCTgSOA1cBJ7v5anNcIYEKc9R/d/faqiHHN+vU0a9y4KmZVZWpjTCIiIkooJV/WAee4+2tm1gqYa2aPAycBT7r7JDO7ALgA+C1wONAj/u0J3AjsGRPQC4HdCYnpXDOb5e7LkgbYrHFjek2fmnQ2VWrekOH5DkFERGQTuilH8sLdF6VqGN19BVAEdAYGAakaxtuBwfH1IGCquxe7+xygjZltAxwGPO7uX8Yk8nGgX82tiYiIiCihlLwzs27ALsBLQEd3XxRHfUZoEoeQbH6SNtmCOKy04SIiIlJD1OQteWVmLYF/AmPc/WszKxnn7sVmVuV3ja1Zs4aioqJyy9XWG3dyiV1E8qu2nj9EqosSSskbM2tKSCbvcPfpcfBiM9vG3RfFJu0lcfhCoGva5F3isIXAARnDny5ruc2aNavTJ/u6HLuIiNRPavKWvIh3bf8NKHL3q9JGzQJGxNcjgJlpw4ebWYGZ7QUsj03jjwKHmtmWZrYlcGgcJiIiIjVENZSSL/sAJwJvmtkbcdg4YBJwr5mNAj4Gjo3jHiI8Muh9wmODRgK4+5dmdgnwSix3sbt/WSNrICIiIoAebC4NUEUebK7HBolIJenB5tKgqMlbRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJKKEVEREQkESWUIiIiIpKIEkoRERERSUQJpYiIiIgkooRSRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJKKEVEREQkESWUIiIiIpKIEkoRERERSUQJpYiIiIgkooRSRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJKKEVEREQkESWUIiIiIpKIEkoRERERSUQJpYiIiIgkooRSRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJN8h2ANExmdiswAFji7j+Kw+4BLBZpA3zl7n3MrBtQBHgcN8fdT4vT7AZMAVoADwG/cffiGloNERERQQml5M8U4HpgamqAu/889drMrgSWp5X/wN37ZJnPjcAvgZcICWU/4OGqD1dERERKoyZvyQt3fxb4Mts4MysAjgXuKmseZrYNsIW7z4m1klOBwVUcqoiIiJRDNZRSG+0HLHb399KGbW9mrwNfAxPc/TmgM7AgrcyCOKxMa9asoaioqNwgCgsLKxR0TckldhHJr9p6/hCpLkoopTY6jo1rJxcB27r70thncoaZ9azszJs1a1anT/Z1OXYREamflFBKrWJmTYAhwG6pYe6+BlgTX881sw+AnYCFQJe0ybvEYSIiIlKD1IdSapu+wLvuXtKUbWbtzaxxfN0d6AF86O6LgK/NbK/Y73I4MDMfQYuIiDRkSiglL8zsLuDF8NIWmNmoOGoYm96Msz8wz8zeAKYBp7l76oae04FbgPeBD9Ad3iIiIjWuoLhYj+yThqWoqKg4136IvaZPLb9QDZo3ZHi+QxCR3BTkOwCRmqQaShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREEmmS7wCkYTKzW4EBwBJ3/1EcdhHwS+DzWGycuz8Ux40FRgHrgdHu/mgc3g+YDDQGbnH3STW5HiIiIqKEUvJnCnA9MDVj+NXufkX6ADPbGRgG9AQ6AU+Y2U5x9F+AQ4AFwCtmNsvd36nOwEVERGRjavKWvHD3Z4Evcyw+CLjb3de4+0fA+8Ae8e99d//Q3b8D7o5lRUREpAaphlJqmzPMbDjwKnCOuy8DOgNz0sosiMMAPskYvmd5C1izZg1FRUXlBlJYWJhrzDUql9hFJL9q6/lDpLoooZTa5EbgEqA4/r8SOLmqF9KsWbM6fbKvy7GLiEj9pIRSag13X5x6bWY3A7Pj24VA17SiXeIwyhguIiIiNUQJpdQaZraNuy+Kb48C3oqvZwF3mtlVhJtyegAvAwVADzPbnpBIDgN+UbNRi4iIiBJKyQszuws4AGhnZguAC4EDzKwPocl7PnAqgLu/bWb3Au8A64Bfu/v6OJ8zgEcJjw261d3frtk1ERERkYLi4uJ8xyBSo4qKiopz7YfYa3rmU43ya96Q4fkOQURyU5DvAERqkh4bJCIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIk3yHYA0TGZ2KzAAWOLuP4rD/gwMBL4DPgBGuvtXZtYNKAI8Tj7H3U+L0+wGTAFaAA8Bv3H34hpcFRERkQZPNZSSL1OAfhnDHgd+5O69gP8CY9PGfeDufeLfaWnDbwR+CfSIf5nzFBERkWqmhFLywt2fBb7MGPaYu6+Lb+cAXcqah5ltA2zh7nNireRUYHA1hCsiIiJlUJO31FYnA/ekvd/ezF4HvgYmuPtzQGdgQVqZBXFYmdasWUNRUVG5ARQWFlYo4JqSS+wikl+19fwhUl2UUEqtY2bjgXXAHXHQImBbd18a+0zOMLOelZ1/s2bN6vTJvi7HLiIi9ZMSSqlVzOwkws06B6durnH3NcCa+HqumX0A7AQsZONm8S5xmIiIiNQg9aGUWsPM+gHnA0e6++q04e3NrHF83Z1w882H7r4I+NrM9jKzAmA4MDMPoYuIiDRoqqGUvDCzu4ADgHZmtgC4kHBXdzPgcTODDY8H2h+42MzWAt8Dp7l76oae09nw2KCH45+IiIjUoILiYj2yTxqWoqKi4lz7IfaaPrWao6mYeUOG5zsEEclNQb4DEKlJavIWERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooJTEzezKXYSIiIvliZn3M7Ii090ea2QXxdXsze8nMXjez/czsITNrU4llHGBmP017f5qZVetv5ppZGzM7varKVVaT6pqx1H9m1hz4AdDOzLZkw2/XbgF0zltgIiIiacysCdAH2B14CMDdZwGzYpGDgTfd/f/i++cquagDgJXAC3EZf63kfCqiDXA6cEMVlasUJZSSxKnAGKATMJcNCeXXwPV5iklEROqxWON3LlAMzAPOBv4KbBuLjHH3583sImAHoDvwP2AfoIWZ7QtcBrQgJJi3AJfHcbsDewNFwO7u/kXm8tz9RDMbCEwANgOWAsfH+Z0GrDezE4AzCYnqSne/wsz6xDh/AHwAnOzuy8zsaeAl4EBC0jfK3bMmtGbWE7gtLrcRMBS4BNjBzN4AHgf+AMwEtgSaAhPcfSYwKaPcg8C57j4gzvt64FV3n2Jmk4AjgXXAY+5+bnn7RU3eUmnuPtndtyd8ILu7+/bxr7e7K6EUEZEqFROqCcBB7t4b+A0wGbja3X9CSLBuSZtkZ6Cvux8H/B64x937uPs9qQLu/kbGuG/KWR7Av4G93H0X4G7gfHefT0gYr47zyUwKpwK/dfdewJvAhWnjmrj7HoRKmgsp3WnAZHfvQ0iGFwAXAB/EZZ4HfAsc5e67EpLUK82sIEu5rMysLXAU0DPG+scy4tmwArkUEimLu18X+4x0I+0z5e5T8xaUiIjURwcB97n7FwDu/qWZ9QV2NrNUmS3MrGV8PSs9QayK5cXhXYB7zGwbQm3hR2XNxMxaA23c/Zk46HbgvrQi0+P/uYRraWleBMabWRdguru/l7beKQXApWa2P/A9oQtax7Liy7CckJT+zcxmA7NzmUg1lJKYmf0duALYF/hJ/Ns9r0GJiEhD0YhQW9gn/nV295Vx3KpqWuZ1wPXu/mNC96/mCee3Jv5fTxmVfe5+J6Ep+hvgITM7KEux44H2wG6xJnNxKfGtY+M8sHlcxjpgD2AaMAB4JJcVUEIpVWF3YB93P93dz4x/o/MdlIiI1DtPAcfEZlnMbCvgMUJ/ReKwPqVMuwJoVQXLA2gNLIyvR5S3DHdfDiwzs/3ioBOBZzLLlcfMugMfuvu1hH6SvbIsszWwxN3XmtmBwHalxPYxoWa3Wbyj/eC4jJZAa3d/CDgL6J1LbEoopSq8BWyd7yBERKR+c/e3gYnAM2b2H+AqYDSwu5nNM7N3CP0Ms/kXIYF6w8x+nmB5ABcB95nZXOCLtEkeAI6Ky9iPjY0A/mxm8wh3nF+cSwwZjgXeijfW/AiY6u5LgefN7C0z+zNwB2F7vAkMB96N67JROXf/BLiXcA2/F3g9LqMVMDvG+W/CTU/lKiguLq7E+ohsYGb/IhwcL7Oh2h53PzJfMZWlqKiouLCwMKeyvabXrm6g84ZU6+PMRKTqFJRfRKT+0E05UhUuyncAIiIikj9KKCWxtLvWREREJCEzOwz4U8bgj9z9qHzEkwsllJKYma0gPPAVwuMTmgKr3H2L/EUlIiJSN7n7o8Cj+Y6jIpRQSmLuXnLXWHx46iBgr/xFJCIiIjVJd3lLlXL3YnefARyW71hERESkZqiGUhIzsyFpbxsRnkv5bZ7CERERkRqmhFKqwsC01+uA+YRmbxERkVrr2/XrPmveuElFfpawvPktbt64SZnPZTaz9YTf8k4ZHH8HPFvZle7eMtu42kYJpSTm7iPzHYOIiEhFNW/cpGNVPm943pDhuSSn38SfRKxXlFBKYvFH6q8D9omDngN+4+4LypjmVsJvhC5x9x/FYVsB9wDdCLWcx7r7snijz2TgCGA1cJK7vxanGQFMiLP9o7vfXrVrJyIiUn3iTx3OBLYkPCVlgrvPzCizDeH6uAUhd/uVuz9nZocCfwCaAR8AI9N+x7xG6aYcqQq3AbOATvHvgTisLFOAfhnDLgCedPcewJPxPcDhQI/4dwpwI5QkoBcCexJ+yP5CM9sy4bqIiIhUpxbxpxnfMLP7CfccHOXuuwIHAlfGipR0vwAejTWbvYE3zKwdoUKlb5z2VXL8mcTqoBpKqQrt3T09gZxiZmPKmsDdnzWzbhmDBwEHxNe3A08Dv43Dp7p7MTDHzNrEb2sHAI+7+5cAZvY4IUm9K8nKiIiIVKONmrzNrClwqZntD3wPdAY6Ap+lTfMKcGssO8Pd3zCznwE7E36fG8JzoF+smVXYlBJKqQpLzewENiRyxwFLKzGfju6+KL7+jHBAQTi4PkkrtyAOK214mdasWUNRUVG5weT6e981LZfYRSS/auv5Q2ql44H2wG7uvtbM5gPN0wvESpj9gf6ESpurgGWESpXjajrgbJRQSlU4mdCH8mrCL+a8AJyUZIbuXmxmxeWXrLhmzZrV6ZN9XY5dREQ20ZpwP8FaMzsQ2C6zgJltByxw95vNrBmwKzAR+IuZ7eju75vZ5kBnd/9vjUYfKaGUqnAxMMLdl0FJ38YrCIlmRSw2s23cfVFs0l4Shy8EuqaV6xKHLWRDE3lq+NMVjl5ERBqkb9evW5zjndk5z6954wqnVncAD5jZm4R+kO9mKXMAcJ6ZrQVWAsPd/XMzOwm4KyaZEPpUKqGUOqtXKpkEcPcvzWyXSsxnFjACmBT/z0wbfoaZ3U24AWd5TDofJfQ7Sd2IcygwtrIrISIiDUt5z4ysxPzKLZP5XEl3/wLYu6yy8QkmmzzFxN2fAn5SmVirmu7ylqrQKP3u6lhDWeZRZWZ3EToPm5ktMLNRhETyEDN7D+gb3wM8BHwIvA/cDJwOIXEFLiF0Vn4FuDh1g46IiIjUHNVQSlW4EnjRzO6L748h9O0oVRmdiA/OUrYY+HUp87kVuDX3UEVERKSqqYZSEnP3qcAQYHH8G+Luf89vVCIiIlJTVEMpVcLd3wHeyXccIiIiUvNUQykiIiIiiSihFBEREZFE1OQtIiIiDdK3a9d91rxpk6p7DuXadYubNy39UURm1hZ4Mr7dGlgPfB7f7+Hu31VVLDVNCaWIiIg0SM2bNum46/lXV9n8Xrv8rDKTU3dfCvQBMLOLgJXufkVqvJk1cfd1VRZQDVJCKSIiIpInZjYF+BbYBXjezL4mLdE0s7eAAe4+38xOAEYDmwEvAae7+/r8RL4x9aEUERERya8uwE/d/ezSCphZIfBzYB9370NoLj++ZsIrn2ooRURERPLrvhxqGg8GdgNeMTOAFsCS6g4sV0ooRURERPJrVdrrdWzcgtw8/i8Abnf3sTUWVQWoyVtERESk9pgP7ApgZrsC28fhTwJHm1mHOG4rM9suLxFmoRpKERERaZC+XbtucXl3Zld0fs2bJk6t/gkMN7O3CTfe/BfCL9KZ2QTgMTNrBKwFfg18nHSBVaGguLg43zGI1KiioqLiwsLCnMr2mj61mqOpmHlDhuc7BBHJTUG+AxCpSWryFhEREZFElFCKiIiISCJKKEVEREQkESWUIiIiIpKIEkoRERERSUQJpYiIiIgkooRSRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJKKEVEREQkESWUIiIiIpKIEkoRERERSUQJpYiIiIgk0iTfAYikMzMD7kkb1B34PdAG+CXweRw+zt0fitOMBUYB64HR7v5ojQUsIiIiqqGU2sWDPu7eB9gNWA3cH0dfnRqXlkzuDAwDegL9gBvMrHEeQpca9uGHHzJo0KCSv1133ZUpU6bw1VdfMXLkSA499FBGjhzJ8uXLN5pu3rx57LzzzjzyyCN5ilxEpP5RQim12cHAB+7+cRllBgF3u/sad/8IeB/Yo0aik7zq3r07M2fOZObMmUyfPp0WLVpwyCGHcNNNN7H33nvz2GOPsffee3PTTTeVTLN+/XquuOIK9tlnnzxGLiJS/6jJW2qzYcBdae/PMLPhwKvAOe6+DOgMzEkrsyAOK9WaNWsoKioqd+GFhYUVDrgm5BJ7Q/P666/Trl07vv76ax5++GH++Mc/UlRURM+ePZkwYQIDBgwAYNasWfTu3Zv33nuPhQsXaltKtamt5w+R6qKEUmolM9sMOBIYGwfdCFwCFMf/VwInV2bezZo1q9Mn+7oce3WZOnUqxxxzDIWFhaxYsaKkBrK4uJgVK1ZQWFjI4sWLefPNN5k6dSrjxo2jc+fO2pYiIlVETd5SWx0OvObuiwHcfbG7r3f374Gb2dCsvRDomjZdlzhMGojvvvuOp556in79+m0yrqCggIKCAgAmTpzIueeeS6NGOu2JiFQ11VBKbXUcac3dZraNuy+Kb48C3oqvZwF3mtlVQCegB/ByTQYq+fXss8/Ss2dP2rVrB0Dbtm1ZsmQJHTp0YMmSJWy11VYAvPXWW5x99tkALFu2jGeeeYYmTZrQt2/fvMUuIlJfKKGUWsfMNgcOAU5NG3y5mfUhNHnPT41z97fN7F7gHWAd8Gt3X1+jAUtePfjgg/Tv37/k/UEHHcSMGTM45ZRTmDFjBgcffDAATz31VEmZCy64gAMOOEDJpIhIFVFCKbWOu68C2mYMO7GM8hOBidUdl9Q+q1ev5oUXXuDiiy8uGXbKKacwZswYpk2bRqdOnbjmmmvyF6CISANRUFxcnO8YRGpUUVFRca43Y/SaPrWao6mYeUOG5zsEEclNQb4DEKlJ6p0uIrXKmvW1r8dCbYxJRKQ2UZO3iNQqzRo3Vs2wiEgdoxpKEREREUlECaWIiIiIJKKEUkREREQSUUIpIiIiIokooRQRERGRRJRQioiIiEgiSihFREREJBEllCIiIiKSiBJKEREREUlECaWIiIiIJKKEUkREREQSUUIpIiIiIokooRQRERGRRJRQioiIiEgiSihFREREJBEllCIiIiKSiBJKEREREUlECaWIiIiIJKKEUkREREQSUUIpIiIiIokooRQRERGRRJRQioiIiEgiSihFREREJBEllCIiIiKSSJN8ByCSyczmAyuA9cA6d9/dzLYC7gG6AfOBY919mZkVAJOBI4DVwEnu/lo+4hYREWmoVEMptdWB7t7H3XeP7y8AnnT3HsCT8T3A4UCP+HcKcGONRyoiItLAKaGUumIQcHt8fTswOG34VHcvdvc5QBsz2yYP8YmIiDRYSiilNioGHjOzuWZ2ShzW0d0XxdefAR3j687AJ2nTLojDREREpIaoD6XURvu6+0Iz6wA8bmbvpo9092IzK67szNesWUNRUVG55QoLCyu7iGqVS+x1mba71Ae19XMsUl2UUEqt4+4L4/8lZnY/sAew2My2cfdFsUl7SSy+EOiaNnmXOKxUzZo1q9Mn+7oce12m7S4iUjo1eUutYmabm1mr1GvgUOAtYBYwIhYbAcyMr2cBw82swMz2ApanNY2LiIhIDVANpdQ2HYH7zQzC5/NOd3/EzF4B7jWzUcDHwLGx/EOERwa9T3hs0MiaD1lERKRhU0IptYq7fwj0zjJ8KXBwluHFwK9rIDQREREphZq8RURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCQRJZQiIiIikogSShERERFJRAmliIiIiCSihFJEREREElFCKSIiIiKJKKEUERERkUSUUIqIiIhIIkooRURERCSRJvkOQERE6o5FixZx/vnns3TpUgoKCjj22GMZMWIEY8aM4aOPPgJgxYoVtGrVipkzZ7J27VomTJjAO++8w7p16xg8eDCnnnpqntdCRKqaEkoREclZ48aNueCCC+jZsycrV65k6NCh7LPPPlxzzTUlZSZNmkTLli0BeOSRR/juu+944IEH+Oabb+jfvz/9+/enS5cueVoDEakOavIWEZGcdejQgZ49ewLQsmVLunfvzuLFi0vGFxcX8/DDDzNgwAAACgoK+Oabb1i3bh3ffvstTZs2LUk2RaT+UEIpIiKVsmDBAoqKiujdu3fJsFdffZW2bdvSrVs3AA477DBatGjBvvvuy4EHHsjJJ59MmzZt8hOwiFQbNXmLiEiFrVq1itGjRzNu3LiNahxnz55dUjsJMG/ePBo1asRzzz3H119/zS9+8Qt++tOf0rVr13yELSLVRDWUIiJSIWvXrmX06NEMHDiQQw89tGT4unXrePzxxzniiCNKhs2ePZv99tuPpk2b0rZtW3bddVfefPPNfIQtItVICaWIiOSsuLiY8ePH0717d0aOHLnRuBdeeIHu3buz9dZblwzbZptteOmllwBYvXo1//nPf+jevXuNxiwi1U8JpYiI5Gzu3LnMnDmTOXPmMGjQIAYNGsQzzzwDwEMPPUT//v03Kn/88cezatUq+vfvz9FHH82QIUP44Q9/mI/QRaQaqQ+liIjkbPfdd8fds46bNGnSJsM233xzrr322uoOS0TyTDWUIiIiIpKIaiilVjGzrsBUoCNQDNzk7pPN7CLgl8Dnseg4d38oTjMWGAWsB0a7+6M1HrhIHbdm/XqaNW6c7zA2UhtjEpHslFBKbbMOOMfdXzOzVsBcM3s8jrva3a9IL2xmOwPDgJ5AJ+AJM9vJ3dfXaNQidVyzxo3pNX1qvsPYyLwhw/MdgojkSE3eUqu4+yJ3fy2+XgEUAZ3LmGQQcLe7r3H3j4D3gT2qP9Lab+zYsey9994bPRPw3Xff5ec//zkDBw7ktNNOY+XKlQB89913jB07loEDB3LkkUeW3JUrIiKSCyWUUmuZWTdgFyCV3ZxhZvPM7FYz2zIO6wx8kjbZAspOQBuMIUOGcMstt2w0bPz48Zxzzjk88MAD9O3bt2T8fffdB8ADDzzAbbfdxp/+9Ce+//77Go9ZRETqJjV5S61kZi2BfwJj3P1rM7sRuITQr/IS4Erg5MrMe82aNRQVFZVbrrCwsDKzr3a5xA7hd5YXL1680fp++OGHtGzZkqKiIjp06MCNN97IYYcdxiuvvMJOO+1UUq5Ro0Y88MAD7LTTTtW2HqWp69u9rtJ2r1q1dXuKVBcllFLrmFlTQjJ5h7tPB3D3xWnjbwZmx7cLgfTfcOsSh5WqWbNmdfpkX5HYW7VqtdH6mhmffvopffv2Zc6cOXz55ZcUFhay99578/zzz3PKKaewaNEi5s+fT/Pmzev0dqpq2hb5oe0uUjeoyVtqFTMrAP4GFLn7VWnDt0krdhTwVnw9CxhmZs3MbHugB/ByTcVb10ycOJE777yTIUOGsGrVKjbbbDMAhg4dytZbb83QoUO59NJL2WWXXWisu2tFRCRHqqGU2mYf4ETgTTN7Iw4bBxxnZn0ITd7zgVMB3P1tM7sXeIdwh/ivdYd36XbYYQduvfVWAD766COefvppAJo0acK4ceNKyg0bNoxu3brlIUIREamLlFBKreLu/wYKsox6qIxpJgITqy2oemTp0qW0bduW77//nhtvvJFhw4YB8M0331BcXMwPfvADnn/+eRo3bsyOO+6Y52hFRKSuUEIpUk+dffbZvPzyyyxbtoz999+fM888k9WrV3PnnXcCcMghhzB06FAgJJqjRo2iUaNGdOzYkcsvvzyfoYuISB2jhFKknrrqqquyDh8xYsQmw7p06cKjj+oHhkREpHJ0U45IHbJm7bp8h5BVbY1LRERqhmooReqQZk2bsOv5V+c7jE28dvlZ+Q5BRETySDWUIiIiIpKIEkoRERERSUQJpYiIiIgkooRSRERERBJRQikiIiIiiSihFBEREZFElFCKiIiISCJKKEVEREQkESWUIiLV5Ouvv2b06NH069ePww8/nNdffx2Av//97/Tr14/+/fvrd9NFpF7QL+WIiFSTiRMnst9++3Httdfy3Xff8e233zJnzhyefPJJZs2axWabbcbSpUvzHaaISGKqoRQRqQYrVqzglVde4eijjwZgs802Y4sttuCuu+7ilFNOYbPNNgOgbdu2+QxTRKRKKKEUEakGCxYsYKuttmLs2LEMHjyY8ePHs3r1aubPn8+rr77KMcccwwknnMC8efPyHaqISGJKKEVEqsG6det45513OO6445gxYwYtWrTgpptuYv369Sxfvpx7772X888/nzFjxlBcXJzvcEVEElFCKSJSDbbeemu23nprevfuDUC/fv1455136NixI4cccggFBQX06tWLRo0asWzZsjxHKyKSjBJKEZFq0L59e7beems+/PBDAF588UV22GEH+vbty0svvQTARx99xNq1a9lyyy3zGWq9smjRIk488USOOOII+vfvz+233w7Addddx3777cegQYMYNGgQzzzzTJ4jFalfdJe3iEg1+d3vfse5557L2rVr6dq1K5dddhktWrRg3LhxDBgwgKZNmzJp0iQKCgryHWq90bhxYy644AJ69uzJypUrGTp0KPvssw8AJ510EqNGjcpzhCL1kxJKEZFqUlhYyPTp0zcZfsUVV+QhmoahQ4cOdOjQAYCWLVvSvXt3Fi9enOeoROo/NXmLiJRjzdp1+Q4hq9oaV22xYMECioqKSvqx3nHHHQwcOJCxY8eyfPnyPEcnUr+ohlJEpBzNmjZh1/OvzncYm3jt8rPyHUKttWrVKkaPHs24ceNo2bIlxx13HKeffjoFBQVMnjyZSZMmcdlll+U7TJF6QzWUIiJSr6xdu5bRo0czcOBADj30UADatWtH48aNadSoEccccwxvvvlmnqMUqV+UUIqISL1RXFzM+PHj6d69OyNHjiwZvmTJkpLXTzzxBD169MhHeCL1lpq8RUSk3pg7dy4zZ85kp512YtCgQQCcffbZzJ49m3fffReAzp07c/HFF+czTJF6RwmliIjUG7vvvjvuvsnwn/3sZ3mIRqThUJO3iIjUSrX1LvbaGpdIPqmGUuoFM+sHTAYaA7e4+6Q8hyQiCenuepG6QzWUUueZWWPgL8DhwM7AcWa2c36jEhERaTiUUEp9sAfwvrt/6O7fAXcDg/Ick4iISINRUFxcnO8YRBIxs6OBfu7+f/H9icCe7n5GtvJz5879HPi4BkMUkYbni912261fvoMQqSnqQykNzm677dY+3zGIiIjUJ2rylvpgIdA17X2XOExERERqgGoopT54BehhZtsTEslhwC/yG5KIiEjDoRpKqfPcfR1wBvAoUATc6+5v5zcqERGRhkM35YiIiIhIIqqhFBEREZFElFCKiIiISCJKKEWqkJldZGbnmtnFZta3EtO/kEOZ+WbWLtfy9U1l19nMBqf/glL6PjKzMWb2gyTzqCpmdoCZza7KedZWZtbGzE5Pe99g1l2kvlFCKVIN3P337v5EJab7aXWWL42Z1ZknPiRY58GEn+ZMzSd9H40Byk0oy5mHVFwb4PTyCuWqLn2OReob3ZQjkpCZjQdGAEuAT4C5wI+A2e4+zcwmAUcC64DH3P1cM+sI/BXoHmfzK3d/wcxWuntLMzsAuBhYAewI/As43d2/N7P5wO7u/kVG+YuAL+Ky5wInuHuxmf0eGAi0AF4ATo3DnwbeAPYFHgBOAnZy97VmtgXwn9T7atlwlZTjOm+0zYHpwGxgefwbCvwuDusEXAE48IW7H5haRlze0cAA4KbS5hH388FxPk0Ij7L6lbuvifvrdsI+aAoc4+7vmtkewGSgOfANMNLdPa7Xue4+oDq2Xz6Z2dnAyfHtLcBehJ9JdeBx4EFK36e7AVcBLeP4k9x9Ucbn+C53v7Km1kdENlANpUgC8SI3DOgDHAH8JGN8W+AooKe79wL+GEddCzzj7r2BXYFsjznaAziTUCO2AzCknHB2IdS07UxIVPeJw69395+4+48ISWV6orKZu+/u7n8Angb6x+HDgOm1LZnMYpN1zrbN3f0FYBZwnrv3cfcPUjNw92uBT4ED3f3A0hZU1jzMrDkwBfi5u/+YkFT+Km3yL9x9V+BG4Nw47F1gP3ffBfg9cGnlN0PtF4+VkcCehETyl8CfgA/i9jwvFs22T5sC1wFHu/tuwK3AxLTZpz7HSiZF8kQJpUgy+wH3u/tqd/+akHCkWw58C/zNzIYAq+PwgwjJBe6+3t2XZ5n3y+7+obuvB+4i1MCU5WV3X+Du3xNqbLrF4Qea2Utm9mZcbs+0ae5Je30L4YJP/H9bOcurDbKtc2nbvDoZ8JG7/ze+vx3YP2389Ph/Lhv2S2vgPjN7C7iajfdLfbQv4VhZ5e4rCdtkvyzlsu1TI9RYPm5mbwATCL+IlXIPIpJXSihFqlF86PoewDRCzeAjFZg8sz9Kef1T1qS9Xg80iTVnNxBqdn4M3ExoYk1ZlRbr80C32OTa2N3fqkCs+bLJOifc5inp27p5qaVyl4pzPRt+oewS4F+x5nhgFS2nPthknwIFwNuxJrOPu//Y3Q9NK7cKEckrJZQiyTwLDDazFmbWipAYlDCzlkBrd38IOAvoHUc9SWwSNbPGZtY6y7z3MLPtzawR8HPg35WIL5WkfBFjObqc8lOBO6kbtZNZlbHNVwCtSpksc9xiMyuM2/6oMsqlOCEZ3zG+PxF4ppxQW7PhN+dPKqdsffAc4Vj5gZltTtiuz1P6PknnQHsz2xvAzJqaWX2v0RWpU5RQiiTg7q8Rmtv+AzxMuBkjXStgtpnNIySEZ8fhvyE0Rb9JaAbdmU29AlxP+DnJj4D7KxHfV4RaybcIP02ZGV+mO4AtCU3sdVVp2/xu4Dwze93MdsiY5ibgETP7V3x/AeEGnBeARWnlss7D3b8ldBO4L+7T7wk3XZXlcuAyM3udDbWW9VY8VqYALwMvAbe4+1zgeTN7y8z+XMa03xG+DP3JzP5DaAqvkicciEjV0F3eIrVQvu70jXc0D3L3E2tyuSIiUrfV+2/FIpIbM7sOOJxwt7qIiEjOVEMpIiIiIomoD6WIiIiIJKKEUkREREQSUUIpIiIiIokooRSRBsnM2pjZ6fmOQ0SkPlBCKSINVRtACaWISBXQY4NEpKGaBOwQfxv6PeAOd58BYGZ3APcSHvJ+FOFXbToD/3D3P8QyJwCjgc0ID+o+Pf7uuohIg6MaShFpqC4APnD3PoRfJDoJIP4M5k+BB2O5PYChQC/gGDPb3cwKCT+HuU+cfj1wfE0GLyJSmyihFJEGz92fAXqYWXvgOOCf7r4ujn7c3Ze6+zfAdGBf4GBgN+CVWMN5MNC95iMXEakd1OQtIhJMBU4AhhF+lzsl89cfioEC4HZ3H1tDsYmI1GqqoRSRhmoF0Crt/RRgDIC7v5M2/BAz28rMWgCDgeeBJ4GjzawDQBy/XQ3ELCJSKymhFJEGyd2XAs+b2Vtm9md3XwwUAbdlFH0Z+Ccwj9AU/mpMOCcAj5nZPOBxYJsaDF9EpFbRb3mLiABm9gPgTWBXd18eh50E7O7uZ+QzNhGR2k41lCLS4JlZX0Lt5HWpZFJERHKnGkoRERERSUQ1lCIiIiKSiBJKEREREUlECaWIiIiIJKKEUkREREQSUUIpIiIiIon8f7P7z2GesLu2AAAAAElFTkSuQmCC",
407 | "text/plain": [
408 | "
"
409 | ]
410 | },
411 | "metadata": {
412 | "needs_background": "light"
413 | },
414 | "output_type": "display_data"
415 | }
416 | ],
417 | "source": [
418 | "# color scheme for the plot\n",
419 | "colors = [\"#17C3B2\", \"#227C9D\"]\n",
420 | "\n",
421 | "# generate plot\n",
422 | "grid = seaborn.catplot(x=\"type\", hue=\"certification_status\", data=repository_info, kind=\"count\", palette=colors)\n",
423 | "\n",
424 | "# annotate counts for each bar\n",
425 | "for ax in grid.axes.flat:\n",
426 | " for container in ax.containers:\n",
427 | " ax.bar_label(container)\n",
428 | "\n",
429 | "today = datetime.date.today()\n",
430 | "num_repositories = len(repository_info[\"re3data_id\"].unique())\n",
431 | "\n",
432 | "# add title\n",
433 | "grid.set(\n",
434 | " title=f\"Certification status per repository type for re3data repositories [number of repositories: {num_repositories}, {today}]\"\n",
435 | ");"
436 | ]
437 | }
438 | ],
439 | "metadata": {
440 | "language_info": {
441 | "name": "python"
442 | }
443 | },
444 | "nbformat": 4,
445 | "nbformat_minor": 4
446 | }
447 |
--------------------------------------------------------------------------------
/examples-python/03_re3data_API_repository_APIs.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 3: Aggregating current API information and general information about repositories (Python)\n",
8 | "\n",
9 | "> This notebook is based on the examples written in `R` from Dorothea Strecker's [examples-r/03_re3data_API_repository_APIs.ipynb](https://github.com/re3data/using_the_re3data_API/blob/main/examples-r/03_re3data_API_repository_APIs.ipynb). \n",
10 | "> Adapted in `Python` by Heinz-Alexander Fütterer.\n",
11 | "\n",
12 | "“As a research data portal, it is important for us to know which repositories offer an API. We would like to aggregate API information, such as API endpoint, API type and general information about the repository.”\n",
13 | "\n",
14 | "### Step 1: load packages\n",
15 | "\n",
16 | "The package `httpx` includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the redata API are returned in XML. `lxml` includes functions for working with XML, for example parsing or extracting content of specific elements. The `pandas` library is used for storing the responses in a tabular data structure (i.e. a `DataFrame`). It offers useful functions for data manipulation and reshaping as well. `seaborn` is a package for beautiful data visualization.\n",
17 | "\n",
18 | "If necessary, install the packages before loading them."
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": 1,
24 | "metadata": {},
25 | "outputs": [],
26 | "source": [
27 | "# !pip install httpx==0.23.0 lxml==4.8.0 pandas==1.4.2 seaborn==0.11.2"
28 | ]
29 | },
30 | {
31 | "cell_type": "code",
32 | "execution_count": 2,
33 | "metadata": {},
34 | "outputs": [],
35 | "source": [
36 | "import datetime\n",
37 | "import typing\n",
38 | "\n",
39 | "import httpx\n",
40 | "import matplotlib.pyplot as plt\n",
41 | "import pandas\n",
42 | "import seaborn\n",
43 | "from lxml import html\n",
44 | "\n",
45 | "seaborn.set_style(\"whitegrid\")"
46 | ]
47 | },
48 | {
49 | "cell_type": "markdown",
50 | "metadata": {},
51 | "source": [
52 | "### Step 2: obtain URLs for further API queries\n",
53 | "\n",
54 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of all repositories indexed in re3data need to be identified first, using the endpoint **/api/v1/repositories**. Details of the re3data APIs are outlined in the [re3data API documentation](https://www.re3data.org/api/doc).\n",
55 | "\n",
56 | "The endpoint is queried using **GET**. The XML response is parsed using **read_XML**. XML elements or attributes can be identified using XPath syntax. All elements matching the XPath syntax for finding re3data IDs are identified with **xml_find_all**, and their content is extracted using **xml_text**. The three functions are nested in the example below.\n",
57 | "\n",
58 | "The endpoint **/api/v1/repository** provides detailed information about individual repositories that can be accessed via re3data IDs. Therefore, URLs for the next query are created by adding re3data IDs to the base URL."
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": 3,
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "name": "stdout",
68 | "output_type": "stream",
69 | "text": [
70 | "2874\n"
71 | ]
72 | },
73 | {
74 | "data": {
75 | "text/plain": [
76 | "['https://www.re3data.org/api/beta/repository/r3d100010141',\n",
77 | " 'https://www.re3data.org/api/beta/repository/r3d100010148',\n",
78 | " 'https://www.re3data.org/api/beta/repository/r3d100010153',\n",
79 | " 'https://www.re3data.org/api/beta/repository/r3d100010201',\n",
80 | " 'https://www.re3data.org/api/beta/repository/r3d100010209']"
81 | ]
82 | },
83 | "execution_count": 3,
84 | "metadata": {},
85 | "output_type": "execute_result"
86 | }
87 | ],
88 | "source": [
89 | "URL = \"https://www.re3data.org/api/beta/repositories\"\n",
90 | "\n",
91 | "re3data_response = httpx.get(URL, timeout=60)\n",
92 | "tree = html.fromstring(re3data_response.content)\n",
93 | "urls = tree.xpath(\"//@href\")\n",
94 | "print(len(urls))\n",
95 | "\n",
96 | "urls[:5]"
97 | ]
98 | },
99 | {
100 | "cell_type": "markdown",
101 | "metadata": {},
102 | "source": [
103 | "### Step 3: define what information about the repositories should be requested\n",
104 | "\n",
105 | "The function `extract_repository_info()` defined in the following code block points to and extracts the content of specific XML elements and attributes. This function will be used later to extract the specified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 2 above.\n",
106 | "\n",
107 | "In our Metadata schema, **api** (the API endpoint) is an element with the attribute **apiType**. Please note that one repository can offer multiple APIs, and even several API types.\n",
108 | "\n",
109 | "The XPath expressions defined here will extract the re3data IDs, names, URLs, API endpoints and API types in their specific order. Results are stored in a dictionary that can be processed later. Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema)."
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": 4,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "def extract_repository_info(\n",
119 | " repository_metadata_xml: html.HtmlElement,\n",
120 | ") -> typing.Dict[str, typing.Any]:\n",
121 | " \"\"\"Extracts wanted metadata elements from a given repository metadata xml representation.\n",
122 | "\n",
123 | " Args:\n",
124 | " repository_metadata_xml: XML representation of repository metadata.\n",
125 | "\n",
126 | " Returns:\n",
127 | " Dictionary representation of repository metadata.\n",
128 | "\n",
129 | " \"\"\"\n",
130 | "\n",
131 | " namespaces = {\"r3d\": \"http://www.re3data.org/schema/2-2\"}\n",
132 | " return {\n",
133 | " \"re3data_id\": repository_metadata_xml.xpath(\"//re3data.orgidentifier/text()\", namespaces=namespaces)[0],\n",
134 | " \"name\": repository_metadata_xml.xpath(\"//repositoryname/text()\", namespaces=namespaces)[0],\n",
135 | " \"url\": repository_metadata_xml.xpath(\"//repositoryurl/text()\", namespaces=namespaces),\n",
136 | " \"api\": repository_metadata_xml.xpath(\"//api/text()\", namespaces=namespaces),\n",
137 | " \"api_type\": repository_metadata_xml.xpath(\"//@apitype\", namespaces=namespaces),\n",
138 | " }"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "metadata": {},
144 | "source": [
145 | "### Step 4: gather detailed information about repositories\n",
146 | "\n",
147 | "After preparing the list of URLs, the extracting function and the container for results, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using **GET**. The XML response is parsed with **read_xml**. An Xpath expression is used to count how often the element *api* occurs for each repository. If *APICount* is larger than 0, the function **extract_repository_info** is called. The results are then appended as a new row to **repository_info**."
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 5,
153 | "metadata": {},
154 | "outputs": [],
155 | "source": [
156 | "results = []\n",
157 | "\n",
158 | "with httpx.Client() as client:\n",
159 | " for i, url in enumerate(urls):\n",
160 | " # Uncomment to see progress, every 100th url is printed\n",
161 | " # if i % 100 == 0:\n",
162 | " # print(url)\n",
163 | "\n",
164 | " repository_metadata_response = client.get(url, follow_redirects=True)\n",
165 | " repository_metadata_xml = html.fromstring(repository_metadata_response.content)\n",
166 | " repository_info = extract_repository_info(repository_metadata_xml)\n",
167 | "\n",
168 | " # filter out repositories with no information on APIs\n",
169 | " if len(repository_info[\"api\"]) > 0:\n",
170 | " results.append(repository_info)"
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": 6,
176 | "metadata": {},
177 | "outputs": [],
178 | "source": [
179 | "repository_info = pandas.DataFrame(results)\n",
180 | "repository_info = repository_info.apply(pandas.Series.explode)\n",
181 | "repository_info[\"api_type\"] = repository_info[\"api_type\"].astype(\"category\")\n",
182 | "repository_info.sort_values(by=\"re3data_id\", inplace=True)"
183 | ]
184 | },
185 | {
186 | "cell_type": "markdown",
187 | "metadata": {},
188 | "source": [
189 | "### Step 5: Look at the results\n",
190 | "\n",
191 | "Results are now stored in `repository_info`. They can be inspected using `.head()` or visualized."
192 | ]
193 | },
194 | {
195 | "cell_type": "code",
196 | "execution_count": 7,
197 | "metadata": {},
198 | "outputs": [
199 | {
200 | "data": {
201 | "text/html": [
202 | "
\n",
203 | "\n",
216 | "
\n",
217 | " \n",
218 | "
\n",
219 | "
\n",
220 | "
re3data_id
\n",
221 | "
name
\n",
222 | "
url
\n",
223 | "
api
\n",
224 | "
api_type
\n",
225 | "
\n",
226 | " \n",
227 | " \n",
228 | "
\n",
229 | "
558
\n",
230 | "
r3d100000002
\n",
231 | "
Access to Archival Databases
\n",
232 | "
https://aad.archives.gov/aad/
\n",
233 | "
https://www.archives.gov/developer#toc-applica...
\n",
234 | "
other
\n",
235 | "
\n",
236 | "
\n",
237 | "
1208
\n",
238 | "
r3d100000005
\n",
239 | "
UNC Dataverse
\n",
240 | "
https://dataverse.unc.edu/
\n",
241 | "
https://guides.dataverse.org/en/latest/api/nat...
\n",
242 | "
REST
\n",
243 | "
\n",
244 | "
\n",
245 | "
1208
\n",
246 | "
r3d100000005
\n",
247 | "
UNC Dataverse
\n",
248 | "
https://dataverse.unc.edu/
\n",
249 | "
https://guides.dataverse.org/en/latest/api/swo...
\n",
250 | "
SWORD
\n",
251 | "
\n",
252 | "
\n",
253 | "
986
\n",
254 | "
r3d100000006
\n",
255 | "
Archaeology Data Service
\n",
256 | "
https://archaeologydataservice.ac.uk/
\n",
257 | "
http://data.archaeologydataservice.ac.uk/query/
\n",
258 | "
SPARQL
\n",
259 | "
\n",
260 | "
\n",
261 | "
986
\n",
262 | "
r3d100000006
\n",
263 | "
Archaeology Data Service
\n",
264 | "
https://archaeologydataservice.ac.uk/
\n",
265 | "
https://archaeologydataservice.ac.uk/about/met...
\n",
266 | "
OAI-PMH
\n",
267 | "
\n",
268 | " \n",
269 | "
\n",
270 | "
"
271 | ],
272 | "text/plain": [
273 | " re3data_id name \\\n",
274 | "558 r3d100000002 Access to Archival Databases \n",
275 | "1208 r3d100000005 UNC Dataverse \n",
276 | "1208 r3d100000005 UNC Dataverse \n",
277 | "986 r3d100000006 Archaeology Data Service \n",
278 | "986 r3d100000006 Archaeology Data Service \n",
279 | "\n",
280 | " url \\\n",
281 | "558 https://aad.archives.gov/aad/ \n",
282 | "1208 https://dataverse.unc.edu/ \n",
283 | "1208 https://dataverse.unc.edu/ \n",
284 | "986 https://archaeologydataservice.ac.uk/ \n",
285 | "986 https://archaeologydataservice.ac.uk/ \n",
286 | "\n",
287 | " api api_type \n",
288 | "558 https://www.archives.gov/developer#toc-applica... other \n",
289 | "1208 https://guides.dataverse.org/en/latest/api/nat... REST \n",
290 | "1208 https://guides.dataverse.org/en/latest/api/swo... SWORD \n",
291 | "986 http://data.archaeologydataservice.ac.uk/query/ SPARQL \n",
292 | "986 https://archaeologydataservice.ac.uk/about/met... OAI-PMH "
293 | ]
294 | },
295 | "execution_count": 7,
296 | "metadata": {},
297 | "output_type": "execute_result"
298 | }
299 | ],
300 | "source": [
301 | "repository_info.head()"
302 | ]
303 | },
304 | {
305 | "cell_type": "markdown",
306 | "metadata": {},
307 | "source": [
308 | "The example below generates a `seaborn.countplot` from the data. It first groups data by **api_type** and counts how many repositories are in each group, then orders **api_type** by occurrence in descending order. Then, a bar chart of APIs offered by repositories indexed in re3data is generated.\n",
309 | "Please note that, as mentioned above, **api_type** has an occurrence of 1-n. Some repositories are assigned more than one API type, for example REST and OAI-PMH."
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": 8,
315 | "metadata": {},
316 | "outputs": [
317 | {
318 | "data": {
319 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAncAAAFOCAYAAADkVzuaAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAABCAklEQVR4nO3dd5xU1f3/8dfSVgUVQcWGUsSPKIoGjAW7wYqCir0g+o3xFyMxGhVREzVBiDESS6IxNlBjl2KJSlCUEEVdRSzkY0QxgJSASJW27u+Pc2a5LDPLLOzswOX9fDz2sTO3nnPLuZ97zrl3SioqKhARERGRdKhX7ASIiIiISO1RcCciIiKSIgruRERERFJEwZ2IiIhIiii4ExEREUkRBXciIiIiKaLgrg6YWRcz+4+ZLTSzHsVOz5qY2cNm9ttip6OumNkhZua1vMxPzOzwWl5mhZktMrP+tbncPNfdKq6/QV2vO65/vT+HzKyfmd1fi8vbOea3fm0tswbrrvacqOvjwcxuNLNH62JdIuureG3+zsymrmnaohTUGWY2GugIbOfuSxPDHwbOBpbFvzLgMnf/t5ndCOzq7udmWV4r4EugobuvKHT6a+Bm4G53v6MYKzezM4GbgO2ApcDfCdtzfi0sezTwqLvX2kWtrrn7GMAy381sMvB/7v6PdVjmnrWQtGw6uvvnBVr2+qyo51A+3P2WzOfaKIvc/b9Ak9pJXY3XXevnRC7rS7ldyDyuDTNrAswAxrj7cVXGTQZaAOXAIkKZ/jN3X1iTMtnMtgf+AnQGtgdau/vkxPjbgO6Ea8c04BZ3H5IYfyRwG7ArMBsY6O73xXEnANcCHYAlwAvAL9x9QR7p2g34PXAQUB94F+jj7p6Y5hfANcBmwDPA/3P3pWa2LXAHcBjQGPgYuMLdx61tuszsbGAAsDUwErjQ3b9JjD8T+DWwM2GfXRDPoWzL+hlwAbAX8Li7X1Bl/FHAn+KyxsVlfQXg7hfE+GiNNzpFq7mLJ/QhQAVwUpZJbnX3JsBOwCzg4TpLXO3bBfhkbWbM5844jzv7sUAXd98SaEMI6tfLmrli1FLUpmLVbG1I1nIb5X0OpWEfpCEPss5OJdyMdzWz7bKMPzFeI39ACM6uX4t1fA+8HNeVzSLgRGBLoBdwh5kdBGBmDYGhhOBwS+AM4HYz6xjn3ZJwndkBaA/sSAjY8tEUGEG4wWgBvAMMz4w0s2OAvsBRhLKhDaECA8IN0btAJ6AZMBh4MQbLNU6Xme0Z83heTMti4M+J8V2B3wG9gc2BQ4Evqsnb13H9D2ZZ19bAc8ANMe3vAU9Ws6ycilmAnA+8TYhMewFPZ5vI3Reb2d/IL4Nvxv/fmhnACYSD7zB3/wggRvWTCQfEnoQI+M/AFcBC4Dp3fyxOWwr0B04HSuOyfuHu38Wd8DBwMOEE+SSu5/tkgsxsEtAaeN7MyoHm8e/eOO83wO/c/a9x+htZeUdxUkzX/VWW+TDwXczDYUB3M/sUuItwYC0EBrn7nXEbTqmyncoJd1qZ5e0LPAC0A14iBNyZcVsBjwD7E46XscAl7j41Ng8eAhxgZn8EHnb3n5nZHcAphJPoP8Dl1dzF1Cgvie1TDhwfl9/b3T+M49sD9wD7EO40r3X3EXHc8YS7zJbA/Ljc22Lz6aPuvpOZPUK4Y8rsr5vd/VYzO4lw57YjMJ5wlzgxLndyXOc54as1Bj4n1gKYWT3gauDHhEJrVNyG35jZJoT9exzhDvU/QDd3n5lte1XZdq0INR4XAL8h3MEOcvf+iW071d2vj98r85lI958IhVZb4AmgHyuP63HAae4+N7HaC+M+KAH+4O63xWVVl8dMOv+PcHc7mbBvq+bnx4Q78WbAP+P8X2c7h5I1/Ym8VN0HnYHbgT2Ar4Cfu/voOP1o4C3CxWF34HXCcfRNHF/d/r4G6ANsQSiof+ruo6q0KlQti7rG7dkvbqNNCRfVy9x9XrZtZGbnk6jRMrMtY36OJ5Q5DwG/dvdyM9uVcA7vAywHRrn7GVm28WBggrv/wcx2BKYSanz+ZGZtCRfFreP+yXlOAE/FRZ5jZqsde1nWewLhgtYWmAc84O43xtGrbSt3fyvbchLLO4Dc+7Y34VjcCfgfoXz9SxyXtdwmBACrnfdZ1pv1GI3jKoD/B1wJbAM8FrdtRRx/IXAVoRbsHeDiTI1MDr0I14njgHMJZddq3H2amf2dUC7WSCxn/pzrZsLdf534Os7MxgAHAv8ibIMtgEdiHt81s4mEffKhu/8tMe9iM/srKwOwNaXrHcI2AsDMBgHXm1lzd59D2DYPuPsncfxvCNu7r7t/QTg2Mu6LNZAGlK1Fus4Bnnf3N+O6bgAmmtnmsbbvJsLx8nacftoa8vZcXE5nwjGadArwibs/Hae5EZhtZru7+7+rW25Vxexzdz5hZzwGHGNmLbJNFKPtc4AP8lhm5oLR1N2buPsbhAtWsgn3LELB97/4fTtCYbYj4YC5z2IJAwwEdiMUmLvGaX4Vx11JKBi3IUTz/UgERRnu3hb4L/EuK16Unojz7gD0BG6J1dsZ3QnVzE0J2yebswmB5+aEE+154MOYxqOAy+PdDQBmdrCZzQMWEO7S/hiHNwKGEQK4ZoQgO3kXV49wEdmFUPh9B9wd83YdMIZQgDVx95/Fed6N26wZ8Dfg6RjE5FKjvMTt83Ri+cPMrGG8k3weeBXYFrgMeCyxPx8AfuLumxMKwteqJsTdz2PV/XVrbCJ4HLicsL9fIlwEGiVmPYtwM9E0S9PSZUAPwkVkB2AuIaiCcMxtSQg4mwOXELZxTRxMKLiOAn4VA9x8nUoIPHYj3KH/nXAsb0PY932qTH8E4SbgaOAaM/tRHF5dHjMOI9wpH1NleKZ5ZwDhRmp7wgX7Cch5DmVTuQ8I5+SLhICiGfBL4Fkz2yYx/fnAhXF9K4DMDUTO/R2PpZ8B+8Xj6BhCsFpV1bLoLUIQfgFhG7Yh1DDcne82IgQlKwhl0b6EffB/cdxvCMf9VoQLxl1Z5gd4Azg8sa4vEmk9jNAEuMoNarZzIjE632NvEWF7NyXso/9nK/tOZttWOcWgtLp9OwvoRgg8egODzOwHcVzWcnsNecysN+cxmtAN2A/YO053TJy3e1zXKXHdYwjHWK487kLYT5lr5PnVTNuSEPBnvUaa2bdmdnCu+fNlZpsS8vYJVAaGjwO9zay+mR1IuE78M8ciDmUtW7DivDNiYAehYubDxPgPgRZm1jxLuvcBGhFuuNcmXausy90nEbqL7WahpakzsI2ZfW5mU83s7rit1kbVdS0CJsXhNVKsztEHEw6Cp9x9drwzPxsYlJjslxbappcQIvgL1nJ1gwnBRd94d3EeUPXEvSFeMN4wsxeB0y08UHAxsHfibv4WQjBxLeHueHtgFw99oLLWTFUVT8QuwAnuvgQYb6ET9vmsDDbecvdh8XOuC/1wdx8bl7kXsI273xzHfRHvRs4EXgFw938CW8aC8cesvCAdADQE/hi3zzNmdkVmJfFkejaR/v6EWo6c3D3ZH+APZnY94QLwYY5ZapQXwt3XM3H62wkF9gFxXBNCv4/vgdfM7AXCRf9Gwj7bw8w+jLVRyRqp6pwBvOjuI+M6bwN+TugPMjpOc2eWGtKMSwgB8NQ4/43Af83svJim5oQanwmE/qU1dZO7fwd8aGYfEvqxTsxz3rsytYTxrnyWu38Qvw8lXLSrrmsR8JGZPUTYtv9YQx4zbozzZnMO8KC7vx/nvxaYa2atPNEHaA0q94GZnQu85O4vxXEjzew9wkVwcBz2iLt/HKe/gXAu9qL6/T2VUIu/h5n9rwZpy+Tx9lizkMnjx7GmKaNyG628J4F483s8IQD6DlgUazMuJjQZLSeUqTvEfZDrAvsG4ZysR7io3UpoAoIQ3L1Rg/xAnsdeplYtmmBmj8f1Davh+iDcrOfct+7+YmLaN8zsVUILw/usZbkd5XOMDnT3bwm1kK8TbnJfJpwfAxK1v7cA/cxslxy1d+cRalg/jTflt5rZvplzMxpmZisINaEvArdkWQ7u3rQGeazOvYQy/JXEsMcJLQ+ZvrD/L1s5aKHpshehBahGzGwnwo3iFYnBTQj5zsh83hzIBICY2RaEioub3D05fU3SVXVdmfVtTrhBaEiopDmEcHwNJzSRX7eGrOVa1/+qDMusq0aK1SzbC3jV3WfH73+Lw5LB3W0em5PWhbuPM7PFwOFmNp1w1zsiMcncKhecrwg1D9sQmhrKEoVsCaHpDEIb/Y3Aq3H8fe4+MI8k7QB846t23vyKEP1n5AoSkpLT7ALsYGbfJobVJ0vBFavwXybccf4gpmdapukgkR4AzGwzwn45llArALC5mdV39/JsCTOzXwIXxWVXEO6gt67FvFRO7+7fW3hyaIfMuCo1D18RagAh1FJdDww0swmEKvxqawmiHUhsk7jOKYnlVs1DVbsAQ80sma5yQsHwCKHW7gkza0roJnCduy/PI10ZMxKfF1OzTvjJ5t/vsnyvuqxkPr8idAqG6vOYbd6qdiBcfAHw0DF8DmEbT65mvlxp2wU4zcxOTAxryKo3JlXz0pBwnObc3+4+2swuJ5z7e5rZK4TO2l/nkb5Vlhs/NyC/bbRLTN/0RHlULzH91YTau3fMbC6hyXy1Pj3uPsnMFhGCjkPiPBfFGsnDiLWXNZDXsWdm+xNaQjoQalFKydEVJw/V7lszO47QtL0bYRttBnwUp1vbchvyO0ZzbY9dCP3V/pAYXxLnzRbcnQ/8Na5nmpm9QbhGJoO7Hl5HD3+Y2e8J++4IX9nMvDvhOnIK4SGDdsALZvZ1MsC20IT+N6Cnu39Ww/VuQ6iR/rO7J2s6FxKuKxmZzwsS825KaMl5290HZFn2aukys0MIrRcAX3l4MK7qujLrW8DKype73H16XMbtxODOQnP5IXGan3js8lWN6tZVI3Ue3MUNfjpQ38wyJ0Ip0NTMOnrsO7WWVmsWjQYT7vZmAM/EGrOMrcyscSLA25nwdM1swo7b091Xa0OPwdmVwJVm1oFQS/Suu49aQxq/BprZyvb6zDqT68iVj6TkNFOAL929XR7zQdjvbePn6cCOZlaSCPB2JlQFQ8ijAfu7+4xYxf0BoWBaLa3x5LiaUOPzSbwwzk1MXxt5aZlYXz1CM1Tm4trSzOolArydgc8A3P1dQp++hoSmtaeSy8qRHuKyM0EMZlYS58t3n00hPF01Nsf4m4CbLPS7eglwQhPyulpEuLBlZOuUXVMtgUzfj51Zud1z5jHmC6rfRl8TLoCZeRoTajSr7b9SRdXj6BF3/3E10yf3/c6Eu+7ZrGF/e+iz87dYK/AXQmfqZA1l1bRkrJLHuM4VhIA60/cm1zaaQuhcv3WWZn/cfQahRj7TMvIPM3vTsz9Z/QahpqFRlcBhK0L/wmzyKZOq8zdCE/Rx7r7EQh/dzA1fTZedc99a6Cf9LCE4Gu7uy81sGLH8WUO5vaZ0rMsxOgXon8fFHQsPLLQDrjWzK+PgzYEOZvbLbPu/kMzsJkK/v8N81TcsdAA+c/dMTZ5baPk6jlCTmOnPPYJQNqzp2lh1vVsRArsRvnpfzk8ItcSZvp8dgZmZZtt4HAwj1LT/JMuys6bLQ9/wqjcomXVl5m1DiFk+c/cFsXIheexUfvYqTznn4RPCuZhZV2PCtbrGzdnFqLnrQbij34vQbp3xFOGEvDLLPPn6H6GTbBviBT16lFCdvIDVC2EIF9Z+hKrZboROyt9baA4cZGY/c/dZFpo0O7j7K2bWjXCRm0SoNi2P666Wu08xs38BA2IN126EWq5z1i7LQGi2XmCho/edhO3aHtjU3d81s3MIfWn+a6EvR39Ch3cIncpXAH3M7M+Eflc/ZGUNx+aEIPdbM2tGuCNOmknY3iSmX0HYFw3MrC+r34msdV7iNJ3M7BTCydmHcNF7m1CALwaujnfIXWJ+9rPQP+404AUPHdjnk3t/Vc3TU0BfC4+ov0looltK6B+Yj3uB/mbWy92/inejB7n7cDM7ghBQfEp4yGN5NemqqfGEi9hvCbUll9fCMm+w0Km8NaE/U6Y/a8485rncx4HHLTw8NZHQxDSuhs2eSY8SOngfQ2g2bkhouv8803QMnGtmQwi1LjcTbvzKzSzn/o41XDsSHixaQjg3sj3hna0sepzQT/HvcfwtwJMeHpaoNjPuPt1C8+IfLDQhLyTsg53c/Q0zO43QnWMqobtBBbmPozcInfMzNWejY9rG5KqNZ/VzoqY2J7RYLDGzHxK64bwax+Uqt3PJuW8JZXFpXOaKWIt3NOGGnTWU22vK47oco/cCvzGz8e7+iYWHY4722HG+il6EmrBkP7tNgQmEwOn5PNaXNwv9oTPHcKmZbZKpALHQ9Hw2cIiv7O+W8QHQzkJfxNcJ264bsdtTDJ4zDw2tlmYLXTcOd/fDs4zbgtD8O9bd+2ZJ9hDgYTN7jBB0X098o0a8eX+GcG728tUfcqw2XVk8BrwVKy7eJ5QVzyUqZx4CLrPQIrYc+AXh9SpZWXh4pQFhm9eP239FDNqHAr83s1MJAfKvCM3zNXqYAorzQEUv4CF3/6+7z8j8Ee7qzrF1eAWAuy8mBC5jLXQiPSAOn0LYKRWs3lQ5g1AYfk3YiZckNuQ1hALj7RgM/AMq3/3ULn5fSAiQ/uzu1fZFSzgLaBXXOZQQTK7LO9XKCSfVPoSn62YT+kFsGSfZg3BhWkS4KDnxLt/dlxGq1S8gPLl7BuFR7Iw/EgqW2YQA6uUqq78D6Glmc83sTsIJ+TKhkP6KcAHMp5k537xA6NNwBmG/nQec4u7LY15OJBSAswlPQZ+f2J/nEZ5CnE/oA5MroB5AeDLr23in7IQg5q643BMJHa+X5Zi/qjsIgeirZraAsB0zfTy2IxRE8wkXjDcITbW14RHCTc1kwoV0rR6pr+INwjkxitB1InOBri6PaxSP/xsItS7TCXerZ65tIuM5n+nE/j/CMXgVq5Z5jxAuCDOATYgPj6xhf5cSmhdnx/m2JfTBrbr+bGXRg3GdbxKO7SWEB1HydT4hSP+UcOw/Q+g/BqGj+zgzW0jYDz/32LcvizcIwVbmKdV/Emp438wxPVQ5J2qQ5oyfAjfHY+NXrKxxyVlu51Ldvo0X3D5x+XMJgUmyG0515Xa1eVyXY9TdhxJqeJ+I5c/HhHJqFfFCfzqhmW9G4u9LwrHTq+o8a2LhRdiHVDPJd4TtASHwTfbzvoVQw/x5XM7CWBGSebDgQsJN+HzCcfUsK9/ukHlq+IHEvMkaqJaE61E2JxOO6d6JeRea2c5x3S8TgsjXCQ/CfMXKioeDCNeQowmVEgurbIM1pWsVHp7IvYQQH8winDs/TUzyG8JDhJ8RyvAPCMdzLtcTtnFfQjnzXRyGhwc9T43zzyWUoWtVDpZUVKxrbfuGwcweBL72RD8+q/JqCFn/WTUvsU47M1tCqEG6091vWNP0kpul4OXbIhsyMxsPHJWlRlByMLMHCC1Qs9x91+qm3ShelGmhz88phNcHiGyQ3L2618mIiGww3H2fYqdhQ+PuFxG6ca1R6n9b1sLLDT8Gfh+rtkVERERSa6NplhURERHZGKS+5k5ERERkY6LgTkRERCRFNugHKsaPH19RWlpa7GSIiIiIrNHixYtnd+rUaZs1T7luNujgrrS0lPbta/Ib6SIiIiLFUVZWlu3n5mqdmmVFREREUmSDrrkrpCOPPJLGjRtTr1496tevz3PPPcff//537r77biZNmsTTTz/NXnuFn5+cOnUqxx9/PK1btwagY8eO3HzzzcVMvoiIiGykFNxVY/DgwTRr1qzy+2677cZdd93Fr39d9edVYeedd2b48Hx/RlNERESkMBTc1UDbtm2LnQQRERGRaqnPXTUuuugiTjnlFJ58cs2/tz516lR69OjBueeey3vvvVcHqRMRERFZnWrucnj88cdp0aIFc+bMoXfv3rRp04b99tsv67Tbbrstr7/+OltttRUff/wxl156KS+++CJNmjSp41SLiIjIxk41dzm0aNECgObNm9O1a1cmTJiQc9pGjRqx1VZbAdChQwd23nlnvvxSP2MrIiIidU/BXRaLFy9m4cKFlZ/Hjh1Lu3btck7/zTffUF5eDsCUKVOYPHkyLVu2rJO0ioiIiCSpWTaLOXPmcOmllwJQXl5Ot27dOPTQQxk5ciS/+c1v+Oabb/jJT35C+/bteeCBB3j33Xe58847adCgAfXq1eOmm26iadOmxc2EiIiIbJRKKioqip2GtTZx4sQK/UKFiIiIbAjKysrKOnXq1LnQ61GzrIiIiEiKpDK4W7q8vNhJqBVpyYeIiIjUnVT2uSttWJ9OVw0pdjLWWdnvzy92EkRERGQDk8qaOxEREZGNlYI7ERERkRQpaLOsmU0GFgDlwAp372xmzYAngVbAZOB0d59rZiXAHcDxwGLgAnd/v5DpExEREUmbuqi5O8Ld93H3zKO/fYFR7t4OGBW/AxwHtIt/FwP31EHaRERERFKlGM2y3YHB8fNgoEdi+BB3r3D3t4GmZrZ9EdInIiIissEqdHBXAbxqZmVmdnEc1sLdp8fPM4AW8fOOwJTEvFPjMBERERHJU6FfhXKwu08zs22BkWb27+RId68ws7X+iYylS5cyceLE1Yan6VcrsuVPREREJJeCBnfuPi3+n2VmQ4EfAjPNbHt3nx6bXWfFyacBLROz7xSH5VRaWpqqQC6btOdPRERkY1FWVlYn6ylYs6yZNTazzTOfgaOBj4ERQK84WS9gePw8AjjfzErM7ABgXqL5VkRERETyUMiauxbAUDPLrOdv7v6ymb0LPGVmFwFfAafH6V8ivAblc8KrUHoXMG0iIiIiqVSw4M7dvwA6Zhk+Bzgqy/AK4NJCpUdERERkY6BfqBARERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUU3ImIiIikiII7ERERkRRRcCciIiKSIgruRERERFJEwZ2IiIhIiii4ExEREUkRBXciIiIiKaLgTkRERCRFFNyJiIiIpIiCOxEREZEUUXAnIiIikiIK7kRERERSRMGdiIiISIoouBMRERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUU3ImIiIikiII7ERERkRRRcCciIiKSIgruRERERFJEwZ2IiIhIiii4ExEREUmRBoVegZnVB94Dprl7NzNrDTwBNAfKgPPcfZmZlQJDgE7AHOAMd59c6PSJiIiIpEld1Nz9HJiY+P47YJC77wrMBS6Kwy8C5sbhg+J0IiIiIlIDBQ3uzGwn4ATg/vi9BDgSeCZOMhjoET93j9+J44+K04uIiIhIngpdc/dH4Grg+/i9OfCtu6+I36cCO8bPOwJTAOL4eXF6EREREclTwfrcmVk3YJa7l5nZ4YVYx9KlS5k4ceJqw9u3b1+I1RVFtvyJiIiI5FLIByq6ACeZ2fHAJsAWwB1AUzNrEGvndgKmxemnAS2BqWbWANiS8GBFTqWlpakK5LJJe/5EREQ2FmVlZXWynoI1y7r7te6+k7u3As4EXnP3c4DXgZ5xsl7A8Ph5RPxOHP+au1cUKn0iIiIiaVSM99xdA1xhZp8T+tQ9EIc/ADSPw68A+hYhbSIiIiIbtIK/5w7A3UcDo+PnL4AfZplmCXBaXaRHREREJK30CxUiIiIiKaLgTkRERCRFFNyJiIiIpIiCOxEREZEUUXAnIiIikiIK7kRERERSRMGdiIiISIoouBMRERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUU3ImIiIikiII7ERERkRRpUOwEyPpj6dKlnHPOOSxbtozy8nKOOeYY+vTpQ79+/fj444+pqKigdevWDBgwgMaNG7Ns2TKuvvpqPvnkE5o2bcqgQYPYaaedip0NERGRjZpq7qRSo0aNGDx4MCNGjGDYsGGMGTOG8ePH069fP0aMGMHzzz/P9ttvz2OPPQbA008/zRZbbMHIkSO54IILuO2224qcAxEREVFwJ5VKSkpo3LgxACtWrGDFihWUlJTQpEkTACoqKliyZEnl9K+99honn3wyAMcccwxvvfUWFRUVdZ9wERERqaTgTlZRXl5O9+7dOeiggzjooIPo2LEjANdeey1dunThiy++4LzzzgNg5syZbL/99gA0aNCAzTffnLlz5xYt7SIiIqLgTqqoX78+w4cP54033mDChAl89tlnAAwYMIAxY8bQtm1bXnrppSKnUkRERHJRcCdZbbHFFuy///6MGTOmclj9+vU54YQTePXVVwFo0aIF06dPB0Iz7oIFC9hqq62Kkl4REREJFNxJpW+++Yb58+cDsGTJEv71r3/RunVrvvrqKyD0uXvttddo06YNAEceeSRDhw4F4JVXXuGAAw6gpKSkOIlfB0uXLqVnz56cdNJJnHDCCdx5550APProo3Tt2hUz45tvvqmcft68eVx66aWceOKJ9OzZs7J2U0REZH2gV6FIpVmzZtG3b1/Ky8upqKjg2GOP5fDDD+fss89m0aJFVFRUYGbcdNNNAPTs2ZOrrrqKrl27suWWWzJo0KAi52DtZJ4Sbty4McuXL+fss8/m0EMP5Qc/+AGHH344559//irT33vvvbRv354//elPTJo0iZtvvpnBgwcXKfUiIiKrUnAnlXbffXeGDRu22vAnnngi6/SlpaWVtVwbslxPCe+xxx5Zp580aRIXX3wxAG3btmXatGnMnj2brbfeus7SLCIikouaZUXI/ZRwNrvvvntlv8MJEybw9ddfM2PGjLpKqoiISLUU3KVIxYqlxU5CrShGPnI9JZzNxRdfzIIFC+jevTuPPPII7du3p379+nWYWhERkdzULJsiJQ1K+e/NexU7Gets5199VLR1J58S3m233bJO06RJEwYMGACEh0yOOuooWrZsWZfJFBERyUk1d7LRy/aUcOaJ4Gzmz5/PsmXLgPATbJ07d678FQ8REZFiU82dbPSyPSV8xBFHMGTIEO6//35mz57NSSedxGGHHUb//v2ZNGkSffv2BaBdu3b079+/yDkQERFZScGdbPRyPSV8/vnnr/YaFIB9992XV155pQ5SJiIiUnNqlhURERFJEQV3ssFbmpKnhNOSDxERKS41y8oGr7RBKV3u6lLsZKyzsZeNLXYSREQkBVRzJyIiIpIiCu5EREREUkTBnYiIiEiKFKzPnZltArwJlMb1POPuvzaz1sATQHOgDDjP3ZeZWSkwBOgEzAHOcPfJhUqfiIiISBoVsuZuKXCku3cE9gGONbMDgN8Bg9x9V2AucFGc/iJgbhw+KE4nIiIiIjVQsODO3SvcfWH82jD+VQBHAs/E4YOBHvFz9/idOP4oMyspVPpERERE0qigfe7MrL6ZjQdmASOBScC37r4iTjIV2DF+3hGYAhDHzyM03YqIiIhIngr6njt3Lwf2MbOmwFBg99pc/tKlS5k4ceJqw9u3b1+bqymqbPnLRfne8NUk3yIiItnUyUuM3f1bM3sdOBBoamYNYu3cTsC0ONk0oCUw1cwaAFsSHqzIqbS0NFUX9mzSnr9clG8REUmbsrKyOllPwZplzWybWGOHmW0KdAUmAq8DPeNkvYDh8fOI+J04/jV3ryhU+kRERETSKK/gzsxG5TOsiu2B181sAvAuMNLdXwCuAa4ws88JfeoeiNM/ADSPw68A+uaXBRERERHJqLZZNr6rbjNgazPbCsg8vboFKx+EyMrdJwD7Zhn+BfDDLMOXAKfll2wRERERyWZNfe5+AlwO7EB44XAmuJsP3F24ZImIiIjI2qg2uHP3O4A7zOwyd7+rjtIkIiIiImspr6dl3f0uMzsIaJWcx92HFChdIiIiIrIW8gruzOwRoC0wHiiPgysIvwUrIiIiIuuJfN9z1xnYQ68mEUmP6dOnc/XVVzNnzhxKSko4/fTT6dUrvI3okUce4bHHHqN+/focdthhXH311UyYMIEbbrgBgIqKCi677DK6du1azCyIiEgW+QZ3HwPbAdMLmBYRqUP169enb9++7LnnnixcuJBTTz2VLl26MHv2bEaNGsWIESNo1KgRc+aEd4m3a9eOZ599lgYNGjBr1iy6d+/OEUccQYMGdfIudBERyVO+pfLWwKdm9g6wNDPQ3U8qSKpEpOC23XZbtt12WwCaNGlCmzZtmDlzJk899RQXX3wxjRo1AqB58/ATz5tuumnlvEuXLqWkpGT1hYqISNHlG9zdWMhEiEhxTZ06lYkTJ9KxY0duvfVW3nvvPQYNGkRpaSlXX301e++9NwAffvgh/fr14+uvv+bWW29VrZ2IyHoo36dl3yh0QkSkOBYtWkSfPn3o168fTZo0oby8nHnz5vHUU0/x0UcfcfnllzNq1ChKSkro2LEjL774IpMmTeKaa67h0EMPpbS0tNhZEBGRhHyfll1AeDoWoBHQEFjk7lsUKmEiUnjLly+nT58+nHjiiRx99NEAtGjRgq5du1JSUsLee+9NvXr1mDt3Ls2aNaucr23btmy22WZ89tln7LXXXsVKvoiIZJHXb8u6++buvkUM5jYFTgX+XNCUiUhBVVRUcN1119GmTRt69+5dOfxHP/oR48aNA+DLL79k+fLlbLXVVkyZMoUVK1YAMG3aNL744gt23LHaXyEUEZEiqHGHmfg6lGFm9mugb+0nSUTqQllZGcOHD2e33Xaje/fuAFxxxRWceuqp9OvXj27dutGwYUMGDhxISUkJZWVl/PWvf6VBgwbUq1ePG2+8cZXaPBERWT/k2yx7SuJrPcJ775YUJEUiUic6d+6Mu2cdd9ttt602rEePHvTo0aPAqRIRkXWVb83diYnPK4DJQPdaT42IiIiIrJN8n5btveapRKQufb90KfVS8KRqWvIhIrK+yLdZdifgLqBLHDQG+Lm7Ty1UwkSkevVKS3nj0MOKnYx1dtibetOSiEhtyutpWeAhYASwQ/x7Pg4TERERkfVIvn3utnH3ZDD3sJldXoD0iIiIiMg6yDe4m2Nm5wKPx+9nAXMKkyQRERERWVv5NsteCJwOzACmAz2BCwqUJhERERFZS/nW3N0M9HL3uQBm1gy4jRD0iYiIiMh6It+au70zgR2Au38D7FuYJImIiIjI2so3uKtnZltlvsSauxr/dJmIiIiIFFa+AdofgLfM7On4/TSgf2GSJCIiIiJrK6+aO3cfApwCzIx/p7j7I4VMmIiIiIjUXN5Nq+7+KfBpAdMiIiIiIuso3z53IiIiIrIBUHAnIiIikiIK7kRERERSRMGdiIiISIoouBMRERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUaFGrBZtYSGAK0ACqA+9z9DjNrBjwJtAImA6e7+1wzKwHuAI4HFgMXuPv7hUqfiIiISBoVsuZuBXClu+8BHABcamZ7AH2BUe7eDhgVvwMcB7SLfxcD9xQwbSIiIiKpVLDgzt2nZ2re3H0BMBHYEegODI6TDQZ6xM/dgSHuXuHubwNNzWz7QqVPREREJI3qpM+dmbUC9gXGAS3cfXocNYPQbAsh8JuSmG1qHCYiIiIieSpYn7sMM2sCPAtc7u7zzaxynLtXmFnF2i576dKlTJw4cbXh7du3X9tFrney5S8X5XvDp3yLiMi6KmhwZ2YNCYHdY+7+XBw808y2d/fpsdl1Vhw+DWiZmH2nOCyn0tLSVF3gskl7/nJRvjcuG2u+RWTjUlZWVifrKVizbHz69QFgorvfnhg1AugVP/cChieGn29mJWZ2ADAv0XwrIiIiInkoZM1dF+A84CMzGx+H9QMGAk+Z2UXAV8DpcdxLhNegfE54FUrvAqZNREREJJUKFty5+z+Bkhyjj8oyfQVwaaHSIyIiIrIx0C9UiIiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUU3ImIiIikiII7ERERkRRRcCciIiKSIgruRERERFJEwZ2IiIhIiii4ExEREUkRBXciIiIiKaLgTkRERCRFFNyJiIiIpIiCOxEREZEUUXAnIiIikiIK7kRERERSRMGdiIiISIoouBMRERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUU3ImIiIikiII7ERERkRRRcCciIiKSIgruRERERFJEwZ2IiIhIiii4ExEREUkRBXciIiIiKaLgTkRERCRFFNyJiIiIpEiDYidARKSYHn74YZ5++mlKSkrYbbfdGDBgAO+//z633nory5cvZ88996R///40aKDiUkQ2DAUrrczsQaAbMMvdO8RhzYAngVbAZOB0d59rZiXAHcDxwGLgAnd/v1BpExEBmDlzJkOGDOGll15ik0024ec//znPP/88d911Fw8//DCtW7fmjjvuYOjQoZx22mnFTq6ISF4K2Sz7MHBslWF9gVHu3g4YFb8DHAe0i38XA/cUMF0iIpXKy8tZsmQJK1asYMmSJWy22WY0bNiQ1q1bA9ClSxdeffXVIqdSRCR/BQvu3P1N4Jsqg7sDg+PnwUCPxPAh7l7h7m8DTc1s+0KlTUQEoEWLFlx44YUcccQRHHzwwTRp0oTjjjuO8vJyPvroIwBefvllZsyYUeSUiojkr64fqGjh7tPj5xlAi/h5R2BKYrqpcZiISMHMmzePUaNGMWrUKMaMGcN3333HiBEjuP322xkwYAA9e/akcePG1KunZ89EZMNRtB7C7l5hZhXrsoylS5cyceLE1Ya3b99+XRa7XsmWv1yU7w2f8l23xo4dS5MmTZg5cyYzZ86kQ4cOvPbaa1xyySXccMMNAHzwwQc0a9asaGkUEampug7uZprZ9u4+PTa7zorDpwEtE9PtFIdVq7S0NFUXuGzSnr9clO+NS7HyvWzZMoYOHUqrVq3YZJNNePjhh+ncuTPbbrstzZs3Z9myZQwcOJBLLrlko903IlJ7ysrK6mQ9dR3cjQB6AQPj/+GJ4T8zsyeA/YF5ieZbEZGC6NixI8cccwwnn3wyDRo0oH379pxxxhkMGjSI0aNH8/3333PWWWdx4IEHFjupIiJ5K+SrUB4HDge2NrOpwK8JQd1TZnYR8BVwepz8JcJrUD4nvAqld6HSJSKS1KdPH/r06bPKsGuuuYZrrrmmSCkSEVk3BQvu3P2sHKOOyjJtBXBpodIiIiIisrHQI2AiskFZsby82EmoFWnJh4isf/R7OiKyQWnQsD53X/l8sZOxzn72hxOLuv758+dz/fXX89lnn1FSUsItt9zC4MGD+fLLLwFYsGABm2++OcOHD1/DkkRkfaPgTkRkI9S/f38OOeQQ7rzzTpYtW8aSJUv44x//WDl+4MCBNGnSpHgJFJG1pmZZEZGNzIIFC3j33Xfp2bMnAI0aNWKLLbaoHF9RUcHf//53unXrVqwkisg6UHAnIrKRmTp1Ks2aNePaa6+lR48eXHfddSxevLhy/HvvvUfz5s1p1apV8RIpImtNwZ2IyEZmxYoVfPrpp5x11lkMGzaMTTfdlPvuu69y/AsvvJDqWrv58+fTp08fjj32WI477jg++OADvv32W3r37s3RRx9N7969mTdvXrGTKbLWFNyJiGxktttuO7bbbjs6duwIwLHHHsunn34KhMBv5MiRHH/88cVMYkFl+hu+/PLLDB8+nLZt23Lfffdx4IEH8uqrr3LggQeuEuyKbGgU3ImIbGS22WYbtttuO7744gsA3nrrLdq2bQvAv/71L9q0acN2221XzCQWTK7+hqNGjaJHjx4A9OjRg3/84x9FTKXIutHTsiIiG6EbbriBX/7ylyxfvpyWLVsyYMAAAF566SVOOOGEIqeucJL9Df/973+z5557ct111zFnzhy23XZbIAS/c+bMKXJKRdaegjsRkY1Q+/btee6551YbPnDgwCKkpu5k+hvecMMNdOzYkd/+9rerNcGWlJRQUlJSpBQWzpFHHknjxo2pV68e9evX57nnnuN3v/sdr7/+Og0bNmTnnXdmwIABqzw5LRsmNcuKiMhGI1d/w+bNmzNr1iwAZs2aRbNmzYqZzIIZPHgww4cPrwzsu3TpwgsvvMDzzz9Pq1at+Mtf/lLkFEptUHAnIrIBWLFsWbGTUCuKnY9c/Q2PPPJIhg0bBsCwYcM46qjVfgY9lQ4++GAaNAiNePvssw8zZswocoqkNqhZVkRkA9CgUSP6n9uz2MlYZ9c9+kyxk5C1v+H333/P5ZdfzjPPPMMOO+ywyq91pMlFF11ESUkJZ5xxBmecccYq45599lmOO+64IqVMapOCOxER2ajk6m84ePDgIqSm7jz++OO0aNGCOXPm0Lt3b9q0acN+++0HwD333EP9+vU56aSTipzK2ldeXs6pp55KixYtVml2/u1vf8uzzz7LBx98UMTUFYaaZUVERDYCLVq0AKB58+Z07dqVCRMmAPDcc88xevRobrvttlQ+SDJkyJDKV/1kfPTRR6l+UbWCOxERWW99v6K82EmoFcXOx+LFi1m4cGHl57Fjx9KuXTvefPNN7r//fu655x423XTToqaxEGbMmMHo0aMr32sIoSbv1ltv5aqrripiygpLzbIiIrLeqtegPhP7v1bsZKyz9tcdWdT1z5kzh0svvRQIwU23bt049NBD6dq1K8uWLaN3794AdOzYkZtvvrmYSa1Vt9xyC1dddRWLFi2qHPboo49y1FFHVb7XMI0U3ImIiKRcy5YtGTFixGrDR44cWYTU1I3XX3+dZs2a0aFDB8aNGwfAzJkzefnll3nkkUeKnLrCUnAnIiIiqfP+++/z2muv8eabb7J06VIWLlxIt27daNSoEUcffTQA3333HV27dk1dkKvgTkREZD2zfPlyGjZsWOxkrLNi5uPKK6/kyiuvBGDcuHE8+OCDq72ked99901dYAcK7kRERNY7DRs25MYbbyx2MtZZGvKwIVJwJyIiIqm2//77s//++682PI3vuAO9CkVEREQkVRTciYiIyHqhvHxpsZNQK4qdDzXLioiIyHqhfv1Snnr6h8VOxjo7/bR3irp+1dyJiIiIpIiCOxEREZEUUXAnIiIikiIK7kRERERSRMGdiIiISIoouBMRERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFFFwJyIiIpIiCu5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBOREREJEUaFDsBSWZ2LHAHUB+4390HFjlJIiIiIhuU9abmzszqA38CjgP2AM4ysz2KmyoRERGRDct6E9wBPwQ+d/cv3H0Z8ATQvchpEhEREdmgrE/B3Y7AlMT3qXGYiIiIiOSppKKiothpAMDMegLHuvv/xe/nAfu7+89yzVNWVvY/4Ks6SqKIiIjIutilU6dO2xR6JevTAxXTgJaJ7zvFYTnVxQYSERER2ZCsT8Hdu0A7M2tNCOrOBM4ubpJERERENizrTZ87d18B/Ax4BZgIPOXunxQ3VSIiIiIblvWmz52IiIiIrLv1puZORERERNadgjsRERGRFFmfHqgoKjMrBz4ibJMvgfPc/Vsza0XoA+iJyW939yFmdiHwC6CCEChfBxwNdAEaAa0T8/3W3Z+pi7ysi8R2yLgD+Hn8vAchP+XAy8C/gd8THoBpBAxy97/WXWprj5k1Bc529z/H74cDv3T3bsVMV02Z2U6EX3rZg3BMvgBcFV8MjpkNA7Zz9wMS89wILHT327IsL3leTAR6uftiM6sAHnP3c+N0DYDpwDh372ZmFwCdk68yMrPRhG36Xm3nO7GOCsL5eWX8/kugibvfWM08hwPL3P1fiWHnA1cTzu0VhLzeZmYPA4cB84FNgbeBfu4+Nc43GVhAOEcAfppcbqGY2XWEB9DKge+BocC+7t4jjr8WuMjdd43fTwR+7O4nmdmWwF3AQUAJMBa4zN3nVSn/GgHvxeUsj9ttOPAFsBkwE7jV3V8odH7XJMv2+AnwAXAr0I2wXz8FLs3suzjf5cBAoIW7z4vDDifk80ugFHjC3W+qq7zENFR7XtfSOm4Efgz8D2hMOO+vd/dPE9PsQ9iOx7n7y4nhWcuJ2kpbYj3Z9uvvgO2BJcBC4EJ39zj9MLKXd5l8NgJ+4+6Px3ElhOt4L8IxMp1wLkyI4ycTyrXZtZ232qaau5W+c/d93L0D8A1waWLcpDgu8zcknmzXAQe7+97AAcAEd7/U3fcBjq8y33of2EXfVcnrQ5nPwNfAEfF73zj9k3Hc4cAtZtaiOMleZ02Bn9bWwmKwU6diwfQcMMzd2wG7AU2A/nF8U6ATsKWZtclzscnzYhlwSRy+COhgZpvG711Zw6uL6shS4BQz27oG8xxOCGwAMLPjgMuBo919L8K5PS8x/VXu3hEwwoXuNTNrlBh/ROL8qYvA7kBCwPKDWBb9CHg4pjvjQGC+mW0bvx8EZNL2APCFu+/q7m0JQcz9iXknxXN8L8Irqk5PjBvj7vu6uwF9gLvN7KjazF9N5dgeU4BbgM0Bi+fHMOC5eN5knEV4c8MpVRY7Jm6DzsC5ZvaDgmYiYU3ndS0bFI/bdsCThGM7+cqxs4B/xv9JucqJWlPNfgU4J56TgwkVDmsq7wbF/dkd+IuZNYzDLyWcGx3dfTfCNh5hZo1rOz+Fppq77N4C9l7DNNsS7tAXArj7wsznjZG7zzKzScAuhDv49ZqZXQFcGL/eT7gQtjWz8cBI4EWgiZk9A3QAyoBz3b3CzDoBtxMK2NnABe4+PdZMjQcOBh4H/lBnGQqOBJa4+0MA7l5uZr8AvjSzXxMuWM8T9s+ZhItdTYxh1fPiJeAE4BlCYf84cMg65WDdrQDuI9SoX5ccES9S9wI7x0GXEwLSS4ByMzsXuAy4llDD+DWAuy8FVquRdvcKYJCZnUz4TezhBchPPrYHZsd0EmsVZpvZfDPb1d0/J/zaz7OEC9ew+P96M9uVcAE8I7G8m4HPzawtK2sgM8fTO+T45SB3H29mNxPeejCqlvNYE6ttDzPbDOgNtHb38jj8odj6ciQwKua3CeEm7zrgoaoLdvdFZlYG7Aq8Xye5qf68/hI4BtiSsF8ezdQqxuO5D6F2ahyhFrnczBYSWmS6Ad8B3d19tTLb3Z80sxMINWV3xCDzNMKN3Bgz28Tdl2RJb9VyorZkO84xs+Q0bxLOa8ijvHP3/5jZYmArYBZwDXBYptbR3V81szHAOYRyZYOhmrsqzKw+cBQwIjG4rZmNT/wdAnxIOGi+NLOHYjNHGmyayOfQfGeKd0ZtgM8Ll7TaEYOz3sD+hKDux4Sq/UxN61Vx0n0JBcUehLx1iXd4dwE93b0T8CCr3kE3cvfO7l7XgR3AnoQgtJK7zwf+S7gYZQKwx1n9zrtasSbyOFZtsn8CONPMNiEU5uOqzHZG8rwh1HrUhT8B58TmxqQ7CHfs+wGnAve7+2RCwJepsRjDymA+X+8Duye+vx7zXHV7FMqrQEsz+8zM/mxmh8XhY4GDLFz9/kNoQj4o7suOhBqqPYDxmYAHQvBAuEnZM7mSuJ/3J3TJyKXqtiiGbNtjV+C/8XxIeo+V+TyTcEyPASxbK4SZNSeUGXX5mq7qzusGhN9lP5VwDp5mZp3NrD0hYO8Sa6jKCQEKhCbXt2NN15uE8i+X5P48CPjS3ScBowk3dqvIUU7UllzHedKJiXWvsbyLNbD/iZUTWwCN3f2LKpO9RzhPNiiquVtp03gB2pHQZ2BkYlymWWIVZnYssB8hGBxkZp2q69uzgfguW16rcYaZHUxoDvuJu39TmGTVqoOBoe6+CMDMniN7jdM7ib5U44FWwLeEi//IeMdYn9AvI+PJQiV6HW0FtAP+GWsfl5tZB3f/eA3zZc4LCBe9BzIj3H1C7JN1FqEWr6onffU+dwXn7vPNbAih1uK7xKgfAXsk7vS3MLMmtbDKkirfj6jLPjnuvjDesBwCHAE8aWZ9Cc2uBxGO0beAd4BfEW5a/u3uS6rUeuSSqdFuDbyY6X+UQ9VtUeeybQ/yq6U+CzjZ3b83s2cJtVR3x3GHmNkHhH5eA339egfrSHefA5Vl2cGEGuxOwLtxH29KqJmC0Gya6RdZRqiJy6Vqk/UT8fMTwPmE2mCoppyoLdUc5wCPmdl3wGTgshiYV1fe/cLMehOauNNSMbMKBXcrfefu+8Tq+1cIbe93VjdDbJZ5B3jHzEYSqvFvLHRC1zOrXMBTZmnicznhfCkBPnH3A3PMs6jgqcrtU6BnckC8G90Z2IcQ4H0ZC/stCIX1dYlpWxKaMQDudfd7WXOwPwK4jdBvrXkt5KG2/JFQ65BsWqsHHFC1KSlLgPMJ4cL4Wp7r2pfiNkNmattGA6PN7CNCh/BrCM3M9YG/uvuCWPt2OCv7230K7GNm9dz9ewAzq0c4XjId6SfFsnFrYKyZneTuyZaNpH0JN8dFlWV7/ATY2cw2d/cFiUk7AS+Y2V6EYCBz09aI0PcwE9yN8eI9XFXdeb2C0PE/qYJQTg1292uzLG95vHbBynItl32B92KL1qlAdwsPNZQAzRPbs6aVAmslx3EOoc9d5YNaZnYZ1Zd3gzw8IHUS8ICZtY03hYvMrE2V2rtOhFrDDYqaZauIbe19gCutmk7xZraDrdqpdh/gqwInT2rHGKCHmW1moaPsyYQmrM3zmNeBbSx07sXMGprZnmuYp66MAjaz8KRnpovBHwid688AjnX3Vu7eilBgnZmc2d2nJB4EuDfPdT4I3OTuhWiGWWuxBvkp4KLE4FcJwQ5Q+eQfhL6zyX0/APi9mW0Xp2tkZv9XdR1mVmJmfQh9gaprqiwoC9olBu1DKIsmAjsQanI+iOPGE/oYjgWI/fE+AK5PzH898H4cVynWRvYl9EnMlo69gRsIzeJFk2N7OKGz/e3xvMg8Eb0ZIYg/C7gxc364+w7ADma2S92mPqvqzuvFQFcza2bh4aYehH07Cuhp8QGaOL5GeTGzUwlvf3ic0Do1wd1bxu2zC6HW7uRayF++6cl1nGdzFmso7wDiTcp7rAwSfw/cGbclZvYjQrP4hvJAZCUFd1m4+wfABFa201ftc9cHaAjcZmb/jtXRZ7DylSGyHnP39wkF4zuEfmL3u3sZoVbiYzP7fTXzLiPcRf/OzD4kXCwPyjV9XYp34ycT+t38B/iM8HqA+wgPurydmPZLYJ6Z7b+O65zq7tXWcBfRH4DkU7N9gM5mNsHMPmXlE33PAyfHc/sQd3+JUGPzDzP7hFADuEViOb+P+/4zQreMI7wWX0mxFpoAg83sUzObQOgfdGM8HsYBc9x9eZz2LUL/0eRTvBcBu5nZJAsPRe3GqkFx0jBCoJHpxnCImX1gZk4I6vq4e1FrMcmxPQhB6RLgs3h+nEZohq0gXPir9jEeSpaAoK5Vc173i5O8Qwi0JgDPuvt7Hl5fcj3watwGIwk3IWvyi3ge/Ac4FzjS3f9HuBZW3T7PUsO+u+so135dRewqUpPy7mbgilhjfRdhe06w8NqTIUDXKrX9E8xsavy7vVZyVgD6+TEREZENkGV5n6TUDgv9cYcC77p7vzVNv75RnzsRERGRBA+vN6vuYZP1mmruRERERFJEfe5EREREUkTBnYiIiEiKKLgTERERSREFdyIiIiIpouBORDY6ZnZzfEFprvGXx1+rERHZ4OhpWRGRKuILTDvX5W/EiojUFgV3IpIKZjYMaAlsAtzh7veZ2ULgr4SfUZoBnOnu/zOzh4EX3H21nxWKv0BzG+Enq2YDjwB7u/vlcfyPCW/Hv4Pws2NlwA8Iv0l7vrsvtvAD57cT3qo/G7jA3acXKOsiIqtQs6yIpMWF7t4J6Az0MbPmQGPgPXffE3gD+PWaFhJ/Tu1rws+KHUH4jdoTzaxhnKQ34Td1AQz4s7u3B+YDP43T3QX0jOl5EOhfW5kUEVkTBXcikhZ94m++vk2owWsHfA88Gcc/Chxc04XGN9W/BnQzs92Bhu7+URw9xd3HVlm+AR2AkfF3p68HdlqrHImIrAX9/JiIbPDM7HDgR8CBsVl0NKF5tqq17YdyP+GH2v8NPFTN8iqAEuATdz9wLdclIrJOVHMnImmwJTA3Bna7AwfE4fWAnvHz2cA/81zeAmDzzBd3H0eoDTwbeDwx3c5mlgniMst3YJvMcDNraGZ71jxLIiJrR8GdiKTBy0ADM5sIDCQ0zQIsAn5oZh8DRwI357m8+4CXzez1xLCngLHuPjcxzIFL43q3Au5x92WEgPJ3sZl4PHDQ2mVLRKTm9LSsiKSWmS109ya1tKwXgEHuPip+b0V44rZDbSxfRKS2qM+diEg1zKwp8A7wYSawExFZn6nmTkQ2WmY2FGhdZfA17v5KMdIjIlIbFNyJiIiIpIgeqBARERFJEQV3IiIiIimi4E5EREQkRRTciYiIiKSIgjsRERGRFPn/bwLL4IecS0sAAAAASUVORK5CYII=",
320 | "text/plain": [
321 | "
"
322 | ]
323 | },
324 | "metadata": {
325 | "needs_background": "light"
326 | },
327 | "output_type": "display_data"
328 | }
329 | ],
330 | "source": [
331 | "plt.figure(figsize=(10, 5))\n",
332 | "\n",
333 | "# generate plot\n",
334 | "ax = seaborn.countplot(\n",
335 | " x=\"api_type\",\n",
336 | " # count api_type only once per repository\n",
337 | " data=repository_info.drop_duplicates(subset=[\"re3data_id\", \"api_type\"]),\n",
338 | " order=repository_info[\"api_type\"].value_counts().index,\n",
339 | ")\n",
340 | "\n",
341 | "# annotate counts for each bar\n",
342 | "ax.bar_label(ax.containers[0])\n",
343 | "\n",
344 | "today = datetime.date.today()\n",
345 | "num_repositories = len(repository_info[\"re3data_id\"].unique())\n",
346 | "\n",
347 | "# add title\n",
348 | "ax.set(\n",
349 | " title=f\"API types for re3data repositories [number of repositories with at least one API: {num_repositories}, {today}]\"\n",
350 | ");"
351 | ]
352 | }
353 | ],
354 | "metadata": {
355 | "kernelspec": {
356 | "display_name": "Python 3.10.4 ('.venv': venv)",
357 | "language": "python",
358 | "name": "python3"
359 | },
360 | "language_info": {
361 | "codemirror_mode": {
362 | "name": "ipython",
363 | "version": 3
364 | },
365 | "file_extension": ".py",
366 | "mimetype": "text/x-python",
367 | "name": "python",
368 | "nbconvert_exporter": "python",
369 | "pygments_lexer": "ipython3",
370 | "version": "3.10.4"
371 | },
372 | "vscode": {
373 | "interpreter": {
374 | "hash": "f4e04fe1c2d2c9fb3f131bdb7502ac6b3e9db43803e91e12f0037d5645fe64f8"
375 | }
376 | }
377 | },
378 | "nbformat": 4,
379 | "nbformat_minor": 4
380 | }
381 |
--------------------------------------------------------------------------------
/examples-r/01_re3data_API_medical_research_community.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 1: identify and collect information about repositories catering to the medical research community (R)\n",
8 | "\n",
9 | "Medical researchers are looking for a suitable repository to deposit their data. They require a repository catering to medical research that offers data upload and assigns DOIs to datasets.\n",
10 | "\n",
11 | "Repositories meeting these specifications can be identified via the re3data API. The API also provides the option to retrieve further information about these repositories, such as the name of the repository or a description.\n",
12 | "\n",
13 | "### Step 1: load packages\n",
14 | "\n",
15 | "The package **httr** includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the redata API are returned in XML. **xml2** includes functions for working with XML, for example parsing or extracting content of specific elements. If necessary, install the packages before loading them."
16 | ]
17 | },
18 | {
19 | "cell_type": "code",
20 | "execution_count": 1,
21 | "metadata": {},
22 | "outputs": [
23 | {
24 | "name": "stderr",
25 | "output_type": "stream",
26 | "text": [
27 | "Warning message:\n",
28 | "\"package 'httr' was built under R version 4.1.1\"\n",
29 | "Warning message:\n",
30 | "\"package 'xml2' was built under R version 4.1.2\"\n"
31 | ]
32 | }
33 | ],
34 | "source": [
35 | "#install.packages(\"htttr\")\n",
36 | "#install.packages(\"xml2\")\n",
37 | "library(httr)\n",
38 | "library(xml2)"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {},
44 | "source": [
45 | "### Step 2: define query parameters\n",
46 | "\n",
47 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of repositories with the desired characteristics need to be identified first.\n",
48 | "\n",
49 | "The re3data API allows querying via the endpoint **/api/beta/repositories**. Parameters that can be queried are listed in the [re3data API documentaion](https://www.re3data.org/api/doc). For more information on re3data metadata, including descriptions of available elements and controlled vocabularies, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema). \n",
50 | "The query below returns re3data IDs of repositories meeting the following conditions:\n",
51 | "\n",
52 | "* **\"subjects[]\" = \"205 Medicine\"** The repository caters to the subject *Medicine*, notation 205 in the DFG Subject Classification, the subject classification used by re3data.\n",
53 | "* **\"dataUploads[]\"=\"open\"** The repository allows data upload.\n",
54 | "* **\"pidSystems[]\"=\"DOI\"** The repository assigns DOIs."
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 2,
60 | "metadata": {},
61 | "outputs": [],
62 | "source": [
63 | "re3data_query <- list(\"subjects[]\" = \"205 Medicine\", \"dataUploads[]\"=\"open\", \"pidSystems[]\"=\"DOI\")"
64 | ]
65 | },
66 | {
67 | "cell_type": "markdown",
68 | "metadata": {},
69 | "source": [
70 | "### Step 3: obtain URLs for further API queries\n",
71 | "\n",
72 | "The query parameters defined in the previous step can then be passed to the re3data API using **GET**. \n",
73 | "The XML response is parsed using **read_XML**. XML elements or attributes can be identified using XPath syntax. The response from the re3data API includes URLs for further queries to the **/api/beta/repository** endpoint. These URLs can be identified with a simple XPath expression. All attributes matching the XPath syntax are identified with **xml_find_all**, and their content is extracted using **xml_text**.\n",
74 | "\n",
75 | "The three functions are nested in the example below."
76 | ]
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": 3,
81 | "metadata": {},
82 | "outputs": [],
83 | "source": [
84 | "re3data_request <- GET(\"https://www.re3data.org/api/beta/repositories?\", query = re3data_query) \n",
85 | "\n",
86 | "URLs <- xml_text(xml_find_all(read_xml(re3data_request), xpath = \"//@href\"))"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "### Step 4: define what information about the repositories should be requested\n",
94 | "\n",
95 | "The function **extract_repository_info** defined in the following code block points to and extracts the content of specific XML elements and attributes.This function will be used later to extract the spedified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 3 above. \n",
96 | "The XPath expressions defined here will extract the re3data IDs, names, URLs, and descriptions of the repositories. Results are stored in a named list that can be processed later.\n",
97 | "\n",
98 | "Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema).\n",
99 | "\n",
100 | "The function **xml_structure** from the package **xml2** can be very useful for inspecting the structure of XML objects and specifying XPath expressions.\n",
101 | " \n",
102 | "Please note that in version 2.2 of the re3data Metadata Schema, the elements mentioned here have occurences of 1 or 0-1, meaning that for each repository, they occur once at most. For information on how to deal with elements that can occur multiple times, please refer to other examples for using the re3data API."
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": 4,
108 | "metadata": {},
109 | "outputs": [],
110 | "source": [
111 | "extract_repository_info <- function(repository_metadata_XML) {\n",
112 | " list(\n",
113 | " re3data_ID = xml_text(xml_find_all(repository_metadata_XML, \"//r3d:re3data.orgIdentifier\")),\n",
114 | " repositoryName = xml_text(xml_find_all(repository_metadata_XML, \"//r3d:repositoryName\")),\n",
115 | " repositoryUrl = xml_text(xml_find_all(repository_metadata_XML, \"//r3d:repositoryURL\")),\n",
116 | " description = xml_text(xml_find_all(repository_metadata_XML, \"//r3d:description\"))\n",
117 | " )\n",
118 | "}"
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "metadata": {},
124 | "source": [
125 | "### Step 5: create a container for storing results\n",
126 | "\n",
127 | "**repository_info** is a container for storing results of the API query. The dataframe has four columns corresponding to names of the list items defined by **extract_repository_info**."
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": 5,
133 | "metadata": {},
134 | "outputs": [],
135 | "source": [
136 | "repository_info <- data.frame(matrix(ncol = 4, nrow = 0))\n",
137 | "\n",
138 | "colnames(repository_info) <- c(\"re3data_ID\", \"repositoryName\", \"repositoryUrl\", \"description\")"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "metadata": {},
144 | "source": [
145 | "### Step 6: gather detailed information about repositories\n",
146 | "\n",
147 | "After preparing the list of URLs, the extracting function and the container for results, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using **GET**. The XML response is parsed with **read_xml** before **extract_repository_info** is called. The results are then appended as a new row to **repository_info**."
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 6,
153 | "metadata": {},
154 | "outputs": [],
155 | "source": [
156 | "for (url in URLs) {\n",
157 | " repository_metadata_request <- GET(url)\n",
158 | " repository_metadata_XML <-read_xml(repository_metadata_request) \n",
159 | " results_list <- extract_repository_info(repository_metadata_XML)\n",
160 | " repository_info <- rbind(repository_info, results_list)\n",
161 | "}"
162 | ]
163 | },
164 | {
165 | "cell_type": "markdown",
166 | "metadata": {},
167 | "source": [
168 | "### Results\n",
169 | "\n",
170 | "Results are now stored in **repository_info**. They can be inspected using **head**, visualized or stored locally with **write.csv**."
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": 7,
176 | "metadata": {},
177 | "outputs": [
178 | {
179 | "data": {
180 | "text/html": [
181 | "
\n",
182 | "
A data.frame: 5 × 4
\n",
183 | "\n",
184 | "\t
re3data_ID
repositoryName
repositoryUrl
description
\n",
185 | "\t
<chr>
<chr>
<chr>
<chr>
\n",
186 | "\n",
187 | "\n",
188 | "\t
1
r3d100012823
Vivli
https://vivli.org/
Vivli is a non-profit organization working to advance human health through the insights and discoveries gained by sharing and analyzing data. It is home to an independent global data-sharing and analytics platform which serves all elements of the international research community. The platform includes a data repository, in-depth search engine and cloud-based analytics, and harmonizes governance, policy and processes to make sharing data easier. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community.
\n",
189 | "\t
2
r3d100010953
Polar Data Catalogue
https://www.polardata.ca/
The Polar Data Catalogue is an online database of metadata and data that describes, indexes and provides access to diverse data sets generated by polar researchers. These records cover a wide range of disciplines from natural sciences and policy, to health, social sciences, and more.
\n",
190 | "\t
3
r3d100012815
UNB Libraries Dataverse
https://dataverse.lib.unb.ca/
UNB Dataverse is repository for research data collected by researchers and organizations primarily affiliated with the University of New Brunswick. The repository allows researchers to deposit, share, analyze, cite, and explore data. Dataverse is an open source application developed by the Institute for Quantitative Social Science (IQSS) at Harvard University.
NAHDAP acquires, preserves and disseminates data relevant to drug addiction and HIV research. By preserving and making available an easily accessible library of electronic data on drug addiction and HIV infection in the United States, NAHDAP offers scholars the opportunity to conduct secondary analysis on major issues of social and behavioral sciences and public policy
\n",
192 | "\t
5
r3d100012074
BindingDB
http://bindingdb.org/bind/index.jsp
BindingDB is a public, web-accessible knowledgebase of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems.\n",
193 | "\n",
194 | "The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature and from US Patents by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target (\"TARGET_TYPE='PROTEIN'\") is provided.
\n",
195 | "\n",
196 | "
\n"
197 | ],
198 | "text/latex": [
199 | "A data.frame: 5 × 4\n",
200 | "\\begin{tabular}{r|llll}\n",
201 | " & re3data\\_ID & repositoryName & repositoryUrl & description\\\\\n",
202 | " & & & & \\\\\n",
203 | "\\hline\n",
204 | "\t1 & r3d100012823 & Vivli & https://vivli.org/ & Vivli is a non-profit organization working to advance human health through the insights and discoveries gained by sharing and analyzing data. It is home to an independent global data-sharing and analytics platform which serves all elements of the international research community. The platform includes a data repository, in-depth search engine and cloud-based analytics, and harmonizes governance, policy and processes to make sharing data easier. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community. \\\\\n",
205 | "\t2 & r3d100010953 & Polar Data Catalogue & https://www.polardata.ca/ & The Polar Data Catalogue is an online database of metadata and data that describes, indexes and provides access to diverse data sets generated by polar researchers. These records cover a wide range of disciplines from natural sciences and policy, to health, social sciences, and more. \\\\\n",
206 | "\t3 & r3d100012815 & UNB Libraries Dataverse & https://dataverse.lib.unb.ca/ & UNB Dataverse is repository for research data collected by researchers and organizations primarily affiliated with the University of New Brunswick. The repository allows researchers to deposit, share, analyze, cite, and explore data. Dataverse is an open source application developed by the Institute for Quantitative Social Science (IQSS) at Harvard University. \\\\\n",
207 | "\t4 & r3d100010261 & National Addiction \\& HIV Data Archive Program & https://www.icpsr.umich.edu/web/pages/NAHDAP/index.html & NAHDAP acquires, preserves and disseminates data relevant to drug addiction and HIV research. By preserving and making available an easily accessible library of electronic data on drug addiction and HIV infection in the United States, NAHDAP offers scholars the opportunity to conduct secondary analysis on major issues of social and behavioral sciences and public policy \\\\\n",
208 | "\t5 & r3d100012074 & BindingDB & http://bindingdb.org/bind/index.jsp & BindingDB is a public, web-accessible knowledgebase of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems.\n",
209 | "\n",
210 | "The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature and from US Patents by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target (\"TARGET\\_TYPE='PROTEIN'\") is provided.\\\\\n",
211 | "\\end{tabular}\n"
212 | ],
213 | "text/markdown": [
214 | "\n",
215 | "A data.frame: 5 × 4\n",
216 | "\n",
217 | "| | re3data_ID <chr> | repositoryName <chr> | repositoryUrl <chr> | description <chr> |\n",
218 | "|---|---|---|---|---|\n",
219 | "| 1 | r3d100012823 | Vivli | https://vivli.org/ | Vivli is a non-profit organization working to advance human health through the insights and discoveries gained by sharing and analyzing data. It is home to an independent global data-sharing and analytics platform which serves all elements of the international research community. The platform includes a data repository, in-depth search engine and cloud-based analytics, and harmonizes governance, policy and processes to make sharing data easier. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community. |\n",
220 | "| 2 | r3d100010953 | Polar Data Catalogue | https://www.polardata.ca/ | The Polar Data Catalogue is an online database of metadata and data that describes, indexes and provides access to diverse data sets generated by polar researchers. These records cover a wide range of disciplines from natural sciences and policy, to health, social sciences, and more. |\n",
221 | "| 3 | r3d100012815 | UNB Libraries Dataverse | https://dataverse.lib.unb.ca/ | UNB Dataverse is repository for research data collected by researchers and organizations primarily affiliated with the University of New Brunswick. The repository allows researchers to deposit, share, analyze, cite, and explore data. Dataverse is an open source application developed by the Institute for Quantitative Social Science (IQSS) at Harvard University. |\n",
222 | "| 4 | r3d100010261 | National Addiction & HIV Data Archive Program | https://www.icpsr.umich.edu/web/pages/NAHDAP/index.html | NAHDAP acquires, preserves and disseminates data relevant to drug addiction and HIV research. By preserving and making available an easily accessible library of electronic data on drug addiction and HIV infection in the United States, NAHDAP offers scholars the opportunity to conduct secondary analysis on major issues of social and behavioral sciences and public policy |\n",
223 | "| 5 | r3d100012074 | BindingDB | http://bindingdb.org/bind/index.jsp | BindingDB is a public, web-accessible knowledgebase of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems.\n",
224 | "\n",
225 | "The data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature and from US Patents by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target (\"TARGET_TYPE='PROTEIN'\") is provided. |\n",
226 | "\n"
227 | ],
228 | "text/plain": [
229 | " re3data_ID repositoryName \n",
230 | "1 r3d100012823 Vivli \n",
231 | "2 r3d100010953 Polar Data Catalogue \n",
232 | "3 r3d100012815 UNB Libraries Dataverse \n",
233 | "4 r3d100010261 National Addiction & HIV Data Archive Program\n",
234 | "5 r3d100012074 BindingDB \n",
235 | " repositoryUrl \n",
236 | "1 https://vivli.org/ \n",
237 | "2 https://www.polardata.ca/ \n",
238 | "3 https://dataverse.lib.unb.ca/ \n",
239 | "4 https://www.icpsr.umich.edu/web/pages/NAHDAP/index.html\n",
240 | "5 http://bindingdb.org/bind/index.jsp \n",
241 | " description \n",
242 | "1 Vivli is a non-profit organization working to advance human health through the insights and discoveries gained by sharing and analyzing data. It is home to an independent global data-sharing and analytics platform which serves all elements of the international research community. The platform includes a data repository, in-depth search engine and cloud-based analytics, and harmonizes governance, policy and processes to make sharing data easier. Vivli acts as a neutral broker between data contributor and data user and the wider data sharing community. \n",
243 | "2 The Polar Data Catalogue is an online database of metadata and data that describes, indexes and provides access to diverse data sets generated by polar researchers. These records cover a wide range of disciplines from natural sciences and policy, to health, social sciences, and more. \n",
244 | "3 UNB Dataverse is repository for research data collected by researchers and organizations primarily affiliated with the University of New Brunswick. The repository allows researchers to deposit, share, analyze, cite, and explore data. Dataverse is an open source application developed by the Institute for Quantitative Social Science (IQSS) at Harvard University. \n",
245 | "4 NAHDAP acquires, preserves and disseminates data relevant to drug addiction and HIV research. By preserving and making available an easily accessible library of electronic data on drug addiction and HIV infection in the United States, NAHDAP offers scholars the opportunity to conduct secondary analysis on major issues of social and behavioral sciences and public policy \n",
246 | "5 BindingDB is a public, web-accessible knowledgebase of measured binding affinities, focusing chiefly on the interactions of proteins considered to be candidate drug-targets with ligands that are small, drug-like molecules. BindingDB supports medicinal chemistry and drug discovery via literature awareness and development of structure-activity relations (SAR and QSAR); validation of computational chemistry and molecular modeling approaches such as docking, scoring and free energy methods; chemical biology and chemical genomics; and basic studies of the physical chemistry of molecular recognition. BindingDB also includes a small collection of host-guest binding data of interest to chemists studying supramolecular systems.\\n\\nThe data collection derives from a variety of measurement techniques, including enzyme inhibition and kinetics, isothermal titration calorimetry, NMR, and radioligand and competition assays. BindingDB includes data extracted from the literature and from US Patents by the BindingDB project, selected PubChem confirmatory BioAssays, and ChEMBL entries for which a well defined protein target (\"TARGET_TYPE='PROTEIN'\") is provided."
247 | ]
248 | },
249 | "metadata": {},
250 | "output_type": "display_data"
251 | }
252 | ],
253 | "source": [
254 | "head(repository_info)"
255 | ]
256 | }
257 | ],
258 | "metadata": {
259 | "kernelspec": {
260 | "display_name": "R",
261 | "language": "R",
262 | "name": "ir"
263 | },
264 | "language_info": {
265 | "codemirror_mode": "r",
266 | "file_extension": ".r",
267 | "mimetype": "text/x-r-source",
268 | "name": "R",
269 | "pygments_lexer": "r",
270 | "version": "4.0.4"
271 | }
272 | },
273 | "nbformat": 4,
274 | "nbformat_minor": 4
275 | }
276 |
--------------------------------------------------------------------------------
/examples-r/02_re3data_API_certification_by_type.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 2: distribution of certificates across repository types (R)\n",
8 | "\n",
9 | "Observants of the repository landscape are interested in conducting a multivariate analysis of certification status and type of research data repositories.\n",
10 | "\n",
11 | "Research data repositories are diverse. The re3data Metadata Schema tries to account for that, resulting in rich and detailed metadata that can be accessed via the re3data API.\n",
12 | "\n",
13 | "### Step 1: load packages\n",
14 | "\n",
15 | "The package **httr** includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the redata API are returned in XML. **xml2** includes functions for working with XML, for example parsing or extracting content of specific elements. **dplyr** and **tidyr** offer useful functions for data manipulation and reshaping. **ggplot2** is a package for data visualization.\n",
16 | "\n",
17 | "If necessary, install the packages before loading them."
18 | ]
19 | },
20 | {
21 | "cell_type": "code",
22 | "execution_count": 1,
23 | "metadata": {},
24 | "outputs": [
25 | {
26 | "name": "stderr",
27 | "output_type": "stream",
28 | "text": [
29 | "Warning message:\n",
30 | "\"package 'httr' was built under R version 4.1.1\"\n",
31 | "Warning message:\n",
32 | "\"package 'xml2' was built under R version 4.1.2\"\n",
33 | "Warning message:\n",
34 | "\"package 'dplyr' was built under R version 4.1.0\"\n",
35 | "\n",
36 | "Attaching package: 'dplyr'\n",
37 | "\n",
38 | "\n",
39 | "The following objects are masked from 'package:stats':\n",
40 | "\n",
41 | " filter, lag\n",
42 | "\n",
43 | "\n",
44 | "The following objects are masked from 'package:base':\n",
45 | "\n",
46 | " intersect, setdiff, setequal, union\n",
47 | "\n",
48 | "\n",
49 | "Warning message:\n",
50 | "\"package 'tidyr' was built under R version 4.1.2\"\n",
51 | "Warning message:\n",
52 | "\"package 'ggplot2' was built under R version 4.1.0\"\n"
53 | ]
54 | }
55 | ],
56 | "source": [
57 | "#install.packages(\"htttr\")\n",
58 | "#install.packages(\"xml2\")\n",
59 | "#install.packages(\"dplyr\")\n",
60 | "#install.packages(\"tidyr\")\n",
61 | "#install.packages(\"ggplot2\")\n",
62 | "library(httr)\n",
63 | "library(xml2)\n",
64 | "library(dplyr)\n",
65 | "library(tidyr)\n",
66 | "library(ggplot2)"
67 | ]
68 | },
69 | {
70 | "cell_type": "markdown",
71 | "metadata": {},
72 | "source": [
73 | "### Step 2: obtain re3data IDs of all repositories indexed in re3data\n",
74 | "\n",
75 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of all repositories indexed in re3data need to be identified first, using the endpoint **/api/v1/repositories**. Details of the re3data APIs are outlined in the [re3data API documentaion](https://www.re3data.org/api/doc).\n",
76 | "\n",
77 | "The endpoint is queried using **GET**. The XML response is parsed using **read_XML**. XML elements or attributes can be identified using XPath syntax. All elements matching the XPath syntax for finding re3data IDs are identified with **xml_find_all**, and their content is extracted using **xml_text**. The three functions are nested in the example below.\n",
78 | "\n",
79 | "The endpoint **/api/v1/repository** provides detailed information about individual repositories that can be accessed via re3data IDs. Therefore, URLs for the next query are created by adding re3data IDs to the base URL."
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "execution_count": 2,
85 | "metadata": {},
86 | "outputs": [],
87 | "source": [
88 | "re3data_request <- GET(\"http://re3data.org/api/v1/repositories\")\n",
89 | "re3data_IDs <- xml_text(xml_find_all(read_xml(re3data_request), xpath = \"//id\"))\n",
90 | "URLs <- paste(\"https://www.re3data.org/api/v1/repository/\", re3data_IDs, sep = \"\")"
91 | ]
92 | },
93 | {
94 | "cell_type": "markdown",
95 | "metadata": {},
96 | "source": [
97 | "### Step 3: define what information about the repositories should be requested\n",
98 | "\n",
99 | "The function **extract_repository_info** defined in the following code block points to and extracts the content of specific XML elements and attributes.This function will be used later to extract the spedified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 2 above. \n",
100 | "The XPath expressions defined here will extract the re3data IDs, certificates, and types of the repositories. According to version 2.2 of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) used by the API, **type** and **certificate** have an occurence of 1-n and 0-n, respectively. This means that the elements can occur multiple times. For this reason, all occurrences of these elements are concatenated, separated by \"_AND_\". Concatenated values can be separated for the analysis later. In this and similar cases, extracting the re3data ID is particularly important, as it can serve as an ID column in the analysis. Results are stored in a named list.\n",
101 | "\n",
102 | "Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006).\n",
103 | "\n",
104 | "The function **xml_structure** from the package **xml2** can be very useful for inspecting the structure of XML objects and specifying XPath expressions. "
105 | ]
106 | },
107 | {
108 | "cell_type": "code",
109 | "execution_count": 3,
110 | "metadata": {},
111 | "outputs": [],
112 | "source": [
113 | "extract_repository_info <- function(repository_metadata_XML) {\n",
114 | " list(\n",
115 | " re3data_ID = xml_text(xml_find_all(repository_metadata_XML, \"//r3d:re3data.orgIdentifier\")),\n",
116 | " type = paste(unique(xml_text(xml_find_all(repository_metadata_XML, \"//r3d:type\"))), collapse = \"_AND_\"),\n",
117 | " certificate = paste(unique(xml_text(xml_find_all(repository_metadata_XML, \"//r3d:certificate\"))), collapse = \"_AND_\")\n",
118 | " )\n",
119 | "}"
120 | ]
121 | },
122 | {
123 | "cell_type": "markdown",
124 | "metadata": {},
125 | "source": [
126 | "### Step 4: create a container for storing results\n",
127 | "\n",
128 | "**repository_info** is a container for storing results of the API query. The dataframe has four columns corresponding to names of the list items defined by **extract_repository_info**."
129 | ]
130 | },
131 | {
132 | "cell_type": "code",
133 | "execution_count": 4,
134 | "metadata": {},
135 | "outputs": [],
136 | "source": [
137 | "repository_info <- data.frame(matrix(ncol = 3, nrow = 0))\n",
138 | "colnames(repository_info) <- c(\"re3data_ID\", \"type\", \"certificate\")"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "metadata": {},
144 | "source": [
145 | "### Step 5: gather detailed information about repositories\n",
146 | "\n",
147 | "After preparing the list of URLs, the extracting function and the container for results, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using **GET**. The XML response is parsed with **read_xml** before **extract_repository_info** is called. The results are then appended as a new row to **repository_info**.\n",
148 | "\n",
149 | "Because these steps are repeated for all repositories indexed in re3data, the process will take a while."
150 | ]
151 | },
152 | {
153 | "cell_type": "code",
154 | "execution_count": 5,
155 | "metadata": {},
156 | "outputs": [],
157 | "source": [
158 | "for (url in URLs) {\n",
159 | " repository_metadata_request <- GET(url)\n",
160 | " repository_metadata_XML <-read_xml(repository_metadata_request) \n",
161 | " results_list <- extract_repository_info(repository_metadata_XML)\n",
162 | " repository_info <- rbind(repository_info, results_list)\n",
163 | "}"
164 | ]
165 | },
166 | {
167 | "cell_type": "markdown",
168 | "metadata": {},
169 | "source": [
170 | "### Step 6: process the results\n",
171 | "\n",
172 | "The first line in the code block below uses the function **mutate_all** and the **%>%** operator to assign empty cells the value **NA**. Similarly, the next line modifies the column providing information on repository certification with **mutate** and an **ifelse** statement, resulting in a column indicating whether a repository received at least one certificate (TRUE) or not (FALSE).\n",
173 | "\n",
174 | "The results can be stored locally with **write.csv**. Concatenated values in the column type are separated with **separate_rows**, creating new rows if a repository was assigned multiple values. The resulting dataframe follows the specifications of [tidy data](http://dx.doi.org/10.18637/jss.v059.i10), a \"standard way of mapping the meaning of a dataset to its structure\". Tidy dataframes are often easier to understand and work with.\n",
175 | "\n",
176 | "Although this introduces duplication - multiple rows can now correspond to the same repository - the re3data IDs can be used to deduplicate results at any time."
177 | ]
178 | },
179 | {
180 | "cell_type": "code",
181 | "execution_count": 6,
182 | "metadata": {},
183 | "outputs": [],
184 | "source": [
185 | "repository_info <- repository_info %>% mutate_all(na_if, \"\")\n",
186 | "repository_info <- repository_info %>% mutate(certificate = ifelse(is.na(certificate), FALSE, TRUE))\n",
187 | "repository_info <- repository_info %>% separate_rows(type, sep = \"_AND_\")"
188 | ]
189 | },
190 | {
191 | "cell_type": "markdown",
192 | "metadata": {},
193 | "source": [
194 | "### Step 7: visualize the results\n",
195 | "\n",
196 | "Now that the results are processed, they can be visualized. The example below first removes repositories without a specified type using **filter**, and then generates a bar chart showing the prevalence of (any) certification by repository type. \n",
197 | "Please note that, as mentioned above, **type** has an occurence of 1-n. Some repositories are assigned more than one type, for example *institutional* and *other*."
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": 8,
203 | "metadata": {},
204 | "outputs": [
205 | {
206 | "data": {
207 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAAANlBMVEUAAAAXw7IifJ1NTU1o\naGh8fHyMjIyampqnp6eysrK9vb3Hx8fQ0NDZ2dnh4eHp6enw8PD////I1Uz+AAAACXBIWXMA\nABJ0AAASdAHeZh94AAAgAElEQVR4nO2di7aiuhYFcy4q6kbF///ZayBAgOCrSVZcVI3RLY8Q\nM5HaQEA0dwD4Z4x0AwA0gEgAK4BIACuASAArgEgAK4BIACuASAArgEgAK4BIACuASAArsKJI\nfwdjzKFaml2XzfuZdtiYnRt+Xfpd2mWeT0lJ0/oPmvBpa9u1+CGnnSmOtT9lupJXXmkv1sJn\nH3G+rBbjWpiW/dI7meH/Q1Pu2Tr0S7/LvLTsp9S8+wdN+LS1hydre4l98xkVtydvu/JKe7EW\nEGnM9bEzujxeL3tzWHgn4w9fX7Xri4blJtKnTfi0ta/X4oyz2d/snsz/jCKLFK3OvFgr4M6c\n3dDehI/uxiK9qg+R1i9/tx9TPVsSkdZgpYDV8Dfuao725bFrMvtGKWPqnTk0hxTtGjXesF32\nUbJsj9qPu8dCl7sr0pe4loUpymtb2f1YmJ3n6vVxplCUl36ZxVqal750l9/cz4Vr6LzRfSE3\n4s/3l/Ma6L1Dn3WWoVkh7vSm7gaaorU/ealxQ/Fw1cvN7uYU3ZovTvfwSvOmxFwL3YfjNXH6\nEf0GK4l0mO6G/trP2TplrEXHRZHaOc0n606zqokClemnP6bs++GGSz+z+7QWamleLl5VbuKp\nb+i80X2hdmQ031vOb6D3Dv4mNM5gqzu6RvyZU/cutqg/ealxQ/Fw1cvNbrjtzd+w5svwSvOm\nxFwL7sPx6px9RL/BSiIVZtQR9Ngt2b9l1327Jvd2pt994P1/NUV1r/d2HZ6aTerUnECPSphT\nfa8fH/vNTiku9/ownGTvmo3iz/4hb5dZqKV98Uq7/KZ4TKkKe7oRaHRXqBkZz/eX8xo4bY/7\nSzzOYKu7uUbsza1/m/t48lLj/PKhqpeb3U5uhx4LtrWb0Erzp0RcC91HPNQ5+4h+g5VEmh4D\nH1uxansoYMxlKDL7v2xOrpojGv8A3itxdH9LS/vabgT18H7eO7eDC7V0fxpnDW834+IYbHRX\n6DIP5S83auC4Pd1ud1Skqe7gouxG5f3JS40blQ9Vvdjsu91cD/Zvl53enNbWhQmttPnZVJy1\n4D5iv877LxJJpJ3pGJ+izP/fjfZl1+q0nyqwc5/WbdjreO/32CrKv9t4YqiW9sUr7Rru/uIe\ndsFGj9KN5o+W8xo4bU+3Wc4zXJstuz+yG/5s95OXGjcqH6p6sdktjwOAy7DmDya00vwpMdeC\n+4iHOmcf0W+wkki7yaGd8df2U5H8j/5c9B94qMSsMsutWWZ3HiaGa2lfvNKuzl0XINjo4Y2n\noUbLeeWm7Rk1YbxbPNgjouHIrps+TF5q3Kh8sOqlZjuu1oBufG9CK82fEnMtuNlDnbOP6DdY\nSaTSPze8jT/Qt0U6Pw6bj3+3z0R6HKKXds2fuokLtXQvfWlXp9sU9ibY6ECCJ8tN3uGVSHbf\n4x3Z+XuqdvJS40bl3xBptuSorH2drzR/Ssy1MPtrOvuIfoOVRPK6v2/F3nY+DNcKn4vkdVPs\nvKNmr8TzQ7uGa2m7/brjh2At3jJtadc41257cBJo9GhkNN9bbtTAaXu6RoWOvx67He/IzttF\nuMlLjRuVf3loF1jS35Kb8835SvOnxFwLbiOYNNH/iH6DlUR6rIq/duCxZ67sHqq5uco/hFgQ\nqXTn1r0I1VSBY1vX6Cx29ld22DJCtdT9lPtkceNOc+3pcqDRo+Lj+cNyowZO29MuG8pgm3T0\njuz66f3kpcaNyoerXmi2+7NV2UmHdus9D6tutupDK23lteA2goVwv8Nazb20twjVjyNr+1fq\n2lwTuBbdhQT7Tm7buI//r0xxdd3fzd0RbXesX/pxeHFsO02vAZHa3tLj0Ic0r2VvDrWb4pV2\n+ZserKrZwAKNvvvvNp4/LDdq4Kg9fRNCGWxbC7+Xt+sD6ycvNW7UrIWqw80+NX3Ol2YPcG76\nrv9Mu16mK82fEnMtNC9+nbOP6DdYzfvuWptbAW60vWrXTNl5J/H+/+1lQfun6NxVcRmXHl/G\nuw+LWtz1u+LWLTOvpS1yHAa9mzaNu+eiCjf6Pnq30fz5cv6lyKI7UWybEMrQTvVOBXbdqX03\nealx42aFqw4329202lbfDp/CK82fEnMtjD7i4z3wEf0G6+1Aa9tXasruD+bN3snTdL10f8l2\nw+Hb6NzlbG/sb4fs/SOX5nzLLz25Rch7sVyaO0pu/TvMa7lfdv2tMEPpe1fRwzB3X8u80X2h\nwHxvOf/mmOEd2j+2bRMCGe7N+Ym3vXRF+8lLjZs0K1h1uNnN2m5vL37wN9wiNF9p3pSYa8G9\neE2cfkS/wY8dia7Ot4fi6xzCn8PX77vJ2Z8nZN/AdGx9TYiK9DgpCN2b2U/OfjvNvoHp2Pqa\nEBRpcsYTmJz9dpp9A9Ox9TUhKNIufPnem5z9dpp9A9PBmgBYAUQCWAFEAliBlUQy5pPD5S8O\nrb969FTgXdd4LtR5P9ydfD0YU7qLPh3TMs9r8L8H7k8P1DD6yvix6C6/hWZPn7r12QcEH/Mr\nIn3z6KnQuy6/9duNcjcHNK25tsPthUdHMSnzvIbL7FaDdnqgBr9o+2WD0Z2d85oK/8IzIkVl\nNZHilXaLfPzoqbXeesLRPtHK3q5mt9iDvSPt7N8XVtmGjso8r6G5teyvaZY/PVSDV/Rhyrm2\nD184Bme/89QtWJXfEenjRSLVU3T3bhZ9dV6l1+b726MyT2u4tKYcrQT+9EANftH2a7TtTfOB\n2e88dQtWZdXt0wzPyvLupHr8O7VTz0X/pcnTy2dg+TdmDeceo2LpngsV2AqbSe327m3OxW5W\n5ulKO7Z72qspx9MDw37R0/whO4GaRh4jUlRWFql/VtZIpPaZTZfmi4/2L3b3dK6nz8DybxWe\nipT8uVAhkawyp/bQrj/+Oo2PQJ/3j9i5+7bievR1ivCwX/RgZjd1zmrqnrrVr7VnbYF/ZGWR\n+mdljUSyX1o5mqJ9ab4c+cYzsEbPbpr8lc7guVBVa8+pMN5XIW7jb9FUT78vXQ1fjx/l85eq\nAl+gdXv5qrml/R6c3b6O91qIFJWVReqflTUSyU2thinNn8rmmayLz8AaPbtpIlIGz4XatV8L\naHbBh273cBjvkHZPvy69G06yRm/mL+UPj0U6TTv1JjX1T92azocorH2O1L2MBuvZzHbHY549\nA2v8DIDAeYPoc6HaY6tr07d23bt+5nrc3x04/prNnYvkLzWqYSySleRx8HYOzm5pn7o1nQ9R\nSCLSPTxzOGsJidSPzK7/9DV5SyZ+LpR7sHDZbqkXd4I/7gI4zjsE5nNnm7+/1NGdn43XzrCX\nf4i8C852XEN7LIhCBiJNa5mOLIvkTUr7XKhua59svIUJlAnTzT10NeznS/WudXvIoaj3xqHZ\nHYsrF9ZGSKS6H158BtbLQzux50LV++IybnD76nc1DGVCDHPHndb+UrMa/KKB3c8HHemwPvFE\n6h+BFRKpOV662D+ei8/AGj/eKbBJSD0X6lb0t97s27e9tMXPwxmLVyaAN9ddRm0PCv2l5jX4\nRffdGjwEZ/tP3VrMAWsSSyTvEVghkZq7WZpvVC8+A2v07KaQSELPhaqLYRdVNb3lN/dDD0Of\nnV9mzmiuu7GnmEwP1eAVvdgLDf0bz2b7T92arDWIQyyRvEdghUSaXz6dPQNrdNE0eJCS8LlQ\n3tuXXn+Ee5aY2/Xt+j62UZlxo6dz3W2vf5PpgRr8ot3Tsvzbv/3Z/lO35g2A9YklkvcIrGBn\ng71FyJ0CLDwDa3ynT/BoP91zoUZG+Jt45X11YShk5hos1uDf5zRMD9Qw/p6EvX/pybcs/Kdu\nzRoA67OqSHJIv/8bSLfwB1bRL4NIifiT/gW6/FfRT7OaSLKfU/5byVH2B3+kPyD1INI2kP6A\n1MPaBVgBRAJYAUQCWIGRSPbiijm0VyLq0XWM9pKKu0wzGgGAsUjuMn17ufziizS6yWA0AgD3\nkUhXYx/xZG8+sXeZnPwv/BemrO3jnYrZCADcRyKVTp32KQX+lzO7u4gP9i7T0QgAWDyR+q+l\ntbdJe1fiu2+bVe1DFrwRALCEeu3sQdvVlPa3XdtbNrvvBzSPJRyNAIAlIFJz7PZnvC6F0Dez\nuZsAYCAgQ/O1mrL55pj7qtwzkQzAdnkiUtnshHauK6EaPRxuaY+U2b4ps+ZEhrRyPFFg8vCb\nun9sSFMakfKDtHIsKzB7iNTMHUTKDNLKsajAYXa/gtXl0D+96jAZCdciTWbNiQxp5VgQ6bbr\nPXIPdHrosre7qfaE6dJeR/JGQrXIk1lzIkNaOcIiVabo70Qt3TPazvZ2h8rtfJr91WgkUEsG\nZNacyJBWjqBIN/83R6/tTxm4HzV5da9dXulya05kSCtHUKRy1Dvursi2brmnqLX7oNHIrJYc\nyKw5kSGtHEGRJpeZmgendb8wX9nnL5aXwMi0lhzIrDmRIa0ca3dc55Uut+ZEhrRyIJIiSCsH\nIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgr\nByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJI\nKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIi\nSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCS\nIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIg\nkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHKsLhLAJllbpFVqWY3MmhMZ0sqBSIog\nrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiK\nIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFI\niiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqB\nSIogrRyIpAjSyoFIiiCtHDmJdC0LYw5VO3IpjSnK63xOPfutwUjN+T1IK0dGIh2dIEc7UrmR\najbngkhLkFaOfES6GnOu7/ezMX+PscKU9b0uTTGbc2r+j96cX4S0cuQjUukE+TM7u0Mqm5GD\nuUzmPCbdUjTnFyGtHPmIVHSL2oO2Y3tM9xDqOJlzb3ZS8Zvzi5BWjnxEGqoo7G6n7Wa4msNk\nztWU570x5Xu7pbxWdmxIK0d+IjVHdWa0E/Ln/Bm/GyJ+c34J0sqRn0i74rYgUjOnNMXjtOle\nFaZO0pxfgrRyZCfSwdoSFKmds7PdD3d39hS/OT8FaeXITaTjvtnRBERyczrqtzod8lrZsSGt\nHJmJdCzaToS5SN2c4Z3eeau8VnZsSCtHXiIdzLkbaLW5db12/ZzhnRBpCmnlyEmkW9HbcnRn\nQpf2TMib03Uy3Mw+cnN+D9LKkZFIVdsf54bbPdGh6eX255ROqfNbdwrltbJjQ1o58hHpVvhn\nQf69dqM5V1NYt9rbhSI25xchrRz5iFQa493XffYuu47nuCuy076HtZvzi5BWjnxEMiNd7tXh\nMVBeAnOuD7F2x3cux+a2smNDWjnyESkKmTUnMqSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQOR\nFEFaORBJEaSVA5EUQVo5FkRaflbj8si8lgX+S0deKzs2pJUjLNLysxqXR+a1LIFIkSCtHEGR\nlp/V+GRkVssiiBQJ0soRFGn5WY3LI/NaFkGkSJBWjqBIy89qXB6Z17IIIkWCtHI877WbPatx\neeRJLVMQKRKkleOpSPNnNS6PLNcyA5EiQVo5niqwM9NnNT4TybxLSpEA0vBEpLI5B2KP9DOQ\nVo4nCri+BET6GUgrx7ICXZ8cIv0MpJVjUYFDd7/C6FmNyyPhWoIgUiRIK8eCSLddf9/P6FmN\nyyOhWhZApEiQVo6wSJUprsOw96zG5ZFALUsgUiRIK0dQpJtZelYj99rlDGnlWLjXzu8d95/V\n+GRkVssiiBQJ0soRFGlymcl7VuOzkWktiyBSJEgrx2cd15/UtwwiRYK0ciCSIkgrByIpgrRy\nIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0\nciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimC\ntHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIp\ngrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwci\nKYK0ciCSIkgrByIpgrRyIJIiSCsHIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyIJIiSCsH\nIimCtHIgkiJIKwciKYK0ciCSIkgrByIpgrRyrC7SG6QUCSANa4v0RpmUIq0S6lcgrRyIpAjS\nyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI\n0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIik\nCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyI\npAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0c\niKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCt\nHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0cT0Q6uSm1cTRj\nl9KYorze5yPhWgIgUiRIK8eySJUz537xRarccDUbCdcSApEiQVo5FkWqul3QY8/0N0wuTFnf\n69IUs5FgLUEQKRKklWNJpJMpOpEO5tZPrkzppl0mI8FawiBSJEgrx4JIhdndOpHMbph+dIdx\nlTlORkK1LIBIkSCtHAsimfPjXzvlasrzzhRls1s6mKubeJiMhGpZAJEiQVo5nvTaOZH+jNel\n0O2lmoHRyFItARApEqSV47VIpSnsKVBVmPq5SOZdUooEkIaXIu1cV0JzJsQeKWdIK8drkTpq\n28uNSDlDWjneF2nuDiJlBmnl+FCk7prSre2180aWagmASJEgrRyvRWo6Ge5Wl729dNSeMF3a\n60jeyFItARApEqSV451eu3PzerZ3ClVu53OwneGjkaVaAiBSJEgrx2uRrqawnvy1Nzhwr13G\nkFaON86R3BXZojkhOrte82o2slBLAESKBGnleKez4Voaszu2Z0r36vAwp7wERsK1BECkSJBW\njs86rj+pbxlEigRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRS\nBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5E\nUgRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRSBGnlQCRFkFYO\nRFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBW\nDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQ\nVg5EUgRp5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVAJEWQVg5EUgRp5UAkRZBWDkRSBGnlWF2k\nN0gpEkAa1hbpjTIpRVol1K9AWjkQSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQ\nSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQORFEFaORBJEaSVA5EUQVo5\nEEkRpJUDkRRBWjkQSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQORFEFa\nORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRB\nWjkQSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQORFEFaORBJEaSVA5EU\nQVo5EEkRpJUDkRRBWjkQSRGklQORFEFaORBJEaSVA5EUQVo5EEkRpJUDkRRBWjkQSRGklQOR\nFEFaORBJEaSVA5EUQVo5noh06qZcSmOK8vpqJFxLAESKBGnlWBapMqYfaKiej4RrCYFIkSCt\nHIsiWUnaocKU9b0uTfF8JFhLEESKBGnlWBLpZAonUmXK5vVgLs9GgrWEQaRIkFaOBZEKs7s5\nkY7uyK0yx2cjoVoWQKRIkFaOBZHM+fGvnXIwbWfC1RyejYRqWQCRIkFaOZ702jmRulOlZmB5\nZKmWAIgUCdLKsaJI5l1SigSQhvVECtcSIKVIbzRHD6SVA5EUQVo5EEkRpJXjtUgHc2teb21H\n3dLIUi0BECkSpJXjtUhHd7n10l46WhpZqiUAIkWCtHK8Fqly+5uDvfy6PLJUSwBEigRp5Xgt\nEvfa/QykleMNkc6uo7x6PrJQSwBEigRp5XhDpHt1eMhSXl6NhGsJgEiRIK0cn3Vcf1LfMogU\nCdLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyI\npAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0c\niKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCt\nHIikCNLKgUiKIK0ciKQI0sqBSIogrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciKQI0sqBSIog\nrRyIpAjSyoFIiiCtHIikCNLKgUiKIK0ciCSJGbCjl9KYorwO8y/mowCZp12ZvNIikiSDR8Vj\nrHLDVT+/QKRl8kqLSDlQmcvdalPW97pspGo4GURaJq+0iJQBN3O6W5vKZuzQWPXgatgjPSGv\ntIiUAftmH3R0x3SVObaTd4ZzpCfklRaR5DmbP/tyMG03w9Uc3OTDHZGWySstIslT7JqXXpp2\n4GaKGpGekFdaRBLn7A7pJiId7H4KkZbJKy0iiVO4XrqxSH9mf0ekZ+SVFpGk6fsWRiLVhbnd\nEekZeaVFJGnKrrd7JFJpzqNpb/EDaVckr7SIJE1//fXQ7INsL8NhdvPQm1Wt37qMySstIglz\ncZdh7XWki5tyRKQ3yCvt6iK9QUqRANKwtkhvlEkp0iqhYtJdhr0H7rXjHOkZeaVFJGEKU3eD\nZ/e3bbj7G5GekFdaRBLGV6U6PDQqLwtz36hrpTb9BnmlRSRFkFYORFIEaeVAJEWQVg5EUgRp\n5UAkRZBWDkRSBGnlQCRFkFYORFIEaeVApNhsK21C8kqLSLHZVtqE5JUWkWKzrbQJySstIsVm\nW2kTkldaRIrNttImJK+0iBSbbaVNSF5pESk220qbkLzSIlJstpU2IXmlRaTYbCttQvJKi0ix\n2VbahOSVFpFis620CckrLSLFZltpE5JXWkSKzbbSJiSvtIgUm22lTUheaREpNttKm5C80iJS\nbLaVNiF5pUWk2GwrbULySotIsdlW2oTklRaRYrOttAnJKy0ixWZbaROSV1pEis220iYkr7SI\nFJttpU1IXmkRKTbbSpuQvNIiUmy2lTYheaVFpNhsK21C8kqLSLHZVtqE5JUWkWKzrbQJySst\nIsVmW2kTkldaRIrNttImJK+0iBSbbaVNSF5pESk220qbkLzSIlJstpU2IXmlRaTYbCttQvJK\ni0ix2VbahOSVFpFis620CckrLSLFZltpE5JXWkSKzbbSJiSvtIgUm22lTUheaREpNttKm5C8\n0iJSbLaVNiF5pUWk2GwrbULySotIsdlW2oTklRaRYrOttAnJKy0ixWZbaROSV1pEis220iYk\nr7SIFJttpU1IXmkRKTbbSpuQvNIiUmy2lTYheaV9Q6TaOJqxS2lMUV7v85HntXhsa9PaVtqE\n5JX2DZEuvkiVG65mIy9q8djWprWttAnJK+0bIp3M3zBSmLK+16UpZiMvavHY1qa1rbQJySvt\nGyIdzK0frkzppl0mI69q8djWprWttAnJK+0bIpndMHx0h3GVOU5GXtXisa1Na1tpE5JX2tci\nXU153pmibHZLB3N1Ew+TkRe1+Gxr09pW2oTklfa1SH/G61IwXSk7MBp5UYvPtjatbaVNSF5p\nXytQmsKeAlWFqZ+LZN4l5aYlz7bSbpeXIu1cV0JzJsQe6WO2lTYheaV9X4Ha9nIj0sdsK21C\n8kr7gQIzdxDpHbaVNiF5pf1QpO6a0q3ttfNG3q3lvrVNa1tpE5JX2tciNZ0Md6vL3l46ak+Y\nLu11JG/kRS0+29q0tpU2IXmlfafX7ty8nu2dQpXb+RxsZ/ho5EUtPtvatLaVNiF5pX3ngmxh\nPflrb3DgXrtP2VbahOSV9o1zJHdFtmhOiM6u17yajbyoxWNbm9a20iYkr7TvdDZcS2N2x/ZM\n6V4dHuaUl8DIi1oGtrVpbSttQvJKyzdkY7OttAnJKy0ixWZbaROSV1pEis220iYkr7SIFJtt\npU1IXmkRKTbbSpuQvNIiUmy2lTYheaVFpNhsK21C8kqLSLHZVtqE5JUWkWKzrbQJySstIsVm\nW2kTkj22rYsAABFjSURBVFdaRIrNttImJK+0iBSbbaVNSF5pESk220qbkLzSIlJstpU2IXml\nRaTYbCttQvJKi0ix2VbahOSVFpFis620CckrLSLFZltpE5JXWkSKzbbSJiSvtIgUm22lTUhe\naREpNttKm5C80iJSbLaVNiF5pUWk2GwrbULySotIsdlW2oTklRaRYrOttM+5loUxh+bBvNPf\nujsWpjg+XXhKXmkRKTbbSvuUozPHCjN41Dw5ft8M7j+pLa+0iBSbbaV9xtWYc908Mf5vmFg1\nvw10NMfm5xg+2SfllRaRYrOttM8onUDuh00abuZkX9of4ar93zV5SV5pESk220r7jKJrn/dT\nqfvCn2I+SZBXWkSKzbbSvsWw4zm7ndTO7ZF2S4uEalm5Vf8GIsVmW2nfoTJlN1g4c/6as6Oj\n+3HI98grLSLFZltp32HX/mbd3e6Qut+oOxe2/+4TjxBpY5vWttK+waH36F50x3i3Q9P9fbgt\nLBMir7SIFJttpX3NcV93g1XX3X0zhe0Er4riA5PySotIsdlW2pccPVdKcxkPeGdPr8krLSLF\nZltpX3Hw+xP63ru+2/uT/u+80iJSbLaV9jm3wvfo0u9/EOlJfctsa9PaVtqnVO2pUMe5v1Vo\nb9rjvcsnd9vllRaRYrOttM+4TfoSDubqhi5mV9WPl8JcAsstkFdaRIrNttI+o5x8c6K9wa7B\nvy/8XfJKi0ix2VbaZ0y/guSfEF2tZeV1Yclwdas27l9BpNhsK21C8kqLSLHZVtqE5JUWkWKz\nrbQJySstIsVmW2kTkldaRPI4dcXtqe/uHBr5nGzT/jp5pUWkgarrRqrarqXdbTbyBbmm/Xny\nSotIPVXXLese0vHXXGYfjXxDpml/n7zSIlLHyRROpNLdEHa0XzsbjXxDnmkVkFdaRHIUj4O3\nyRX3m72pcjTyDVmmjUXCsBmk9UGkruHn+bNs7OHcaOQbskwbi4RhM0jrg0h+48ci1XZgNPIN\n2aaNQcKwGaT1QSS/8d05Uns6dLbjo5FvyDZtDBKGzSCtDyL5jW+LX0zxkKc+N714o5FvyDZt\nDBKGzSCtDyL5jXfF3U397XWl0cgXZJs2BgnDZpDWB5H8xnfFL/vmnv72wZ+jkc/JNm0MEobN\nIK0PIvmNHxe/mMPCyCdkmzYGCcNmkNZndZHeIOXKlmdTaROGzSDtiLVFeqNMypX9WePHF2Sb\nJ7uPRr4h27QxSBg2g7Q+iOQ3vi1+NPvr41ju0Px0z2jkG7JNG4OEYTNI64NIfuNd8fZ3GN1J\n0WjkC7JNG4OEYTNI64NIfuO74uedMfsqNPI52aaNQcKwGaT1QaTYbCptwrAZpPVBpNhsKm3C\nsBmk9UGk2GwqbcKwGaT1QaTYbCptwrAZpPVBpNhsKm3CsBmk9UGk2GwqbcKwGaT1QaTYbCpt\nwrAZpPVBpNhsKm3CsBmk9VEu0v/SkUFa+U0rYdgM0vogEiKtSMKwGaT1QSREWpGEYTNI64NI\niLQiCcNmkNYHkRBpRRKGzSCtDyIh0ookDJtBWh9EQqQVSRg2g7Q+iIRIK5IwbAZpfRAJkVYk\nYdgM0vogEiKtSMKwGaT1QSREWpGEYTNI64NImxap2nu/jzsa+Y6EYREJkWLx8Ud4aJ+QtJ+P\nfEnCsIiESLH49CM8tc/s2zXP7BuNfEvCsIiESLH48COsTfv02KspJiNfkzAsIiFSLD78CP/M\ncWHkaxKGRSREisWHH+HBXBdGviZhWERCpFh8+BHuzP1aGrO/zEa+JmFYREKkWHz4ETY/Q2g5\nTke+JmFYREKkWHwskjnV93tVmMtk5GsShkUkRIrFxyK1V18rU05GviZhWERCpFh8LJI3MBr5\nmoRhEQmRYvHhR7j33dnrF+laFsYc2h/oqWe/WPmPINJ2RTqam/vQisnI1yQM+7FIR+P1plwQ\n6SMQ6QkXdzpU2W1rNPI1CcN+mvb6OAus7/ezMX93e0PU3z/EDIBI2xXpcThnpamKZmc0GvmW\nhGE/TVs6df7M7m6vP/9LzACItGGRbrv28KaajXxLwrCfpi1GJ4H/dAAbApE2LFL7+7jlNTTy\nHQnDft1rZx26mvK8f6Rdb7eESJsWaW0Shv02bXOh7K/ra/in/a8PIiHSiiQM+23aXXGzJ0yF\nvYHjcUZYr5QckRBpRRKG/TLtwXp037kbof6tj9IHkRBpRRKG/S7tcT/aBdWrdTogEiKtSMKw\nX6U9FpPuhdWuyCISIq1IwrDfpD2Y6UOSEOk9ECktCcN+nvZW9B51nQy3f3tokgciIdKKJAz7\ncdqq7alrKJ1S59XuFEIkRSLJp00Y9lORboV3fnQ1hb2C1N4utAqIFHnT2lbahGE/v9fOeHd8\nuyuy076H70GkyJvWttImDPvFF+v9r07YJ73sjmtdjkWk6JvWttImDJvBGaEPIkXetLaVNmFY\nRNrWprWttAnDItK2Nq1tpU0YFpG2tWltK23CsIi0rU1rW2kThkWkbW1a20qbMCwibWvT2lba\nhGFViXQpjSlGX/NHpE2nTRg2g7Q+/yRSNf/iOyJtOm3CsBmk9fknkQpT1ve69L9liEibTpsw\nbAZpff5FpO6HCw7eL4Eg0qbTJgybQVqffxHp6I7p/CdIINKm0yYMm0Fan38RqfvZ0as5fFTL\ntlb2ptImDJtBWp9/ESn0kzqItOm0CcNmkHYkw0cKTJYdi2QAtst6IuVItg2LAmlzAJF+HtLm\nACL9PKTNgW967dpHR9y8XjuAjfPNdaT2QuxltSeRA/w839zZ0O6JDuv9ygzAr7POvXYAG+cL\nkc6uC50dEkDHN50g1eGhUXl5XRBgK+TamzjG9rQ/7W2fzXy5RIY8b+911xe5zp5d7c1c4a2y\n49N4AmTduB5EcnO727KezFzhrbLj03gCZN24ns/XYd5r/Rv8+7rCIn1X2S+ASCuBSIiUeZuz\nbtyDy94UJ+9ArT7u7O8JuK9ElYXZNb8d1a3pc2H2l25CN/FRh3HPavlr+kmursTRmGPXi7/e\nD/R+zby9Q1p3r3Ebqx90S42n3I6FKY63+7SySfjcGWIM8Zow7Y01tZ1ddintJynZ2OxFah+0\nUvZa3ArX937pZzY/Y+i2wZOx88/3kUiumN2A9t7STbXmVLpu/D/pTyLQXi/t+yK5xZuf1HoS\nXirku3gx+nhumrWnWzVtyuaTlG1v3iv0sZv4a38eyn32e2N/3KY+299cqx8rr75f91Yctw3a\nX5A6G/tLoZ5IdpFHqdLO2j8K1CfnXmE3r6v7PdG9Ee/Pn7XXTzvrbPBE8v+r++vl9dPwEgE/\nYBSj/3jPXZh7YR7Dj0xtyuL6orr45L1CT+3uovIP1LyZzVNYrnYzczOb9Xm0f5y8Jc5tqcfI\nzt1v6894bK3twY74kd28vaPN/U2RTv0dXKen4ePH+SdGMSZrprAGtR/en5dSlrxX6N41r+i2\nqp05VHU/c/g75G8dzT7GE8nbfuzc6rQfzWifilTJH9nN2+unfVekfX9P8f5p+Phx/olRjC5e\nuy7syKFrv59SlLxXaPd57zstmqPk3fHiz+yHux997ySab2+3w/AN4X7xnf2AygzueJq110/7\nrkh9LH+Xthw+V0Yx5oFHX/bOIkwObVhmvO7s0KU9Zd7dQiJNSs8m2jPUXXm+jbels90ZFRms\niLkfXtp/FikcPlcQaVW6VVR6q6uubJ9w+c0eqTTt7/BOtqWieBxBlJGjvMHcDy/tP4u0ED5T\nXooUKitJDm1YpjtHOk3WXfNndX6O1N6CNj1HGkq44ekf5aOpjhkc2QVFstymG5NfcDpzco40\nVLYQPlOC50jtrPaz9/pYswiTQxuWOTpXdt2GsPNOOE9t/0BzJdWt6aZ0Oem1u/dLuOFysi3V\nZp/Dkd28vX7agEhN2vNk5qTXbqhsIXymBHvt2lntZ9/OHv25ECWHNixTNxfkLrt+Qzg3Ny7U\npT3YqZtuz/a6glvT9qrT7DpSW1X7d+zYXJvoSzhK0x0VijJrr5/23vRNdSLdbKP3V3uNyS3V\nzZxcRxoqWwqfJ9PrSLdxmMfsw2NSVdg9UxZhcmjDE9pr2afhL6q7PN+s3u73ZW79NnjwrnYH\nNsxLW/60m6z+qzF/yaPNmR/a+Wn3w80LzeDVXdrvC7qZ/q/uvBM+U0Y/HuTFc23vZh/vmYTJ\noQ3PuE7utbufHyt1524HudpfPDv6XVrH7l6soEhNL9jh8timDuPVvzf1XZ7AOZKX9rZ7HMO2\nU5vB+83uSC/tJG+mu0lt6Fi4vwifKV6MUTz3au+1s3HumYTJoQ1r8fUKrbM4soNfBpHu7kYT\ngH8AkR47pF0WR3bwyyDSY7EcrsbCb4NI910G96vCr6NJJAAxEAlgBRAJYAUQCWAFEAlgBRAJ\nYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAF\nEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEEkR/y2z3pv8b5H13uMHQSRF\nIJIciKQIRJIDkRSBSHIgkiIQSQ5EUgQiyYFI73Dd3btf1qxLY4pXv7Lplf/0XV5OegIiyYFI\n79A40YpxMA9eSeKV//RdXk56AiLJgUjv4G3Pxlw/Kv/du3xXESLJgUjvMBLps/Lfvct3Fb0t\nkmlph73/u3lemVkD3hTJ9Mt7FbnKzHLtP4ueJP/CtSzM7twO18fCFOXNDj4+56Mxx36D6D7/\nfgP3l/uzB33l9d4XcmVutr5jV99l7wrdu3fbGbM7XoeqFytqX4cF5rwvkj9oRlP6saUt44M9\n0thRMxJpofafRV2gb6jaLXZvh29FO1Ld7ZZbPoZOiyL5y+3dn9jLZPt3ZYq2PjfWa9C922Ox\nruqlito39RaYs4JIrzd1RAqiLtAX1A9Z6vt1b+y+pTDn+n4/m6K2W27RbvJe54H3v7/c2ewf\nO5361Fo1lK8LU9a2q6+tzxybBcrurfd2wr0+m76bb6ki9zpaYMp6It0R6UPUBfqCU7thX+3G\neTbtkdqfOdkt1x22hUXyl9uZ230y25U5NNMPXn3X4dRgdJLQjCxV5F6fnlV8IZLx/rXvYcy0\nzBhECqIu0BfsvY64Q/+x7+1WdXMjQZH2kw68a3XaT7f/vTsGu/j1DTbszKGqu5F+cqgi9zpa\nYMqHnQ3N0H0skpvbl/m6s6GrfFQRnQ2q8T9OM+DNCIs02gxuB2/TGGaPqhj1GjQ050y748Wb\nvFCRex0tMOXzPVK/zzCzAuvukUY7J/ZIOllDJNsJsCvPtw9FanrxrBq3bvJSRd2rv8CUL0Tq\no84KrHxo55+JIZJOxiKFZrwWqWw6AeazX4l0v9eV7dAuu8lLFQ2vwwJTVtgjvd7UESmIukBf\n4J/r7P1+5Rci+ct1F41enCNN6u24zcwL7JFu3mK3WRWWj0UaXkdJo1xHMqP/1G136gJ9wckc\n7UttiqGXbbThL/baDcu5ouWLXjuvuoadqSdvEKyoEfbcdjbUkyo8/kWkoUOgPz1bq7PBH3tV\n+8+iJ8n31E23dHt5py7M4XH6URXtBVFXoulum4nkL9dc37F3hjcb+lB+ch2pq65767PZX5o7\nystuqUBFpdlf7aUju9hogSlvi/QvfCDSlkCke3/3QdM53Q3bnU2/we+HGwz8/73lLu3Qadcc\nynnluzLVPSRSdx+D1axdKlDR1dXgjhWHBaYgkhyIZLnaLxm198M199qZQ3Nm02/wt13/HST/\nf38525v2WOrSHMp55d29drVfn39Ec34stzsN7xKo6H57vM3+4hbzFpiCSHIgkiIQSQ5EUgQi\nyYFIikAkORAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlg\nBRAJYAUQCWAFEAlgBRAJYAUQCWAFEAlgBRAJYAUQCWAF/g8HTmO7Xgh3+gAAAABJRU5ErkJg\ngg==",
208 | "text/plain": [
209 | "plot without title"
210 | ]
211 | },
212 | "metadata": {
213 | "image/png": {
214 | "height": 420,
215 | "width": 420
216 | }
217 | },
218 | "output_type": "display_data"
219 | }
220 | ],
221 | "source": [
222 | "num_repositories <- n_distinct(repository_info$re3data_ID)\n",
223 | "\n",
224 | "repository_info %>%\n",
225 | " filter(!is.na(type)) %>%\n",
226 | " ggplot(aes(x = type, fill = certificate)) +\n",
227 | " geom_bar(position = \"dodge\") +\n",
228 | " ggtitle(paste(\"Certification status per repository type for re3data repositories\\n[number of repositories:\", num_repositories, \",\", Sys.Date(), \"]\")) +\n",
229 | " geom_text(aes(label = ..count..), size = 5, stat = \"count\", position = position_dodge(width = 1), vjust = -0.5) +\n",
230 | " scale_y_continuous(expand = c(0,0), limits = c(0,2500)) +\n",
231 | " scale_fill_manual(values = c(\"#17C3B2\", \"#227C9D\")) +\n",
232 | " labs(fill = \"certification status\") +\n",
233 | " theme_linedraw() +\n",
234 | " theme(axis.title = element_blank(),\n",
235 | " axis.text = element_text(size = 14),\n",
236 | " legend.position = \"bottom\",\n",
237 | " legend.title = element_text(size = 14))"
238 | ]
239 | }
240 | ],
241 | "metadata": {
242 | "kernelspec": {
243 | "display_name": "R",
244 | "language": "R",
245 | "name": "ir"
246 | },
247 | "language_info": {
248 | "codemirror_mode": "r",
249 | "file_extension": ".r",
250 | "mimetype": "text/x-r-source",
251 | "name": "R",
252 | "pygments_lexer": "r",
253 | "version": "4.0.4"
254 | }
255 | },
256 | "nbformat": 4,
257 | "nbformat_minor": 4
258 | }
259 |
--------------------------------------------------------------------------------
/examples-r/03_re3data_API_repository_APIs.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Use case 3: aggregating current API information and general information about repositories (R)\n",
8 | "\n",
9 | "“As a research data portal, it is important for us to know which repositories offer an API. We would like to aggregate API information, such as API endpoint, API type and general information about the repository.”\n",
10 | "\n",
11 | "### Step 1: load packages\n",
12 | "\n",
13 | "The package **httr** includes the HTTP method GET, which will be used to request data from the re3data API. Responses from the re3data API are returned in XML. **xml2** includes functions for working with XML, for example parsing or extracting content of specific elements. **dplyr** offers useful functions for data manipulation. **purrr** provides tools for iteration, and **ggplot2** is a package for data visualization.\n",
14 | "\n",
15 | "If necessary, install the packages before loading them."
16 | ]
17 | },
18 | {
19 | "cell_type": "code",
20 | "execution_count": 1,
21 | "metadata": {},
22 | "outputs": [
23 | {
24 | "name": "stderr",
25 | "output_type": "stream",
26 | "text": [
27 | "Warning message:\n",
28 | "\"package 'httr' was built under R version 4.1.1\"\n",
29 | "Warning message:\n",
30 | "\"package 'xml2' was built under R version 4.1.2\"\n",
31 | "Warning message:\n",
32 | "\"package 'dplyr' was built under R version 4.1.0\"\n",
33 | "\n",
34 | "Attaching package: 'dplyr'\n",
35 | "\n",
36 | "\n",
37 | "The following objects are masked from 'package:stats':\n",
38 | "\n",
39 | " filter, lag\n",
40 | "\n",
41 | "\n",
42 | "The following objects are masked from 'package:base':\n",
43 | "\n",
44 | " intersect, setdiff, setequal, union\n",
45 | "\n",
46 | "\n",
47 | "Warning message:\n",
48 | "\"package 'purrr' was built under R version 4.1.1\"\n",
49 | "Warning message:\n",
50 | "\"package 'ggplot2' was built under R version 4.1.0\"\n"
51 | ]
52 | }
53 | ],
54 | "source": [
55 | "#install.packages(\"httr\")\n",
56 | "#install.packages(\"xml2\")\n",
57 | "#install.packages(\"dplyr\")\n",
58 | "#install.packages(\"purrr\")\n",
59 | "#install.packages(\"ggplot2\")\n",
60 | "library(httr)\n",
61 | "library(xml2)\n",
62 | "library(dplyr)\n",
63 | "library(purrr)\n",
64 | "library(ggplot2)"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "metadata": {},
70 | "source": [
71 | "### Step 2: obtain URLs for further API queries\n",
72 | "\n",
73 | "Information on individual repositories can be extracted using the re3data ID. Therefore, re3data IDs of all repositories indexed in re3data need to be identified first, using the endpoint **/api/v1/repositories**. Details of the re3data APIs are outlined in the [re3data API documentation](https://www.re3data.org/api/doc).\n",
74 | "\n",
75 | "The endpoint is queried using **GET**. The XML response is parsed using **read_XML**. XML elements or attributes can be identified using XPath syntax. All elements matching the XPath syntax for finding re3data IDs are identified with **xml_find_all**, and their content is extracted using **xml_text**. The three functions are nested in the example below.\n",
76 | "\n",
77 | "The endpoint **/api/v1/repository** provides detailed information about individual repositories that can be accessed via re3data IDs. Therefore, URLs for the next query are created by adding re3data IDs to the base URL."
78 | ]
79 | },
80 | {
81 | "cell_type": "code",
82 | "execution_count": 2,
83 | "metadata": {},
84 | "outputs": [],
85 | "source": [
86 | "re3data_request <- GET(\"http://re3data.org/api/v1/repositories\")\n",
87 | "re3data_IDs <- xml_text(xml_find_all(read_xml(re3data_request), xpath = \"//id\"))\n",
88 | "URLs <- paste(\"https://www.re3data.org/api/v1/repository/\", re3data_IDs, sep = \"\")"
89 | ]
90 | },
91 | {
92 | "cell_type": "markdown",
93 | "metadata": {},
94 | "source": [
95 | "### Step 3: define what information about the repositories should be requested\n",
96 | "\n",
97 | "The function **extract\\_repository\\_info** defined in the following code block points to and extracts the content of specific XML elements and attributes. This function will be used later to extract the specified information from responses of the re3data API. Its basic structure is similar to the process of extracting the URLs outlined in step 2 above.\n",
98 | "\n",
99 | "In our Metadata schema, **api** (the API endpoint) is an element with the attribute **apiType**. Please note that one repository can offer multiple APIs, and even several API types.\n",
100 | "\n",
101 | "The XPath expressions defined here will extract the re3data IDs, names, URLs, API endpoints and API types in their specific order (using the parameter *API_index*). Results are stored in a named list that can be processed later. Depending on specific use cases, this function can be adapted to extract a different set of elements and attributes. For an overview of the metadata re3data offers, please refer to the documentation of the [re3data Metadata Schema](https://doi.org/10.2312/re3.006) (the API uses version 2.2 of the re3data Metadata Schema).\n",
102 | "\n",
103 | "The function **xml_structure** from the package **xml2** can be very useful for inspecting the structure of XML objects and specifying XPath expressions."
104 | ]
105 | },
106 | {
107 | "cell_type": "code",
108 | "execution_count": 3,
109 | "metadata": {},
110 | "outputs": [],
111 | "source": [
112 | "extract_info <- function(repository_metadata_XML) {\n",
113 | " xml_find_all(repository_metadata_XML, \"//r3d:api\") %>%\n",
114 | " map_df(function(x) {\n",
115 | " list(\n",
116 | " re3data_ID = xml_text(xml_find_all(x, \"./parent::r3d:repository/r3d:re3data.orgIdentifier\")),\n",
117 | " repositoryName = xml_text(xml_find_first(x, \"./parent::r3d:repository/r3d:repositoryName\")),\n",
118 | " repositoryUrl = xml_text(xml_find_first(x, \"./parent::r3d:repository/r3d:repositoryURL\")),\n",
119 | " api = xml_text(xml_find_first(x, \".\")),\n",
120 | " apiType = xml_text(xml_find_first(x, \"./@apiType\"))\n",
121 | " )\n",
122 | " })\n",
123 | "}"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "### Step 4: create a container for storing results\n",
131 | "\n",
132 | "**repository_info** is a container for storing results of the API query. The dataframe has five columns corresponding to names of the list items defined by **extract_repository_info**."
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": 4,
138 | "metadata": {},
139 | "outputs": [],
140 | "source": [
141 | "repository_info <- data.frame(matrix(ncol = 5, nrow = 0))\n",
142 | "\n",
143 | "colnames(repository_info) <- c(\"re3data_ID\", \"repositoryName\", \"repositoryUrl\", \"api\", \"apiType\")"
144 | ]
145 | },
146 | {
147 | "cell_type": "markdown",
148 | "metadata": {},
149 | "source": [
150 | "### Step 5: gather detailed information about repositories\n",
151 | "\n",
152 | "After preparing the list of URLs, the extracting function and the container for results, these components can be put together. The code block below iterates through the list of URLs using a for-loop. For each repository, data is requested from the re3data API using **GET**. The XML response is parsed with **read_xml**. An Xpath expression is used to count how often the element *api* occurs for each repository. If *APICount* is larger than 0, the function **extract_repository_info** is called. The results are then appended as a new row to **repository_info**."
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "execution_count": 5,
158 | "metadata": {},
159 | "outputs": [],
160 | "source": [
161 | "for (url in URLs) {\n",
162 | " repository_metadata_request <- GET(url)\n",
163 | " repository_metadata_XML <- read_xml(repository_metadata_request)\n",
164 | " \n",
165 | " results_list <- extract_info(repository_metadata_XML)\n",
166 | " \n",
167 | " if(nrow(results_list) > 0) {\n",
168 | " repository_info <- rbind(repository_info, results_list)\n",
169 | " }\n",
170 | "}"
171 | ]
172 | },
173 | {
174 | "cell_type": "markdown",
175 | "metadata": {},
176 | "source": [
177 | "### Step 6: Look at the results\n",
178 | "\n",
179 | "Results are now stored in **repository_info**. They can be inspected using **head** or visualized.\n",
180 | "\n",
181 | "The example below first deduplicates the data so that each API type is counted only once per repository. It then groups data by **apiType** and counts how many repositories are in each group, then orders **apiType** by occurrence in descending order. Then, a bar chart of APIs offered by repositories indexed in re3data is generated.\n",
182 | "Please note that, as mentioned above, **apiType** has an occurrence of 1-n. Some repositories are assigned more than one API type, for example REST and OAI-PMH."
183 | ]
184 | },
185 | {
186 | "cell_type": "code",
187 | "execution_count": 6,
188 | "metadata": {},
189 | "outputs": [
190 | {
191 | "data": {
192 | "text/html": [
193 | "