├── README.md ├── data-message ├── README.md └── docs │ └── sdmx-csv-field-guide.md └── metadata-message └── docs └── sdmx-csv-field-guide.md /README.md: -------------------------------------------------------------------------------- 1 | # Overview 2 | 3 | This repository is used for maintaining the SDMX-CSV data message specifications. 4 | 5 | This includes: 6 | 7 | - Normative documentation and samples for the SDMX-CSV data message. 8 | - [Wiki](https://github.com/sdmx-twg/sdmx-csv/wiki) for additional information 9 | -------------------------------------------------------------------------------- /data-message/README.md: -------------------------------------------------------------------------------- 1 | # Overview 2 | 3 | This repository is used for maintaining the SDMX-CSV data message specifications. 4 | 5 | This includes: 6 | 7 | - Normative documentation and samples for the SDMX-CSV data message. 8 | - [Wiki](https://github.com/sdmx-twg/sdmx-csv/wiki) for additional information -------------------------------------------------------------------------------- /data-message/docs/sdmx-csv-field-guide.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | SDMX-CSV Data Message is an SDMX data exchange format based on the [RFC 4180](https://tools.ietf.org/html/rfc4180). CSV is a widely used standardised and simple format to exchange data supported by many tools. 4 | 5 | SDMX-CSV integrates with other specifications, i.e.: 6 | - The SDMX API RESTful specification (e.g. content negotiation with mime-type to get SDMX-CSV representations, specific formats for responses, language selection through HTTP content negotiation) 7 | - The [RFC 4180](https://tools.ietf.org/html/rfc4180) specification 8 | 9 | ## RFC 4180: A common format for CSV files 10 | In order to benefit from best practices, SDMX-CSV is based on the rules defined in the [RFC 4180](https://tools.ietf.org/html/rfc4180), which defines a common format and MIME Type for CSV files. It is advised to read the (very short) RFC for a full list of requirements but, in a nutshell, the RFC defines rules such as: 11 | - How the CSV file should be structured (the RFC specifies that all records must have an identical structure (determined column number), like when using an SDMX "flat" representation for data); 12 | - When double-quotes should be used and how to escape them when needed; 13 | - How spaces should be handled: Spaces are considered part of a field and should not be ignored; 14 | - Which mime type should be used; 15 | - What is the default character set, etc. 16 | 17 | The SDMX-CSV format is flexible enough in its representation to support the needs of different target audiences: 18 | - It is designed and optimised for the purpose of general public data dissemination of statistical data, and for usage in common statistical software. 19 | - It allows using the messages to create pivot tables in spreadsheets applications. 20 | 21 | # Design principles for SDMX-CSV 2.1 Data Messages (aligned with SDMX 3.1) 22 | 23 | - In order to ensure the identifiability of the data contained in the message, the header row containing the column headers is mandatory and its content is well-defined. 24 | - After the mandatory header row, each row contains the information related to one specific observation or to one or more attributes attached to partial keys. For `Delete` actions a row can also concern several observations if dimensions are wildcarded. 25 | - In [RFC 4180](https://tools.ietf.org/html/rfc4180), csv stands for "comma-separated values". However, while SDMX-CSV uses indeed the "comma" (%x2C) as the default field separator, it adopts the wider interpretation of csv as "character-separated values". It is recommended for implementers to provide SDMX-CSV messages according to the locale of the user (e.g. as indicated in the http Accept-Language header). It means that e.g. the semi-colon ‘;’ (as used typically in specific regions or countries) is acceptable as separator. See also the related example below. Note that the separator used in a message can be determined by retrieving the character that follows the fixed first column header term *STRUCTURE* (which may be extended by a squared bracket term). 26 | 27 | ## Columns 28 | 29 | - The first column is always used for the structure type: dataflow, data structure definition or data provision agreement. 30 | - The next one or two columns are always used for the structure's identification. 31 | - The next column is always used for the action to be performed. 32 | - The next up to two columns are used for the series and/or observation key. 33 | - Each Data Structure Definition (DSD) component (dimensions, measures, attributes including those defined through a referenced Metadata Structure Definition) included in the message is represented in one or two columns. SDMX web services should return the columns in the order of components as defined in (each of) the underlying Data Structure Definition(s), grouped by type of component, thus in case of data defined by different data structures: first the dimensions of the first data structure, then the remaining dimensions of the second data structure and so forth, then the measures of the first data structure, then the remaining measures of the second data structure and so forth, then the attributes of the first data structure, then the remaining attributes of the second data structure and so forth. However, any order of these columns is valid for data uploads to SDMX-consuming systems. 34 | - Only all those dimension columns have to be present, that are required to uniquely identify the concerned attributes and/or measures. 35 | - Attributes can but do not need to be included even if they have a mandatory status. 36 | - Measures can but do not have to be included. 37 | - When an SDMX RESTful web service implements streaming, then it might not know, while generating the csv header row, which measures and attributes actually have values. Therefore, it can happen that all values presented in an attribute or measure column are left empty. 38 | - Implementers have the possibility to add any other custom columns as required, e.g. updated, prepared, etc. 39 | 40 | ## Column headers (first row) 41 | 42 | - The header field of the first column always contains the term `STRUCTURE`. 43 | - This field must be extended with a sub-field delimiter encapsulated in squared brackets "[]", e.g. `STRUCTURE[;]`, in case the message contains multi-valued or multi-language measure or attribute values. 44 | - The header field of the second column always contains the term `STRUCTURE_ID`. 45 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the artefact identification column containing the term `STRUCTURE_NAME`. 46 | - The header field of the next column should contain the term `ACTION`. For convenience, if this column is not present, a default action ("Information") is assumed for the whole message. 47 | - The next up to two columns contain, if option `key=series|obs|both` (see *[here](#optional-parameters)*), in this order the terms `SERIES_KEY` and/or `OBS_KEY`. 48 | - The other columns for components contain: 49 | - Default: The ID of the component reported in that column, e.g. `DIM1`. 50 | - If option `labels=both` (see *[here](#optional-parameters)*): The ID and the localised name of the component reported in that column separated by the term ": ", e.g. `DIM1: Dimension 1`. 51 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the component identification column containing the localised name of the component reported in the previous column. 52 | - Any other custom column contains a custom but unique term, e.g. `UPDATED`. 53 | 54 | ## Column content (all rows after header) 55 | 56 | - The first column contains: `dataflow`, `datastructure` or `dataprovision`, depending on type of artefact for which the data contained in the row are defined: dataflow, data structure definition or data provision agreement. 57 | - The second column contains: 58 | - Default: The artefact identification information for the data in the row in the form *AGENCY:ARTEFACT_ID(VERSION)*(1), e.g. `ESTAT:NA_MAIN(1.6.0)`. 59 | - If option `labels=both` (see *[here](#optional-parameters)*): The artefact identification information and its localised name separated by the term ": ", e.g. `ESTAT:NA_MAIN(1.6.0): National Accounts Main Aggregates`. 60 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the artefact identification column with the artefact's localised name, e.g. `National Accounts Main Aggregates`. 61 | - The next column contains one character representing one of the current 4 action types: 62 | - "I": Information - Deprecated. When used to update an SDMX storage system, the *Merge* action is assumed. 63 | - "A": Append - Deprecated. When used to update an SDMX storage system, the *Merge* action is assumed. 64 | - "M": Merge - Data or data-related reference metadata is to be merged, through either update or insertion depending on already existing information. This operation does not allow deleting any component values. Updating individual values in multi-valued measure, attribute or data-related reference metadata values is not supported either. The complete multi-valued value is to be provided. Only non-dimensional components (measure, attribute or data-related reference metadata values) can be **omitted** (\ cell or column is absent) as long as at least one of those components is present. Bulk merges are thus not supported. Only the provided values are merged. Dimension values for higher-level (data-related reference metadata) attributes can be **switched-off** (using `~`) when those are not attached to these dimensions. All observations as well as the sets of data-related reference metadata attributes at specific dimension combinations impacted by the *Merge* action change their time stamp when used to update an SDMX storage system. 65 | - "R": Replace - Data or data-related reference metadata is to be replaced, through either update, insert or delete depending on already existing information. A full replacement is hereby assumed to take place at specific “replacement levels”: for entire observations and for any specific dimension combination for data-related reference metadata attributes. Within these “replacement levels” the provided values are inserted or updated, and omitted values are deleted. Values provided for the other attributes (those above the observation level) are merged (see *Merge* action). Only non-dimensional components (measure, attribute or reference metadata values) can be **omitted** (\ cell or column is absent). Bulk replacing is thus not supported. Dimension values for higher-level (data-related reference metadata) attributes can be **switched-off** (using `~`) when those are not attached to these dimensions. Replacing non-existing elements is not resulting in an error. All observations as well as the sets of data-related reference metadata attributes at specific dimension combinations impacted by the *Replace* action change their time stamp when used to update an SDMX storage system. Because the *replace* action always takes place at specific levels, it cannot be used to replace a whole dataset or a whole series. However, a “*replace all*” effect can be achieved by combining a *Delete* row containing a completely wildcarded key (where all dimension values are omitted) with *Merge* or *Replace* rows within the same data message. Similarly, to replace a whole series, a message can combine a *delete* row containing only the partial key of the series (where the not used dimension values are omitted) with *Merge* or *Replace* rows for that series. 66 | - "D": Delete - Data or data-related reference metadata is to be deleted. Deletion is hereby assumed to take place at the lowest level of detail provided in the message. Any component (including dimensions) can be **omitted** (\ cell or column is absent). Omitting dimension values allows for bulk deletions. Partially omitting non-dimension component values allows restricting the deletion of measure, attribute or data-related reference metadata values to the ones being present. Instead of real values for non-dimensional components, it is sufficient to use any valid value, e.g. the dash character `-`. With this, all dataflow data, any slices of observations for dimension groups such as time series, observations or individual measure, attribute and data-related reference metadata attributes values can be deleted. Dimension values for higher-level (data-related reference metadata) attributes can be **switched-off** (using `~`) when those are not attached to these dimensions. Deleting non-existing elements or values is not resulting in an error. All observations as well as the sets of attributes and data-related reference metadata at higher partial keys impacted by the *Delete* action change their time stamp when used to update an SDMX storage system. 67 | - For convenience, if this column is absent then the *Merge* action is assumed. 68 | For more details see [here](#further-details-for-data-actions). 69 | - The next up to two columns contain, if option `key=series|obs|both`, in this order the series keys and/or the observation keys (see *[here](#optional-parameters)*). 70 | - The other columns for components contain: 71 | - Default: The ID(s) (if coded) or value(s) (if non-coded) for the component values reported in that column for the corresponding observation, e.g. `A`. 72 | - If option `labels=both` (see *[here](#optional-parameters)*): The ID(s) and the localised name separated by the term ": " (if coded) or the value(s) (if non-coded) for the component values reported in that column for the corresponding observation, e.g. `A: A value name`. 73 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the component identification column containing the localised name, e.g. `A value name`, of the component value reported in the previous column. It is empty if the value has no localised name. 74 | - For rows containing the information related to one specific observation, the related values for attributes attached to partial keys may have to be replicated. 75 | - For rows containing the information related to one or more attributes attached to partial keys, in addition to these attributes only the components that are part of the partial key need to be filled, all other components can be left empty. Also the columns not related to the attribute's data structure (when data from different data structures are present) are to be left empty. 76 | - For rows containing information to be deleted, the deletion is assumed to take place at the lowest level of detail provided in the message. For that purpose, to be deleted measure or attribute values are non-empty, e.g. marked with the dash character "-". Delete operations allow wildcarding dimensions by leaving the corresponding dimension field empty. 77 | - The other custom columns contain any potentially localised custom content. 78 | 79 | ## Further details for data actions 80 | 81 | The following convention is used to indicate the state of components in data messages: 82 | 83 | | | | **Dimension value is** | | **Measure, attribute or reference metadata value is** | | 84 | | --- | --- | --- | --- | --- | --- | 85 | | | | **Omitted** | **switched off** | **Omitted** | **Present** | 86 | | Action | Delete | bulk deletion: dimension value doesn't matter | only for irrelevant dimensions:1) higher-level (reference metadata) attributes not attached to this dimension(incl. TIME\_PERIOD)2) measures and attributes not attached to this dimension if the DSD allows for an ‘evolving structure’ (excl. TIME\_PERIOD) | to be deleted only if **all** non-dimension components are omitted | to be deleted | 87 | | | Merge | *bulk merge is not permitted* | (see above) | not to be changed | to be updated/inserted | 88 | | | Replace | *bulk replace is not permitted* | (see above) | at permitted replacement levels: to be deleted, otherwise not to be changed | to be updated/inserted | 89 | | Format | CSV | \ cell or column is absent | ~ | \ cell or column is absent | any valid or intentionally missing value | 90 | 91 | **Important notes:** 92 | 93 | The terms “*delete*”, “*merge*” and “*replace*” do **not** imply a physical replacement or deletion of values in the underlying database. To minimize the physical resource requirements, SDMX web service implementations that do not support the *includeHistory* and *asOf* URL parameters might physically replace the existing values in the database. SDMX web services that neither support the *updatedAfter* URL parameter might also implement physical deletions. However, SDMX web services that support these parameters (or other time-machine features), would not overwrite or delete the physical values. 94 | 95 | SDMX web services that support the *includeHistory* or *asOf* URL parameters should never allow deleting their **historic** data content because this would interfere with the interests of data consumers, such as data aggregators. Therefore, a specific feature to physically delete previous (outdated) content is intentionally not added to the SDMX standard syntax. If such a feature is required by an organisation, then it needs to be implemented as a custom feature outside the SDMX standard. 96 | 97 | Likewise, all SDMX-compliant systems that do (or are configured to) support the *updatedAfter* URL parameter need to systematically retain the information about deleted data (or data-related reference metadata). 98 | 99 | All datasets – even with varying actions – within a single data message have always to be treated as **ACID transaction** to guarantee “transactional safety” (full data consistency and validity despite errors, power failures, and other mishaps). These datasets are to be processed in the order of appearance in the message. The advantage of such data messages is thus the ability to bundle separate *delete* and *replace* or *merge* actions into one transactional data message. 100 | 101 | **Recommended[^1] dataset actions in SDMX web service responses to GET data queries:** 102 | 103 | 1. Without the *updatedAfter*, *includeHistory*, *detail*, *attributes* or *measures* URL parameters: 104 | 105 | The response message should contain the retrieved data in a *Replace* dataset (instead of the previous *information* dataset). 106 | 107 | 1. Without the *updatedAfter* and *includeHistory*, but with *detail*, *attributes* or *measures* URL parameters: 108 | 109 | The response message should contain the retrieved data in a *Merge* dataset (instead of the previous *Information* dataset). 110 | 111 | 1. With the *updatedAfter* URL parameter: 112 | 113 | The response must include the information of all previously updated, inserted and deleted data or data-related reference metadata, even if bulk deletions have been used. One of the two approaches are possible: 114 | 115 | * a *Delete* dataset for entirely deleted observations and for entirely deleted sets of (data-related reference metadata) attribute values attached to specific dimension combinations and 116 | a *Replace* dataset for all other changed observations and changed attribute and data-related reference metadata values attached to specific dimension combinations, or 117 | * a *Delete* dataset for entirely deleted observations, for entirely deleted sets of (data-related reference metadata) attribute values attached to specific dimension combinations and for individually deleted mesure, attribute and reference metadata values and 118 | a *Merge* dataset for all other updated or inserted observation, attribute and data-related reference metadata values. 119 | 120 | The DB synchronization use case requires that the generated response must always allow achieving to replicate the exact same punctual data content as currently stored in the queried data source. 121 | 122 | 1. With the *includeHistory* URL parameter: 123 | 124 | Using a number of datasets with *Delete*, *Replace* or *Merge* actions and limited in their validity time span that allow achieving to replicate the exact same punctual data contents as previously stored in the queried data source. 125 | 126 | 1. With the *asOf* URL parameter: 127 | 128 | The recommendations of 1 and 2 apply depending on the other parameters. In addition, the returned dataset should have its validity time span limited to the point in time requested in the *asOf* parameter. 129 | 130 | [^1]: So far this is recommended for systems that do not require backward-compatibility. Later, with SDMX 4.0, this may generally be made mandatory. 131 | 132 | ## Intentionally missing values 133 | 134 | To indicate **intentionally missing** observations, attributes and reference metadata values, even if mandatory, the following special values are to be used in SDMX-CSV: 135 | - Numeric data types float and double: `NaN` 136 | - All other data types: `#N/A` 137 | 138 | ## Localisation 139 | 140 | - HTTP content negotiation, see [RFC 2616 - HTTP 1.1 Header Field Definitions](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) 141 | - Always use this mime-type in the Accept header: `application/vnd.sdmx.data+csv; version=2.0.0`. 142 | - The client can indicate preferred languages through the Accept-Language header, e.g. `fr, en-gb;q=0.8, en;q=0.7`. 143 | - Always localise all artefact names according to the preferred language. The first best language match according to the user’s preferred language choices in the http Accept-Language header (or if that is not available than according to the system's default language order) is to be used for each localisable name element. The message does however not indicate the returned language per localisable name element. In case that there is no such language match for a particular localisable name element, it is optional to return the element in a system-default language or alternatively to not return the element. 144 | **It is recommended to indicate all languages used anywhere in the message for localised name elements through http Content-Language response header (languages of the intended audience).** 145 | Note: For multi-language values, all language versions are provided independently from the preferred language (see below). 146 | 147 | ## Multi-valued components and nested metadata attributes 148 | 149 | - Some components (measures or attributes) allow for multiple values. Those multiple values are separated by a special sub-field separation character, e.g. `;`. 150 | - This sub-field separation character has to be defined as first character in the squared bracket term of the header field of the first column, e.g. `STRUCTURE[;]`. 151 | - Such components are indicated by having their IDs followed by empty squared brackets "[]", e.g. `ATTR4[]`. 152 | - For coded multi-valued components, if option `labels=both` (see *[here](#optional-parameters)*) then each individual value is to be prefixed with its ID and the term ": ", e.g. `A: Value A;B: Value B`. 153 | - Each metadata attribute is also to be presented in its own column(s), even if the metadata attributes are nested. In that case, the attribute IDs in the column headers are pre-fixed with the IDs of the related parent attribute(s) separated by a dot `.`, e.g. `CONTACT[].NAME[]`. All the values corresponding to one attribute are presented like a multi-valued component by respecting their position in the nested attribute tree, e.g. `name for contact 1;name for contact 2`. Parent branches in that attribute tree without a value for a specific attribute need to be indicated by leaving the corresponding multi-value sub-field empty, e.g. `name for contact 1;;name for contact 3`. Nested parent branches are indicated by using nested double quotes. Note that fields containing double quotes must themselves be encapsulated in double quotes and that nested inner double quotes need to be doubled recoursively, e.g. `"""name 1 for contact 1;name 2 for contact 1"";""name 1 for contact 2;name 2 for contact 2"""`. 154 | 155 | ## Non-coded multi-lingual components 156 | 157 | - Some non-coded components (measures or attributes) allow for multi-lingual values. Those values are separated by a special sub-field separation character, e.g. `;`. 158 | - This sub-field separation character has to be defined as first character in the squared bracket term of the header field of the first column, e.g. `STRUCTURE[;]`. 159 | - Such components are indicated by having their IDs followed by the list of possible 2-letter ISO language codes separated by the sub-field separator and encapsulated squared brackets "[]", e.g. `ATTR2[en;fr]`. 160 | - Each individual language value is to be prefixed with its 2-letter ISO language code and a colon character ":", e.g. `en:Value;fr:Valeur`. Thus, in distinction to the ID prefix for coded values when using the HTTP accept header `labels=both` (see *[here](#optional-parameters)*), the language prefix `xx:` doesn't have an extra space character. 161 | - Note that multi-lingual components are always non-coded and therefore do not interfere with value IDs. 162 | 163 | ## Non-coded multi-lingual multi-valued components 164 | 165 | - Some non-coded components (measures or attributes) allow for multiple multi-lingual values. All individual values are separated by a special sub-field separation character, e.g. `;`. 166 | - This sub-field separation character has to be defined as first character in the squared bracket term of the header field of the first column, e.g. `STRUCTURE[;]`. 167 | - Such components are indicated by having its ID followed by the list of possible language codes separated by the sub-field separator and encapsulated squared brackets "[]", e.g. `ATTR2[en;fr;de]`. 168 | - Each individual language value is to prefixed with its 2-letter ISO language code and a colon character ":", e.g. `en:Value1`. 169 | - Each multi-lingual value set is to be encapsulated in double quotes, e.g. `"en:Value1;fr:Valeur1";"en:Value2;de:Wert2"`. However, note that fields containing double quotes must themselves be encapsulated in double quotes and that the inner double quotes need to be doubled, thus the complete example is `"""en:Value1;fr:Valeur1"";""en:Value2;de:Wert2"""`. 170 | 171 | ## Non-coded XHTML-valued components 172 | 173 | - Some non-coded components (measures or attributes) allow for XHTML values. 174 | - Each XHTML value is to be encapsulated in double quotes, e.g. `"

This is some ""metadata html""

"`. Remember that the inner quotes need to be doubled. 175 | - The CSV format allows fields to contain line breaks if those fields are enclosed in double quotes. Thus XHTML values can also contain line breaks. 176 | 177 | # Optional parameters 178 | 179 | Optional parameters can be added to the HTTP Accept header. They need to be separated by the character combination `"; "`. 180 | - labels (id|name|both; default=id): This parameter applies to all Nameable SDMX Artefacts contained in the header and the body of the message: 181 | - If the parameter value is `id` then only the id/value of the artefacts is displayed. 182 | - If the parameter value is `name` then the id/value and the name of the artefacts are displayed in separate columns (see *[here](#columns)*), the ID/value column always directly preceding its related localised name column. 183 | - If the parameter value is `both` then the concatenated id/value and localised name of the artefacts (see the section on [localised names](#localised-names) on how the message deals with languages) separated by `": "` are displayed. Note that the character combination `": "` could also be part of the artefact name and could therefore occur several times within the concatenated string. 184 | - timeFormat (original|normalized; default=original): 185 | - If the parameter value is `original` then the time dimension (*TIME-PERIOD*) values are displayed in the SDMX *TIME_PERIOD* format as originally recorded. 186 | - If the parameter value is `normalized` then the time dimension (*TIME_PERIOD*) values are converted to the most granular [ISO 8601](https://www.iso.org/iso-8601-date-and-time-format.html) representation taking into account the highest frequency of the data in the message and the moment in time when the lower-frequency values were collected (which, e.g. at the ECB, is typically either at the beginning, middle or end of the reporting period). This eases comparisons and business analysis of multi-frequency values, e.g. in pivot tables. As an example, if annual and daily data are available in the message and the annual data were collected at the end of the reporting period, the formatted value for the annual period 2014 becomes 2014-12-31. 187 | - keys (none|obs|series|both; default=none): Request the addition of column(s) for keys. 188 | - If the value is `none` (the default), no related column will be added. 189 | - If the value is `obs`, a new column OBS_KEY will be added after the ACTION column. The column will contain the combination of IDs/values for all the dimensions, order by their order in the data structure definition and separated by a dot character (.), e.g. M.USD.EUR.SP00.2020-01 190 | - If the value is `series`, a new column SERIES_KEY will be added after the ACTION column. The column will contain the combination of IDs/values for all the dimensions except the one(s) attached to the observation, ordered by their order in the data structure definition and separated by a dot character (.), e.g. M.USD.EUR.SP00 191 | - If the value is `both`, both a SERIES_KEY and an OBS_KEY columns must be added after the ACTION column, starting with the SERIES_KEY column. 192 | 193 | # Examples 194 | 195 | Note: All examples assume the minimal HTTP Accept header: `application/vnd.sdmx.data+csv; version=2.1.0` 196 | 197 | #### 1) Ordinary case 198 | 199 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_2,ATTR_3,ATTR_1,UPDATED 200 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-01,12.4,Y,"Normal, special and other values",N,2021-01-22T13:15:41Z 201 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-02,10.8,Y,"Normal, special and other values",Y,2021-01-22T13:15:41Z 202 | 203 | Notes: 204 | - The following default parameter settings are automatically applied: 205 | - labels=id 206 | - timeFormat=original 207 | - *UPDATED* is a custom column 208 | 209 | #### 2) Components in any order, missing component(s), component with multiple values 210 | 211 | STRUCTURE[;],STRUCTURE_ID,ACTION,OBS_VALUE1,OBS_VALUE2,ATTR_3,ATTR_1[],DIM_2,DIM_1,DIM_3 212 | dataflow,ESTAT:NA_MAIN(1.6.0),M,12.4,12.5,"Normal, special and other values",X;Y,B,A,2014-01 213 | dataflow,ESTAT:NA_MAIN(1.6.0),M,10.8,10.9,"Normal, special and other values",X;Z,B,A,2014-02 214 | 215 | #### 3) Components in any order and missing component, HTTP Accept header: `application/vnd.sdmx.data+csv; version=1.0.0; key=series` 216 | 217 | STRUCTURE[;],STRUCTURE_ID,ACTION,SERIES_KEY,OBS_VALUE1,OBS_VALUE2,ATTR_3,ATTR_1,DIM_2,DIM_1,DIM_3 218 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A.B,12.4,12.5,"Normal, special and other values",N,B,A,2014-01 219 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A.B,10.8,10.9,"Normal, special and other values",Y,B,A,2014-02 220 | 221 | #### 4) Localisation: HTTP Accept header: `application/vnd.sdmx.data+csv; version=1.0.0; labels=both; key=both`, HTTP Accept-Language header: `fr-FR, en;q=0.7` 222 | 223 | STRUCTURE[|];STRUCTURE_ID;ACTION;SERIES_KEY;OBS_KEY;DIM_1: Dimension 1;DIM_2: Dimension 2;DIM_3: Dimension 3;OBS_VALUE: Observation value;ATTR_2: Attribut 2;ATTR_3: Attribut 3;ATTR_1: Attribut 1 224 | dataflow;ESTAT:NA_MAIN(1.6.0): Principaux agrégats des comptes nationaux;M;A.B;A.B.2014-01;A: Value A;B: Value B;2014-01: 2014-01;12,4;Y: Oui;Normal, special and other values;N: Non 225 | dataflow;ESTAT:NA_MAIN(1.6.0): Principaux agrégats des comptes nationaux;M;A.B;A.B.2014-02;A: Value A;B: Value B;2014-02: 2014-02;10,8;Y: Oui;Normal, special and other values;Y: Oui 226 | 227 | Note that in this example the client prefers French (fr) language with the France (FR) locale, but will also accept any type of English. Therefore, in the message the French language with the France locale is applied, transforming also the field separator from comma (,) to semicolon (;), and the decimal separator from dot (.) to comma (,). 228 | 229 | #### 5) HTTP Accept header: `application/vnd.sdmx.data+csv; version=1.0.0; labels=both; timeFormat=normalized` 230 | 231 | STRUCTURE[;],STRUCTURE_ID,ACTION,DIM_1: Dimension 1,DIM_2: Dimension 2,DIM_3: Dimension 3,OBS_VALUE: Observation value,ATTR_2: Attribute 2,ATTR_3: Attribute 3,ATTR_1: Attribute 1 232 | dataflow,ESTAT:NA_MAIN(1.6.0): National Accounts Main Aggregates,M,A: Value A,B: Value B,2014-01-01,12.4,Y: Yes,"Normal, special and other values",N: No 233 | dataflow,ESTAT:NA_MAIN(1.6.0): National Accounts Main Aggregates,M,A: Value A,B: Value B,2014-02-01,10.8,Y: Yes,"Normal, special and other values",Y: Yes 234 | 235 | #### 6) HTTP Accept header: `application/vnd.sdmx.data+csv; version=1.0.0; labels=name` 236 | 237 | STRUCTURE,STRUCTURE_ID,STRUCTURE_NAME,ACTION,DIM_1,Dimension 1,DIM_2,Dimension 2,DIM_3,Dimension 3,OBS_VALUE,Observation value,ATTR_1,Attribute 1,ATTR_2,Attribute 2,ATTR_3,Attribute 3 238 | dataflow,ESTAT:NA_MAIN(1.6.0),National Accounts Main Aggregates,M,A,Value A,B,Value B,2014-01,2014-01,12.4,,Y,Yes,"Normal, special and other values",,N,No 239 | dataflow,ESTAT:NA_MAIN(1.6.0),National Accounts Main Aggregates,M,A,Value A,B,Value B,2014-02,2014-02,10.8,,Y,Yes,"Normal, special and other values",,Y,Yes 240 | 241 | #### 7) Multi-valued components 242 | 243 | STRUCTURE[;],STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1[],ATTR_2[],ATTR_3[] 244 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-01,12.4,Value X;Value Y,"M, N & O;P & Q",A;B;C 245 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-02,10.8,Value X;Value Y,"M, N & O;P & Q",A;C 246 | 247 | #### 8) Non-coded multi-lingual components, varying dataflows based on the same underlying data structure 248 | 249 | STRUCTURE[;],STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1[en;fr] 250 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-01,12.4,en:Any Value;fr:N'importe quelle Valeur 251 | dataflow,ESTAT:NA_MAIN(1.7.0),M,A,B,2014-02,10.8,"en:Value ""X"";fr:Valeur ""X""" 252 | 253 | #### 9-A) Varying structural artefacts based on same underlying data structure 254 | 255 | STRUCTURE[;],STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1[en;fr] 256 | dataflow,ESTAT:DF_NA_MAIN(1.6.0),M,A,B,2014-01,12.4,en:Any Value;fr:N'importe quelle Valeur 257 | datastructure,ESTAT:DSD_NA_MAIN(1.7.0),M,A,B,2014-02,10.8,"en:Value ""X"";fr:Valeur ""X""" 258 | dataprovision,ESTAT:DPA_NA_MAIN(1.8.0),M,A,B,2014-03,11.2,"en:Value ""Y"";fr:Valeur ""Y""" 259 | 260 | #### 9-B) Varying structural artefacts based on different underlying data structures 261 | 262 | STRUCTURE[;],STRUCTURE_ID,ACTION,DIM_A1B1,DIM_A2,DIM_A3C2,DIM_B2,DIM_C1,DIM_C3,MEAS_A1B1C1,MEAS_C2,ATTR_A1,ATTR_B1 263 | dataflow,ESTAT:DF_A(1.6.0),M,DIMVAL_A1B1,DIMVAL_A2,DIMVAL_A3C2,,,,"MEASVAL_A1B1C1",,"ATTRVAL_A1", 264 | datastructure,ESTAT:DSD_B(1.7.0),M,DIMVAL_A1B1,,,DIMVAL_B2,,,"MEASVAL_A1B1C1",,,"ATTRVAL_B1" 265 | dataprovision,ESTAT:DPA_C(1.8.0),M,,,DIMVAL_A3C2,,DIMVAL_C1,DIMVAL_C3,"MEAS_A1B1C1","MEAS_C2",, 266 | 267 | #### 10) Varying actions 268 | 269 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1 270 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-01,12.4,X 271 | dataflow,ESTAT:NA_MAIN(1.6.0),R,A,B,2014-02,10.8,Y 272 | 273 | #### 11) Data for a non-versioned(1) data structure definition 274 | 275 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1 276 | datastructure,AGENCY:DF_ID,M,A,B,2014-01,12.4,N 277 | datastructure,AGENCY:DF_ID,M,A,B,2014-02,10.8,Y 278 | 279 | #### 12) Attributes attached to partial keys for a data provision agreement 280 | 281 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_2,DIM_3,ATTR_1 282 | dataprovision,AGENCY:DPA_ID(1.0.0),M,B,2014-01,N 283 | dataprovision,AGENCY:DPA_ID(1.0.0),M,B,2014-02,Y 284 | 285 | #### 13) Mixing rows for attributes attached to partial keys with rows for observations 286 | 287 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,MEAS_1,ATTR_1,ATTR_2 288 | dataflow,AGENCY:DF_ID(1.0.0),M,A,B,2014-01,12.4,N, 289 | dataflow,AGENCY:DF_ID(1.0.0),M,,B,,,,Y 290 | 291 | #### 14) Nested metadata attributes attached to partial keys 292 | 293 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_2,COLLECTION.METHOD[en;fr],CONTACT[],CONTACT[].NAME[] 294 | dataflow,AGENCY:DF_ID(1.0.0),M,A,en:AAA;fr:BBB,Contact 1;Contact 2,"""Contact 1 Name 1;Contact 1 Name 2"";""Contact 1 Name 1;Contact 2 Name 2""" 295 | dataflow,AGENCY:DF_ID(1.0.0),M,B,en:CCC;fr:DDD,Contact 1;Contact 2;Contact 3,"""Contact 1 Name 1;Contact 1 Name 2"";;""Contact 3 Name 1;Contact 3 Name 2""" 296 | 297 | #### 15) Non-coded XHTML-formatted values with line-breaks 298 | 299 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_1 300 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-01,12.4,"

This is some ""xhtml"" with a line 301 | break

" 302 | dataflow,ESTAT:NA_MAIN(1.6.0),M,A,B,2014-02,10.8,"

This is some other ""xhtml""

" 303 | 304 | #### 16) Deleting specific measure and attribute values: all non-empty values (e.g. marked with "-") are deleted 305 | 306 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_2,ATTR_3,ATTR_1 307 | dataflow,ESTAT:NA_MAIN(1.6.0),D,A,B,2014-01,-,,, 308 | dataflow,ESTAT:NA_MAIN(1.6.0),D,A,B,2014-02,,,-, 309 | 310 | #### 17) Deleting specific measure and attribute values with wildcarded dimensions: all non-empty values (e.g. marked with "-") are deleted for all dimension combinations where: 311 | - row 2: DIM2=A 312 | - row 3: DIM2=B 313 | 314 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3,OBS_VALUE,ATTR_2,ATTR_3,ATTR_1 315 | dataflow,ESTAT:NA_MAIN(1.6.0),D,,A,,-,,, 316 | dataflow,ESTAT:NA_MAIN(1.6.0),D,,B,,,,-, 317 | 318 | #### 18) Deleting whole observations with wildcarded dimensions: all observations are deleted for all dimension combinations where: 319 | - row 2: DIM2=A 320 | - row 3: DIM2=B and DIM3=C 321 | 322 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_2,DIM_3 323 | dataflow,ESTAT:NA_MAIN(1.6.0),D,A,, 324 | dataflow,ESTAT:NA_MAIN(1.6.0),D,B,C, 325 | 326 | #### 19) Deleting all data for a data structure definition: 327 | 328 | STRUCTURE,STRUCTURE_ID,ACTION 329 | datastructure,ESTAT:DSD_NA_MAIN(1.6.0),D 330 | or 331 | 332 | STRUCTURE,STRUCTURE_ID,ACTION,DIM_1,DIM_2,DIM_3 333 | datastructure,ESTAT:DSD_NA_MAIN(1.6.0),D,,, 334 | 335 | ------------------------ 336 | 337 | **(1)** Note that since SDMX 3.0.0 the syntax *AGENCY:ARTEFACT_ID(VERSION)* allows omitting the version for non-versioned artefacts. In this case using *AGENCY:ARTEFACT_ID* is sufficient, e.g. `AGENCY:DF_ID` 338 | -------------------------------------------------------------------------------- /metadata-message/docs/sdmx-csv-field-guide.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | SDMX-CSV Data Message is an SDMX data exchange format based on the [RFC 4180](https://tools.ietf.org/html/rfc4180). CSV is a widely used standardised and simple format to exchange data supported by many tools. 4 | 5 | SDMX-CSV integrates with other specifications, i.e.: 6 | - The SDMX API RESTful specification (e.g. content negotiation with mime-type to get SDMX-CSV representations, specific formats for responses, language selection through HTTP content negotiation) 7 | - The [RFC 4180](https://tools.ietf.org/html/rfc4180) specification 8 | 9 | ## RFC 4180: A common format for CSV files 10 | 11 | In order to benefit from best practices, SDMX-CSV is based on the rules defined in the [RFC 4180](https://tools.ietf.org/html/rfc4180), which defines a common format and MIME Type for CSV files. It is advised to read the (very short) RFC for a full list of requirements but, in a nutshell, the RFC defines rules such as: 12 | - How the CSV file should be structured (the RFC specifies that all records must have an identical structure (determined column number), like when using an SDMX "flat" representation for data); 13 | - When double-quotes should be used and how to escape them when needed; 14 | - How spaces should be handled: Spaces are considered part of a field and should not be ignored; 15 | - Which mime type should be used; 16 | - What is the default character set, etc. 17 | 18 | # Design principles for SDMX-CSV 2.1 Metadata Messages (aligned with SDMX 3.1) 19 | 20 | - In order to ensure the identifiability of the metadata contained in the message, the header row containing the column headers is mandatory and its content is well-defined. 21 | - An SDMX-CSV referential metadata message contains metadata attribute values for one or more metadatasets reported for one or more metadataflows or metadata provision agreements. 22 | - After the mandatory header row, each row contains the information related to one specific metadataset attached to one or more identifiable artefacts (targets). 23 | - In [RFC 4180](https://tools.ietf.org/html/rfc4180), csv stands for "comma-separated values". However, while SDMX-CSV uses indeed the "comma" (%x2C) as the default field separator, it adopts the wider interpretation of csv as "character-separated values". It is recommended for implementers to provide SDMX-CSV messages according to the locale of the user (e.g. as indicated in the http Accept-Language header). It means that e.g. the semi-colon ‘;’ (as used typically in specific regions or countries) is acceptable as separator. See also the examples below. Note that the separator used in a message can be determined by retrieving the character that follows the header field of the first column which extended by a squared bracket term (see below). 24 | 25 | ## Columns 26 | 27 | - The first column is always used for the underlying type of structure by which the metadataset is defined: metadataflow or metadata provision agreement. 28 | - The next one or two columns are always used for the related structure identification. 29 | - The next one or two columns are used for the metadataset identification. 30 | - The next column, previously used for the action to be performed for the metadataset, is deprecated. Instead use the appropriate HTTP method when submitting the message to an SDMX Rest API as documented [here](https://github.com/sdmx-twg/sdmx-rest/blob/complement-maintenance-doc/doc/maintenance.md#maintaining-reference-metadata). 31 | - The next column is used for indicating if the metadataset includes only partial available languages. If false (the default), then the value is `0`, otherwise `1`. E.g., an SDMX Rest GET query with an HTTP header `Accept-Language` may result in a metadataset containing only partial languages. If such a metadataset is again submitted to an SDMX Rest web service, then only the included languages are added or updated in the target SDMX system but other languages are not changed; 32 | - The next column is used for the structure types of all targets of the metadataset. 33 | - The next one or two columns are used for the identification of all targets of the metadataset. 34 | - Each metadata attribute of the included metadataset(s) is represented in one or two columns. SDMX web services should return the columns in the metadata attribute order as defined in (each of) the underlying Metadata Structure Definition(s), thus in case of data defined by different metadata structures: first the metadata attributes of the first metadata structure, then the remaining metadata attributes of the second metadata structure and so forth. However, any order of these columns is valid for metadata uploads to SDMX-consuming systems. 35 | - Implementers have the possibility to add any other custom columns as required, e.g. publicationPeriod, publicationYear, reportingBegin, reportingEnd, prepared, etc. 36 | - In the context of appending or deleting metadata, certain columns may be omitted, see below. 37 | 38 | ## Column headers (first row) 39 | 40 | - The header field of the first column always starts with the term `MDSTRUCTURE`. 41 | - This field must be extended with a sub-field delimiter encapsulated in squared brackets "[]", e.g. `MDSTRUCTURE[;]`, in case the message contains metadatasets with multiple targets or with multi-instance or multi-language metadata attributes. 42 | - The header field of the second column always contains the term `MDSTRUCTURE_ID`. 43 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the structure identification column containing the term `MDSTRUCTURE_NAME`. 44 | - The header field of the next column always contains the term `METADATASET_ID`. 45 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the metadataset identification column containing the term `METADATASET_NAME`. 46 | - The header field of the next column may contain the term `ACTION`. If this deprecated column is present, it is to be ignored. 47 | - The header field of the next column may contain the term `IS_PARTIAL_LANGUAGE`. 48 | - The header field of the next column contains the term `TARGET_TYPES`. 49 | - The header field of the next column contains the term `TARGET_IDS`. 50 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the target identification column containing the term `TARGET_NAMES`. 51 | - The other columns for components contain: 52 | - Default: The ID of the metadata attribute reported in that column prefixed by all corresponding nested parent metadata attributes separated by a dot "." in the form *METADATA_ID[.METADATA_ID]+*, e.g. `ATTRIBUTE_GRANDPARENT_ID.ATTRIBUTE_PARENT_ID.ATTRIBUTE_CHILD_ID`. Additional pairs of squared brackets `[]` are added at the end of the IDs of those metadata attributes that have multiple instances, e.g. `CONTACT[].NAME`, `CONTACT[].PHONE[]` or `CONTACT.PHONE[]`, and/or that contain localised values. In the latter case the brackets encapsulate the ISO 2-letter language codes that can be encountered in that column, separated by the special sub-field separation character, e.g. `;`, defined in the squared bracket term of the header field of the first column, e.g. `MDSTRUCTURE[;]`. Example of a localised child attribute: `PROCESS.STEP[en;fr]`, and for multiple instances: `PROCESS.STEP[][en;fr]`. 53 | - If option `labels=both` (see *[here](#optional-parameters)*): The full ID (as described above under 'Default') and the localised name of the metadata attribute reported in that column separated by the term ": ", e.g. `ATTRIBUTE_ID: ATTRIBUTE_NAME. 54 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the metadata attribute identification column containing the localised name of the metadata attribute reported in the previous column. 55 | - Any other custom column contains a custom but unique term, e.g. `publicationPeriod`. 56 | 57 | ## Column content (all rows after header) 58 | 59 | - The first column contains: `metadataflow` or `metadataprovision`, depending on type of artefact for which the metadata contained in the message are defined: metadataflow or metadata provision agreement. 60 | - The second column contains: 61 | - Default: The structure identification information in the form *AGENCY:ARTEFACT_ID(VERSION)* (1), e.g. `ESTAT:MDF(1.6.0)`. 62 | - If option `labels=both` (see *[here](#optional-parameters)*): The structure identification information and its localised name separated by the term ": ", e.g. `ESTAT:MDF(1.6.0): Metadataflow name`. 63 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the structure identification column with the structure's localised name, e.g. `Metadataflow name`. 64 | - The next column contains the metadataset identification information in the form *AGENCY:ARTEFACT_ID(VERSION)*(1), e.g. `AGENCY:MD_SET(1.0.0)`. 65 | - If option `labels=both` (see *[here](#optional-parameters)*): The ID and the localised name of the metadataset separated by the term ": ", e.g. `ESTAT:MD_SET(1.0.0): Metadataset 1`. 66 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the metadataset identification column with the metadataset's localised name, e.g. `Metadata set name`. 67 | - The next column, if present, containing one of the action types, is deprecated and ignored. 68 | - The next column, if present, contains `1` or `0`, indicating if the metadataset only contains only a subset of all available languages. `1` stands for partial language subset, and `0` for the complete set of available languages (default). 69 | - The next column contains the types of all the targets of the metadataset according to the resource names defined for Structural Metadata Queries, e.g. `dataflow`. Multiple targets are separated by the special sub-field separation character, e.g. `;`, defined in the squared bracket term of the header field of the first column, e.g. `MDSTRUCTURE[;]`. Example for multiple target types: `dataflow;dataflow`. 70 | - The next column contains the identification information of all the targets of the metadataset in the form *AGENCY:ARTEFACT_ID(VERSION)* (1), separated by the sub-field separation character, e.g. `AGENCY:DF1(1.0.0);AGENCY:DF2(1.0.0)`. 71 | - If option `labels=both` (see *[here](#optional-parameters)*): The column contains the ID and the localised name of the targets separated by the term ": ", e.g. `AGENCY:DF(1.0.0): Dataflow name` or `AGENCY:DF1(1.0.0): Dataflow 1 name;AGENCY:DF2(1.0.0): Dataflow 2 name`. 72 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the target identification column with the target's localised name, e.g. `Dataflow name` or `Dataflow 1 name;Dataflow 2 name`. 73 | - The other columns for metadata attributes contain: 74 | - Default: The ID(s) (if coded) or value(s) (if non-coded) for the metadata attribute reported in that column, e.g. `A`, `A;B` or `"

An XHTML text

"`. 75 | - If option `labels=both` (see *[here](#optional-parameters)*): The ID(s) and their localised name(s) for the metadata attribute separated by the term ": " (if coded) or the value(s) (if non-coded) for the metadata attribute reported in that column, e.g. `A: A value name`, `A: A value name;B: B value name` or `"

An XHTML text

"`. 76 | - If option `labels=name` (see *[here](#optional-parameters)*): An additional column is added right after the metadata attribute identification column containing the localised name, e.g. `A value name` or `A value name;B value name`, of the metadata attribute value reported in the previous column. It is empty if the value has no localised name. 77 | - All string/textual values (complete string between column-separating characters including ID's or language codes) should always be encapsulated in quotation marks, they must be if they contain commas or inner quotation marks. Quotation marks in strings/textual values must always be escaped by doubling the quotes. 78 | - When metadata from different metadata structures are present then the columns not related to the attribute's metadata structure are to be left empty. 79 | - The other custom columns contain any potentially localised custom content. 80 | 81 | ## Localisation 82 | 83 | - HTTP content negotiation, see [RFC 2616 - HTTP 1.1 Header Field Definitions](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) 84 | - Always use this mime-type in the Accept header: `application/vnd.sdmx.metadata+csv; version=2.0.0`. 85 | - The client can indicate preferred languages through the Accept-Language header, e.g. `fr, en-gb;q=0.8, en;q=0.7`. 86 | - Always localise all artefact names according to the preferred language. The first best language match according to the user’s preferred language choices in the http Accept-Language header (or if that is not available than according to the system's default language order) is to be used for each localisable name element. The message does however not indicate the returned language per localisable name element. In case that there is no such language match for a particular localisable name element, it is optional to return the element in a system-default language or alternatively to not return the name element. 87 | **It is recommended to indicate all languages used anywhere in the message for localised name elements through http Content-Language response header (languages of the intended audience).** 88 | Note: For multi-language metadata attribute values, all language versions are provided independently from the preferred language (see below). 89 | 90 | ## Multi-instance metadata attributes 91 | 92 | - Values from multiple instances of a metadata attribute within a metadataset are separated by the special sub-field separation character, e.g. `;`, defined in the squared bracket term of the header field of the first column, e.g. `MDSTRUCTURE[;]`. 93 | - Such metadata attributes are indicated in the column header by having their ID followed by empty squared brackets "[]", e.g. `ATTR[]`. 94 | - For coded multi-instance metadata attributes, if option `labels=both` (see *[here](#optional-parameters)*) then each individual value is to be prefixed with its ID and the term ": ", e.g. `A: Value A;B: Value B`. 95 | 96 | ## Non-coded multi-lingual metadata attributes 97 | 98 | - Non-coded metadata attributes allow for multi-lingual values. Those values are separated by the special sub-field separation character, e.g. `;`, defined in the squared bracket term of the header field of the first column, e.g. `MDSTRUCTURE[;]`. 99 | - Such metadata attributes are indicated in the column header by having their ID followed by the list of possible 2-letter ISO language codes separated by the sub-field separator and encapsulated squared brackets "[]", e.g. `ATTR[en;fr]`. 100 | - Each individual language value is to be prefixed with its 2-letter ISO language code and a colon character ":", e.g. `en:Value;fr:Valeur`. Thus, in distinction to the ID prefix for coded values when using the HTTP accept header `labels=both` (see *[here](#optional-parameters)*), the language prefix `xx:` doesn't have an extra space character. 101 | 102 | ## Non-coded multi-lingual multi-instance metadata attributes 103 | 104 | - When non-coded multi-lingual metadata attributes have multiple instances within a metadataset, then all individual values are included and separated by the special sub-field separation character, e.g. `;`, defined in the squared bracket term of the header field of the first column, e.g. `MDSTRUCTURE[;]`. 105 | - Such metadata attributes are indicated in the column header by having their ID followed by squared brackets "[]" as well as by the list of possible language codes separated by the sub-field separator and encapsulated in additional squared brackets "[]", e.g. `ATTR[][en;fr;de]`. 106 | - Each individual language value is to prefixed with its 2-letter ISO language code and a colon character ":", e.g. `en:Value1`. 107 | - Not each value needs all language versions. In order to allow knowing to which value the different language items belong, each multi-lingual value set is to be encapsulated in double quotes, e.g. `"en:Value1;fr:Valeur1";"en:Value2;de:Wert2"`. However, note that fields with double quotes must themselves be encapsulated in double quotes and that the inner double quotes need to be doubled, thus the fully complete example is `"""en:Value1;fr:Valeur1"";""en:Value2;de:Wert2"""`. 108 | 109 | ## Non-coded XHTML-valued components 110 | 111 | - Some non-coded metadata attributes allow for XHTML values. 112 | - Each XHTML value is to be encapsulated in double quotes, e.g. `"

This is some ""metadata html""

"`. Remember that the inner double quotes need to be doubled. 113 | - The CSV format allows fields to contain line break characters if those fields are enclosed in double quotes. Thus XHTML values can also contain line breaks, although HTML viewers will ignore them. 114 | 115 | # Optional parameters 116 | 117 | The following optional parameter can be added to the HTTP Accept header. It needs to be separated by the character combination `"; "`. 118 | - labels (id|name|both; default=id): This parameter applies to all Nameable SDMX Artefacts contained in the header and the body of the message: 119 | - If the parameter value is `id` then only the id of the Artefacts is displayed. 120 | - If the parameter value is `both` then the concatenated id and localised name of the Artefacts (see the section on [localised names](#localised-names) on how the message deals with languages) separated by `": "` are displayed. Note that the character combination `": "` could also be part of the Artefact name and could therefore occur several times within the concatenated string. 121 | - If the parameter value is `name` then the id/value and the name of the artefacts are displayed in separate columns (see *[here](#columns)*), the ID/value column always directly preceding its related localised name column. 122 | 123 | # Examples 124 | 125 | Note: All examples assume the minimal HTTP Accept header: `application/vnd.sdmx.metadata+csv; version=2.1.0` 126 | 127 | #### 1) Ordinary case 128 | 129 | MDSTRUCTURE,MDSTRUCTURE_ID,METADATASET_ID,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1,ATTRIBUTE_1.CHILD,ATTRIBUTE_2 130 | metadataflow,OECD:MDF(1.0.0),OECD:MDS(1.0.0),dataflow,OECD:DF(1.0.0),A STRING VALUE,"

An XHTML text with ""quotes""

",123 131 | 132 | Note: 133 | The following default parameter settings are automatically applied: 134 | - labels=id 135 | 136 | #### 2) Metadata attribute with multiple instances and multi-lingual values 137 | 138 | MDSTRUCTURE[;],MDSTRUCTURE_ID,METADATASET_ID,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1,ATTRIBUTE_1.ATTRIBUTE_1_2[][en;fr],ATTRIBUTE_2[],ATTRIBUTE_3[] 139 | metadataflow,OECD:MDF(1.0.0),OECD:MDS(1.0.0),dataflow,OECD:DF(1.0.0),CODE_ID,"""en:""""

An XHTML text

"""";fr:""""

Un texte XHTML

""""";""en:""""

Another XHTML text

"""";fr:""""

Un autre texte XHTML

""""""","""Text with """"quotes"""""";""Another text""",123;456 140 | 141 | #### 3) Localisation: HTTP Accept header: `application/vnd.sdmx.metadata+csv; version=1.0.0; labels=both`, HTTP Accept-Language header: `fr-FR, en;q=0.7`, metadata attribute with multiple instances, metadata attributes with multi-lingual values 142 | 143 | MDSTRUCTURE[|],MDSTRUCTURE_ID;METADATASET_ID;TARGET_TYPES;TARGET_IDS;ATTRIBUTE_1: Attribut d'exemple 1;ATTRIBUTE_1.ATTRIBUTE_1_2[][en|fr]: Attribut d'exemple 12;ATTRIBUTE_2[]: Attribut d'exemple 2 144 | metadataflow;OECD:MDF(1.0.0): Metadataflow d'exemple;OECD:MDS(1.0.0): Metadataset d'exemple;dataflow;OECD:DF(1.0.0): Dataflow d'exemple;CODE_ID: Nom du code;"""en:""""

An XHTML text

""""|fr:""""

Un texte XHTML

""""""|""en:""""

Another XHTML text

""""|fr:""""

Un autre texte XHTML

""""""";123,45|6,789 145 | 146 | Note that in this example the client prefers French (fr) language with the France (FR) locale, but will also accept any type of English. Therefore, in the message the French language with the France locale is applied, transforming also the field separator from comma (,) to semicolon (;), and the decimal separator from dot (.) to comma (,). 147 | 148 | #### 4) Localisation: HTTP Accept header: `application/vnd.sdmx.metadata+csv; version=1.0.0; labels=name`, HTTP Accept-Language header: `en-US`, metadata attribute with multiple instances, metadata attributes with multi-lingual values, different targets and metadatasets 149 | 150 | MDSTRUCTURE[;],MDSTRUCTURE_ID,MDSTRUCTURE_NAME,METADATASET_ID,METADATASET_NAME,TARGET_TYPES,TARGET_IDS,TARGET_NAMES,ATTRIBUTE_1,Attribute 1,ATTRIBUTE_1.ATTRIBUTE_1_2[][en|fr],Attribute 12,ATTRIBUTE_2[],Attribute 2 151 | metadataflow,OECD:MDF(1.0.0),Metadataflow name,OECD:MDS(1.0.0),Metadataset name,dataflow;dataflow,OECD:DF(1.0.0);OECD:DF(1.1.0),Dataflow name 1;Dataflow name 2,CODE_ID,Code name,"""en:""""

An XHTML text

"""";fr:""""

Un texte XHTML

"""""";""en:""""

Another XHTML text

"""";fr:""""

Un autre texte XHTML

""""""",123.45;6.789 152 | metadataflow,OECD:MDF(1.0.0),Metadataflow name,OECD:MDS(1.1.0),Metadataset new name,codelist,OECD:CL(1.0.0),Codelist name,CODE_ID,Code name,"""en:""""

Text 1

"""";fr:""""

Texte 1

"""""";""en:""""

Text 2

"""";fr:""""

Texte 2

""""""",0 153 | 154 | #### 5) Varying metadataflows 155 | 156 | MDSTRUCTURE[;],MDSTRUCTURE_ID,METADATASET_ID,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1,ATTRIBUTE_2[][en;fr;de] 157 | metadataflow,OECD:MDF(1.0.0),OECD:MDS(1.0.0),dataflow,OECD:DF(1.0.0),CODE_ID,"""en:Value1;fr:Valeur1"";""en:Value2;de:Wert2""" 158 | metadataflow,OECD:MDF(1.1.0),OECD:MDS(1.1.0),dataflow,OECD:DF(1.1.0),CODE_ID,"""en:Value1;fr:Valeur1"";""en:Value2;de:Wert2""" 159 | 160 | #### 6) Non-versioned metadataset for a non-versioned[^1] data provision agreement 161 | 162 | MDSTRUCTURE[;],MDSTRUCTURE_ID,METADATASET_ID,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1,ATTRIBUTE_2[en;fr] 163 | metadataprovision,OECD:MDP,OECD:MDS,dataflow,OECD:DF(1.0.0),CODE_ID,"en:Value1;fr:Valeur1" 164 | 165 | #### 7) Non-coded metadata attribute values with line-breaks 166 | 167 | MDSTRUCTURE[;],MDSTRUCTURE_ID,METADATASET_ID,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1[] 168 | metadataflow,OECD:MDF(1.0.0),OECD:MDS(1.0.0),dataflow,OECD:DF(1.0.0),"""This text with a line 169 | break"";""This is some other text

""" 170 | 171 | #### 8) Metadataflows with partial languages 172 | 173 | MDSTRUCTURE[;],MDSTRUCTURE_ID,METADATASET_ID,IS_PARTIAL_LANGUAGE,TARGET_TYPES,TARGET_IDS,ATTRIBUTE_1,ATTRIBUTE_2[][en] 174 | metadataflow,OECD:MDF(1.0.0),OECD:MDS(1.0.0),1,dataflow,OECD:DF(1.0.0),CODE_ID,"""en:Value1"";""en:Value2""" 175 | metadataflow,OECD:MDF(1.1.0),OECD:MDS(1.1.0),1,dataflow,OECD:DF(1.1.0),CODE_ID,"""en:Value1"";""en:Value2""" 176 | 177 | ------------------------ 178 | 179 | [^1]: Note that since SDMX 3.0.0 the syntax *AGENCY:ARTEFACT_ID(VERSION)* allows omitting the version for non-versioned artefacts. In this case using *AGENCY:ARTEFACT_ID* is sufficient, e.g. `OECD:MDP` 180 | --------------------------------------------------------------------------------