26 |
--------------------------------------------------------------------------------
/Gemfile:
--------------------------------------------------------------------------------
1 | source "https://rubygems.org"
2 | # Hello! This is where you manage which Jekyll version is used to run.
3 | # When you want to use a different version, change it below, save the
4 | # file and run `bundle install`. Run Jekyll with `bundle exec`, like so:
5 | #
6 | # bundle exec jekyll serve
7 | #
8 | # This will help ensure the proper Jekyll version is running.
9 | # Happy Jekylling!
10 | gem "jekyll", "~> 4.3"
11 | # This is the default theme for new Jekyll sites. You may change this to anything you like.
12 | gem "just-the-docs"
13 | # If you want to use GitHub Pages, remove the "gem "jekyll"" above and
14 | # uncomment the line below. To upgrade, run `bundle update github-pages`.
15 | # gem "github-pages", group: :jekyll_plugins
16 | # If you have any plugins, put them here!
17 | group :jekyll_plugins do
18 | gem "jekyll-feed", "~> 0.12"
19 | end
20 |
21 | # Windows and JRuby does not include zoneinfo files, so bundle the tzinfo-data gem
22 | # and associated library.
23 | platforms :mingw, :x64_mingw, :mswin, :jruby do
24 | gem "tzinfo", "~> 1.2"
25 | gem "tzinfo-data"
26 | end
27 |
28 | # Performance-booster for watching directories on Windows
29 | gem "wdm", "~> 0.1.1", :platforms => [:mingw, :x64_mingw, :mswin]
30 |
31 | # Lock `http_parser.rb` gem to `v0.6.x` on JRuby builds since newer versions of the gem
32 | # do not have a Java counterpart.
33 | gem "http_parser.rb", "~> 0.6.0", :platforms => [:jruby]
34 |
35 | gem "webrick", "~> 1.7"
36 |
37 | gem "jekyll-remote-theme", "~> 0.4.3"
38 |
39 | gem 'jekyll-redirect-from'
40 |
41 | gem 'jekyll-include-cache'
42 |
--------------------------------------------------------------------------------
/Gemfile.lock:
--------------------------------------------------------------------------------
1 | GEM
2 | remote: https://rubygems.org/
3 | specs:
4 | addressable (2.8.6)
5 | public_suffix (>= 2.0.2, < 6.0)
6 | colorator (1.1.0)
7 | concurrent-ruby (1.2.3)
8 | em-websocket (0.5.3)
9 | eventmachine (>= 0.12.9)
10 | http_parser.rb (~> 0)
11 | eventmachine (1.2.7)
12 | ffi (1.16.3)
13 | forwardable-extended (2.6.0)
14 | google-protobuf (3.25.3-arm64-darwin)
15 | google-protobuf (3.25.3-x86_64-linux)
16 | http_parser.rb (0.8.0)
17 | i18n (1.14.1)
18 | concurrent-ruby (~> 1.0)
19 | jekyll (4.3.3)
20 | addressable (~> 2.4)
21 | colorator (~> 1.0)
22 | em-websocket (~> 0.5)
23 | i18n (~> 1.0)
24 | jekyll-sass-converter (>= 2.0, < 4.0)
25 | jekyll-watch (~> 2.0)
26 | kramdown (~> 2.3, >= 2.3.1)
27 | kramdown-parser-gfm (~> 1.0)
28 | liquid (~> 4.0)
29 | mercenary (>= 0.3.6, < 0.5)
30 | pathutil (~> 0.9)
31 | rouge (>= 3.0, < 5.0)
32 | safe_yaml (~> 1.0)
33 | terminal-table (>= 1.8, < 4.0)
34 | webrick (~> 1.7)
35 | jekyll-feed (0.17.0)
36 | jekyll (>= 3.7, < 5.0)
37 | jekyll-include-cache (0.2.1)
38 | jekyll (>= 3.7, < 5.0)
39 | jekyll-redirect-from (0.16.0)
40 | jekyll (>= 3.3, < 5.0)
41 | jekyll-remote-theme (0.4.3)
42 | addressable (~> 2.0)
43 | jekyll (>= 3.5, < 5.0)
44 | jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0)
45 | rubyzip (>= 1.3.0, < 3.0)
46 | jekyll-sass-converter (3.0.0)
47 | sass-embedded (~> 1.54)
48 | jekyll-seo-tag (2.8.0)
49 | jekyll (>= 3.8, < 5.0)
50 | jekyll-watch (2.2.1)
51 | listen (~> 3.0)
52 | just-the-docs (0.8.0)
53 | jekyll (>= 3.8.5)
54 | jekyll-include-cache
55 | jekyll-seo-tag (>= 2.0)
56 | rake (>= 12.3.1)
57 | kramdown (2.4.0)
58 | rexml
59 | kramdown-parser-gfm (1.1.0)
60 | kramdown (~> 2.0)
61 | liquid (4.0.4)
62 | listen (3.9.0)
63 | rb-fsevent (~> 0.10, >= 0.10.3)
64 | rb-inotify (~> 0.9, >= 0.9.10)
65 | mercenary (0.4.0)
66 | pathutil (0.16.2)
67 | forwardable-extended (~> 2.6)
68 | public_suffix (5.0.4)
69 | rake (13.1.0)
70 | rb-fsevent (0.11.2)
71 | rb-inotify (0.10.1)
72 | ffi (~> 1.0)
73 | rexml (3.2.6)
74 | rouge (4.2.0)
75 | rubyzip (2.3.2)
76 | safe_yaml (1.0.5)
77 | sass-embedded (1.71.1-arm64-darwin)
78 | google-protobuf (~> 3.25)
79 | sass-embedded (1.71.1-x86_64-linux-gnu)
80 | google-protobuf (~> 3.25)
81 | terminal-table (3.0.2)
82 | unicode-display_width (>= 1.1.1, < 3)
83 | unicode-display_width (2.5.0)
84 | webrick (1.8.1)
85 |
86 | PLATFORMS
87 | arm64-darwin-21
88 | arm64-darwin-23
89 | x86_64-linux
90 |
91 | DEPENDENCIES
92 | http_parser.rb (~> 0.6.0)
93 | jekyll (~> 4.3)
94 | jekyll-feed (~> 0.12)
95 | jekyll-include-cache
96 | jekyll-redirect-from
97 | jekyll-remote-theme (~> 0.4.3)
98 | just-the-docs
99 | tzinfo (~> 1.2)
100 | tzinfo-data
101 | wdm (~> 0.1.1)
102 | webrick (~> 1.7)
103 |
104 | BUNDLED WITH
105 | 2.5.6
106 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Creative Commons Legal Code
2 |
3 | CC0 1.0 Universal
4 |
5 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
6 | LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
7 | ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
8 | INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
9 | REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
10 | PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
11 | THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
12 | HEREUNDER.
13 |
14 | Statement of Purpose
15 |
16 | The laws of most jurisdictions throughout the world automatically confer
17 | exclusive Copyright and Related Rights (defined below) upon the creator
18 | and subsequent owner(s) (each and all, an "owner") of an original work of
19 | authorship and/or a database (each, a "Work").
20 |
21 | Certain owners wish to permanently relinquish those rights to a Work for
22 | the purpose of contributing to a commons of creative, cultural and
23 | scientific works ("Commons") that the public can reliably and without fear
24 | of later claims of infringement build upon, modify, incorporate in other
25 | works, reuse and redistribute as freely as possible in any form whatsoever
26 | and for any purposes, including without limitation commercial purposes.
27 | These owners may contribute to the Commons to promote the ideal of a free
28 | culture and the further production of creative, cultural and scientific
29 | works, or to gain reputation or greater distribution for their Work in
30 | part through the use and efforts of others.
31 |
32 | For these and/or other purposes and motivations, and without any
33 | expectation of additional consideration or compensation, the person
34 | associating CC0 with a Work (the "Affirmer"), to the extent that he or she
35 | is an owner of Copyright and Related Rights in the Work, voluntarily
36 | elects to apply CC0 to the Work and publicly distribute the Work under its
37 | terms, with knowledge of his or her Copyright and Related Rights in the
38 | Work and the meaning and intended legal effect of CC0 on those rights.
39 |
40 | 1. Copyright and Related Rights. A Work made available under CC0 may be
41 | protected by copyright and related or neighboring rights ("Copyright and
42 | Related Rights"). Copyright and Related Rights include, but are not
43 | limited to, the following:
44 |
45 | i. the right to reproduce, adapt, distribute, perform, display,
46 | communicate, and translate a Work;
47 | ii. moral rights retained by the original author(s) and/or performer(s);
48 | iii. publicity and privacy rights pertaining to a person's image or
49 | likeness depicted in a Work;
50 | iv. rights protecting against unfair competition in regards to a Work,
51 | subject to the limitations in paragraph 4(a), below;
52 | v. rights protecting the extraction, dissemination, use and reuse of data
53 | in a Work;
54 | vi. database rights (such as those arising under Directive 96/9/EC of the
55 | European Parliament and of the Council of 11 March 1996 on the legal
56 | protection of databases, and under any national implementation
57 | thereof, including any amended or successor version of such
58 | directive); and
59 | vii. other similar, equivalent or corresponding rights throughout the
60 | world based on applicable law or treaty, and any national
61 | implementations thereof.
62 |
63 | 2. Waiver. To the greatest extent permitted by, but not in contravention
64 | of, applicable law, Affirmer hereby overtly, fully, permanently,
65 | irrevocably and unconditionally waives, abandons, and surrenders all of
66 | Affirmer's Copyright and Related Rights and associated claims and causes
67 | of action, whether now known or unknown (including existing as well as
68 | future claims and causes of action), in the Work (i) in all territories
69 | worldwide, (ii) for the maximum duration provided by applicable law or
70 | treaty (including future time extensions), (iii) in any current or future
71 | medium and for any number of copies, and (iv) for any purpose whatsoever,
72 | including without limitation commercial, advertising or promotional
73 | purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
74 | member of the public at large and to the detriment of Affirmer's heirs and
75 | successors, fully intending that such Waiver shall not be subject to
76 | revocation, rescission, cancellation, termination, or any other legal or
77 | equitable action to disrupt the quiet enjoyment of the Work by the public
78 | as contemplated by Affirmer's express Statement of Purpose.
79 |
80 | 3. Public License Fallback. Should any part of the Waiver for any reason
81 | be judged legally invalid or ineffective under applicable law, then the
82 | Waiver shall be preserved to the maximum extent permitted taking into
83 | account Affirmer's express Statement of Purpose. In addition, to the
84 | extent the Waiver is so judged Affirmer hereby grants to each affected
85 | person a royalty-free, non transferable, non sublicensable, non exclusive,
86 | irrevocable and unconditional license to exercise Affirmer's Copyright and
87 | Related Rights in the Work (i) in all territories worldwide, (ii) for the
88 | maximum duration provided by applicable law or treaty (including future
89 | time extensions), (iii) in any current or future medium and for any number
90 | of copies, and (iv) for any purpose whatsoever, including without
91 | limitation commercial, advertising or promotional purposes (the
92 | "License"). The License shall be deemed effective as of the date CC0 was
93 | applied by Affirmer to the Work. Should any part of the License for any
94 | reason be judged legally invalid or ineffective under applicable law, such
95 | partial invalidity or ineffectiveness shall not invalidate the remainder
96 | of the License, and in such case Affirmer hereby affirms that he or she
97 | will not (i) exercise any of his or her remaining Copyright and Related
98 | Rights in the Work or (ii) assert any associated claims and causes of
99 | action with respect to the Work, in either case contrary to Affirmer's
100 | express Statement of Purpose.
101 |
102 | 4. Limitations and Disclaimers.
103 |
104 | a. No trademark or patent rights held by Affirmer are waived, abandoned,
105 | surrendered, licensed or otherwise affected by this document.
106 | b. Affirmer offers the Work as-is and makes no representations or
107 | warranties of any kind concerning the Work, express, implied,
108 | statutory or otherwise, including without limitation warranties of
109 | title, merchantability, fitness for a particular purpose, non
110 | infringement, or the absence of latent or other defects, accuracy, or
111 | the present or absence of errors, whether or not discoverable, all to
112 | the greatest extent permissible under applicable law.
113 | c. Affirmer disclaims responsibility for clearing rights of other persons
114 | that may apply to the Work or any use thereof, including without
115 | limitation any person's Copyright and Related Rights in the Work.
116 | Further, Affirmer disclaims responsibility for obtaining any necessary
117 | consents, permissions or other rights required for any use of the
118 | Work.
119 | d. Affirmer understands and acknowledges that Creative Commons is not a
120 | party to this document and has no duty or obligation with respect to
121 | this CC0 or use of the Work.
122 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Zarr Enhancement Proposals (ZEPs)
2 |
3 | ## Community Feedback Process for Zarr Specifications
4 |
5 | ZEP stands for Zarr Enhancement Proposal. A ZEP is a design document providing
6 | information to the Zarr community, describing a modification or enhancement of
7 | the Zarr specifications or a new feature for its processes or environment. The
8 | ZEP should provide specific proposed changes to the Zarr specification and a
9 | narrative rationale for the specification changes.
10 |
11 | We intend ZEPs to be the primary mechanism for evolving the spec, collecting
12 | community input on significant issues and documenting the design decision that
13 | has gone into Zarr.
14 |
15 | ## Proposing a new ZEP
16 |
17 | ZEPs should be submitted as a draft ZEP in the [draft folder](https://github.com/zarr-developers/zeps/tree/main/draft)
18 | of this (*zeps*) repository via [GitHub pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request).
19 |
20 | The PR should contain the narrative text of the ZEP with the name `zep-.md`
21 | where `` is an appropriately assigned four-digit number. The draft ZEP must
22 | use the [ZEP X - Template and Instructions](https://zarr.dev/zeps/template/template.html)
23 | file.
24 |
25 | To read more on `Submitting a ZEP`, please refer [here](https://zarr.dev/zeps/active/ZEP0000.html#submitting-a-zep).
26 |
27 | ## Contributing to ZEPs
28 |
29 | The ZEPs in this repo are published automatically on the web @
30 | https://zarr.dev/zeps/. If you wish to contribute to the website, please build
31 | the website locally on your machine. Building this website requires [Jekyll](http://jekyllrb.com/).
32 | Refer to [this](https://jekyllrb.com/docs/) to install Jekyll.
33 |
34 | Steps to contribute:
35 |
36 | 1. Fork this repo
37 | 2. cd into the forked repo
38 | 3. Type `bundle exec jekyll serve --incremental`
39 | or `docker run -p 4000:4000 -v $(pwd):/site bretfisher/jekyll-serve`
40 | 4. Open a browser and go to http://localhost:4000/zeps/ to see the
41 | website
42 | 5. Make desired changes, save them and refresh http://localhost:4000/zeps/ to see
43 | the changes
44 |
45 | Once done, push your changes to your fork and open a
46 | [GitHub pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request).
47 |
48 |
--------------------------------------------------------------------------------
/_config.yml:
--------------------------------------------------------------------------------
1 | # Welcome to Jekyll!
2 | #
3 | # This config file is meant for settings that affect your whole blog, values
4 | # which you are expected to set up once and rarely edit after that. If you find
5 | # yourself editing this file very often, consider using Jekyll's data files
6 | # feature for the data you need to update frequently.
7 | #
8 | # For technical reasons, this file is *NOT* reloaded automatically when you use
9 | # 'bundle exec jekyll serve'. If you change this file, please restart the server process.
10 | #
11 | # If you need help with YAML syntax, here are some quick references for you:
12 | # https://learn-the-web.algonquindesign.ca/topics/markdown-yaml-cheat-sheet/#yaml
13 | # https://learnxinyminutes.com/docs/yaml/
14 | #
15 | # Site settings
16 | # These are used to personalize your new site. If you look in the HTML files,
17 | # you will see them accessed via {{ site.title }}, {{ site.email }}, and so on.
18 | # You can create any custom variable you would like, and they will be accessible
19 | # in the templates via {{ site.myvariable }}.
20 |
21 | title: ZEP
22 | email: zarrdevelopers@gmail.com
23 | description: >- # this means to ignore newlines until "baseurl:"
24 | ZEP (Zarr Enhancement Proposal) is a community feedback process for Zarr Specifications.
25 | A ZEP is a design document providing information to the Zarr community, describing a
26 | modification or enhancement of the Zarr specifications, a new feature for its
27 | processes or environment.
28 | baseurl: "/zeps" # the subpath of your site, e.g. /blog
29 | url: "https://zarr.dev" # the base hostname & protocol for your site, e.g. http://example.com
30 | #twitter_username: jekyllrb
31 | #github_username: jekyll
32 |
33 | # Set a path/url to a logo that will be displayed instead of the title
34 | logo: "/assets/images/zarr.png"
35 |
36 | # Google Analytics Tracking
37 | ga_tracking: G-BCRR9QE7Z0
38 | ga_tracking_anonymize_ip: true # Use GDPR compliant Google Analytics settings (true/nil by default)
39 |
40 | # Aux links for the upper right navigation
41 | aux_links:
42 | "Zarr Homepage":
43 | - "https://zarr.dev/"
44 |
45 | # Makes Aux links open in a new tab. Default is false
46 | aux_links_new_tab: true
47 |
48 | # Build settings
49 | remote_theme: pmarsceill/just-the-docs
50 | plugins:
51 | - jekyll-feed
52 | - jekyll-remote-theme
53 | - jekyll-redirect-from
54 | - jekyll-include-cache
55 |
56 | # Exclude from processing.
57 | # The following items will not be processed, by default.
58 | # Any item listed under the `exclude:` key here will be automatically added to
59 | # the internal "default list".
60 | #
61 | # Excluded items can be processed by explicitly listing the directories or
62 | # their entries' file path in the `include:` list.
63 | #
64 | # exclude:
65 | # - .sass-cache/
66 | # - .jekyll-cache/
67 | # - gemfiles/
68 | # - Gemfile
69 | # - Gemfile.lock
70 | # - node_modules/
71 | # - vendor/bundle/
72 | # - vendor/cache/
73 | # - vendor/gems/
74 | # - vendor/ruby/
75 |
--------------------------------------------------------------------------------
/_includes/custom.html:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/_posts/2022-05-26-welcome-to-jekyll.markdown:
--------------------------------------------------------------------------------
1 | ---
2 | layout: post
3 | title: "Welcome to Jekyll!"
4 | date: 2022-05-26 03:31:52 +0530
5 | categories: jekyll update
6 | ---
7 | You’ll find this post in your `_posts` directory. Go ahead and edit it and re-build the site to see your changes. You can rebuild the site in many different ways, but the most common way is to run `jekyll serve`, which launches a web server and auto-regenerates your site when a file is updated.
8 |
9 | Jekyll requires blog post files to be named according to the following format:
10 |
11 | `YEAR-MONTH-DAY-title.MARKUP`
12 |
13 | Where `YEAR` is a four-digit number, `MONTH` and `DAY` are both two-digit numbers, and `MARKUP` is the file extension representing the format used in the file. After that, include the necessary front matter. Take a look at the source for this post to get an idea about how it works.
14 |
15 | Jekyll also offers powerful support for code snippets:
16 |
17 | {% highlight ruby %}
18 | def print_hi(name)
19 | puts "Hi, #{name}"
20 | end
21 | print_hi('Tom')
22 | #=> prints 'Hi, Tom' to STDOUT.
23 | {% endhighlight %}
24 |
25 | Check out the [Jekyll docs][jekyll-docs] for more info on how to get the most out of Jekyll. File all bugs/feature requests at [Jekyll’s GitHub repo][jekyll-gh]. If you have questions, you can ask them on [Jekyll Talk][jekyll-talk].
26 |
27 | [jekyll-docs]: https://jekyllrb.com/docs/home
28 | [jekyll-gh]: https://github.com/jekyll/jekyll
29 | [jekyll-talk]: https://talk.jekyllrb.com/
30 |
--------------------------------------------------------------------------------
/accepted/accepted_zeps.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: accepted ZEPs
4 | description: List of Accepted ZEPs
5 | nav_order: 2
6 | has_children: true
7 | permalink: /accepted_zeps/
8 | ---
9 |
10 | # Accepted ZEPs
11 |
12 | ### Shows the list of Accepted ZEPs.
13 |
--------------------------------------------------------------------------------
/active/active_zep.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: active ZEPs
4 | description: List of Active ZEPs
5 | nav_order: 3
6 | has_children: true
7 | permalink: /active_zeps/
8 | ---
9 |
10 | # Active ZEPs
11 |
12 | ### Shows the list of Active ZEPs.
13 |
--------------------------------------------------------------------------------
/assets/images/Zarr_accumulation-App_Interface-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/Zarr_accumulation-App_Interface-1.png
--------------------------------------------------------------------------------
/assets/images/Zarr_accumulation-App_Interface-2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/Zarr_accumulation-App_Interface-2.png
--------------------------------------------------------------------------------
/assets/images/flowchart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/flowchart.png
--------------------------------------------------------------------------------
/assets/images/sharding.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/sharding.png
--------------------------------------------------------------------------------
/assets/images/zarr.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/zarr.png
--------------------------------------------------------------------------------
/draft/ZEP0003.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: ZEP0003
4 | description: Variable chunk sizes
5 | parent: draft ZEPs
6 | nav_order: 3
7 | ---
8 |
9 | # ZEP 3 — Variable chunking
10 |
11 | Authors:
12 | * Martin Durant ([@martindurant](https://github.com/martindurant)), Anaconda, Inc.
13 | * Isaac Virshup ([@ivirshup](https://github.com/martindurant)), Helmholtz Munich
14 |
15 | Status: Draft
16 |
17 | Type: Specification
18 |
19 | Created: 2022-10-17
20 |
21 | Discussion: https://github.com/orgs/zarr-developers/discussions/52
22 |
23 | ## Abstract
24 |
25 | To allow the chunks of a zarr array to be rectangular grid rather than a regular grid,
26 | with the chunk
27 | lengths along any dimension a list of integers rather than a single chunk size.
28 |
29 | ## Motivation and Scope
30 |
31 | Two specific use cases have motivated this, given below. However, this generalisation of Zarr's storage
32 | model can be seen as an optional enhancement, and the same data model as currently used by dask.array.
33 |
34 | - when producing a [kerchunked](https://github.com/fsspec/kerchunk) dataset, the native chunking of the targets
35 | cannot be changed. It is common
36 | to have non-regular chunking on at least one dimension, such as a time dimension with one sample per day and chunks
37 | of one month or one year. The change would allow these datasets to be read via kerchunk, and/or converted to
38 | zarr with equivalent chunking to the original. Such data cannot currently be represented in zarr.
39 | - [awkward](https://github.com/scikit-hep/awkward) arrays, ragged arrays and sparse data can be represented as
40 | a set of one-dimensional arrays, with an appropriate metadata description convention. The size of a chunks
41 | of each component array corresponding to a logical chunk of the overall array will not, in general be equal
42 | with each other in a single chunk, nor consistent between chunks, as each row in the matrix can have a variable number
43 | of non-zero values
44 | - sensor data, may not come in fixed increments; variably chunked storage would be great for parallel writing.
45 | With variable chunk sizes, just need to make sure offsets are
46 | correct once done. Otherwise, write locations for chunks are dependent on previous chunks.
47 | - in some cases, parts of the overall data array may have very different data distributions, and it can
48 | be very convenient to partition the data by such characteristics to allow, for example, for more efficient encoding
49 | schemes.
50 | - when filtering regular table data on one column and applying to other columns, you necessarily end up with an unequal
51 | number of values in each chunk, which zarr does not currently handle.
52 |
53 | ## Usage and Impact
54 |
55 | ### Creation
56 |
57 | ```python
58 | zarr.create(1000, chunks=((100, 300, 500, 100),))
59 | ```
60 |
61 |
62 | ## Backward Compatibility
63 |
64 |
65 | This change is fully backward compatible - all old data will remain usable. However, data written with
66 | variable chunks will not be readable by older versions of Zarr. It would be reasonable to wish to backport the
67 | feature to v2.
68 |
69 | ## Detailed description
70 |
71 | Currently, the array metadata specifies the chunking scheme like
72 | (see https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html#chunk-grid)
73 | ```json
74 | {
75 | "type": "regular",
76 | "chunk_shape": [10, 10],
77 | "separator":"/"
78 | }
79 | ```
80 |
81 | The proposal is to allow metadata of the form
82 | ```json
83 | {
84 | "type": "rectangular",
85 | "chunk_shape": [[5, 5, 5, 15, 15, 20, 35], 10],
86 | "separator":"/"
87 | }
88 | ```
89 | Each element of `chunk_shape`, corresponding to each dimension of the array, may be a single integer, as before,
90 | or a list of integers which add up to the size of the array in that dimension. In this example, the single value
91 | of `10` for the chunks on the second dimension would be identical to `[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]`.
92 | The number of values in the list is equal to the number of chunks along that dimension. Thus, a "rectangular"
93 | grid may be fully compatible as a "regular" grid.
94 |
95 | The data index bounds on a dimension of each hyperrectangle is formed by a cumulative sum of the chunks values,
96 | starting at 0.
97 | ```
98 | bounds_axis0 = [0, 5, 10, 15, 30, 45, 65, 100]
99 | bounds_axis1 = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
100 | ```
101 | such that key "c0/0" contains values for indices along the first dimension (0, 5] and (0, 10] on the second dimension.
102 | An array index of (17, 17) would be found in key "c3/1", index (2, 2).
103 |
104 | ## Related Work
105 |
106 | ### Dask
107 |
108 | `dask.array` uses rectangular chunking internally, and is one of the major consumers of zarr data. Much of the
109 | code translating logical slices into slices on the individual chunks should be reusable.
110 |
111 | ### Parquet/ Arrow
112 |
113 | Arrow describes tables as a collection of record batches. There is no restriction on the size of these batches.
114 | This is not only very flexible, but can be used as an indexing strategy for low cardinality columns within parquet.
115 |
116 | ```
117 | dataset_name/
118 | year=2007/
119 | month=01/
120 | 0.parq
121 | 1.parq
122 | ...
123 | month=02/
124 | 0.parq
125 | 1.parq
126 | ...
127 | month=03/
128 | ...
129 | year=2008/
130 | month=01/
131 | ...
132 | ...
133 | ```
134 |
135 | This feature was cited as one of the reasons parquet was chose over zarr for dask
136 | dataframes: https://github.com/dask/dask/issues/1599
137 |
138 | ### awkward array
139 |
140 | https://github.com/zarr-developers/zarr-specs/issues/62
141 |
142 |
143 | ## Implementation
144 |
145 | It is to be hoped that much code can be adapted from dask.array, which already allows variable chunk sizes
146 | on each dimension.
147 |
148 | ## Alternatives
149 |
150 | ### Just tune chunk sizes
151 |
152 | https://github.com/zarr-developers/zarr-specs/issues/62#issuecomment-1100806513
153 |
154 |
155 | ## Discussion
156 |
157 |
158 | ## References and Footnotes
159 |
160 | * Previous discussion:
161 | * [Zarr Dask Table dask/dask#1599](https://github.com/dask/dask/issues/1599)
162 | * [Protocol extensions for awkward arrays zarr-developers/zarr-specs#62](https://github.com/zarr-developers/zarr-specs/issues/62)
163 | * [Handling arrays with non-uniform chunking zarr-developers/zarr-specs#40](https://github.com/zarr-developers/zarr-specs/issues/40)
164 | * [Chunk spec zarr-developers/zarr-spec#7](https://github.com/zarr-developers/zarr-specs/issues/7#issuecomment-468127219)
165 |
166 |
167 |
168 | ## Copyright
169 |
170 | This document has been placed in the public domain.
171 |
--------------------------------------------------------------------------------
/draft/ZEP0005.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: ZEP0005
4 | description: This ZEP proposes a Zarr extension for an algorithm developed at NASA’s GES DISC for fast and cost-efficient multi-dimensional averaging service -- Zarr-based Chunk-level Accumulation in Reduced Dimensions.
5 | parent: draft ZEPs
6 | nav_order: 5
7 | ---
8 |
9 | # ZEP 5 — Zarr-based Chunk-level Accumulation in Reduced Dimensions
10 |
11 | Authors:
12 | * Hailiang Zhang ([@hailiangzhang](https://github.com/hailiangzhang)), Adnet Systems Inc, NASA Goddard Space Flight Center.
13 | * Mahabal Hegde ([@nasahegde](https://github.com/nasahegde)), NASA Goddard Space Flight Center.
14 | * Christine Smit ([@christine-e-smit](https://github.com/christine-e-smit)), Telophase Co, NASA Goddard Space Flight Center.
15 | * Brianna Pagan ([@briannapagan](https://github.com/briannapagan)), Adnet Systems Inc, NASA Goddard Space Flight Center.
16 | * Dieu My Nguyen ([@dieumynguyen](https://github.com/dieumynguyen)), Adnet Systems Inc, NASA Goddard Space Flight Center.
17 |
18 | Status: Draft
19 |
20 | Type: Specification
21 |
22 | Created: 2023-02-12
23 |
24 | Discussion:
25 |
26 | ## Abstract
27 |
28 | At NASA GES DISC, we receive a large number of user requests each day for a variety of analysis and visualization services involving averaging along one or more dimensions, some of which are computationally expensive when running against large amounts of geospatial data. We proposed a generic and dimension-agnostic method based on chunk-level cumulative sums (accumulation) on a regular grid, which provides fast and cost-efficient cloud analysis for multidimensional averaging services. This method introduces a small adjustable set of auxiliary data on top of the raw data, and dramatically reduces the computational time by orders of magnitude based on chunk-level accumulation along one or more dimensions.
29 |
30 | We hereby propose a Zarr extension for this chunk-level accumulation approach. In this proposal, we will present a Zarr group for the accumulation data, a JSON schema for the accumulation group attribute, a JSON schema for the accumulation data array attribute, and an example of the user application interface.
31 |
32 | ## Motivation and Scope
33 |
34 | At NASA GES DISC, our use case is for computing averages along a range of data dimensions (e.g., space, time) in a cost effective and highly performant manner. In the Geo-spatial community, computing area or temporal averages over a long range of observations is popular with users. Performing the averaging operation over data in Zarr generally requires a full scan of the data. This can be parallelized with Dask (or any other distributed framework), but reading all of the data is an unavoidable bottleneck. Our proposed approach, Zarr-based Chunk-level Accumulation, will improve the speed and cost of long-range calculation by loading only a few data chunks at the averaging boundary. Here, we provide examples of how the approach is applied to geospatial data with temporal and spatial dimensions; however, it’s noteworthy that this approach is dimension-agnostic and can be generalized for all types of multidimensional data aggregation services with Zarr.
35 |
36 | ## Detailed description
37 |
38 | The fundamental of this Zarr-based Chunk-level Accumulation algorithm is to pre-compute [cumulative sums](https://mathworld.wolfram.com/CumulativeSum.html) of data values and weights/counts along data dimensions at the chunk intervals. These cumulative sums can then be used to find data averages for dimension ranges.
39 |
40 | Example:
41 |
42 | For a sequence `A=(a, b, c, d)`, the cumulative sums are `S=(s0, s1, s2, s3)` where `s0=a, s1=a+b, s2=a+b+c, s3=a+b+c+d`. The average of the sequence over a range can now be calculated using the cumulative sums. For example, assuming zero-based indexing, average `(A[1:])=(S[3]-S[0])/3=(s3-s0)/3`.
43 |
44 | The above example can be extended and generalized for multiple dimensions. This makes the problem of computing averages *O(1)* vs *O(N^m)* for the dimensions being averaged, where *N* is the number of data values and *m* is the number of dimensions to be averaged. See our [ESIP 2022 presentation](https://www.youtube.com/watch?v=ac_UKunUrNM&t=2250s) (and the [slides](https://docs.google.com/presentation/d/1RNvkIlCFvtoy89OTMzQNn_0jixOpdhnu/edit?usp=sharing&ouid=106287227661991623566&rtpof=true&sd=true)) for a more detailed description.
45 |
46 | ## Implementation
47 |
48 | We propose to formalize this Zarr-based Chunk-level Accumulation approach as a Zarr extension. To implement this approach, cumulative sums are computed at chunk intervals and are stored in a Zarr group. The API for averaging the data fetches the necessary pre-computed sums based on the user-requested dimensions (e.g., time) and dimension ranges (e.g., from 1980 to 1990).
49 |
50 | Please note that this solution is also applicable for storing chunk statistics (min, max, sum, count, etc.) to help with performing aggregations.
51 |
52 | ### Zarr group structure of accumulation data
53 |
54 | Rather than storing the chunk-level statistics in a separate store, we could store them inline with the arrays they are derived from. This would enable other applications to take advantage of such pre-computed data to optimize queries. This is similar to an optimization in Snowflake ([twitter link](https://twitter.com/teej_m/status/1546591452750159873)).
55 |
56 | The accumulation datasets are organized in a data group adjacent to the raw data and dimension arrays with the following structure:
57 | ```
58 | ├── ${dimension_array}
59 | ├── ...
60 | ├── ${raw_dataset}
61 | ├── ...
62 | └── ${raw_dataset}_accumulation_group
63 | ├── .zgroup
64 | ├── .zattr
65 | ├── ${accumulation_dataset_1}
66 | │ ├── .zarray
67 | │ ├── .zattr
68 | │ └── ...
69 | ├── ${accumulation_dataset_2}
70 | │ ├── .zarray
71 | │ ├── .zattr
72 | │ └── ...
73 | ...
74 | ```
75 |
76 | where `${dimension_array}` is the data array for the dimension variable, `${raw_dataset}` is the data array for the raw dataset, `${raw_dataset}_accumulation_group` is the group for accumulation, and `${accumulation_dataset_1}` and `${accumulation_dataset_2}` are the data arrays for each accumulation dataset.
77 |
78 | ### Zarr attribute file of accumulation group
79 |
80 | The accumulation group attribute file, `${raw_dataset}_accumulation_group/.zattr`, provides details of the accumulation implementation and data organization. It follows the JSON schema shown below:
81 | ```
82 | {
83 | "$schema": "http://json-schema.org/draft-07/schema#",
84 | "type": "object",
85 | "definitions": {
86 | "accumulation_data_array": {
87 | "type": "object",
88 | "properties": {
89 | "_DATA_UNWEIGHTED": {
90 | "type": "string"
91 | },
92 | "_DATA_WEIGHTED": {
93 | "type": "string"
94 | },
95 | "_WEIGHTS": {
96 | "type": "string"
97 | }
98 | },
99 | "patternProperties": {
100 | "^(?!_DATA_UNWEIGHTED|_DATA_WEIGHTED|_WEIGHTS).*$": {
101 | "$ref": "#/definitions/accumulation_data_array"
102 | }
103 | },
104 | "additionalProperties": false
105 | }
106 | },
107 | "properties": {
108 | "_ACCUMULATION_GROUP": {
109 | "type": "object",
110 | "patternProperties": {
111 | "^(?!_DATA_UNWEIGHTED|_DATA_WEIGHTED|_WEIGHTS).*$": {
112 | "$ref": "#/definitions/accumulation_data_array"
113 | }
114 | },
115 | "additionalProperties": false
116 | }
117 | },
118 | "required": [
119 | "_ACCUMULATION_GROUP"
120 | ]
121 | }
122 | ```
123 |
124 | The recursive definition (`#/definitions/accumulation_data_array`) under the schema root (`_ACCUMULATION_GROUP`) provides details of the cumulative sum statistics, including the dataset names, accumulation types and dimensions. The keys of its `properties` (`_DATA_UNWEIGHTED`, `_DATA_WEIGHTED`, and `_WEIGHTS`) indicate the cumulative sum types (for unweighted data, weighted data, and weights respectively), whereas its values give the cumulative sum dataset names. The accumulation dimension names are saved in the keys of its `patternProperties` along the recursion chain; it is noteworthy that these dimension names need to be ordered to avoid ambiguity and redundancy.
125 |
126 | An example of the above zarr attribute file is given as follows. The data has three dimensions including *latitude*, *longitude* and *time*. The cumulative sums are computed for the weighted data (`_DATA_WEIGHTED`) and weights (`_WEIGHTS`). If we want to provide the time-averaged map and area-averaged time series, the accumulation is only needed for the dimension combinations of *latitude*, *longitude*, *time*, and *latitude*+*longitude*; all other dimension combinations (e.g. *latitude*+*time*, *longitude*+*time*, and *latitude*+*longitude*+*time*) are empty (`{}`).
127 | ```
128 | {
129 | "_ACCUMULATION_GROUP": {
130 | "latitude": {
131 | "_DATA_WEIGHTED": "acc_lat",
132 | "_WEIGHTS": "acc_wt_lat",
133 | "longitude": {
134 | "_DATA_WEIGHTED": "acc_lat_lon",
135 | "_WEIGHTS": "acc_wt_lat_lon",
136 | "time": {}
137 | },
138 | "time": {}
139 | },
140 | "longitude": {
141 | "_DATA_WEIGHTED": "acc_lon",
142 | "_WEIGHTS": "acc_wt_lon",
143 | "time": {}
144 | },
145 | "time": {
146 | "_DATA_WEIGHTED": "acc_time",
147 | "_WEIGHTS": "acc_wt_time"
148 | }
149 | }
150 | }
151 | ```
152 |
153 | ### Zarr attribute file of accumulation data array
154 |
155 | With Zarr-based chunk-level accumulation, the cumulative sums are not necessarily computed for every single chunk. To further reduce the computation and storage cost for the accumulation data, the cumulative sums can be computed every certain number of chunks, and we call this tunable number the *accumulation stride*. This information is saved in the Zarr attribute file for the accumulation dataset (e.g., ``${raw_dataset}_accumulation_group/{accumulation_dataset_1}/.zattr``).
156 |
157 | As mentioned above, the dimension labels are needed to identify the accumulation datasets. We assume that the dimensions are defined in the attributes of the dataset as `_ARRAY_DIMENSIONS` as from [the xarray implementation](https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html). In the present approach, the *accumulation stride* is saved in an object called `_ACCUMULATION_STRIDE` in parallel with `_ARRAY_DIMENSIONS`. The related schema segment of this attribute file is shown as follows:
158 | ```
159 | {
160 | "$schema":"http://json-schema.org/draft-07/schema#",
161 | "type":"object",
162 | "properties":{
163 | "_ARRAY_DIMENSIONS":{
164 | "type":"array",
165 | "items":{
166 | "type":"string"
167 | }
168 | },
169 | "_ACCUMULATION_STRIDE":{
170 | "type":"array",
171 | "items":{
172 | "type":"integer"
173 | }
174 | }
175 | },
176 | "required":[
177 | "_ARRAY_DIMENSIONS",
178 | "_ACCUMULATION_STRIDE"
179 | ]
180 | }
181 | ```
182 |
183 | The `_ARRAY_DIMENSIONS` and `_ACCUMULATION_STRIDE` arrays should have the same length. Each item in the `_ACCUMULATION_STRIDE` array represents the accumulation stride along the dimension from the `_ARRAY_DIMENSIONS` array at the same index. The value of accumulation stride should be a non-negative integer: a positive value represents the accumulation stride as defined above, whereas a value of 0 indicates the accumulation is not performed along the given dimension.
184 |
185 | For example, the following attribute file represents the accumulation that is performed along only the time dimension every other chunk:
186 | ```
187 | {
188 | "_ARRAY_DIMENSIONS":[
189 | "latitude",
190 | "longitude",
191 | "time"
192 | ],
193 | "_ACCUMULATION_STRIDE":[
194 | 0,
195 | 0,
196 | 2
197 | ]
198 | }
199 | ```
200 |
201 | and the following attribute file represents the accumulation that is performed along the latitude dimension for each chunk, and along longitude dimension every 3 chunks:
202 | ```
203 | {
204 | "_ARRAY_DIMENSIONS":[
205 | "latitude",
206 | "longitude",
207 | "time"
208 | ],
209 | "_ACCUMULATION_STRIDE":[
210 | 1,
211 | 3,
212 | 0
213 | ]
214 | }
215 | ```
216 |
217 | ### Application Interface
218 |
219 | The accumulation-based workflow requires the application to locate the accumulation data along certain dimensions. The accumulation data array name for the given dimensions can be obtained from the accumulation group attributes. The following example shows the steps to get the weighted accumulation data array name along *latitude*+*longitude* dimensions:
220 |
221 |
222 |
223 | The accumulation stride is also needed to locate the accumulation data for a given chunk number. They can be obtained from the accumulation data attributes, and the following example shows the steps to get the accumulation stride for the accumulation data along *latitude*+*longitude* dimensions:
224 |
225 |
226 |
227 | ## References and Footnotes
228 | * ESIP Summer 2022 Presentation on *Zarr-based chunk-level cumulative sums in reduced dimensions for fast high-resolution data analysis*:
229 | * [Abstract](https://2022esipjulymeeting.sched.com/event/12etJ/advances-and-challenges-of-cloud-native-data-including-analysis-ready-cloud-optimized-or-arco-formats-and-access-part-1-presentations)
230 | * [Slides](https://docs.google.com/presentation/d/1RNvkIlCFvtoy89OTMzQNn_0jixOpdhnu/edit?usp=sharing&ouid=106287227661991623566&rtpof=true&sd=true)
231 | * [Video](https://www.youtube.com/watch?v=ac_UKunUrNM&t=2250s)
232 |
233 | * [*Xarray* Zarr Encoding Specification](https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html)
234 | * [*Snowflake* table statistics](https://twitter.com/teej_m/status/1546591452750159873)
235 |
236 | ## Copyright
237 |
238 | This proposal is licensed under [the Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
239 |
--------------------------------------------------------------------------------
/draft/draft_zeps.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: draft ZEPs
4 | description: List of Draft ZEPs
5 | nav_order: 4
6 | has_children: true
7 | permalink: /draft_zeps/
8 | ---
9 |
10 | # Draft ZEPs
11 |
12 | ### Shows the list of Draft ZEPs.
13 |
--------------------------------------------------------------------------------
/index.md:
--------------------------------------------------------------------------------
1 | ---
2 | # Feel free to add content and custom Front Matter to this file.
3 | # To modify the layout, see https://jekyllrb.com/docs/themes/#overriding-theme-defaults
4 |
5 | layout: default
6 | title: home
7 | nav_order: 1
8 | description: ZEP "A community feedback process for Zarr Specification"
9 | permalink: /
10 |
11 | ---
12 |
13 | # Zarr Enhancement Proposals (ZEPs)
14 | {: .fs-9}
15 |
16 | Community Feedback Process for Zarr Specifications.
17 | {: .fs-6 .fw-300 }
18 |
19 | [Propose a new ZEP](https://github.com/zarr-developers/zeps#proposing-a-new-zep){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }
20 | [View it on GitHub](https://github.com/zarr-developers/zeps){: .btn .fs-5 .mb-4 .mb-md-0 }
21 |
22 | ---
23 |
24 | ZEP stands for Zarr Enhancement Proposal. A ZEP is a design document providing
25 | information to the Zarr community, describing a modification or enhancement of
26 | the [Zarr specifications](https://zarr-specs.readthedocs.io/en/latest/), a new
27 | feature for its processes or environment. The ZEP should provide specific proposed
28 | changes to the Zarr specification and a narrative rationale for the specification
29 | changes.
30 |
31 | We intend ZEPs to be the primary mechanism for evolving the spec, collecting
32 | community input on major issues and documenting the design decision that has
33 | gone into Zarr.
34 |
35 | ### ZEP Meetings 🧑🏻💻
36 |
37 | We hold bi-weekly ZEPs meetings to propose, discuss, review and finalize discussions around current ZEPs and Zarr Specification. More info available here: [https://zarr.dev/zeps/meetings/](https://zarr.dev/zeps/meetings/)
38 |
39 | ---
40 |
41 | ### Contributing 🤝🏻
42 |
43 | If you wish to contribute to Zarr's codebase, propose a new ZEP(s), website, blog
44 | posts or in any way, please visit Zarr's GitHub [here](https://github.com/zarr-developers/).
45 | You can discuss the change you want to see by opening an issue in the appropriate
46 | repository, or if the issue is already present, feel free to submit a pull request.
47 |
48 | ### Code of Conduct ⚖️
49 |
50 | ZEPs are governed by Zarr Community's
51 | [CODE OF CONDUCT](https://github.com/zarr-developers/.github/blob/main/CODE_OF_CONDUCT.md).
--------------------------------------------------------------------------------
/join-community.markdown:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: join the community
4 | permalink: /join-community/
5 | ---
6 |
7 | ## Join the Zarr Community
8 |
9 | Most discussions and chats related to Zarr and its [implementations](https://github.com/zarr-developers/zarr_implementations) take place on Gitter and GitHub. If you are looking to:
10 |
11 | - Interact with the maintainers, contributors and users of the project; join the ZulipChat → [here](https://ossci.zulipchat.com/)
12 | - Want to ask questions related to [`zarr-python`](https://github.com/zarr-developers/zarr-python) usage, create a new discussion on GitHub → [here](https://github.com/zarr-developers/zarr-python/discussions)
13 | - Contribute and engage in discussion related to Zarr Specification; check out the `zarr-specs` [repo](https://github.com/zarr-developers/zarr-specs/) or create an issue → [here](https://github.com/zarr-developers/zarr-specs/issues)
14 |
15 | Also, find us on:
16 |
17 | - [Twitter](https://twitter.com/zarr_dev)
18 | - [GitHub](https://github.com/zarr-developers)
19 | - [YouTube](https://www.youtube.com/@zarr_dev/playlists)
20 |
--------------------------------------------------------------------------------
/meetings/2022/2022-09-08.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 8th September
4 | description: ZEPs Meeting Notes for 2022-09-08
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 1
8 | ---
9 |
10 | # 2022-09-08
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Jonathan Striebel (JS), Norman Rzepka (NR), Ryan Abernathey (RA), Dennis Heimbigner (DH), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR
15 |
16 | This was the first ZEP meeting ever. Representatives from the [Zarr Implementations Council](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-implementation-council-zic) joined the meeting. Most discussions revolved around [ZEP1](https://zarr.dev/zeps/draft/ZEP0001.html). One of the critical decisions about ZEP1 was to accept it ‘[Provisionally](https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution)’ and move forward with the implementation in various programming languages.
17 |
18 | **Updates:**
19 | - SV: open ("draft") ZEPs:
20 | - [https://zarr.dev/zeps/draft/ZEP0001.html](https://zarr.dev/zeps/draft/ZEP0001.html)
21 | - [https://zarr.dev/zeps/draft/ZEP0002.html](https://zarr.dev/zeps/draft/ZEP0002.html)
22 | - SV: Author discussion on all the comments on ZEP0001
23 | - Have proposed resolutions for a number of those
24 | - Meeting again tomorrow to try to finish the list
25 | - [https://hackmd.io/sOos8rxrRvKCJPbbUKWtwA?view](https://hackmd.io/sOos8rxrRvKCJPbbUKWtwA?view)
26 |
27 | - SV: Critically propose marking ZEP0001 as "provisionally accepted" after the above are handled and passed by the ZIC
28 | - Implementations are free (and encouraged!) to start implementing.
29 | - Any blocking changes could still be handled.
30 | - Otherwise, "feature freeze".
31 | - Feedback?
32 | - RA: process to get to provisionally accepted
33 | - SV: draft == under review.
34 | - on vote, can move to provisionally or accepted state.
35 | - once implemented, moves to final.
36 | - could move to "deferred" state if the ZIC vetoes
37 | - WF: "ready to implement" jumped out (and caused anxiety but only since there's too much to do)
38 | - [https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution](https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution)
39 | - JMS: no substantial changes since early draft
40 | - JM: editors are preparing a rebuttal (Alistair's paper model)
41 | - JMS: not sure a paper model is best
42 | - RA: not in the sense that there's only one round and someone will decide. iterative
43 | - good to have authors who are organizing.
44 | - now in revision and we can continue until everyone is happy
45 | - gone slowly for various reasons (availability, summer, and it's our first time & massive)
46 | - would be useful to go through the outstanding issues
47 | - JS: in this cycle and not limited iterations is just the limited time.
48 | - but for now, trying to make batched changes
49 |
50 | **Meeting Minutes:**
51 | - JS: **review of memory order decision number-16 from list**
52 | - zarr's goal is interoperability. therefore propose to keep C & F (benefit for community)
53 | - could support read only, even with a transpose (if too slow, add a warning?)
54 | - JMS: agree. but would like an arbitrary permutation.
55 | - DH: good use case?
56 | - JMS: dimension that represents time. order you display to the user is logical for them but need not be logical for compression/access patterns.
57 | - JM/JS: core or extension?
58 | - RA: that's a key question
59 | - NR: re: backwards compatibility C/F is in V2 therefore that would need to be in core. but arbitrary could be an extension.
60 | - RA: but v3 is a chance to break backwards compatibility (explicitly not a goal)
61 | - NR: upgrade path? so be able to upgrade without re-writing the chunks.
62 | - RA: v2 will still be supported.
63 | - WF: that would be the hope, but worry about netcdf & archival -- assuming software will support it without it being expressed somewhere. aspirational sure but makes us nervous.
64 | - e.g. will future software implement the v2 standard?
65 | - RA: transform based solution? (but only if we support F) **if** we say the chunks should be backwards compatibility.
66 | - WF/DH: no one has ever asked for arbitrary. Someone at NOA asks for things that would help their lab. Technical debt. (Won't even request a pull request) See the trap that the HDF group fell into (single-writer-multiple-reader, several orders of magnitude that they are trying to recover from.)
67 | - JMS: arbitrary seems most natural. pass to `numpy.transpose()`
68 | - WF: shocked at the assertion that there _wouldn't_ be a migration path
69 | - JM: clarification -- were only differentiating if _binary_ transformation is needed
70 | - **can add _requirement_ to v3 that implementations read v2**
71 | - WF: requirement of netcdf. can decide if that's a requirement.
72 | - DH: depends if it's alot. operational definition - "too painful to copy v2 to v3"
73 | - (for RA): petabytes of data
74 | - JS: RA proposed transformer strategy - essentially rewriting metadata **formalize it?**
75 | - DH: how comfortable are you not supporting older version?
76 | - JM: for OME, got agreement but that's a layer higher
77 | - DH: will there be new implementations without V3 support?
78 | - NR: think there will be
79 | - JM: but it's so easy to implement
80 | - WF: people won't do that...what do we do if a popular implementation doesn't support v2?
81 | - other packages?
82 | - RA: recommend storage layer / translation?
83 | - JM: agreed but that's SHOULD (versus MUST)
84 | - JMS: only way to force it is a standardization
85 | - JM: agreed, but we can only do what the spec document allows us (i.e. labeling something as "compliant")
86 | - JS: it's a new major version and people know what we mean. (as a user, I wouldn't expect support for v2 if an implementation says "v3")
87 | - WF: convinced myself I'm worrying too much instead:
88 | - WF: in 18 months how do you know which Zarr is used to open it.
89 | - JM: metadata file is different (essentially the magic number). The proposal for `.zr3` was currently turned down.
90 | - SV: [data type naming](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r929140806)
91 | - JM: dropping the python-ness
92 | - JMS: helps provide a more nature scheme for some datatypes (and endianness as a codec)
93 | - no argument against (just "convenient in Python")
94 | - JM: will need names
95 | - JMS: in [https://github.com/zarr-developers/zarr-specs/pull/155](https://github.com/zarr-developers/zarr-specs/pull/155)
96 | - DH: netcdf ncchar type equivalent to 8-bit ascii, no equivalent in Zarr. Needed? NC uses it all the time. Why not in numpy?
97 | - JMS: thought numpy has char.
98 | - RA: revisit char question? JMS: different than varstring
99 | - RA: where does the encoding go? DH: in an attribute. "ascii" (or "utf-8")
100 | - RA: used for? DH: you see a lot of flags stored that way.
101 | - also historical: NC-3 didn't have strings of any type. (arrays of chars workaround)
102 | - JM: extension mechanism?
103 | - DH: where the wheel hits the road
104 | - JMS: just metadata?
105 | - RA: disagree, influences ...
106 | - JMS: agreed, but doesn't change how hard it is to implement
107 | - JM: but need to feel confident that they are low cost so we can change *when* we discuss these things
108 | - JMS: will changes start appearing?
109 | - JM: Very soon!
--------------------------------------------------------------------------------
/meetings/2022/2022-09-22.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 22nd September
4 | description: ZEPs Meeting Notes for 2022-09-22
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 2
8 | ---
9 |
10 | # 2022-09-22
11 |
12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS), Dennis Heimbigner (DH)
13 |
14 | ## TL;DR
15 |
16 | Consolidate metadata needs an extension for V3, which might result in a new ZEP. Next, JMS shared a document titled ‘Optionally-cooperative distributed b-tree for Tensorstore’. The participants discussed the document after that. After that, JM initiated the discussion on codecs-registry, which was built by one of the GSoC students this summer. The meeting ended with a discussion on the path to the metadata files.
17 |
18 | **Meeting Minutes:**
19 |
20 | - Java/NetCDF side:
21 | - JM: Sanket met people
22 | - WF: Unidata should be 3x the staff.
23 | - JM: perhaps starting with a kerchunk implementation?
24 | - WF: looking for more community involvement (like netcdf-c had)
25 | - JM: Greg mentioned consolidated metadata needs an extension for V3
26 | - RA: Iceberg issue, also see JMS' proposal
27 | - [https://github.com/zarr-developers/zarr-specs/issues/154](https://github.com/zarr-developers/zarr-specs/issues/154)
28 | - JMS: touches on not needing a file per chunk (like discussed last night)
29 | - [https://docs.google.com/document/d/1PLfyjtCnfJRr-zcWSxKy-gxgHJHSZvJ2y4C3JEHRwkQ/edit?resourcekey=0-o0JdDnC44cJ0FfT8K6U2pw#heading=h.8g8ih69qb0v](https://docs.google.com/document/d/1PLfyjtCnfJRr-zcWSxKy-gxgHJHSZvJ2y4C3JEHRwkQ/edit?resourcekey=0-o0JdDnC44cJ0FfT8K6U2pw#heading=h.8g8ih69qb0v)
30 | - db format that stores a btree.
31 | - uniquely: designed to allow distributed writes (s3, etc.) *but* doesn't need a peristent database
32 | - can also read it in a non-distributed fashion
33 | - downside: adds quite a bit of added complexity (greatly for binary format)
34 | - also good where sharding isn't appropriate (e.g. pre-defined shard size which is required for write)
35 | - e.g. large number of small arrays (where sharding won't help)
36 | - RA: nice document. comments:
37 | - focused on big distributed writes, but with iceberg had a different main motivation: more flexibility in mapping keys to chunks. kerchunk-like. virtual concatenate . can you reference random chunks? yes.
38 | - JMS: btree nodes have references to files (like kerchunk). but datafiles are identified with 128-bit path (not an fsspec URL)
39 | - RA: different use case, so can have them be optional transformers/extensions
40 | - RA: really similar to tiledb! why not use it?
41 | - JMS: tiledb is organized by time not space.
42 | - JM: need a compaction
43 | - JMS: and even after that you still have a million files.
44 | - DH: HDF5? internally it's btrees. (which is responsible for most of its complexity). Are you sure this is the path?
45 | - JMS: not sure there's an alternative to btrees. used in databases, filesystems, etc.
46 | - DH: if you don't want some ordered searches, then linear hashes are an alternative
47 | - JMS: ordered is useful for a lot of use cases. but there wasn't an obvious solution for distributed writes
48 | - DH: [extendable hashing](https://en.wikipedia.org/wiki/Extendible_hashing) is an easier data structure (old paper) works well with disk storage.
49 | - JMS: think this is more a key-value store (like zip)
50 | - RA: agreed. Nice that it's possible to experiment like this.
51 | - RA: can the V3 spec support this experimentation? (right extension points?)
52 | - RA: trying to do that with Iceberg. Martin suggested "IceChunk".
53 | - See also: hooty and others. Lots of smart ideas that we can copy.
54 | - Goal is to provide some level of branching & transactions for/on a Zarr store
55 | - Allow you to work on your staged area which all get written at once.
56 | - Branch non-destructively (or rollback)
57 | - The key is having a "manifest" (they all have some concept of that, even kerchunk)
58 | - Don't depend on the object stores listing as the source of truth
59 | - Need storage transformers at the top level, not array. But for JMS' idea array-level might suffice.
60 | - JMS: wasn't planning on an extension. root metadata would be in the same data store.
61 | - JM: basically writing DB/filesystem :+1: ZarrFS ;)
62 | - JMS: planning on mongo? Yeah, or Dynamo. (They store JSON)
63 | - JSON in S3 isn't ideal.
64 | - metadata in document store and chunks on disk. Beyond just filesystem. It's a data lake.
65 | - "meta-store"
66 | - JMS: regarding versioning, how are you representing the delta?
67 | - The chunk is the minimal writable unit. (out-of-scope)
68 | - Every chunk write is a uniquely ID'd (e.g. content addressable). That gets a key. Write that to DB.
69 | - JMS: expecting the database to provide the versioning?
70 | - RA: no, just a place for documents. versioning (in iceberg) has a branch or a tag that points to a specific chunk manifest. you can create a new one and point your HEAD at that. only rely on database to atomically change the references. iceberg tracks a number for the transaction.
71 | - JMS: use kerchunk model? limitation on the number of chunks?
72 | - RA: chunks are likely in a separate manifest. discussed that another extension with Martin.
73 | - RA: but can just query a chunk from the database.
74 | - JMS: 1M chunks in v1. then update to v2. What's the diff? A copy.
75 | - RA: yeah need to play with it.
76 | - JMS: when you get to wanting to update just a portion of it, then you get to b-trees :smile:
77 | - RA: no db guys, trying to keep it hackable.
78 | - RA: but megabyte kerchunk is already getting :heart: since it's so easy. looking for incremental improvement on _that_. (NASA will be pumping out GRIB forever...)
79 | - JMS: looking forward to hearing more and exchanging info re: b-trees
80 | - JMS: see also [https://github.com/janelia-flyem/dvid](https://github.com/janelia-flyem/dvid) (backed by KV database)
81 | - JM: sharing layers with them?
82 | - JMS: complicated by other priorities of the EM team. invite Bill to the Zarr meetings?
83 | - RA: see [https://lakefs.io/](https://lakefs.io/)
84 | - JM: API versus format
85 | - RA: thinking about it more like an API
86 | - JM: briefly codecs-registry
87 | - [https://zarr.dev/codecs-registry/](https://zarr.dev/codecs-registry/)
88 | - [https://github.com/zarr-developers/codecs-registry](https://github.com/zarr-developers/codecs-registry)
89 | - JMS: still want a schema per codec. JM: agreed!
90 | - JMS: talks about codecs having URLs.
91 | - would by an annoyance to have difference V2 and V3 identifiers.
92 | - e.g. just numeric constants in the JSON that are from the C API
93 | - e.g. shuffle parameter which would be nicer as a string.
94 | - support integer or string for a while (in order to deprecate)
95 | - JM: have plans to have code in each languages that checks for an id from the central registry
96 | - DH: approx. that with nczarr. ncdump lists the actual codecs in the file
97 | - would be good to have something more sophisticated
98 | - have the disadvantage of C code and interpreted files
99 | - 3 repositories on the C side. unidata + irvine + hdf5
100 | - hdf5 only has names, hdf5-ids and a pointer (which is often out of date)
101 | - something universal would be nice
102 | - WF: roping in the HDF5 group would be a heavy lift
103 | - JMS: **URL interface** :rocket:
104 | - DH: :+1: for the REST API
105 | - WF: NSF/CSSI solicitation has opened
106 | - [https://beta.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific-innovation-cssi](https://beta.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific-innovation-cssi)
107 | - perhaps something here
108 | - WF: planning on getting to [https://www.egu23.eu/](https://www.egu23.eu/)
109 | - tweet something from zarr_dev to see if there is interest :question:
110 | - could collaborate something re: nc/zarr
111 | - JMS: don't have clear resolution on the paths to the metadata files
112 | - JM: re-capped the previous discussion and think it's still good.
113 | - JMS: some details around the root array (the named files, etc.)
114 | - JMS: consolidated metadata? duplicated?
115 | - JM: would make it possible to have everything in the top-level
116 | - JMS: pointers in the subdirectories? bit annoying.
117 | - JMS: with iceberg & co. you likely don't need a consolidated metadata
118 | - JM: so you'd push it to the store level?
119 | - JMS: possibly, but not that simple
120 | - JMS: there are cases where you need path separation anyway (Zips)
121 | - JMS: so could see using a path separation strategy entirely
122 | - JMS: Davis did have a use case ...
123 | - (...details zip, consolidated brainstorming...)
124 | - JM: need both solutions...
--------------------------------------------------------------------------------
/meetings/2022/2022-10-06.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 6th October
4 | description: ZEPs Meeting Notes for 2022-10-06
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 3
8 | ---
9 |
10 | # 2022-10-06
11 |
12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Greg Lee (GL), Jonathan Striebel (JS)
13 |
14 | ## TL;DR
15 |
16 | JM shared that there were some good conversations around OME-Zarr yesterday. The summary is available [here](https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10). WF shared that Kitware is looking for partners and a link to the sign-up form. GL shared that during the CZI Open-Science Summit 2022, he worked on writing tests for Xarray. After this, there was an extensive discussion on URL syntax initiated by JMS.
17 |
18 | **Updates:**
19 |
20 | * miscellaneous reading before the meeting (JM)
21 | - [https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/](https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/)
22 | - [https://github.com/kaitai-io/kaitai_struct/issues/125](https://github.com/kaitai-io/kaitai_struct/issues/125)
23 | * NGFF (JM)
24 | - [https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10](https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10)
25 | - Good conversations around OME-Zarr yesterday
26 | * Enthusiasm for Kitware (WF)
27 | - Looking for partners. [Have form on webpage](https://www.kitware.com/contact/project/).
28 | - Unidata an option. They've mentioned Zarr a couple of times (Kitware Blog).
29 | * xarray test (GL)
30 | - during czi conference.
31 | - release of 2.13 hopefully fixed it all :tada:
32 |
33 | **Meeting Minutes:**
34 |
35 | * URL syntax? (JMS)
36 | - helps to figure out the metadata location.
37 | - Josh: great idea. have several ongoing discussions at the NGFF level
38 | - current proposal would be to support URIs internally (relative, absolute, remote)
39 | - however, in V2
40 | - JMS: in v3 the root exists
41 | - though not entirely clear that the new metadata organization is necessary
42 | - designed for S3 where there's no directory, but other problems exist
43 | - Josh: _summarized previous discussions for Greg_
44 | - GL, thoughts on the V3 situation?
45 | - GL: at the moment, you need helper methods to do that.
46 | - JM: one proposal was to have the metadata be the main directory which lets you then bootstrap the chunk loading
47 | - JMS: support multiple?
48 | - JM: conceivably. as extension or configuration.
49 | - JM: downside for consolidated metadata is that nothing exists in the metadata hierarchy
50 | - workaround of having a thin-hierarchy only with references to where the metadata exists
51 | - JS: losing the ability to be able to next any hierarchy. (everything is a root)
52 | - JM: are we proposing rolling it back completely
53 | - JMS: problem is the URI+rootpath metadata
54 | - JS: walking up the hierarchy would be an option (URL doesn't actually point)
55 | - JMS: would be nicer if you don't have to perform a search
56 | - Use case
57 | - URL case
58 | - Desktop double click on something
59 | - Similar issue: **Zips** :warning:
60 | - JMS: have an additional level
61 | - JM: except ZipStore v2 assumes the whole zip is a zgroup
62 | - JS: propose zip is a special case which is _easier_
63 | - JMS: unless you are mixing volumetric with a zarr then it wouldn't be at the top-level
64 | - Btree (JMS): need to be able to compose multiple layers (similar to fsspec and double colons)
65 | - Remote chunk store (or point to V2 chunks)
66 | - Renaming folders (keep data with arrays)
67 | - Options
68 | - Keep "/meta", clients must know
69 | - Drop "/meta", direct URLs
70 | - `?param` syntax
71 | - `#param` syntax
72 | - Separator syntax (e.g. "`//`")
73 | - root dir ends in .zarr
74 | - fsspec `::` separator
75 | - multiple protocols (git+ssh, zip+zarr)
76 | - further discussion
77 | - JS: without /meta and .zarr requirement, you still don't know where the root is
78 | - JS: if you drop "/meta" then you can't name anything "/data"
79 | - JMS: could use something more obfuscated
80 | - JS: why split?
81 | - JMS: if you are not using the filesystem (s3 or gcs) and you want to list all the metadata, it's not (as) efficient
82 | - JM: "data" could be registered in the metadata so it's a known (and configurable) thing
83 | - WF: NC anything with leading underscore is assumed reserved for the library
84 | - permitted to create them, but the spec says "please don't"
85 | - JM: `.z` prefix
86 | - WF: utilities and tools can scrape everything with that
87 | - WF: also don't have to put too much thought into new features
88 | - JMS: would prefer not a `.` prefix because of archiving tools, etc.
89 | - then `_z`?
90 | - JMS: root metadata file doesn't really do anything
91 | - JS: creates ambiguity
92 | - JM: think it was largely for bootstrapping global plugins (e.g. transformers)
93 | - JS: perhaps V2 compatibility
94 | - JMS: not clear you would nest sharding with other transformers. it would be the thing applied to the chunks.
95 | - JS: the metadata needs to be somewhere, and for that can be at the array level
96 | - brief summary
97 | - zarrs are essentially a metadata hierarchy
98 | - that configure (possibly remote) chunk stores
99 | - and the root is identified with .zarr
--------------------------------------------------------------------------------
/meetings/2022/2022-10-20.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 20th October
4 | description: ZEPs Meeting Notes for 2022-10-20
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 4
8 | ---
9 |
10 | # 2022-10-20
11 |
12 | **Attending:** Ward Fisher (WF), John Kirkham (JK), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR
15 |
16 | WF is working on the maintenance NetCDF release candidate (`v4.9.1-rc1`), and JMS added CMake support to TensorStore. After this, JMS initiated a discussion on Path structure and was stretched for the remaining meeting.
17 |
18 | **Updates:**
19 |
20 | - (WF) Working on maintenance netcdf release candidate (`v4.9.1-rc1`). No new features, just bug fixes and improvements.
21 | - (JMS) Added CMake support to TensorStore
22 | - Discussion about CMake, dependency management
23 | - https://cmake.org/cmake/help/book/mastering-cmake/chapter/CDash.html
24 | - https://github.com/cpm-cmake/CPM.cmake
25 |
26 | **Meeting Minutes:**
27 |
28 | * (JMS) Path structure
29 | * Require or encourage root directory to end in .zarr
30 | * How to name all the metadata files?
31 | * Root metadata could contain extension information
32 | * (JK) Mentioned `.zmeta` metadata file with paths to metadata file
33 | * (JMS) About listing
34 | * (WF) Possible issues with writing
35 | * (WF) Spec vs. library tension
36 | * (JK) Have file expire?
37 | * (JMS) Handle as read-only
38 | * (JK) Could also delete as part of writing?
39 | * (JMS) HDF5 has hierachary and Zarr replicates this
40 | * Have some array and non-array data next to each other
41 | * (JK) Examples?
42 | * (JMS) Segmentations & mesh representations
43 | * (JMS) Collection of volumes with annotations related to them
44 | * (WF) Have Zarr hierarchy with non-Zarr?
45 | * (JMS) Only have single individual arrays
46 | * (WF) Wouldn't have considered this structure
47 | * (WF) Does there need to be something in the spec about interleaving data?
48 | * (WF) Maybe interleaving poses some challenges
49 | * (JMS) Doesn't NetCDF have extra files as well?
50 | * (WF) Yes. Extra metadata used to map Zarr model to NetCDF model.
51 | * (JMS) Reason to use this structure as opposed to Zarr metadata files?
52 | * (WF) NetCDF supports different formats HDF5, Zarr, etc.
53 | * (JMS) Have user defined attributes. Types are stored in metadata file? Could those be in zattrs?
54 | * (WF) Yes. Not sure
55 | * (JMS) Hierarchy becomes more apparent with V3 as opposed to V2
56 | * (WF) Groups were a new feature that users were slow to pick up on
57 | * (JK) Does adding more top-level metadata cause issues?
58 | * (JMS) Could it contain the metadata?
59 | * (WF) Maybe include subset of metadata
60 | * (WF/JMS) Perhaps special case single array use case
61 | * (JK) How does data relate in non-hierachical form
62 | * (JMS) Related, but not all Zarr data
63 | * (JK) Would other kinds of chunk formats (standardizing on kerchunk) be useful
64 | * (JMS) Meshes probably don't make sense in this way
65 | * (JMS) Neuroglancer meshes are a good example
66 | * (JMS) Sparse arrays seem similar in that they might be better handled by being their own file format
67 | * (WF) NetCDF users mention performance issues in moving to new version. Usually suggest using old NetCDF. Maybe same with V2/V3?
68 | * (JMS) Want to use V3 (sharding being of value).
69 | * (JK) Including unstructured binary blobs in Zarr?
70 | * (JMS) Has a group of files for mesh
71 | * (JK) Maybe ignore specific paths?
72 | * (WF) Having mixed media is valuable though can be logisticially tricky
73 | * (WF) What defines Zarr as a data model? At least need to say some behavior is undefined (mixed media). Ideally ignores mixed media files.
--------------------------------------------------------------------------------
/meetings/2022/2022-11-03.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 3rd November
4 | description: ZEPs Meeting Notes for 2022-11-03
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 5
8 | ---
9 |
10 | # 2022-11-03
11 |
12 | **Attending:** Josh Moore (JM), Jonathan Striebel (JS), Jeremy Maitin-Shepherd (JMS), Sanket Verma (SV)
13 |
14 | ## TL;DR
15 |
16 | Discussions were held on how to move forward with ZEP1 quickly. The summary can be viewed [here](https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1302440391). Then the attendees discussed extensions in V3, and JMS is considering trying with non-zero origin. SV joined the meeting after 30 mins. After that, JS mentioned some high-level issues looming around V3 spec.
17 |
18 | **Meeting Minutes:**
19 |
20 | * JMS: number of PRs that could be merged into the working draft
21 | - JS: don't want to just close it
22 | - JM: can we cross link e.g. JMS' PR? Yes.
23 | - ==> once all cross-linked close PR.
24 | * JS: when to merge?
25 | - JM: when it matches the consensus?
26 | - JS: ok, but don't have merge rights.
27 | - ==> Let's merge proactively.
28 | * see: https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1302440391
29 | * extensions
30 | - JMS: thinking of trying with non-0-origin
31 | - JM: think that's a general principle we should try for all issues/PR is "could it be an extension"
32 | - JMS: thinking of extensions as plugins? Not exactly.
33 | - JS: how to influence if an implenentation adopts an extension? if there's a concrete implementation / clear interface
34 | - JMS: agreed and some obvious ones (codecs) but not clear there will be a broader abstraction
35 | - JS: "index transformer" _perhaps_
36 | - or as transformer _if_ multiple of chunking
37 | - JMS: unfortunate limitation
38 | - JMS: re: transformers - it doesn't make sense to compose a different storage transform _before_ sharding
39 | - JS: depends. cache of chunks or shards? also checksum
40 | - JM: codec is similar
41 | - JMS: caching enabled in code, but not in zarr metadata
42 | - JS: that's in spec, yes. "runtime-only" but still before or after
43 | - JMS: when implementing sharding, would check if it's first
44 | - want to be able to tell the user "this is the graularity to write"
45 | - JS: good that it's flexible. like c/f order.
46 | - JS: mention in implementation "sharding must be first"
47 | - JMS: composing makes for useful extension point
48 | - JS: most important point: are we sure enough about the extension points?
49 | - _Sanket joins_ 🧑🏻💻
50 | - Jonathan: high-level issues looming
51 | - paths/URL discussion (needs an issue)
52 | - global transformers
53 | - variable chunk length (possibly origin offset)
54 | - indexing more abstract
55 | - upgrade path! (`{“extension”: [“@v2-layout”]}`)
--------------------------------------------------------------------------------
/meetings/2022/2022-11-17.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 17th November
4 | description: ZEPs Meeting Notes for 2022-11-17
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 6
8 | ---
9 |
10 | # 2022-11-17
11 |
12 | **Attending:** Sanket Verma (SV)...Jonathan Striebel (JS), Ryan Abernathey (RA), Ward Fisher (WF)
13 |
14 | ## TL;DR:
15 |
16 | Apparently there was a snafu where JS, RA and WF joined a Zoom meeting whereas SV joined another one! 🥲
17 |
18 | In the meeting there was the discussion on V3 spec and some of it's missing parts. Also, RA opened a PR on global transformers, which can be seen [here](https://github.com/zarr-developers/zarr-specs/pull/182).
19 |
20 | **Updates:**
21 |
22 | - ZEP1 Update, see [here](https://gitter.im/zarr-developers/community?at=6374fae6f9491f62c9b7ea61)
23 | - Check out the ZEP1 GH Project board [here](https://github.com/orgs/zarr-developers/projects/2/views/2); maintained by Jonathan Striebel
24 |
25 | **Meeting minutes:**
26 |
27 | Same as TL;DR. 👆🏻
28 |
--------------------------------------------------------------------------------
/meetings/2022/2022-12-01.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 1st December
4 | description: ZEPs Meeting Notes for 2022-12-01
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 7
8 | ---
9 |
10 | # 2022-12-01
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), Ward Fisher (WF), Dennis Heimbigner (DH), Ryan Abernathey (RA)
13 |
14 | ## TL;DR:
15 |
16 | RA started a discussion on drop `/meta` prefix. See the issue [here](https://github.com/zarr-developers/zarr-specs/issues/177), which basically led to chain reaction of several conversations around topics related to each other. These discussions are mostly around some lingering issues around the finalisation of Zarr V3 spec.
17 |
18 | RA, JMS and JS took some action items which can be seen at the bottom
19 |
20 | **Updates:**
21 |
22 | - Conversations (issues and feedback) on ZEP1 [PR](https://github.com/zarr-developers/zarr-specs/pull/149) are now resolved. Check [this](https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1327605570). Thanks to Jonathan Striebel! 🙌🏻
23 | - The conversations which needs additional input have been moved to separate issues
24 | - Jeremy Maitin-Shepard promoted as one of the authors for [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html)
25 | - Current status of ZEP1 can be viewed [here](https://github.com/orgs/zarr-developers/projects/2)
26 |
27 | **Meeting minutes:**
28 |
29 | - RA: suggest focusing on the meta/ prefix discussion
30 | - [https://github.com/zarr-developers/zarr-specs/issues/177](https://github.com/zarr-developers/zarr-specs/issues/177)
31 | - JMS: not sure it's solving a problem (optimally). nice feature of v2 is copying out an array
32 | - JS: was for performance, use exclusion mechanism
33 | - RA: never need to list chunks (even if implementations do...)
34 | - NB: don't like trying to open files to know things (404-based)
35 | - JM: so we all agree? Yes. But what's the default?
36 | - RA: suggest: drop meta, use .json on the array
37 | - Can then drop the root metadata?
38 | - DH: there is dataset-level metadata (superblock)
39 | - JS: discuss those separately?
40 | - Agreed
41 | - JS: so to that suggestion, how do you list all metadata?
42 | - RA: don't think we should plan for discovering all metadata (millions of arrays)
43 | - JS:
44 | - RA: listing recursively isn't ok?
45 | - JS: not with implicit groups
46 | - RA: use storage transformer to get the previous behavior
47 | - RA: data is out there so need to provide a mechanism
48 | - DH: don't think that's fair
49 | - RA: nice feature of this proposal if we could keep it.
50 | - JM: how is conslidated metadata related?
51 | - RA: that's another problem.
52 | - RA: had thought about explicitly list the children (stac catalog)
53 | - DH: nczarr does that as well.
54 | - RA: downside is the concurrency issue
55 | - JS: good extension for groups (listing children)
56 | - JS: but could also have consolidated per group
57 | - JM: different commnunities here. some are definitely asking for listability
58 | - DH: not lots of formats that are listable without tools. They are asking for something powerful.
59 | - RA: so that's the root feature of the separate hierarchies
60 | - RA: we should look at some data. offer to write a script
61 | - JS one other alternative: chunks in an extra subfolder
62 | - JMS: k/v versus directory based will have different performance behavior
63 | - RA: does this require us to give up on implicit groups?
64 | - summary: foo/array.json and foo/chunks/0/0
65 | - JMS: re: dropping of root metadata
66 | - solution is perhaps storage transformer would need to write something in _array_ metadata
67 | - JS: but then everything in the array metadata and you'd need be able to read it
68 | - JMS: don't _always_ need to access the metadata directly. more like a safety measure.
69 | - JMS: could copy "global" metadata into each array
70 | - JM: that works to the design goal of being able to freely copy an array
71 | - DH: that assumes no global extensions.
72 | - RA: need to think through this separately:
73 | - portability ("invariance" of v2 that any group is standalone)
74 | - RA: danger is that you can build zarrs through the command-line (need no library)
75 | - DH: sure you want to do that?
76 | - RA: it was at least part of the design of V2
77 | - DH: as a principle, if that's what you want it's really critical to the v3 process
78 | - DH: cf. intellij -- fiddling in text file then going back (bypassing the tool)
79 | - JM: perhaps CLI manipulations are inherently "extension-less" and therefore this is "safe"
80 | - DH: one tool being a verification tool?
81 | - JS: consolidated metadata as an example
82 | - RA: primarily used as a way to allow easy listing
83 | - but you know that you can't touch the store.
84 | JS: you need a root to do fancy things
85 | - **action items:**
86 | - RA to do some performance benchmarking
87 | - JMS to propose a new storage layout for v3
88 | - next time: root metadata discussion issue (JS)
89 |
--------------------------------------------------------------------------------
/meetings/2022/2022-12-15.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 15th December
4 | description: ZEPs Meeting Notes for 2022-12-15
5 | grand_parent: ZEP meetings
6 | parent: 2022 meetings
7 | nav_order: 8
8 | ---
9 |
10 | # 2022-12-15
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Ryan Abernathey (RA), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), John Kirkham (JK)
13 |
14 | ## TL;DR:
15 |
16 | The meeting started with a discussion on some pending issues regarding V3. Then, we opened the [ZEP1 project board](https://github.com/orgs/zarr-developers/projects/2) and went through the issues individually to decide their conclusion. As a result, consensus on some issues was achieved, while others are yet to be discussed in successive ZEP meetings.
17 |
18 | The [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html) has gone into feature freeze, as mentioned in the blog post [here](https://zarr.dev/blog/zep1-update/), and from now on, the community, ZSC and ZIC will be working on integrating and resolving existing features and issues, respectively.
19 |
20 | **Meeting minutes:**
21 |
22 | - Discussed with Jonathan on 12/9:
23 | - Adding a `diff` w.r.t. to earlier version of V3
24 | - Include filesystem in ZEP0001
25 | - Sync V3 implementation in `zarr-python` with the recent changes in spec; see - [https://github.com/zarr-developers/zarr-python/issues/1290](https://github.com/zarr-developers/zarr-python/issues/1290)
26 | - [https://github.com/orgs/zarr-developers/projects/2](https://github.com/orgs/zarr-developers/projects/2)
27 | - No issues added after the 19th
28 | - All need to be solved by the vote
29 | - Migrated NaN issue to zarr-specs
30 | - dropping /meta prefix
31 | - RA: make clear (at some spec locations) that iterative listing is necessary
32 | - also make more use of async calls
33 |
--------------------------------------------------------------------------------
/meetings/2022/meeting_notes_2022.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2022 meetings
4 | description: List of ZEP meeting notes for the year 2022
5 | nav_order: 1
6 | parent: ZEP meetings
7 | has_children: true
8 | permalink: /meetings/2022/
9 | ---
10 |
11 | # ZEP Meeting Notes for 2022
12 |
13 | Shows the list of meeting notes for the year 2022.
14 |
--------------------------------------------------------------------------------
/meetings/2023/2023-01-12.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 12th January
4 | description: ZEPs Meeting Notes for 2023-01-12
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 1
8 | ---
9 |
10 | # 2023-01-12
11 |
12 | **Attending:** Jonathan Striebel (JS), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA), Sanket Verma (SV), Ward Fisher (WF), Dennis Heimbigner (DH)
13 |
14 | ## TL;DR:
15 |
16 | SV started the meeting by asking everyone to review the his latest [PR](https://github.com/zarr-developers/zeps/pull/27) in the ZEPs repo which consolidates the discussion venues for the draft ZEP and minor additions to the webpage. After this the attendees started discussing the open issues for V3 (ZEP0001). We discussed on what should be the ideal /meta prefix; please see [#177](https://github.com/zarr-developers/zarr-specs/issues/177) for extensive discussions. Then we started chatting about having [#192](https://github.com/zarr-developers/zarr-specs/issues/192) which considers to remove entry point metadata.
17 |
18 | **Updates:**
19 |
20 | -
21 | - Any objections?
22 | - No.
23 |
24 | **Meeting Minutes:**
25 |
26 | - ZEP 1 issues that need attention:
27 | - **prefix**:
28 | - open question is the prefix for the chunk directory
29 | - potentially to-be-used for json files, etc.
30 | - also useful for extensions that add new folders
31 | - Dennis: `zarr.chunks`? Good to identify them. (since there are arbitrary delimiters)
32 | - with HDF5 people have experimented with accessing chunks directly, so need easy identification
33 | - Options:
34 | - `_`
35 | - `__`
36 | - `_z_`
37 | - `z.`
38 | - `zarr.`
39 | - `_zarr_`
40 | - Ryan: what's the goal of the prefix?
41 | - JS: preventing node-name collision
42 | - JM: preventing extension collision
43 | - DH: DAP4 rule attempts to have self-assigned namespace, piggybacking on DNS (.ucar.edu)
44 | - RA: work through use cases, e.g. nczarr files
45 | - RA: won't the ability to offload metadata into a separate document
46 | - DH: also apply it to keys within the attributes
47 | - RA: see [ZEP4](https://github.com/zarr-developers/zeps/pull/28). special attributes is another discusion.
48 | - WF: A convention that _zarr are reserved is longer, but feels less prone to collision than _z
49 | - JMS: In general I think we want to reserve a prefix such as "_" for zarr itself and extensions, and then perhaps a subset of that should be reserved for just zarr itself (not extensions).
50 | - NB/DH: would suggest a top-level group
51 | - DH: Do you have sufficient metadata?
52 | - What does it mean to access a raw variable?
53 | - JM: there's still a directory. metadata+chunks (but a place to put extension files as well)
54 | - JM: bidirectionality would need some work to make sure that a group doesn't magically appear
55 | - RA: cf. how GDAL and Geo-tiff (etc) handle this
56 | - DH: two purposes of group -- namespace and a place to store attributes that aren't part of the variable
57 | - DH: example is group-level superblock marker
58 | - chat
59 | - RA: I worry that if we make `_` a disallowed prefix, lots of datasets may not work
60 | - RA: I feel like there is plenty of data out there in the wild that has a `_` as the first character in a variable
61 | - RA: Using ‘_’ as a convention in netCDF is a soft limit, not a hard; it’s part of the convention that it’s reserved, but if users disregard that, they can use ‘_’ with their own attributes. Whatever convention we decide upon can be phrased as guidance, without necessarily breaking extant datasets.
62 | - deciding
63 | - JM: ?
64 | - JS: compatiblity extension for v2 is valid or not
65 | - JM: could see `zarr.json, zarr.chunks, zarr.extensions`
66 |
67 | - **no root group**:
68 | - explicit groups can have transformers
69 | - do we always need to have a zarr.json for an array
70 | - if an array has none, how do we open it?
71 | - search up the hiearchy? too inefficient?
72 | - JMS: group level transformer ten you can't make any assumptions about what's underneath it (e.g. a redirect)
73 | - JMS: without searching, the group with the transform will become a "root"
74 | - RA: need URL syntax for group#path if we have transformers
75 | - JM: still need search up for desktop clients. no URL syntax for searching down
76 | - RA: formalize "**entrypoint**" what can and cannot be opened
77 | - shouldn't be able to open a chunk (and figure out that it's part of an array)
78 | - JS: alternatively, you MUST have a zarr.json
79 | - RA: like that one. for an entry point, there should be a zarr.json
80 | - That should be the **one** entrypoint definition.
81 | - DH: defines a leaf, everything below doesn't exist externally.
82 | - RA: would help to look at hierarchies. (too abstract)
83 | - DH: a bit like posix mountpoints? driver is responsible for interpretation
84 |
85 | - URL syntax:
86 | - little bit via the previous conversation
87 |
--------------------------------------------------------------------------------
/meetings/2023/2023-01-26.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 26th January
4 | description: ZEPs Meeting Notes for 2023-01-26
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 2
8 | ---
9 |
10 | # 2023-01-26
11 |
12 | **Attending:** Jonathan Striebel (JS), Ward Fisher (WF), Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | SV announced that we would have weekly ZEP meetings instead of a bi-weekly routine to finalise the open issues for V3 (ZEP0001). Then we discussed the timeline to finish the ZEP1 and the possible timeframe required for the ZIC to vote on the same. After this, we touched on [#177](https://github.com/zarr-developers/zarr-specs/issues/177) and [#56](https://github.com/zarr-developers/zarr-specs/issues/56).
17 |
18 | **Updates:**
19 |
20 | - Weekly ZEP meetings until March, 2023
21 |
22 | **Meeting Minutes:**
23 |
24 | - JS: Timeline
25 | - (in discussion and TODO)
26 | - Not voting by end of January
27 | - More realistic? End of February
28 | - Josh: agreed with handover e.g. end of February. (can be more activate in March)
29 | - JS: How long for the review?
30 | - SV: 1 month?
31 | - nodding...
32 | - [Prefix](https://github.com/zarr-developers/zarr-specs/issues/177)
33 | - Underscores and escaping. (needs to happen in group)
34 | - [unicode](https://github.com/zarr-developers/zarr-specs/issues/56)
35 | - allowed: +1
36 | - recommended set of characters (lower case, digits, hyphens)
37 | - normalization?
38 | - filesystem does normalization on matching
39 | - online there's no normalization
40 | - default: we do nothing
41 |
--------------------------------------------------------------------------------
/meetings/2023/2023-02-02.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2nd February
4 | description: ZEPs Meeting Notes for 2023-02-02
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 3
8 | ---
9 |
10 | # 2023-02-02
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ryan Abernathey (RA), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | This is the first Special ZEP Meeting, as announced during the last call. We extensively discussed various open issues for V3 on the GitHub project board. In between various V3 discussions, SV popped the question of whether there is an R implementation of Zarr. Also, RA tested the newly developed Sharding feature, which can be seen [here](https://github.com/zarr-developers/zarr-python/discussions/1338).
17 |
18 | **Meeting minutes:**
19 |
20 | - SV: Is there R implementation of Zarr?
21 | - JM: Only [Rarr](https://github.com/grimbough/Rarr) (with active development)
22 | - RA: Rust Implementation is a good place to put our efforts; would be good binary implementation that would be useful for the communities for other languages
23 | - RA: Took sharding for the test-drive
24 | - RA: Storage transformers doesn't have `get_items` and `set_items`
25 | - Is a good thing
26 | - JS: Does have a partial values and it could cover keys
27 | - Martin thinks API is not clean now
28 |
--------------------------------------------------------------------------------
/meetings/2023/2023-02-09.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 9th February
4 | description: ZEPs Meeting Notes for 2023-02-09
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 4
8 | ---
9 |
10 | # 2023-02-09
11 |
12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Isaac Virshup (IV), Virginia Scarlett (VS), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting was solely focused on discussing open issues related to ZEP1. After the discussion, IV proposed his idea for a variable-length string, which could be a potential ZEP.
17 |
18 | **Meeting minutes:**
19 |
20 | - Discussion on V3 issues - check issues @ [GitHub project board](https://github.com/orgs/zarr-developers/projects/2)
21 | - IV: Strings + variable length binary – New ZEP?
22 |
--------------------------------------------------------------------------------
/meetings/2023/2023-02-16.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 16th February
4 | description: ZEPs Meeting Notes for 2023-02-16
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 5
8 | ---
9 |
10 | # 2023-02-16
11 |
12 | **Attending:** Sanket Verma (SV), Dieu My Nguyen (DMN), John A. Kirkham (JK), Hailiang Zhang (HZ), Johana Chazaro (JC), Jeremy Maitin-Shepard (JMS), Akshay Subramaniam (AS)
13 |
14 | ## TL;DR:
15 |
16 | Jonathan Striebel laid out some discussion points before the meeting to, look at. Unfortunately, he couldn’t make it to the meeting, and we decided to work on those points asynchronously via GitHub. Then SV asked everyone for their thoughts on [checksums](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1412688953) for the shard. Finally, HZ gave a summary of his newly submitted ZEP, which can be seen [here](https://zarr.dev/zeps/draft/ZEP0005.html).
17 |
18 | **Points of discussions:**
19 |
20 | ZEP 1:
21 | - Anything missing for ?
22 | - Change global storage transformer PR to group storage transformers:
23 |
24 | - Should we update or remove the "Storage – Operations" section?
25 |
26 | - ZEP 1 needs updates:
27 | - URL to groups and arrays:
28 |
29 | - Prepare mail for the councils for the vote
30 |
31 | **Meeting Minutes:**
32 |
33 | - SV: Your thoughts on checksum for shards? Check the discussion [here](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1412688953)
34 | - AS: Not really thought about the ZEP extension but it could be!
35 | - AS: Want to support applications that don’t support Zarr - would be nice to support shard - send shard over the network and decompress it over at the other end
36 | - AS: KwickIO doesn’t do compression - would be nice to support this
37 | - JMS: There’s some tension with the Zarr model - shard has data and metadata - maybe duplicate the metadata and add info over to it
38 | - AS: checksum issue is not critical - some more metadata to shard - applications in genomics and geospatial data - having number of chunks would help - applications has the context for unpacking
39 | - AS: Zarr can have wrapper which can put data in the right place
40 | - JMS: having a container of the string is the abstract of what we want
41 | - JK: Is it depended on the data? - The compressor and the chunks and shard
42 | - AS: what compressor works with what data is a subjective choice
43 | - JK: Compressor could have branching logic?
44 | - AS: Could be logical to create a new compressor
45 | - JK: Branching logic would change for different datasets?
46 | - AS: It could.
47 | - SV: Dataset is public? Or can be made public?
48 | - AS: We can make it public - I’ll look into it
49 | - HZ: Brief summary of [ZEP0005](https://github.com/zarr-developers/zarr-specs/pull/205) GES DISC is looking for averaging the chunks - cost is high - introduce the algorithm - make overhead to be small - regional data is 1 TB the extra overhead after accumulation would be ~5% - it is working really well - big improvement in performance - very accurate - Ryan suggested to add it as extension during last year ESIP meeting
50 | - SV: Maybe add benchmarks?
51 | - HZ: We can do that!
52 | - JMS: Question on the specs PR
53 | - HZ: User can specify can any random range - corner would aggregate the chunk value - loading single chunk is easy - this single chunk would contain the aggregation value and you load it - it would be transparent to the user
54 | - JMS: makes sense to have stride - for image you want to store pyramid - you can represent downsampling pyramid - you can do this by your proposal
55 | - HZ: if you can zoom in and zoom out - regular stride - how do we setup the stride? - in theory is it possible? - is it possible to have a version 2 for this extension ZEP? - the separation would be good idea
56 | - JMS: Multiply stride by chunk size
57 | - HZ: Saving multiple chunks is not a problem - currently doesn’t support any random stride
58 | - JMS: It would let implementation have more work but it would cover more generic use cases
59 | - HZ: I will think hard and include the modification in the PR
60 |
--------------------------------------------------------------------------------
/meetings/2023/2023-02-23.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 23rd February
4 | description: ZEPs Meeting Notes for 2023-02-23
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 6
8 | ---
9 |
10 | # 2023-02-23
11 |
12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Ryan Abernathey (RA), Jonathan Striebel (JS)
13 |
14 | ## TL;DR:
15 |
16 | Finally, after several meetings, all the essential issues related to the ZEP1 (V3) are resolved! However, some minor tasks remain, which JS will be looking at in the upcoming days. Then, JS asked whether we should ship V3 with OGC and whether chunk key encoding could be an extension. RA mentioned he’s working on an extension ZEP for non-listable stores. We also discussed having a hack week to eliminate the technical gap between the zarr-python’s V3 implementation and the current V3 specification.
17 |
18 | **Updates:**
19 |
20 | - New extension by Hailiang Zhang, see here:
21 |
22 | **Meeting Minutes:**
23 |
24 | - JS: No items in TODOs and Needs pr in ZEP1 [project board](https://github.com/orgs/zarr-developers/projects/2/views/2) - spec is coming to final stage 🎉 - last few days at scalable minds - needs to finish the remaining tasks soon!
25 | - JS: Should we ship V3 with OGC?
26 | - RA: We already have Zarr V2 Spec as OGC standard - we can ask for v3 - but its more of take it or leave thing
27 | - JS: [chunk key encoding](https://github.com/zarr-developers/zarr-specs/issues/172) could be an extension - separate key in the metadata - may prepends 0 - two things should be configurable separately
28 | - RA: could it be an extension?
29 | - JS: possibly!
30 | - RA: we should have it - this could enable to see only metadata when opening a directory without seeing the whole array
31 | - JMS: separator `/` would allow multiple possibilities
32 | - JS: should be backwards compatible, not a breaking feature
33 | - RA: working on a extension ZEP for non - listable stores -
34 | - wants to run it by the community first - read only stores - no writes to these stores
35 | - copied from STAC - link:
36 | - provide explicit link between parent and children document - write a store and create links for the store -
37 | - JMS: new group property, no reason to have parent - because you always know the parent
38 | - RA: what if someone gives you a middle hierarchy array address? - it’s helpful then
39 | - RA: Maybe we should not advertise V3 as we don't have a reference implementation
40 | - SV: hack week is a good idea to get ride of the existing technical debt
41 | - JMS: Looking to implement V3 in tensorstore and can help with Zarr-Python too!
42 |
--------------------------------------------------------------------------------
/meetings/2023/2023-03-09.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 9th March
4 | description: ZEPs Meeting Notes for 2023-03-09
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 7
8 | ---
9 |
10 | # 2023-03-09
11 |
12 | **Attending:** Hailiang Zhang (HZ), Dieu My Nguyen (DMN), Josh Moore (JM), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting started with a discussion on Sharding as a [transformer vs codec](https://github.com/zarr-developers/zarr-specs/issues/220). Sharding is implemented as a transformer, and then we discussed the pros and cons of implementing Sharding as a codec. JMS had briefly discussed [#219](https://github.com/zarr-developers/zarr-specs/issues/219). After this, HZ gave a brief on [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) but couldn’t present his slides due to Zoom/screen sharing issues - tabled for the next community meeting.
17 |
18 | **Meeting Minutes:**
19 |
20 | - JMS: sharding as transform vs. codec
21 | -
22 | - RA: been playing with sharding, trying to get the most out of it
23 | - Key question (impl or spec?): if sharding is a codec how does the outer layer which range it wants? (general problem in zarr-python for blosc)
24 | - requires passing context to context
25 | - transform is explicit; codec is less clear
26 | - JMS: in zarrita he has the codec take an indexing expression (optional?). defer some of that for ZEP2? arrays vs. bytes vs. additional concept of arrays.
27 | - RA: similar to Martin's request
28 | - JMS: first codec is fine, but the next one is less clear. need to be more explicit about the interface.
29 | - RA: need to solve this. what information needs to be passed in between (implementors and at spec level)
30 | - RA: e.g. could be a codec that takes an HDF5 file (blosc2, etc.) missed a chance to build the right abstractions there.
31 | - JMS: `codecs := array|bytestream in; array|bytestream out`
32 | - JM: recursive zarrs all the way down?
33 | - JMS: concatenation of other arrays
34 | - RA: Norman's justification. JMS' proposal. re: how to integrate other things like referncing between arrays, shards defining own chunking, etc. (doesn't change anything in ZEP1)
35 | - JMS: transforms as bytes, and codecs can access arrays
36 | - JMS: NB: MD wants low level store to be aware of array indexing
37 | - JM: always thought of codecs as the lowest thing that is unaware of arrasy
38 | - JMS: combined compression with filters (which can operate on arrays, transpose)
39 | - RA: sharding fundamentally breaks core abstraction between store / codec. at the impl. level, want an efficient/fast code to fetch chunks of shard, make smart decisions, close to the metal. but the naive thing isn't fast. do the core abstractions break down. no longer using key/value store API. using offsets into storage.
40 | - JMS: don't see byte range as breaking. addition to the interface.
41 | - RA: not just a file format, but a protocol for addressing chunks.
42 | - JMS: dimension names metadata
43 | -
44 | - RA: would be for keeping or making it easy to not use
45 | - HZ/DMN: ZEP5 presentation (recorded)
46 | -
47 | - 
48 | - HZ: Tabling because of Zoom issues.
49 | - RA: re: expectations -- very limited due to the numbers of people working on the spec. (it's taken *years*) so ... 6 months?
50 | - HZ: this is an extension, doesn't blocking anything.
51 |
--------------------------------------------------------------------------------
/meetings/2023/2023-03-16.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 16th March
4 | description: ZEPs Meeting Notes for 2023-03-16
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 8
8 | ---
9 |
10 | ## 2023-03-16
11 |
12 | **Note:** This ZEP meeting was an impromptu meeting. Please see the corresponding message on Gitter [here](https://matrix.to/#/!nZLdXRRzIbkoDjkEvS:gitter.im/$oxM2UpzOTs--6P1Itl6gPWvLRCBEv_npvxJYi5m95l8?via=gitter.im&via=matrix.org&via=cadair.com).
13 |
14 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS)
15 |
16 | ## TL;DR:
17 |
18 | NR was confused about sharding as a codec and thought changing it would require rewriting the interface. JS and JMS answered NR’s questions. After this, we discussed sending ZEP1 for voting and concluded that there needs to be an editorial change before we send it. JS initiated a discussion on [#161](https://github.com/zarr-developers/zarr-specs/issues/161) and [#222](https://github.com/zarr-developers/zarr-specs/pull/222). NR also asked if we have a contributors list for V3, and SV took the job for himself.
19 |
20 | **Meeting Minutes:**
21 |
22 | - NR: Confused for sharding as a codec! - Making sharding as a codec will change the interface -
23 | - JMS: doesn’t agree with Martin’s point
24 | - JS: doesn’t change the codec - will nest a new one - V3 doesn’t need to change anything
25 | - NR: Need to change ZEP2 then
26 | - NR: Zarr-Python API is mess right now
27 |
28 | ~JMS joins in~
29 |
30 | - NR: Should we send ZEP1 for voting?
31 | - JMS: Martin wants to push everything as a storage transformer - keep storage transformers in the spec - we could also defer that decision to ZEP2
32 | - NR: I wonder if we have too much implementation detail in V3? - Whether we need partial read or not?
33 | - JMS: Partial read are not required for sharding
34 | - JS:
35 | - JMS: Mostly concerned with JSON metadata - haven’t starting doing the implementation
36 | - NR: Some behaviour needs to be defined - everything goes beyond - doesn’t need to strip the spec
37 | - JS: hierarchy discovery - what happens if you delete the chunks and what happens then?
38 | - JS: We’re fine as it is now!
39 | - JMS: For someone is reviewing the ZEP0001 - JSON metadata is important - but it is burried in the middle - an editorial change would be helpful to put on the top
40 | - JS: Glossary defined at the top is not optimal
41 | - JMS: Would look into restructuring the metadata to the top
42 | - JMS: Would start working on V3 implementation of Neuroglancer and tensor store
43 | - JS: -
44 | - NR: This is something of an implementation detail
45 | - JS: Maybe we can mark it them as implementation detail
46 | - JMS: Josh and Ryan brought up the idea of codec vs transformers in the last ZEP meeting - so I wrote this PR
47 | - NR: Move sharding to a codec?
48 | - JMS: Josh was skeptical - Ryan was in favour - we should go ahead with the proposal for sharding as a codec
49 | - NR: Will make the changes to the ZEP2 to the next week
50 | - JS: Having sharding as similar to blosc2 and nvcomp is a strongest opinion
51 | - NR: Contributors for V3 so far?
52 | - SV: Will make a contributor list after the voting email goes out!
53 |
--------------------------------------------------------------------------------
/meetings/2023/2023-03-23.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 23rd March
4 | description: ZEPs Meeting Notes for 2023-03-23
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 9
8 | ---
9 |
10 | ## 2023-03-23
11 |
12 | **Attending:** Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Norman Rzepka (NR), Ward Fisher (WF)
13 |
14 | ## TL;DR:
15 |
16 | NR started wondering whether Hailiang’s [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) is in the ZEP scope. Everyone has different thoughts, and SV thought it might be part of [Ryan Abernathey’s ZEP](https://github.com/zarr-developers/zeps/pull/28). Next, SV presented the draft email that the JMS and JS are supposed to send out soon. And lastly, JMS had some thoughts about bloc being a special codec due to its shuffling nature.
17 |
18 | **Updates:**
19 |
20 | - Hailiang presented [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) yesterday
21 | - Check the recording here:
22 |
23 | **Meeting Minutes:**
24 |
25 | - NR: Does Hailiang’s ZEP in the Zarr scope?
26 | - JMS: Not possible to evaluate the proposal right now - agree with the scope - maybe it’s not for us to judge it - it could be a metadata convention - doesn’t need to be implemented with Zarr-Python itself
27 | - NR: Whether it needs to be standardised or not? - Could understand some part of the proposal - may not an extension - because it’s on top of the Zarr (the accumulation attribute)
28 | - SV: Maybe part of the [Ryan’s ZEP](https://github.com/zarr-developers/zeps/pull/28)?
29 | - NR: Trying to find a specific use case and I couldn’t find it - something similar to OME naming conventions
30 | - JMS: Prefixing OME w.r.t. to OME keys
31 |
32 | - SV: ZIC Email
33 | - JMS: We can remove Zarr-Python implementation's reference from the email
34 | - NR: You can add Zarrita!
35 | - SV: Is it in a better state of syncronisation as compared to zarr-python V3 implementation?
36 | - NR: Yes!
37 | - JMS: Sure, we can do it!
38 | - JMS: Reorder metadata to the top and then send the email
39 | - NR: Sharding as a codec PR: - if it can be merged before the email then it’ll be great!
40 | - NR: Want to bundle ZEP0002 with ZEP0001 -
41 | - SV: Not a good idea!
42 | - NR: Alright!
43 | - NR: Rendering of the read the docs - SV: check this out:
44 |
45 | - JMS: Blosc is a special codec! - bytes to bytes codec - the shuffle parameters has some logic - the shuffling is happening in the Zarr V3 - which is weird - adds a weird abstraction - potentially useful for users to specify shuffling manually
46 | - JMS: Proposed value for the shuffling for the blosc codec: {null, "bit", 1, 2, 3, 4…}
47 | - Will create an issue or send a PR to the numcodecs repository
48 |
49 | - SV: Once the re-ordering (moving metadata section to the top) of V3 is done, we can send out the email
50 | - NR: I’ve added Zarrita as the reference implementation for V3 in the email!
51 |
--------------------------------------------------------------------------------
/meetings/2023/2023-04-06.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 6th April
4 | description: ZEPs Meeting Notes for 2023-04-06
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 10
8 | ---
9 |
10 | # 2023-04-06
11 |
12 | **Attending:** Sanket Verma (SV), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS), Ethan Davis (ED)
13 |
14 | ## TL;DR:
15 |
16 | In today’s ZEP Meeting, we worked on the text to kick off the review process for ZEP0001. After reviewing the document and making a few necessary changes, JMS created an issue to notify the ZIC and ZSC that ZEP0001 has finally entered the review phase. Please check the relevant issue: .
17 |
18 | **Updates:**
19 |
20 | - ZEP1 is ready for review after merging [#224](https://github.com/zarr-developers/zarr-specs/pull/224)
21 |
22 | **Meeting Minutes:**
23 |
24 | - SV: [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html) is ready for review
25 | - RA: What does ZIC think of the implementing the spec? We should ask them!
26 | - JMS: Sounds good!
27 | - RA: We can use the main issue to cast the votes, and open separate issues for additional concerns as having everything in a singe issue will clutter it
28 | - RA: Are we expecting a lot of feedback and then we would need to edit the spec? Because we've already did a lot of work to reach the current state of the SPEC
29 | - JMS: Think people who haven't been involved in the SPEC process so far would not be too much concerned with the details
30 | - ED: Has there been any involvements from the ZIC?
31 | - RA: Not much!
32 | - ED: Then maybe it's gonna be clarifications and similar questions
33 | - RA: In my experience, you can design as much you want but real issues starts coming up when you start implementing it
34 | - ED: Yeah!
35 | - JMS: It not easy to get a large group of people to agree on a same thing at a same time!
36 | - RA: Do we have a reference implementation of V3?
37 | - SV: Yeah, [Zarrita](https://github.com/scalableminds/zarrita)!
38 | - RA: Need to provide some clarifications on the voting mechanism
39 | - They could vote to approve
40 | - They could to abstain
41 | - And they could vote to veto
42 | - RA: Provide guidance on the implementation as well
43 | - ED: In OGC, you can say no but with the comments
44 | - RA: We should plan that there's no veto!
45 | - JMS: Need to avoid the case where the implementors start working on a new spec on their own - also a fact that this is a community process - you can't force people to implement something new if it isn't helpful for them - instead you can help them with their additional use cases
46 | - ED: Why veto and why just 'No'?
47 | - RA: That's for other ZEPs not for ZEP0001 because it's like a constitution of Zarr
48 | - ED: The veto is contentious issue
49 | - RA: Is there a process to respond to the feedback? Is there a time period for that?
50 | - SV: We should accomodate all of the feedback in a month's time
51 | - ED: Is the vote going to be, 'Yes' we're moving forward with it and there's gonna be a separate discussion for the implementation? - OGC have a public comment period
52 | - RA: There are heavy handed process like OGC, ISO but we don't want to use them; we just made a process for our project and the community to reach a consensus - like the idea of comment period from the council and they have a month to solicit their feedback - and after that we'll take a vote
53 | - JMS: If people wanted to comment, they'd have already done so
54 | - RA: I'd love to think so but all the implementors are busy and they might be waiting for someone to come to them review the spec - the downside of doing this is we're going to present this as a take it or leave it and that's not fine given Zarr is a commmunity owned OS project!
55 | - JMS: What does leave it mean? They'll not upgrade to V3 from V2 - Is it fine?
56 | - SV: I know some folks from the council couldn't keep up with the progress and they've been waiting for the SPEC to go into the review phase
57 | - RA: It's been in the review phase forever! - would like to see robust handling of the extension points - getting behind the idea of 1 month time
58 | - ED: Release candidate for the extension?
59 | - ED: Does approval means that the SPEC will be 3.0 or 3.1.0 or 3.0.1?
60 | - SV: It's gonna be 3.0
61 | - JMS: How about the folks who were involved during the development of the SPEC?
62 | - SV: Will take care of it after the voting period
63 | - ED: It's a good to list the contributors!
64 | - SV: Does everything seems fine?
65 | - JMS: Yes!
66 |
67 | - ZEP0001 Review issue: 🎉
68 |
--------------------------------------------------------------------------------
/meetings/2023/2023-04-20.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 20th April
4 | description: ZEPs Meeting Notes for 2023-04-20
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 11
8 | ---
9 |
10 | # 2023-04-20
11 |
12 | **Attending:** Jeremy Maitin-Shepard (JMS), Jonathan Striebel (JS), Ryan Abernathey (RA)
13 |
14 | ## TL;DR:
15 |
16 | During the discussion on the V3 specification, the community explored different codecs, including `array → array`, `array → bytes`, and `bytes → array`, and evaluated their advantages and disadvantages. They also debated whether to include codecs in the metadata or not.
17 |
--------------------------------------------------------------------------------
/meetings/2023/2023-05-04.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 4th May
4 | description: ZEPs Meeting Notes for 2023-05-04
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 12
8 | ---
9 |
10 | # 2023-05-04
11 |
12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Sanket Verma (SV), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA)
13 |
14 | ## TL;DR:
15 |
16 | During the meeting, SV provided an update on the current voting status for [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227), while WF plans to vote soon with the assistance of colleagues at Unidata. RA suggested that we should be open to changes in the living document, i.e. V3 Spec and not adhere strictly to the by-laws. JS recommended having a list of contributors on the ZEP0001 page. SV also discussed pending tasks on the ZEP0001 project board before finalising the spec. JMS briefly discussed Mark’s [comments](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1533335795) on the sharding PR, and the meeting concluded with an impromptu discussion about organising a Zarr conference.
17 |
18 | **Updates:**
19 |
20 | - [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227) review
21 | - Ends in 2 days on 5/6 @ 23:59
22 | - 2 votes so far
23 | - Constantine - `ABSTAIN`
24 | - Jeremy - `YES`
25 | - 8 votes pending - (if we leave zarr-python out then 7)
26 |
27 | **Meeting Minutes:**
28 |
29 | - SV: summary on votes so far (as above)
30 | - WF: Working with the colleagues @ Unidata - will be voting on the V3 Spec soon!
31 | - RA: wouldn't stick hard-and-fast to the by-laws if they need
32 | - JS: Mostly giving people a point in time for veto. It's a living document.
33 | - WF: comparing to markdown, might be good to have this process
34 | - SV: @Ward - Is there a PR you're looking to submit for ZEP1 review?
35 | - WF: User attributes for the metadata field - a bit certainity in `must_understand` flag and user defined attributes - we can have arbitary tags in user attributes specifying it doesn't require `must_understand` flag - will prepare a PR for the same
36 | - JMS: think of the review as an intent to implement
37 | - JM: agree, looking forward to using this to garner motivation
38 | - JS: looking forward to a retrospective; smaller changes.
39 | - RA: specturm of processes; OGC is the most heavy-handed; STAC is most agile (what's their trick? everything has an implementation)
40 | - JM: STAC still needs to do the major upgrade to V2 - something I'm looking forward to
41 | - `must_understand` flag
42 | - JMS: Still not clear on Dennis' objection
43 | - WF: (for Dennis) worried about future changes painting us into a corner - good to keep Dennis in loop and listen to his concerns
44 | - JS: Point where you can see both the sides - it's not a deal breaker for the V3 Spec
45 | - RA: Any downside for making `must_understand` flag a required attribute for an extension? - Seems lightweight and can satisfy Dennis
46 | - JS: To rephrase JMS's point - "You can do this but then it's not possible to have non-object entries in the config again."
47 | - JMS: We haven't clarified in codecs the presence of unknown attributes in codecs!
48 | - JS: Would be good to have a list of contributors for the ZEP1
49 | - SV: Will complete it before we finalise ZEP1
50 | - JM: Would be good to put Zarr Spec on Zenodo as well!
51 | - SV: State of the [ZEP1 project board](https://github.com/orgs/zarr-developers/projects/2/views/2)
52 | - 2 issues in meta and 1 under discussion
53 | - JS: We can ignore the one under discussion - RA can take care of the [OGC](https://github.com/zarr-developers/zarr-specs/issues/42) one
54 | - RA: We can update the community standard we already have with OGC
55 | - JS: Not super happy with how we do the Spec work atm - see [#179](https://github.com/zarr-developers/zarr-specs/issues/179) - we need to address this once V3 gets finalised
56 | - SV: What does updating V3 at OGC means? Does it supersedes V2 or V3 gets published at a new URL alongside V2?
57 | - All: Don't know! RA can take care of it.
58 | - RA: geozarr has become a comparison of geo/weather specs
59 | - but will remain a convention
60 | - JM: love to hear more. Maybe we can have a Zarr conventions convention ;)
61 | - RA: all spatial/temporal. with infinite time, would be great to work together.
62 | - JMS: Discussion on [comments](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1533335795) by Mark on Sharding PR
63 | - Would be a good thing to add checksum of the index not the individual chunks (data) - will add 4 extra bytes to the index - JMS favour this!
64 | - JS: Would be good to ask Norman - but I also favour the idea
65 | - Impromptu discussion on organsing a Zarr Conference
66 | - JS: Really good idea! We should have it
67 | - JMS: In-person has more values
68 | - JS: Would be good co-locate with other conferences
69 | - JM: From my experience it was difficult to get together folks from different fields - maybe we can look for hosts like Janelia, NASA etc.
70 | - JM: Thermo fisher is also looking into Zarr
71 |
--------------------------------------------------------------------------------
/meetings/2023/2023-05-18.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 18th May
4 | description: ZEPs Meeting Notes for 2023-05-18
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 13
8 | ---
9 |
10 | # 2023-05-18
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ryan Abernathey (RA), Norman Rzepka (NR), Jeremy Maitin-Shepard(JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting covered the acceptance of ZEP0001, signaling the initiation of implementations. Discussions revolved around updates for ZEP0001 and V3, with Zarrita noted as a comprehensive PoC. ZEP0002 discussions focused on shard structures and an iterative ZEP process. Additionally, there were insights into experimentation with virtualization and metadata linking, and considerations for handling unmaintained Zarr implementations, emphasizing a shift towards V3 developments.
17 |
18 | **Meeting minutes:**
19 |
20 | - Discussions about climate and weather 🌡️
21 | - [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227) is finally accepted! 🎉
22 | - Implementors can start working on their implementations
23 | - Will be moving the ZEP0001 under the new `Accepted` section
24 | - Will move it under `Final` section once we have atleast one complete reference implementation
25 | - SV - updates and next steps for ZEP0001 and V3:
26 | - 1 year into the process... ([2019](https://zarr.dev/blog/v3-update/) first discussion)
27 | - [gdal](https://github.com/OSGeo/gdal/pull/7706) moving quickly
28 | - will be checking in on the various implementations
29 | - zarrita as one of the most complete (in terms of code, not docs or tests), i.e. PoC
30 | - JM: Reference implementation needs to be useable!
31 | - NR: Yeah!
32 | - RA: why it's not a complete implementation?
33 | - NR: no optimizations (sharding, etc.). meant to be easy to read code.
34 | - lacks features like buffer protocol, etc. but could be used.
35 | - don't currently plan to maintain it over a long period of time.
36 | - SV: was thinking less of end-user and more of supporting all the features so others can refer to it.
37 | - NR: that probably could be done now.
38 | - NR: could write an intro for people to read. (don't want to write end-user docs)
39 | - RA: NR's production implementation? use different file format currently. must have sharding.
40 | - considering using zarrita as the implementation (for Python stack)
41 | - also have a scala stack (baked into software)
42 | - JM: ZEP0002 voting and discussions
43 | - JM: We could open up the voting for ZEP0002 and give a month/two month/full summer for voting - any open issues?
44 | - NR: None!
45 | - JM: Shard as recrusive Zarrs? - treat internals of shards like another Zarr
46 | - NR: Sharding being a codec would work that way but it's more of a implementation detail
47 | - JM: What would it look like from a URI structure? - similar to what Saalfeld is doing in N5 ecosystem - if I access a chunk inside a shard and I could treat it as a Zarr array and not as a blob
48 | - NR: Fair enough! I could have something like this in Zarrita
49 | - NR: Not have implementation details in the spec but rather point to the reference implementation for the details
50 | - RA: ZEP process should be more of an iterative process and not an ultimatum
51 | - JMS: I feel, most of the implementors would be working on sharding and V3 together
52 | - RA: flat files to virtualization (experimentation). (Once ZEP0003 lands!)
53 | - Also want to see linked referencing in metadata (to allow browsing through HTTP). Better than consolidated. For e.g. infinite hierarchies, allows browsing like a catalog
54 | - Allow parent to list it's children in Zarr groups
55 | - Also multiscales. Relates the arrays. Within the array directory?
56 | - NR: listing parent/children
57 | - NR: paths...
58 | - RA: see "link" in for any node (self, root, parent, child)
59 | - NR: Discussions on umaintained Zarr implementations
60 | - SV: Dissolve them with the help of the maintainers
61 | - JM: Tricky to get a hold of them
62 | - SV: Start deprecating them and then removing them from
63 | - JMS: But all of that is V2 - if there's going to be something on V3 we can certainly help them
64 |
--------------------------------------------------------------------------------
/meetings/2023/2023-06-01.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 1st June
4 | description: ZEPs Meeting Notes for 2023-06-01
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 14
8 | ---
9 |
10 | # 2023-06-01
11 |
12 | **Attending:** Ward Fisher (WF), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA), Norman Rzepka(NR), Josh Moore (JM)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed ongoing work on ZEP 4, focusing on conventions for diverse domains like bio-imaging, geospatial, and genomics. There were considerations about whether these conventions should be general or domain-specific and how to handle legacy data. Additionally, discussions covered topics such as the organization of conventions on the website, the separation of codecs, challenges in introducing transformations to the codec pipeline, and the exploration of fallback data types, emphasizing the need for broader community discussions on certain aspects.
17 |
18 | **Updates:**
19 |
20 | **Meeting Minutes:**
21 |
22 | - SV: ZEP 4 →
23 | - RA: Still working on it
24 | - SV: How can we help?
25 | - RA: Need to decide if it's going to be a general convention (for bio-imaging, geospatial, genomics etc.), or a convention of Geospatial domain all
26 | - RA: Also need to decide if the dataset are conforming to a convention or not - lot of legacy data out there which doesn't conform to it
27 | - RA:
28 | - WF: Convention doesn't need to be broad, cf convention are based on NetCDF model - but there's nothing in the NetCDF library or code that mentions the cf conventions!
29 | - RA: The existing conventions are broad, and it's difficult to place cf in a specific place
30 | - WF: Agree with you
31 | - RA: Define what's the process to put a new convention for the community
32 | - JMS: You have group level attribute and array level attribute?
33 | - RA: Mostly yes!
34 | - WF: There
35 | - RA: Getting all convention on the website would be a good way for cross domain and community collaboration - conventions can be composable - conventions could not be universal
36 | - WF: Conventions move slowly - take time to adopt to new things - took a good amount of time to solve SST (Sea Surface Temperature)
37 | - RA: Will get the another draft out soon
38 | - JMS: Not feasible to namespace an attribute?
39 | - RA: It would require a deep re-factoring for the software we use! - It would break Xarray, GDAL, NetCDF - Zarr-Python doesn't care about the attributes
40 | - WF: Namespacing would definitely break the NetCDF library!
41 | - RA: JMS, how do you handle conventions?
42 | - JMS: Generally, doesn't invent new conventions, and implement existing conventions - the data formats I invented, I defined those conventions - these existing conventions doesn't lack _certain_ things
43 | - NR: In the process of adopting Zarr V3, currently using `OME-Key`
44 | - WF: Attributes are strictly defined - defined to be interpretable not changeable
45 | - JMS: Maybe the best idea is to say clearly that _X_ datasets use the conventions and multiscales
46 | - JMS:
47 | - NR: No need to separate them - my opinion: to keep it as it is - maybe Chris's comment is coming out of the Rust world and separating the codecs will be convenient for him
48 | - JMS:
49 | - JM: Cares about the codec pipeline - adding some transformations in the middle would be tricky!
50 | - NR: Feel the same! - Sharding as a single codec makes sense but adding anything would make it complicated -
51 | - JM: Current partial codec --- define the metadata format for shard codec would be great!
52 | - NR: Adding 2 partial codecs would make it tricky to implement
53 | - JMS: Blosc is kind-of sharding codec - not clear if using sharding as a partial read codec is good idea!
54 | - JMS: How do you have partial writes for the codec?
55 | - JMS: Fallback data types
56 | - JMS: Need to have a broader discussion
57 | - NR: Also, do extensions data type need to have a fallback?
58 | - JMS: No, it's optional
59 | - NR: The definition of fallback - like a tuple - having a datatype and the fallback value
60 | - JMS: Maybe it's the way to go!
61 |
--------------------------------------------------------------------------------
/meetings/2023/2023-06-15.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 15th June
4 | description: ZEPs Meeting Notes for 2023-06-15
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 15
8 | ---
9 |
10 | # 2023-06-15
11 |
12 | **Attending:** Sanket Verma (SV), Alan Watson (AM), Jeremy Maitin-Shepard (JMS), Josh Moore (JM), Norman Rzepka (NR)
13 |
14 | ## TL;DR:
15 |
16 | The meeting highlighted SV's recent talk on Zarr at Vrije Universiteit Amsterdam, providing access to the slides and code. Discussions covered the existence of command-line tools for Zarr, including the discovery of zarr-tools. AW shared Zarr adoption in a game project and encouraged its use in brain conferences. Additionally, there were updates on Allen's utilization of Zarr, a forthcoming V3 blog post by SV, and ongoing discussions about addressing the fallback data types issue and the review timeline for ZEP0002.
17 |
18 | **Updates:**
19 |
20 | - SV gave a talk on Zarr @ Vrije Universiteit Amsterdam -
21 | - Slides and code:
22 |
23 | **Meeting Minutes:**
24 |
25 | - Are there any command line tools for Zarr?
26 | - Found this:
27 | - JM: Was working on something on this but didn't use it much
28 | - NR: There are bunch of tools which you can use OME-Zarr, having something like H5LS would be cool
29 | - JMS: Operations for small things makes sense - rechunking, copying
30 | - JM: Nextflow - workflow engine
31 | - AM: BIL is working on games
32 | -
33 | - Pushing them to use Zarr for their image (.png) data
34 | - AM: Interest and benefit in attending brain conferences - someone from the Zarr community
35 | -
36 | - AM: Allen is using Zarr for their work
37 | - SV: Recent paper out of Allen:
38 | - Extensive usage at Allen have revealed some problems and it may be worth addressing them
39 | - JMS: Writing electrophysiology data in Zarr rather than tiff is a good oway to go forward
40 | - SV: Blog post on V3 coming out soon!
41 | - JMS: Fallback data types issue needs to be addressed
42 | - JMS: Not clear how it'll be specified
43 | - JMS: How do you handle it in Zarrita?
44 | - NR: Currently, we do not!
45 | - SV: How serious it is? Implementation or spec issue?
46 | - JMS: Kind of ignoring for it now! We're in implementation phase now!
47 | - JM: If everybody is ignoring it, then it's fine
48 | - NR: Would not be straight forward to add it later!
49 | - JMS: If implementation doesn't support, it'll fail
50 | - SV: Have you started on the implementation?
51 | - JMS: Tensorstore has V3 minus sharding; planning to work on it this week
52 | - NR: Rust implementation of V3 -
53 | - NR: Benchmarking in Zarrita
54 | -
55 | - SV: Benchmarks in Tensorstore?
56 | - JMS: Seen bottlenecks in IO layer than array layer
57 | - NR: What about codecs?
58 | - JMS:
59 | - NR: 10x performance improvement would be great!
60 | - NR: ZEP0002 review timeline
61 | - SV: V3 blog post, feedback for ZEP process, and then we can add ZEP0002 in the pipeline
62 | - NR: Ok!
63 | - SV: Maybe we need to invite Chris and others from the ZIC to the ZEP meeting
64 | - JM: There used to be a Zarr-Rust meeting
65 |
--------------------------------------------------------------------------------
/meetings/2023/2023-06-29.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 29th June
4 | description: ZEPs Meeting Notes for 2023-06-29
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 16
8 | ---
9 |
10 | # 2023-06-29
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Ryan Abernathey (RA), Ward Fisher (WF), Daniel Jahn (DJ), Jeremy Maitin-Shepard (JMS), Norman Rzepka (NR)
13 |
14 | ## TL;DR:
15 |
16 | The meeting focused on ZEPs 3 and 5, emphasizing the importance of ongoing implementation to avoid stalling and addressing technical debt in Zarr-Python. Discussions revolved around best practices, supporting old conventions, and the use of conformance tests. The possibility of multiscales as an extension was explored, and plans were made to kick off the ZEP0002 review process, with considerations for neat benchmarks and gathering feedback from the Zarr Implementers Community (ZIC).
17 |
18 | **Meeting minutes:**
19 |
20 | - SV: ZEPs 3 and 5 ...
21 | - RA: feedback on the ZEP
22 | - Need to be implementing as we go, otherwise leads to stalling
23 | - JS: Zarr-Python has tech debt which makes it difficult to implement new stuff
24 | - *Impromptu round of introduction*
25 | - JMS: Reference implementation in any language is helpful for any new ZEPs
26 | - RA: Having explicit tweet/statement about implementation would help
27 | - JS: benchmark repo? also sample data?
28 | - NR: Sample datasets
29 | - RA:
30 | - best practices going forward
31 | - but a way to support old conventions
32 | - from OGC, "conformance class" determined with "conformance test"
33 | - namespacing up to the convention
34 | - would like to get it into draft form and then we can move forward with the process
35 | - SV: few open comments?
36 | - RA: just merge it and move the process forward.
37 | - still need to open a template
38 | - NR: add existing conventions at this point (OME-NGFF)
39 | - JM: Thoughts on multiscales?
40 | - RA: It should be an extension
41 | - JMS: Multiscales doesn't lead to a lot of objects
42 | - SV: ZEP0002 review process to kick-off by next week
43 | - NR: Zarrita has sharding implementation
44 | - JM: Kicking-off review process and having neat benchmarks, any idea how we could do both?
45 | - NR: Make a new issue for ZIC feedback
46 |
--------------------------------------------------------------------------------
/meetings/2023/2023-07-13.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 13th July
4 | description: ZEPs Meeting Notes for 2023-07-13
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 17
8 | ---
9 |
10 | # 2023-07-13
11 |
12 | ## Josh Moore and Sanket Verma are presenting at [SciPy 2023](https://www.scipy2023.scipy.org/).
13 |
14 | ### Check the proposal:
--------------------------------------------------------------------------------
/meetings/2023/2023-07-27.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 27th July
4 | description: ZEPs Meeting Notes for 2023-07-27
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 18
8 | ---
9 |
10 | # 2023-07-27
11 |
12 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Ward Fisher (WF), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting focused on advancing ZEP2, with plans to send it to the Zarr Implementation Council (ZIC) for review. Sharding implementations were discussed, including progress on Tensorstore and considerations for Zarrita's compatibility with FSSPEC. Updates on ZEP1 and V3 were highlighted, with pending pull requests, and the possibility of Unidata's involvement in the roadmap was mentioned. Overall, anticipation was expressed for ZEP2's review and progress in various implementations.
17 |
18 | **Meeting minutes:**
19 |
20 | - Send [ZEP2](https://zarr.dev/zeps/draft/ZEP0002.html) to the ZIC
21 | - Merged
22 | - Need to merge -> SV
23 | - SV will send out the email to the ZIC
24 | - Try to fix crosslinks
25 | - JS to close [PR #152](https://github.com/zarr-developers/zarr-specs/pull/152)
26 | - NR to create an issue to gather votes and update the ZEP [PR #40](https://github.com/zarr-developers/zeps/pull/40)
27 | - SV to send an email to the ZIC after issue creation and PR merging
28 | - Everyone looking forward to it!
29 | - Mark wanting to organise a ZEP2 review call - but didn't happen
30 | - Sharding implementation
31 | - JMS working on Tensorstore implementation
32 | - V3 implementation on Tensorstore is close to completion
33 | - Zarrita had a noticeable overhead while running the benchmarks
34 | - NR: Zarr-Python sharding implementation has deviated over the time
35 | - JS: Make sense to add sharding as a codec once V3 in Zarr-Python gets in
36 | - JMS: Is there aim for having the similar API for V2 and V3 in Zarr-Python?
37 | - NR: Zarrita doesn't have various stores
38 | - JMS: Is Zarrita compatible with FSSPEC?
39 | - NR: Yes!
40 | - ZEP1 and V3
41 | -
42 | -
43 | -
44 | -
45 | - Once the PRs are merged, SV to send out the FYI to the ZIC
46 | - SV: Any developments on Unidata side?
47 | - WF: Not yet, but it's on our roadmap
48 |
--------------------------------------------------------------------------------
/meetings/2023/2023-08-10.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 10th August
4 | description: ZEPs Meeting Notes for 2023-08-10
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 19
8 | ---
9 |
10 | # 2023-08-10
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Norman Rzepka (NR)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed updates on Zarr-Python working groups, a proof-of-concept implementation for ZEP0003, and presentations at SciPy 2023. Notable mentions included removing watermarks from the V3 spec, recognizing contributors for ZEP0001, and ongoing discussions on collaborating with Blosc and Dask for efficient chunking and sharding across the PyData stack. There was also exploration into a lightweight ZEP process for codecs, with considerations for updating ZEP0000 and scheduling ZEPs for voting. Fundraising discussions and potential collaborations with Napari were explored to sustain momentum in the Zarr project.
17 |
18 | **Updates:**
19 |
20 | - Zarr-Python working groups
21 | - Benchmarking and performance:
22 | - Refactoring:
23 | - POC implementation of [ZEP0003](https://zarr.dev/zeps/draft/ZEP0003.html)
24 | -
25 | - SciPy 2023 proceedings
26 | - Talk slides:
27 | - Tools update slides:
28 | - ZEP0001 Contributors section:
29 |
30 | **Meeting minutes:**
31 |
32 | - JM: JS are you using Zarr at work?
33 | - JS: We're convinced that it's a good idea to use Zarr! 😄
34 | - Growing rapidly - good thing
35 | - SV: Presentation for EuroSciPy 2023!
36 | - JS: Coming up
37 | - SV: You can also cite SciPy 2023 talks
38 | - JM: Will be using the EuroSciPy 2022 poster for an upcoming meeting
39 | - SV: Tweeting about the contributors for ZEP0001 and tagging everyone
40 | - JM: Thanks to everyone! 🙏🏻
41 | - JS: Removes the watermark from the V3 spec
42 | - Figured out the CSS selector
43 | - JM: Zeiss got back to JM
44 | - SciPy 2023 discussions
45 | - James Webb Space Telescope - how they can use Zarr
46 | - JS: Misc. link:
47 | - JS: Met with Francesc (Blosc) ?
48 | - JM: Yeah, we spend a lot of time and it was great!
49 | - How Blosc and Sharding can co-exist together - like a package
50 | - SV: Sharding can provide cloud enability to Blosc2 - discussions with Francesc
51 | - JM: Recently filled out feedback for CZI EOSS - we can do join grants as well
52 | - NR: Writing a Blosc2 codec for Zarr could help
53 | - JM: Dask chunking comes down to `.chunk()` property for the object - how about data API chunking specification around the chunks? - chunking could work across the whole PyData stack - and we can add sharding too - could help with what's the efficient access pattern for sharding chunk!?
54 | - Unified package for Zarr (Sharding), Blosc2 and Dask and other packages
55 | - NR: Interesting! Sharding has 2 access pattern
56 | - Chunk level for read and shard level for write
57 | - For Dask purposes you probably want the shard access pattern - because you're in a HPC environment
58 | - JS: Writing it as a spec and collaborating with Dask and Blosc team
59 | - NR: Agreed!
60 | - JS: Sharding also needs a lot of explanation - lots of education needed
61 | - NR: Limbo state rn - blog posts and videos can help a lot - maybe after 6 months
62 | - JM: Unifying names - block.dev? - and same documentation as well
63 | - SV: Can include HDF5 as well
64 | - JM: HDF5 could be beneficial if you're working on cluster/HPC and Zarr can help you bring that data down from the cloud to your machine
65 | - NR: Can apply to EOSS grant with same applications
66 | - JM: Less chances of getting funded
67 | - JM: Zarr can solve world hunger! 😁
68 | - NR: Good momentum but need to deliver as well
69 | - JM: Zarr as a sister project of Napari!?
70 | - SV: Having conversations regarding fundraising for Zarr to keep the project funded
71 | - We can work on joint grant or something similar
72 | - NR: Lightweight ZEP process for codecs?
73 | - JM: Light voting procedure?
74 | - NR: Could be!
75 | - JM: A new ZEP to update ZEP0000 and add a new type of ZEP in the types and loosen the restrictions
76 | - NR: No problem with voting but how do we setup the ZEPs up for voting - anyone can do it via creating a issue but that can lead to mayhem if everyone starts doing it!
77 | - JM: We're following a chronological order but we can have a statement which can allow lightweight ZEPs to come in while big ones are on the way
78 | - Having a separate ZEPs for codecs and extension with less voting burden
79 | - How do we schedule the ZEPs for voting?
80 | - Something for
81 | - JM: There are improvements we can make to ZEP0000 and let's keep working on that
82 |
--------------------------------------------------------------------------------
/meetings/2023/2023-08-24.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 24th August
4 | description: ZEPs Meeting Notes for 2023-08-24
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 20
8 | ---
9 |
10 | # 2023-08-24
11 |
12 | **Attending:** Jeremy Maitin-Shepard (JMS), Sanket Verma (SV), Josh Moore (JM)
13 |
14 | ## TL;DR:
15 |
16 | The meeting focused on ZEP0004 preliminary work for review and highlighted open pull requests addressing various aspects of the Zarr specifications. Discussions revolved around the proposal for a lightweight ZEP process specifically designed for adding codecs and data types, with considerations for fast-tracking certain additions, especially those crucial for specific domains. Additionally, updates on the Tensorstore implementation were provided, indicating the nearing completion of V3 and sharding implementations with ongoing bug fixes.
17 |
18 | **Updates:**
19 |
20 | - ZEP0004 preliminary work for review:
21 | - Open PRs:
22 | -
23 | -
24 | -
25 | -
26 |
27 | **Meeting minutes:**
28 |
29 | - JMS: Lightweight ZEP process for adding codecs and datatypes
30 | - JM: How do you think it should look like?
31 | - JMS: Opening a PR and get the votes from ZIC and ZSC could be it
32 | - SV: Minimilastic ZEP for adding codec and data types
33 | - JM: The voting process keep going-on for codecs and data types without hinderance from the bigger ZEPs
34 | - JMS: Can work on a lightweight ZEP template for codec and dtype
35 | - JM: Certain datatypes addition may be possible blocks for some domains - having a fast-track ZEPs would help that
36 | - JMS: Would like to work on dtypes for ML use-cases
37 | - SV: Tensorstore implementation
38 | - JMS: V3 and Sharding implementation almost complete - working on some bugs - will be finalising soon!
39 |
--------------------------------------------------------------------------------
/meetings/2023/2023-09-07.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 7th September
4 | description: ZEPs Meeting Notes for 2023-09-07
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 21
8 | ---
9 |
10 | # 2023-09-07
11 |
12 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting introduced new ZEPs by Davis Bennett (ZEP0006) and Isaac Virshup (ZEP0007). Discussions included the possibility of hosting Zarr-Con with NASA F.15 grant support and updates on ZEP0002 progress, particularly in Tensorstore implementation. The addition of ZipStore as a ZEP was explored, with considerations for conventions, URL structures, and potential contributors from the microscopy and napari communities. Additionally, a discussion on handling non-Zarr stores by Zarr and the need for defined behaviors in case of malformed data was addressed.
17 |
18 | **Updates:**
19 |
20 | - ZEP0006 by Davis Bennett:
21 | - ZEP0007 by Isaac Virshup:
22 |
23 | **Meeting Minutes:**
24 |
25 | - WF: NASA F.15 grant could help hosting Zarr-Con over at Unidata
26 | - SV: Update on new ZEPs by Davis and Isaac
27 | - NR: Overview of the ZEP0007
28 | - WF: Character encoding addressed? - Not implemented robustly across NetCDF
29 | - SV: Norman as co-author?
30 | - NR: No, just left some comments
31 | - JMS: Define a name for the codec - array to bytes - can be applied to raw data buffer
32 | - NR: Could model it as a data type - not clear how the translation from bytes would work in a codec
33 | - JMS: Encourage a spec PR first - make things straightforward
34 | - SV: ZEP document and spec PR - anyone can come first - depends which is the clear and straightforward way to introduce the changes
35 | - SV: ZEP0002
36 | - JMS: Extremely close on tensorstore
37 | - NR: Zarrita.js can be added to the sharding implementation in the issue review
38 | - NR: Adding ZipStore as a ZEP
39 | - JMS: Added read-only support for ZipStore to tenstore
40 | - NR: Certain features that can be included in the ZEP - like allow different types of hierarchy
41 | - JMS: Various ways to use ZipStore in Zarr-Python - depends on different ways how want to organise your data in the Zip
42 | - NR: Maybe more of a convention - Zip on S3: How do you access it? (URL gets funky)
43 | - JMS: `s3://bucket/path/to/zip.zip|zip:path/to/array/|zarr3` - pipe URL - convey what's happening - `:` downsides - they're valid in a URL
44 | - NR: Go down further and address things further in the Zarr like array
45 | - WF: We have Zip support in NCZarr - not the similar URL style
46 | - NR: Microscopy folks - napari folks - Saalfeld could be potential people who could work on the Zip ZEP
47 | - NR: Protocol for Google storage? GS or GCS
48 | - JMS: `gs://bucket/path` - GS
49 | - JMS: General Zip required sequential access
50 | - JMS: Standardizing some kind for URL would be good thing
51 | - NR: Getting feedback from Ward, Stephan Saalfeld would be good
52 | - WF: HTTP post style syntax in NetCDF is supported
53 | - WF: What would happen if we try to read non-Zarr store by Zarr?
54 | - JMS: Looking at the metadata file and then figuring it out?
55 | - WF: Some part of NetCDF uses HDF5 and we try to open it with Zarr and it crashed
56 | - WF: Curious to what the failed `open()` should look like? Having a defined behaviour would be good
57 | - JMS: Launch missiles if the data is malformed! 😄🚀
58 | - WF: NetCDF have certain error code when it can't read insted of crashing the software - should be a part of the spec
59 | - JMS: Could be a good addition
60 | - WF: Just curious about the crashing!
61 |
--------------------------------------------------------------------------------
/meetings/2023/2023-09-21.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 21st September
4 | description: ZEPs Meeting Notes for 2023-09-21
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 22
8 | ---
9 |
10 | # 2023-09-21
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Davis Bennett (DB), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting featured updates on new ZEPs, including ZEP0006 (Zarr Object Model) by Davis Bennett and its implementation progress, ZEP0007 (String) by Isaac Virshup, and ZEP0008 (URL Syntax) by Jeremy Maitin-Shepard. Discussions included an overview of the newly submitted ZEPs, with considerations for generalizing URL syntax and exploring query syntax similarities with JSON. Additionally, discussions touched on the need for standardizing JSON schemas for Zarr, the compatibility of Zarr shards with HDF5, and plans to kick off the voting process for ZEP0003, particularly its usefulness for Tensorstore.
17 |
18 | **Updates:**
19 |
20 | - ZEP0006 by Davis Bennett: (Zarr Object Model)
21 | - Implementation:
22 | - ZEP0007 by Isaac Virshup: (String)
23 | - ZEP0008 by Jeremy Maitin-Shepard: (URL Syntax)
24 |
25 | **Meeting Minutes:**
26 |
27 | - Overview of the newly submitted ZEPs
28 | - JMS: ZEP8 Could be generalised apart from the Zarr ecosystem - provides parameters at each specific level
29 | - DB: Considering query strings?
30 | - JMS: Clearly diverging from convential syntax -
31 | - JMS: Issue with `#` syntax - interpretation will be different - a few downsides of using it - argument for using fragment identifier
32 | - JM: Descending down the attributes in N5 land?
33 | -
34 | - DB: Query syntax like JSON - the idea was shared among and used across in N5 land
35 | - DB: ZEP0006 (ZOM) discussions - [Tally Lambert](https://github.com/tlambert03) from Napari was looking for JSON schema for Zarr
36 | - JMS: JSON Schema for tensorstore
37 | - V2:
38 | - V3:
39 | - DB: Wise thing to move towards the standardisation of the schema
40 | - JMS: Consolidated metadata
41 | - JM: Engage positively with consolidated metadata and not break it
42 | - JMS: V3 consolidated metadata not in-line with ZOM would be a bad thing :)
43 | - DB: Could be use with HDF5 as well and can be future proof for the upcoming Zarr specifications
44 | - SV: Zarr shards as a valid HDF5
45 | - DB: Need to have a mechanism where both the formats can talk to each other - otherwise may lead to brittleness
46 | - JMS: The approach is hacky atm
47 | - Martin wants to kick-off [ZEP0003](https://zarr.dev/zeps/draft/ZEP0003.html) voting ASAP
48 | - Prototype implementation:
49 | - Techincal review needed
50 | - DB will have at it soon!
51 | - JMS: Useful for Tensorstore
52 |
--------------------------------------------------------------------------------
/meetings/2023/2023-10-05.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 5th October
4 | description: ZEPs Meeting Notes for 2023-10-05
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 23
8 | ---
9 |
10 | # 2023-10-05
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Thomas Nicholas (TN), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, attendees shared their favorite books as a fun introduction. The discussions revolved around various GitHub issues and pull requests, including considerations for making a store less explicit, clarifying certain points, and addressing implementation details. Jeremy Maitin-Shepard highlighted the successful 100% V3 implementation in Neuroglancer and Tensorstore, while Thomas Nicholas expressed interest in performance improvements, particularly regarding variable chunking in Zarr-Python. Plans to finalize ZEP0003 for voting were also mentioned.
17 |
18 | **Meeting Minutes:**
19 |
20 | - Introductions with favourite book
21 | - SV: ASOF by GRR Martin
22 | - TN: Dispossessed
23 | - JS: Hitchhicker's Guide to Galaxy
24 | - JM: Kingkiller Triology (Rothfuss) — can HIGHLY recommend. Beware: only 2 of the 3 books is written.
25 | - Issues and PRs to look at:
26 | -
27 | - JS and JMS: Mostly an implementation detail
28 | - JMS: Make store less explicit
29 | - JS: Should not enforce the parameter; will send a PR after 2 weeks
30 | -
31 | - JMS: Will try to make it more clear
32 | -
33 | - JMS: Chris Barnes' implementation hasn't made the change yet
34 | - SV: Send an email for this to ZIC
35 | - JMS:
36 | - JS: Fortran array needs to be inverted
37 | - JMS: C and Fortran array are contiguous in different ways
38 | - JMS: Neuroglancer and Tensorstore has 100% V3 implementation 🎉
39 | - Working on some CI issues and will merge it
40 | - JM: https://github.com/ome/ngff/pull/206
41 | - TN: Cubed discussions
42 | - Anything which increases the performance would be useful - interested in Jack's work
43 | - How can we get variable chunking into Zarr-Python?
44 | - SV: Needs to finalise the ZEP0003 - will go into voting soon!
45 |
--------------------------------------------------------------------------------
/meetings/2023/2023-10-19.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 19th October
4 | description: ZEPs Meeting Notes for 2023-10-19
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 24
8 | ---
9 |
10 | # 2023-10-19
11 |
12 | **Attending:** Sanket Verma (SV), Mark Kittisopikul (MK), Ward Fisher (WF), Davis Bennett (DB), Norman Rzepka (NR), Isaac Virshup (IV)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, the attendees discussed updates on ZEP0002, with a focus on potential changes related to handling checksums and compatibility with Kerchunk. ZEP0006 (Zarr Object Model) discussions included the suggestion of obtaining a JSON schema for ZOM and the idea of a separate ZEP for serializing children metadata. Additionally, there were considerations about namespacing codecs, discussions on ZEP0003 progress, and the potential adoption of HDF filters in Zarr. The meeting also touched on accommodating changes to ZEP0001 and the upcoming ZEP0002 voting deadline on October 31.
17 |
18 | **Updates:**
19 |
20 | - ZEP0002 voting closes on 31st October -
21 |
22 | **Meeting Minutes:**
23 |
24 | - MK: Changes to ZEP0001 are still coming in - how do we handle them?
25 | - SV: ZEP1 was provisionally accepted but not at the final stage
26 | - NR: These changes are minor and would need voting from ZIC
27 | - MK: Mention potential grace period in ZEP0000 would be helpful
28 | - NR: Needs to be written out!
29 | - MK: Zarr shards as HDF5 file
30 | - ZEP0002 should proceed as it is atm
31 | - Having a null codec to ignore the checksum would be helpful - can work on this
32 | - Recent numcodecs release helps a lot - getting Jenkins lookup checksum
33 | - Relation with ZEP0002 and Kerchunk?
34 | - NR: Don't know if there is
35 | - IV: Zarrita can read the HDF5 file using Kerchunk
36 | - IV: Multiple arrays in a single Zarr shard
37 | - NR: I don't think it'll be possible
38 | - DB: Why the chunks in the directory called C?
39 | - NR: Helps when when scanning down the groups and arrays
40 | - DB: Any questions for ZEP0006?
41 | - NR: Getting a JSON schema out of the ZEP0006 would be helpful
42 | - DB: This would also help us to get a container level validation
43 | - IV: How differs from consolidated metadata?
44 | - DB: Consolidated metadata
45 | - NR: Serializing children metadata would be helpful - could be a different ZEP
46 | - DB: Flattening array
47 | - DB: Pydantic-Zarr would have the reference implementation for ZOM
48 | - ZEP0002 discussions at Unidata
49 | - Mostly going for 'Yes' - but looking for resources who can handle and complete it
50 | - MK: NetCDF has adopted HDF filters? Making a part of Zarr filters?
51 | - WF: Would like to see spec support - supports interoperability - but hasn't considered it
52 | - MK: Between N5 and Zarr we encounter difficulties for LZ4 codec
53 | - DB: Sounds like N5 problem to me! ;)
54 | - IV: List of HDF5 filters? MK: Yes, there's a list
55 | - MK: But I think NCZarr support a select few only
56 | - WF: Yes!
57 | - MK: List of HDF5 Registered Filters: - GitHub library for plugins:
58 | - DB: Getting away from storing F9 transformations in metadata
59 | - IV: Thoughts on namespacing codecs?
60 | - NR: Would be helpful
61 | - WF: The overhead of maintenance and administration is daunting - experience from Unidata - How would I guarantee that information would be there in 10 years?
62 | - IV: Pointing at URI where the codec is defined - Having this in Zenodo would be helpful
63 | - IV: ZEP0003 progress
64 | - SV: Waiting for technical review - would help if Martin/IV could make it the ZEP meetings to raise the discussion
65 | - IV: Sure
66 |
--------------------------------------------------------------------------------
/meetings/2023/2023-11-02.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2nd November
4 | description: ZEPs Meeting Notes for 2023-11-02
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 25
8 | ---
9 |
10 | # 2023-11-02
11 |
12 | **Attending:** Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Jonathan Striebel (JS)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, the attendees celebrated the acceptance of ZEP0002. The discussion revolved around potential changes to ZEP0, considering a phased voting approach and addressing implementation challenges. Additionally, topics included resolving conflicts in the Endian codecs to bytes pull request, updates to ZEP2 regarding index location, versioning, and tracking intermediate buffer sizes, and the addition of v1 and v2 specs to zarr-specs.
17 |
18 | **Updates:**
19 |
20 | - [ZEP0002](https://zarr.dev/zeps/accepted/ZEP0002.html) finally accepted! 🎉
21 |
22 | **Meeting Minutes:**
23 |
24 | - Changes to ZEP0
25 | -
26 | - JMS: Don't think there's gonna be any changes to ZEP1 now
27 | - JM: Could be 2 voting - but how do you get everyone to implement ZEP without voting?
28 | - SV: Yeah! Good point.
29 | - JMS: Don't need to go overhead for small changes
30 | - JM: `index_location` is backwards compatible and maybe the best case scenario
31 | - JMS: But endian codec is not backward compatible
32 | - SV: Very natural to see this scenario coming up
33 | - JM: Architect building a building - they go through submittals - 20% increment - show the adoption percentage
34 | - SV: How would you define percentage?
35 | - JM: Voting could be in phases - reading phase - implementation change - grace period
36 | - *Jonathan joins*
37 | - JS: Voting encourages people to read the spec
38 | - JM: Setting a common calendar for the ZIC and ZSC - can help the author
39 | - JS: It was good to the response when we set the deadline - also the examples of implementation currently is good
40 | - JMS: Would be great to have a table of what part of spec they're current implementing would be great
41 | - JMS: Having a compatibility table would be great - e.g. Neuroglancer doesn't support boolean type
42 | - JMS: Once a ZEP is accepted the spec matters
43 | - JM: Looking at RFC for NGFF these days
44 | - JMS: Random reviews vs the expert group reviews
45 | - SV: If you're going to implement the spec and how you're going to implement the spec - Form a voting procedure around that
46 | - JS: Having a process definitely helped me to finsh the sharding - find it good as contributor!
47 | - JM: Having rebuttal would change the tone a lil' bit
48 | - JS: Telling to vote before implementation is fine - as you can find things spec
49 | - JM: Having it defined in ZEP0 would be great!
50 | - JS: Doesn't like the grace period - leave the door open a little
51 | - JS: Having an implementation phase would be good
52 | - SV: Reading notification during the voting phase - could be helpful
53 | - JM: Keeping a table would be helpful - 5 states
54 | - JMS: Three phases -
55 | - reading phase and express opinions
56 | - implementation phase and raise issues
57 | - solve issues raised in the last phase
58 | - JM: It's clear we're in agreement of phases/periods - intent
59 | - JM: People want general confidence from the audience that they're moving forward
60 | - JMS:
61 | - JMS: Getting formal feedback from the ZIC before the vote
62 | - SV: Let ZIC know when you're going to put up the ZEP for voting
63 | - JM: Worst case scenarios - no-one read the spec and someone vetos the ZEP in the initial stage
64 | - JM: Roll call helps
65 | - JS: May not need a second vote - those are implementation details
66 | - JM: Add a different state - a pre-implementation checkpoint
67 | - Endian codecs to bytes
68 | -
69 | - JMS: Need to resolve git conflicts
70 | - Changes to ZEP2:
71 | - (add index_location)
72 | - (versioning)
73 | - (tracking intermediate buffer sizes)
74 | - Adding v1 and v2 spec to zarr-specs
75 | -
76 |
--------------------------------------------------------------------------------
/meetings/2023/2023-11-16.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 16th November
4 | description: ZEPs Meeting Notes for 2023-11-16
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 26
8 | ---
9 |
10 | # 2023-11-16
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Norman Rzepka (NR), Jeremy Maitin-Shephard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, updates were provided on various pull requests, including PR#281 to update ZEP2 status. Discussions revolved around ZEP8, with Jeremy tasked to update the corresponding pull request. Davis provided an update on ZEP6, and topics included cleaning the zarr-specs documentation, considerations about version numbers in codecs and stores, and discussions on refining the ZEP process, including the duration of voting periods and the championing process for new ZEPs. John Bogovic's potential involvement in future meetings for ZEP8 discussions was mentioned.
17 |
18 | **Updates:**
19 |
20 | - Sanket recently added [PR#281](https://github.com/zarr-developers/zarr-specs/pull/281) to update ZEP2 status
21 | - Discussions around ZEP8:
22 | - Jeremy to update the PR
23 | - Davis update ZEP6
24 | - Question to Davis: Status of completion?
25 |
26 | **Meeting Minutes:**
27 |
28 | - Cleaning of zarr-specs.readthedocs.io
29 | - Remove 'Under construction'
30 | - Remove 'Array storage transformers'
31 | - Maybe rename data types to extension data types?
32 | - JMS: [bfloat16 dtype](https://github.com/zarr-developers/zarr-specs/pull/257) should be in core spec under data type table - required and optional table separately
33 | - NR: May need some explanation on the data type
34 | - NR: Remove version number from the codecs and stores
35 | - JMS: Could track down the implementations adopting the different versions of codecs/stores
36 | - NR: No version number in metadata, so not useful
37 | - JMS: But you'd not be allowed to change the metadata
38 | - JM: Helps to write down this; for example a ZEP
39 | - JMS: Implementations might not implement all the versions
40 | - JMS: How about STAC?
41 | - SV: STAC uses incremental versions to evolve the specification
42 | - NR: Zarr V2 & V3 compatbility issues may arise in the future
43 | - JM: Flip side if we're dealing with a long list of codecs - versions may help here
44 | - JM: For example: Bumping the Blosc2 codec
45 | - JMS: Pretty rare to change the versions
46 | - JMS: Having a shorter voting period may help
47 | - JM: Having a step voting period may be troublesome - experience from NGFF and OME-Zarr - strong word for the first phase
48 | - NR: Silence in the second phase?
49 | - JM: It'll be good!
50 | - NR: How do we make the vote earlier?
51 | - JM: Once month for roll call - done reading, start implementing. finish implementing, also no vetos
52 | - JM: Graph for the progress
53 | - JMS: Make revisions after the voting - working for now but not great
54 | - JMS: A word for grace period: `Final revision deadline`
55 | - NR: Finding 1-2 champion for starting a new ZEP
56 | - JM: Having a mailing list to ask for champion
57 | - JM: Close the ZEP proposal if it's not active for sometime
58 | - JMS: C++ standard is much complicated than Zarr - only some people capable of changing the wording
59 | - SV: If someone is not able to find a champion should they not proceed with the ZEP? - Not in favour of the champion process to be the only condition to move forward
60 | - JM: List of endorses and endorsement for the ZEP process
61 | - FYI: John Bogovic may join future ZEP meeting for ZEP8 discussions
62 | - TABLED
63 | - Thoughts on [PR#276](https://github.com/zarr-developers/zarr-specs/pull/276)?
64 |
--------------------------------------------------------------------------------
/meetings/2023/2023-11-30.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 30th November
4 | description: ZEPs Meeting Notes for 2023-11-30
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 27
8 | ---
9 |
10 | # 2023-11-30
11 |
12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, discussions included Ward Fisher's virtual talk for AGU, focusing on Zarr as an archival format and its interoperability with NetCDF. There was also consideration of Zarr implementation in FORTRAN and the challenges in upgrading NetCDF FORTRAN modules. Additionally, ongoing work on ZEP48 was acknowledged, and the decision on several pull requests, including zarr-developers/zarr-specs/#276, zarr-developers/zarr-specs/#205, and zarr-developers/zarr-specs/#271, was tabled for future consideration.
17 |
18 | **Meeting Minutes:**
19 |
20 | - WF: Recording a virtual talk for AGU next week
21 | - - CMIP6 trying to get rid of NetCDF?
22 | - WF: Zarr as archival? - Trying to be there - much younger format - but NetCDF is a robust archival format
23 | - SV: The interoperablility b/w Zarr and NetCDF is a good thing in here
24 | - WF:
25 | - WF: Zarr implementation in FORTRAN?
26 | - JMS: Are people writing FORTRAN actively?
27 | - WF: Yes - The latest book on FORTRAN came last year - NCZarr is supported via FORTRAN
28 | - JMS: Zarr V3 FORTRAN would be good
29 | - WF: NetCDF FORTRAN is used by supercomputers across US
30 | - SV: Why end NetCDF FORTRAN?
31 | - WF: Selfish reasons - takes a lot of time fixing the modules
32 | - SV: Why supercomputers still use FORTRAN?
33 | - WF: Supercomputers unparalleled performance using FORTRAN is just great!
34 | - WF: E.g. Mathworks upgrading to newer NetCDF version is a long process
35 | - WF:
36 | - Good to go with?
37 | - [zarr-developers/zeps/#48](https://github.com/zarr-developers/zeps/pull/48)
38 | - JMS: Still need to work on this - also work on the implementation of the ZEP
39 | - TABLED
40 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)?
41 | - Thoughts on [zarr-developers/zarr-specs/#205](https://github.com/zarr-developers/zarr-specs/pull/205)
42 | - Good to go with?
43 | - [zarr-developers/zarr-specs/#271](https://github.com/zarr-developers/zarr-specs/pull/271)
44 |
--------------------------------------------------------------------------------
/meetings/2023/2023-12-14.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 14th December
4 | description: ZEPs Meeting Notes for 2023-12-14
5 | grand_parent: ZEP meetings
6 | parent: 2023 meetings
7 | nav_order: 28
8 | ---
9 |
10 | # 2023-12-14
11 |
12 | **Attending:** Sanket Verma (SV) and Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | In this meeting, discussions revolved around the Zarr NYC Sprint and the possibility of hosting ZarrCon at Google. The decision to proceed with `zarr-developers/zarr-specs/#276` was made, with changes committed and awaiting review before merging. Additionally, thoughts were shared on `zarr-developers/zarr-specs/#205`, and decisions were made to merge zarr-developers/zarr-specs#271 while considering the closure of `zarr-developers/zarr-specs#254`.
17 |
18 | **Meeting Minutes:**
19 |
20 | - Discussion about Zarr NYC Sprint
21 | - ZarrCon @ Google
22 | - JMS: Can be hosted at Google
23 | - JMS: Can provide rooms for conferences but lodging could be a challenge
24 | - JMS: Stephan Hoyer could be interested
25 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)? - Changes committed; will wait for review and then merge
26 | - Thoughts on [zarr-developers/zarr-specs/#205](https://github.com/zarr-developers/zarr-specs/pull/205)
27 | - Good to go with?
28 | - [zarr-developers/zarr-specs/#271](https://github.com/zarr-developers/zarr-specs/pull/271) - MERGED
29 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280)
30 | - And close?
31 | - [zarr-developers/zarr-specs#254](https://github.com/zarr-developers/zarr-specs/issues/254)
32 |
--------------------------------------------------------------------------------
/meetings/2023/meeting_notes_2023.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2023 meetings
4 | description: List of ZEP meeting notes for the year 2023
5 | nav_order: 2
6 | parent: ZEP meetings
7 | has_children: true
8 | permalink: /meetings/2023/
9 | ---
10 |
11 | # ZEP Meeting Notes for 2023
12 |
13 | Shows the list of meeting notes for the year 2023.
14 |
--------------------------------------------------------------------------------
/meetings/2024/2024-01-11.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 11th January
4 | description: ZEPs Meeting Notes for 2024-01-11
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 1
8 | ---
9 |
10 | # 2024-01-11
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Ward Fisher (WF)
13 |
14 | ## TL;DR:
15 |
16 | Meeting covers progress updates, resolution discussions for various Zarr specs pull requests, and status check on ZEPs with considerations for improving the review process.
17 |
18 | **Updates:**
19 |
20 | - Happy New Year! 🥂
21 |
22 | **Meeting Minutes:**
23 |
24 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)?
25 | - This would help resolve [zarr-developers/zarr-python#1582](https://github.com/zarr-developers/zarr-python/pull/1582)
26 | - Good to go with?
27 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280)
28 | - [zarr-developers/zarr-specs#263](https://github.com/zarr-developers/zarr-specs/pull/263)
29 | - And close?
30 | - [zarr-developers/zarr-specs#254](https://github.com/zarr-developers/zarr-specs/issues/254)
31 | - Status of ZEP6, 7 & 8
32 | - JMS: Will look at it sometime soon
33 | - JMS: Also looking forward to the simplified version of ZEP0
34 | - JM: In NGFF space we're going away from single PR and merge thing as it becomes difficult to manage stuff
35 | - JMS: GitHub is not well suited for resolving comments and stuff
36 | - JM: The webpage becomes the official record of what was discussed and approved
37 | - SV: Process for Tensorstore and Neuroglancer
38 | - JMS: Open a issue and then PR - mostly me and my colleague working on stuff
39 | - JMS: We have internal repository for the changes as well
40 |
--------------------------------------------------------------------------------
/meetings/2024/2024-01-25.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 25th January
4 | description: ZEPs Meeting Notes for 2024-01-25
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 2
8 | ---
9 |
10 | # 2024-01-25
11 |
12 | **Attending:** Sanket Verma (SV) and Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | Discussion centers on revising ZEP0 with proposals to streamline the process, including incorporating reading and implementation phases, and considering methods to increase user engagement and ensure smooth decision-making, with attention given to potential veto options and quorum requirements.
17 |
18 | **Meeting Minutes:**
19 |
20 | - ZEP0 Revision
21 | -
22 | - JMS: Both Ryan's and SV proposal can work together
23 | - SV: We can have an issue for reading/implementation comments and PR for the actual change
24 | - JMS: The idea of reading and implementation phase is an improvement from the existing proposal
25 | - JMS: The people who care about Zarr specification is less but streamlining the process is equally important
26 | - JMS: Create new issues if the discussion gets long and link it to the PR - and you can add it to the description
27 | - SV: There's also a question of how we can increase the activity of the users in Zarr specification work
28 | - JMS: The time and interests of various the council members depends on various factors
29 | - JMS: Let people veto at anytime, even at the voting phase? - you have processes but in the end you rely on people!
30 | - JMS: We hopefully don't get veto at the end of the stage
31 | - JMS: You should have the veto in the voting phase too! - There should be option of veto if the problem comes up very late
32 | - JMS: Thinks having a quorum can help
33 | - Good to go with?
34 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280)
35 | - SV commented on the PR to get Norman's attention
36 |
--------------------------------------------------------------------------------
/meetings/2024/2024-02-08.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 8th February
4 | description: ZEPs Meeting Notes for 2024-02-08
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 3
8 | ---
9 |
10 | # 2024-02-08
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | Discussion revolves around refining storage transformer interfaces within ZEP1, exploring options for unified JSON representations, and considering the integration of Parquet with Zarr, alongside ZEP0 discussions focusing on streamlining processes while ensuring compatibility with existing bylaws.
17 |
18 | **Meeting Minutes:**
19 |
20 | -
21 | - JM: The state we left ZEP1 and storage transformer, where does this fit in?
22 | - JMS: Wrap the key-value interface in the existing implementation
23 | - JMS: Kerchunk approach has 1 `.JSON` file and the proposed approach has 2 `.JSON` files
24 | - JMS: Specify any array in-line?
25 | - JM: May look like specifying kerchunk in Zarr which we may or may not want to do
26 | - JMS: Kerchunk approach has keys and values - not exactly readable
27 | - JMS: Various flavours of `.JSON`s can we somehow unify them? - Does it help to have a representation for inline arrays?
28 | - JM: Will comment on the Joe's issue
29 | - JMS: Would be good to get Martin's POV
30 | - JMS: Kerchunk parquet format is worth looking at
31 | - JM: Parquet folks are looking to combine parquet and Zarr - could look at the tabular data as 2D array
32 | - JMS: Do you need to download the whole parquet to access it?
33 | - JM: I think the offset works in parquet
34 | - JM:
35 | - JMS: - created annotations in Tensorstore - spatial query has multi-index grid - sorta same like a sparse-array
36 | - JMS: general missing feature of a cloud database (Josh: cf. work on a graph/zarr version in Spain)
37 | - JM: Will try to get together SpatialData and JMS for discussions to prevent duplicative efforts
38 | - JM: Having URLs as indices and if not generate them on the fly and if you have write access then write to it
39 | - JMS: Annotations doesn't end up being too large
40 | - JM: Duckdb is worth looking at -
41 | - JMS: Cloud database need regular maintenance
42 | - JM:
43 | - JM: Building index on the cloud or locally?
44 | - ZEP0 discussions -
45 | - JM: Let's open a PR and go ahead!?
46 | - JMS: Yes!
47 | - JM: In favour of having a lighweight process would be helpful but if we reach to a point where we have contention then we should go back to the bylaws
48 | - JMS: If the future ZEPs overlap then there would be a problem
49 | - JM: Footnote is useful for future records
50 |
--------------------------------------------------------------------------------
/meetings/2024/2024-02-22.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 22nd February
4 | description: ZEPs Meeting Notes for 2024-02-22
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 4
8 | ---
9 |
10 | # 2024-02-22
11 |
12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Josh Moore (JM), Martin Durant (MD), Tom Nicholas (TN)
13 |
14 | ## TL;DR:
15 |
16 | The meeting covered the use of LLMs in training, feedback on the Zarr-Specs website redesign, progress on the V3 refactor, and discussions on integrating kerchunk into Zarr, focusing on chunk manifest standardization and virtualized array concatenation.
17 |
18 | **Updates:**
19 |
20 | - HTTP Extension meeting:
21 |
22 | **Meeting Notes:**
23 |
24 | - LLMs and how WF is using them in trainings
25 | - Feedback for new design for Zarr-Specs website (combines ZEP and Zarr-Specs together)
26 | - Link:
27 | - MD: How's V3 refactor work going on these days?
28 | - JM: Quite good progress taking place these days
29 | - SV: V3 PRs can be found here -
30 | - MD:
31 | - TN: Been discussing →
32 | - interested in integrating kerchunk into zarr, especially two ZEPs
33 | - (1) chunk manifest (Joe) - standardizing what chunk json files do
34 | - (2) concatenation -
35 | - 1. manifest: opinion that it's an incredible idea that is very popular
36 | - fsspec relationship makes things complicated
37 | - move to the zarr spec for other implementations?
38 | - goal is readable in any language
39 | - difficult position
40 | - three things to think about
41 | - read byte ranges
42 | - write JSON
43 | - combine module
44 | - roadmap:
45 | - standardize json for the chunks. manifest file?
46 | - JM: storage in zarr array itself
47 | - JM: log file anytime you read a full file into memory
48 | - Josh: virtual zarr (access pattern)
49 | - 2. concatenation
50 | - multi-zarr-to-zarr leads to a loop
51 | - more sense to think of concat of virtualized arrays objects
52 | - see kerchunk array notebook
53 | - read in byte ranges with kerchunk. array class which only stores byte-offset arrays in memory
54 | - can be done in xarray. concat-classes can be put into xarray and can use higher-order API
55 | - JM: store that xarray as a zarr :smile: (but need additional metadata for realizing the array)
56 | - TN: part of notebook that isn't done. exactly.
57 | - common case in geo. multiple NC files, concat those array.
58 | - possibly compression options change over time.
59 | - prevents it from being one zarr array
60 | - JM: or just always serialize to the chunk manifest
61 | - JM: i.e. where do we stop? (when does Zarr become Turing Complete?)
62 | - TN: thought at concat (clear use case). but jeremy thought indexing (also clear use case)
63 | - JM: starting to sound like transforms ()
64 | - WF: periodically get requests for operations on the data
65 | - no one has come close to making the argument for adding that into the storage
66 | - so many math libraries that would do it better
67 | - TN: no computation since you don't need the values. can do some subset of concat & indexing without values.
68 | - TN: have now become a zarr producer :tada:
69 | - JM: cross-language motivation
70 | - SV: pyramiding ZEP discussions
71 |
--------------------------------------------------------------------------------
/meetings/2024/2024-03-07.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 7th March
4 | description: ZEPs Meeting Notes for 2024-03-07
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 5
8 | ---
9 |
10 | # 2024-03-07
11 |
12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Davis Bennett (DB), Josh Moore (JM), Thomas Nicholas (TN), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed enhancing Zarr store browsing, pushing Kerchunk functionality into Zarr, and the potential for a chunk manifest as a ZEP. Additionally, the group explored ideas for revising ZEP0, creating a Zarr specification IETF standard, and shared updates on upcoming Zarr HTTP Extension and SEA conference events.
17 |
18 | **Updates:**
19 |
20 | - Zarr HTTP Extension Meeting next week
21 | - Check here:
22 | - TN:
23 | - Had a conversation with folks over at Development Seed, NASA, Earthmover
24 |
25 | **Meeting Minutes:**
26 |
27 | - TN: Jed wants to have nice ways to browse the Zarr stores - they have nice ways to browse `.tiff` files already
28 | - Wants to propose an extension to add more information in the metadata
29 | - The end result would look more like a Xarray HTML wrapper
30 | - TN:
31 | - Had a conversation with folks over at Development Seed, NASA, Earthmover
32 | - DB: Pushing Kerchunk functionality into Zarr stores
33 | - DB: Whether the feature could be file format agnostic?
34 | - TN: Argues that it should be a ZEP - and can be read every Zarr implementation
35 | - JM: Having same thing implemented in FSSPEC
36 | - DB: Would ZEP
37 | - WF: HDF5 group may be open to a conversation
38 | - SV: might have some useful information
39 | - TN: _recaps the conversation for JMS_
40 | - TN: Should concatenation be a part of the current ZEP?
41 | - DB: Any reason you don't want to concatenate HDF5 and other file formats?
42 | - TN: Chunk manifest would point inside the arrays - chunk manifest could let you create a Zarr store over other formats as well
43 | - DB: This would make Zarr as an API/access pattern
44 | - TN: Can be created and tested fairly separate to Zarr - personally think chunk manifest is neat feature - implementation can support/not support it
45 | - DB: Array mutation can break the concatenation - having guidelines for archival arrays would help
46 | - TN: Currently we're thinking about read-only case
47 | - TN: Virtualisation in Kerchunk is a spotlight feature
48 | - JMS: Manifest is a good idea and keeping it separate would be a minor difference - needs to align with Kerchunk
49 | - JM: report/ZEP idea (time permitting)
50 | -
51 | - JM: Putting ro-create inside Zarr - or making Zarr specification a IETF standard
52 | - JM: Would probably go ahead and write a convention in NGFF space
53 | -
54 | - JM: Have a mechanism for going up/down the hierarchy - useful for the HTTP extension discussions
55 | - Revising ZEP0
56 | - - comments/feedback welcome
57 | - DB: :+1:
58 | - DB: Would be easy to have a single PR for my ZEP
59 | - JMS: Putting narrative document in PR description
60 | - JM: Weird for commenting on the PR description and for the public visibility
61 | - JMS: Rationale can be put down as a footnote
62 | - JMS: Having numeric numbering is something Python follows
63 | - JMS: The actual specification change can also serve as a ZEP narrative
64 | - SV: We can pick out certain sections out of the ZEP narrative document
65 | - JMS: Having a PR template similar to ZEP's narrative could also help us
66 | - WF:
67 | - In-person and virtual registrations are available
68 |
--------------------------------------------------------------------------------
/meetings/2024/2024-03-21.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 21st March
4 | description: ZEPs Meeting Notes for 2024-03-21
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 6
8 | ---
9 |
10 | # 2024-03-21
11 |
12 | **Attending:** Sanket Verma (SV), Thomas Nicholas (TN), Ward Fisher (WF)
13 |
14 | ## TL;DR:
15 |
16 | **Updates:**
17 |
18 | - Join ZulipChat:
19 | - HTTP Extension meeting took place on 3/14
20 | - Trying to figure out the best way forward, i.e. a ZEP or not
21 | - Guaging interest and use cases from others in the community
22 |
23 | **Meeting Minutes:**
24 |
25 | - HTTP Extension
26 | - WF: Can see the shape of it, and I think it would be useful
27 | - SV: Existing thread:
28 | - TN: Tom's company may have a use case for the HTTP work
29 | - Showing [VirtualiZarr](https://github.com/TomNicholas/VirtualiZarr) (related to the "chunk manifest" ZEP)
30 | - TN: Been working on the packages for the last 2 weeks - could potentially replace Kerchunk
31 | - TN: _code walkthrough via screen sharing_
32 | - TN: Storing the virtual Zarr manifests, not the actual array values
33 | - TN: Could move `class ManifestArray` to Zarr-Python - arguments in favour and against it
34 | - TN: Could see donating VirtualiZarr to zarr-developers
35 | - SV: **Action items**
36 | - TN to create a topic for VirtualiZarr to gather feedback/comments
37 | - SV to try VirtualiZarr
38 | - TN and SV to work on ZEP Extension proposal for virtual Zarr manifest and formally present it for broader feedback
39 | - TABLED
40 | - Revising ZEP0
41 | -
42 |
--------------------------------------------------------------------------------
/meetings/2024/2024-04-04.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 4th April
4 | description: ZEPs Meeting Notes for 2024-04-04
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 7
8 | ---
9 |
10 | # 2024-04-04
11 |
12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF)
13 |
14 | ## TL;DR:
15 |
16 | **Updates:**
17 |
18 | - CZI EOSS6 Application not funded
19 |
20 | **Meeting Minutes:**
21 |
22 | - NASA Grant (WF)
23 | -
24 | - Townhall meeting slides:
25 | - Looking towards sustaining the already established open source software
26 | - NetCDF is looking for collaboration for their application
27 | - JM: Collaborators in US could be NF, OpenCollective, NVIDA, Columbia etc.
28 | - JM: Will reach out to NF for their NASA grants' experience
29 |
--------------------------------------------------------------------------------
/meetings/2024/2024-04-18.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 18th April
4 | description: ZEPs Meeting Notes for 2024-04-18
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 8
8 | ---
9 |
10 | # 2024-04-18
11 |
12 | **Attending:** Josh Moore (JM), Vicent Immler (VI), Sanket Verma (SV), Ward Fisher (WF), Altay Sansal (AS), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting covered the proposal to remove implicit groups in Zarr, progress on ZEP4 and ZEP3, and updates on V3 implementation. Additionally, discussions included async read optimizations for Zarr and the impact on performance, especially concerning large datasets and parallel data ingestion.
17 |
18 | **Updates:**
19 |
20 | - Davis wants to remove implicit groups:
21 | - Activity going-on at ZEP4 Review PR
22 |
23 | **Meeting Minutes:**
24 |
25 | - Introductions w/ last gift you got
26 | - Sanket - cologne and clothes
27 | - Vincent - wooden board forged with family crescent
28 | - Ward - camping tent
29 | - Josh - pecan nuts
30 | - Altay - lead data scientist - lego
31 | - Removing Implicit groups
32 | - JM: Discussed at community meeting - needs to go back to root node to figure out the group
33 | - JM: Tensorstore doesn't use Zarr groups at all
34 | - WF: Supposition from my side
35 | - WF: Dennis completed the V3 implementation!
36 | - JM: Are we closer to parity in V3 work - a question for Dennis!
37 | - VI: How does implicit groups affect performance?
38 | - JM: No, implicit groups means performance improvement
39 | - VI: Working on a new software implementation for students
40 | - JMS: No experience in working with groups
41 | - JM: Lot of callbacks
42 | - JMS: You'd definitely want to remove the looking upward
43 | - AL: Couldn't see a use-case for parallel creation of groups
44 | - JMS: You're ingesting lot of data in S3 and they read group metadata and have implicit groups
45 | - AL: `.zattrs` would have race condition?
46 | - JMS: Kind of a niche use-case
47 | - AL: Are Multi-processing locks concern metadata?
48 | - JMS: Multiple machine can leverage this!
49 | - AL: Removing would be a good idea!
50 | - AL: ZEP4 and ZEP3 progress
51 | - SV: AL, are you using V2 or V3?
52 | - AL: Using V2 and would love to move to V3 - have 20-30 PB data
53 | - AL: Want to work on `dimension_names` - what would be the best time to do it?
54 | - SV: After V3 release
55 | - VI: _explains GSoC application_
56 | - AS: Hacked Zarr to submit reads in a async manner to the machine to circumvent the problem
57 | - AS: Zarr V3 is going to be fully async so, it helps alleviates the problem
58 | - VI: Would be good to have a way to improve the read speeds for Zarr
59 |
--------------------------------------------------------------------------------
/meetings/2024/2024-05-02.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2nd May
4 | description: ZEPs Meeting Notes for 2024-05-02
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 9
8 | ---
9 |
10 | # 2024-05-02
11 |
12 | **Attending:** Josh Moore (JM), Sanket Verma (SV), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS), Thomas Nicholas (TN)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed the upcoming Zarr V3 release candidate, status and integration of the chunk manifest ZEP, and potential revisions for combining ZEPs. They also covered progress on ZEP0 and plans to move ZEP2 to "Active" status, while tabling the discussion on removing implicit groups.
17 |
18 | **Updates:**
19 |
20 | **Meeting Minutes:**
21 |
22 | - WF: Dennis has a PR coming up to revise the Zarr V3 - 6 months long work - release candidate coming soon!
23 | - TN: Status of unfinished ZEP w.r.t. to chunk manifest?
24 | - E.g. Have a sharded chunk manifest?
25 | - JMS: Chunk manifest could refer to entire shard - use case might not be clear
26 | - JMS: For viz tools you would not load entire shard at once
27 | - TN: Changing chunk manifest ZEP or accomodate chunk manifest in sharding codec?
28 | - JMS: Not really need to change
29 | - JM: Maybe there's a way to re-write the ZEP in a way which the existing ZEPs are composable - basically how extensions would interact with each other
30 | - TN:
31 | - JMS: There might be cases where combination of codecs may not work well
32 | - TN: Combining codecs seems straightforward compared to variable chunking which specifies what is allowed and what not
33 | - JMS: The proposal which changes the data model are tricky - wanted to add non-zero origin
34 | - TN: Interested in variable chunking
35 | - SV: Would you in be interested in contributing to ZEP3?
36 | - JM: We could also start thinking about ZEP2+ZEP3, ZEP3+ZEP4, ZEP4+ZEP2...
37 | - JMS: Any reason for not using Kerchunk?
38 | - TN: Chunk manifest is clearly defined Zarr store compared to kerchunk (which kinda looks like Zarr) - reference file-system are not defined - there's value of getting chunk manifest into Zarr specification as Kerchunk is more than Zarr
39 | - JMS: The actual implementation of the file-system would be same across the various libraries
40 | - TN: Relying on single maintainer code is not an ideal situation
41 | - SV: ZP V3 implementation was outdated which led to creation of Zarrita and then finally re-using Zarrita for ZP V3 refactor
42 | - TN: Working on nit-picking Xarray for Virtuali-Zarr
43 | - TN: Zarr arrays are kind-of lazy arrays - when you index into Z-arrays they provide you with bytes not the actual Zarr arrays - Xarray has lazy-loading hidden inside in codebase and there has been discussion to make it a standalone library
44 | - JMS: We could have two sizes for chunks - stored size and actual size for variable chunking strategy
45 | - Move ZEP2 from `Accepted` to `Active`
46 | - JM: Would be good to move ZEP1 and ZEP2 both at the same time
47 | - SV: ZP V3 refactor would be a good time to move ZEP1 to active
48 | - Finalise ZEP0 revisions
49 | -
50 | - Re-start the conversation and finalise it
51 | - **TABLED**
52 | - Removing implicit groups -
53 |
--------------------------------------------------------------------------------
/meetings/2024/2024-05-16.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 16th May
4 | description: ZEPs Meeting Notes for 2024-05-16
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 10
8 | ---
9 |
10 | # 2024-05-16
11 |
12 | **Attending:** Dennis Heimbigner (DH), Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed the release of Zarr-Python 2.18.0, a new blog post by Joe Hamman, and updates on the sharding support in the R implementation. They also covered the implementation of manifest storage transformers, standardizing URLs for Zarr, and the removal of implicit groups in Zarr-Python V3.
17 |
18 | **Updates:**
19 |
20 | **Meeting Minutes:**
21 |
22 | - Zarr-Python 2.18.0 out now:
23 | - One of the last few releases for Zarr Spec 2 - if there's anything you want to get in, please reply/tag us in the PRs/issues
24 | - New blog post by Joe Hamman:
25 | - Zarr-Python developers meeting new schedule - check here:
26 | - Lachlan Deakin added support for sharding in his R implementation:
27 |
28 | **Open agenda (add here 👇🏻):**
29 |
30 | - JM: In sharding you can recurse and browse through the chunks - somthing like _chunks([x, y])_
31 | - DH: Treating sub-chunks as regular chunks - like what we decided during the storage transformers proposal -
32 | - _DH understands this proposal better and favours it_
33 | - The relevant issue:
34 | - DH: Time to gets hands dirty with the implementation and figure out any problems we have
35 | - JMS: Using storage transformers and codecs in Neuroglancer to achieve sharding
36 | - DH:
37 | - SV: Manifest storage transformers - - defines and implements on top of the storage transformer in V3 core spec - discussion 👇🏻
38 | - JMS: Good to define the `JSON` and add other formats later on
39 | - DH: FSSPEC interprets the URL in Kerchunk
40 | - DH: Having complete key values in URL would help in the long run - DAP made a mistake earlier and we fixed it - having a complete URL is a better option and you can replace the contents within it later on
41 | - JM: Having a complete URL in manifest storage transformer for Zarr would help us but there's a question of backward compatibility
42 | - JMS: standardise the URL
43 | - DH: URL spec defines the format and correct way of defninig a URL - if you consider things other than FSSPEC you should have a more standardised URL
44 | - DH: Conforming to the [URL Spec](https://www.w3.org/Addressing/URL/url-spec.txt) should be avoided actively
45 | - JM: Having URL defined in the storage transformed would help - currently not defined
46 | - Fix typo - - **MERGED**
47 | - Updated Zarr-Specs license to CC-BY-4.0 -
48 | - Implicit groups removed in Zarr-Python V3 via
49 | - Corresponding PR in Zarr-Specs -
50 |
--------------------------------------------------------------------------------
/meetings/2024/2024-05-30.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 30th May
4 | description: ZEPs Meeting Notes for 2024-05-30
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 11
8 | ---
9 |
10 | # 2024-05-30
11 |
12 | **Attending:** Sanket Verma (SV) and Davis Bennett (DB)
13 |
14 | ## TL;DR:
15 |
16 | **Updates:**
17 |
18 | - Zarr + NASA applications survey:
19 | - Zarr-Python 2.18.1 and 2.18.2 were released in the last 2 weeks - includes a couple of minor bugs
20 | - Latest update: ZP V3.0 alpha aimed to release this week
21 | - New blog post coming soon in collaboration with NASA POWER project -
22 |
23 | **Meeting Minutes:**
24 |
25 | - DB: Discussion on -
26 | - DB: - This is inconsistent with the current design
27 | - Also doesn't conforms with the spec
28 | - The array metadata should be immutable but `codec = CodecPipeline` makes it mutable
29 | - Implicit groups removed in Zarr-Python V3 via
30 | - Corresponding PR in Zarr-Specs -
31 | - SV: There's a informal consensus - we should go ahead with this PR
32 |
--------------------------------------------------------------------------------
/meetings/2024/2024-06-13.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 13th June
4 | description: ZEPs Meeting Notes for 2024-06-13
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 12
8 | ---
9 |
10 | # 2024-06-13
11 |
12 | **Attending:** Davis Bennett (DB), Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS)
13 |
14 | ## TL;DR:
15 |
16 | The meeting discussed the timing for moving ZEPs 1 & 2 from "Accepted" to "Final," potential changes to ZEP1 related to v3 codec metadata and variable chunking, and considerations around implementing a chunk manifest with URL support.
17 |
18 | **Updates:**
19 |
20 | **Meeting Minutes:**
21 | - discussion for a future meeting: when are ZEPs 1 & 2 no longer just "Accepted"
22 | - see
23 | - when zarr-python v3 goes GA? some other time?
24 | - Sanket: after "accepted" is "final" for non-process
25 | - Davis: "active" isn't connected in the flowchart
26 | - Sanket:
27 | - Davis: defining ZEP with a ZEP seems problematic
28 | - Josh: certainly not necessary, but that requires "Yet Another Document"
29 | - Sanket: also definitely made a mistake of not taking into account the changes of ZEP0000
30 | - Davis: not a lot of ZEPs. writing the ZEP wasn't a good use of my time.
31 | - *jeremy joins*
32 | - ZEP1 changes ("bugs")
33 | - Davis: v3 codec metadata is cumbersome. could be json metadata rather than a list
34 | - could do this backwards compatible
35 | - as ZEP? Josh: suggest an issue first (like implicit group) then we can discuss
36 | - Jeremy: less likely to go in. need a high benefit (existing data out there, churn, etc.)
37 | - Davis: would argue that this is a wart in the spec and good to document that.
38 | - Davis: clarify relationship between the v3 spec and the codecs
39 | - current spec document is inconsistent
40 | - may impact implementations
41 | - Jeremy: intention was that whether in specs or in the "codecs" that there is a definition. i.e., no problem there and probably an editorial change.
42 | - Davis: variable chunking. extension defines a place to define the chunking (`name=regular` i.e. rectilinear)
43 | - minor version incremement that just uses variable chunks ("easier"?)
44 | - on the implementations, if there's only one it's easier in the long-run
45 | - Jeremy: that's probably how the implementation will work. but hard to know if there are other types of chunking in the future. possibly geospatial.
46 | - Davis: propose not supporting the old version (one list of chunk sizes)
47 | - Jeremy: don't think that's workable (to always require full); but for each dimension, to allow an integer or a list. ok to have the identifier and not a lot of work to convert.
48 | - Jeremy: chunk manifest (tabled)
49 | - likely makes it necessary to have URL support
50 | -
51 | - examples use `s3://...`
52 | - Josh: concerned that it's bigger than Zarr
53 | - other things: fsspec, intake, ...
54 | - Davis: is this another way to do sharding? pros / cons to the codec approach? (serves same perhaps as the shard header)
55 | - Jeremy: except there's the binary / plain split.
56 | - Davis: just that there's more than one way to do something
57 | - Jeremy: not unusual that a complicated system has more than one way to do something
58 | - Davis: decision point for people to make. Not something we had in zarr v2
59 | - Jeremy: use shard if you're writing it; use manifest if you have some stuff
60 | - Josh: also you could potentially choose to put a manifest in front of a (old) sharded
61 |
--------------------------------------------------------------------------------
/meetings/2024/2024-06-27.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 27th June
4 | description: ZEPs Meeting Notes for 2024-06-27
5 | grand_parent: ZEP meetings
6 | parent: 2024 meetings
7 | nav_order: 13
8 | ---
9 |
10 | # 2024-06-27
11 |
12 | **Attending:** Davis Bennett (DB) and Sanket Verma (SV)
13 |
14 | ## TL;DR:
15 |
16 | **Updates:**
17 |
18 | - Zarr-Python 3.0.0a0 out
19 | -
20 | - Good momentum and lots of things happening with ZP-V3 - aiming for mid July release
21 | - SV represented Zarr at CZI Open Science 2024 meeting - various groups looking forward to V3 -
22 | - R users at bio-conductor looking to develop bindings for ZP-V3
23 | - New blog post:
24 | - ARCO-ERA5 got updated this week - ~6PB of Zarr data available - check:
25 | - - making weather data easy and accessbile to work with
26 | - Check:
27 | - Video tutorial:
28 |
29 | **Meeting Minutes:**
30 |
31 | - SV: Would like invite Norman for one of the showcase/lightning talks
32 | - DB: Having Tensorstore as backend for Zarr array writers would be good for performance
33 | - SV: How about Rust?
34 | - DB: Similar to C++ (Tensorstore)
35 | - DB: Slicing returns NumPy arrays - we should have lazy slicing API
36 | - DB: Would be good to be keep the momentum after V3
37 | - SV: Anything we can do to keep them engaged?
38 | - DB: Not as of now!
39 | - - would like to go ahead with this
40 | - DB: Impicit groups is a big change - maybe we need a major version bump
41 | - SV: If there's a unanimous change then it could be submitted as a PR / Lean ZEP
42 | - DB: Sounds good!
43 | - Move ZEP1 and ZEP2 to `Final`?
44 |
--------------------------------------------------------------------------------
/meetings/2024/meeting_notes_2024.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: 2024 meetings
4 | description: List of ZEP meeting notes for the year 2024
5 | nav_order: 3
6 | parent: ZEP meetings
7 | has_children: true
8 | permalink: /meetings/2024/
9 | ---
10 |
11 | # ZEP Meeting Notes for 2024
12 |
13 | Shows the list of meeting notes for the year 2024.
14 |
--------------------------------------------------------------------------------
/meetings/meetings.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: ZEP meetings
4 | description: Information about bi-weekly ZEP meetings
5 | nav_order: 7
6 | has_children: true
7 | permalink: /meetings/
8 | ---
9 |
10 | # ZEP Meetings
11 | {: .fs-8}
12 |
13 | Agenda, joining instructions and meeting notes for Bi-Weekly ZEP meetings
14 | {: .fs-5 .fw-300 }
15 |
16 | [Join here](https://openmicroscopy-org.zoom.us/j/82447735305?pwd=U3VXTnZBSk84T1BRNjZxaXFnZVQvZz09){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }
17 | [Agenda for the upcoming meeting](https://hackmd.io/ZilORe8AQvyqH6ArqDw0Cg?view){: .btn .fs-5 .mb-4 .mb-md-0 }
18 |
19 | ---
20 |
21 |
22 |
23 | Download the [.ics](https://calendar.google.com/calendar/ical/c_ba2k79i3u0lkf49vo0jre27j14%40group.calendar.google.com/public/basic.ics) file and add it to your calendar so won't miss any of our
24 | meetings!
25 |
--------------------------------------------------------------------------------
/template/header.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: template
4 | description: Template and instructions for proposing a new ZEP
5 | nav_order: 5
6 | has_children: true
7 | permalink: /template/
8 | ---
9 |
10 | # ZEP Template and Instructions
11 |
12 | ### Template and instructions for proposing a new ZEP.
13 |
--------------------------------------------------------------------------------
/template/template.md:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: ZEP0000
4 | description: Template and instructions for proposing a new ZEP
5 | parent: template
6 | nav_order: 1
7 | ---
8 |
9 | # ZEP — Template and Instructions
10 |
11 | ---
12 |
13 | ```
14 | Author:
15 |
16 | Status: < Draft | Active | Accepted | Deferred | Rejected | Withdrawn | Final | Superseded >
17 |
18 | Type:
19 |
20 | Created:
21 |
22 | Discussion: (link to zarr-developers post for discussion)
23 |
24 | Resolution: (required for Accepted | Rejected | Withdrawn)
25 | ```
26 |
27 | ## Abstract
28 |
29 | The abstract should be a short description of what the ZEP will achieve.
30 |
31 | ## Motivation and Scope
32 |
33 | This section describes the need for the proposed change. It should describe the existing problem, who it affects, what it is trying to solve, and why.
34 | This section should explicitly address the scope of and key requirements for the proposed change.
35 |
36 | ## Usage and Impact
37 |
38 | This section describes how users of Zarr will use the new features, spec changes or a new process described in this ZEP. It should be comprised mainly of code examples that wouldn’t be possible
39 | without acceptance and implementation of this ZEP, as well as the impact the proposed changes would have on the ecosystem. This section should be written from
40 | the perspective of the users of Zarr, and the benefit it will provide them; as such, it should include implementation details only if necessary to explain the
41 | functionality.
42 |
43 | ## Backward Compatibility
44 |
45 | This section describes how the ZEP breaks backward compatibility.
46 |
47 | Its purpose is to provide a high-level summary to users who are not interested in detailed technical discussion, but may have opinions around, e.g., usage and
48 | impact.
49 |
50 | ## Detailed description
51 |
52 | This section should provide a detailed description of the proposed change. It should include examples of how the new functionality would be used, intended
53 | use-cases and pseudo-code illustrating its use.
54 |
55 | ## Related Work
56 |
57 | This section should list relevant and/or similar technologies, possibly in other libraries. It does not need to be comprehensive, just list the major examples
58 | of prior and relevant art.
59 |
60 | ## Implementation
61 |
62 | This section lists the major steps required to implement the ZEP. Where possible, it should be noted where one step is dependent on another, and which steps may
63 | be optionally omitted. Where it makes sense, each step should include a link to related pull requests as the implementation progresses.
64 |
65 | Any pull requests or development branches containing work on this ZEP be linked to from here. (A ZEP does not need to be implemented in a single pull request if
66 | it makes sense to implement it in discrete phases).
67 |
68 | ## Alternatives
69 |
70 | If there were any alternative solutions to solving the same problem, they should be discussed here, along with a justification for the chosen approach.
71 |
72 | ## Discussion
73 |
74 | This section should have links related to any discussion regarding the ZEP. It could be GitHub issues and/or discussions. (The links to discussions in past
75 | if any, goes in this section.)
76 |
77 | ## References and Footnotes
78 |
79 | Each ZEP must either be explicitly labelled as placed in the public domain (see this ZEP as an example) or licensed under the
80 | [Open Publication License](https://www.opencontent.org/openpub/).
81 |
82 | ## Copyright
83 |
84 | This document has been placed in the public domain.
85 |
--------------------------------------------------------------------------------
/zarr-implementations-council.markdown:
--------------------------------------------------------------------------------
1 | ---
2 | layout: default
3 | title: implementations council
4 | description: Representatives of various Zarr Implementations
5 | nav_order: 6
6 | permalink: /zic/
7 | ---
8 |
9 | # Zarr Implementation Council 🚀
10 |
11 | The [ZSC](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-steering-council) have invited Zarr Implementations to participate in the management of the Zarr specification through the Zarr Implementation Council (ZIC). Implementations are selected based on the maturity of implementation as well as the activity of the developer community. Preference will be given to open-source and *open-process* implementations. Multiple implementations in a single programming language may be invited, or such implementations could work together as a single community.
12 |
13 | The current list of implementations which are participating in this process are (in alphabetical order):
14 |
15 | - [constantinpape/z5](https://github.com/constantinpape/z5) represented by [Constantin Pape](https://github.com/constantinpape) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/26))
16 |
17 | - [google/tensorstore](https://github.com/google/tensorstore) represented by [Jeremy Maitin-Shepard](https://github.com/jbms) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/22))
18 |
19 | - [freeman-lab/zarr-js](https://github.com/freeman-lab/zarr-js):
20 | - represented by [Jeremy Freeman](https://github.com/freeman-lab) ([May 2022 – March 2023](https://github.com/zarr-developers/governance/issues/27))
21 | - represented by [Anderson Banihirwe](https://github.com/andersy005) ([March 2023 - present](https://github.com/zarr-developers/governance/pull/36))
22 |
23 | - [gzuidhof/zarr.js](https://github.com/gzuidhof/zarr.js) represented by [Trevor Manz](https://github.com/manzt) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/28))
24 |
25 | - [JuliaIO/Zarr.jl](https://github.com/JuliaIO/Zarr.jl) represented by [Fabian Gans](https://github.com/meggart) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/18))
26 |
27 | - [saalfeldlab/n5-zarr](https://github.com/saalfeldlab/n5-zarr) represented by [Stephan Saalfeld](https://github.com/axtimwalde) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/25))
28 |
29 | - [sci-rs/zarr](https://github.com/sci-rs/zarr) represented by [Andrew Champion](https://github.com/aschampion) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/20))
30 |
31 | - [Unidata/netcdf-c](https://github.com/Unidata/netcdf-c) and [Unidata/netcdf-java](https://github.com/Unidata/netcdf-java) represented by [Ward Fisher](https://github.com/wardf) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/21))
32 |
33 | - [xtensor-stack/xtensor-zarr](https://github.com/xtensor-stack/xtensor-zarr) represented by [David Brochart](https://github.com/davidbrochart) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/23))
34 |
35 | - [zarr-developers/zarr-python](https://github.com/zarr-developers/zarr-python):
36 | - represented by [Gregory Lee](https://github.com/grlee77) ([May 2022 - January 2024](https://github.com/zarr-developers/governance/issues/19))
37 | - represented by [Joe Hamman](https://github.com/jhamman) and seconded by [Davis Bennett](https://github.com/d-v-b/) ([January 2024 - present](https://github.com/zarr-developers/governance/commit/0a12fdf653d5a32c47d9566eb3049d2961880bca))
38 |
39 |
40 | The core developers of each implementation have selected a representative of the ZIC. It is up to each implementation to determine its process for selecting its representatives.
41 |
42 | This member will represent that implementation in decisions regarding the Zarr Specification and other Zarr-wide contexts which require input from implementations.
43 |
44 | An additional representative should also be selected to act as an alternate when the primary representative is unavailable.
45 |
46 | Continued ZIC membership depends on timely feedback and votes on relevant issues. The ZSC also reserves the right to remove implementations from the council.
47 |
--------------------------------------------------------------------------------