├── .gitignore ├── .readthedocs.yml ├── 404.html ├── Gemfile ├── Gemfile.lock ├── LICENSE ├── README.md ├── _config.yml ├── _includes └── custom.html ├── _posts └── 2022-05-26-welcome-to-jekyll.markdown ├── accepted ├── ZEP0001.md ├── ZEP0002.md └── accepted_zeps.md ├── active ├── ZEP0000.md └── active_zep.md ├── assets └── images │ ├── Zarr_accumulation-App_Interface-1.png │ ├── Zarr_accumulation-App_Interface-2.png │ ├── flowchart.png │ ├── sharding.png │ └── zarr.png ├── draft ├── ZEP0003.md ├── ZEP0004.md ├── ZEP0005.md ├── ZEP0009.md └── draft_zeps.md ├── index.md ├── join-community.markdown ├── meetings ├── 2022 │ ├── 2022-09-08.md │ ├── 2022-09-22.md │ ├── 2022-10-06.md │ ├── 2022-10-20.md │ ├── 2022-11-03.md │ ├── 2022-11-17.md │ ├── 2022-12-01.md │ ├── 2022-12-15.md │ └── meeting_notes_2022.md ├── 2023 │ ├── 2023-01-12.md │ ├── 2023-01-26.md │ ├── 2023-02-02.md │ ├── 2023-02-09.md │ ├── 2023-02-16.md │ ├── 2023-02-23.md │ ├── 2023-03-09.md │ ├── 2023-03-16.md │ ├── 2023-03-23.md │ ├── 2023-04-06.md │ ├── 2023-04-20.md │ ├── 2023-05-04.md │ ├── 2023-05-18.md │ ├── 2023-06-01.md │ ├── 2023-06-15.md │ ├── 2023-06-29.md │ ├── 2023-07-13.md │ ├── 2023-07-27.md │ ├── 2023-08-10.md │ ├── 2023-08-24.md │ ├── 2023-09-07.md │ ├── 2023-09-21.md │ ├── 2023-10-05.md │ ├── 2023-10-19.md │ ├── 2023-11-02.md │ ├── 2023-11-16.md │ ├── 2023-11-30.md │ ├── 2023-12-14.md │ └── meeting_notes_2023.md ├── 2024 │ ├── 2024-01-11.md │ ├── 2024-01-25.md │ ├── 2024-02-08.md │ ├── 2024-02-22.md │ ├── 2024-03-07.md │ ├── 2024-03-21.md │ ├── 2024-04-04.md │ ├── 2024-04-18.md │ ├── 2024-05-02.md │ ├── 2024-05-16.md │ ├── 2024-05-30.md │ ├── 2024-06-13.md │ ├── 2024-06-27.md │ └── meeting_notes_2024.md └── meetings.md ├── template ├── header.md └── template.md └── zarr-implementations-council.markdown /.gitignore: -------------------------------------------------------------------------------- 1 | _site 2 | .jekyll-cache 3 | .jekyll-metadata 4 | .DS_Store 5 | -------------------------------------------------------------------------------- /.readthedocs.yml: -------------------------------------------------------------------------------- 1 | # .readthedocs.yml 2 | # Read the Docs configuration file 3 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 4 | 5 | # Required 6 | version: 2 7 | 8 | # Set the version of Python and other tools you might need 9 | build: 10 | os: ubuntu-22.04 11 | 12 | tools: 13 | ruby: "3.3" 14 | 15 | commands: 16 | - bundle install 17 | - > 18 | JEKYLL_ENV=production bundle exec jekyll build --destination 19 | _readthedocs/html --baseurl $(echo -n "$READTHEDOCS_CANONICAL_URL" | cut 20 | -d '/' -f 4-) 21 | -------------------------------------------------------------------------------- /404.html: -------------------------------------------------------------------------------- 1 | --- 2 | permalink: /404.html 3 | layout: default 4 | --- 5 | 6 | 19 | 20 |
21 |

404

22 | 23 |

Page not found :(

24 |

The requested page could not be found.

25 |
26 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source "https://rubygems.org" 2 | # Hello! This is where you manage which Jekyll version is used to run. 3 | # When you want to use a different version, change it below, save the 4 | # file and run `bundle install`. Run Jekyll with `bundle exec`, like so: 5 | # 6 | # bundle exec jekyll serve 7 | # 8 | # This will help ensure the proper Jekyll version is running. 9 | # Happy Jekylling! 10 | gem "jekyll", "~> 4.3" 11 | # This is the default theme for new Jekyll sites. You may change this to anything you like. 12 | gem "just-the-docs" 13 | # If you want to use GitHub Pages, remove the "gem "jekyll"" above and 14 | # uncomment the line below. To upgrade, run `bundle update github-pages`. 15 | # gem "github-pages", group: :jekyll_plugins 16 | # If you have any plugins, put them here! 17 | group :jekyll_plugins do 18 | gem "jekyll-feed", "~> 0.12" 19 | end 20 | 21 | # Windows and JRuby does not include zoneinfo files, so bundle the tzinfo-data gem 22 | # and associated library. 23 | platforms :mingw, :x64_mingw, :mswin, :jruby do 24 | gem "tzinfo", "~> 1.2" 25 | gem "tzinfo-data" 26 | end 27 | 28 | # Performance-booster for watching directories on Windows 29 | gem "wdm", "~> 0.1.1", :platforms => [:mingw, :x64_mingw, :mswin] 30 | 31 | # Lock `http_parser.rb` gem to `v0.6.x` on JRuby builds since newer versions of the gem 32 | # do not have a Java counterpart. 33 | gem "http_parser.rb", "~> 0.6.0", :platforms => [:jruby] 34 | 35 | gem "webrick", "~> 1.7" 36 | 37 | gem "jekyll-remote-theme", "~> 0.4.3" 38 | 39 | gem 'jekyll-redirect-from' 40 | 41 | gem 'jekyll-include-cache' 42 | -------------------------------------------------------------------------------- /Gemfile.lock: -------------------------------------------------------------------------------- 1 | GEM 2 | remote: https://rubygems.org/ 3 | specs: 4 | addressable (2.8.6) 5 | public_suffix (>= 2.0.2, < 6.0) 6 | colorator (1.1.0) 7 | concurrent-ruby (1.2.3) 8 | em-websocket (0.5.3) 9 | eventmachine (>= 0.12.9) 10 | http_parser.rb (~> 0) 11 | eventmachine (1.2.7) 12 | ffi (1.16.3) 13 | forwardable-extended (2.6.0) 14 | google-protobuf (3.25.3-arm64-darwin) 15 | google-protobuf (3.25.3-x86_64-linux) 16 | http_parser.rb (0.8.0) 17 | i18n (1.14.1) 18 | concurrent-ruby (~> 1.0) 19 | jekyll (4.3.3) 20 | addressable (~> 2.4) 21 | colorator (~> 1.0) 22 | em-websocket (~> 0.5) 23 | i18n (~> 1.0) 24 | jekyll-sass-converter (>= 2.0, < 4.0) 25 | jekyll-watch (~> 2.0) 26 | kramdown (~> 2.3, >= 2.3.1) 27 | kramdown-parser-gfm (~> 1.0) 28 | liquid (~> 4.0) 29 | mercenary (>= 0.3.6, < 0.5) 30 | pathutil (~> 0.9) 31 | rouge (>= 3.0, < 5.0) 32 | safe_yaml (~> 1.0) 33 | terminal-table (>= 1.8, < 4.0) 34 | webrick (~> 1.7) 35 | jekyll-feed (0.17.0) 36 | jekyll (>= 3.7, < 5.0) 37 | jekyll-include-cache (0.2.1) 38 | jekyll (>= 3.7, < 5.0) 39 | jekyll-redirect-from (0.16.0) 40 | jekyll (>= 3.3, < 5.0) 41 | jekyll-remote-theme (0.4.3) 42 | addressable (~> 2.0) 43 | jekyll (>= 3.5, < 5.0) 44 | jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0) 45 | rubyzip (>= 1.3.0, < 3.0) 46 | jekyll-sass-converter (3.0.0) 47 | sass-embedded (~> 1.54) 48 | jekyll-seo-tag (2.8.0) 49 | jekyll (>= 3.8, < 5.0) 50 | jekyll-watch (2.2.1) 51 | listen (~> 3.0) 52 | just-the-docs (0.8.0) 53 | jekyll (>= 3.8.5) 54 | jekyll-include-cache 55 | jekyll-seo-tag (>= 2.0) 56 | rake (>= 12.3.1) 57 | kramdown (2.4.0) 58 | rexml 59 | kramdown-parser-gfm (1.1.0) 60 | kramdown (~> 2.0) 61 | liquid (4.0.4) 62 | listen (3.9.0) 63 | rb-fsevent (~> 0.10, >= 0.10.3) 64 | rb-inotify (~> 0.9, >= 0.9.10) 65 | mercenary (0.4.0) 66 | pathutil (0.16.2) 67 | forwardable-extended (~> 2.6) 68 | public_suffix (5.0.4) 69 | rake (13.1.0) 70 | rb-fsevent (0.11.2) 71 | rb-inotify (0.10.1) 72 | ffi (~> 1.0) 73 | rexml (3.2.6) 74 | rouge (4.2.0) 75 | rubyzip (2.3.2) 76 | safe_yaml (1.0.5) 77 | sass-embedded (1.71.1-arm64-darwin) 78 | google-protobuf (~> 3.25) 79 | sass-embedded (1.71.1-x86_64-linux-gnu) 80 | google-protobuf (~> 3.25) 81 | terminal-table (3.0.2) 82 | unicode-display_width (>= 1.1.1, < 3) 83 | unicode-display_width (2.5.0) 84 | webrick (1.8.1) 85 | 86 | PLATFORMS 87 | arm64-darwin-21 88 | arm64-darwin-23 89 | x86_64-linux 90 | 91 | DEPENDENCIES 92 | http_parser.rb (~> 0.6.0) 93 | jekyll (~> 4.3) 94 | jekyll-feed (~> 0.12) 95 | jekyll-include-cache 96 | jekyll-redirect-from 97 | jekyll-remote-theme (~> 0.4.3) 98 | just-the-docs 99 | tzinfo (~> 1.2) 100 | tzinfo-data 101 | wdm (~> 0.1.1) 102 | webrick (~> 1.7) 103 | 104 | BUNDLED WITH 105 | 2.5.6 106 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Creative Commons Legal Code 2 | 3 | CC0 1.0 Universal 4 | 5 | CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE 6 | LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN 7 | ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS 8 | INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES 9 | REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS 10 | PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM 11 | THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED 12 | HEREUNDER. 13 | 14 | Statement of Purpose 15 | 16 | The laws of most jurisdictions throughout the world automatically confer 17 | exclusive Copyright and Related Rights (defined below) upon the creator 18 | and subsequent owner(s) (each and all, an "owner") of an original work of 19 | authorship and/or a database (each, a "Work"). 20 | 21 | Certain owners wish to permanently relinquish those rights to a Work for 22 | the purpose of contributing to a commons of creative, cultural and 23 | scientific works ("Commons") that the public can reliably and without fear 24 | of later claims of infringement build upon, modify, incorporate in other 25 | works, reuse and redistribute as freely as possible in any form whatsoever 26 | and for any purposes, including without limitation commercial purposes. 27 | These owners may contribute to the Commons to promote the ideal of a free 28 | culture and the further production of creative, cultural and scientific 29 | works, or to gain reputation or greater distribution for their Work in 30 | part through the use and efforts of others. 31 | 32 | For these and/or other purposes and motivations, and without any 33 | expectation of additional consideration or compensation, the person 34 | associating CC0 with a Work (the "Affirmer"), to the extent that he or she 35 | is an owner of Copyright and Related Rights in the Work, voluntarily 36 | elects to apply CC0 to the Work and publicly distribute the Work under its 37 | terms, with knowledge of his or her Copyright and Related Rights in the 38 | Work and the meaning and intended legal effect of CC0 on those rights. 39 | 40 | 1. Copyright and Related Rights. A Work made available under CC0 may be 41 | protected by copyright and related or neighboring rights ("Copyright and 42 | Related Rights"). Copyright and Related Rights include, but are not 43 | limited to, the following: 44 | 45 | i. the right to reproduce, adapt, distribute, perform, display, 46 | communicate, and translate a Work; 47 | ii. moral rights retained by the original author(s) and/or performer(s); 48 | iii. publicity and privacy rights pertaining to a person's image or 49 | likeness depicted in a Work; 50 | iv. rights protecting against unfair competition in regards to a Work, 51 | subject to the limitations in paragraph 4(a), below; 52 | v. rights protecting the extraction, dissemination, use and reuse of data 53 | in a Work; 54 | vi. database rights (such as those arising under Directive 96/9/EC of the 55 | European Parliament and of the Council of 11 March 1996 on the legal 56 | protection of databases, and under any national implementation 57 | thereof, including any amended or successor version of such 58 | directive); and 59 | vii. other similar, equivalent or corresponding rights throughout the 60 | world based on applicable law or treaty, and any national 61 | implementations thereof. 62 | 63 | 2. Waiver. To the greatest extent permitted by, but not in contravention 64 | of, applicable law, Affirmer hereby overtly, fully, permanently, 65 | irrevocably and unconditionally waives, abandons, and surrenders all of 66 | Affirmer's Copyright and Related Rights and associated claims and causes 67 | of action, whether now known or unknown (including existing as well as 68 | future claims and causes of action), in the Work (i) in all territories 69 | worldwide, (ii) for the maximum duration provided by applicable law or 70 | treaty (including future time extensions), (iii) in any current or future 71 | medium and for any number of copies, and (iv) for any purpose whatsoever, 72 | including without limitation commercial, advertising or promotional 73 | purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each 74 | member of the public at large and to the detriment of Affirmer's heirs and 75 | successors, fully intending that such Waiver shall not be subject to 76 | revocation, rescission, cancellation, termination, or any other legal or 77 | equitable action to disrupt the quiet enjoyment of the Work by the public 78 | as contemplated by Affirmer's express Statement of Purpose. 79 | 80 | 3. Public License Fallback. Should any part of the Waiver for any reason 81 | be judged legally invalid or ineffective under applicable law, then the 82 | Waiver shall be preserved to the maximum extent permitted taking into 83 | account Affirmer's express Statement of Purpose. In addition, to the 84 | extent the Waiver is so judged Affirmer hereby grants to each affected 85 | person a royalty-free, non transferable, non sublicensable, non exclusive, 86 | irrevocable and unconditional license to exercise Affirmer's Copyright and 87 | Related Rights in the Work (i) in all territories worldwide, (ii) for the 88 | maximum duration provided by applicable law or treaty (including future 89 | time extensions), (iii) in any current or future medium and for any number 90 | of copies, and (iv) for any purpose whatsoever, including without 91 | limitation commercial, advertising or promotional purposes (the 92 | "License"). The License shall be deemed effective as of the date CC0 was 93 | applied by Affirmer to the Work. Should any part of the License for any 94 | reason be judged legally invalid or ineffective under applicable law, such 95 | partial invalidity or ineffectiveness shall not invalidate the remainder 96 | of the License, and in such case Affirmer hereby affirms that he or she 97 | will not (i) exercise any of his or her remaining Copyright and Related 98 | Rights in the Work or (ii) assert any associated claims and causes of 99 | action with respect to the Work, in either case contrary to Affirmer's 100 | express Statement of Purpose. 101 | 102 | 4. Limitations and Disclaimers. 103 | 104 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 105 | surrendered, licensed or otherwise affected by this document. 106 | b. Affirmer offers the Work as-is and makes no representations or 107 | warranties of any kind concerning the Work, express, implied, 108 | statutory or otherwise, including without limitation warranties of 109 | title, merchantability, fitness for a particular purpose, non 110 | infringement, or the absence of latent or other defects, accuracy, or 111 | the present or absence of errors, whether or not discoverable, all to 112 | the greatest extent permissible under applicable law. 113 | c. Affirmer disclaims responsibility for clearing rights of other persons 114 | that may apply to the Work or any use thereof, including without 115 | limitation any person's Copyright and Related Rights in the Work. 116 | Further, Affirmer disclaims responsibility for obtaining any necessary 117 | consents, permissions or other rights required for any use of the 118 | Work. 119 | d. Affirmer understands and acknowledges that Creative Commons is not a 120 | party to this document and has no duty or obligation with respect to 121 | this CC0 or use of the Work. 122 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Zarr Enhancement Proposals (ZEPs) 2 | 3 | ## Community Feedback Process for Zarr Specifications 4 | 5 | ZEP stands for Zarr Enhancement Proposal. A ZEP is a design document providing 6 | information to the Zarr community, describing a modification or enhancement of 7 | the Zarr specifications or a new feature for its processes or environment. The 8 | ZEP should provide specific proposed changes to the Zarr specification and a 9 | narrative rationale for the specification changes. 10 | 11 | We intend ZEPs to be the primary mechanism for evolving the spec, collecting 12 | community input on significant issues and documenting the design decision that 13 | has gone into Zarr. 14 | 15 | ## Proposing a new ZEP 16 | 17 | ZEPs should be submitted as a draft ZEP in the [draft folder](https://github.com/zarr-developers/zeps/tree/main/draft) 18 | of this (*zeps*) repository via [GitHub pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). 19 | 20 | The PR should contain the narrative text of the ZEP with the name `zep-.md` 21 | where `` is an appropriately assigned four-digit number. The draft ZEP must 22 | use the [ZEP X - Template and Instructions](https://zarr.dev/zeps/template/template.html) 23 | file. 24 | 25 | To read more on `Submitting a ZEP`, please refer [here](https://zarr.dev/zeps/active/ZEP0000.html#submitting-a-zep). 26 | 27 | ## Contributing to ZEPs 28 | 29 | The ZEPs in this repo are published automatically on the web @ 30 | https://zarr.dev/zeps/. If you wish to contribute to the website, please build 31 | the website locally on your machine. Building this website requires [Jekyll](http://jekyllrb.com/). 32 | Refer to [this](https://jekyllrb.com/docs/) to install Jekyll. 33 | 34 | Steps to contribute: 35 | 36 | 1. Fork this repo 37 | 2. cd into the forked repo 38 | 3. Type `bundle exec jekyll serve --incremental` 39 | or `docker run -p 4000:4000 -v $(pwd):/site bretfisher/jekyll-serve` 40 | 4. Open a browser and go to http://localhost:4000/zeps/ to see the 41 | website 42 | 5. Make desired changes, save them and refresh http://localhost:4000/zeps/ to see 43 | the changes 44 | 45 | Once done, push your changes to your fork and open a 46 | [GitHub pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request). 47 | 48 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | # Welcome to Jekyll! 2 | # 3 | # This config file is meant for settings that affect your whole blog, values 4 | # which you are expected to set up once and rarely edit after that. If you find 5 | # yourself editing this file very often, consider using Jekyll's data files 6 | # feature for the data you need to update frequently. 7 | # 8 | # For technical reasons, this file is *NOT* reloaded automatically when you use 9 | # 'bundle exec jekyll serve'. If you change this file, please restart the server process. 10 | # 11 | # If you need help with YAML syntax, here are some quick references for you: 12 | # https://learn-the-web.algonquindesign.ca/topics/markdown-yaml-cheat-sheet/#yaml 13 | # https://learnxinyminutes.com/docs/yaml/ 14 | # 15 | # Site settings 16 | # These are used to personalize your new site. If you look in the HTML files, 17 | # you will see them accessed via {{ site.title }}, {{ site.email }}, and so on. 18 | # You can create any custom variable you would like, and they will be accessible 19 | # in the templates via {{ site.myvariable }}. 20 | 21 | title: ZEP 22 | email: zarrdevelopers@gmail.com 23 | description: >- # this means to ignore newlines until "baseurl:" 24 | ZEP (Zarr Enhancement Proposal) is a community feedback process for Zarr Specifications. 25 | A ZEP is a design document providing information to the Zarr community, describing a 26 | modification or enhancement of the Zarr specifications, a new feature for its 27 | processes or environment. 28 | baseurl: "/zeps" # the subpath of your site, e.g. /blog 29 | url: "https://zarr.dev" # the base hostname & protocol for your site, e.g. http://example.com 30 | #twitter_username: jekyllrb 31 | #github_username: jekyll 32 | 33 | # Set a path/url to a logo that will be displayed instead of the title 34 | logo: "/assets/images/zarr.png" 35 | 36 | # Google Analytics Tracking 37 | ga_tracking: G-BCRR9QE7Z0 38 | ga_tracking_anonymize_ip: true # Use GDPR compliant Google Analytics settings (true/nil by default) 39 | 40 | # Aux links for the upper right navigation 41 | aux_links: 42 | "Zarr Homepage": 43 | - "https://zarr.dev/" 44 | 45 | # Makes Aux links open in a new tab. Default is false 46 | aux_links_new_tab: true 47 | 48 | # Build settings 49 | remote_theme: pmarsceill/just-the-docs 50 | plugins: 51 | - jekyll-feed 52 | - jekyll-remote-theme 53 | - jekyll-redirect-from 54 | - jekyll-include-cache 55 | 56 | # Exclude from processing. 57 | # The following items will not be processed, by default. 58 | # Any item listed under the `exclude:` key here will be automatically added to 59 | # the internal "default list". 60 | # 61 | # Excluded items can be processed by explicitly listing the directories or 62 | # their entries' file path in the `include:` list. 63 | # 64 | # exclude: 65 | # - .sass-cache/ 66 | # - .jekyll-cache/ 67 | # - gemfiles/ 68 | # - Gemfile 69 | # - Gemfile.lock 70 | # - node_modules/ 71 | # - vendor/bundle/ 72 | # - vendor/cache/ 73 | # - vendor/gems/ 74 | # - vendor/ruby/ 75 | -------------------------------------------------------------------------------- /_includes/custom.html: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /_posts/2022-05-26-welcome-to-jekyll.markdown: -------------------------------------------------------------------------------- 1 | --- 2 | layout: post 3 | title: "Welcome to Jekyll!" 4 | date: 2022-05-26 03:31:52 +0530 5 | categories: jekyll update 6 | --- 7 | You’ll find this post in your `_posts` directory. Go ahead and edit it and re-build the site to see your changes. You can rebuild the site in many different ways, but the most common way is to run `jekyll serve`, which launches a web server and auto-regenerates your site when a file is updated. 8 | 9 | Jekyll requires blog post files to be named according to the following format: 10 | 11 | `YEAR-MONTH-DAY-title.MARKUP` 12 | 13 | Where `YEAR` is a four-digit number, `MONTH` and `DAY` are both two-digit numbers, and `MARKUP` is the file extension representing the format used in the file. After that, include the necessary front matter. Take a look at the source for this post to get an idea about how it works. 14 | 15 | Jekyll also offers powerful support for code snippets: 16 | 17 | {% highlight ruby %} 18 | def print_hi(name) 19 | puts "Hi, #{name}" 20 | end 21 | print_hi('Tom') 22 | #=> prints 'Hi, Tom' to STDOUT. 23 | {% endhighlight %} 24 | 25 | Check out the [Jekyll docs][jekyll-docs] for more info on how to get the most out of Jekyll. File all bugs/feature requests at [Jekyll’s GitHub repo][jekyll-gh]. If you have questions, you can ask them on [Jekyll Talk][jekyll-talk]. 26 | 27 | [jekyll-docs]: https://jekyllrb.com/docs/home 28 | [jekyll-gh]: https://github.com/jekyll/jekyll 29 | [jekyll-talk]: https://talk.jekyllrb.com/ 30 | -------------------------------------------------------------------------------- /accepted/accepted_zeps.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: accepted ZEPs 4 | description: List of Accepted ZEPs 5 | nav_order: 2 6 | has_children: true 7 | permalink: /accepted_zeps/ 8 | --- 9 | 10 | # Accepted ZEPs 11 | 12 | ### Shows the list of Accepted ZEPs. 13 | -------------------------------------------------------------------------------- /active/active_zep.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: active ZEPs 4 | description: List of Active ZEPs 5 | nav_order: 3 6 | has_children: true 7 | permalink: /active_zeps/ 8 | --- 9 | 10 | # Active ZEPs 11 | 12 | ### Shows the list of Active ZEPs. 13 | -------------------------------------------------------------------------------- /assets/images/Zarr_accumulation-App_Interface-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/Zarr_accumulation-App_Interface-1.png -------------------------------------------------------------------------------- /assets/images/Zarr_accumulation-App_Interface-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/Zarr_accumulation-App_Interface-2.png -------------------------------------------------------------------------------- /assets/images/flowchart.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/flowchart.png -------------------------------------------------------------------------------- /assets/images/sharding.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/sharding.png -------------------------------------------------------------------------------- /assets/images/zarr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zarr-developers/zeps/a644e70eae9a247ee0895d977d025d34fce35adb/assets/images/zarr.png -------------------------------------------------------------------------------- /draft/ZEP0003.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: ZEP0003 4 | description: Variable chunk sizes 5 | parent: draft ZEPs 6 | nav_order: 3 7 | --- 8 | 9 | # ZEP 3 — Variable chunking 10 | 11 | Authors: 12 | * Martin Durant ([@martindurant](https://github.com/martindurant)), Anaconda, Inc. 13 | * Isaac Virshup ([@ivirshup](https://github.com/martindurant)), Helmholtz Munich 14 | 15 | Status: Draft 16 | 17 | Type: Specification 18 | 19 | Created: 2022-10-17 20 | 21 | Discussion: https://github.com/orgs/zarr-developers/discussions/52 22 | 23 | ## Abstract 24 | 25 | To allow the chunks of a zarr array to be rectangular grid rather than a regular grid, 26 | with the chunk 27 | lengths along any dimension a list of integers rather than a single chunk size. 28 | 29 | ## Motivation and Scope 30 | 31 | Two specific use cases have motivated this, given below. However, this generalisation of Zarr's storage 32 | model can be seen as an optional enhancement, and the same data model as currently used by dask.array. 33 | 34 | - when producing a [kerchunked](https://github.com/fsspec/kerchunk) dataset, the native chunking of the targets 35 | cannot be changed. It is common 36 | to have non-regular chunking on at least one dimension, such as a time dimension with one sample per day and chunks 37 | of one month or one year. The change would allow these datasets to be read via kerchunk, and/or converted to 38 | zarr with equivalent chunking to the original. Such data cannot currently be represented in zarr. 39 | - [awkward](https://github.com/scikit-hep/awkward) arrays, ragged arrays and sparse data can be represented as 40 | a set of one-dimensional arrays, with an appropriate metadata description convention. The size of a chunks 41 | of each component array corresponding to a logical chunk of the overall array will not, in general be equal 42 | with each other in a single chunk, nor consistent between chunks, as each row in the matrix can have a variable number 43 | of non-zero values 44 | - sensor data, may not come in fixed increments; variably chunked storage would be great for parallel writing. 45 | With variable chunk sizes, just need to make sure offsets are 46 | correct once done. Otherwise, write locations for chunks are dependent on previous chunks. 47 | - in some cases, parts of the overall data array may have very different data distributions, and it can 48 | be very convenient to partition the data by such characteristics to allow, for example, for more efficient encoding 49 | schemes. 50 | - when filtering regular table data on one column and applying to other columns, you necessarily end up with an unequal 51 | number of values in each chunk, which zarr does not currently handle. 52 | 53 | ## Usage and Impact 54 | 55 | ### Creation 56 | 57 | ```python 58 | zarr.create(1000, chunks=((100, 300, 500, 100),)) 59 | ``` 60 | 61 | 62 | ## Backward Compatibility 63 | 64 | 65 | This change is fully backward compatible - all old data will remain usable. However, data written with 66 | variable chunks will not be readable by older versions of Zarr. It would be reasonable to wish to backport the 67 | feature to v2. 68 | 69 | ## Detailed description 70 | 71 | Currently, the array metadata specifies the chunking scheme like 72 | (see https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html#chunk-grid) 73 | ```json 74 | { 75 | "type": "regular", 76 | "chunk_shape": [10, 10], 77 | "separator":"/" 78 | } 79 | ``` 80 | 81 | The proposal is to allow metadata of the form 82 | ```json 83 | { 84 | "type": "rectangular", 85 | "chunk_shape": [[5, 5, 5, 15, 15, 20, 35], 10], 86 | "separator":"/" 87 | } 88 | ``` 89 | Each element of `chunk_shape`, corresponding to each dimension of the array, may be a single integer, as before, 90 | or a list of integers which add up to the size of the array in that dimension. In this example, the single value 91 | of `10` for the chunks on the second dimension would be identical to `[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]`. 92 | The number of values in the list is equal to the number of chunks along that dimension. Thus, a "rectangular" 93 | grid may be fully compatible as a "regular" grid. 94 | 95 | The data index bounds on a dimension of each hyperrectangle is formed by a cumulative sum of the chunks values, 96 | starting at 0. 97 | ``` 98 | bounds_axis0 = [0, 5, 10, 15, 30, 45, 65, 100] 99 | bounds_axis1 = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] 100 | ``` 101 | such that key "c0/0" contains values for indices along the first dimension (0, 5] and (0, 10] on the second dimension. 102 | An array index of (17, 17) would be found in key "c3/1", index (2, 2). 103 | 104 | ## Related Work 105 | 106 | ### Dask 107 | 108 | `dask.array` uses rectangular chunking internally, and is one of the major consumers of zarr data. Much of the 109 | code translating logical slices into slices on the individual chunks should be reusable. 110 | 111 | ### Parquet/ Arrow 112 | 113 | Arrow describes tables as a collection of record batches. There is no restriction on the size of these batches. 114 | This is not only very flexible, but can be used as an indexing strategy for low cardinality columns within parquet. 115 | 116 | ``` 117 | dataset_name/ 118 | year=2007/ 119 | month=01/ 120 | 0.parq 121 | 1.parq 122 | ... 123 | month=02/ 124 | 0.parq 125 | 1.parq 126 | ... 127 | month=03/ 128 | ... 129 | year=2008/ 130 | month=01/ 131 | ... 132 | ... 133 | ``` 134 | 135 | This feature was cited as one of the reasons parquet was chose over zarr for dask 136 | dataframes: https://github.com/dask/dask/issues/1599 137 | 138 | ### awkward array 139 | 140 | https://github.com/zarr-developers/zarr-specs/issues/62 141 | 142 | 143 | ## Implementation 144 | 145 | It is to be hoped that much code can be adapted from dask.array, which already allows variable chunk sizes 146 | on each dimension. 147 | 148 | ## Alternatives 149 | 150 | ### Just tune chunk sizes 151 | 152 | https://github.com/zarr-developers/zarr-specs/issues/62#issuecomment-1100806513 153 | 154 | 155 | ## Discussion 156 | 157 | 158 | ## References and Footnotes 159 | 160 | * Previous discussion: 161 | * [Zarr Dask Table dask/dask#1599](https://github.com/dask/dask/issues/1599) 162 | * [Protocol extensions for awkward arrays zarr-developers/zarr-specs#62](https://github.com/zarr-developers/zarr-specs/issues/62) 163 | * [Handling arrays with non-uniform chunking zarr-developers/zarr-specs#40](https://github.com/zarr-developers/zarr-specs/issues/40) 164 | * [Chunk spec zarr-developers/zarr-spec#7](https://github.com/zarr-developers/zarr-specs/issues/7#issuecomment-468127219) 165 | 166 | 167 | 168 | ## Copyright 169 | 170 | This document has been placed in the public domain. 171 | -------------------------------------------------------------------------------- /draft/ZEP0005.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: ZEP0005 4 | description: This ZEP proposes a Zarr extension for an algorithm developed at NASA’s GES DISC for fast and cost-efficient multi-dimensional averaging service -- Zarr-based Chunk-level Accumulation in Reduced Dimensions. 5 | parent: draft ZEPs 6 | nav_order: 5 7 | --- 8 | 9 | # ZEP 5 — Zarr-based Chunk-level Accumulation in Reduced Dimensions 10 | 11 | Authors: 12 | * Hailiang Zhang ([@hailiangzhang](https://github.com/hailiangzhang)), Adnet Systems Inc, NASA Goddard Space Flight Center. 13 | * Mahabal Hegde ([@nasahegde](https://github.com/nasahegde)), NASA Goddard Space Flight Center. 14 | * Christine Smit ([@christine-e-smit](https://github.com/christine-e-smit)), Telophase Co, NASA Goddard Space Flight Center. 15 | * Brianna Pagan ([@briannapagan](https://github.com/briannapagan)), Adnet Systems Inc, NASA Goddard Space Flight Center. 16 | * Dieu My Nguyen ([@dieumynguyen](https://github.com/dieumynguyen)), Adnet Systems Inc, NASA Goddard Space Flight Center. 17 | 18 | Status: Draft 19 | 20 | Type: Specification 21 | 22 | Created: 2023-02-12 23 | 24 | Discussion: 25 | 26 | ## Abstract 27 | 28 | At NASA GES DISC, we receive a large number of user requests each day for a variety of analysis and visualization services involving averaging along one or more dimensions, some of which are computationally expensive when running against large amounts of geospatial data. We proposed a generic and dimension-agnostic method based on chunk-level cumulative sums (accumulation) on a regular grid, which provides fast and cost-efficient cloud analysis for multidimensional averaging services. This method introduces a small adjustable set of auxiliary data on top of the raw data, and dramatically reduces the computational time by orders of magnitude based on chunk-level accumulation along one or more dimensions. 29 | 30 | We hereby propose a Zarr extension for this chunk-level accumulation approach. In this proposal, we will present a Zarr group for the accumulation data, a JSON schema for the accumulation group attribute, a JSON schema for the accumulation data array attribute, and an example of the user application interface. 31 | 32 | ## Motivation and Scope 33 | 34 | At NASA GES DISC, our use case is for computing averages along a range of data dimensions (e.g., space, time) in a cost effective and highly performant manner. In the Geo-spatial community, computing area or temporal averages over a long range of observations is popular with users. Performing the averaging operation over data in Zarr generally requires a full scan of the data. This can be parallelized with Dask (or any other distributed framework), but reading all of the data is an unavoidable bottleneck. Our proposed approach, Zarr-based Chunk-level Accumulation, will improve the speed and cost of long-range calculation by loading only a few data chunks at the averaging boundary. Here, we provide examples of how the approach is applied to geospatial data with temporal and spatial dimensions; however, it’s noteworthy that this approach is dimension-agnostic and can be generalized for all types of multidimensional data aggregation services with Zarr. 35 | 36 | ## Detailed description 37 | 38 | The fundamental of this Zarr-based Chunk-level Accumulation algorithm is to pre-compute [cumulative sums](https://mathworld.wolfram.com/CumulativeSum.html) of data values and weights/counts along data dimensions at the chunk intervals. These cumulative sums can then be used to find data averages for dimension ranges. 39 | 40 | Example: 41 | 42 | For a sequence `A=(a, b, c, d)`, the cumulative sums are `S=(s0, s1, s2, s3)` where `s0=a, s1=a+b, s2=a+b+c, s3=a+b+c+d`. The average of the sequence over a range can now be calculated using the cumulative sums. For example, assuming zero-based indexing, average `(A[1:])=(S[3]-S[0])/3=(s3-s0)/3`. 43 | 44 | The above example can be extended and generalized for multiple dimensions. This makes the problem of computing averages *O(1)* vs *O(N^m)* for the dimensions being averaged, where *N* is the number of data values and *m* is the number of dimensions to be averaged. See our [ESIP 2022 presentation](https://www.youtube.com/watch?v=ac_UKunUrNM&t=2250s) (and the [slides](https://docs.google.com/presentation/d/1RNvkIlCFvtoy89OTMzQNn_0jixOpdhnu/edit?usp=sharing&ouid=106287227661991623566&rtpof=true&sd=true)) for a more detailed description. 45 | 46 | ## Implementation 47 | 48 | We propose to formalize this Zarr-based Chunk-level Accumulation approach as a Zarr extension. To implement this approach, cumulative sums are computed at chunk intervals and are stored in a Zarr group. The API for averaging the data fetches the necessary pre-computed sums based on the user-requested dimensions (e.g., time) and dimension ranges (e.g., from 1980 to 1990). 49 | 50 | Please note that this solution is also applicable for storing chunk statistics (min, max, sum, count, etc.) to help with performing aggregations. 51 | 52 | ### Zarr group structure of accumulation data 53 | 54 | Rather than storing the chunk-level statistics in a separate store, we could store them inline with the arrays they are derived from. This would enable other applications to take advantage of such pre-computed data to optimize queries. This is similar to an optimization in Snowflake ([twitter link](https://twitter.com/teej_m/status/1546591452750159873)). 55 | 56 | The accumulation datasets are organized in a data group adjacent to the raw data and dimension arrays with the following structure: 57 | ``` 58 | ├── ${dimension_array} 59 | ├── ... 60 | ├── ${raw_dataset} 61 | ├── ... 62 | └── ${raw_dataset}_accumulation_group 63 | ├── .zgroup 64 | ├── .zattr 65 | ├── ${accumulation_dataset_1} 66 | │ ├── .zarray 67 | │ ├── .zattr 68 | │ └── ... 69 | ├── ${accumulation_dataset_2} 70 | │ ├── .zarray 71 | │ ├── .zattr 72 | │ └── ... 73 | ... 74 | ``` 75 | 76 | where `${dimension_array}` is the data array for the dimension variable, `${raw_dataset}` is the data array for the raw dataset, `${raw_dataset}_accumulation_group` is the group for accumulation, and `${accumulation_dataset_1}` and `${accumulation_dataset_2}` are the data arrays for each accumulation dataset. 77 | 78 | ### Zarr attribute file of accumulation group 79 | 80 | The accumulation group attribute file, `${raw_dataset}_accumulation_group/.zattr`, provides details of the accumulation implementation and data organization. It follows the JSON schema shown below: 81 | ``` 82 | { 83 | "$schema": "http://json-schema.org/draft-07/schema#", 84 | "type": "object", 85 | "definitions": { 86 | "accumulation_data_array": { 87 | "type": "object", 88 | "properties": { 89 | "_DATA_UNWEIGHTED": { 90 | "type": "string" 91 | }, 92 | "_DATA_WEIGHTED": { 93 | "type": "string" 94 | }, 95 | "_WEIGHTS": { 96 | "type": "string" 97 | } 98 | }, 99 | "patternProperties": { 100 | "^(?!_DATA_UNWEIGHTED|_DATA_WEIGHTED|_WEIGHTS).*$": { 101 | "$ref": "#/definitions/accumulation_data_array" 102 | } 103 | }, 104 | "additionalProperties": false 105 | } 106 | }, 107 | "properties": { 108 | "_ACCUMULATION_GROUP": { 109 | "type": "object", 110 | "patternProperties": { 111 | "^(?!_DATA_UNWEIGHTED|_DATA_WEIGHTED|_WEIGHTS).*$": { 112 | "$ref": "#/definitions/accumulation_data_array" 113 | } 114 | }, 115 | "additionalProperties": false 116 | } 117 | }, 118 | "required": [ 119 | "_ACCUMULATION_GROUP" 120 | ] 121 | } 122 | ``` 123 | 124 | The recursive definition (`#/definitions/accumulation_data_array`) under the schema root (`_ACCUMULATION_GROUP`) provides details of the cumulative sum statistics, including the dataset names, accumulation types and dimensions. The keys of its `properties` (`_DATA_UNWEIGHTED`, `_DATA_WEIGHTED`, and `_WEIGHTS`) indicate the cumulative sum types (for unweighted data, weighted data, and weights respectively), whereas its values give the cumulative sum dataset names. The accumulation dimension names are saved in the keys of its `patternProperties` along the recursion chain; it is noteworthy that these dimension names need to be ordered to avoid ambiguity and redundancy. 125 | 126 | An example of the above zarr attribute file is given as follows. The data has three dimensions including *latitude*, *longitude* and *time*. The cumulative sums are computed for the weighted data (`_DATA_WEIGHTED`) and weights (`_WEIGHTS`). If we want to provide the time-averaged map and area-averaged time series, the accumulation is only needed for the dimension combinations of *latitude*, *longitude*, *time*, and *latitude*+*longitude*; all other dimension combinations (e.g. *latitude*+*time*, *longitude*+*time*, and *latitude*+*longitude*+*time*) are empty (`{}`). 127 | ``` 128 | { 129 | "_ACCUMULATION_GROUP": { 130 | "latitude": { 131 | "_DATA_WEIGHTED": "acc_lat", 132 | "_WEIGHTS": "acc_wt_lat", 133 | "longitude": { 134 | "_DATA_WEIGHTED": "acc_lat_lon", 135 | "_WEIGHTS": "acc_wt_lat_lon", 136 | "time": {} 137 | }, 138 | "time": {} 139 | }, 140 | "longitude": { 141 | "_DATA_WEIGHTED": "acc_lon", 142 | "_WEIGHTS": "acc_wt_lon", 143 | "time": {} 144 | }, 145 | "time": { 146 | "_DATA_WEIGHTED": "acc_time", 147 | "_WEIGHTS": "acc_wt_time" 148 | } 149 | } 150 | } 151 | ``` 152 | 153 | ### Zarr attribute file of accumulation data array 154 | 155 | With Zarr-based chunk-level accumulation, the cumulative sums are not necessarily computed for every single chunk. To further reduce the computation and storage cost for the accumulation data, the cumulative sums can be computed every certain number of chunks, and we call this tunable number the *accumulation stride*. This information is saved in the Zarr attribute file for the accumulation dataset (e.g., ``${raw_dataset}_accumulation_group/{accumulation_dataset_1}/.zattr``). 156 | 157 | As mentioned above, the dimension labels are needed to identify the accumulation datasets. We assume that the dimensions are defined in the attributes of the dataset as `_ARRAY_DIMENSIONS` as from [the xarray implementation](https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html). In the present approach, the *accumulation stride* is saved in an object called `_ACCUMULATION_STRIDE` in parallel with `_ARRAY_DIMENSIONS`. The related schema segment of this attribute file is shown as follows: 158 | ``` 159 | { 160 | "$schema":"http://json-schema.org/draft-07/schema#", 161 | "type":"object", 162 | "properties":{ 163 | "_ARRAY_DIMENSIONS":{ 164 | "type":"array", 165 | "items":{ 166 | "type":"string" 167 | } 168 | }, 169 | "_ACCUMULATION_STRIDE":{ 170 | "type":"array", 171 | "items":{ 172 | "type":"integer" 173 | } 174 | } 175 | }, 176 | "required":[ 177 | "_ARRAY_DIMENSIONS", 178 | "_ACCUMULATION_STRIDE" 179 | ] 180 | } 181 | ``` 182 | 183 | The `_ARRAY_DIMENSIONS` and `_ACCUMULATION_STRIDE` arrays should have the same length. Each item in the `_ACCUMULATION_STRIDE` array represents the accumulation stride along the dimension from the `_ARRAY_DIMENSIONS` array at the same index. The value of accumulation stride should be a non-negative integer: a positive value represents the accumulation stride as defined above, whereas a value of 0 indicates the accumulation is not performed along the given dimension. 184 | 185 | For example, the following attribute file represents the accumulation that is performed along only the time dimension every other chunk: 186 | ``` 187 | { 188 | "_ARRAY_DIMENSIONS":[ 189 | "latitude", 190 | "longitude", 191 | "time" 192 | ], 193 | "_ACCUMULATION_STRIDE":[ 194 | 0, 195 | 0, 196 | 2 197 | ] 198 | } 199 | ``` 200 | 201 | and the following attribute file represents the accumulation that is performed along the latitude dimension for each chunk, and along longitude dimension every 3 chunks: 202 | ``` 203 | { 204 | "_ARRAY_DIMENSIONS":[ 205 | "latitude", 206 | "longitude", 207 | "time" 208 | ], 209 | "_ACCUMULATION_STRIDE":[ 210 | 1, 211 | 3, 212 | 0 213 | ] 214 | } 215 | ``` 216 | 217 | ### Application Interface 218 | 219 | The accumulation-based workflow requires the application to locate the accumulation data along certain dimensions. The accumulation data array name for the given dimensions can be obtained from the accumulation group attributes. The following example shows the steps to get the weighted accumulation data array name along *latitude*+*longitude* dimensions: 220 | 221 | accumulation app interface 1 222 | 223 | The accumulation stride is also needed to locate the accumulation data for a given chunk number. They can be obtained from the accumulation data attributes, and the following example shows the steps to get the accumulation stride for the accumulation data along *latitude*+*longitude* dimensions: 224 | 225 | accumulation app interface 2 226 | 227 | ## References and Footnotes 228 | * ESIP Summer 2022 Presentation on *Zarr-based chunk-level cumulative sums in reduced dimensions for fast high-resolution data analysis*: 229 | * [Abstract](https://2022esipjulymeeting.sched.com/event/12etJ/advances-and-challenges-of-cloud-native-data-including-analysis-ready-cloud-optimized-or-arco-formats-and-access-part-1-presentations) 230 | * [Slides](https://docs.google.com/presentation/d/1RNvkIlCFvtoy89OTMzQNn_0jixOpdhnu/edit?usp=sharing&ouid=106287227661991623566&rtpof=true&sd=true) 231 | * [Video](https://www.youtube.com/watch?v=ac_UKunUrNM&t=2250s) 232 | 233 | * [*Xarray* Zarr Encoding Specification](https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html) 234 | * [*Snowflake* table statistics](https://twitter.com/teej_m/status/1546591452750159873) 235 | 236 | ## Copyright 237 | 238 | This proposal is licensed under [the Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 239 | -------------------------------------------------------------------------------- /draft/draft_zeps.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: draft ZEPs 4 | description: List of Draft ZEPs 5 | nav_order: 4 6 | has_children: true 7 | permalink: /draft_zeps/ 8 | --- 9 | 10 | # Draft ZEPs 11 | 12 | ### Shows the list of Draft ZEPs. 13 | -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | --- 2 | # Feel free to add content and custom Front Matter to this file. 3 | # To modify the layout, see https://jekyllrb.com/docs/themes/#overriding-theme-defaults 4 | 5 | layout: default 6 | title: home 7 | nav_order: 1 8 | description: ZEP "A community feedback process for Zarr Specification" 9 | permalink: / 10 | 11 | --- 12 | 13 | # Zarr Enhancement Proposals (ZEPs) 14 | {: .fs-9} 15 | 16 | Community Feedback Process for Zarr Specifications. 17 | {: .fs-6 .fw-300 } 18 | 19 | [Propose a new ZEP](https://github.com/zarr-developers/zeps#proposing-a-new-zep){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } 20 | [View it on GitHub](https://github.com/zarr-developers/zeps){: .btn .fs-5 .mb-4 .mb-md-0 } 21 | 22 | --- 23 | 24 | ZEP stands for Zarr Enhancement Proposal. A ZEP is a design document providing 25 | information to the Zarr community, describing a modification or enhancement of 26 | the [Zarr specifications](https://zarr-specs.readthedocs.io/en/latest/), a new 27 | feature for its processes or environment. The ZEP should provide specific proposed 28 | changes to the Zarr specification and a narrative rationale for the specification 29 | changes. 30 | 31 | We intend ZEPs to be the primary mechanism for evolving the spec, collecting 32 | community input on major issues and documenting the design decision that has 33 | gone into Zarr. 34 | 35 | ### ZEP Meetings 🧑🏻‍💻 36 | 37 | We hold bi-weekly ZEPs meetings to propose, discuss, review and finalize discussions around current ZEPs and Zarr Specification. More info available here: [https://zarr.dev/zeps/meetings/](https://zarr.dev/zeps/meetings/) 38 | 39 | --- 40 | 41 | ### Contributing 🤝🏻 42 | 43 | If you wish to contribute to Zarr's codebase, propose a new ZEP(s), website, blog 44 | posts or in any way, please visit Zarr's GitHub [here](https://github.com/zarr-developers/). 45 | You can discuss the change you want to see by opening an issue in the appropriate 46 | repository, or if the issue is already present, feel free to submit a pull request. 47 | 48 | ### Code of Conduct ⚖️ 49 | 50 | ZEPs are governed by Zarr Community's 51 | [CODE OF CONDUCT](https://github.com/zarr-developers/.github/blob/main/CODE_OF_CONDUCT.md). -------------------------------------------------------------------------------- /join-community.markdown: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: join the community 4 | permalink: /join-community/ 5 | --- 6 | 7 | ## Join the Zarr Community 8 | 9 | Most discussions and chats related to Zarr and its [implementations](https://github.com/zarr-developers/zarr_implementations) take place on Gitter and GitHub. If you are looking to: 10 | 11 | - Interact with the maintainers, contributors and users of the project; join the ZulipChat → [here](https://ossci.zulipchat.com/) 12 | - Want to ask questions related to [`zarr-python`](https://github.com/zarr-developers/zarr-python) usage, create a new discussion on GitHub → [here](https://github.com/zarr-developers/zarr-python/discussions) 13 | - Contribute and engage in discussion related to Zarr Specification; check out the `zarr-specs` [repo](https://github.com/zarr-developers/zarr-specs/) or create an issue → [here](https://github.com/zarr-developers/zarr-specs/issues) 14 | 15 | Also, find us on: 16 | 17 | - [Twitter](https://twitter.com/zarr_dev) 18 | - [GitHub](https://github.com/zarr-developers) 19 | - [YouTube](https://www.youtube.com/@zarr_dev/playlists) 20 | -------------------------------------------------------------------------------- /meetings/2022/2022-09-08.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 8th September 4 | description: ZEPs Meeting Notes for 2022-09-08 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 1 8 | --- 9 | 10 | # 2022-09-08 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Jonathan Striebel (JS), Norman Rzepka (NR), Ryan Abernathey (RA), Dennis Heimbigner (DH), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR 15 | 16 | This was the first ZEP meeting ever. Representatives from the [Zarr Implementations Council](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-implementation-council-zic) joined the meeting. Most discussions revolved around [ZEP1](https://zarr.dev/zeps/draft/ZEP0001.html). One of the critical decisions about ZEP1 was to accept it ‘[Provisionally](https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution)’ and move forward with the implementation in various programming languages. 17 | 18 | **Updates:** 19 | - SV: open ("draft") ZEPs: 20 | - [https://zarr.dev/zeps/draft/ZEP0001.html](https://zarr.dev/zeps/draft/ZEP0001.html) 21 | - [https://zarr.dev/zeps/draft/ZEP0002.html](https://zarr.dev/zeps/draft/ZEP0002.html) 22 | - SV: Author discussion on all the comments on ZEP0001 23 | - Have proposed resolutions for a number of those 24 | - Meeting again tomorrow to try to finish the list 25 | - [https://hackmd.io/sOos8rxrRvKCJPbbUKWtwA?view](https://hackmd.io/sOos8rxrRvKCJPbbUKWtwA?view) 26 | 27 | - SV: Critically propose marking ZEP0001 as "provisionally accepted" after the above are handled and passed by the ZIC 28 | - Implementations are free (and encouraged!) to start implementing. 29 | - Any blocking changes could still be handled. 30 | - Otherwise, "feature freeze". 31 | - Feedback? 32 | - RA: process to get to provisionally accepted 33 | - SV: draft == under review. 34 | - on vote, can move to provisionally or accepted state. 35 | - once implemented, moves to final. 36 | - could move to "deferred" state if the ZIC vetoes 37 | - WF: "ready to implement" jumped out (and caused anxiety but only since there's too much to do) 38 | - [https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution](https://zarr.dev/zeps/active/ZEP0000.html#review-and-resolution) 39 | - JMS: no substantial changes since early draft 40 | - JM: editors are preparing a rebuttal (Alistair's paper model) 41 | - JMS: not sure a paper model is best 42 | - RA: not in the sense that there's only one round and someone will decide. iterative 43 | - good to have authors who are organizing. 44 | - now in revision and we can continue until everyone is happy 45 | - gone slowly for various reasons (availability, summer, and it's our first time & massive) 46 | - would be useful to go through the outstanding issues 47 | - JS: in this cycle and not limited iterations is just the limited time. 48 | - but for now, trying to make batched changes 49 | 50 | **Meeting Minutes:** 51 | - JS: **review of memory order decision number-16 from list** 52 | - zarr's goal is interoperability. therefore propose to keep C & F (benefit for community) 53 | - could support read only, even with a transpose (if too slow, add a warning?) 54 | - JMS: agree. but would like an arbitrary permutation. 55 | - DH: good use case? 56 | - JMS: dimension that represents time. order you display to the user is logical for them but need not be logical for compression/access patterns. 57 | - JM/JS: core or extension? 58 | - RA: that's a key question 59 | - NR: re: backwards compatibility C/F is in V2 therefore that would need to be in core. but arbitrary could be an extension. 60 | - RA: but v3 is a chance to break backwards compatibility (explicitly not a goal) 61 | - NR: upgrade path? so be able to upgrade without re-writing the chunks. 62 | - RA: v2 will still be supported. 63 | - WF: that would be the hope, but worry about netcdf & archival -- assuming software will support it without it being expressed somewhere. aspirational sure but makes us nervous. 64 | - e.g. will future software implement the v2 standard? 65 | - RA: transform based solution? (but only if we support F) **if** we say the chunks should be backwards compatibility. 66 | - WF/DH: no one has ever asked for arbitrary. Someone at NOA asks for things that would help their lab. Technical debt. (Won't even request a pull request) See the trap that the HDF group fell into (single-writer-multiple-reader, several orders of magnitude that they are trying to recover from.) 67 | - JMS: arbitrary seems most natural. pass to `numpy.transpose()` 68 | - WF: shocked at the assertion that there _wouldn't_ be a migration path 69 | - JM: clarification -- were only differentiating if _binary_ transformation is needed 70 | - **can add _requirement_ to v3 that implementations read v2** 71 | - WF: requirement of netcdf. can decide if that's a requirement. 72 | - DH: depends if it's alot. operational definition - "too painful to copy v2 to v3" 73 | - (for RA): petabytes of data 74 | - JS: RA proposed transformer strategy - essentially rewriting metadata **formalize it?** 75 | - DH: how comfortable are you not supporting older version? 76 | - JM: for OME, got agreement but that's a layer higher 77 | - DH: will there be new implementations without V3 support? 78 | - NR: think there will be 79 | - JM: but it's so easy to implement 80 | - WF: people won't do that...what do we do if a popular implementation doesn't support v2? 81 | - other packages? 82 | - RA: recommend storage layer / translation? 83 | - JM: agreed but that's SHOULD (versus MUST) 84 | - JMS: only way to force it is a standardization 85 | - JM: agreed, but we can only do what the spec document allows us (i.e. labeling something as "compliant") 86 | - JS: it's a new major version and people know what we mean. (as a user, I wouldn't expect support for v2 if an implementation says "v3") 87 | - WF: convinced myself I'm worrying too much instead: 88 | - WF: in 18 months how do you know which Zarr is used to open it. 89 | - JM: metadata file is different (essentially the magic number). The proposal for `.zr3` was currently turned down. 90 | - SV: [data type naming](https://github.com/zarr-developers/zarr-specs/pull/149#discussion_r929140806) 91 | - JM: dropping the python-ness 92 | - JMS: helps provide a more nature scheme for some datatypes (and endianness as a codec) 93 | - no argument against (just "convenient in Python") 94 | - JM: will need names 95 | - JMS: in [https://github.com/zarr-developers/zarr-specs/pull/155](https://github.com/zarr-developers/zarr-specs/pull/155) 96 | - DH: netcdf ncchar type equivalent to 8-bit ascii, no equivalent in Zarr. Needed? NC uses it all the time. Why not in numpy? 97 | - JMS: thought numpy has char. 98 | - RA: revisit char question? JMS: different than varstring 99 | - RA: where does the encoding go? DH: in an attribute. "ascii" (or "utf-8") 100 | - RA: used for? DH: you see a lot of flags stored that way. 101 | - also historical: NC-3 didn't have strings of any type. (arrays of chars workaround) 102 | - JM: extension mechanism? 103 | - DH: where the wheel hits the road 104 | - JMS: just metadata? 105 | - RA: disagree, influences ... 106 | - JMS: agreed, but doesn't change how hard it is to implement 107 | - JM: but need to feel confident that they are low cost so we can change *when* we discuss these things 108 | - JMS: will changes start appearing? 109 | - JM: Very soon! -------------------------------------------------------------------------------- /meetings/2022/2022-09-22.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 22nd September 4 | description: ZEPs Meeting Notes for 2022-09-22 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 2 8 | --- 9 | 10 | # 2022-09-22 11 | 12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS), Dennis Heimbigner (DH) 13 | 14 | ## TL;DR 15 | 16 | Consolidate metadata needs an extension for V3, which might result in a new ZEP. Next, JMS shared a document titled ‘Optionally-cooperative distributed b-tree for Tensorstore’. The participants discussed the document after that. After that, JM initiated the discussion on codecs-registry, which was built by one of the GSoC students this summer. The meeting ended with a discussion on the path to the metadata files. 17 | 18 | **Meeting Minutes:** 19 | 20 | - Java/NetCDF side: 21 | - JM: Sanket met people 22 | - WF: Unidata should be 3x the staff. 23 | - JM: perhaps starting with a kerchunk implementation? 24 | - WF: looking for more community involvement (like netcdf-c had) 25 | - JM: Greg mentioned consolidated metadata needs an extension for V3 26 | - RA: Iceberg issue, also see JMS' proposal 27 | - [https://github.com/zarr-developers/zarr-specs/issues/154](https://github.com/zarr-developers/zarr-specs/issues/154) 28 | - JMS: touches on not needing a file per chunk (like discussed last night) 29 | - [https://docs.google.com/document/d/1PLfyjtCnfJRr-zcWSxKy-gxgHJHSZvJ2y4C3JEHRwkQ/edit?resourcekey=0-o0JdDnC44cJ0FfT8K6U2pw#heading=h.8g8ih69qb0v](https://docs.google.com/document/d/1PLfyjtCnfJRr-zcWSxKy-gxgHJHSZvJ2y4C3JEHRwkQ/edit?resourcekey=0-o0JdDnC44cJ0FfT8K6U2pw#heading=h.8g8ih69qb0v) 30 | - db format that stores a btree. 31 | - uniquely: designed to allow distributed writes (s3, etc.) *but* doesn't need a peristent database 32 | - can also read it in a non-distributed fashion 33 | - downside: adds quite a bit of added complexity (greatly for binary format) 34 | - also good where sharding isn't appropriate (e.g. pre-defined shard size which is required for write) 35 | - e.g. large number of small arrays (where sharding won't help) 36 | - RA: nice document. comments: 37 | - focused on big distributed writes, but with iceberg had a different main motivation: more flexibility in mapping keys to chunks. kerchunk-like. virtual concatenate . can you reference random chunks? yes. 38 | - JMS: btree nodes have references to files (like kerchunk). but datafiles are identified with 128-bit path (not an fsspec URL) 39 | - RA: different use case, so can have them be optional transformers/extensions 40 | - RA: really similar to tiledb! why not use it? 41 | - JMS: tiledb is organized by time not space. 42 | - JM: need a compaction 43 | - JMS: and even after that you still have a million files. 44 | - DH: HDF5? internally it's btrees. (which is responsible for most of its complexity). Are you sure this is the path? 45 | - JMS: not sure there's an alternative to btrees. used in databases, filesystems, etc. 46 | - DH: if you don't want some ordered searches, then linear hashes are an alternative 47 | - JMS: ordered is useful for a lot of use cases. but there wasn't an obvious solution for distributed writes 48 | - DH: [extendable hashing](https://en.wikipedia.org/wiki/Extendible_hashing) is an easier data structure (old paper) works well with disk storage. 49 | - JMS: think this is more a key-value store (like zip) 50 | - RA: agreed. Nice that it's possible to experiment like this. 51 | - RA: can the V3 spec support this experimentation? (right extension points?) 52 | - RA: trying to do that with Iceberg. Martin suggested "IceChunk". 53 | - See also: hooty and others. Lots of smart ideas that we can copy. 54 | - Goal is to provide some level of branching & transactions for/on a Zarr store 55 | - Allow you to work on your staged area which all get written at once. 56 | - Branch non-destructively (or rollback) 57 | - The key is having a "manifest" (they all have some concept of that, even kerchunk) 58 | - Don't depend on the object stores listing as the source of truth 59 | - Need storage transformers at the top level, not array. But for JMS' idea array-level might suffice. 60 | - JMS: wasn't planning on an extension. root metadata would be in the same data store. 61 | - JM: basically writing DB/filesystem :+1: ZarrFS ;) 62 | - JMS: planning on mongo? Yeah, or Dynamo. (They store JSON) 63 | - JSON in S3 isn't ideal. 64 | - metadata in document store and chunks on disk. Beyond just filesystem. It's a data lake. 65 | - "meta-store" 66 | - JMS: regarding versioning, how are you representing the delta? 67 | - The chunk is the minimal writable unit. (out-of-scope) 68 | - Every chunk write is a uniquely ID'd (e.g. content addressable). That gets a key. Write that to DB. 69 | - JMS: expecting the database to provide the versioning? 70 | - RA: no, just a place for documents. versioning (in iceberg) has a branch or a tag that points to a specific chunk manifest. you can create a new one and point your HEAD at that. only rely on database to atomically change the references. iceberg tracks a number for the transaction. 71 | - JMS: use kerchunk model? limitation on the number of chunks? 72 | - RA: chunks are likely in a separate manifest. discussed that another extension with Martin. 73 | - RA: but can just query a chunk from the database. 74 | - JMS: 1M chunks in v1. then update to v2. What's the diff? A copy. 75 | - RA: yeah need to play with it. 76 | - JMS: when you get to wanting to update just a portion of it, then you get to b-trees :smile: 77 | - RA: no db guys, trying to keep it hackable. 78 | - RA: but megabyte kerchunk is already getting :heart: since it's so easy. looking for incremental improvement on _that_. (NASA will be pumping out GRIB forever...) 79 | - JMS: looking forward to hearing more and exchanging info re: b-trees 80 | - JMS: see also [https://github.com/janelia-flyem/dvid](https://github.com/janelia-flyem/dvid) (backed by KV database) 81 | - JM: sharing layers with them? 82 | - JMS: complicated by other priorities of the EM team. invite Bill to the Zarr meetings? 83 | - RA: see [https://lakefs.io/](https://lakefs.io/) 84 | - JM: API versus format 85 | - RA: thinking about it more like an API 86 | - JM: briefly codecs-registry 87 | - [https://zarr.dev/codecs-registry/](https://zarr.dev/codecs-registry/) 88 | - [https://github.com/zarr-developers/codecs-registry](https://github.com/zarr-developers/codecs-registry) 89 | - JMS: still want a schema per codec. JM: agreed! 90 | - JMS: talks about codecs having URLs. 91 | - would by an annoyance to have difference V2 and V3 identifiers. 92 | - e.g. just numeric constants in the JSON that are from the C API 93 | - e.g. shuffle parameter which would be nicer as a string. 94 | - support integer or string for a while (in order to deprecate) 95 | - JM: have plans to have code in each languages that checks for an id from the central registry 96 | - DH: approx. that with nczarr. ncdump lists the actual codecs in the file 97 | - would be good to have something more sophisticated 98 | - have the disadvantage of C code and interpreted files 99 | - 3 repositories on the C side. unidata + irvine + hdf5 100 | - hdf5 only has names, hdf5-ids and a pointer (which is often out of date) 101 | - something universal would be nice 102 | - WF: roping in the HDF5 group would be a heavy lift 103 | - JMS: **URL interface** :rocket: 104 | - DH: :+1: for the REST API 105 | - WF: NSF/CSSI solicitation has opened 106 | - [https://beta.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific-innovation-cssi](https://beta.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific-innovation-cssi) 107 | - perhaps something here 108 | - WF: planning on getting to [https://www.egu23.eu/](https://www.egu23.eu/) 109 | - tweet something from zarr_dev to see if there is interest :question: 110 | - could collaborate something re: nc/zarr 111 | - JMS: don't have clear resolution on the paths to the metadata files 112 | - JM: re-capped the previous discussion and think it's still good. 113 | - JMS: some details around the root array (the named files, etc.) 114 | - JMS: consolidated metadata? duplicated? 115 | - JM: would make it possible to have everything in the top-level 116 | - JMS: pointers in the subdirectories? bit annoying. 117 | - JMS: with iceberg & co. you likely don't need a consolidated metadata 118 | - JM: so you'd push it to the store level? 119 | - JMS: possibly, but not that simple 120 | - JMS: there are cases where you need path separation anyway (Zips) 121 | - JMS: so could see using a path separation strategy entirely 122 | - JMS: Davis did have a use case ... 123 | - (...details zip, consolidated brainstorming...) 124 | - JM: need both solutions... -------------------------------------------------------------------------------- /meetings/2022/2022-10-06.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 6th October 4 | description: ZEPs Meeting Notes for 2022-10-06 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 3 8 | --- 9 | 10 | # 2022-10-06 11 | 12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Greg Lee (GL), Jonathan Striebel (JS) 13 | 14 | ## TL;DR 15 | 16 | JM shared that there were some good conversations around OME-Zarr yesterday. The summary is available [here](https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10). WF shared that Kitware is looking for partners and a link to the sign-up form. GL shared that during the CZI Open-Science Summit 2022, he worked on writing tests for Xarray. After this, there was an extensive discussion on URL syntax initiated by JMS. 17 | 18 | **Updates:** 19 | 20 | * miscellaneous reading before the meeting (JM) 21 | - [https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/](https://arrow.apache.org/blog/2022/10/05/arrow-parquet-encoding-part-1/) 22 | - [https://github.com/kaitai-io/kaitai_struct/issues/125](https://github.com/kaitai-io/kaitai_struct/issues/125) 23 | * NGFF (JM) 24 | - [https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10](https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/10) 25 | - Good conversations around OME-Zarr yesterday 26 | * Enthusiasm for Kitware (WF) 27 | - Looking for partners. [Have form on webpage](https://www.kitware.com/contact/project/). 28 | - Unidata an option. They've mentioned Zarr a couple of times (Kitware Blog). 29 | * xarray test (GL) 30 | - during czi conference. 31 | - release of 2.13 hopefully fixed it all :tada: 32 | 33 | **Meeting Minutes:** 34 | 35 | * URL syntax? (JMS) 36 | - helps to figure out the metadata location. 37 | - Josh: great idea. have several ongoing discussions at the NGFF level 38 | - current proposal would be to support URIs internally (relative, absolute, remote) 39 | - however, in V2 40 | - JMS: in v3 the root exists 41 | - though not entirely clear that the new metadata organization is necessary 42 | - designed for S3 where there's no directory, but other problems exist 43 | - Josh: _summarized previous discussions for Greg_ 44 | - GL, thoughts on the V3 situation? 45 | - GL: at the moment, you need helper methods to do that. 46 | - JM: one proposal was to have the metadata be the main directory which lets you then bootstrap the chunk loading 47 | - JMS: support multiple? 48 | - JM: conceivably. as extension or configuration. 49 | - JM: downside for consolidated metadata is that nothing exists in the metadata hierarchy 50 | - workaround of having a thin-hierarchy only with references to where the metadata exists 51 | - JS: losing the ability to be able to next any hierarchy. (everything is a root) 52 | - JM: are we proposing rolling it back completely 53 | - JMS: problem is the URI+rootpath metadata 54 | - JS: walking up the hierarchy would be an option (URL doesn't actually point) 55 | - JMS: would be nicer if you don't have to perform a search 56 | - Use case 57 | - URL case 58 | - Desktop double click on something 59 | - Similar issue: **Zips** :warning: 60 | - JMS: have an additional level 61 | - JM: except ZipStore v2 assumes the whole zip is a zgroup 62 | - JS: propose zip is a special case which is _easier_ 63 | - JMS: unless you are mixing volumetric with a zarr then it wouldn't be at the top-level 64 | - Btree (JMS): need to be able to compose multiple layers (similar to fsspec and double colons) 65 | - Remote chunk store (or point to V2 chunks) 66 | - Renaming folders (keep data with arrays) 67 | - Options 68 | - Keep "/meta", clients must know 69 | - Drop "/meta", direct URLs 70 | - `?param` syntax 71 | - `#param` syntax 72 | - Separator syntax (e.g. "`//`") 73 | - root dir ends in .zarr 74 | - fsspec `::` separator 75 | - multiple protocols (git+ssh, zip+zarr) 76 | - further discussion 77 | - JS: without /meta and .zarr requirement, you still don't know where the root is 78 | - JS: if you drop "/meta" then you can't name anything "/data" 79 | - JMS: could use something more obfuscated 80 | - JS: why split? 81 | - JMS: if you are not using the filesystem (s3 or gcs) and you want to list all the metadata, it's not (as) efficient 82 | - JM: "data" could be registered in the metadata so it's a known (and configurable) thing 83 | - WF: NC anything with leading underscore is assumed reserved for the library 84 | - permitted to create them, but the spec says "please don't" 85 | - JM: `.z` prefix 86 | - WF: utilities and tools can scrape everything with that 87 | - WF: also don't have to put too much thought into new features 88 | - JMS: would prefer not a `.` prefix because of archiving tools, etc. 89 | - then `_z`? 90 | - JMS: root metadata file doesn't really do anything 91 | - JS: creates ambiguity 92 | - JM: think it was largely for bootstrapping global plugins (e.g. transformers) 93 | - JS: perhaps V2 compatibility 94 | - JMS: not clear you would nest sharding with other transformers. it would be the thing applied to the chunks. 95 | - JS: the metadata needs to be somewhere, and for that can be at the array level 96 | - brief summary 97 | - zarrs are essentially a metadata hierarchy 98 | - that configure (possibly remote) chunk stores 99 | - and the root is identified with .zarr -------------------------------------------------------------------------------- /meetings/2022/2022-10-20.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 20th October 4 | description: ZEPs Meeting Notes for 2022-10-20 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 4 8 | --- 9 | 10 | # 2022-10-20 11 | 12 | **Attending:** Ward Fisher (WF), John Kirkham (JK), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR 15 | 16 | WF is working on the maintenance NetCDF release candidate (`v4.9.1-rc1`), and JMS added CMake support to TensorStore. After this, JMS initiated a discussion on Path structure and was stretched for the remaining meeting. 17 | 18 | **Updates:** 19 | 20 | - (WF) Working on maintenance netcdf release candidate (`v4.9.1-rc1`). No new features, just bug fixes and improvements. 21 | - (JMS) Added CMake support to TensorStore 22 | - Discussion about CMake, dependency management 23 | - https://cmake.org/cmake/help/book/mastering-cmake/chapter/CDash.html 24 | - https://github.com/cpm-cmake/CPM.cmake 25 | 26 | **Meeting Minutes:** 27 | 28 | * (JMS) Path structure 29 | * Require or encourage root directory to end in .zarr 30 | * How to name all the metadata files? 31 | * Root metadata could contain extension information 32 | * (JK) Mentioned `.zmeta` metadata file with paths to metadata file 33 | * (JMS) About listing 34 | * (WF) Possible issues with writing 35 | * (WF) Spec vs. library tension 36 | * (JK) Have file expire? 37 | * (JMS) Handle as read-only 38 | * (JK) Could also delete as part of writing? 39 | * (JMS) HDF5 has hierachary and Zarr replicates this 40 | * Have some array and non-array data next to each other 41 | * (JK) Examples? 42 | * (JMS) Segmentations & mesh representations 43 | * (JMS) Collection of volumes with annotations related to them 44 | * (WF) Have Zarr hierarchy with non-Zarr? 45 | * (JMS) Only have single individual arrays 46 | * (WF) Wouldn't have considered this structure 47 | * (WF) Does there need to be something in the spec about interleaving data? 48 | * (WF) Maybe interleaving poses some challenges 49 | * (JMS) Doesn't NetCDF have extra files as well? 50 | * (WF) Yes. Extra metadata used to map Zarr model to NetCDF model. 51 | * (JMS) Reason to use this structure as opposed to Zarr metadata files? 52 | * (WF) NetCDF supports different formats HDF5, Zarr, etc. 53 | * (JMS) Have user defined attributes. Types are stored in metadata file? Could those be in zattrs? 54 | * (WF) Yes. Not sure 55 | * (JMS) Hierarchy becomes more apparent with V3 as opposed to V2 56 | * (WF) Groups were a new feature that users were slow to pick up on 57 | * (JK) Does adding more top-level metadata cause issues? 58 | * (JMS) Could it contain the metadata? 59 | * (WF) Maybe include subset of metadata 60 | * (WF/JMS) Perhaps special case single array use case 61 | * (JK) How does data relate in non-hierachical form 62 | * (JMS) Related, but not all Zarr data 63 | * (JK) Would other kinds of chunk formats (standardizing on kerchunk) be useful 64 | * (JMS) Meshes probably don't make sense in this way 65 | * (JMS) Neuroglancer meshes are a good example 66 | * (JMS) Sparse arrays seem similar in that they might be better handled by being their own file format 67 | * (WF) NetCDF users mention performance issues in moving to new version. Usually suggest using old NetCDF. Maybe same with V2/V3? 68 | * (JMS) Want to use V3 (sharding being of value). 69 | * (JK) Including unstructured binary blobs in Zarr? 70 | * (JMS) Has a group of files for mesh 71 | * (JK) Maybe ignore specific paths? 72 | * (WF) Having mixed media is valuable though can be logisticially tricky 73 | * (WF) What defines Zarr as a data model? At least need to say some behavior is undefined (mixed media). Ideally ignores mixed media files. -------------------------------------------------------------------------------- /meetings/2022/2022-11-03.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 3rd November 4 | description: ZEPs Meeting Notes for 2022-11-03 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 5 8 | --- 9 | 10 | # 2022-11-03 11 | 12 | **Attending:** Josh Moore (JM), Jonathan Striebel (JS), Jeremy Maitin-Shepherd (JMS), Sanket Verma (SV) 13 | 14 | ## TL;DR 15 | 16 | Discussions were held on how to move forward with ZEP1 quickly. The summary can be viewed [here](https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1302440391). Then the attendees discussed extensions in V3, and JMS is considering trying with non-zero origin. SV joined the meeting after 30 mins. After that, JS mentioned some high-level issues looming around V3 spec. 17 | 18 | **Meeting Minutes:** 19 | 20 | * JMS: number of PRs that could be merged into the working draft 21 | - JS: don't want to just close it 22 | - JM: can we cross link e.g. JMS' PR? Yes. 23 | - ==> once all cross-linked close PR. 24 | * JS: when to merge? 25 | - JM: when it matches the consensus? 26 | - JS: ok, but don't have merge rights. 27 | - ==> Let's merge proactively. 28 | * see: https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1302440391 29 | * extensions 30 | - JMS: thinking of trying with non-0-origin 31 | - JM: think that's a general principle we should try for all issues/PR is "could it be an extension" 32 | - JMS: thinking of extensions as plugins? Not exactly. 33 | - JS: how to influence if an implenentation adopts an extension? if there's a concrete implementation / clear interface 34 | - JMS: agreed and some obvious ones (codecs) but not clear there will be a broader abstraction 35 | - JS: "index transformer" _perhaps_ 36 | - or as transformer _if_ multiple of chunking 37 | - JMS: unfortunate limitation 38 | - JMS: re: transformers - it doesn't make sense to compose a different storage transform _before_ sharding 39 | - JS: depends. cache of chunks or shards? also checksum 40 | - JM: codec is similar 41 | - JMS: caching enabled in code, but not in zarr metadata 42 | - JS: that's in spec, yes. "runtime-only" but still before or after 43 | - JMS: when implementing sharding, would check if it's first 44 | - want to be able to tell the user "this is the graularity to write" 45 | - JS: good that it's flexible. like c/f order. 46 | - JS: mention in implementation "sharding must be first" 47 | - JMS: composing makes for useful extension point 48 | - JS: most important point: are we sure enough about the extension points? 49 | - _Sanket joins_ 🧑🏻‍💻 50 | - Jonathan: high-level issues looming 51 | - paths/URL discussion (needs an issue) 52 | - global transformers 53 | - variable chunk length (possibly origin offset) 54 | - indexing more abstract 55 | - upgrade path! (`{“extension”: [“@v2-layout”]}`) -------------------------------------------------------------------------------- /meetings/2022/2022-11-17.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 17th November 4 | description: ZEPs Meeting Notes for 2022-11-17 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 6 8 | --- 9 | 10 | # 2022-11-17 11 | 12 | **Attending:** Sanket Verma (SV)...Jonathan Striebel (JS), Ryan Abernathey (RA), Ward Fisher (WF) 13 | 14 | ## TL;DR: 15 | 16 | Apparently there was a snafu where JS, RA and WF joined a Zoom meeting whereas SV joined another one! 🥲 17 | 18 | In the meeting there was the discussion on V3 spec and some of it's missing parts. Also, RA opened a PR on global transformers, which can be seen [here](https://github.com/zarr-developers/zarr-specs/pull/182). 19 | 20 | **Updates:** 21 | 22 | - ZEP1 Update, see [here](https://gitter.im/zarr-developers/community?at=6374fae6f9491f62c9b7ea61) 23 | - Check out the ZEP1 GH Project board [here](https://github.com/orgs/zarr-developers/projects/2/views/2); maintained by Jonathan Striebel 24 | 25 | **Meeting minutes:** 26 | 27 | Same as TL;DR. 👆🏻 28 | -------------------------------------------------------------------------------- /meetings/2022/2022-12-01.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 1st December 4 | description: ZEPs Meeting Notes for 2022-12-01 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 7 8 | --- 9 | 10 | # 2022-12-01 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), Ward Fisher (WF), Dennis Heimbigner (DH), Ryan Abernathey (RA) 13 | 14 | ## TL;DR: 15 | 16 | RA started a discussion on drop `/meta` prefix. See the issue [here](https://github.com/zarr-developers/zarr-specs/issues/177), which basically led to chain reaction of several conversations around topics related to each other. These discussions are mostly around some lingering issues around the finalisation of Zarr V3 spec. 17 | 18 | RA, JMS and JS took some action items which can be seen at the bottom 19 | 20 | **Updates:** 21 | 22 | - Conversations (issues and feedback) on ZEP1 [PR](https://github.com/zarr-developers/zarr-specs/pull/149) are now resolved. Check [this](https://github.com/zarr-developers/zarr-specs/pull/149#issuecomment-1327605570). Thanks to Jonathan Striebel! 🙌🏻 23 | - The conversations which needs additional input have been moved to separate issues 24 | - Jeremy Maitin-Shepard promoted as one of the authors for [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html) 25 | - Current status of ZEP1 can be viewed [here](https://github.com/orgs/zarr-developers/projects/2) 26 | 27 | **Meeting minutes:** 28 | 29 | - RA: suggest focusing on the meta/ prefix discussion 30 | - [https://github.com/zarr-developers/zarr-specs/issues/177](https://github.com/zarr-developers/zarr-specs/issues/177) 31 | - JMS: not sure it's solving a problem (optimally). nice feature of v2 is copying out an array 32 | - JS: was for performance, use exclusion mechanism 33 | - RA: never need to list chunks (even if implementations do...) 34 | - NB: don't like trying to open files to know things (404-based) 35 | - JM: so we all agree? Yes. But what's the default? 36 | - RA: suggest: drop meta, use .json on the array 37 | - Can then drop the root metadata? 38 | - DH: there is dataset-level metadata (superblock) 39 | - JS: discuss those separately? 40 | - Agreed 41 | - JS: so to that suggestion, how do you list all metadata? 42 | - RA: don't think we should plan for discovering all metadata (millions of arrays) 43 | - JS: 44 | - RA: listing recursively isn't ok? 45 | - JS: not with implicit groups 46 | - RA: use storage transformer to get the previous behavior 47 | - RA: data is out there so need to provide a mechanism 48 | - DH: don't think that's fair 49 | - RA: nice feature of this proposal if we could keep it. 50 | - JM: how is conslidated metadata related? 51 | - RA: that's another problem. 52 | - RA: had thought about explicitly list the children (stac catalog) 53 | - DH: nczarr does that as well. 54 | - RA: downside is the concurrency issue 55 | - JS: good extension for groups (listing children) 56 | - JS: but could also have consolidated per group 57 | - JM: different commnunities here. some are definitely asking for listability 58 | - DH: not lots of formats that are listable without tools. They are asking for something powerful. 59 | - RA: so that's the root feature of the separate hierarchies 60 | - RA: we should look at some data. offer to write a script 61 | - JS one other alternative: chunks in an extra subfolder 62 | - JMS: k/v versus directory based will have different performance behavior 63 | - RA: does this require us to give up on implicit groups? 64 | - summary: foo/array.json and foo/chunks/0/0 65 | - JMS: re: dropping of root metadata 66 | - solution is perhaps storage transformer would need to write something in _array_ metadata 67 | - JS: but then everything in the array metadata and you'd need be able to read it 68 | - JMS: don't _always_ need to access the metadata directly. more like a safety measure. 69 | - JMS: could copy "global" metadata into each array 70 | - JM: that works to the design goal of being able to freely copy an array 71 | - DH: that assumes no global extensions. 72 | - RA: need to think through this separately: 73 | - portability ("invariance" of v2 that any group is standalone) 74 | - RA: danger is that you can build zarrs through the command-line (need no library) 75 | - DH: sure you want to do that? 76 | - RA: it was at least part of the design of V2 77 | - DH: as a principle, if that's what you want it's really critical to the v3 process 78 | - DH: cf. intellij -- fiddling in text file then going back (bypassing the tool) 79 | - JM: perhaps CLI manipulations are inherently "extension-less" and therefore this is "safe" 80 | - DH: one tool being a verification tool? 81 | - JS: consolidated metadata as an example 82 | - RA: primarily used as a way to allow easy listing 83 | - but you know that you can't touch the store. 84 | JS: you need a root to do fancy things 85 | - **action items:** 86 | - RA to do some performance benchmarking 87 | - JMS to propose a new storage layout for v3 88 | - next time: root metadata discussion issue (JS) 89 | -------------------------------------------------------------------------------- /meetings/2022/2022-12-15.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 15th December 4 | description: ZEPs Meeting Notes for 2022-12-15 5 | grand_parent: ZEP meetings 6 | parent: 2022 meetings 7 | nav_order: 8 8 | --- 9 | 10 | # 2022-12-15 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Ryan Abernathey (RA), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), John Kirkham (JK) 13 | 14 | ## TL;DR: 15 | 16 | The meeting started with a discussion on some pending issues regarding V3. Then, we opened the [ZEP1 project board](https://github.com/orgs/zarr-developers/projects/2) and went through the issues individually to decide their conclusion. As a result, consensus on some issues was achieved, while others are yet to be discussed in successive ZEP meetings. 17 | 18 | The [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html) has gone into feature freeze, as mentioned in the blog post [here](https://zarr.dev/blog/zep1-update/), and from now on, the community, ZSC and ZIC will be working on integrating and resolving existing features and issues, respectively. 19 | 20 | **Meeting minutes:** 21 | 22 | - Discussed with Jonathan on 12/9: 23 | - Adding a `diff` w.r.t. to earlier version of V3 24 | - Include filesystem in ZEP0001 25 | - Sync V3 implementation in `zarr-python` with the recent changes in spec; see - [https://github.com/zarr-developers/zarr-python/issues/1290](https://github.com/zarr-developers/zarr-python/issues/1290) 26 | - [https://github.com/orgs/zarr-developers/projects/2](https://github.com/orgs/zarr-developers/projects/2) 27 | - No issues added after the 19th 28 | - All need to be solved by the vote 29 | - Migrated NaN issue to zarr-specs 30 | - dropping /meta prefix 31 | - RA: make clear (at some spec locations) that iterative listing is necessary 32 | - also make more use of async calls 33 | -------------------------------------------------------------------------------- /meetings/2022/meeting_notes_2022.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2022 meetings 4 | description: List of ZEP meeting notes for the year 2022 5 | nav_order: 1 6 | parent: ZEP meetings 7 | has_children: true 8 | permalink: /meetings/2022/ 9 | --- 10 | 11 | # ZEP Meeting Notes for 2022 12 | 13 | Shows the list of meeting notes for the year 2022. 14 | -------------------------------------------------------------------------------- /meetings/2023/2023-01-12.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 12th January 4 | description: ZEPs Meeting Notes for 2023-01-12 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 1 8 | --- 9 | 10 | # 2023-01-12 11 | 12 | **Attending:** Jonathan Striebel (JS), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA), Sanket Verma (SV), Ward Fisher (WF), Dennis Heimbigner (DH) 13 | 14 | ## TL;DR: 15 | 16 | SV started the meeting by asking everyone to review the his latest [PR](https://github.com/zarr-developers/zeps/pull/27) in the ZEPs repo which consolidates the discussion venues for the draft ZEP and minor additions to the webpage. After this the attendees started discussing the open issues for V3 (ZEP0001). We discussed on what should be the ideal /meta prefix; please see [#177](https://github.com/zarr-developers/zarr-specs/issues/177) for extensive discussions. Then we started chatting about having [#192](https://github.com/zarr-developers/zarr-specs/issues/192) which considers to remove entry point metadata. 17 | 18 | **Updates:** 19 | 20 | - 21 | - Any objections? 22 | - No. 23 | 24 | **Meeting Minutes:** 25 | 26 | - ZEP 1 issues that need attention: 27 | - **prefix**: 28 | - open question is the prefix for the chunk directory 29 | - potentially to-be-used for json files, etc. 30 | - also useful for extensions that add new folders 31 | - Dennis: `zarr.chunks`? Good to identify them. (since there are arbitrary delimiters) 32 | - with HDF5 people have experimented with accessing chunks directly, so need easy identification 33 | - Options: 34 | - `_` 35 | - `__` 36 | - `_z_` 37 | - `z.` 38 | - `zarr.` 39 | - `_zarr_` 40 | - Ryan: what's the goal of the prefix? 41 | - JS: preventing node-name collision 42 | - JM: preventing extension collision 43 | - DH: DAP4 rule attempts to have self-assigned namespace, piggybacking on DNS (.ucar.edu) 44 | - RA: work through use cases, e.g. nczarr files 45 | - RA: won't the ability to offload metadata into a separate document 46 | - DH: also apply it to keys within the attributes 47 | - RA: see [ZEP4](https://github.com/zarr-developers/zeps/pull/28). special attributes is another discusion. 48 | - WF: A convention that _zarr are reserved is longer, but feels less prone to collision than _z 49 | - JMS: In general I think we want to reserve a prefix such as "_" for zarr itself and extensions, and then perhaps a subset of that should be reserved for just zarr itself (not extensions). 50 | - NB/DH: would suggest a top-level group 51 | - DH: Do you have sufficient metadata? 52 | - What does it mean to access a raw variable? 53 | - JM: there's still a directory. metadata+chunks (but a place to put extension files as well) 54 | - JM: bidirectionality would need some work to make sure that a group doesn't magically appear 55 | - RA: cf. how GDAL and Geo-tiff (etc) handle this 56 | - DH: two purposes of group -- namespace and a place to store attributes that aren't part of the variable 57 | - DH: example is group-level superblock marker 58 | - chat 59 | - RA: I worry that if we make `_` a disallowed prefix, lots of datasets may not work 60 | - RA: I feel like there is plenty of data out there in the wild that has a `_` as the first character in a variable 61 | - RA: Using ‘_’ as a convention in netCDF is a soft limit, not a hard; it’s part of the convention that it’s reserved, but if users disregard that, they can use ‘_’ with their own attributes. Whatever convention we decide upon can be phrased as guidance, without necessarily breaking extant datasets. 62 | - deciding 63 | - JM: ? 64 | - JS: compatiblity extension for v2 is valid or not 65 | - JM: could see `zarr.json, zarr.chunks, zarr.extensions` 66 | 67 | - **no root group**: 68 | - explicit groups can have transformers 69 | - do we always need to have a zarr.json for an array 70 | - if an array has none, how do we open it? 71 | - search up the hiearchy? too inefficient? 72 | - JMS: group level transformer ten you can't make any assumptions about what's underneath it (e.g. a redirect) 73 | - JMS: without searching, the group with the transform will become a "root" 74 | - RA: need URL syntax for group#path if we have transformers 75 | - JM: still need search up for desktop clients. no URL syntax for searching down 76 | - RA: formalize "**entrypoint**" what can and cannot be opened 77 | - shouldn't be able to open a chunk (and figure out that it's part of an array) 78 | - JS: alternatively, you MUST have a zarr.json 79 | - RA: like that one. for an entry point, there should be a zarr.json 80 | - That should be the **one** entrypoint definition. 81 | - DH: defines a leaf, everything below doesn't exist externally. 82 | - RA: would help to look at hierarchies. (too abstract) 83 | - DH: a bit like posix mountpoints? driver is responsible for interpretation 84 | 85 | - URL syntax: 86 | - little bit via the previous conversation 87 | -------------------------------------------------------------------------------- /meetings/2023/2023-01-26.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 26th January 4 | description: ZEPs Meeting Notes for 2023-01-26 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 2 8 | --- 9 | 10 | # 2023-01-26 11 | 12 | **Attending:** Jonathan Striebel (JS), Ward Fisher (WF), Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | SV announced that we would have weekly ZEP meetings instead of a bi-weekly routine to finalise the open issues for V3 (ZEP0001). Then we discussed the timeline to finish the ZEP1 and the possible timeframe required for the ZIC to vote on the same. After this, we touched on [#177](https://github.com/zarr-developers/zarr-specs/issues/177) and [#56](https://github.com/zarr-developers/zarr-specs/issues/56). 17 | 18 | **Updates:** 19 | 20 | - Weekly ZEP meetings until March, 2023 21 | 22 | **Meeting Minutes:** 23 | 24 | - JS: Timeline 25 | - (in discussion and TODO) 26 | - Not voting by end of January 27 | - More realistic? End of February 28 | - Josh: agreed with handover e.g. end of February. (can be more activate in March) 29 | - JS: How long for the review? 30 | - SV: 1 month? 31 | - nodding... 32 | - [Prefix](https://github.com/zarr-developers/zarr-specs/issues/177) 33 | - Underscores and escaping. (needs to happen in group) 34 | - [unicode](https://github.com/zarr-developers/zarr-specs/issues/56) 35 | - allowed: +1 36 | - recommended set of characters (lower case, digits, hyphens) 37 | - normalization? 38 | - filesystem does normalization on matching 39 | - online there's no normalization 40 | - default: we do nothing 41 | -------------------------------------------------------------------------------- /meetings/2023/2023-02-02.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2nd February 4 | description: ZEPs Meeting Notes for 2023-02-02 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 3 8 | --- 9 | 10 | # 2023-02-02 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ryan Abernathey (RA), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | This is the first Special ZEP Meeting, as announced during the last call. We extensively discussed various open issues for V3 on the GitHub project board. In between various V3 discussions, SV popped the question of whether there is an R implementation of Zarr. Also, RA tested the newly developed Sharding feature, which can be seen [here](https://github.com/zarr-developers/zarr-python/discussions/1338). 17 | 18 | **Meeting minutes:** 19 | 20 | - SV: Is there R implementation of Zarr? 21 | - JM: Only [Rarr](https://github.com/grimbough/Rarr) (with active development) 22 | - RA: Rust Implementation is a good place to put our efforts; would be good binary implementation that would be useful for the communities for other languages 23 | - RA: Took sharding for the test-drive 24 | - RA: Storage transformers doesn't have `get_items` and `set_items` 25 | - Is a good thing 26 | - JS: Does have a partial values and it could cover keys 27 | - Martin thinks API is not clean now 28 | -------------------------------------------------------------------------------- /meetings/2023/2023-02-09.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 9th February 4 | description: ZEPs Meeting Notes for 2023-02-09 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 4 8 | --- 9 | 10 | # 2023-02-09 11 | 12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Isaac Virshup (IV), Virginia Scarlett (VS), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting was solely focused on discussing open issues related to ZEP1. After the discussion, IV proposed his idea for a variable-length string, which could be a potential ZEP. 17 | 18 | **Meeting minutes:** 19 | 20 | - Discussion on V3 issues - check issues @ [GitHub project board](https://github.com/orgs/zarr-developers/projects/2) 21 | - IV: Strings + variable length binary – New ZEP? 22 | -------------------------------------------------------------------------------- /meetings/2023/2023-02-16.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 16th February 4 | description: ZEPs Meeting Notes for 2023-02-16 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 5 8 | --- 9 | 10 | # 2023-02-16 11 | 12 | **Attending:** Sanket Verma (SV), Dieu My Nguyen (DMN), John A. Kirkham (JK), Hailiang Zhang (HZ), Johana Chazaro (JC), Jeremy Maitin-Shepard (JMS), Akshay Subramaniam (AS) 13 | 14 | ## TL;DR: 15 | 16 | Jonathan Striebel laid out some discussion points before the meeting to, look at. Unfortunately, he couldn’t make it to the meeting, and we decided to work on those points asynchronously via GitHub. Then SV asked everyone for their thoughts on [checksums](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1412688953) for the shard. Finally, HZ gave a summary of his newly submitted ZEP, which can be seen [here](https://zarr.dev/zeps/draft/ZEP0005.html). 17 | 18 | **Points of discussions:** 19 | 20 | ZEP 1: 21 | - Anything missing for ? 22 | - Change global storage transformer PR to group storage transformers: 23 | 24 | - Should we update or remove the "Storage – Operations" section? 25 | 26 | - ZEP 1 needs updates: 27 | - URL to groups and arrays: 28 | 29 | - Prepare mail for the councils for the vote 30 | 31 | **Meeting Minutes:** 32 | 33 | - SV: Your thoughts on checksum for shards? Check the discussion [here](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1412688953) 34 | - AS: Not really thought about the ZEP extension but it could be! 35 | - AS: Want to support applications that don’t support Zarr - would be nice to support shard - send shard over the network and decompress it over at the other end 36 | - AS: KwickIO doesn’t do compression - would be nice to support this 37 | - JMS: There’s some tension with the Zarr model - shard has data and metadata - maybe duplicate the metadata and add info over to it 38 | - AS: checksum issue is not critical - some more metadata to shard - applications in genomics and geospatial data - having number of chunks would help - applications has the context for unpacking 39 | - AS: Zarr can have wrapper which can put data in the right place 40 | - JMS: having a container of the string is the abstract of what we want 41 | - JK: Is it depended on the data? - The compressor and the chunks and shard 42 | - AS: what compressor works with what data is a subjective choice 43 | - JK: Compressor could have branching logic? 44 | - AS: Could be logical to create a new compressor 45 | - JK: Branching logic would change for different datasets? 46 | - AS: It could. 47 | - SV: Dataset is public? Or can be made public? 48 | - AS: We can make it public - I’ll look into it 49 | - HZ: Brief summary of [ZEP0005](https://github.com/zarr-developers/zarr-specs/pull/205) GES DISC is looking for averaging the chunks - cost is high - introduce the algorithm - make overhead to be small - regional data is 1 TB the extra overhead after accumulation would be ~5% - it is working really well - big improvement in performance - very accurate - Ryan suggested to add it as extension during last year ESIP meeting 50 | - SV: Maybe add benchmarks? 51 | - HZ: We can do that! 52 | - JMS: Question on the specs PR 53 | - HZ: User can specify can any random range - corner would aggregate the chunk value - loading single chunk is easy - this single chunk would contain the aggregation value and you load it - it would be transparent to the user 54 | - JMS: makes sense to have stride - for image you want to store pyramid - you can represent downsampling pyramid - you can do this by your proposal 55 | - HZ: if you can zoom in and zoom out - regular stride - how do we setup the stride? - in theory is it possible? - is it possible to have a version 2 for this extension ZEP? - the separation would be good idea 56 | - JMS: Multiply stride by chunk size 57 | - HZ: Saving multiple chunks is not a problem - currently doesn’t support any random stride 58 | - JMS: It would let implementation have more work but it would cover more generic use cases 59 | - HZ: I will think hard and include the modification in the PR 60 | -------------------------------------------------------------------------------- /meetings/2023/2023-02-23.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 23rd February 4 | description: ZEPs Meeting Notes for 2023-02-23 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 6 8 | --- 9 | 10 | # 2023-02-23 11 | 12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Ryan Abernathey (RA), Jonathan Striebel (JS) 13 | 14 | ## TL;DR: 15 | 16 | Finally, after several meetings, all the essential issues related to the ZEP1 (V3) are resolved! However, some minor tasks remain, which JS will be looking at in the upcoming days. Then, JS asked whether we should ship V3 with OGC and whether chunk key encoding could be an extension. RA mentioned he’s working on an extension ZEP for non-listable stores. We also discussed having a hack week to eliminate the technical gap between the zarr-python’s V3 implementation and the current V3 specification. 17 | 18 | **Updates:** 19 | 20 | - New extension by Hailiang Zhang, see here: 21 | 22 | **Meeting Minutes:** 23 | 24 | - JS: No items in TODOs and Needs pr in ZEP1 [project board](https://github.com/orgs/zarr-developers/projects/2/views/2) - spec is coming to final stage 🎉 - last few days at scalable minds - needs to finish the remaining tasks soon! 25 | - JS: Should we ship V3 with OGC? 26 | - RA: We already have Zarr V2 Spec as OGC standard - we can ask for v3 - but its more of take it or leave thing 27 | - JS: [chunk key encoding](https://github.com/zarr-developers/zarr-specs/issues/172) could be an extension - separate key in the metadata - may prepends 0 - two things should be configurable separately 28 | - RA: could it be an extension? 29 | - JS: possibly! 30 | - RA: we should have it - this could enable to see only metadata when opening a directory without seeing the whole array 31 | - JMS: separator `/` would allow multiple possibilities 32 | - JS: should be backwards compatible, not a breaking feature 33 | - RA: working on a extension ZEP for non - listable stores - 34 | - wants to run it by the community first - read only stores - no writes to these stores 35 | - copied from STAC - link: 36 | - provide explicit link between parent and children document - write a store and create links for the store - 37 | - JMS: new group property, no reason to have parent - because you always know the parent 38 | - RA: what if someone gives you a middle hierarchy array address? - it’s helpful then 39 | - RA: Maybe we should not advertise V3 as we don't have a reference implementation 40 | - SV: hack week is a good idea to get ride of the existing technical debt 41 | - JMS: Looking to implement V3 in tensorstore and can help with Zarr-Python too! 42 | -------------------------------------------------------------------------------- /meetings/2023/2023-03-09.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 9th March 4 | description: ZEPs Meeting Notes for 2023-03-09 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 7 8 | --- 9 | 10 | # 2023-03-09 11 | 12 | **Attending:** Hailiang Zhang (HZ), Dieu My Nguyen (DMN), Josh Moore (JM), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting started with a discussion on Sharding as a [transformer vs codec](https://github.com/zarr-developers/zarr-specs/issues/220). Sharding is implemented as a transformer, and then we discussed the pros and cons of implementing Sharding as a codec. JMS had briefly discussed [#219](https://github.com/zarr-developers/zarr-specs/issues/219). After this, HZ gave a brief on [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) but couldn’t present his slides due to Zoom/screen sharing issues - tabled for the next community meeting. 17 | 18 | **Meeting Minutes:** 19 | 20 | - JMS: sharding as transform vs. codec 21 | - 22 | - RA: been playing with sharding, trying to get the most out of it 23 | - Key question (impl or spec?): if sharding is a codec how does the outer layer which range it wants? (general problem in zarr-python for blosc) 24 | - requires passing context to context 25 | - transform is explicit; codec is less clear 26 | - JMS: in zarrita he has the codec take an indexing expression (optional?). defer some of that for ZEP2? arrays vs. bytes vs. additional concept of arrays. 27 | - RA: similar to Martin's request 28 | - JMS: first codec is fine, but the next one is less clear. need to be more explicit about the interface. 29 | - RA: need to solve this. what information needs to be passed in between (implementors and at spec level) 30 | - RA: e.g. could be a codec that takes an HDF5 file (blosc2, etc.) missed a chance to build the right abstractions there. 31 | - JMS: `codecs := array|bytestream in; array|bytestream out` 32 | - JM: recursive zarrs all the way down? 33 | - JMS: concatenation of other arrays 34 | - RA: Norman's justification. JMS' proposal. re: how to integrate other things like referncing between arrays, shards defining own chunking, etc. (doesn't change anything in ZEP1) 35 | - JMS: transforms as bytes, and codecs can access arrays 36 | - JMS: NB: MD wants low level store to be aware of array indexing 37 | - JM: always thought of codecs as the lowest thing that is unaware of arrasy 38 | - JMS: combined compression with filters (which can operate on arrays, transpose) 39 | - RA: sharding fundamentally breaks core abstraction between store / codec. at the impl. level, want an efficient/fast code to fetch chunks of shard, make smart decisions, close to the metal. but the naive thing isn't fast. do the core abstractions break down. no longer using key/value store API. using offsets into storage. 40 | - JMS: don't see byte range as breaking. addition to the interface. 41 | - RA: not just a file format, but a protocol for addressing chunks. 42 | - JMS: dimension names metadata 43 | - 44 | - RA: would be for keeping or making it easy to not use 45 | - HZ/DMN: ZEP5 presentation (recorded) 46 | - 47 | - ![](https://i.imgur.com/bGqN3gA.jpg) 48 | - HZ: Tabling because of Zoom issues. 49 | - RA: re: expectations -- very limited due to the numbers of people working on the spec. (it's taken *years*) so ... 6 months? 50 | - HZ: this is an extension, doesn't blocking anything. 51 | -------------------------------------------------------------------------------- /meetings/2023/2023-03-16.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 16th March 4 | description: ZEPs Meeting Notes for 2023-03-16 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 8 8 | --- 9 | 10 | ## 2023-03-16 11 | 12 | **Note:** This ZEP meeting was an impromptu meeting. Please see the corresponding message on Gitter [here](https://matrix.to/#/!nZLdXRRzIbkoDjkEvS:gitter.im/$oxM2UpzOTs--6P1Itl6gPWvLRCBEv_npvxJYi5m95l8?via=gitter.im&via=matrix.org&via=cadair.com). 13 | 14 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS) 15 | 16 | ## TL;DR: 17 | 18 | NR was confused about sharding as a codec and thought changing it would require rewriting the interface. JS and JMS answered NR’s questions. After this, we discussed sending ZEP1 for voting and concluded that there needs to be an editorial change before we send it. JS initiated a discussion on [#161](https://github.com/zarr-developers/zarr-specs/issues/161) and [#222](https://github.com/zarr-developers/zarr-specs/pull/222). NR also asked if we have a contributors list for V3, and SV took the job for himself. 19 | 20 | **Meeting Minutes:** 21 | 22 | - NR: Confused for sharding as a codec! - Making sharding as a codec will change the interface - 23 | - JMS: doesn’t agree with Martin’s point 24 | - JS: doesn’t change the codec - will nest a new one - V3 doesn’t need to change anything 25 | - NR: Need to change ZEP2 then 26 | - NR: Zarr-Python API is mess right now 27 | 28 | ~JMS joins in~ 29 | 30 | - NR: Should we send ZEP1 for voting? 31 | - JMS: Martin wants to push everything as a storage transformer - keep storage transformers in the spec - we could also defer that decision to ZEP2 32 | - NR: I wonder if we have too much implementation detail in V3? - Whether we need partial read or not? 33 | - JMS: Partial read are not required for sharding 34 | - JS: 35 | - JMS: Mostly concerned with JSON metadata - haven’t starting doing the implementation 36 | - NR: Some behaviour needs to be defined - everything goes beyond - doesn’t need to strip the spec 37 | - JS: hierarchy discovery - what happens if you delete the chunks and what happens then? 38 | - JS: We’re fine as it is now! 39 | - JMS: For someone is reviewing the ZEP0001 - JSON metadata is important - but it is burried in the middle - an editorial change would be helpful to put on the top 40 | - JS: Glossary defined at the top is not optimal 41 | - JMS: Would look into restructuring the metadata to the top 42 | - JMS: Would start working on V3 implementation of Neuroglancer and tensor store 43 | - JS: - 44 | - NR: This is something of an implementation detail 45 | - JS: Maybe we can mark it them as implementation detail 46 | - JMS: Josh and Ryan brought up the idea of codec vs transformers in the last ZEP meeting - so I wrote this PR 47 | - NR: Move sharding to a codec? 48 | - JMS: Josh was skeptical - Ryan was in favour - we should go ahead with the proposal for sharding as a codec 49 | - NR: Will make the changes to the ZEP2 to the next week 50 | - JS: Having sharding as similar to blosc2 and nvcomp is a strongest opinion 51 | - NR: Contributors for V3 so far? 52 | - SV: Will make a contributor list after the voting email goes out! 53 | -------------------------------------------------------------------------------- /meetings/2023/2023-03-23.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 23rd March 4 | description: ZEPs Meeting Notes for 2023-03-23 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 9 8 | --- 9 | 10 | ## 2023-03-23 11 | 12 | **Attending:** Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Norman Rzepka (NR), Ward Fisher (WF) 13 | 14 | ## TL;DR: 15 | 16 | NR started wondering whether Hailiang’s [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) is in the ZEP scope. Everyone has different thoughts, and SV thought it might be part of [Ryan Abernathey’s ZEP](https://github.com/zarr-developers/zeps/pull/28). Next, SV presented the draft email that the JMS and JS are supposed to send out soon. And lastly, JMS had some thoughts about bloc being a special codec due to its shuffling nature. 17 | 18 | **Updates:** 19 | 20 | - Hailiang presented [ZEP0005](https://zarr.dev/zeps/draft/ZEP0005.html) yesterday 21 | - Check the recording here: 22 | 23 | **Meeting Minutes:** 24 | 25 | - NR: Does Hailiang’s ZEP in the Zarr scope? 26 | - JMS: Not possible to evaluate the proposal right now - agree with the scope - maybe it’s not for us to judge it - it could be a metadata convention - doesn’t need to be implemented with Zarr-Python itself 27 | - NR: Whether it needs to be standardised or not? - Could understand some part of the proposal - may not an extension - because it’s on top of the Zarr (the accumulation attribute) 28 | - SV: Maybe part of the [Ryan’s ZEP](https://github.com/zarr-developers/zeps/pull/28)? 29 | - NR: Trying to find a specific use case and I couldn’t find it - something similar to OME naming conventions 30 | - JMS: Prefixing OME w.r.t. to OME keys 31 | 32 | - SV: ZIC Email 33 | - JMS: We can remove Zarr-Python implementation's reference from the email 34 | - NR: You can add Zarrita! 35 | - SV: Is it in a better state of syncronisation as compared to zarr-python V3 implementation? 36 | - NR: Yes! 37 | - JMS: Sure, we can do it! 38 | - JMS: Reorder metadata to the top and then send the email 39 | - NR: Sharding as a codec PR: - if it can be merged before the email then it’ll be great! 40 | - NR: Want to bundle ZEP0002 with ZEP0001 - 41 | - SV: Not a good idea! 42 | - NR: Alright! 43 | - NR: Rendering of the read the docs - SV: check this out: 44 | 45 | - JMS: Blosc is a special codec! - bytes to bytes codec - the shuffle parameters has some logic - the shuffling is happening in the Zarr V3 - which is weird - adds a weird abstraction - potentially useful for users to specify shuffling manually 46 | - JMS: Proposed value for the shuffling for the blosc codec: {null, "bit", 1, 2, 3, 4…} 47 | - Will create an issue or send a PR to the numcodecs repository 48 | 49 | - SV: Once the re-ordering (moving metadata section to the top) of V3 is done, we can send out the email 50 | - NR: I’ve added Zarrita as the reference implementation for V3 in the email! 51 | -------------------------------------------------------------------------------- /meetings/2023/2023-04-06.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 6th April 4 | description: ZEPs Meeting Notes for 2023-04-06 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 10 8 | --- 9 | 10 | # 2023-04-06 11 | 12 | **Attending:** Sanket Verma (SV), Ryan Abernathey (RA), Jeremy Maitin-Shepard (JMS), Ethan Davis (ED) 13 | 14 | ## TL;DR: 15 | 16 | In today’s ZEP Meeting, we worked on the text to kick off the review process for ZEP0001. After reviewing the document and making a few necessary changes, JMS created an issue to notify the ZIC and ZSC that ZEP0001 has finally entered the review phase. Please check the relevant issue: . 17 | 18 | **Updates:** 19 | 20 | - ZEP1 is ready for review after merging [#224](https://github.com/zarr-developers/zarr-specs/pull/224) 21 | 22 | **Meeting Minutes:** 23 | 24 | - SV: [ZEP0001](https://zarr.dev/zeps/draft/ZEP0001.html) is ready for review 25 | - RA: What does ZIC think of the implementing the spec? We should ask them! 26 | - JMS: Sounds good! 27 | - RA: We can use the main issue to cast the votes, and open separate issues for additional concerns as having everything in a singe issue will clutter it 28 | - RA: Are we expecting a lot of feedback and then we would need to edit the spec? Because we've already did a lot of work to reach the current state of the SPEC 29 | - JMS: Think people who haven't been involved in the SPEC process so far would not be too much concerned with the details 30 | - ED: Has there been any involvements from the ZIC? 31 | - RA: Not much! 32 | - ED: Then maybe it's gonna be clarifications and similar questions 33 | - RA: In my experience, you can design as much you want but real issues starts coming up when you start implementing it 34 | - ED: Yeah! 35 | - JMS: It not easy to get a large group of people to agree on a same thing at a same time! 36 | - RA: Do we have a reference implementation of V3? 37 | - SV: Yeah, [Zarrita](https://github.com/scalableminds/zarrita)! 38 | - RA: Need to provide some clarifications on the voting mechanism 39 | - They could vote to approve 40 | - They could to abstain 41 | - And they could vote to veto 42 | - RA: Provide guidance on the implementation as well 43 | - ED: In OGC, you can say no but with the comments 44 | - RA: We should plan that there's no veto! 45 | - JMS: Need to avoid the case where the implementors start working on a new spec on their own - also a fact that this is a community process - you can't force people to implement something new if it isn't helpful for them - instead you can help them with their additional use cases 46 | - ED: Why veto and why just 'No'? 47 | - RA: That's for other ZEPs not for ZEP0001 because it's like a constitution of Zarr 48 | - ED: The veto is contentious issue 49 | - RA: Is there a process to respond to the feedback? Is there a time period for that? 50 | - SV: We should accomodate all of the feedback in a month's time 51 | - ED: Is the vote going to be, 'Yes' we're moving forward with it and there's gonna be a separate discussion for the implementation? - OGC have a public comment period 52 | - RA: There are heavy handed process like OGC, ISO but we don't want to use them; we just made a process for our project and the community to reach a consensus - like the idea of comment period from the council and they have a month to solicit their feedback - and after that we'll take a vote 53 | - JMS: If people wanted to comment, they'd have already done so 54 | - RA: I'd love to think so but all the implementors are busy and they might be waiting for someone to come to them review the spec - the downside of doing this is we're going to present this as a take it or leave it and that's not fine given Zarr is a commmunity owned OS project! 55 | - JMS: What does leave it mean? They'll not upgrade to V3 from V2 - Is it fine? 56 | - SV: I know some folks from the council couldn't keep up with the progress and they've been waiting for the SPEC to go into the review phase 57 | - RA: It's been in the review phase forever! - would like to see robust handling of the extension points - getting behind the idea of 1 month time 58 | - ED: Release candidate for the extension? 59 | - ED: Does approval means that the SPEC will be 3.0 or 3.1.0 or 3.0.1? 60 | - SV: It's gonna be 3.0 61 | - JMS: How about the folks who were involved during the development of the SPEC? 62 | - SV: Will take care of it after the voting period 63 | - ED: It's a good to list the contributors! 64 | - SV: Does everything seems fine? 65 | - JMS: Yes! 66 | 67 | - ZEP0001 Review issue: 🎉 68 | -------------------------------------------------------------------------------- /meetings/2023/2023-04-20.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 20th April 4 | description: ZEPs Meeting Notes for 2023-04-20 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 11 8 | --- 9 | 10 | # 2023-04-20 11 | 12 | **Attending:** Jeremy Maitin-Shepard (JMS), Jonathan Striebel (JS), Ryan Abernathey (RA) 13 | 14 | ## TL;DR: 15 | 16 | During the discussion on the V3 specification, the community explored different codecs, including `array → array`, `array → bytes`, and `bytes → array`, and evaluated their advantages and disadvantages. They also debated whether to include codecs in the metadata or not. 17 | -------------------------------------------------------------------------------- /meetings/2023/2023-05-04.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 4th May 4 | description: ZEPs Meeting Notes for 2023-05-04 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 12 8 | --- 9 | 10 | # 2023-05-04 11 | 12 | **Attending:** Ward Fisher (WF), Josh Moore (JM), Sanket Verma (SV), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA) 13 | 14 | ## TL;DR: 15 | 16 | During the meeting, SV provided an update on the current voting status for [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227), while WF plans to vote soon with the assistance of colleagues at Unidata. RA suggested that we should be open to changes in the living document, i.e. V3 Spec and not adhere strictly to the by-laws. JS recommended having a list of contributors on the ZEP0001 page. SV also discussed pending tasks on the ZEP0001 project board before finalising the spec. JMS briefly discussed Mark’s [comments](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1533335795) on the sharding PR, and the meeting concluded with an impromptu discussion about organising a Zarr conference. 17 | 18 | **Updates:** 19 | 20 | - [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227) review 21 | - Ends in 2 days on 5/6 @ 23:59 22 | - 2 votes so far 23 | - Constantine - `ABSTAIN` 24 | - Jeremy - `YES` 25 | - 8 votes pending - (if we leave zarr-python out then 7) 26 | 27 | **Meeting Minutes:** 28 | 29 | - SV: summary on votes so far (as above) 30 | - WF: Working with the colleagues @ Unidata - will be voting on the V3 Spec soon! 31 | - RA: wouldn't stick hard-and-fast to the by-laws if they need 32 | - JS: Mostly giving people a point in time for veto. It's a living document. 33 | - WF: comparing to markdown, might be good to have this process 34 | - SV: @Ward - Is there a PR you're looking to submit for ZEP1 review? 35 | - WF: User attributes for the metadata field - a bit certainity in `must_understand` flag and user defined attributes - we can have arbitary tags in user attributes specifying it doesn't require `must_understand` flag - will prepare a PR for the same 36 | - JMS: think of the review as an intent to implement 37 | - JM: agree, looking forward to using this to garner motivation 38 | - JS: looking forward to a retrospective; smaller changes. 39 | - RA: specturm of processes; OGC is the most heavy-handed; STAC is most agile (what's their trick? everything has an implementation) 40 | - JM: STAC still needs to do the major upgrade to V2 - something I'm looking forward to 41 | - `must_understand` flag 42 | - JMS: Still not clear on Dennis' objection 43 | - WF: (for Dennis) worried about future changes painting us into a corner - good to keep Dennis in loop and listen to his concerns 44 | - JS: Point where you can see both the sides - it's not a deal breaker for the V3 Spec 45 | - RA: Any downside for making `must_understand` flag a required attribute for an extension? - Seems lightweight and can satisfy Dennis 46 | - JS: To rephrase JMS's point - "You can do this but then it's not possible to have non-object entries in the config again." 47 | - JMS: We haven't clarified in codecs the presence of unknown attributes in codecs! 48 | - JS: Would be good to have a list of contributors for the ZEP1 49 | - SV: Will complete it before we finalise ZEP1 50 | - JM: Would be good to put Zarr Spec on Zenodo as well! 51 | - SV: State of the [ZEP1 project board](https://github.com/orgs/zarr-developers/projects/2/views/2) 52 | - 2 issues in meta and 1 under discussion 53 | - JS: We can ignore the one under discussion - RA can take care of the [OGC](https://github.com/zarr-developers/zarr-specs/issues/42) one 54 | - RA: We can update the community standard we already have with OGC 55 | - JS: Not super happy with how we do the Spec work atm - see [#179](https://github.com/zarr-developers/zarr-specs/issues/179) - we need to address this once V3 gets finalised 56 | - SV: What does updating V3 at OGC means? Does it supersedes V2 or V3 gets published at a new URL alongside V2? 57 | - All: Don't know! RA can take care of it. 58 | - RA: geozarr has become a comparison of geo/weather specs 59 | - but will remain a convention 60 | - JM: love to hear more. Maybe we can have a Zarr conventions convention ;) 61 | - RA: all spatial/temporal. with infinite time, would be great to work together. 62 | - JMS: Discussion on [comments](https://github.com/zarr-developers/zarr-specs/pull/152#issuecomment-1533335795) by Mark on Sharding PR 63 | - Would be a good thing to add checksum of the index not the individual chunks (data) - will add 4 extra bytes to the index - JMS favour this! 64 | - JS: Would be good to ask Norman - but I also favour the idea 65 | - Impromptu discussion on organsing a Zarr Conference 66 | - JS: Really good idea! We should have it 67 | - JMS: In-person has more values 68 | - JS: Would be good co-locate with other conferences 69 | - JM: From my experience it was difficult to get together folks from different fields - maybe we can look for hosts like Janelia, NASA etc. 70 | - JM: Thermo fisher is also looking into Zarr 71 | -------------------------------------------------------------------------------- /meetings/2023/2023-05-18.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 18th May 4 | description: ZEPs Meeting Notes for 2023-05-18 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 13 8 | --- 9 | 10 | # 2023-05-18 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ryan Abernathey (RA), Norman Rzepka (NR), Jeremy Maitin-Shepard(JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting covered the acceptance of ZEP0001, signaling the initiation of implementations. Discussions revolved around updates for ZEP0001 and V3, with Zarrita noted as a comprehensive PoC. ZEP0002 discussions focused on shard structures and an iterative ZEP process. Additionally, there were insights into experimentation with virtualization and metadata linking, and considerations for handling unmaintained Zarr implementations, emphasizing a shift towards V3 developments. 17 | 18 | **Meeting minutes:** 19 | 20 | - Discussions about climate and weather 🌡️ 21 | - [ZEP0001](https://github.com/zarr-developers/zarr-specs/issues/227) is finally accepted! 🎉 22 | - Implementors can start working on their implementations 23 | - Will be moving the ZEP0001 under the new `Accepted` section 24 | - Will move it under `Final` section once we have atleast one complete reference implementation 25 | - SV - updates and next steps for ZEP0001 and V3: 26 | - 1 year into the process... ([2019](https://zarr.dev/blog/v3-update/) first discussion) 27 | - [gdal](https://github.com/OSGeo/gdal/pull/7706) moving quickly 28 | - will be checking in on the various implementations 29 | - zarrita as one of the most complete (in terms of code, not docs or tests), i.e. PoC 30 | - JM: Reference implementation needs to be useable! 31 | - NR: Yeah! 32 | - RA: why it's not a complete implementation? 33 | - NR: no optimizations (sharding, etc.). meant to be easy to read code. 34 | - lacks features like buffer protocol, etc. but could be used. 35 | - don't currently plan to maintain it over a long period of time. 36 | - SV: was thinking less of end-user and more of supporting all the features so others can refer to it. 37 | - NR: that probably could be done now. 38 | - NR: could write an intro for people to read. (don't want to write end-user docs) 39 | - RA: NR's production implementation? use different file format currently. must have sharding. 40 | - considering using zarrita as the implementation (for Python stack) 41 | - also have a scala stack (baked into software) 42 | - JM: ZEP0002 voting and discussions 43 | - JM: We could open up the voting for ZEP0002 and give a month/two month/full summer for voting - any open issues? 44 | - NR: None! 45 | - JM: Shard as recrusive Zarrs? - treat internals of shards like another Zarr 46 | - NR: Sharding being a codec would work that way but it's more of a implementation detail 47 | - JM: What would it look like from a URI structure? - similar to what Saalfeld is doing in N5 ecosystem - if I access a chunk inside a shard and I could treat it as a Zarr array and not as a blob 48 | - NR: Fair enough! I could have something like this in Zarrita 49 | - NR: Not have implementation details in the spec but rather point to the reference implementation for the details 50 | - RA: ZEP process should be more of an iterative process and not an ultimatum 51 | - JMS: I feel, most of the implementors would be working on sharding and V3 together 52 | - RA: flat files to virtualization (experimentation). (Once ZEP0003 lands!) 53 | - Also want to see linked referencing in metadata (to allow browsing through HTTP). Better than consolidated. For e.g. infinite hierarchies, allows browsing like a catalog 54 | - Allow parent to list it's children in Zarr groups 55 | - Also multiscales. Relates the arrays. Within the array directory? 56 | - NR: listing parent/children 57 | - NR: paths... 58 | - RA: see "link" in for any node (self, root, parent, child) 59 | - NR: Discussions on umaintained Zarr implementations 60 | - SV: Dissolve them with the help of the maintainers 61 | - JM: Tricky to get a hold of them 62 | - SV: Start deprecating them and then removing them from 63 | - JMS: But all of that is V2 - if there's going to be something on V3 we can certainly help them 64 | -------------------------------------------------------------------------------- /meetings/2023/2023-06-01.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 1st June 4 | description: ZEPs Meeting Notes for 2023-06-01 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 14 8 | --- 9 | 10 | # 2023-06-01 11 | 12 | **Attending:** Ward Fisher (WF), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Ryan Abernathey (RA), Norman Rzepka(NR), Josh Moore (JM) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed ongoing work on ZEP 4, focusing on conventions for diverse domains like bio-imaging, geospatial, and genomics. There were considerations about whether these conventions should be general or domain-specific and how to handle legacy data. Additionally, discussions covered topics such as the organization of conventions on the website, the separation of codecs, challenges in introducing transformations to the codec pipeline, and the exploration of fallback data types, emphasizing the need for broader community discussions on certain aspects. 17 | 18 | **Updates:** 19 | 20 | **Meeting Minutes:** 21 | 22 | - SV: ZEP 4 → 23 | - RA: Still working on it 24 | - SV: How can we help? 25 | - RA: Need to decide if it's going to be a general convention (for bio-imaging, geospatial, genomics etc.), or a convention of Geospatial domain all 26 | - RA: Also need to decide if the dataset are conforming to a convention or not - lot of legacy data out there which doesn't conform to it 27 | - RA: 28 | - WF: Convention doesn't need to be broad, cf convention are based on NetCDF model - but there's nothing in the NetCDF library or code that mentions the cf conventions! 29 | - RA: The existing conventions are broad, and it's difficult to place cf in a specific place 30 | - WF: Agree with you 31 | - RA: Define what's the process to put a new convention for the community 32 | - JMS: You have group level attribute and array level attribute? 33 | - RA: Mostly yes! 34 | - WF: There 35 | - RA: Getting all convention on the website would be a good way for cross domain and community collaboration - conventions can be composable - conventions could not be universal 36 | - WF: Conventions move slowly - take time to adopt to new things - took a good amount of time to solve SST (Sea Surface Temperature) 37 | - RA: Will get the another draft out soon 38 | - JMS: Not feasible to namespace an attribute? 39 | - RA: It would require a deep re-factoring for the software we use! - It would break Xarray, GDAL, NetCDF - Zarr-Python doesn't care about the attributes 40 | - WF: Namespacing would definitely break the NetCDF library! 41 | - RA: JMS, how do you handle conventions? 42 | - JMS: Generally, doesn't invent new conventions, and implement existing conventions - the data formats I invented, I defined those conventions - these existing conventions doesn't lack _certain_ things 43 | - NR: In the process of adopting Zarr V3, currently using `OME-Key` 44 | - WF: Attributes are strictly defined - defined to be interpretable not changeable 45 | - JMS: Maybe the best idea is to say clearly that _X_ datasets use the conventions and multiscales 46 | - JMS: 47 | - NR: No need to separate them - my opinion: to keep it as it is - maybe Chris's comment is coming out of the Rust world and separating the codecs will be convenient for him 48 | - JMS: 49 | - JM: Cares about the codec pipeline - adding some transformations in the middle would be tricky! 50 | - NR: Feel the same! - Sharding as a single codec makes sense but adding anything would make it complicated - 51 | - JM: Current partial codec --- define the metadata format for shard codec would be great! 52 | - NR: Adding 2 partial codecs would make it tricky to implement 53 | - JMS: Blosc is kind-of sharding codec - not clear if using sharding as a partial read codec is good idea! 54 | - JMS: How do you have partial writes for the codec? 55 | - JMS: Fallback data types 56 | - JMS: Need to have a broader discussion 57 | - NR: Also, do extensions data type need to have a fallback? 58 | - JMS: No, it's optional 59 | - NR: The definition of fallback - like a tuple - having a datatype and the fallback value 60 | - JMS: Maybe it's the way to go! 61 | -------------------------------------------------------------------------------- /meetings/2023/2023-06-15.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 15th June 4 | description: ZEPs Meeting Notes for 2023-06-15 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 15 8 | --- 9 | 10 | # 2023-06-15 11 | 12 | **Attending:** Sanket Verma (SV), Alan Watson (AM), Jeremy Maitin-Shepard (JMS), Josh Moore (JM), Norman Rzepka (NR) 13 | 14 | ## TL;DR: 15 | 16 | The meeting highlighted SV's recent talk on Zarr at Vrije Universiteit Amsterdam, providing access to the slides and code. Discussions covered the existence of command-line tools for Zarr, including the discovery of zarr-tools. AW shared Zarr adoption in a game project and encouraged its use in brain conferences. Additionally, there were updates on Allen's utilization of Zarr, a forthcoming V3 blog post by SV, and ongoing discussions about addressing the fallback data types issue and the review timeline for ZEP0002. 17 | 18 | **Updates:** 19 | 20 | - SV gave a talk on Zarr @ Vrije Universiteit Amsterdam - 21 | - Slides and code: 22 | 23 | **Meeting Minutes:** 24 | 25 | - Are there any command line tools for Zarr? 26 | - Found this: 27 | - JM: Was working on something on this but didn't use it much 28 | - NR: There are bunch of tools which you can use OME-Zarr, having something like H5LS would be cool 29 | - JMS: Operations for small things makes sense - rechunking, copying 30 | - JM: Nextflow - workflow engine 31 | - AM: BIL is working on games 32 | - 33 | - Pushing them to use Zarr for their image (.png) data 34 | - AM: Interest and benefit in attending brain conferences - someone from the Zarr community 35 | - 36 | - AM: Allen is using Zarr for their work 37 | - SV: Recent paper out of Allen: 38 | - Extensive usage at Allen have revealed some problems and it may be worth addressing them 39 | - JMS: Writing electrophysiology data in Zarr rather than tiff is a good oway to go forward 40 | - SV: Blog post on V3 coming out soon! 41 | - JMS: Fallback data types issue needs to be addressed 42 | - JMS: Not clear how it'll be specified 43 | - JMS: How do you handle it in Zarrita? 44 | - NR: Currently, we do not! 45 | - SV: How serious it is? Implementation or spec issue? 46 | - JMS: Kind of ignoring for it now! We're in implementation phase now! 47 | - JM: If everybody is ignoring it, then it's fine 48 | - NR: Would not be straight forward to add it later! 49 | - JMS: If implementation doesn't support, it'll fail 50 | - SV: Have you started on the implementation? 51 | - JMS: Tensorstore has V3 minus sharding; planning to work on it this week 52 | - NR: Rust implementation of V3 - 53 | - NR: Benchmarking in Zarrita 54 | - 55 | - SV: Benchmarks in Tensorstore? 56 | - JMS: Seen bottlenecks in IO layer than array layer 57 | - NR: What about codecs? 58 | - JMS: 59 | - NR: 10x performance improvement would be great! 60 | - NR: ZEP0002 review timeline 61 | - SV: V3 blog post, feedback for ZEP process, and then we can add ZEP0002 in the pipeline 62 | - NR: Ok! 63 | - SV: Maybe we need to invite Chris and others from the ZIC to the ZEP meeting 64 | - JM: There used to be a Zarr-Rust meeting 65 | -------------------------------------------------------------------------------- /meetings/2023/2023-06-29.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 29th June 4 | description: ZEPs Meeting Notes for 2023-06-29 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 16 8 | --- 9 | 10 | # 2023-06-29 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Ryan Abernathey (RA), Ward Fisher (WF), Daniel Jahn (DJ), Jeremy Maitin-Shepard (JMS), Norman Rzepka (NR) 13 | 14 | ## TL;DR: 15 | 16 | The meeting focused on ZEPs 3 and 5, emphasizing the importance of ongoing implementation to avoid stalling and addressing technical debt in Zarr-Python. Discussions revolved around best practices, supporting old conventions, and the use of conformance tests. The possibility of multiscales as an extension was explored, and plans were made to kick off the ZEP0002 review process, with considerations for neat benchmarks and gathering feedback from the Zarr Implementers Community (ZIC). 17 | 18 | **Meeting minutes:** 19 | 20 | - SV: ZEPs 3 and 5 ... 21 | - RA: feedback on the ZEP 22 | - Need to be implementing as we go, otherwise leads to stalling 23 | - JS: Zarr-Python has tech debt which makes it difficult to implement new stuff 24 | - *Impromptu round of introduction* 25 | - JMS: Reference implementation in any language is helpful for any new ZEPs 26 | - RA: Having explicit tweet/statement about implementation would help 27 | - JS: benchmark repo? also sample data? 28 | - NR: Sample datasets 29 | - RA: 30 | - best practices going forward 31 | - but a way to support old conventions 32 | - from OGC, "conformance class" determined with "conformance test" 33 | - namespacing up to the convention 34 | - would like to get it into draft form and then we can move forward with the process 35 | - SV: few open comments? 36 | - RA: just merge it and move the process forward. 37 | - still need to open a template 38 | - NR: add existing conventions at this point (OME-NGFF) 39 | - JM: Thoughts on multiscales? 40 | - RA: It should be an extension 41 | - JMS: Multiscales doesn't lead to a lot of objects 42 | - SV: ZEP0002 review process to kick-off by next week 43 | - NR: Zarrita has sharding implementation 44 | - JM: Kicking-off review process and having neat benchmarks, any idea how we could do both? 45 | - NR: Make a new issue for ZIC feedback 46 | -------------------------------------------------------------------------------- /meetings/2023/2023-07-13.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 13th July 4 | description: ZEPs Meeting Notes for 2023-07-13 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 17 8 | --- 9 | 10 | # 2023-07-13 11 | 12 | ## Josh Moore and Sanket Verma are presenting at [SciPy 2023](https://www.scipy2023.scipy.org/). 13 | 14 | ### Check the proposal: -------------------------------------------------------------------------------- /meetings/2023/2023-07-27.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 27th July 4 | description: ZEPs Meeting Notes for 2023-07-27 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 18 8 | --- 9 | 10 | # 2023-07-27 11 | 12 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Ward Fisher (WF), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting focused on advancing ZEP2, with plans to send it to the Zarr Implementation Council (ZIC) for review. Sharding implementations were discussed, including progress on Tensorstore and considerations for Zarrita's compatibility with FSSPEC. Updates on ZEP1 and V3 were highlighted, with pending pull requests, and the possibility of Unidata's involvement in the roadmap was mentioned. Overall, anticipation was expressed for ZEP2's review and progress in various implementations. 17 | 18 | **Meeting minutes:** 19 | 20 | - Send [ZEP2](https://zarr.dev/zeps/draft/ZEP0002.html) to the ZIC 21 | - Merged 22 | - Need to merge -> SV 23 | - SV will send out the email to the ZIC 24 | - Try to fix crosslinks 25 | - JS to close [PR #152](https://github.com/zarr-developers/zarr-specs/pull/152) 26 | - NR to create an issue to gather votes and update the ZEP [PR #40](https://github.com/zarr-developers/zeps/pull/40) 27 | - SV to send an email to the ZIC after issue creation and PR merging 28 | - Everyone looking forward to it! 29 | - Mark wanting to organise a ZEP2 review call - but didn't happen 30 | - Sharding implementation 31 | - JMS working on Tensorstore implementation 32 | - V3 implementation on Tensorstore is close to completion 33 | - Zarrita had a noticeable overhead while running the benchmarks 34 | - NR: Zarr-Python sharding implementation has deviated over the time 35 | - JS: Make sense to add sharding as a codec once V3 in Zarr-Python gets in 36 | - JMS: Is there aim for having the similar API for V2 and V3 in Zarr-Python? 37 | - NR: Zarrita doesn't have various stores 38 | - JMS: Is Zarrita compatible with FSSPEC? 39 | - NR: Yes! 40 | - ZEP1 and V3 41 | - 42 | - 43 | - 44 | - 45 | - Once the PRs are merged, SV to send out the FYI to the ZIC 46 | - SV: Any developments on Unidata side? 47 | - WF: Not yet, but it's on our roadmap 48 | -------------------------------------------------------------------------------- /meetings/2023/2023-08-10.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 10th August 4 | description: ZEPs Meeting Notes for 2023-08-10 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 19 8 | --- 9 | 10 | # 2023-08-10 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jonathan Striebel (JS), Norman Rzepka (NR) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed updates on Zarr-Python working groups, a proof-of-concept implementation for ZEP0003, and presentations at SciPy 2023. Notable mentions included removing watermarks from the V3 spec, recognizing contributors for ZEP0001, and ongoing discussions on collaborating with Blosc and Dask for efficient chunking and sharding across the PyData stack. There was also exploration into a lightweight ZEP process for codecs, with considerations for updating ZEP0000 and scheduling ZEPs for voting. Fundraising discussions and potential collaborations with Napari were explored to sustain momentum in the Zarr project. 17 | 18 | **Updates:** 19 | 20 | - Zarr-Python working groups 21 | - Benchmarking and performance: 22 | - Refactoring: 23 | - POC implementation of [ZEP0003](https://zarr.dev/zeps/draft/ZEP0003.html) 24 | - 25 | - SciPy 2023 proceedings 26 | - Talk slides: 27 | - Tools update slides: 28 | - ZEP0001 Contributors section: 29 | 30 | **Meeting minutes:** 31 | 32 | - JM: JS are you using Zarr at work? 33 | - JS: We're convinced that it's a good idea to use Zarr! 😄 34 | - Growing rapidly - good thing 35 | - SV: Presentation for EuroSciPy 2023! 36 | - JS: Coming up 37 | - SV: You can also cite SciPy 2023 talks 38 | - JM: Will be using the EuroSciPy 2022 poster for an upcoming meeting 39 | - SV: Tweeting about the contributors for ZEP0001 and tagging everyone 40 | - JM: Thanks to everyone! 🙏🏻 41 | - JS: Removes the watermark from the V3 spec 42 | - Figured out the CSS selector 43 | - JM: Zeiss got back to JM 44 | - SciPy 2023 discussions 45 | - James Webb Space Telescope - how they can use Zarr 46 | - JS: Misc. link: 47 | - JS: Met with Francesc (Blosc) ? 48 | - JM: Yeah, we spend a lot of time and it was great! 49 | - How Blosc and Sharding can co-exist together - like a package 50 | - SV: Sharding can provide cloud enability to Blosc2 - discussions with Francesc 51 | - JM: Recently filled out feedback for CZI EOSS - we can do join grants as well 52 | - NR: Writing a Blosc2 codec for Zarr could help 53 | - JM: Dask chunking comes down to `.chunk()` property for the object - how about data API chunking specification around the chunks? - chunking could work across the whole PyData stack - and we can add sharding too - could help with what's the efficient access pattern for sharding chunk!? 54 | - Unified package for Zarr (Sharding), Blosc2 and Dask and other packages 55 | - NR: Interesting! Sharding has 2 access pattern 56 | - Chunk level for read and shard level for write 57 | - For Dask purposes you probably want the shard access pattern - because you're in a HPC environment 58 | - JS: Writing it as a spec and collaborating with Dask and Blosc team 59 | - NR: Agreed! 60 | - JS: Sharding also needs a lot of explanation - lots of education needed 61 | - NR: Limbo state rn - blog posts and videos can help a lot - maybe after 6 months 62 | - JM: Unifying names - block.dev? - and same documentation as well 63 | - SV: Can include HDF5 as well 64 | - JM: HDF5 could be beneficial if you're working on cluster/HPC and Zarr can help you bring that data down from the cloud to your machine 65 | - NR: Can apply to EOSS grant with same applications 66 | - JM: Less chances of getting funded 67 | - JM: Zarr can solve world hunger! 😁 68 | - NR: Good momentum but need to deliver as well 69 | - JM: Zarr as a sister project of Napari!? 70 | - SV: Having conversations regarding fundraising for Zarr to keep the project funded 71 | - We can work on joint grant or something similar 72 | - NR: Lightweight ZEP process for codecs? 73 | - JM: Light voting procedure? 74 | - NR: Could be! 75 | - JM: A new ZEP to update ZEP0000 and add a new type of ZEP in the types and loosen the restrictions 76 | - NR: No problem with voting but how do we setup the ZEPs up for voting - anyone can do it via creating a issue but that can lead to mayhem if everyone starts doing it! 77 | - JM: We're following a chronological order but we can have a statement which can allow lightweight ZEPs to come in while big ones are on the way 78 | - Having a separate ZEPs for codecs and extension with less voting burden 79 | - How do we schedule the ZEPs for voting? 80 | - Something for 81 | - JM: There are improvements we can make to ZEP0000 and let's keep working on that 82 | -------------------------------------------------------------------------------- /meetings/2023/2023-08-24.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 24th August 4 | description: ZEPs Meeting Notes for 2023-08-24 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 20 8 | --- 9 | 10 | # 2023-08-24 11 | 12 | **Attending:** Jeremy Maitin-Shepard (JMS), Sanket Verma (SV), Josh Moore (JM) 13 | 14 | ## TL;DR: 15 | 16 | The meeting focused on ZEP0004 preliminary work for review and highlighted open pull requests addressing various aspects of the Zarr specifications. Discussions revolved around the proposal for a lightweight ZEP process specifically designed for adding codecs and data types, with considerations for fast-tracking certain additions, especially those crucial for specific domains. Additionally, updates on the Tensorstore implementation were provided, indicating the nearing completion of V3 and sharding implementations with ongoing bug fixes. 17 | 18 | **Updates:** 19 | 20 | - ZEP0004 preliminary work for review: 21 | - Open PRs: 22 | - 23 | - 24 | - 25 | - 26 | 27 | **Meeting minutes:** 28 | 29 | - JMS: Lightweight ZEP process for adding codecs and datatypes 30 | - JM: How do you think it should look like? 31 | - JMS: Opening a PR and get the votes from ZIC and ZSC could be it 32 | - SV: Minimilastic ZEP for adding codec and data types 33 | - JM: The voting process keep going-on for codecs and data types without hinderance from the bigger ZEPs 34 | - JMS: Can work on a lightweight ZEP template for codec and dtype 35 | - JM: Certain datatypes addition may be possible blocks for some domains - having a fast-track ZEPs would help that 36 | - JMS: Would like to work on dtypes for ML use-cases 37 | - SV: Tensorstore implementation 38 | - JMS: V3 and Sharding implementation almost complete - working on some bugs - will be finalising soon! 39 | -------------------------------------------------------------------------------- /meetings/2023/2023-09-07.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 7th September 4 | description: ZEPs Meeting Notes for 2023-09-07 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 21 8 | --- 9 | 10 | # 2023-09-07 11 | 12 | **Attending:** Sanket Verma (SV), Norman Rzepka (NR), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting introduced new ZEPs by Davis Bennett (ZEP0006) and Isaac Virshup (ZEP0007). Discussions included the possibility of hosting Zarr-Con with NASA F.15 grant support and updates on ZEP0002 progress, particularly in Tensorstore implementation. The addition of ZipStore as a ZEP was explored, with considerations for conventions, URL structures, and potential contributors from the microscopy and napari communities. Additionally, a discussion on handling non-Zarr stores by Zarr and the need for defined behaviors in case of malformed data was addressed. 17 | 18 | **Updates:** 19 | 20 | - ZEP0006 by Davis Bennett: 21 | - ZEP0007 by Isaac Virshup: 22 | 23 | **Meeting Minutes:** 24 | 25 | - WF: NASA F.15 grant could help hosting Zarr-Con over at Unidata 26 | - SV: Update on new ZEPs by Davis and Isaac 27 | - NR: Overview of the ZEP0007 28 | - WF: Character encoding addressed? - Not implemented robustly across NetCDF 29 | - SV: Norman as co-author? 30 | - NR: No, just left some comments 31 | - JMS: Define a name for the codec - array to bytes - can be applied to raw data buffer 32 | - NR: Could model it as a data type - not clear how the translation from bytes would work in a codec 33 | - JMS: Encourage a spec PR first - make things straightforward 34 | - SV: ZEP document and spec PR - anyone can come first - depends which is the clear and straightforward way to introduce the changes 35 | - SV: ZEP0002 36 | - JMS: Extremely close on tensorstore 37 | - NR: Zarrita.js can be added to the sharding implementation in the issue review 38 | - NR: Adding ZipStore as a ZEP 39 | - JMS: Added read-only support for ZipStore to tenstore 40 | - NR: Certain features that can be included in the ZEP - like allow different types of hierarchy 41 | - JMS: Various ways to use ZipStore in Zarr-Python - depends on different ways how want to organise your data in the Zip 42 | - NR: Maybe more of a convention - Zip on S3: How do you access it? (URL gets funky) 43 | - JMS: `s3://bucket/path/to/zip.zip|zip:path/to/array/|zarr3` - pipe URL - convey what's happening - `:` downsides - they're valid in a URL 44 | - NR: Go down further and address things further in the Zarr like array 45 | - WF: We have Zip support in NCZarr - not the similar URL style 46 | - NR: Microscopy folks - napari folks - Saalfeld could be potential people who could work on the Zip ZEP 47 | - NR: Protocol for Google storage? GS or GCS 48 | - JMS: `gs://bucket/path` - GS 49 | - JMS: General Zip required sequential access 50 | - JMS: Standardizing some kind for URL would be good thing 51 | - NR: Getting feedback from Ward, Stephan Saalfeld would be good 52 | - WF: HTTP post style syntax in NetCDF is supported 53 | - WF: What would happen if we try to read non-Zarr store by Zarr? 54 | - JMS: Looking at the metadata file and then figuring it out? 55 | - WF: Some part of NetCDF uses HDF5 and we try to open it with Zarr and it crashed 56 | - WF: Curious to what the failed `open()` should look like? Having a defined behaviour would be good 57 | - JMS: Launch missiles if the data is malformed! 😄🚀 58 | - WF: NetCDF have certain error code when it can't read insted of crashing the software - should be a part of the spec 59 | - JMS: Could be a good addition 60 | - WF: Just curious about the crashing! 61 | -------------------------------------------------------------------------------- /meetings/2023/2023-09-21.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 21st September 4 | description: ZEPs Meeting Notes for 2023-09-21 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 22 8 | --- 9 | 10 | # 2023-09-21 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Davis Bennett (DB), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting featured updates on new ZEPs, including ZEP0006 (Zarr Object Model) by Davis Bennett and its implementation progress, ZEP0007 (String) by Isaac Virshup, and ZEP0008 (URL Syntax) by Jeremy Maitin-Shepard. Discussions included an overview of the newly submitted ZEPs, with considerations for generalizing URL syntax and exploring query syntax similarities with JSON. Additionally, discussions touched on the need for standardizing JSON schemas for Zarr, the compatibility of Zarr shards with HDF5, and plans to kick off the voting process for ZEP0003, particularly its usefulness for Tensorstore. 17 | 18 | **Updates:** 19 | 20 | - ZEP0006 by Davis Bennett: (Zarr Object Model) 21 | - Implementation: 22 | - ZEP0007 by Isaac Virshup: (String) 23 | - ZEP0008 by Jeremy Maitin-Shepard: (URL Syntax) 24 | 25 | **Meeting Minutes:** 26 | 27 | - Overview of the newly submitted ZEPs 28 | - JMS: ZEP8 Could be generalised apart from the Zarr ecosystem - provides parameters at each specific level 29 | - DB: Considering query strings? 30 | - JMS: Clearly diverging from convential syntax - 31 | - JMS: Issue with `#` syntax - interpretation will be different - a few downsides of using it - argument for using fragment identifier 32 | - JM: Descending down the attributes in N5 land? 33 | - 34 | - DB: Query syntax like JSON - the idea was shared among and used across in N5 land 35 | - DB: ZEP0006 (ZOM) discussions - [Tally Lambert](https://github.com/tlambert03) from Napari was looking for JSON schema for Zarr 36 | - JMS: JSON Schema for tensorstore 37 | - V2: 38 | - V3: 39 | - DB: Wise thing to move towards the standardisation of the schema 40 | - JMS: Consolidated metadata 41 | - JM: Engage positively with consolidated metadata and not break it 42 | - JMS: V3 consolidated metadata not in-line with ZOM would be a bad thing :) 43 | - DB: Could be use with HDF5 as well and can be future proof for the upcoming Zarr specifications 44 | - SV: Zarr shards as a valid HDF5 45 | - DB: Need to have a mechanism where both the formats can talk to each other - otherwise may lead to brittleness 46 | - JMS: The approach is hacky atm 47 | - Martin wants to kick-off [ZEP0003](https://zarr.dev/zeps/draft/ZEP0003.html) voting ASAP 48 | - Prototype implementation: 49 | - Techincal review needed 50 | - DB will have at it soon! 51 | - JMS: Useful for Tensorstore 52 | -------------------------------------------------------------------------------- /meetings/2023/2023-10-05.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 5th October 4 | description: ZEPs Meeting Notes for 2023-10-05 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 23 8 | --- 9 | 10 | # 2023-10-05 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Thomas Nicholas (TN), Jonathan Striebel (JS), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, attendees shared their favorite books as a fun introduction. The discussions revolved around various GitHub issues and pull requests, including considerations for making a store less explicit, clarifying certain points, and addressing implementation details. Jeremy Maitin-Shepard highlighted the successful 100% V3 implementation in Neuroglancer and Tensorstore, while Thomas Nicholas expressed interest in performance improvements, particularly regarding variable chunking in Zarr-Python. Plans to finalize ZEP0003 for voting were also mentioned. 17 | 18 | **Meeting Minutes:** 19 | 20 | - Introductions with favourite book 21 | - SV: ASOF by GRR Martin 22 | - TN: Dispossessed 23 | - JS: Hitchhicker's Guide to Galaxy 24 | - JM: Kingkiller Triology (Rothfuss) — can HIGHLY recommend. Beware: only 2 of the 3 books is written. 25 | - Issues and PRs to look at: 26 | - 27 | - JS and JMS: Mostly an implementation detail 28 | - JMS: Make store less explicit 29 | - JS: Should not enforce the parameter; will send a PR after 2 weeks 30 | - 31 | - JMS: Will try to make it more clear 32 | - 33 | - JMS: Chris Barnes' implementation hasn't made the change yet 34 | - SV: Send an email for this to ZIC 35 | - JMS: 36 | - JS: Fortran array needs to be inverted 37 | - JMS: C and Fortran array are contiguous in different ways 38 | - JMS: Neuroglancer and Tensorstore has 100% V3 implementation 🎉 39 | - Working on some CI issues and will merge it 40 | - JM: https://github.com/ome/ngff/pull/206 41 | - TN: Cubed discussions 42 | - Anything which increases the performance would be useful - interested in Jack's work 43 | - How can we get variable chunking into Zarr-Python? 44 | - SV: Needs to finalise the ZEP0003 - will go into voting soon! 45 | -------------------------------------------------------------------------------- /meetings/2023/2023-10-19.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 19th October 4 | description: ZEPs Meeting Notes for 2023-10-19 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 24 8 | --- 9 | 10 | # 2023-10-19 11 | 12 | **Attending:** Sanket Verma (SV), Mark Kittisopikul (MK), Ward Fisher (WF), Davis Bennett (DB), Norman Rzepka (NR), Isaac Virshup (IV) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, the attendees discussed updates on ZEP0002, with a focus on potential changes related to handling checksums and compatibility with Kerchunk. ZEP0006 (Zarr Object Model) discussions included the suggestion of obtaining a JSON schema for ZOM and the idea of a separate ZEP for serializing children metadata. Additionally, there were considerations about namespacing codecs, discussions on ZEP0003 progress, and the potential adoption of HDF filters in Zarr. The meeting also touched on accommodating changes to ZEP0001 and the upcoming ZEP0002 voting deadline on October 31. 17 | 18 | **Updates:** 19 | 20 | - ZEP0002 voting closes on 31st October - 21 | 22 | **Meeting Minutes:** 23 | 24 | - MK: Changes to ZEP0001 are still coming in - how do we handle them? 25 | - SV: ZEP1 was provisionally accepted but not at the final stage 26 | - NR: These changes are minor and would need voting from ZIC 27 | - MK: Mention potential grace period in ZEP0000 would be helpful 28 | - NR: Needs to be written out! 29 | - MK: Zarr shards as HDF5 file 30 | - ZEP0002 should proceed as it is atm 31 | - Having a null codec to ignore the checksum would be helpful - can work on this 32 | - Recent numcodecs release helps a lot - getting Jenkins lookup checksum 33 | - Relation with ZEP0002 and Kerchunk? 34 | - NR: Don't know if there is 35 | - IV: Zarrita can read the HDF5 file using Kerchunk 36 | - IV: Multiple arrays in a single Zarr shard 37 | - NR: I don't think it'll be possible 38 | - DB: Why the chunks in the directory called C? 39 | - NR: Helps when when scanning down the groups and arrays 40 | - DB: Any questions for ZEP0006? 41 | - NR: Getting a JSON schema out of the ZEP0006 would be helpful 42 | - DB: This would also help us to get a container level validation 43 | - IV: How differs from consolidated metadata? 44 | - DB: Consolidated metadata 45 | - NR: Serializing children metadata would be helpful - could be a different ZEP 46 | - DB: Flattening array 47 | - DB: Pydantic-Zarr would have the reference implementation for ZOM 48 | - ZEP0002 discussions at Unidata 49 | - Mostly going for 'Yes' - but looking for resources who can handle and complete it 50 | - MK: NetCDF has adopted HDF filters? Making a part of Zarr filters? 51 | - WF: Would like to see spec support - supports interoperability - but hasn't considered it 52 | - MK: Between N5 and Zarr we encounter difficulties for LZ4 codec 53 | - DB: Sounds like N5 problem to me! ;) 54 | - IV: List of HDF5 filters? MK: Yes, there's a list 55 | - MK: But I think NCZarr support a select few only 56 | - WF: Yes! 57 | - MK: List of HDF5 Registered Filters: - GitHub library for plugins: 58 | - DB: Getting away from storing F9 transformations in metadata 59 | - IV: Thoughts on namespacing codecs? 60 | - NR: Would be helpful 61 | - WF: The overhead of maintenance and administration is daunting - experience from Unidata - How would I guarantee that information would be there in 10 years? 62 | - IV: Pointing at URI where the codec is defined - Having this in Zenodo would be helpful 63 | - IV: ZEP0003 progress 64 | - SV: Waiting for technical review - would help if Martin/IV could make it the ZEP meetings to raise the discussion 65 | - IV: Sure 66 | -------------------------------------------------------------------------------- /meetings/2023/2023-11-02.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2nd November 4 | description: ZEPs Meeting Notes for 2023-11-02 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 25 8 | --- 9 | 10 | # 2023-11-02 11 | 12 | **Attending:** Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS), Jonathan Striebel (JS) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, the attendees celebrated the acceptance of ZEP0002. The discussion revolved around potential changes to ZEP0, considering a phased voting approach and addressing implementation challenges. Additionally, topics included resolving conflicts in the Endian codecs to bytes pull request, updates to ZEP2 regarding index location, versioning, and tracking intermediate buffer sizes, and the addition of v1 and v2 specs to zarr-specs. 17 | 18 | **Updates:** 19 | 20 | - [ZEP0002](https://zarr.dev/zeps/accepted/ZEP0002.html) finally accepted! 🎉 21 | 22 | **Meeting Minutes:** 23 | 24 | - Changes to ZEP0 25 | - 26 | - JMS: Don't think there's gonna be any changes to ZEP1 now 27 | - JM: Could be 2 voting - but how do you get everyone to implement ZEP without voting? 28 | - SV: Yeah! Good point. 29 | - JMS: Don't need to go overhead for small changes 30 | - JM: `index_location` is backwards compatible and maybe the best case scenario 31 | - JMS: But endian codec is not backward compatible 32 | - SV: Very natural to see this scenario coming up 33 | - JM: Architect building a building - they go through submittals - 20% increment - show the adoption percentage 34 | - SV: How would you define percentage? 35 | - JM: Voting could be in phases - reading phase - implementation change - grace period 36 | - *Jonathan joins* 37 | - JS: Voting encourages people to read the spec 38 | - JM: Setting a common calendar for the ZIC and ZSC - can help the author 39 | - JS: It was good to the response when we set the deadline - also the examples of implementation currently is good 40 | - JMS: Would be great to have a table of what part of spec they're current implementing would be great 41 | - JMS: Having a compatibility table would be great - e.g. Neuroglancer doesn't support boolean type 42 | - JMS: Once a ZEP is accepted the spec matters 43 | - JM: Looking at RFC for NGFF these days 44 | - JMS: Random reviews vs the expert group reviews 45 | - SV: If you're going to implement the spec and how you're going to implement the spec - Form a voting procedure around that 46 | - JS: Having a process definitely helped me to finsh the sharding - find it good as contributor! 47 | - JM: Having rebuttal would change the tone a lil' bit 48 | - JS: Telling to vote before implementation is fine - as you can find things spec 49 | - JM: Having it defined in ZEP0 would be great! 50 | - JS: Doesn't like the grace period - leave the door open a little 51 | - JS: Having an implementation phase would be good 52 | - SV: Reading notification during the voting phase - could be helpful 53 | - JM: Keeping a table would be helpful - 5 states 54 | - JMS: Three phases - 55 | - reading phase and express opinions 56 | - implementation phase and raise issues 57 | - solve issues raised in the last phase 58 | - JM: It's clear we're in agreement of phases/periods - intent 59 | - JM: People want general confidence from the audience that they're moving forward 60 | - JMS: 61 | - JMS: Getting formal feedback from the ZIC before the vote 62 | - SV: Let ZIC know when you're going to put up the ZEP for voting 63 | - JM: Worst case scenarios - no-one read the spec and someone vetos the ZEP in the initial stage 64 | - JM: Roll call helps 65 | - JS: May not need a second vote - those are implementation details 66 | - JM: Add a different state - a pre-implementation checkpoint 67 | - Endian codecs to bytes 68 | - 69 | - JMS: Need to resolve git conflicts 70 | - Changes to ZEP2: 71 | - (add index_location) 72 | - (versioning) 73 | - (tracking intermediate buffer sizes) 74 | - Adding v1 and v2 spec to zarr-specs 75 | - 76 | -------------------------------------------------------------------------------- /meetings/2023/2023-11-16.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 16th November 4 | description: ZEPs Meeting Notes for 2023-11-16 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 26 8 | --- 9 | 10 | # 2023-11-16 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF), Norman Rzepka (NR), Jeremy Maitin-Shephard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, updates were provided on various pull requests, including PR#281 to update ZEP2 status. Discussions revolved around ZEP8, with Jeremy tasked to update the corresponding pull request. Davis provided an update on ZEP6, and topics included cleaning the zarr-specs documentation, considerations about version numbers in codecs and stores, and discussions on refining the ZEP process, including the duration of voting periods and the championing process for new ZEPs. John Bogovic's potential involvement in future meetings for ZEP8 discussions was mentioned. 17 | 18 | **Updates:** 19 | 20 | - Sanket recently added [PR#281](https://github.com/zarr-developers/zarr-specs/pull/281) to update ZEP2 status 21 | - Discussions around ZEP8: 22 | - Jeremy to update the PR 23 | - Davis update ZEP6 24 | - Question to Davis: Status of completion? 25 | 26 | **Meeting Minutes:** 27 | 28 | - Cleaning of zarr-specs.readthedocs.io 29 | - Remove 'Under construction' 30 | - Remove 'Array storage transformers' 31 | - Maybe rename data types to extension data types? 32 | - JMS: [bfloat16 dtype](https://github.com/zarr-developers/zarr-specs/pull/257) should be in core spec under data type table - required and optional table separately 33 | - NR: May need some explanation on the data type 34 | - NR: Remove version number from the codecs and stores 35 | - JMS: Could track down the implementations adopting the different versions of codecs/stores 36 | - NR: No version number in metadata, so not useful 37 | - JMS: But you'd not be allowed to change the metadata 38 | - JM: Helps to write down this; for example a ZEP 39 | - JMS: Implementations might not implement all the versions 40 | - JMS: How about STAC? 41 | - SV: STAC uses incremental versions to evolve the specification 42 | - NR: Zarr V2 & V3 compatbility issues may arise in the future 43 | - JM: Flip side if we're dealing with a long list of codecs - versions may help here 44 | - JM: For example: Bumping the Blosc2 codec 45 | - JMS: Pretty rare to change the versions 46 | - JMS: Having a shorter voting period may help 47 | - JM: Having a step voting period may be troublesome - experience from NGFF and OME-Zarr - strong word for the first phase 48 | - NR: Silence in the second phase? 49 | - JM: It'll be good! 50 | - NR: How do we make the vote earlier? 51 | - JM: Once month for roll call - done reading, start implementing. finish implementing, also no vetos 52 | - JM: Graph for the progress 53 | - JMS: Make revisions after the voting - working for now but not great 54 | - JMS: A word for grace period: `Final revision deadline` 55 | - NR: Finding 1-2 champion for starting a new ZEP 56 | - JM: Having a mailing list to ask for champion 57 | - JM: Close the ZEP proposal if it's not active for sometime 58 | - JMS: C++ standard is much complicated than Zarr - only some people capable of changing the wording 59 | - SV: If someone is not able to find a champion should they not proceed with the ZEP? - Not in favour of the champion process to be the only condition to move forward 60 | - JM: List of endorses and endorsement for the ZEP process 61 | - FYI: John Bogovic may join future ZEP meeting for ZEP8 discussions 62 | - TABLED 63 | - Thoughts on [PR#276](https://github.com/zarr-developers/zarr-specs/pull/276)? 64 | -------------------------------------------------------------------------------- /meetings/2023/2023-11-30.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 30th November 4 | description: ZEPs Meeting Notes for 2023-11-30 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 27 8 | --- 9 | 10 | # 2023-11-30 11 | 12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, discussions included Ward Fisher's virtual talk for AGU, focusing on Zarr as an archival format and its interoperability with NetCDF. There was also consideration of Zarr implementation in FORTRAN and the challenges in upgrading NetCDF FORTRAN modules. Additionally, ongoing work on ZEP48 was acknowledged, and the decision on several pull requests, including zarr-developers/zarr-specs/#276, zarr-developers/zarr-specs/#205, and zarr-developers/zarr-specs/#271, was tabled for future consideration. 17 | 18 | **Meeting Minutes:** 19 | 20 | - WF: Recording a virtual talk for AGU next week 21 | - - CMIP6 trying to get rid of NetCDF? 22 | - WF: Zarr as archival? - Trying to be there - much younger format - but NetCDF is a robust archival format 23 | - SV: The interoperablility b/w Zarr and NetCDF is a good thing in here 24 | - WF: 25 | - WF: Zarr implementation in FORTRAN? 26 | - JMS: Are people writing FORTRAN actively? 27 | - WF: Yes - The latest book on FORTRAN came last year - NCZarr is supported via FORTRAN 28 | - JMS: Zarr V3 FORTRAN would be good 29 | - WF: NetCDF FORTRAN is used by supercomputers across US 30 | - SV: Why end NetCDF FORTRAN? 31 | - WF: Selfish reasons - takes a lot of time fixing the modules 32 | - SV: Why supercomputers still use FORTRAN? 33 | - WF: Supercomputers unparalleled performance using FORTRAN is just great! 34 | - WF: E.g. Mathworks upgrading to newer NetCDF version is a long process 35 | - WF: 36 | - Good to go with? 37 | - [zarr-developers/zeps/#48](https://github.com/zarr-developers/zeps/pull/48) 38 | - JMS: Still need to work on this - also work on the implementation of the ZEP 39 | - TABLED 40 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)? 41 | - Thoughts on [zarr-developers/zarr-specs/#205](https://github.com/zarr-developers/zarr-specs/pull/205) 42 | - Good to go with? 43 | - [zarr-developers/zarr-specs/#271](https://github.com/zarr-developers/zarr-specs/pull/271) 44 | -------------------------------------------------------------------------------- /meetings/2023/2023-12-14.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 14th December 4 | description: ZEPs Meeting Notes for 2023-12-14 5 | grand_parent: ZEP meetings 6 | parent: 2023 meetings 7 | nav_order: 28 8 | --- 9 | 10 | # 2023-12-14 11 | 12 | **Attending:** Sanket Verma (SV) and Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | In this meeting, discussions revolved around the Zarr NYC Sprint and the possibility of hosting ZarrCon at Google. The decision to proceed with `zarr-developers/zarr-specs/#276` was made, with changes committed and awaiting review before merging. Additionally, thoughts were shared on `zarr-developers/zarr-specs/#205`, and decisions were made to merge zarr-developers/zarr-specs#271 while considering the closure of `zarr-developers/zarr-specs#254`. 17 | 18 | **Meeting Minutes:** 19 | 20 | - Discussion about Zarr NYC Sprint 21 | - ZarrCon @ Google 22 | - JMS: Can be hosted at Google 23 | - JMS: Can provide rooms for conferences but lodging could be a challenge 24 | - JMS: Stephan Hoyer could be interested 25 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)? - Changes committed; will wait for review and then merge 26 | - Thoughts on [zarr-developers/zarr-specs/#205](https://github.com/zarr-developers/zarr-specs/pull/205) 27 | - Good to go with? 28 | - [zarr-developers/zarr-specs/#271](https://github.com/zarr-developers/zarr-specs/pull/271) - MERGED 29 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280) 30 | - And close? 31 | - [zarr-developers/zarr-specs#254](https://github.com/zarr-developers/zarr-specs/issues/254) 32 | -------------------------------------------------------------------------------- /meetings/2023/meeting_notes_2023.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2023 meetings 4 | description: List of ZEP meeting notes for the year 2023 5 | nav_order: 2 6 | parent: ZEP meetings 7 | has_children: true 8 | permalink: /meetings/2023/ 9 | --- 10 | 11 | # ZEP Meeting Notes for 2023 12 | 13 | Shows the list of meeting notes for the year 2023. 14 | -------------------------------------------------------------------------------- /meetings/2024/2024-01-11.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 11th January 4 | description: ZEPs Meeting Notes for 2024-01-11 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 1 8 | --- 9 | 10 | # 2024-01-11 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS), Ward Fisher (WF) 13 | 14 | ## TL;DR: 15 | 16 | Meeting covers progress updates, resolution discussions for various Zarr specs pull requests, and status check on ZEPs with considerations for improving the review process. 17 | 18 | **Updates:** 19 | 20 | - Happy New Year! 🥂 21 | 22 | **Meeting Minutes:** 23 | 24 | - Let's go ahead with [zarr-developers/zarr-specs/#276](https://github.com/zarr-developers/zarr-specs/pull/276)? 25 | - This would help resolve [zarr-developers/zarr-python#1582](https://github.com/zarr-developers/zarr-python/pull/1582) 26 | - Good to go with? 27 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280) 28 | - [zarr-developers/zarr-specs#263](https://github.com/zarr-developers/zarr-specs/pull/263) 29 | - And close? 30 | - [zarr-developers/zarr-specs#254](https://github.com/zarr-developers/zarr-specs/issues/254) 31 | - Status of ZEP6, 7 & 8 32 | - JMS: Will look at it sometime soon 33 | - JMS: Also looking forward to the simplified version of ZEP0 34 | - JM: In NGFF space we're going away from single PR and merge thing as it becomes difficult to manage stuff 35 | - JMS: GitHub is not well suited for resolving comments and stuff 36 | - JM: The webpage becomes the official record of what was discussed and approved 37 | - SV: Process for Tensorstore and Neuroglancer 38 | - JMS: Open a issue and then PR - mostly me and my colleague working on stuff 39 | - JMS: We have internal repository for the changes as well 40 | -------------------------------------------------------------------------------- /meetings/2024/2024-01-25.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 25th January 4 | description: ZEPs Meeting Notes for 2024-01-25 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 2 8 | --- 9 | 10 | # 2024-01-25 11 | 12 | **Attending:** Sanket Verma (SV) and Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | Discussion centers on revising ZEP0 with proposals to streamline the process, including incorporating reading and implementation phases, and considering methods to increase user engagement and ensure smooth decision-making, with attention given to potential veto options and quorum requirements. 17 | 18 | **Meeting Minutes:** 19 | 20 | - ZEP0 Revision 21 | - 22 | - JMS: Both Ryan's and SV proposal can work together 23 | - SV: We can have an issue for reading/implementation comments and PR for the actual change 24 | - JMS: The idea of reading and implementation phase is an improvement from the existing proposal 25 | - JMS: The people who care about Zarr specification is less but streamlining the process is equally important 26 | - JMS: Create new issues if the discussion gets long and link it to the PR - and you can add it to the description 27 | - SV: There's also a question of how we can increase the activity of the users in Zarr specification work 28 | - JMS: The time and interests of various the council members depends on various factors 29 | - JMS: Let people veto at anytime, even at the voting phase? - you have processes but in the end you rely on people! 30 | - JMS: We hopefully don't get veto at the end of the stage 31 | - JMS: You should have the veto in the voting phase too! - There should be option of veto if the problem comes up very late 32 | - JMS: Thinks having a quorum can help 33 | - Good to go with? 34 | - [zarr-developers/zarr-specs#280](https://github.com/zarr-developers/zarr-specs/pull/280) 35 | - SV commented on the PR to get Norman's attention 36 | -------------------------------------------------------------------------------- /meetings/2024/2024-02-08.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 8th February 4 | description: ZEPs Meeting Notes for 2024-02-08 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 3 8 | --- 9 | 10 | # 2024-02-08 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | Discussion revolves around refining storage transformer interfaces within ZEP1, exploring options for unified JSON representations, and considering the integration of Parquet with Zarr, alongside ZEP0 discussions focusing on streamlining processes while ensuring compatibility with existing bylaws. 17 | 18 | **Meeting Minutes:** 19 | 20 | - 21 | - JM: The state we left ZEP1 and storage transformer, where does this fit in? 22 | - JMS: Wrap the key-value interface in the existing implementation 23 | - JMS: Kerchunk approach has 1 `.JSON` file and the proposed approach has 2 `.JSON` files 24 | - JMS: Specify any array in-line? 25 | - JM: May look like specifying kerchunk in Zarr which we may or may not want to do 26 | - JMS: Kerchunk approach has keys and values - not exactly readable 27 | - JMS: Various flavours of `.JSON`s can we somehow unify them? - Does it help to have a representation for inline arrays? 28 | - JM: Will comment on the Joe's issue 29 | - JMS: Would be good to get Martin's POV 30 | - JMS: Kerchunk parquet format is worth looking at 31 | - JM: Parquet folks are looking to combine parquet and Zarr - could look at the tabular data as 2D array 32 | - JMS: Do you need to download the whole parquet to access it? 33 | - JM: I think the offset works in parquet 34 | - JM: 35 | - JMS: - created annotations in Tensorstore - spatial query has multi-index grid - sorta same like a sparse-array 36 | - JMS: general missing feature of a cloud database (Josh: cf. work on a graph/zarr version in Spain) 37 | - JM: Will try to get together SpatialData and JMS for discussions to prevent duplicative efforts 38 | - JM: Having URLs as indices and if not generate them on the fly and if you have write access then write to it 39 | - JMS: Annotations doesn't end up being too large 40 | - JM: Duckdb is worth looking at - 41 | - JMS: Cloud database need regular maintenance 42 | - JM: 43 | - JM: Building index on the cloud or locally? 44 | - ZEP0 discussions - 45 | - JM: Let's open a PR and go ahead!? 46 | - JMS: Yes! 47 | - JM: In favour of having a lighweight process would be helpful but if we reach to a point where we have contention then we should go back to the bylaws 48 | - JMS: If the future ZEPs overlap then there would be a problem 49 | - JM: Footnote is useful for future records 50 | -------------------------------------------------------------------------------- /meetings/2024/2024-02-22.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 22nd February 4 | description: ZEPs Meeting Notes for 2024-02-22 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 4 8 | --- 9 | 10 | # 2024-02-22 11 | 12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Josh Moore (JM), Martin Durant (MD), Tom Nicholas (TN) 13 | 14 | ## TL;DR: 15 | 16 | The meeting covered the use of LLMs in training, feedback on the Zarr-Specs website redesign, progress on the V3 refactor, and discussions on integrating kerchunk into Zarr, focusing on chunk manifest standardization and virtualized array concatenation. 17 | 18 | **Updates:** 19 | 20 | - HTTP Extension meeting: 21 | 22 | **Meeting Notes:** 23 | 24 | - LLMs and how WF is using them in trainings 25 | - Feedback for new design for Zarr-Specs website (combines ZEP and Zarr-Specs together) 26 | - Link: 27 | - MD: How's V3 refactor work going on these days? 28 | - JM: Quite good progress taking place these days 29 | - SV: V3 PRs can be found here - 30 | - MD: 31 | - TN: Been discussing → 32 | - interested in integrating kerchunk into zarr, especially two ZEPs 33 | - (1) chunk manifest (Joe) - standardizing what chunk json files do 34 | - (2) concatenation - 35 | - 1. manifest: opinion that it's an incredible idea that is very popular 36 | - fsspec relationship makes things complicated 37 | - move to the zarr spec for other implementations? 38 | - goal is readable in any language 39 | - difficult position 40 | - three things to think about 41 | - read byte ranges 42 | - write JSON 43 | - combine module 44 | - roadmap: 45 | - standardize json for the chunks. manifest file? 46 | - JM: storage in zarr array itself 47 | - JM: log file anytime you read a full file into memory 48 | - Josh: virtual zarr (access pattern) 49 | - 2. concatenation 50 | - multi-zarr-to-zarr leads to a loop 51 | - more sense to think of concat of virtualized arrays objects 52 | - see kerchunk array notebook 53 | - read in byte ranges with kerchunk. array class which only stores byte-offset arrays in memory 54 | - can be done in xarray. concat-classes can be put into xarray and can use higher-order API 55 | - JM: store that xarray as a zarr :smile: (but need additional metadata for realizing the array) 56 | - TN: part of notebook that isn't done. exactly. 57 | - common case in geo. multiple NC files, concat those array. 58 | - possibly compression options change over time. 59 | - prevents it from being one zarr array 60 | - JM: or just always serialize to the chunk manifest 61 | - JM: i.e. where do we stop? (when does Zarr become Turing Complete?) 62 | - TN: thought at concat (clear use case). but jeremy thought indexing (also clear use case) 63 | - JM: starting to sound like transforms () 64 | - WF: periodically get requests for operations on the data 65 | - no one has come close to making the argument for adding that into the storage 66 | - so many math libraries that would do it better 67 | - TN: no computation since you don't need the values. can do some subset of concat & indexing without values. 68 | - TN: have now become a zarr producer :tada: 69 | - JM: cross-language motivation 70 | - SV: pyramiding ZEP discussions 71 | -------------------------------------------------------------------------------- /meetings/2024/2024-03-07.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 7th March 4 | description: ZEPs Meeting Notes for 2024-03-07 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 5 8 | --- 9 | 10 | # 2024-03-07 11 | 12 | **Attending:** Sanket Verma (SV), Ward Fisher (WF), Davis Bennett (DB), Josh Moore (JM), Thomas Nicholas (TN), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed enhancing Zarr store browsing, pushing Kerchunk functionality into Zarr, and the potential for a chunk manifest as a ZEP. Additionally, the group explored ideas for revising ZEP0, creating a Zarr specification IETF standard, and shared updates on upcoming Zarr HTTP Extension and SEA conference events. 17 | 18 | **Updates:** 19 | 20 | - Zarr HTTP Extension Meeting next week 21 | - Check here: 22 | - TN: 23 | - Had a conversation with folks over at Development Seed, NASA, Earthmover 24 | 25 | **Meeting Minutes:** 26 | 27 | - TN: Jed wants to have nice ways to browse the Zarr stores - they have nice ways to browse `.tiff` files already 28 | - Wants to propose an extension to add more information in the metadata 29 | - The end result would look more like a Xarray HTML wrapper 30 | - TN: 31 | - Had a conversation with folks over at Development Seed, NASA, Earthmover 32 | - DB: Pushing Kerchunk functionality into Zarr stores 33 | - DB: Whether the feature could be file format agnostic? 34 | - TN: Argues that it should be a ZEP - and can be read every Zarr implementation 35 | - JM: Having same thing implemented in FSSPEC 36 | - DB: Would ZEP 37 | - WF: HDF5 group may be open to a conversation 38 | - SV: might have some useful information 39 | - TN: _recaps the conversation for JMS_ 40 | - TN: Should concatenation be a part of the current ZEP? 41 | - DB: Any reason you don't want to concatenate HDF5 and other file formats? 42 | - TN: Chunk manifest would point inside the arrays - chunk manifest could let you create a Zarr store over other formats as well 43 | - DB: This would make Zarr as an API/access pattern 44 | - TN: Can be created and tested fairly separate to Zarr - personally think chunk manifest is neat feature - implementation can support/not support it 45 | - DB: Array mutation can break the concatenation - having guidelines for archival arrays would help 46 | - TN: Currently we're thinking about read-only case 47 | - TN: Virtualisation in Kerchunk is a spotlight feature 48 | - JMS: Manifest is a good idea and keeping it separate would be a minor difference - needs to align with Kerchunk 49 | - JM: report/ZEP idea (time permitting) 50 | - 51 | - JM: Putting ro-create inside Zarr - or making Zarr specification a IETF standard 52 | - JM: Would probably go ahead and write a convention in NGFF space 53 | - 54 | - JM: Have a mechanism for going up/down the hierarchy - useful for the HTTP extension discussions 55 | - Revising ZEP0 56 | - - comments/feedback welcome 57 | - DB: :+1: 58 | - DB: Would be easy to have a single PR for my ZEP 59 | - JMS: Putting narrative document in PR description 60 | - JM: Weird for commenting on the PR description and for the public visibility 61 | - JMS: Rationale can be put down as a footnote 62 | - JMS: Having numeric numbering is something Python follows 63 | - JMS: The actual specification change can also serve as a ZEP narrative 64 | - SV: We can pick out certain sections out of the ZEP narrative document 65 | - JMS: Having a PR template similar to ZEP's narrative could also help us 66 | - WF: 67 | - In-person and virtual registrations are available 68 | -------------------------------------------------------------------------------- /meetings/2024/2024-03-21.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 21st March 4 | description: ZEPs Meeting Notes for 2024-03-21 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 6 8 | --- 9 | 10 | # 2024-03-21 11 | 12 | **Attending:** Sanket Verma (SV), Thomas Nicholas (TN), Ward Fisher (WF) 13 | 14 | ## TL;DR: 15 | 16 | **Updates:** 17 | 18 | - Join ZulipChat: 19 | - HTTP Extension meeting took place on 3/14 20 | - Trying to figure out the best way forward, i.e. a ZEP or not 21 | - Guaging interest and use cases from others in the community 22 | 23 | **Meeting Minutes:** 24 | 25 | - HTTP Extension 26 | - WF: Can see the shape of it, and I think it would be useful 27 | - SV: Existing thread: 28 | - TN: Tom's company may have a use case for the HTTP work 29 | - Showing [VirtualiZarr](https://github.com/TomNicholas/VirtualiZarr) (related to the "chunk manifest" ZEP) 30 | - TN: Been working on the packages for the last 2 weeks - could potentially replace Kerchunk 31 | - TN: _code walkthrough via screen sharing_ 32 | - TN: Storing the virtual Zarr manifests, not the actual array values 33 | - TN: Could move `class ManifestArray` to Zarr-Python - arguments in favour and against it 34 | - TN: Could see donating VirtualiZarr to zarr-developers 35 | - SV: **Action items** 36 | - TN to create a topic for VirtualiZarr to gather feedback/comments 37 | - SV to try VirtualiZarr 38 | - TN and SV to work on ZEP Extension proposal for virtual Zarr manifest and formally present it for broader feedback 39 | - TABLED 40 | - Revising ZEP0 41 | - 42 | -------------------------------------------------------------------------------- /meetings/2024/2024-04-04.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 4th April 4 | description: ZEPs Meeting Notes for 2024-04-04 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 7 8 | --- 9 | 10 | # 2024-04-04 11 | 12 | **Attending:** Sanket Verma (SV), Josh Moore (JM), Ward Fisher (WF) 13 | 14 | ## TL;DR: 15 | 16 | **Updates:** 17 | 18 | - CZI EOSS6 Application not funded 19 | 20 | **Meeting Minutes:** 21 | 22 | - NASA Grant (WF) 23 | - 24 | - Townhall meeting slides: 25 | - Looking towards sustaining the already established open source software 26 | - NetCDF is looking for collaboration for their application 27 | - JM: Collaborators in US could be NF, OpenCollective, NVIDA, Columbia etc. 28 | - JM: Will reach out to NF for their NASA grants' experience 29 | -------------------------------------------------------------------------------- /meetings/2024/2024-04-18.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 18th April 4 | description: ZEPs Meeting Notes for 2024-04-18 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 8 8 | --- 9 | 10 | # 2024-04-18 11 | 12 | **Attending:** Josh Moore (JM), Vicent Immler (VI), Sanket Verma (SV), Ward Fisher (WF), Altay Sansal (AS), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting covered the proposal to remove implicit groups in Zarr, progress on ZEP4 and ZEP3, and updates on V3 implementation. Additionally, discussions included async read optimizations for Zarr and the impact on performance, especially concerning large datasets and parallel data ingestion. 17 | 18 | **Updates:** 19 | 20 | - Davis wants to remove implicit groups: 21 | - Activity going-on at ZEP4 Review PR 22 | 23 | **Meeting Minutes:** 24 | 25 | - Introductions w/ last gift you got 26 | - Sanket - cologne and clothes 27 | - Vincent - wooden board forged with family crescent 28 | - Ward - camping tent 29 | - Josh - pecan nuts 30 | - Altay - lead data scientist - lego 31 | - Removing Implicit groups 32 | - JM: Discussed at community meeting - needs to go back to root node to figure out the group 33 | - JM: Tensorstore doesn't use Zarr groups at all 34 | - WF: Supposition from my side 35 | - WF: Dennis completed the V3 implementation! 36 | - JM: Are we closer to parity in V3 work - a question for Dennis! 37 | - VI: How does implicit groups affect performance? 38 | - JM: No, implicit groups means performance improvement 39 | - VI: Working on a new software implementation for students 40 | - JMS: No experience in working with groups 41 | - JM: Lot of callbacks 42 | - JMS: You'd definitely want to remove the looking upward 43 | - AL: Couldn't see a use-case for parallel creation of groups 44 | - JMS: You're ingesting lot of data in S3 and they read group metadata and have implicit groups 45 | - AL: `.zattrs` would have race condition? 46 | - JMS: Kind of a niche use-case 47 | - AL: Are Multi-processing locks concern metadata? 48 | - JMS: Multiple machine can leverage this! 49 | - AL: Removing would be a good idea! 50 | - AL: ZEP4 and ZEP3 progress 51 | - SV: AL, are you using V2 or V3? 52 | - AL: Using V2 and would love to move to V3 - have 20-30 PB data 53 | - AL: Want to work on `dimension_names` - what would be the best time to do it? 54 | - SV: After V3 release 55 | - VI: _explains GSoC application_ 56 | - AS: Hacked Zarr to submit reads in a async manner to the machine to circumvent the problem 57 | - AS: Zarr V3 is going to be fully async so, it helps alleviates the problem 58 | - VI: Would be good to have a way to improve the read speeds for Zarr 59 | -------------------------------------------------------------------------------- /meetings/2024/2024-05-02.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2nd May 4 | description: ZEPs Meeting Notes for 2024-05-02 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 9 8 | --- 9 | 10 | # 2024-05-02 11 | 12 | **Attending:** Josh Moore (JM), Sanket Verma (SV), Ward Fisher (WF), Jeremy Maitin-Shepard (JMS), Thomas Nicholas (TN) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed the upcoming Zarr V3 release candidate, status and integration of the chunk manifest ZEP, and potential revisions for combining ZEPs. They also covered progress on ZEP0 and plans to move ZEP2 to "Active" status, while tabling the discussion on removing implicit groups. 17 | 18 | **Updates:** 19 | 20 | **Meeting Minutes:** 21 | 22 | - WF: Dennis has a PR coming up to revise the Zarr V3 - 6 months long work - release candidate coming soon! 23 | - TN: Status of unfinished ZEP w.r.t. to chunk manifest? 24 | - E.g. Have a sharded chunk manifest? 25 | - JMS: Chunk manifest could refer to entire shard - use case might not be clear 26 | - JMS: For viz tools you would not load entire shard at once 27 | - TN: Changing chunk manifest ZEP or accomodate chunk manifest in sharding codec? 28 | - JMS: Not really need to change 29 | - JM: Maybe there's a way to re-write the ZEP in a way which the existing ZEPs are composable - basically how extensions would interact with each other 30 | - TN: 31 | - JMS: There might be cases where combination of codecs may not work well 32 | - TN: Combining codecs seems straightforward compared to variable chunking which specifies what is allowed and what not 33 | - JMS: The proposal which changes the data model are tricky - wanted to add non-zero origin 34 | - TN: Interested in variable chunking 35 | - SV: Would you in be interested in contributing to ZEP3? 36 | - JM: We could also start thinking about ZEP2+ZEP3, ZEP3+ZEP4, ZEP4+ZEP2... 37 | - JMS: Any reason for not using Kerchunk? 38 | - TN: Chunk manifest is clearly defined Zarr store compared to kerchunk (which kinda looks like Zarr) - reference file-system are not defined - there's value of getting chunk manifest into Zarr specification as Kerchunk is more than Zarr 39 | - JMS: The actual implementation of the file-system would be same across the various libraries 40 | - TN: Relying on single maintainer code is not an ideal situation 41 | - SV: ZP V3 implementation was outdated which led to creation of Zarrita and then finally re-using Zarrita for ZP V3 refactor 42 | - TN: Working on nit-picking Xarray for Virtuali-Zarr 43 | - TN: Zarr arrays are kind-of lazy arrays - when you index into Z-arrays they provide you with bytes not the actual Zarr arrays - Xarray has lazy-loading hidden inside in codebase and there has been discussion to make it a standalone library 44 | - JMS: We could have two sizes for chunks - stored size and actual size for variable chunking strategy 45 | - Move ZEP2 from `Accepted` to `Active` 46 | - JM: Would be good to move ZEP1 and ZEP2 both at the same time 47 | - SV: ZP V3 refactor would be a good time to move ZEP1 to active 48 | - Finalise ZEP0 revisions 49 | - 50 | - Re-start the conversation and finalise it 51 | - **TABLED** 52 | - Removing implicit groups - 53 | -------------------------------------------------------------------------------- /meetings/2024/2024-05-16.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 16th May 4 | description: ZEPs Meeting Notes for 2024-05-16 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 10 8 | --- 9 | 10 | # 2024-05-16 11 | 12 | **Attending:** Dennis Heimbigner (DH), Sanket Verma (SV), Josh Moore (JM), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed the release of Zarr-Python 2.18.0, a new blog post by Joe Hamman, and updates on the sharding support in the R implementation. They also covered the implementation of manifest storage transformers, standardizing URLs for Zarr, and the removal of implicit groups in Zarr-Python V3. 17 | 18 | **Updates:** 19 | 20 | **Meeting Minutes:** 21 | 22 | - Zarr-Python 2.18.0 out now: 23 | - One of the last few releases for Zarr Spec 2 - if there's anything you want to get in, please reply/tag us in the PRs/issues 24 | - New blog post by Joe Hamman: 25 | - Zarr-Python developers meeting new schedule - check here: 26 | - Lachlan Deakin added support for sharding in his R implementation: 27 | 28 | **Open agenda (add here 👇🏻):** 29 | 30 | - JM: In sharding you can recurse and browse through the chunks - somthing like _chunks([x, y])_ 31 | - DH: Treating sub-chunks as regular chunks - like what we decided during the storage transformers proposal - 32 | - _DH understands this proposal better and favours it_ 33 | - The relevant issue: 34 | - DH: Time to gets hands dirty with the implementation and figure out any problems we have 35 | - JMS: Using storage transformers and codecs in Neuroglancer to achieve sharding 36 | - DH: 37 | - SV: Manifest storage transformers - - defines and implements on top of the storage transformer in V3 core spec - discussion 👇🏻 38 | - JMS: Good to define the `JSON` and add other formats later on 39 | - DH: FSSPEC interprets the URL in Kerchunk 40 | - DH: Having complete key values in URL would help in the long run - DAP made a mistake earlier and we fixed it - having a complete URL is a better option and you can replace the contents within it later on 41 | - JM: Having a complete URL in manifest storage transformer for Zarr would help us but there's a question of backward compatibility 42 | - JMS: standardise the URL 43 | - DH: URL spec defines the format and correct way of defninig a URL - if you consider things other than FSSPEC you should have a more standardised URL 44 | - DH: Conforming to the [URL Spec](https://www.w3.org/Addressing/URL/url-spec.txt) should be avoided actively 45 | - JM: Having URL defined in the storage transformed would help - currently not defined 46 | - Fix typo - - **MERGED** 47 | - Updated Zarr-Specs license to CC-BY-4.0 - 48 | - Implicit groups removed in Zarr-Python V3 via 49 | - Corresponding PR in Zarr-Specs - 50 | -------------------------------------------------------------------------------- /meetings/2024/2024-05-30.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 30th May 4 | description: ZEPs Meeting Notes for 2024-05-30 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 11 8 | --- 9 | 10 | # 2024-05-30 11 | 12 | **Attending:** Sanket Verma (SV) and Davis Bennett (DB) 13 | 14 | ## TL;DR: 15 | 16 | **Updates:** 17 | 18 | - Zarr + NASA applications survey: 19 | - Zarr-Python 2.18.1 and 2.18.2 were released in the last 2 weeks - includes a couple of minor bugs 20 | - Latest update: ZP V3.0 alpha aimed to release this week 21 | - New blog post coming soon in collaboration with NASA POWER project - 22 | 23 | **Meeting Minutes:** 24 | 25 | - DB: Discussion on - 26 | - DB: - This is inconsistent with the current design 27 | - Also doesn't conforms with the spec 28 | - The array metadata should be immutable but `codec = CodecPipeline` makes it mutable 29 | - Implicit groups removed in Zarr-Python V3 via 30 | - Corresponding PR in Zarr-Specs - 31 | - SV: There's a informal consensus - we should go ahead with this PR 32 | -------------------------------------------------------------------------------- /meetings/2024/2024-06-13.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 13th June 4 | description: ZEPs Meeting Notes for 2024-06-13 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 12 8 | --- 9 | 10 | # 2024-06-13 11 | 12 | **Attending:** Davis Bennett (DB), Josh Moore (JM), Sanket Verma (SV), Jeremy Maitin-Shepard (JMS) 13 | 14 | ## TL;DR: 15 | 16 | The meeting discussed the timing for moving ZEPs 1 & 2 from "Accepted" to "Final," potential changes to ZEP1 related to v3 codec metadata and variable chunking, and considerations around implementing a chunk manifest with URL support. 17 | 18 | **Updates:** 19 | 20 | **Meeting Minutes:** 21 | - discussion for a future meeting: when are ZEPs 1 & 2 no longer just "Accepted" 22 | - see 23 | - when zarr-python v3 goes GA? some other time? 24 | - Sanket: after "accepted" is "final" for non-process 25 | - Davis: "active" isn't connected in the flowchart 26 | - Sanket: 27 | - Davis: defining ZEP with a ZEP seems problematic 28 | - Josh: certainly not necessary, but that requires "Yet Another Document" 29 | - Sanket: also definitely made a mistake of not taking into account the changes of ZEP0000 30 | - Davis: not a lot of ZEPs. writing the ZEP wasn't a good use of my time. 31 | - *jeremy joins* 32 | - ZEP1 changes ("bugs") 33 | - Davis: v3 codec metadata is cumbersome. could be json metadata rather than a list 34 | - could do this backwards compatible 35 | - as ZEP? Josh: suggest an issue first (like implicit group) then we can discuss 36 | - Jeremy: less likely to go in. need a high benefit (existing data out there, churn, etc.) 37 | - Davis: would argue that this is a wart in the spec and good to document that. 38 | - Davis: clarify relationship between the v3 spec and the codecs 39 | - current spec document is inconsistent 40 | - may impact implementations 41 | - Jeremy: intention was that whether in specs or in the "codecs" that there is a definition. i.e., no problem there and probably an editorial change. 42 | - Davis: variable chunking. extension defines a place to define the chunking (`name=regular` i.e. rectilinear) 43 | - minor version incremement that just uses variable chunks ("easier"?) 44 | - on the implementations, if there's only one it's easier in the long-run 45 | - Jeremy: that's probably how the implementation will work. but hard to know if there are other types of chunking in the future. possibly geospatial. 46 | - Davis: propose not supporting the old version (one list of chunk sizes) 47 | - Jeremy: don't think that's workable (to always require full); but for each dimension, to allow an integer or a list. ok to have the identifier and not a lot of work to convert. 48 | - Jeremy: chunk manifest (tabled) 49 | - likely makes it necessary to have URL support 50 | - 51 | - examples use `s3://...` 52 | - Josh: concerned that it's bigger than Zarr 53 | - other things: fsspec, intake, ... 54 | - Davis: is this another way to do sharding? pros / cons to the codec approach? (serves same perhaps as the shard header) 55 | - Jeremy: except there's the binary / plain split. 56 | - Davis: just that there's more than one way to do something 57 | - Jeremy: not unusual that a complicated system has more than one way to do something 58 | - Davis: decision point for people to make. Not something we had in zarr v2 59 | - Jeremy: use shard if you're writing it; use manifest if you have some stuff 60 | - Josh: also you could potentially choose to put a manifest in front of a (old) sharded 61 | -------------------------------------------------------------------------------- /meetings/2024/2024-06-27.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 27th June 4 | description: ZEPs Meeting Notes for 2024-06-27 5 | grand_parent: ZEP meetings 6 | parent: 2024 meetings 7 | nav_order: 13 8 | --- 9 | 10 | # 2024-06-27 11 | 12 | **Attending:** Davis Bennett (DB) and Sanket Verma (SV) 13 | 14 | ## TL;DR: 15 | 16 | **Updates:** 17 | 18 | - Zarr-Python 3.0.0a0 out 19 | - 20 | - Good momentum and lots of things happening with ZP-V3 - aiming for mid July release 21 | - SV represented Zarr at CZI Open Science 2024 meeting - various groups looking forward to V3 - 22 | - R users at bio-conductor looking to develop bindings for ZP-V3 23 | - New blog post: 24 | - ARCO-ERA5 got updated this week - ~6PB of Zarr data available - check: 25 | - - making weather data easy and accessbile to work with 26 | - Check: 27 | - Video tutorial: 28 | 29 | **Meeting Minutes:** 30 | 31 | - SV: Would like invite Norman for one of the showcase/lightning talks 32 | - DB: Having Tensorstore as backend for Zarr array writers would be good for performance 33 | - SV: How about Rust? 34 | - DB: Similar to C++ (Tensorstore) 35 | - DB: Slicing returns NumPy arrays - we should have lazy slicing API 36 | - DB: Would be good to be keep the momentum after V3 37 | - SV: Anything we can do to keep them engaged? 38 | - DB: Not as of now! 39 | - - would like to go ahead with this 40 | - DB: Impicit groups is a big change - maybe we need a major version bump 41 | - SV: If there's a unanimous change then it could be submitted as a PR / Lean ZEP 42 | - DB: Sounds good! 43 | - Move ZEP1 and ZEP2 to `Final`? 44 | -------------------------------------------------------------------------------- /meetings/2024/meeting_notes_2024.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: 2024 meetings 4 | description: List of ZEP meeting notes for the year 2024 5 | nav_order: 3 6 | parent: ZEP meetings 7 | has_children: true 8 | permalink: /meetings/2024/ 9 | --- 10 | 11 | # ZEP Meeting Notes for 2024 12 | 13 | Shows the list of meeting notes for the year 2024. 14 | -------------------------------------------------------------------------------- /meetings/meetings.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: ZEP meetings 4 | description: Information about bi-weekly ZEP meetings 5 | nav_order: 7 6 | has_children: true 7 | permalink: /meetings/ 8 | --- 9 | 10 | # ZEP Meetings 11 | {: .fs-8} 12 | 13 | Agenda, joining instructions and meeting notes for Bi-Weekly ZEP meetings 14 | {: .fs-5 .fw-300 } 15 | 16 | [Join here](https://openmicroscopy-org.zoom.us/j/82447735305?pwd=U3VXTnZBSk84T1BRNjZxaXFnZVQvZz09){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 } 17 | [Agenda for the upcoming meeting](https://hackmd.io/ZilORe8AQvyqH6ArqDw0Cg?view){: .btn .fs-5 .mb-4 .mb-md-0 } 18 | 19 | --- 20 | 21 | 22 | 23 | Download the [.ics](https://calendar.google.com/calendar/ical/c_ba2k79i3u0lkf49vo0jre27j14%40group.calendar.google.com/public/basic.ics) file and add it to your calendar so won't miss any of our 24 | meetings! 25 | -------------------------------------------------------------------------------- /template/header.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: template 4 | description: Template and instructions for proposing a new ZEP 5 | nav_order: 5 6 | has_children: true 7 | permalink: /template/ 8 | --- 9 | 10 | # ZEP Template and Instructions 11 | 12 | ### Template and instructions for proposing a new ZEP. 13 | -------------------------------------------------------------------------------- /template/template.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: ZEP0000 4 | description: Template and instructions for proposing a new ZEP 5 | parent: template 6 | nav_order: 1 7 | --- 8 | 9 | # ZEP — Template and Instructions 10 | 11 | --- 12 | 13 | ``` 14 | Author: 15 | 16 | Status: < Draft | Active | Accepted | Deferred | Rejected | Withdrawn | Final | Superseded > 17 | 18 | Type: 19 | 20 | Created: 21 | 22 | Discussion: (link to zarr-developers post for discussion) 23 | 24 | Resolution: (required for Accepted | Rejected | Withdrawn) 25 | ``` 26 | 27 | ## Abstract 28 | 29 | The abstract should be a short description of what the ZEP will achieve. 30 | 31 | ## Motivation and Scope 32 | 33 | This section describes the need for the proposed change. It should describe the existing problem, who it affects, what it is trying to solve, and why. 34 | This section should explicitly address the scope of and key requirements for the proposed change. 35 | 36 | ## Usage and Impact 37 | 38 | This section describes how users of Zarr will use the new features, spec changes or a new process described in this ZEP. It should be comprised mainly of code examples that wouldn’t be possible 39 | without acceptance and implementation of this ZEP, as well as the impact the proposed changes would have on the ecosystem. This section should be written from 40 | the perspective of the users of Zarr, and the benefit it will provide them; as such, it should include implementation details only if necessary to explain the 41 | functionality. 42 | 43 | ## Backward Compatibility 44 | 45 | This section describes how the ZEP breaks backward compatibility. 46 | 47 | Its purpose is to provide a high-level summary to users who are not interested in detailed technical discussion, but may have opinions around, e.g., usage and 48 | impact. 49 | 50 | ## Detailed description 51 | 52 | This section should provide a detailed description of the proposed change. It should include examples of how the new functionality would be used, intended 53 | use-cases and pseudo-code illustrating its use. 54 | 55 | ## Related Work 56 | 57 | This section should list relevant and/or similar technologies, possibly in other libraries. It does not need to be comprehensive, just list the major examples 58 | of prior and relevant art. 59 | 60 | ## Implementation 61 | 62 | This section lists the major steps required to implement the ZEP. Where possible, it should be noted where one step is dependent on another, and which steps may 63 | be optionally omitted. Where it makes sense, each step should include a link to related pull requests as the implementation progresses. 64 | 65 | Any pull requests or development branches containing work on this ZEP be linked to from here. (A ZEP does not need to be implemented in a single pull request if 66 | it makes sense to implement it in discrete phases). 67 | 68 | ## Alternatives 69 | 70 | If there were any alternative solutions to solving the same problem, they should be discussed here, along with a justification for the chosen approach. 71 | 72 | ## Discussion 73 | 74 | This section should have links related to any discussion regarding the ZEP. It could be GitHub issues and/or discussions. (The links to discussions in past 75 | if any, goes in this section.) 76 | 77 | ## References and Footnotes 78 | 79 | Each ZEP must either be explicitly labelled as placed in the public domain (see this ZEP as an example) or licensed under the 80 | [Open Publication License](https://www.opencontent.org/openpub/). 81 | 82 | ## Copyright 83 | 84 | This document has been placed in the public domain. 85 | -------------------------------------------------------------------------------- /zarr-implementations-council.markdown: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: implementations council 4 | description: Representatives of various Zarr Implementations 5 | nav_order: 6 6 | permalink: /zic/ 7 | --- 8 | 9 | # Zarr Implementation Council 🚀 10 | 11 | The [ZSC](https://github.com/zarr-developers/governance/blob/main/GOVERNANCE.md#zarr-steering-council) have invited Zarr Implementations to participate in the management of the Zarr specification through the Zarr Implementation Council (ZIC). Implementations are selected based on the maturity of implementation as well as the activity of the developer community. Preference will be given to open-source and *open-process* implementations. Multiple implementations in a single programming language may be invited, or such implementations could work together as a single community. 12 | 13 | The current list of implementations which are participating in this process are (in alphabetical order): 14 | 15 | - [constantinpape/z5](https://github.com/constantinpape/z5) represented by [Constantin Pape](https://github.com/constantinpape) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/26)) 16 | 17 | - [google/tensorstore](https://github.com/google/tensorstore) represented by [Jeremy Maitin-Shepard](https://github.com/jbms) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/22)) 18 | 19 | - [freeman-lab/zarr-js](https://github.com/freeman-lab/zarr-js): 20 | - represented by [Jeremy Freeman](https://github.com/freeman-lab) ([May 2022 – March 2023](https://github.com/zarr-developers/governance/issues/27)) 21 | - represented by [Anderson Banihirwe](https://github.com/andersy005) ([March 2023 - present](https://github.com/zarr-developers/governance/pull/36)) 22 | 23 | - [gzuidhof/zarr.js](https://github.com/gzuidhof/zarr.js) represented by [Trevor Manz](https://github.com/manzt) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/28)) 24 | 25 | - [JuliaIO/Zarr.jl](https://github.com/JuliaIO/Zarr.jl) represented by [Fabian Gans](https://github.com/meggart) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/18)) 26 | 27 | - [saalfeldlab/n5-zarr](https://github.com/saalfeldlab/n5-zarr) represented by [Stephan Saalfeld](https://github.com/axtimwalde) ([May 2022 – present](https://github.com/zarr-developers/governance/issues/25)) 28 | 29 | - [sci-rs/zarr](https://github.com/sci-rs/zarr) represented by [Andrew Champion](https://github.com/aschampion) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/20)) 30 | 31 | - [Unidata/netcdf-c](https://github.com/Unidata/netcdf-c) and [Unidata/netcdf-java](https://github.com/Unidata/netcdf-java) represented by [Ward Fisher](https://github.com/wardf) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/21)) 32 | 33 | - [xtensor-stack/xtensor-zarr](https://github.com/xtensor-stack/xtensor-zarr) represented by [David Brochart](https://github.com/davidbrochart) ([May 2022 - present](https://github.com/zarr-developers/governance/issues/23)) 34 | 35 | - [zarr-developers/zarr-python](https://github.com/zarr-developers/zarr-python): 36 | - represented by [Gregory Lee](https://github.com/grlee77) ([May 2022 - January 2024](https://github.com/zarr-developers/governance/issues/19)) 37 | - represented by [Joe Hamman](https://github.com/jhamman) and seconded by [Davis Bennett](https://github.com/d-v-b/) ([January 2024 - present](https://github.com/zarr-developers/governance/commit/0a12fdf653d5a32c47d9566eb3049d2961880bca)) 38 | 39 | 40 | The core developers of each implementation have selected a representative of the ZIC. It is up to each implementation to determine its process for selecting its representatives. 41 | 42 | This member will represent that implementation in decisions regarding the Zarr Specification and other Zarr-wide contexts which require input from implementations. 43 | 44 | An additional representative should also be selected to act as an alternate when the primary representative is unavailable. 45 | 46 | Continued ZIC membership depends on timely feedback and votes on relevant issues. The ZSC also reserves the right to remove implementations from the council. 47 | --------------------------------------------------------------------------------