├── .github
└── ISSUE_TEMPLATE
│ ├── add-a-new-gap-analysis-topic.md
│ ├── ask-a-question.md
│ └── other.md
├── .gitignore
├── .pr-preview.json
├── CONTRIBUTING.md
├── README.md
├── beng
├── echidna
└── index.html
├── bengali
├── index.html
└── local.css
├── charter
├── charter2018.html
└── index.html
├── deva
├── echidna
└── index.html
├── devanagari
├── images
│ ├── first_letter.png
│ └── first_letter_box.png
├── index.html
└── local.css
├── gap-analysis
├── HOWTO.md
├── beng-gap.html
├── bengali-tests
│ ├── bengali-danda-001.html
│ └── highlighting-base-001.html
├── deva-gap.html
├── echidna-beng-gap
├── echidna-deva-gap
├── echidna-gujr-gap
├── echidna-guru-gap
├── echidna-taml-gap
├── gujr-gap.html
├── guru-gap.html
├── images
│ ├── 107240559-f6157b00-6a21-11eb-9ce6-bf4494e6c3ca.png
│ ├── 146041054-f96714a2-b926-4ea4-8e17-ba4e003a671e.png
│ ├── 146041082-fbdd1cee-1bdc-4c45-b2d8-d427b8027c77.png
│ ├── 146041100-ab30d934-2a22-4e2e-9c6e-68cb3965f3e9.png
│ ├── 150516199-0b937d1f-ffa7-458f-a192-e49146503cf2.png
│ ├── 150529011-bc06a209-1218-4e32-ba68-2d1b1d48533a.png
│ ├── 150529074-0ea464b3-75c0-412e-b849-f6e0af82d6d3.png
│ └── 153039565-773656ba-9a55-47d3-be14-a7c5a84e605d.png
└── taml-gap.html
├── gujr
├── echidna
└── index.html
├── gurmukhi
├── echidna
├── images
│ ├── fig_abbrev.png
│ ├── fig_alphabetic_counters.png
│ ├── fig_baselines_baloo_paaji_2.png
│ ├── fig_baselines_gurmukhi_mn.png
│ ├── fig_baselines_mukta_mahee.png
│ ├── fig_baselines_raavi.png
│ ├── fig_baselines_sans.png
│ ├── fig_baselines_serif.png
│ ├── fig_counterstyles.png
│ ├── fig_counterstyles2.png
│ ├── fig_cs_separators.png
│ ├── fig_danda.png
│ ├── fig_first_letter.png
│ ├── fig_grapheme_clusters.png
│ ├── fig_justification.png
│ ├── fig_numeric_counters.png
│ ├── fig_orthographic_syllables.png
│ └── fig_slanted_para.jpg
├── index.html
└── local.css
├── guru
├── echidna
├── index.html
└── local.css
├── home.md
├── homepage
├── index-data
│ ├── local.css
│ └── translations.js
└── index.html
├── tamil
├── echidna
├── images
│ ├── drop_cap_low.png
│ ├── drop_cap_wide.png
│ ├── fig_1978_reforms.png
│ ├── fig_baselines.png
│ ├── fig_baselines_latha.png
│ ├── fig_baselines_mn.png
│ ├── fig_baselines_sans.png
│ ├── fig_counter_styles.png
│ ├── fig_cs_additive.png
│ ├── fig_ra_variants.jpeg
│ ├── fig_raised_initial.png
│ ├── fig_virama_seg.png
│ ├── graphemes.png
│ ├── hyphenation_ta.png
│ ├── justification_full_stop.png
│ ├── justification_gaps.png
│ ├── justification_in_newsprint.png
│ ├── justification_one_word.png
│ ├── partridge.png
│ ├── s_cut.png
│ ├── s_fineness.png
│ ├── s_lakh.png
│ ├── s_road_street.png
│ ├── s_shri.png
│ ├── s_stalk.png
│ ├── s_there.png
│ └── s_weakness.png
├── index.html
├── local.css
└── webfonts
│ ├── notosanstamil-regular-webfont.woff2
│ └── notoseriftamil-regular-webfont.woff2
├── taml
├── echidna
├── index.html
├── local.css
└── webfonts
│ ├── notosanstamil-regular-webfont.woff2
│ └── notoseriftamil-regular-webfont.woff2
└── w3c.json
/.github/ISSUE_TEMPLATE/add-a-new-gap-analysis-topic.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Add a new gap-analysis topic
3 | about: Only those in the iip group should use this template.
4 | title: Brief_description_of_the_problem
5 | labels: gap
6 | assignees: ''
7 |
8 | ---
9 |
10 | This issue is applicable to most_languages.
11 |
12 | Brief_intro_illustrating_the_requirements
13 |
14 | More:
15 | - [requirements_doc]()
16 | - [etc]()
17 |
18 |
19 | IF THIS IS NOT THE ISSUE THAT IS BEING TRACKED BY THE GAP-ANALYSIS PIPELINE, ADD A POINTER TO THAT ISSUE. THE INITIAL BRIEF INTRO SHOULD REMAIN, AND MAY BE TAILORED WITH EXAMPLES RELEVANT TO THIS LANGUAGE. YOU MAY, OPTIONALLY, ALSO ADD OTHER DETAILS BELOW IF THEY ARE SPECIFIC TO THIS LANGUAGE. THEN ADD THIS:
20 |
21 | For more details, see [this GitHub issue](https://github.com/w3c/XXXX/issues/XX), which is being used to track this gap. Please add any discussion there, and not to this issue.
22 |
23 | THEN ADD THESE 2 PARAS TO THE SECOND COMMENT FIELD AND DELETE THE REST OF THIS TEMPLATE.
24 |
25 | _The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the Editor's draft of the document._
26 |
27 | _**Please add any discussion to the GitHub issue being used to track this gap, and not to this issue**_
28 |
29 |
30 |
31 |
32 |
33 | ### The GAP
34 |
35 | Description_of_the_problem_and_summary_of_test_results
36 |
37 | Brief_description_of_what_spec_says_on_the_matter
38 | [shortname](url_to_section) describe_what_it_says
39 |
40 | Gecko, Blink, and Webkit
41 |
42 | More:
43 | - [relevant_issues]()
44 | - [etc]()
45 |
46 |
47 |
48 |
49 | ### Priority
50 | Why_you_chose_the_priority
51 |
52 |
53 |
54 |
55 |
56 | ### Tests & results
57 |
58 | Interactive test, [assertion](url)
59 | I18n test suite, [section_head](url)
60 |
61 | Summarise_the_results_for_each_major_engine_only_if_useful
62 |
63 |
64 |
65 |
66 |
67 | ### Action taken
68 | Issue, [XXX](url) Closed.
69 |
70 | [Gecko](url) • [Blink](url) • [Webkit](url)
71 |
72 |
73 |
74 |
75 | ### Outcomes
76 | Brief_description_of_developments
77 |
78 |
79 |
80 |
81 |
82 | TEXT FOR THE SECOND COMMENT FIELD: ADAPT THE LINKS AS NEEDED; IF THE DOCS SPAN REPOS, BOLD THE ONE THAT IS REFERRED TO FROM THE PIPELINE
83 | _The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the Editor's draft of the document. Proposals for changes or discussion of the content can be made by adding comments below this point._
84 |
85 | _Relevant gap analysis documents include:_
86 | _[Adlam](https://www.w3.org/TR/adlm-gap#fragmentid) • [Arabic/Persian](https://www.w3.org/TR/alreq-gap#fragmentid) • [Bengali](https://www.w3.org/TR/beng-gap/#fragmentid) • [Cherokee](https://www.w3.org/TR/cher-gap#fragmentid) • [Chinese](https://www.w3.org/TR/clreq-gap#fragmentid) • [Dutch](https://www.w3.org/TR/latn-nl-gap#fragmentid) • [Ethiopic](https://www.w3.org/TR/elreq-gap#fragmentid) • [French](https://www.w3.org/TR/latn-fr-gap#fragmentid) • [**Georgian**](https://www.w3.org/TR/geor-gap#fragmentid) • [German](https://www.w3.org/TR/latn-de-gap#fragmentid) • [Greek](https://www.w3.org/TR/grek-gap#fragmentid) • [Gujarati](https://www.w3.org/TR/gujr-gap#fragmentid) • [Hebrew](https://www.w3.org/TR/hebr-gap#fragmentid) • [Hindi](https://www.w3.org/TR/deva-gap#fragmentid) • [Hungarian](https://w3c.github.io/eurlreq/gap-analysis/latn-nl-gap#fragmentid) • [Inuktitut/Cree](https://www.w3.org/TR/cans-iu-cr-gap#fragmentid) • [Japanese](https://www.w3.org/TR/jpan-gap#fragmentid) • [Javanese](https://www.w3.org/TR/java-gap#fragmentid) • [Kashmiri](https://www.w3.org/TR/arab-ks-gap#fragmentid) • [Khmer](https://www.w3.org/TR/khmr-gap#fragmentid) • [Korean](https://www.w3.org/TR/kore-gap#fragmentid) • [Lao](https://www.w3.org/TR/laoo-gap#fragmentid) • [Mongolian](https://www.w3.org/TR/mong-gap#fragmentid) • [N'Ko](https://www.w3.org/TR/nkoo-gap#fragmentid) • [Osage](https://www.w3.org/TR/osge-osa-gap#fragmentid) • [Punjabi](https://www.w3.org/TR/guru-gap#fragmentid) • [Tamil](https://www.w3.org/TR/taml-gap#fragmentid) • [Thai](https://www.w3.org/TR/thai-gap#fragmentid) • [Tibetan](https://www.w3.org/TR/tibt-gap#fragmentid) • [Uighur](https://www.w3.org/TR/arab-ug-gap#fragmentid)_
87 |
88 | SETTING LABELS (delete before submitting)
89 | gap should already be assigned
90 | doc:... should point to each document _in this repo_ where this gap report will appear
91 | i:... should indicate the section in those documents where this will appear
92 | x:blink/gecko/webkit should be set for browser engines that don't resolve the gap (and removed when they do)
93 | x:... language or script related tags should be set for all affected languages
94 | p:... should indicate the priority of this gap
95 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/ask-a-question.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Ask a question
3 | about: Use to ask about how people use a language or script.
4 | title: Short_version_of_the_question?
5 | labels: question
6 | assignees: ''
7 |
8 | ---
9 |
10 | Ask_the_question_here_Use_pictures_and_links
11 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/other.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Other
3 | about: Please use links or pictures for examples and sources where possible.
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 |
11 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | *.DS_Store
2 |
--------------------------------------------------------------------------------
/.pr-preview.json:
--------------------------------------------------------------------------------
1 | {
2 | "src_file": "gap-analysis/deva-gap.html",
3 | "type": "respec"
4 | }
5 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | ## Contributions
2 |
3 | Contributions to this repository are intended to become part of the Internationalization Interest Group and Internationalization Working Group documents governed by the [Software and Document License](http://www.w3.org/Consortium/Legal/copyright-software). By committing here, you agree to that licensing of your contributions.
4 |
5 | If you are not the sole contributor to a contribution (pull request), please identify all contributors in the pull request comment.
6 |
7 | To add a contributor (other than yourself, that's automatic), mark them one per line as follows:
8 |
9 | ```
10 | +@github_username
11 | ```
12 |
13 | If you added a contributor by mistake, you can remove them in a comment with:
14 |
15 | ```
16 | -@github_username
17 | ```
18 |
19 | If you are making a pull request on behalf of someone else but you had no part in designing the feature, you can remove yourself with the above syntax.
20 |
21 |
22 |
23 | ## Copyright
24 |
25 | Copyright is a very important part of standardization activities. It allows the standards development organization to maintain vendor neutral control over a specification, and thus protect the consensus found within a Working Group.
26 |
27 | In the course of the development of materials within the W3C, Task Force Participants will make contributions. Those contributions will be integrated into the jointly developed work thus creating shared copyright on the Task Force Participant's contribution. Most W3C Specifications contain a section with acknowledgement of contributions.
28 |
29 | Task Force Participants grant to the W3C a perpetual, nonexclusive, royalty-free, world-wide right and license under any Task Force Participant's copyrights on his or her contributions, to copy, publish and distribute the contribution under a license of W3C's choosing. Additionally, the Task Force Participant grants a right and license of the same scope to any derivative works prepared by the W3C and based on, or incorporating all or part of, his or her contribution and that any derivative works of this contribution prepared by the W3C shall be solely owned by the W3C. Furthermore, the Task Force Participant understands that W3C will be able to exercise all rights as a copyright owner of Task Force Participant's contribution, including enforcement against infringers without additional agreement or notice.
30 |
31 | Nothing in this agreement restricts the Task Force Participant from using their individual contributions as they wish, even if those have later been amalgamated into joint works. Where W3C releases materials under a permissive license such as the W3C Software License or CC-BY, nothing in this agreement should be read to restrict the Task Force Participant from exercising the permissions granted by that license. The Task Force Participant represents that they are legally entitled to grant the above license. If their employer(s) have rights to intellectual property that the Task Force Participant creates that includes the contributions, they represent that they have received permission to make contributions on behalf of that employer or that the employer has waived such rights for the contributions to W3C.
32 |
33 |
34 | ## Decency
35 |
36 | The Task Force Participant will participate in the W3C Group in a decent way. Task Force Participants will refrain from defaming, harassing or otherwise offending other participants. The [Section 3.1 of the Process Document](https://www.w3.org/2015/Process-20150901/#ParticipationCriteria) applies, as does the W3C [Code of Ethics and Professional Conduct](https://www.w3.org/Consortium/cepc/).
37 |
38 | The Task Force Participant will refrain from sending unsolicited commercial messages to W3C mailing-lists and other promotional activities for personal matters or for third parties. This is especially required from Task Force Participants sending messages to public W3C Groups.
39 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # India Language Enablement (IIP)
2 |
3 | This is the place to explore gaps in support for languages of India on the Web and in eBooks, and to document requirements.
4 |
5 | We aim to address the problem that local users don't know how to tell the W3C what problems exist for support of their language on the Web, and the W3C doesn't know how to contact people who can help when questions arise.
6 |
7 | Topics for discussion are suggested by [the gap-analysis template](https://www.w3.org/International/i18n-activity/templates/gap-analysis/gap-analysis_template.html). This work feeds into the [language matrix](https://www.w3.org/International/typography/gap-analysis/language-matrix.html) which provides a heat-map for language issues on the Web.
8 |
9 |
10 | ### Key links
11 | [GitHub repo](https://github.com/w3c/iip) • [Discussion threads](https://github.com/w3c/iip/issues) • [Charter](https://www.w3.org/International/iip/charter/)
12 |
13 | ---
14 |
15 | ## Help wanted! ###
16 | **We're looking for information about these writing systems. Follow the links for specific questions.**
17 |
18 | **[Bengali](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Abn+label%3Aquestion) • [Hindi](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Ahi+label%3Aquestion) • [Gujarati](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Agu+label%3Aquestion) • [Punjabi](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Apa-guru+label%3Aquestion) • [Tamil](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Ata+label%3Aquestion)**
19 |
20 | ---
21 |
22 |
23 |
24 | ### Resource & requirement docs
25 | - **Bengali Script Resources** • [DNOTE](https://www.w3.org/TR/beng-lreq/) • [*Editor's draft*](https://w3c.github.io/iip/beng/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/beng/)
26 | - **Devanagari Script Resources** • [DNOTE](https://www.w3.org/TR/deva-lreq/) • [*Editor's draft*](https://w3c.github.io/iip/deva/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/deva/)
27 | - **Gujarati Script Resources** • [DNOTE](https://www.w3.org/TR/gujr-lreq/) • [*Editor's draft*](https://w3c.github.io/iip/gujr/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gujr/)
28 | - **Gurmukhi Script Resources** • [DNOTE](https://www.w3.org/TR/guru-lreq/) • [*Editor's draft*](https://w3c.github.io/iip/guru/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/guru/)
29 | - **Tamil Script Resources** • [DNOTE](https://www.w3.org/TR/taml-lreq/) • [*Editor's draft*](https://w3c.github.io/iip/taml/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/taml/)
30 |
31 | ### Gap docs
32 | - **Bengali Gap Analysis** • [DNOTE](https://www.w3.org/TR/beng-gap) • [*Editor's draft*](https://www.w3.org/International/ilreq/gap-analysis/beng-gap) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gap-analysis/beng-gap.html)
33 | - **Devanagari Gap Analysis** • [DNOTE](https://www.w3.org/TR/deva-gap) • [*Editor's draft*](https://www.w3.org/International/ilreq/gap-analysis/deva-gap) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gap-analysis/deva-gap.html)
34 | - **Gujarati Gap Analysis** • [DNOTE](https://www.w3.org/TR/gujr-gap) • [*Editor's draft*](https://www.w3.org/International/ilreq/gap-analysis/gujr-gap) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gap-analysis/gujr-gap.html)
35 | - **Gurmukhi Gap Analysis** • [DNOTE](https://www.w3.org/TR/guru-gap) • [*Editor's draft*](https://www.w3.org/International/ilreq/gap-analysis/guru-gap) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gap-analysis/guru-gap.html)
36 | - **Tamil Gap Analysis** • [DNOTE](https://www.w3.org/TR/taml-gap) • [*Editor's draft*](https://www.w3.org/International/ilreq/gap-analysis/taml-gap) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/gap-analysis/taml-gap.html)
37 |
38 |
39 | ### Discussions
40 | - **Bengali** • [*Questions*](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Abn+label%3Aquestion)
41 | • [*Gap reports*](https://github.com/w3c/iip/labels/doc%3Abeng)
42 | • [*Other*](https://github.com/w3c/iip/issues?q=is%3Aopen+label%3Al%3Abn+-label%3Aquestion)
43 | • [*Spec issues*](https://github.com/w3c/i18n-activity/issues?q=is%3Aopen+label%3Ailreq+label%3Aspec-type-issue)
44 | - **Hindi** • [*Questions*](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Ahi+label%3Aquestion)
45 | • [*Gap reports*](https://github.com/w3c/iip/labels/doc%3Adeva)
46 | • [*Other*](https://github.com/w3c/iip/issues?q=is%3Aopen+label%3Al%3Ahi+-label%3Aquestion)
47 | • [*Spec issues*](https://github.com/w3c/i18n-activity/issues?q=is%3Aopen+label%3Ailreq+label%3Aspec-type-issue)
48 | - **Gujarati** • [*Questions*](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Agu+label%3Aquestion)
49 | • [*Gap reports*](https://github.com/w3c/iip/labels/doc%3Agujr)
50 | • [*Other*](https://github.com/w3c/iip/issues?q=is%3Aopen+label%3Al%3Agu+-label%3Aquestion)
51 | • [*Spec issues*](https://github.com/w3c/i18n-activity/issues?q=is%3Aopen+label%3Ailreq+label%3Aspec-type-issue)
52 | - **Punjabi** • [*Questions*](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Apa-guru+label%3Aquestion)
53 | • [*Gap reports*](https://github.com/w3c/iip/labels/doc%3Aguru)
54 | • [*Other*](https://github.com/w3c/iip/issues?q=is%3Aopen+label%3Al%3Apa-guru+-label%3Aquestion)
55 | • [*Spec issues*](https://github.com/w3c/i18n-activity/issues?q=is%3Aopen+label%3Ailreq+label%3Aspec-type-issue)
56 | - **Tamil** • [*Questions*](https://github.com/w3c/iip/issues?q=is%3Aissue+is%3Aopen+label%3Al%3Ata+label%3Aquestion)
57 | • [*Gap reports*](https://github.com/w3c/iip/labels/doc%3Ataml)
58 | • [*Other*](https://github.com/w3c/iip/issues?q=is%3Aopen+label%3Al%3Ata+-label%3Aquestion)
59 | • [*Spec issues*](https://github.com/w3c/i18n-activity/issues?q=is%3Aopen+label%3Ailreq+label%3Aspec-type-issue)
60 |
61 |
62 |
63 |
64 | ### Related documents
65 | - [Indic Layout Requirements](https://www.w3.org/TR/ilreq/)
66 | - [Ready-made Counter Styles](https://www.w3.org/TR/predefined-counter-styles/)
67 |
68 |
69 | #### Documents not currently being worked on
70 | - Bengali Layout Requirements • [*Editor's draft*](https://www.w3.org/International/ilreq/bengali/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/bengali/index.html)
71 | - Devanagari Layout Requirements • [*Editor's draft*](https://www.w3.org/International/ilreq/devanagari/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/devanagari/index.html)
72 | - Tamil Layout Requirements • [DNOTE](https://www.w3.org/TR/ilreq-taml) • [*Editor's draft*](https://www.w3.org/International/ilreq/tamil/) • [*Latest commits*](https://github.com/w3c/iip/commits/gh-pages/tamil/index.html)
73 |
74 |
75 | ### Feedback
76 | Please use the [GitHub issue list](https://github.com/w3c/iip/issues) to report issues for language support, for discussions, and to send feedback about documents. (Learn [how GitHub issues work](https://www.w3.org/International/i18n-activity/guidelines/issues.html).)
77 |
78 | Note that the public-i18n-indic mailing list is used to send notification digests & meeting minutes. It is **not** for technical discussion.
79 |
80 |
81 | ### Participate
82 | You can participate in the work at various levels. In order of increasing commitment, these include List subscriber, Participant, Editor, and Chair. [Explore the options](https://www.w3.org/International/i18n-drafts/pages/languagedev_participation.html).
83 |
84 | **To just follow the work:** Rather than 'Watch' this repository, [subscribe](mailto:public-i18n-indic-request@w3.org?subject=subscribe) to the [public-i18n-indic](https://lists.w3.org/Archives/Public/public-i18n-indic/) mailing list. That list is notified (no more than once a day, and in digest form), about changes to issues in this repository, but also about other W3C Working Group issues related to the Indian writing systems.
85 |
86 | **To contribute content:** All contributors must read and agree with [CONTRIBUTING.md](CONTRIBUTING.md).
87 |
88 | **To become a participant, editor, or chair:** contact [Richard Ishida](mailto:ishida@w3.org). We welcome participation requests.
89 |
90 |
91 | ### Contacts
92 |
93 | - Chairs: Alolita Sharma, Abhijit Dutta
94 | - W3C staff: [Atsushi Shimono](mailto:atsushi@w3.org), [Richard Ishida](mailto:ishida@w3.org)
95 |
96 |
97 | ### Links to practical information
98 | - [Mail archive](https://lists.w3.org/Archives/Public/public-i18n-indic/)
99 | - [Writing i18n tests](https://github.com/w3c/i18n-tests/wiki/Writing-i18n-tests)
100 | - [Practical tips for task forces](https://www.w3.org/International/i18n-activity/guidelines/process.html) (See also the github and editorial guidelines below)
101 | - [Meeting info](https://www.w3.org/2017/07/ilreq-meeting-info.html)
102 | - [Group members](https://www.w3.org/2000/09/dbwg/details?group=104979&public=1)
103 | - [Archived, former action tracker](https://www.w3.org/International/groups/indic-layout/track/)
104 |
105 |
106 | ### Links to background information
107 | The following information describes work going on at the W3C to support languages on the Web.
108 | - [Language support heatmap (matrix)](https://www.w3.org/International/typography/gap-analysis/language-matrix.html)
109 | - [Analysing support for text layout on the Web](https://www.w3.org/International/i18n-drafts/nav/languagedev)
110 | - [Overview of language enablement work in progress](https://www.w3.org/International/i18n-drafts/nav/languagedev)
111 | - [Get involved with Language Enablement](https://www.w3.org/International/i18n-drafts/pages/languagedev_participation)
112 | - [Setting up a Gap Analysis Project](https://github.com/w3c/typography/wiki/Setting-up-a-Gap-Analysis-Project)
113 | - [Internationalization Sponsorship Program](https://www.w3.org/International/sponsorship/)
114 |
115 |
116 | ### Links for editors
117 | If you end up creating a document, you should be familiar with and use the following:
118 |
119 | - [Github guidelines for working with i18n documents](https://www.w3.org/International/i18n-activity/guidelines/github)
120 | - [Editorial guidelines for working with i18n documents](https://www.w3.org/International/i18n-activity/guidelines/editing)
121 |
--------------------------------------------------------------------------------
/beng/echidna:
--------------------------------------------------------------------------------
1 | # ECHIDNA configuration
2 | index.html?specStatus=DNOTE&shortName=beng-lreq respec
3 |
--------------------------------------------------------------------------------
/bengali/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
This document describes or points to requirements for the layout and presentation of text in languages that use the Bengali script. The target audience is developers of Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode, as well as implementers of web browsers, ebook readers, and other applications that need to render Bengali text.
52 |
53 |
54 |
55 |
This document describes the basic requirements for Bengali script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of Bengali script languages. Currently the document focuses on the Bengali script as used for Bengali. The information here is developed in conjunction with a document that summarises gaps in support on the Web for Bengali.
To make it easier to track comments, please raise separate issues or emails for each comment, and point to the section you are commenting on using a URL.
60 |
61 |
62 |
63 |
64 |
Some links on this page point to repositories or pages to which information will be added over time. Initially, the link may produce no results, but as issues, tests, etc. are created they will show up.
65 |
66 |
Links that have a gray color led to no content the last time this document was updated. They are still live, however, since relevant content could be added at any time. When the document is updated, links that now point to results will have their live colour restored.
Thanks to the following people who contributed information that is used in this document (contributors' names listed in in alphabetic order): Akshat Joshi, Hai Liang, John Hudson, Vivek Pani.
The aim of this document is to describe the basic requirements for Bengali script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications, and for application developers, about how to support users of the Bengali script. The document currently focuses on texts using the Bengali language.
103 |
104 |
The document focuses on typographic layout issues. For a deeper understanding of Bengali using the Bengali script and how it works see Bengali Orthography Notes, which includes topics such as: Phonology, Vowels, Consonants, Encoding choices, and Numbers.
105 |
106 |
This document should contain no reference to a particular technology. For example, it should not say "CSS does/doesn't do such and such", and it should not describe how a technology, such as CSS, should implement the requirements. It is technology agnostic, so that it will be evergreen, and it simply describes how the script works. The gap analysis document is the appropriate place for all kinds of technology-specific information.
This document should be used alongside a separate document, Bengali Gap Analysis, which describes gaps in support for Bengali on the Web, and prioritises and describes the impact of those gaps on the user.
The document Language enablement index points to this document and others, and provides a central location for developers and implementers to find information related to various scripts.
132 |
133 |
The W3C also has a repository with discussion threads related to the Bengali script, including requests from developers to the user community for information about how scripts/languages work, and a notification system that tracks issues in W3C working groups related to the Bengali script. See a list of unresolved questions for Bengali experts. Each section below points to related discussions. See also the repository home page.
The Bengali script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel signs to the consonant.
156 |
157 |
The orthographic letters of the Bengali script are derived from Sanskrit, and in some cases don't quite fit the needs of modern Bangla (eg. lack of simple vowels for the sounds ɛ and æ, letters for only 2 of many diphthongs, long and short letters where pronunciation no longer distinguishes those sounds, etc.)
158 |
159 |
Bengali text runs left to right in horizontal lines. Words are separated by spaces. There are no case distinctions.
160 |
161 |
The consonant letters are supplemented by repertoire extensions by applying the nukta diacritic to characters.
162 |
163 |
Consonant clusters at any location are normally indicated using a virama (hasant) between consonants. This results in a large number of conjunct forms expressed using stacked consonants, conjoined consonants, and ligated glyphs. Conjuncts often have different pronunciations than might be expected from the letters involved and, in particular, gemination is very common. Occasionally, a visible virama is used. However, clusters are often not marked at all.
164 |
165 |
As part of a cluster, RA has special forms, for both cluster-initial and post-base positions.
166 |
167 |
Word-final consonant sounds may be represented by the special letter ৎ, or by dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama.
168 |
169 |
The Bangla orthography is an abugida with 2 inherent vowels, pronounced ɔ and o. Other post-consonant vowels are written using combining marks (vowel signs) and a specialised use of the y consonant letter.
170 |
171 |
Vowel harmony plays a significant role in the pronunciation of vowel-related code points in Bangla.
172 |
173 |
There are pre-base and circumgraph vowel signs. In principle, there are no multipart vowels, however in decomposed text the circumgraphs split into 2 parts each.
174 |
175 |
Standalone vowels are written using independent vowel letters, one for each vowel sound, including the inherent vowel and diphthongs. The final sound of numerous diphthongs is also represented using independent vowels.
176 |
Vowels may be nasalised, using the candrabindu diacritic.
The basic unit for working with Bengali text is the orthographic syllable, ie. one consonant or a sequence of consonants with hasant between, plus optional additional combining characters (such as vowel-signs).
240 |
241 |
In Bengali an orthographic syllable that forms a conjunct should be treated as an indivisible unit of text for most editing operations. shows a Bengali word with a conjunct at the end, and the expected segmentation.
242 |
243 |
244 |
245 |
ঝিল্লি → ঝি+ল্লি
246 | Expected minimal units (right) during segmentation of the word ঝিল্লিjhilli.
247 |
248 |
249 |
250 |
If, however, a conjunct is not formed and the hasant is visible, the first consonant plus hasant would be treated as separate from the second consonant, and the vowel-sign would appear to the left of the second consonant (see ).
251 |
252 |
253 |
254 |
ঝিল্লি → ঝি+ল্+লি
255 | Expected segmentation of the word ঝিল্লিjhilli when there is no conjunct.
256 |
257 |
258 |
259 |
Note that in Bengali an orthographic syllable may be longer than a Unicode grapheme cluster, if it forms a conjunct. shows a Bengali word with a conjunct at the end, and the segmentation that would result from applying Unicode grapheme clusters only.
260 |
261 |
262 |
263 |
ঝিল্লি → ঝি+ল্+লি
264 | Segmentation of the word ঝিল্লিjhilli with a conjunct when using Unicode grapheme clusters.
265 |
266 |
267 |
268 |
For Bengali, applications need to provide tailored extensions to correctly segment the text. Such tailoring needs to be able to distinguish between sequences that are displayed as conjuncts, and those where the hasant is visible.
।U+0964 DEVANAGARI DANDA, is used for sentence final punctuation.
316 |
317 |
There are two alternative approaches to the use of spaces with danda:
318 |
319 |
320 |
No space character appears between the end of the phrase and the danda glyph, but the advance width of the danda in a font should open a small gap before it. The danda is then typically followed by a single space.
321 |
322 |
A space is allowed before and after the danda in order to balance the space before and after it. In this case, the danda must still be kept from wrapping to a new line on its own; it should wrap with the previous word and space together.
323 |
324 |
325 |
These same principles apply to ॥U+0965 DEVANAGARI DOUBLE DANDA.
326 |
327 |
The double danda should be written using the dedicated Unicode character, and not by combining two single dandas.
328 |
329 |
The double danda is sometimes used to set apart section or verse numbering, in which the number is placed between pairs of double dandas. To obtain the correct spacing, the character sequence is usually <double danda, space, numeral(s), double danda>.
The mission of this task force is to support the use of languages of India by Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications and Unicode. It does this by establishing a network of experts who explore, discuss and document gaps and requirements for the languages in scope.
The output of the task force is pointed to by the Language enablement index, and sits alongside similar work for other writing systems. For information about layout and typographic requirements work for other scripts, see Layout & typography.
This charter is intended to reflect the current direction of the group, so that there is common agreement. It may be altered at any point in order to reflect new priorities or work items.
Teleconferences: At least once per month.
67 | Face-to-face meetings: On an as-needed basis.
68 | Video Conferences: On an as-needed basis.
69 |
70 |
71 |
72 |
73 |
74 |
Scope
75 |
The use of the GitHub issue list is not restricted to any particular set of languages other than that the group is focused on languages and scripts used in India
76 |
The set of languages for which gap-analysis and requirements documents will be provided is determined based on the availability of linguistic and typographic experts. Currently, the group is working on gap-analysis and/or requirements documents for the following:
77 |
78 |
Devanagari (Hindi, Marathi)
79 |
Bengali (Bangla, Assamese)
80 |
Tamil (Tamil)
81 |
Gurmukhi (Punjabi)
82 |
83 |
Documents related to additional languages may be added as expertise becomes available.
84 |
85 |
86 |
Deliverables
87 |
The India International Program Task Force will not produce Recommendation-track deliverables but will produce documents that can be published by the Internationalization Working Group as Working Group Notes or articles.
88 |
The group will also assist in developing tests for language features, most of which will be made available via the Internationalization Test Suite, and some of which may be ported to the Web Platform Tests repository.
89 |
The major deliverables of the program are:
90 |
91 |
a network of experts who receive notifications of issues raised in GitHub and respond when needed with advice about requirements for Southeast Asian languages on the Web. It aims to address the problem that experts don't know how to tell the W3C what problems exist for support of their language on the Web, and the W3C doesn't know how to contact people who can help when questions arise. This network of experts should help to significantly reduce that problem.
92 |
gap-analysis documents focused on specific languages and/or scripts, which describe features that need attention and prioritise them. The gap-analysis document will describe the problems, demonstrate them using tests or screen grabs, and describe whether work is needed on specifications (such as the CSS spec) or implementations (such as major web browsers). This work feeds into the language matrix which provides a heat-map for language issues on the Web.
93 |
requirements documents, which describe in a technology-agnostic way how the script/language works. The requirements documents may be developed piecemeal, to match progress in the gap analysis documents, ie. as a new section is created for the latter, content may be added to the requirements document to indicate the expected result.
94 |
95 |
Note that the requirements document should always remain technology-agnostic, so that it is evergreen. The gap analysis document, however, should be technology-specific, and as issues are addressed parts of the gap analysis document will eventually become merely historical records (and should be labelled as such).
96 |
There will be one gap-analysis document per script. Where the gap-analysis issues reported relate to more than one language, the document content shoud clearly indicate the affected language(s).
97 |
The gap analysis will be done in two stages:
98 |
99 |
A preliminary version can be produced quickly with a small number of experts. The information will be added to the W3C language matrix.
100 |
A more detailed version of the gap analysis report, preferably with test cases and links to existing requirements information. After an initial effort, this information may be added to on an ongoing basis as new issues arise, or as gaps are closed. The results will again be reflected in the W3C language matrix.
101 |
102 |
W3C India has already done some work
103 | on layout requirements for Indian Languages. A Working Draft is available. This work will be merged into the
104 | deliverables of this task force, and adapted as appropriate. Such documents should at least document the requirements for unsupported features that are identified during the gap analysis work, but can go beyond that to describe how the script functions.
105 |
106 |
131 |
132 |
133 |
134 |
Success Criteria
135 |
136 |
The success of the Task Force will be evaluated based on:
137 |
138 |
the number of experts recruited to participate in or follow discussions
139 |
the number of issues raised and dealt with
140 |
the number of documents produced
141 |
how successful the group is in advancing support for Southeast Asian languages and scripts on the Web
142 |
143 |
144 |
145 |
146 |
147 |
Relationships to other groups
148 |
149 |
Working Drafts and Notes will be published by the i18n WG, and the i18n WG will work with the task force closely to assist with development and review of the documents.
150 |
151 |
Dependencies
152 |
153 |
W3C Internationalization WG
154 |
The W3C i18n WG will oversee the work of the Task Force, and will
155 | publish the Working Drafts and Notes on their behalf. The i18n WG will also help the Task Force produce work that fits with the work of other Task Forces, and wider initiatives at the W3C.
156 |
157 |
158 |
W3C Publishing Working Group
159 |
The India International Program Task Force will work the Digital Publishing
160 | Working Group to ensure the work the Task Force is doing is known to
161 | that group and any issues that are common to the two groups are
162 | identified and tracked appropriately.
163 |
164 |
165 |
The India International Program Task Force has no formal dependencies on any
166 | other W3C Working Groups, but important points of contact include:
The India International Program Task Force is also expected to take advantage
178 | of opportunities for discussion and collaboration with existing groups
179 | and communities in India as well as groups and communities elsewhere.
180 |
181 |
182 |
183 |
184 |
Participation
185 |
186 |
A number of types of participation are possible, ranging from very low commitment (eg. 'Followers') to significant (eg. 'Editors' and 'Chairs'). These are described in a wiki page.
187 |
188 |
The GitHub home page for the group describes how to participate.
189 |
Everyone participating in the work of the task force, be it through the issue list, by contributing content or tests, or any other communication, must do so in conformance with the provisions of the CONTRIBUTING document.
The GitHub issue list is used to report issues for language support, for discussions, and to send feedback about documents.
199 |
The public-i18n-indic mailing list is used to send notification digests & meeting minutes. It is not for technical discussion.
200 |
There is also a public-iip-admin mailing list for internal and administrative use by the TF participants, for example for announcing teleconference agendas, new participants, preparing for publication, etc. or for discussing other non-technical, practical arrangements related to the group. Only participants in the task force are subscribed to that list.
The task force aims to hold teleconference or face-to-face meetings at least once a month, with additional meetings as needed to enable discussion and review status of the work. Such meetings have proven to be extremely useful in maintaining the heartbeat of the work.
203 |
The #ilreq IRC channel is used for supplementary communication and for minute-taking during meetings. Instructions for use are sent out with the meeting agenda.
204 |
205 |
206 |
207 |
208 |
Decision Policy
209 |
210 |
As explained in the Process Document (section 3.3),
212 | this group will seek to make decisions when there is consensus. In cases where there is a need to formally produce a group resolution about a particular issue, its Chair will put a question about the issue to the group and gather responses (including any formal objections); then, after due consideration of all the responses, the Chair will record a group resolution (possibly after a formal vote and also along with responding to any formal objections).
213 |
214 |
215 |
216 |
Patent Policy
217 |
218 |
Participants in the Task Force are obligated to comply with W3C patent-disclosure policy as outlined in Section 6 of the W3C Patent Policy document. Although the Task Force is not chartered to produce Recommendation-track documents that themselves require patent disclosure, participants in the group are nevertheless obligated to comply with W3C patent-disclosure policy for any Recommendation-track specifications that they review or comment on.
This charter for the Task Force within the Internationalization Interest Group is not a formal document and does not require W3C management or Advisory Committee review or approval. It is intended to summarise the goals and procedures of the group at any given time, and can be changed at any time to realign with changed priorities for the group.
227 |
228 |
229 | Charter Authors: Richard Ishida
230 |
231 |
232 |
This document describes or points to requirements for the layout and presentation of text in languages that use the Devanagari script. The target audience is developers of Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode, as well as implementers of web browsers, ebook readers, and other applications that need to render Devanagari text.
52 |
53 |
54 |
55 |
This document describes the basic requirements for Devanagari script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of Devanagari script languages. Currently the document focuses on the Devanagari script as used for Hindi and Marathi. The information here is developed in conjunction with a document that summarises gaps in support on the Web for Devanagari.
To make it easier to track comments, please raise separate issues or emails for each comment, and point to the section you are commenting on using a URL.
60 |
61 |
62 |
63 |
64 |
Some links on this page point to repositories or pages to which information will be added over time. Initially, the link may produce no results, but as issues, tests, etc. are created they will show up.
65 |
66 |
Links that have a gray color led to no content the last time this document was updated. They are still live, however, since relevant content could be added at any time. When the document is updated, links that now point to results will have their live colour restored.
The initial version of this document was prepared by Richard Ishida.
88 |
89 |
Thanks to the following people who contributed information that is used in this document (contributors' names listed in in alphabetic order): Akshat Joshi, Alolita Sharma, Vivek Pani.
The aim of this document is to describe the basic requirements for Devanagari script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications, and for application developers, about how to support users of the Devanagari script. The document currently focuses on texts using the Hindi and Marathi languages.
103 |
104 |
The document focuses on typographic layout issues. For a deeper understanding of the Devanagari script and how it works see Hindi Orthography Notes, which includes topics such as: Phonology, Vowels, Consonants, Encoding choices, and Numbers.
105 |
106 |
This document should contain no reference to a particular technology. For example, it should not say "CSS does/doesn't do such and such", and it should not describe how a technology, such as CSS, should implement the requirements. It is technology agnostic, so that it will be evergreen, and it simply describes how the script works. The gap analysis document is the appropriate place for all kinds of technology-specific information.
This document should be used alongside a separate document, Devanagari Gap Analysis, which describes gaps in support for Devanagari on the Web, and prioritises and describes the impact of those gaps on the user.
To complement any content authored specifically for this document, the sections in the document also point to related, external information, tests, GitHub discussions, etc.
132 |
133 |
The document Language enablement index points to this document and others, and provides a central location for developers and implementers to find information related to various scripts.
134 |
135 |
The W3C also has a repository with discussion threads related to the Devanagari script, including requests from developers to the user community for information about how scripts/languages work, and a notification system that tracks issues in W3C working groups related to the Devanagari script. See a list of unresolved questions for Indian script experts. Each section below points to related discussions. See also the repository home page.
The Devanagari script is an abugida. Consonant letters have an inherent vowel sound. Combining vowel signs are attached to the consonant to indicate that a different vowel follows the consonant.
159 |
160 |
Devanagari text runs left-to-right in horizontal lines. Words are separated by spaces. There is no case distinction.
161 |
162 |
Orthographic syllables (as opposed to phonetic syllables) play a significant role in Devanagari. An orthographic syllable starts at the beginning of any cluster of consonants and incorporates the whole cluster plus any following vowels and diacritics. A phonetic syllable may begin and end within a consonant conjunct.
163 |
Languages written with the Devanagari script often have aspirated forms of stops and a set of retroflex consonants. These are all represented separately in the orthography.
164 |
165 |
Consonant letters may be supplemented by repertoire extensions for non-native sounds by applying the nukta diacritic to characters.
166 |
167 |
Consonant clusters at any location are normally indicated using the virama between consonants. This results in a large number of conjunct forms expressed using half-forms, stacked consonants, and ligated glyphs. Occasionally, a visible virama is used. As part of a cluster, RA typically has special forms.
168 |
Word-final consonant sounds may be represented by dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama. An elided inherent vowel is not always marked. In Hindi, the inherent vowel of a penultimate consonant in a word of 3 syllables that ends in a non-inherent vowel is usually elided, and not marked as such.
169 |
Standalone vowel sounds are typically written using independent vowels, one for each vowel sound, including the inherent vowel.
170 |
171 |
Vowel nasalisation is typically indicated using a diacritic.
172 |
There is a set of native number digits. Punctuation is mostly ASCII, but dandas may be used for phrase boundaries.
173 |
174 |
The Unicode Devanagari block contains more characters than other indic scripts, partly because it serves as a pivot script for transliterations of other scripts.
The basic unit for working with Devanagari text is the orthographic syllable, ie. one consonant or a sequence of consonants with halant between, plus optional additional combining characters (such as vowel-signs).
202 |
203 |
In Devanagari an orthographic syllable that forms a conjunct should be treated as an indivisible unit of text for most editing operations. shows a Devanagari word with a conjunct at the end, and the expected segmentation.
204 |
205 |
206 |
207 |
हिन्दी → हि+न्दी
208 | Expected minimal units (right) during segmentation of the word हिन्दीhindī.
209 |
210 |
211 |
212 |
If, however, a conjunct is not formed and the halant is visible, the first consonant plus halant would be treated as separate from the second consonant, and the vowel-sign would appear to the left of the second consonant (see ).
213 |
214 |
215 |
216 |
हिन्दी → हि+न्+दी
217 | Expected segmentation of the word हिन्दीhindī when there is no conjunct.
218 |
219 |
220 |
221 |
Note that in Devanagari an orthographic syllable may be longer than a Unicode grapheme cluster, if it forms a conjunct. shows a Devanagari word with a conjunct at the end, and the segmentation that would result from applying Unicode grapheme clusters only.
222 |
223 |
224 |
225 |
हिन्दी → हि+न्+दी
226 | Segmentation of the word हिन्दीhindī with a conjunct when using Unicode grapheme clusters.
227 |
228 |
229 |
230 |
For Devanagari, applications need to provide tailored extensions to correctly segment the text. Such tailoring needs to be able to distinguish between sequences that are displayed as conjuncts, and those where the halant is visible.
231 |
232 |
233 |
234 |
235 |
236 |
237 |
Word boundaries
238 |
239 |
Words are separated by spaces.
240 |
241 |
Devanagari has hyphenated words – mainly conjoined nouns, eg. लाभ-हानिlābʰ-hāniprofit-loss, and माता-पिताmātā-pitāparents. i
।U+0964 DEVANAGARI DANDA, is used for sentence final punctuation.
274 |
There are two alternative approaches to the use of spaces with danda:
275 |
276 |
277 |
No space character appears between the end of the phrase and the danda glyph, but the advance width of the danda in a font should open a small gap before it. The danda is then typically followed by a single space.
278 |
279 |
A space is allowed before and after the danda in order to balance the space before and after it. In this case, the danda must still be kept from wrapping to a new line on its own; it should wrap with the previous word and space together.
280 |
281 |
282 |
These same principles apply to ॥U+0965 DEVANAGARI DOUBLE DANDA.
283 |
284 |
The double danda should be written using the dedicated Unicode character, and not by combining two single dandas.
285 |
286 |
The double danda is sometimes used to set apart section or verse numbering, in which the number is placed between pairs of double dandas. To obtain the correct spacing, the character sequence is usually <double danda, space, numeral(s), double danda>.
The primary break opportunities for line breaking are at inter-word spaces.
326 |
327 |
If a line is broken inside a word, any consonant clusters should be kept intact unless they are separated by visible halant characters.
328 |
329 |
Line breaking should not move a danda or double danda to the beginning of a new line, even if they are preceded by a space character. These punctuation characters should behave in the same way as a full stop does in English text.
This document describes and prioritises gaps for the support of the Bengali script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, such as HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
66 |
67 |
68 |
69 |
70 |
This document describes and prioritises gaps for the support of the Bengali script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
71 | It is linked to from the language matrix that tracks Web support for many languages.
The framework of this document was created by Richard Ishida. The text for most gap descriptions is automatically pulled from GitHub issues, and that text may have been written or contributed to by others.
91 |
92 |
97 |
98 |
The following people contributed information to the gap reports in this document (in alphabetic order): Akshat Joshi, Hai Liang, John Hudson, Vivek Pani.
The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
113 |
114 |
This page documents difficulties that people encounter when trying to use languages written in the Bengali script on the Web.
115 |
116 |
Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.
This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.
128 |
129 |
Key:
130 |
131 |
132 |
133 |
134 |
135 |
136 |
137 |
138 |
139 |
It is important to note that these colours do not indicate to what extent a particular feature is broken. They indicate the impact of a broken or missing feature on the content author or end user.
140 |
141 |
A cell can be scored as OK if the feature in question is specified in an appropriate specification (including Candidate Recommendations), and is supported by at least two major browser engines.
142 |
143 |
Advanced level support includes features that one might expect to include in ebooks or other advanced typographic formats. If a feature of a script or language is not supported on the Web, but is not generally regarded as necessary (usually archaic or obscure features), even if the feature is described here, the status may be marked as OK. The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide.
144 |
145 |
If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section will be the lowest denominator.
A summary of this report and others can be found as part of the Language Matrix.
158 |
159 |
Gap reports are brought to the attention of spec and browser implementers, and are tracked via the Gap Analysis Pipeline. Find the Bengali items.
160 |
161 |
For more information about the Bengali script, including requirements, tests, GitHub discussions, type samples, and more, see Bengali Script Resources.
Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).
There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?
This document describes and prioritises gaps for the support of the Devanagari script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, such as HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
67 |
68 |
69 |
70 |
71 |
This document describes and prioritises gaps for the support of the Devanagari script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
72 | It is linked to from the language matrix that tracks Web support for many languages.
The framework of this document was created by Richard Ishida. The text for most gap descriptions is automatically pulled from GitHub issues, and that text may have been written or contributed to by others.
93 |
94 |
99 |
100 |
The following people contributed information to the gap reports in this document (in alphabetic order): Akshat Joshi, Hai Liang, John Hudson, Vivek Pani.
The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
115 |
116 |
This page documents difficulties that people encounter when trying to use languages written in the Devanagari script on the Web.
117 |
118 |
Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.
This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.
130 |
131 |
Key:
132 |
133 |
134 |
135 |
136 |
137 |
138 |
139 |
140 |
141 |
It is important to note that these colours do not indicate to what extent a particular feature is broken. They indicate the impact of a broken or missing feature on the content author or end user.
142 |
143 |
A cell can be scored as OK if the feature in question is specified in an appropriate specification (including Candidate Recommendations), and is supported by at least two major browser engines.
144 |
145 |
Advanced level support includes features that one might expect to include in ebooks or other advanced typographic formats. If a feature of a script or language is not supported on the Web, but is not generally regarded as necessary (usually archaic or obscure features), even if the feature is described here, the status may be marked as OK. The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide.
146 |
147 |
If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section will be the lowest denominator.
A summary of this report and others can be found as part of the Language Matrix.
160 |
161 |
Gap reports are brought to the attention of spec and browser implementers, and are tracked via the Gap Analysis Pipeline. Find the Devanagari items.
162 |
163 |
For more information about the Devanagari script, including requirements, tests, GitHub discussions, type samples, and more, see Devanagari Script Resources.
Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).
There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?
This document describes and prioritises gaps for the support of the Gujarati script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, such as HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
67 |
68 |
69 |
70 |
71 |
This document describes and prioritises gaps for the support of the Gujarati script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
72 | It is linked to from the language matrix that tracks Web support for many languages.
The framework of this document was created by Richard Ishida. The text for most gap descriptions is automatically pulled from GitHub issues, and that text may have been written or contributed to by others.
The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
111 |
112 |
This page documents difficulties that people encounter when trying to use languages written in the Gujarati script on the Web.
113 |
114 |
Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.
This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.
126 |
127 |
Key:
128 |
129 |
130 |
131 |
132 |
133 |
134 |
135 |
136 |
137 |
It is important to note that these colours do not indicate to what extent a particular feature is broken. They indicate the impact of a broken or missing feature on the content author or end user.
138 |
139 |
A cell can be scored as OK if the feature in question is specified in an appropriate specification (including Candidate Recommendations), and is supported by at least two major browser engines.
140 |
141 |
Advanced level support includes features that one might expect to include in ebooks or other advanced typographic formats. If a feature of a script or language is not supported on the Web, but is not generally regarded as necessary (usually archaic or obscure features), even if the feature is described here, the status may be marked as OK. The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide.
142 |
143 |
If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section will be the lowest denominator.
A summary of this report and others can be found as part of the Language Matrix.
156 |
157 |
Gap reports are brought to the attention of spec and browser implementers, and are tracked via the Gap Analysis Pipeline. Find the Gujarati items.
158 |
159 |
For more information about the Gujarati script, including requirements, tests, GitHub discussions, type samples, and more, see Gujarati Script Resources.
Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).
There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?
This document describes and prioritises gaps for the support of the Gurmukhi script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, such as HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
66 |
67 |
68 |
69 |
70 |
This document describes and prioritises gaps for the support of the Gurmukhi script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
71 | It is linked to from the language matrix that tracks Web support for many languages.
The framework of this document was created by Richard Ishida. The text for most gap descriptions is automatically pulled from GitHub issues, and that text may have been written or contributed to by others.
The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
110 |
111 |
This page documents difficulties that people encounter when trying to use languages written in the Gurmukhi script on the Web.
112 |
113 |
Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.
This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.
125 |
126 |
Key:
127 |
128 |
129 |
130 |
131 |
132 |
133 |
134 |
135 |
136 |
It is important to note that these colours do not indicate to what extent a particular feature is broken. They indicate the impact of a broken or missing feature on the content author or end user.
137 |
138 |
A cell can be scored as OK if the feature in question is specified in an appropriate specification (including Candidate Recommendations), and is supported by at least two major browser engines.
139 |
140 |
Advanced level support includes features that one might expect to include in ebooks or other advanced typographic formats. If a feature of a script or language is not supported on the Web, but is not generally regarded as necessary (usually archaic or obscure features), even if the feature is described here, the status may be marked as OK. The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide.
141 |
142 |
If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section will be the lowest denominator.
A summary of this report and others can be found as part of the Language Matrix.
155 |
156 |
Gap reports are brought to the attention of spec and browser implementers, and are tracked via the Gap Analysis Pipeline. Find the Gurmukhi items.
157 |
158 |
For more information about the Gurmukhi script, including requirements, tests, GitHub discussions, type samples, and more, see Gurmukhi Script Resources.
Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).
There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?
This document describes and prioritises gaps for the support of the Tamil script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, such as HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
69 |
70 |
71 |
72 |
73 |
This document describes and prioritises gaps for the support of the Tamil script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders.
74 | It is linked to from the language matrix that tracks Web support for many languages.
The framework of this document was created by Richard Ishida. The text for most gap descriptions is automatically pulled from GitHub issues, and that text may have been written or contributed to by others.
The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
114 |
115 |
This page documents difficulties that people encounter when trying to use languages written in the Tamil script on the Web.
116 |
117 |
Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.
This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.
129 |
130 |
Key:
131 |
132 |
133 |
134 |
135 |
136 |
137 |
138 |
139 |
140 |
It is important to note that these colours do not indicate to what extent a particular feature is broken. They indicate the impact of a broken or missing feature on the content author or end user.
141 |
142 |
A cell can be scored as OK if the feature in question is specified in an appropriate specification (including Candidate Recommendations), and is supported by at least two major browser engines.
143 |
144 |
Advanced level support includes features that one might expect to include in ebooks or other advanced typographic formats. If a feature of a script or language is not supported on the Web, but is not generally regarded as necessary (usually archaic or obscure features), even if the feature is described here, the status may be marked as OK. The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide.
145 |
146 |
If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section will be the lowest denominator.
Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).
There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?
This task force does gap analysis and documents requirements, related to the layout and
72 | presentation of text in languages that use Indian scripts,
73 | in the context of Web standards and technologies such as HTML, CSS,
74 | Mobile Web, Digital Publications, and Unicode.
We welcome participation requests from people who are interested
77 | in contributing to the work of the Task Force. There are two ways to get involved:
78 |
79 |
Task force members are expert contributors
80 | who participate actively in producing the work of the group, regularly
81 | contributing text and advice to create the outputs, and participating in
82 | meetings. For more information about becoming a task force member
83 | contact Richard Ishida.
84 |
It is also possible to follow and contribute to
85 | discussions without the commitment required in being an expert
86 | contributor. See the github home page for details
The India International Program Task Force will not produce
107 | Recommendation-track deliverables but expects to produce Working
108 | Group Notes, published by the Internationalization
109 | Working Group.
110 |
To find and follow progress on deliverables, see the GitHub repo.
111 |
The main deliverables include the following:
112 |
113 |
Gap analysis reports for several Indian languages/scripts
114 | Template
115 |
Text Layout Requirements for several Indic scripts
116 | Editor's draft
117 |
118 |
The group charter also allows review of draft specifications
119 | produced by other working groups, and provision of translations of
120 | relevant W3C specifications and resources. The group may also choose to produce other non-normative deliverables,
121 | such as test cases and error reports, under the terms of the Policies
122 | for Contribution of Test Cases to W3C, and in coordination
123 | with any relevant working groups.
Most of the technical discussion takes place in the GitHub issues list. If you want to raise an issue with the documents, this is the place to raise it.
126 |
To follow the work, you can 'Watch' the repository, or subscribe to the public-i18n-indic mailing list, which is notified once a day about changes to the repo. The www-international list is also notified daily. (Please use github issues rather than the mailing list to send feedback.) Meeting minutes are sent to public-i18n-arabic.
There is also a public-iip-admin mailing list for internal and administrative use by the TF
130 | participants, for example for announcing teleconference agendas, new
131 | participants, preparing for publication, etc. or for discussing other
132 | non-technical, practical arrangements related to the group. Only
133 | participants in the task force are subscribed to that list.
134 |
The task force aims to hold teleconference or face-to-face
135 | meetings at least once a month, with additional
136 | meetings as needed to enable discussion and review status of the
137 | work.
138 |
The #ilreq IRC channel is used for supplementary communication and minute-taking during meetings. Instructions for use are sent out with the meeting agenda.
This document describes or points to requirements for the layout and presentation of text in languages that use the Tamil script. The target audience is developers of Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode, as well as implementers of web browsers, ebook readers, and other applications that need to render Tamil text.
46 |
47 |
48 |
49 |
50 |
51 |
52 |
This document describes the basic requirements for Tamil script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of Tamil script languages. Currently the document focuses on the Tamil script as used for Tamil. The information here is developed in conjunction with a document that summarises gaps in support on the Web for Tamil.
53 |
54 |
🚩
55 |
56 | This document is a stub awaiting future edits. .
57 | See Tamil Script Resources instead.
To make it easier to track comments, please raise separate issues or emails for each comment, and point to the section you are commenting on using a URL.
The aim of this document is to describe the basic requirements for Tamil script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications, and for application developers, about how to support users of the Tamil script. The document currently focuses on texts using the Tamil language.
The Tamil script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs or killed using a virama.
155 |
156 |
The Tamil script is written horizontally, left to right. Words are separated by spaces.
157 |
158 |
There are fewer consonants than in other Indic scripts. Tamil has no aspirated consonant letters, and symbols are allocated on a phonemic basis, rather than phonetic. This means that க, for example, may be pronounced as the allophones kɡxɣ or h, according to where it appears relative to other sounds in a word, but its pronunciation doesn't change the word.
159 |
160 |
The consonant letters used for pure Tamil words are supplemented by Grantha consonant signs which are used for English and Sanskrit loan words. Repertoire extensions for non-native sounds are achieved by preceding a consonant with ஃ U+0B83 TAMIL SIGN VISARGA (āytam).
161 |
162 |
Consonant clusters are indicated using the visible puḷḷi dot (the virama) to indicate that no vowel follows a consonant. Exceptions to the rule are 2 ligated forms: க்ஷkʃʌ and ஶ்ரீʃri.
163 |
164 |
Word-initial clusters do not appear in Tamil. Syllable-/word-final consonants are just written using ordinary consonants with the puḷḷi overhead, eg.தமிழ் t̪amiɻ Tamil.
165 |
166 |
The Tamil orthography has an inherent vowel, and represents vowels using vowel signs, including pre-base glyphs and circumgraphs. All circumgraphs can be decomposed. All vowel signs are combining marks, and are stored after the base character.
167 |
168 |
There are also independent vowels, one for each vowel sound, including the inherent vowel, and these are used to write all standalone vowel sounds.
169 |
170 |
The only composite vowels are those created by decomposition of the circumgraphs, and involve 2 glyphs, one on each side of the base consonant(s).
171 |
172 |
Tamil is diglossic: the classic form is preferred for writing and public speaking, and is mostly standard across the Tamil-speaking regions; the colloquial, spoken form differs widely from the written.
173 |
174 |
There can also be differences in letter shapes and other typographic approaches between the Tamil used in India and that used in places like Singapore and Malaysia (and even Sri Lanka).